Jan Veen

Concurrent-Modification-Aware Declarative Multi-Table Data Modification

Recently I had a little problem in API design which resulted in a little trick nicely suited for a blog post.

To fill this ridiculously complicated title with a concrete example, let's talk about a two-entity data model of recipes:

  1. A recipe which has a name and cooking instructions.

  2. Ingredients which have a name and amount and belong to a recipe.

Each of these entities is modelled in a separate SQL DB table and is accessible over some API over the network and allows multiple users to concurrently work on them.

What I want

To detect concurrent modifications of the same recipe, each of them contains a version number which is incremented monotonically upon each modification. Every time a user modifies the recipe he has to transmit the version on which his modification bases. If this version is different (lower) than the version in the database, the user works on an old entity and his modification is rejected. He must first refresh his entity and base his modification on the latest data version. This is how to make a single table "concurrent-modification-aware".

Next up the API should be declarative which means a modification to a recipe should be formed by a request containing the state of the recipe as it should be after the modification is applied. The server should find out by itself how to get to that state. This is easily done for a single table by sending the new state of the entity inside the request and writing it into the database.

The challenge now is to achieve both a declarative API and a concurrent-modification-aware data model if the modification involves more than one table. Say for our chocolate cake recipe we have learned that a tiny amount of salt can add an interesting new flavour to the cake. We now have to add a new ingredient to the recipe and modify the cooking instructions.

What I tried

To support this modification we can imagine an API endpoint to add an ingredient to a recipe and another endpoint to update the cooking instructions. This naive approach disallows to run both modifications as a single transaction because cross-request transactions are not common and discouraged. It is also not concurrent-modification-aware: If Alice and Bob learn about the salt trick at about the same time, both can add an ingredient and the DB won't notice this duplicate. And finally it is not declarative as the user performs the low-level adding operation by herself instead of describing the high-level target state.

So let's merge the two API endpoints into a single one. The immediate benefit is that now the modification can run inside a single transaction. We can also enrich our request data structure to include both the recipe description as well as a list of ingredients. This approach is also nicely declarative: Let the server figure out how to reach a state with these exact ingredients.

However, this design isn't concurrent-modification-aware: As Alice learns about a new spice ingredient to add to the recipe, Bob still wants to add salt as ingredient. Whoever sends her request last will delete the ingredient added before by not including it into the request's ingredient list. So how to achieve this as well?

I had some rather complicated ideas giving up the declarative approach and instead sending ADD, MODIFY or REMOVE operations for those ingredients which actually changed but didn't specifically like it.

What I ended up with

The simple solution is to include the state from which the modification started into the request. This allows the server to check inside a transaction, whether the actual ingredient list is exactly the one from the request. If any ingredient is present in only one of the lists or the version of an ingredient is different, this is a modification based on an obsolete state and must be rejected as a concurrent modification.