Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
Amazon DynamoDB is one of the most versatile and popular services on AWS. In seconds, we can deploy a highly available, dynamically scaling key-document store with global replication, transactions, and more! However, if we modify a list attribute on a document, we need to take extra steps to achieve correctness and concurrency. Below, Iâll describe the problem and offer several solutions.
Full code if you want to follow along.
Letâs clarify the problem. Suppose we insert the following document, using the JavaScript AWS-SDK and the DynamoDB DocumentClient:
In the DynamoDB console, hereâs what the document looks like:
By default, the DocumentClient has marshalled the JavaScript array as a DynamoDB List type. How would we remove the value âfrylockâ from the âfriendsâ List attribute? Hereâs the doc on the Listâs remove operation. Since we need to specify the index of the element to remove, we need to read the document and find the index:
But this implementation has a race condition; there is a small window of time between reading the document, finding the index, and sending the update request, during which the document could be updated by another source, thus causing the operation âremove element at index Xâ to produce an undesired result. This problem is also referred to as Transactional Memory. Luckily, there are several solutions.
Condition Expression on the list contents
DynamoDB supports a handy feature called a Condition Expressions, which lets us specify a condition that must be met in order for the operation to execute. In this case we want to build a rule that says, âonly execute this operation if the target value is in the listâ:
We also need to handle the error case where the condition expression is not met. Hereâs the updated function:
This technique only ensures that updates to the list attribute are safe. How can we ensure we only apply updates when the document has not changed?
Condition Expression on a version attribute
Borrowing from databases that employ Multiversion concurrency control, we can introduce a âversionâ attribute at the root of our document. We can use the version field to set a condition expression that aborts the update when any other update has occurred. During the put operation, we can include an initial version property like so:
Letâs update the condition expression and add error handling:
Notice that the Update Expression also increments the version attribute. The two drawbacks to this approach:
- We need to add a version attribute to every document/table for which we want to enforce this pattern.
- We need to create a wrapper layer that ensures all updates respect the version attribute and educate the team that direct update operations are prohibited.
Use the Set data type
In practice, a friends list would store of a list of unique foreign keys. If we know the entries are unique, we can marshal the friends field as the DynamoDB Set data type instead of a List. Compared to lists, sets have a few differences:
- All values must be of the same type (string, bool, number)
- All values must be unique
- To remove element(s) from a set, use the DELETE operation, specifying a set of values
- A Set cannot be empty
Sounds perfect for storing a list of related document keys. However, we saw that the DocumentClient serializes JavaScript arrays as Lists, so we need to override that behavior with a custom marshaller.
Note: the example in the docs uses a âDynamoDBSetâ class, but this does not appear to be available as an import from the aws-sdk JS npm module. Instead, weâll use the DynamoDB.createSet function, which accomplishes the same thing:
In the console, our new document looks almost identical, except for the âStringSetâ type on the friends attribute.
Now to specify the DELETE operation:
Working with Sets from JavaScript has two gotchas. First: a set attribute on a document does not deserialize into a JavaScript array. Letâs see what it actually returns:
Aha! A DynamoDB Set deserializes into an object with its array of items stored under the values property. If we want a Set to deserialize into an array, weâll need to add an unmarshalling step where we assign the values property instead of the deserialized set object itself.
Second: remember how sets cannot be empty? If we try to remove all elements from a set, the console will stop us:
The console prevents us from deleting the last element from an existing set, but the SDK does not.
However, if we remove the last element from a set in code, the attribute will be deleted from the document. This means the unmarshalling step we mentioned in gotcha #1 will need to account for the case where the property is undefined. Hereâs a helper function that covers both cases:
Youâll still get an error if you try to store an empty array as a set, so hereâs the helper function going the other way:
Global Write Lock
Letâs not forget the time-honored tradition of preventing problems instead of solving them. Transactional Memory is only a problem when we allow concurrent writes. We can avoid concurrent writes by requiring any writer to obtain a distributed write lock (using a distributed lock service, such as etcd or zookeeper).
Since there are many implementations of the global-write-lock pattern, Iâll omit sample code and directly discuss the tradeoffs.
This technique has two significant drawbacks: 1) a distributed lock service adds extra complexity and latency. 2) A global write lock reduces write throughout. If youâre already using a distributed lock service and you donât need high write throughput, this solution is worth considering.
What about Transactions?
DynamoDB recently added support for multi-document transactions, and this sounds like a promising solution. But, as my colleague Danilo puts it:
Items are not locked during a transaction. DynamoDB transactions provide serializable isolation. If an item is modified outside of a transaction while the transaction is in progress, the transaction is canceled and an exception is thrown with details about which item or items caused the exception.
For this use case, transactions will essentially act like a slower version of condition expressions.
Thatâs all, folks
Hereâs a gist of all the code weâve written so far. Do you have a better way to achieve safe list updates? Please share them in the comments. I hope this helps you make the most of DynamoDB and happy hacking!
Safe List updates with DynamoDB was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.