Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've found that KV stores like DynamoDB make for a good control-plane configuration repository. For instance, say, you need to know if a client, X, is allowed to access a resource, Y. And, say, you've clients in order of millions and resources in order of 100s, and you've got very specific queries to execute on such denormalized data and need consistently low latency and high throughput across key-combinations.

Another good use-case is to store checkpointing information. Say, you've processed some task and would like to check-in the result. Either the information fits the 400KB DynamoDB limit or you use DynamoDB as a index to a S3 file.

You could do those things with managed or self-hosted RDBMS, but DynamoDB takes away the need to manage the hardware, the backups, the scale-ups, and the scale-outs, reduces ceremony whilst dealing with locks, schemas, misbehaving clients, and myraid other configuration knobs whilst also fitting your queries patterns to a tee.

KV stores typically give you consistent performance on reads and writes, if you avoid cascading relationships between two or more keys, and make just the right amount of trade-offs in terms of both cross-cluster data-consistency and cross-table data-consistency.

Besides, in terms of features, one can add a write-through cache in front of a DynamoDB table, can point-in-time-restore data up to a minute granularity, can create on-demand tables that scale with load (not worry about provisioned capacity anymore), can auto-stream updates to Elasticsearch for materialised views or consume the updates in real-time themselves, can replicate tables world-wide with lax consistency guarantees and so on...with very little fuss, if any.

Running databases is hard. I pretty much exclusively favour a managed solution over self-hosted one, at this point. And for denormalized data, a managed KV store makes for a viable solution, imo.



All good points, but one thing people should look at very closely before choosing DynamoDB as a primary db is the transaction limits. Most apps are going to have some operations that should be atomic and involve more than 25 items. With DynamoDB, your only option currently is to break these up into multiple transactions and hope none of them fail. But as you scale, eventually some will fail, while others in the same request succeed, leaving your data in an inconsistent state.

While this could be ok for some apps, I think for most use cases it's really bad and ends up being more trouble than what you save on ops in the long run, especially considering options like Aurora that, while not as hands-off as Dynamo, are still pretty low-maintenance and don't limit transactions at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: