Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
DynamoDB 10 years later (amazon.science)
252 points by mariuz on Jan 20, 2022 | hide | past | favorite | 216 comments


I can safely say that the team members working in DynamoDB are very skilled and they care deeply about the product. They really work hard and think of interesting solutions to a lot of problems that their biggest customers face which is great from a product standpoint. There are some pretty smart people working there.

Engineering, however, was a disaster story. Code is horribly written and very few tests are maintained to make sure deployments go without issues. There was too much emphasis on deployment and getting fixes/features out over making sure it won't break anything else. It was a common scenario to release a new feature and put duct tape all around it to make sure it "works". And way too many operational issues. There are a lot of ways to break DynamoDB :)

Overall, though, the product is very solid and it's one of the few database that you can say "just works" when it comes to scalability and reliability (as most AWS services are)

I worked at DynamoDB for over 2 years.


>Engineering, however, was a disaster story. Code is horribly written and very few tests are maintained to make sure deployments go without issues. There was too much emphasis on deployment and getting fixes/features out over making sure it won't break anything else. It was a common scenario to release a new feature and put duct tape all around it to make sure it "works". And way too many operational issues. There are a lot of ways to break DynamoDB :)

>Overall, though, the product is very solid and it's one of the few database that you can say "just works" when it comes to scalability and reliability (as most AWS services are)

How those two can coexist?


You throw bodies at it. A small bunch of people will be overworked, stressed, constantly fighting fires and struggling to fight technical debt, implement features, and keep the thing afloat. Production is always a hair away from falling over but luck and grit keeps it running. To the team it's a nightmare, to the business everything is fine.


this is the answer.

source: currently being burned out on an adjacent aws team..


That sucks, man. If they won't move you to another team, just get out of there. We don't benefit by suffering for them, and they're not gonna change.


Yea. I can probably move to a more chill team, but I wouldn't work on anything nearly as cutting edge. I mentally check out for weeks at a time, then get back into it and deliver something large. I'm low key job hunting, but don't entirely trust that it'll be different anywhere else (previous jobs were like this too)


literally every co

if you want to know why capitalism causes this, start a startup and prioritize quality, do not get to market, do not raise money, do not pass go, watch dumpster fires with millions of betrayed and angry users raise their series d


I mean eventually enough duct tape can be solid like a tank :)




It sounds incredulous but I have heard similar things about Oracle. May be a large dev team can duct tape enough so that the product is solid.


You're probably thinking of this comment, oracle and 25 million lines of c code: https://news.ycombinator.com/item?id=18442941


They both likely have solid 80% solutions (design) and incrementally cover the 20% gap as need arises. This in turn adds to operational complexity.

Alternative would be to attempt a near 'perfect' solution for the product requirements and that may either hit an impossibility wall or may require substantial long term effort that would impede product development cycles. So likely the former approach is the smarter choice.


You need the really good duct tape.


AWS level of disaster is very different to an average disaster. :)


Customers care about the outcome, not the internal process. Besides, I’ve never worked at any sizable company in my 20+-year-long career where I didn’t conclude, “it’s a miracle this garbage works at all.”

Enjoy the sausage, but if you have a weak stomach, don’t watch how it’s made.

(I work for AWS but not on the DynamoDB team and I have no first-hand knowledge of the above claim. Opinions are my own and not those of my employer.)


> Customers care about the outcome, not the internal process.

This is true though there's only so much technical debt and internal process chaos you can create before it affects the outcome. It's a leading indicator, so by the time customers are feeling that pain you've got a lot of work in front of you before you can turn it around, if at all, and customers are not going to be happy for that duration.

Technical debt is not something to completely defeat or completely ignore, instead it's a tradeoff to manage.


This article from Martin Fowler explores your point in greater depth. It's a good read: https://martinfowler.com/articles/is-quality-worth-cost.html

One concrete problem with technical debt the article highlights is it that negatively impacts the time to deliver new features. Customers today usually expect not only a great initial feature set from a product, but also a steady stream of improvements and growth, along with responsiveness to feedback and pain points.


> Customers care about the outcome, not the internal process

Additionally, the business cares about the outcome, not the internal process.

Ostensibly, the business should care about process but it actually doesn't matter as long as the product is just good enough to obtain/retain customers, and the people spending the money (managers) aren't incentivized to make costs any lower than previously promised (status quo).


Just curious, why do you mention you work at AWS if you're just disclaiming that fact in the next sentence? Besides, nothing you stated is specific to AWS or any of its products.


I don't work at Amazon, but our company's social media policy requires us to be transparent about a possible conflict of interest when speaking about things "close to" our company/our position in the industry and also need to be clear about whether we're speaking in an official capacity or in a personal capacity.

This is designed to reduce the chances of eager employees going out and astro-turfing or otherwise acting in trust-damaging ways while thinking they're "helping".


Crudely speaking, the fact that they work at AWS means that it’s in their best interests for AWS to be perceived positively.

When this is the case it’s often nice to state this conflict of interest, so others can take your appraisal in the appropriate context.

I’m not implying anything about the post, just stating what I assume to be the reason for the disclosure.


Exactly this. I was too young at the time to grasp this idea.


If the developers are happy about the code and testing quality of a project, then you waited too long to ship.

If the customers don't have any feedback or missed feature asks at launch, you waited too long to ship.

You know who has great internal code and test quality? Google. Which is why Google doesn't ship. They're a wealth distribution charity for talented engineers. And their competitive advantage is that they lure talented people away from other companies where they might actually ship something and compete with Google, to instead park them, distract them with toys, beer kegs, readability reviews, and monorepo upgrades.


Very interesting!

To me the takeaway is large/interesting/challenging engineering projects are pretty close to disasters generally. Some time they do become disaster actually.

On the other hand if a project looks like straight up designed, neatly put into JIRA stories, and developers deliver code consistently week after week then it may be a successfully planned and delivered project. But it would mostly be doing stuff that has already been many times over and likely by same people on team.

At least this has been my experience while working on standardized / templated projects vs something new.


Challenging the cutting edge of your product domain is what I get from this. Easy things are easy and predictable. Hard things and unpredictable evolving requirements are a tension against the initial system design which is the foundation of your code base. Over time the larger projects get the perhaps further they deviate from the original design. If you could predict it up front in many cases its not all that interesting or challenging of a problem. Duct tape is fine to use as long as you understand when you've gone too far and might want to re-design from scratch based on prior learnings.


On the other hand, if you don't go solder things and remove the duct tape from time to time, you will always come closer to a disaster, never further away.

Some projects are run like the Doomsday Clock, and nobody can get anything done. Other ones increase and decrease on complexity all the time, and those tend to catch-up to the first set quite quickly.


I worked at a company who re-implemented the entire Dynamo paper and API, and it was exactly the same story. Completely eliminated all my illusions about the supposed superiority of distributed systems. It was a mound of tires held together with duct tape, with a tiki torch in each tire.


Did they have a spare 100 million hanging around to burn? That seems pretty ridiculous. Why did they not just run cassandra?


They did have 100 million to burn, but my mostly-wild-guess is it was closer to $1.5M/yr. But that gives you an in-house SaaS DB used across a hundred other teams/products/services, so it actually saved money (and nothing else matched its performance/CAP/functionality).

Cassandra is too opinionated and its CAP behavior wasn't great for a service like this, so they built on top of Riak. (This also eliminated any thoughts I had about Erlang being some uber-language for distributed systems, as there were (are?) tons of bugs and missing edge cases in Riak)


Erlang gives you great primitives for building reliable protocols, but they're just primitives, and there are tons of footguns since building protocols is hard.


Because Riak uses vector clocks instead of cell timestamps? Cassandra's ONE/QUORUM/ALL consistency levels otherwise allow tuning for tolerance of CP vs AP, don't they?


To be honest I don't know, I wasn't there for the initial decision, but I know it wasn't just about CAP. It could have been as simple as Riak was easier to use (which I don't know either)


> Why did they not just run cassandra?

Not Invented Here can run very deep in some branches of an organization. Depending on how engineering performance evaluations work, writing a homebrew database could totally be something that aligns with the company incentives. It might not make a single bit of sense from a business standpoint but hey, if the company rewards such behavior don't be surprised when engineers flush millions down the tube "innovating" a brand new wheel.


Dynamo paper and DynamoDB are two very different things…


Err you're right, they re-used something that implemented the Dynamo paper (Riak) and implemented on top of it the DynamoDB API.


It's a shame they don't open source it. It's funny too, being AWS they really don't have to worry about AWS running a cheaper service, so at that point why not open source it.


There is a compatible open source alternative here, https://www.scylladb.com/alternator/


If you’ve read their paper, there is a lot of detail in it to create your own. Of course they haven’t given out the code but the paper is a pretty solid design document.


They probably view it as a competitive advantage that Azure or GCP would try to copy if they figured out the "secret sauce."


I kinda doubt it. It's probably just that open sourcing it won't provide much utility (I bet lots of code is aws specific) and just adds a new maintenance burden for them.


The way you need to write code for a massively scalable service is just different. And the things you need to operate a service are also just different.


Azure has a better product (Cosmos is damn good) and Google engineers have too much hubris to import something from lowly Amazon engineers.


Azure has Cosmos and Google has Datastore. They would never.


Azure has Cosmos which is arguably better than DynamoDB for a lot of use cases.


So they just rolled out global replication, and I can't for the life of me figure out how they resolve write conflicts without cell timestamps or any other obvious CRDT measures.

Questions were handwaved away, and the usual Amazon black box non-answers which always smells like they are hiding problems.

Any ideas how this is working? It seems bolt-on and not well thought out, and I doubt they'll ever pay for Aphyr to put it through his torture tests.


From: https://aws.amazon.com/dynamodb/global-tables/

Consistency and conflict resolution

Any changes made to any item in any replica table are replicated to all the other replicas within the same global table. In a global table, a newly written item is usually propagated to all replica tables within a second. With a global table, each replica table stores the same set of data items. DynamoDB does not support partial replication of only some of the items. If applications update the same item in different Regions at about the same time, conflicts can arise. To help ensure eventual consistency, DynamoDB global tables use a last-writer-wins reconciliation between concurrent updates, in which DynamoDB makes a best effort to determine the last writer. With this conflict resolution mechanism, all replicas agree on the latest update and converge toward a state in which they all have identical data.


So a write than doesn’t “win” just gets silently discarded?


Honestly your expectations are too high. Conflict resolution is row-level last-write-wins. It's not a globally distributed database, it's just a pile of regional DynamoDB tables duct taped together... They're not going to hire Aphyr for testing because there's nothing for him to test.


"And way too many operational issues."

I've seen this kind of thing mentioned many times, pretty baffling TBH based on Dynamo's pretty good reputation in industry. Are these mostly to the stateless components of the product, or do they see data loss?


I can't say in risk of violating some NDA but a lot of it is internal stuff that customers will never even be aware of or it would require too much effort for them to break.

There are times when bad deployments happen and customers were impacted.


Careful about running your systems in us-east-1, folks


I've worked at 5 different tech companies now - this is par for course. And every single one, wished they could go back and do it again, but at that point the product was too successful so they ran with it.


We're at early stages of planning an architecture where we offload pre-rendered JSON views of PostgreSQL onto a key value store optimised for read only high volume. Considering DynamoDB, S3, Elastic, etc. (We'll probably start without the pre-render bit, or store it in PostgreSQL until it becomes a problem).

When looking at DynamoDB I noticed that there was a surprising amount of discussion around the requirement for provisioning, considering node read/write ratios, data characteristics, etc. Basically, worrying about all the stuff you'd have to worry about with a traditional database.

To be honest, I'd hoped that it could be a bit more 'magic', like S3, and it AWS would take care of provisioning, scaling, sharding etc. But it seemed disappointingly that you'd have to focus on proactively worrying about operations and provisioning.

Is that sense correct? Is the dream of a self-managing, fire-and-forget key value database completely naive?


Your example really summarizes the challenge with the AWS paradigm: namely that they want you to believe that the thing to do is to spread the the backend of your application across a large number of distinct data systems. No one uses DynamoDB alone: they bolt it onto Postgres after realizing they have availability or scale needs beyond what a relational database can do, then they bolt on Elasticsearch to enable querying, and then they bolt on Redis to make the disjointed backend feel fast. And I'm just talking operational use cases; ignoring analytics here. Honestly it doesn't need to be these particular technologies but this is the general phenomenon you see in so many companies that adopt a relational database, key/value store (could be Cassandra instead of DynamoDB eg like what Netflix does), a search engine, and a caching layer because they think that that's the only option

This inherently leads to a complexity debt explosion, fragmentation in the experience, and an operationally brittle posture that becomes very difficult to dig out of (this is probably why AWS loves the paradigm).


>No one uses DynamoDB alone

Almost every single team at Amazon that I can think of off the top of my head uses DynamoDB (or DDB + S3) as its sole data store. I know that there are teams out there using relational DBs as well (especially in analytics), but in my day-to-day working with a constantly changing variety of teams that run customer-facing apps, I haven't seen RDS/Redis/etc being used in months.


The thing about Amazon is that it is massive. In my neck of the woods, I've got the complete opposite experience. So many teams have the exact DDB induced infrastructure sprawl as described by the GP (e.g. supplemental RDBMS, Elastic, caching layers, etc..).

Which says nothing of DDB. It's an god-tier tool if what you need matches what it's selling. However, I see too many teams reach for it by default without doing any actual analysis (including young me!), thus leading to the "oh shit, how will we...?" soup of ad-hoc supporting infra. Big machines look great on the promo-doc tho. So, I don't expect it to stop.


> they bolt it onto Postgres after realizing they have availability or scale needs beyond what a relational database can do, then they bolt on Elasticsearch to enable querying, and then they bolt on Redis to make the disjointed backend feel fast.

This made my head explode. Why would you explicitly join two systems made to solve different issues together? This sounds rather like a lack of architectural vision. Postgres's zero access-design inherently clashes with DynamoDB's; same goes with ElasticSearch scenario: DynamoDB's was not made to query everything, it's made to query specifically what you designed to be queried and nothing else. Redis sort-of make sense to gain a bit of speed for some particular access, but you still lack collection level querying with it.

In my experience, leave DynamoDB alone and it will work great. Automatic scaling is cheaper eventually if you've done your homework about knowing your traffic.


In my experience, leave DynamoDB alone and it will work great.

My experience agrees with yours and I'm likewise puzzled by the grandparent comment. But just a shout out to DAX (DyanmoDB Accelerator) which makes it scale through the roof:

https://aws.amazon.com/dynamodb/dax/


If you add DAX you are not guaranteed to read your writes. Terrible consistency model. https://docs.aws.amazon.com/amazondynamodb/latest/developerg...


Terrible consistency model.

Judging a consistency model as "terrible" implies that it does not fit any use case and therefore is objectively bad.

On the contrary, there are plenty of use cases where "eventually consistent writes" is the perfect use case. To judge this as true, you only have to look and see that every major database server offers this as an option - just one example:

https://www.compose.com/articles/postgresql-and-per-connecti...


I think main advantage of DDB is being serverless. Adding a server-based layer on top of it doesn't make sense to me.

I have a theory it would be better to have multiple table-replicas for read access. At application level, you randomize access to those tables according to your read scale needs.

Use main table streams and lambda to keep replicas in sync.

Depending on your traffic, this might end more expensive than DAX, but you remain fully serverless, using the exact same technology model, and have control over the consistency model.

Haven't had the chance to test this in practice, though.


Thanks - I've seen DAX mentioned and possibly even recommended. I don't need faster DynamoDB that much.


You choose your consistency on reads. However, Dax won't help you much on a write heavy workload.


In my experience, NoSQL is almost never the right answer.

And DynamoDB is worse than most.

My prediction is that the future is in scalable SQL; CockroachDB or Yugabase or similar.

NoSQL actually causes more problems than it solves, in my experience.


There are plenty of reasons when NoSQL is the right answer. The biggest is when you care more about predictable performance: https://brooker.co.za/blog/2022/01/19/predictability.html?s=...


As long as you consider "can just fail if it gets too busy" to be "predictable."

Which I don't. I'd rather see reliable operation than "predictable except for when it fails outright" in almost every situation.

If you've encountered that other situation, where failures are fine? Then great. But I still assert that's a tiny minority of real-life DB use cases.


If this is not the only option, what would you suggest instead? How to simplify it?


The alternative is to go to GCP and use the big GCP selling point, which is Big Table/Big Query.

Those databases build most of that in, and it's all one fairly excellent distributed monolith.


Wouldn't Spanner be closer to what you're talking about?


It's still a marriage.


> they bolt it onto Postgres

I am working with a company that is redesigning an enterprise transactional system, currently backed by an Oracle database with 3000 tables. It’s B2B so loads are predictable and are expected to grow no more than 10% per year.

They want to use DynamoDB as their primary data store, with Postgres for edge cases it seems to me the opposite would be more beneficial.

At what point does DynamoDB become a better choice than Postgres? I know that at certain scales Postgres breaks down, but what are those thresholds?


You can make Postgres scale, but there is an operational cost to it. DynamoDB does that for you out of the box. (So does Aurora, to be honest, but there is also an overhead to setting up an Aurora cluster to the needs of your business.)

I've found also that in Postgres the query performance does not keep up with bursts of traffic -- you need to overprovision your db servers to cope with the highest traffic days. DynamoDB, in contrast, scales instantly. (It's a bit more complicated that that, but the effect of it is nearly instantaneous.) And what's really great about DynamoDB is after the traffic levels go down, it does not scale down your table and maintains it at the same capacity at no additional cost to you, so if you receive a burst of traffic at the same throughput, you can handle it even faster.

DynamoDB does a lot of magic under the hood, as well. My favorite is auto-sharding, i.e. it automatically moves your hot keys around so the demand is evenly distributed across your table.

So DynamoDB is pretty great. But to get the the best experience from DynamoDB, you need to have a stable codebase, and design your tables around your access patterns. Because joining two tables isn't fun.


> So DynamoDB is pretty great. But to get the the best experience from DynamoDB, you need to have a stable codebase, and design your tables around your access patterns. Because joining two tables isn't fun.

More than just joining--you're in the unenviable place of reinventing (in most environments, anyway) a lot of what are just online problems in the SQL universe. Stuff you'd do with a case statement in Postgres becomes some on-the-worker shenanigans, stuff you'd do with a materialized view in Postgres becomes a batch process that itself has to be babysat and managed and introduces new and exciting flavors of contention.

There are really good reasons to use DynamoDB out there, but there are also an absolute ton of land mines. If your data model isn't trivial, DynamoDB's best use case is in making faster subsets of your data model that you can make trivial.


Using +1 DynamoDB table is a bad idea in the first place.


They should be looking at Aurora, not Dynamo. Using Dynamo as the primary store for relational data (3000 tables!) sounds like an awful idea to me. I’d rather stay on Oracle.

https://aws.amazon.com/rds/aurora/?aurora-whats-new.sort-by=...


It really depends much more on the access patterns than data shape.

Certain access patterns can do pretty well with 3,000 relational tables denormalized to a single DynamoDB table.


It seems to me that what this is saying is that storage has become so cheap that if another database provides even slight advantages over another for some workload it is likely to be deployed and have all the data copied over to it.

HN entrepreneurs take note, this also suggests to me that there may be a market for a database (or a "metadatabase") that takes care of this for you. I'd love to be able to have a "relational database" that is also some "NoSQL" databases (since there's a few major useful paradigms there) that just takes care of this for me. I imagine I'd have to declare my schemas, but I'd love it if that's all I had to do and then the DB handled keeping sync and such. Bonus points if you can give me cross-paradigm transactionality, especially in terms of coherent insert sets (so "today's load of data" appears in one lump instantly from clients point of view and they don't see the load in progress).

At least at first, this wouldn't have to be best-of-breed necessarily at anything. I'd need good SQL joining support, but I think I wouldn't need every last feature Postgres has ever had out of the box.

If such a product exists, I'm all ears. Though I am thinking of this as a unified database, not a collection of databases and products that merely manages data migrations and such. I'm looking to run "CREATE CASSANDRA-LIKE VIEW gotta_go_fast ON SELECT a.x, a.y, b.z FROM ...", maybe it takes some time of course but that's all I really have to do to keep things in sync. (Barring resource overconsumption.)


> I'd love to be able to have a "relational database" that is also some "NoSQL" databases (since there's a few major useful paradigms there) that just takes care of this for me. I imagine I'd have to declare my schemas, but I'd love it if that's all I had to do and then the DB handled keeping sync and such.

You might be interested in what we're building [0]

It synchronizes your data systems so that, for example, you can CDC tables from your Postgres DB, transform them in interesting ways, and then materialize the result in a view within Elastic or DynamoDB that updates continuously and with millisecond latency.

It will even propagate your sourced SQL schemas into JSON schemas, and from there to, say, equivalent Elastic Search schema.

[0]: https://github.com/estuary/flow


I'm afraid it's not feasible to develop a single general purpose implementation for that.

The amount of complexity to guarantee data integrity while covering all possible use cases will be just unmanageable.

I'd be extremely happy to be proven wrong, though...


I think there was a project like this a few years ago (wrapping a relational DB + ElasticSearch into one box) and I thought it was CrateDB, but from looking at their current website I think I'm misremembering.

The concept didn't appeal to me very much then, so I never looked into it further.

---

To address your larger point, I think Postgres has a better chance of absorbing other datastores (via FDW and/or custom index types) and updating them in sync with it's own transactions (as far as those databases support some sort of atomic swap operation) than a new contender has of getting near Postgres' level of reliability and feature richness.


Were you thinking of ZomboDB? https://github.com/zombodb/zombodb


AWS tried building this with Glue Elastic Views: https://aws.amazon.com/glue/features/elastic-views/

It's been in preview forever though, not sure when it's going to officially launch.


My understanding of the cockroach db architecture, it that it’s essentially two discrete components, a key value store that actually persists the data, and a SQL layer built on top. Although I don’t think it’s recommended or supported to access the key value store directly.


Postgres with Cassandra built in and scaled separately would be really great.


We use DynamoDB alone. Microservices generally use one or two tables each.


I have no direct experience with scaling DynamoDB in production, so take this with a grain of salt. But it seems to me that the on-demand scaling mode in DynamoDB has gotten _really_ good the last couple of years.

For example, you used to have to manually set RCU/WCU to a high number when you expected a spike in traffic, since the ramp-up for on-demand scaling was pretty slow (could take up to 30 minutes). But these days, on-demand can handle spikes from 10s of requests a minute to 100s/1000s per second gracefully.

The downside of on-demand is the pricing - it's more expensive if you have continuous load. But it can easily become _much_ cheaper if you have naturally spiky load patterns.

Example: https://aws.amazon.com/blogs/database/running-spiky-workload...


> The downside of on-demand is the pricing - it's more expensive if you have continuous load.

True, although you don't have to make that choice permanently. You can switch from provisioned to on demand once every 24 hours.

And you can also set up application autoscaling in provisioned mode, which'll allow you to set parameters under which it'll scale your provisioned capacity up or down for you. This doesn't require any code and works pretty well if you can accept autoscaling adjustments being made in the timeframe of a minute or two.


scaling down is limited to 4x a day


It’s up to 27 times a day, if you time it well: “4 decreases in the first hour, and 1 decrease for each of the subsequent 1-hour windows in a day”.


gotcha, it's been awhile since I was looking at that


They upped it when they're own autoscaler needed the ability to back it down more :-/


Indeed

We've some regular jobs that require scaling up dynamodb in advance few times per day, but then dynamo is only able to scale down 4x per day, so we're probably paying for over capacity unnecessarily (10x or more) for a couple hours a day

Now we just moved ondemand and let them handle it, works fine


> Is the dream of a self-managing, fire-and-forget key value database completely naive?

It's not, if you plan it right. Learn about single table design for DynamoDB before you start. There are a lot of good resources from Amazon and the community.

Here is a very accessible video from the community:

https://www.youtube.com/watch?v=BnDKD_Zv0og

Here is a video from Rick Houlihan, a senior leader from AWS who basically helps companies convert to single table design:

https://www.youtube.com/watch?v=KYy8X8t4MB8

And a good book on the topic:

https://www.dynamodbbook.com

If you use single table design, you can turn on all of the auto-tuning features of DynamoDB and they will work as expected and get better and more efficient with more data.

Some people worry that this breaks the cardinal rule of microservices: One database per service. But the actual rule is never have one service directly access the data of another, always use the API. So as long as your services use different keyspaces and never access each other's data, it can still work (but does require extra discipline).


A lot of things that used to be a concern (hot partitions, etc) are not a concern anymore and most have been solved these days :)

Put it on on-demand pricing (it'll be better and cheaper for you most likely), and it will handle any load you throw at it. Can you get it to throttle? Sure, if you absolutely blast it without ever having had that high of a need before (and it can actually be avoided[0]).

You will need to understand how to model things for the NoSQL paradigm that DynamoDB uses, but that's a question of familiarity and not much else (you didn't magically know SQL either).

My experience comes from scaling DynamoDB in production for several years, handling both massive IoT data ingestion in it as well as the user data as well. We were able to replace all things we thought we would need a relational database for, completely.

My comparison between a traditional RDS setup: - DynamoDB issues? 0. Seriously. Only thing you need to monitor is billing. - RDS? Oh boy, need to provision for peak capacity, need to monitor replica lags, need to monitor the Replicas themselves, constant monitoring and scaling of IOPS, suddenly queries get slow as data increases, worrying about indexes and the data size, and much more...

[0]: https://theburningmonk.com/2019/03/understanding-the-scaling...


> We're at early stages of planning an architecture where we offload pre-rendered JSON views of PostgreSQL onto a key value store optimised for read only high volume.

If possible, put the json in Workers KV, and access it through Cloudflare Workers. You can also optionally cache reads from Workers KV into Cloudflare's zonal caches.

> To be honest, I'd hoped that it could be a bit more 'magic', like S3

You could opt to use the slightly more expensive DynamoDB On-Demand, or the free DynamoDB Auto-Scaling modes, which are relatively no-config. For a very ready-heavy workload, you'd probably want to add DynamoDB Accelerator (an write-through in-memory cache) in front of your tables. Or, use S3 itself (but a S3 bucket doesn't really like when you load it with a tonne of small files) accelerated by CloudFront (which is what AWS Hyperplane, tech underpinning ALB and NLB, does: https://aws.amazon.com/builders-library/reliability-and-cons...)

S3, much like DynamoDB, is a KV store: https://news.ycombinator.com/item?id=11161667 and https://www.allthingsdistributed.com/2009/03/keeping_your_da...


DyanmoDB is pretty much the opposite of magic.

It is a resource that can often be the right tool for the job but you really have to understand what the job is and carefully measure Dynamo up for what you are doing.

It is _easy_ to misunderstand or miss something that would make Dynamo hideously expensive for your use case.


What use cases would likely make it hideously expensive, in your view? Like, what are the red flags?


Hot keys are much lesser of an issue nowadays. It'd been a big one in old DDB architectures.

I'd say requiring scans or filters as opposed to queries is one of the biggest issues that can bite your pocket.

Think carefully about how you'll access your data later. You won't be able to change it drastically and cheaply later.


Hot keys are the primary one. They destroys your "average" calculations for your throughput.

Bulk loading data is the other gotcha I've run into. Had a beautiful use case for steady read performance of a batch dataset that was incredibly economical on Dynamo but the cost/time for loading the dataset into Dynamo was totally prohibitive.

Basically Dynamo is great for constant read/write of very small, randomly distributed documents. Once you are out of thay zone things can hey dicey fast.


I do not recommend starting off with a decision to use DynamoDB before you have worked with it directly for some time to understand it. You could spend months trying to shoehorn your use case into it before realizing you made a mistake. That said, DynamoDB can be incredibly powerful and inexpensive tool if used right.


I think this can be said about any technology, really...


Yea, probably, but it is especially true for DynamoDB because it can initially appear as though your use cases are all supported but that is only because you haven't internalized how it works yet. By the time you realize you made a mistake, you are way too far in the weeds and have to start over from scratch. I would venture that more than 50% of DynamoDB users have had this happen to them early on. Anecdotally, just look at the comments on this post. There are so many horror stories with DynamoDB, but they're basically all people who decided to use it before they really understood it.


I believe it used to be static provisioning, you'd set the read and limit capacity beforehand. Then obviously there is autoscaling of those but it is still steps of capacity being provisioned.

They now have a dynamic provisioning scheme, you simply don't care but it is more expensive so if you have predictible requirements it is still better to use static capacity provisioning. There is an option though.

DynamoDB also requires the developer to know about its data storage model. While this is generally a good practice for any data storage solution, I feel like Dynamo requires a lot more careful planning.

I also think that most of the best practices, articles etc apply to giant datasets with huge scale issues etc. If you are running a moderately active app, you probably can get away with a lot of stupid design decisions.


My experience with dynamic provisioning has been that it is pretty inelastic, at least at the lower range of capacity. E.g. if you have a few read units and then try to export the data using AWS's cli client, you can pretty quickly hit the capacity limit and have to start the export over again. Last time, I ended up manually bumping the capacity way up, waiting a few minutes for the new capacity to kick in, and then exporting. Not what I had in mind when I wanted a serverless database!


I understand it's not really your point, but if you're actually looking to export all the data from the table, they've got an API call you can give to have DynamoDB write the whole table to S3. This doesn't use any of your available capacity.

https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

Beyond that, though, it's really not designed for that kind of use case.


Ah, fair point. Somehow I didn't encounter that when I was trying to export, even though it existed at the time. But it would have solved my problem.


The key benefit with DDB is predictability: https://brooker.co.za/blog/2022/01/19/predictability.html

Yes, you have to learn about all these things upfront. But once you figure it out, test it, and configure it - it will work as you expect. No surprises.

Whereas Relational Databases work until they don't. A developer makes a tiny (even a no-op) change to a query or stored procedure, a different SQL plan gets chosen, and suddenly your performance/latency dramatically reduces, and you have no easy way to roll it back through source control/deployment pipelines. You have to page a DBA who has to go pull up the hood.

With services like DDB, you maintain control.


It is for now but it doesn't have to be. Dynamo's design isn't particularly amenable to dynamic and heterogenous shard topologies however.

There could exist a fantasy database where you still tell it your hash and range keys, which are roughly how you tell the database which data isn't closely related to each other and which data is (and which you may want to scan) but instead of hard provisioning shard capacity it automagically splits shards when they hotspot and doesn't rely consistent hashing so that every shard can be sized differently depending on how hot it is.

Right now such a database doesn't exist AFAICT as most places that need something the scales big enough also generally have the skill to avoid most of the pitfalls that cause problems on simple databases like Dynamo.


Dynamo is incredibly hard to use correctly

I’d urge you to start writing a prototype, a lot of your assumptions might get thrown out the window. Dynamo is not necessarily good for reading high volume. You’ll end up needing to use a parallel scan approach which is not fast.


I'd say Dynamo is extremely good at reading high volume, with the appropriate access pattern. It's very efficient at retrieving huge amounts of well partitioned data using the data's keys, but scanning isn't so efficient.


You can only ever fetch 1MB of data at a time though, even when using the more efficient query method (as opposed to scan). If your individual entities are not very tiny, it is hard to get for instance 2M items back in a reasonable amount of time.


Also can be _very_ expensive if you do not use it correctly.


I don't know your scaling needs, but I would highly recommend just using Aurora postgresql for read-only workloads. We have some workloads that are essentially K/V store lookups that were previously slated for dynamodb. On an Aurora cluster of 3*r6g.xlarge we easily handle 25k qps with p99 in the single-digit ms range. Aurora can scale up to 15 instances and up to 24xlarge, so it would not be unreasonable to see 100x the read workload with similar latencies.

Happy to talk more. We're actively moving a bunch of workloads away from DynamoDB and to Aurora so this is fresh on our minds.


Thanks, that's what I hope will work. I might drop you a mail at some point.


The salespeople always promise magic and handwave CAP away.

But data at scale is about:

1) knowing your queries ahead of time (since you've presumably reached the limit of PG/maybesql/o-rackle.

2) dealing with CAP at the application level: distributed transactions, eventual consistency, network partitions.

3) dealing with a lot more operational complexity, not less.

So if the snake oil salesmen say it will be seamless, they are very very very much lying. Either that, or you are paying a LOT of money for other people to do the hard work.

Which is what happens with managing your own NoSQL vs DynamoDB. You'll pay through the roof for DynamoDB at true big data scales.


If you know and understand S3 pretty well, and you purely need to generate, store, and read materialized static views, I highly recommend S3 for this use case. I say this as someone who really likes working with DDB daily and understands the tradeoffs with Dynamo. You can always layer on Athena or (simpler) S3 Select later if a SQL query model is a better fit than KV object lookups. S3 is loosely the fire and forget KV DB you’re describing IMO depending on your use case


Plenty of options already exist. DynamoDB has both autoscaling and serverless modes. AWS also has managed Cassandra (runs on top of DynamoDB) which doesn't need instance management.

Azure has CosmosDB, GCP has Cloud Datastore/Firestore, and there are many DB vendors like Planetscale (mysql), CockroachDB (postgres), FaunaDB (custom document/relational) that have "serverless" options.


Exactly. This has been my experience with several AWS technologies. Like with their ElasticSearch service, where I had to constantly fine-tune various parameters, such as memory. I was curious why they couldn't auto-scale the memory, why I had to do that manually. There are several AWS services that should be a bit more magical, but they are not.


Dynamo is like the opposite of fire and forget it. You really want to know your access patterns at design time.


There's not really magic with s3, you still need to name things with coherrent prefixes to spread around the load.

DynamoDB is almost simple enough to learn in a day. And if you're doing nothing with it, you're only really paying for storage. Good luck with your decisions.


And S3 won't scale instantly... If your load is big enough :)

Everything has limits, but S3 is remarkably hard to break if used right.


S3 naming no longer matters for performance. Rejoice.


Prefixes are not needed 90% of use cases


I'm not going to speculate on the accuracy of 90% value, but I will say that appropriately prefixed objects substantially help with performance when you have tons of small-ish files. Maybe most orgs don't have that need but in operational realms doing this with your logs make the response faster.


DynamoDB is like S3 but with query features. It is not a relational db. It is a document storage. So you need to use it for what it is.

Our entire solution is basically based on top of lambda and dynamodb tables and it works really as long as you don't threat the tables like SQL.


After looking into solutions like Fauna, Upstash, and Planetscale I don't understand why anyone is bothering with DDB anymore.

I read "the dynamodb book" and almost got a stroke. So much idiosyncrasies, for what?!


Your impressions are cordect: DynamoDB is quite low-level and more like a DB kit than ready to use DB, for most applications it's better to use something else.


Save yourself a ton of pain and don't use DynamoDB


If it’s possible in your situation, instead of vendor lock-in, invest in cacheability of your service and leverage HTTP cache as much as possible.


If you use the "pay per request" billing model instead of provisioned throughput, DynamoDB scaling is self-managing, and you can treat your DB as a fire-and-forget key/value store. You need to plan how you'll query your data and structure the keys accordingly, but honestly, that applies even more to S3 than it does to Dynamo.


Exactly my experience. I got sucked into using more than once, thinking it would be better next time, but there are just so many sharp edges.

At one company, someone accidentally set the write rate rate high to transfer data into the db. This had the effect of permanently increasing the shard count to a huge number, basically making the DB useless.


I think this is a good summary, and it even gets more complicated if you start using the DAX cache. Your read/write provisioning for DAX is totally different than the underlying dynamodb tables. The write throughput for Dax is limited by the size of the master node in the cluster. Can you say bottleneck?


Take a look at Firestore / Google Cloud Datastore. It's pretty much exactly what you describe - fire and forget. There's no concept of "node" (at least not from the outside).


If you don't need data persistence then consider redis instead (which can also do persistence if you enable AOF)


Thinking like this both baffles me, but also makes me happy because there will always be a need for people like me, infra. AWS is not a magical tool that will replace your infra team, it is a magical tool that will allow your infra team to do more. I am the infra team of my startup and I estimate that only 50% of my time is doing infra work. The rest is supporting my peers, work in frameworky stuff, solve dev efficiency issues bla bla.

Lets say that you operate in an AWS-less environment, with everything bare metal, in a datacenter. Your GOOD infra team has to do the following:

Hardware:

- make sure there is a channel to get new hardware, both for capacity increase and spares. What are you going to do? Buy 1 server and 2 spares? If one of the servers has an issue, isn't it quite likely that the other servers, from the same batch, to have the same issue? Is this affecting you, or not? Where do you store the spares? In a warehouse somewhere, making it harder to deploy? In the rack with the one in use, wasting rackspace/switch space? Are you going to rely on the datacenter to provide you with the hardware? What if you are one of their smaller customers and your requests get pushed back because some larger customer requests get higher priority?

- make sure there is a way to deploy said hardware. You don't want to not be able to deploy a new server because there is no space in the rack, or no space in the switch. Where are your spares? In a warehouse miles away from the datacenter? Do you have access to said warehouse at midnight, on Thanksgiving? Oh shit, someone lost the key to your rack! Oh noes, we don't have any spare network cable/connectors/screws...

Software:

- did you patch your servers? did you patch your switches?

- new server, we need to install the os. And a base set of software, including the agent we use to remote manage the server.

- oh, we also need to run and maintain the management infra, say the control plane for k8.

- oh, we want some read replicas for this db, not only we need the hardware to run the replicas on (and see above for what that means), now you need to add a bunch of monitoring and have plans in place to handle things like: replicas lagging, network links between master and replicas being full, failover for the above, master crapping out yada yada.

I bet there are many other aspects I'm missing.

Choices:

Your GOOD infra team will have to decide things like: how many spares do we need, is the capacity we have atm enough for the launch of our next world-changing feature that half the internet wants to use? Are we lucky enough to survive a few months without spares or should we get estra capacity in another datacenter? Do we want to have replicas on the west coast or is the latency acceptable?

These are the main areas of what an infra team is supposed to do: Hardware, Software and Choices. AWS (and most other cloud providers) is making the first 2 points non issues. For the last area you can do 2 things: get an infra team (could be a full fledged team, could be 1 person, you could do it) and teoretically you will get choices tailored to what your business needs OR let AWS do it for you. *AWS might make these choices based on a metric you disagree with and this is the main reason people complain*.


Every system I've built on DynamoDB just works. The APIs that use it have had virtually 100% uptime and have never needed database maintenance. It is not a replacement for a RDBMS, but for some use cases, it's a killer service.


As a developer, I really have a love-hate relationship with Dynamo. I love how fast and easy it is to setup and get rolling.

The partitioning scheme came off as confusing and opaque but I think that says more about Amazon's documentation than the scheme itself.

I do not like that there's no really third party tooling integration to be able to query. Their UI in the console is _so freaking terrible_ yet you have no other way than code to query it. This problem is so bad that I will avoid using it where I can despite it being a good option, performance-wise.


Don't think this is a fair reason to avoid DynamoDB. There are reasonably better alternatives:

- DynamoDB Workbench (free, AWS official): https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

- Dynobase (paid, third-party): https://dynobase.dev/


Plus there is the local install ( Only for Dev and Testing purposes...)

"Deploying DynamoDB Locally on Your Computer" https://docs.aws.amazon.com/amazondynamodb/latest/developerg...


* SQL support is so new (https://aws.amazon.com/about-aws/whats-new/2020/11/you-now-c...) I have not seen it used in real life yet.

* No way to "DELETE * FROM T"


Yes the console UI is mind numbing, and was just made 10x worse in a recent redesign.

But i like the python boto3 library https://boto3.amazonaws.com/v1/documentation/api/latest/guid...

Build yourself a few wrappers to make querying more convenient, and i query straight from a python repl pretty effectively


My experience exactly.


We realized how great Dynamo was only after we migrated off AWS.

Dynamo was a key factor to us when we were releasing the MVP of our News API [0]. We used Dynamo, ElasticSearch, Lambda and could make it running in 60 days while being full-time employed.

Also, the best tech talk I saw was given by Rick Houlihan on re:Invent [1]

I highly recommend every engineer to watch it: it's a great overview of SQL vs NoSQL

[0] https://newscatcherapi.com/blog/how-we-built-a-news-api-beta...

[1] https://www.youtube.com/watch?v=HaEPXoXVf2k


BTW Rick Houlihan left AWS recently to work for Mongo.

https://twitter.com/houlihan_rick/status/1472969503575265283

On that thread he criticizes AWS regarding DynamoDB openly.

> I will always love DynamoDB, but the fact is it is losing ground fast because AWS focuses most of their resources on the half baked #builtfornopurpose database strategy. I always hated that idea, I just bit my tongue instead of saying it.

> The problem is the other half-baked database services that all compete for the same business. DocumentDB, Keyspaces, Timestream, Neptune, etc. Databases take decades to optimize, the idea that you can pump them out like web apps is silly.

> I was very tired of explaining over and over again that DynamoDB is actually not the dumbed down Key-Value store that the marketing message implied. When AWS created 6 different NoSQL databases they had to make up reasons for each one and the messaging makes no sense.


Interesting. MongoDB actually came to mind while I was reading the other comment here:

> No one uses DynamoDB alone: they bolt it onto Postgres after realizing they have availability or scale needs beyond what a relational database can do, then they bolt on Elasticsearch to enable querying, and then they bolt on Redis to make the disjointed backend feel fast. And I'm just talking operational use cases; ignoring analytics here.

Perhaps MongoDB is prime for a comeback?


Not snark: did MongoDB ever go away?

I've seen it used in many places over the years.

Today I would choose JSON in Postgres before I would just jump to Monogo but it certainly serves a purpose for many shops and it is still widely used AFAIK.

I _really_ miss RethinkDB.


What do you miss about RethinkDB?


The table model with joins that mostly "just worked" and the sweet web ui that came with it by default.

Hat tip for compass, very nice tool I was just losing last week.

We use Atlas and it "just works" so no comment on administering mongo vs rethink haha.


In my circles Mongo has always been considered a bad database.

If Rick's vouching for it maybe it's time to give it a try. It must be pretty mature by now.


It had some operational quirks 10 years ago (allocating giant chunks of space was more of an issue that dataloss) and I've not used it directly in that many years. We lost some data during an OOM process kill but it was just twitter firehose data so not a huge deal.

Lots of good info in the response in this SO post

https://stackoverflow.com/questions/10560834/to-what-extent-...


For anyone else expecting this to be a paper given the domain name, it’s not. It’s a non technical interview with a couple of the original papers authors. Not bad, just not as exciting as I imagine a paper detailing what they’ve learnt from a distributed systems perspective etc operating Dynamo then DynamoDB for so long now.


We don't have a paper on DynamoDB's internals (yet?), but here's a talk you might find interesting from one of the folks who built and ran DDB for a long time: https://www.youtube.com/watch?v=yvBR71D0nAQ

And Doug Terry talking through the details of how DynamoDB's transaction protocol works: https://www.usenix.org/conference/fast19/presentation/terry

If we did publish more about the internals of DDB, what would you be looking to learn? Architecture? Operational experience? Developer experience? There's a lot of material we could share, and it's useful to hear where people would like us to focus.


All of it - architecture, operational experience, best practices etc.


Just want to second this. All of the above sounds really interesting to me!


https://brooker.co.za/blog/2022/01/19/predictability.html This might be something you are looking for.


To be honest, as a customer, it is hard for me to justify using DynamoDB. Some of this criticism can be out of date:

1. DynamoDB is not as convenient. There are a bit too many dials to turn.

2. DynamoDB does not have a SQL facade on top.

3. DynamoDB is proprietary, I believe there's no OSS API equivalent if you want to migrate out.

4. DynamoDB was kind of expensive. But it has been a while since I last check the pricing page.

It's simply much better to start with PostgreSQL Aurora and move to a more scalable storage based on specific uses-cases later. For example: Cassandra, Elastic, Druid, or CockroachDB.


I strongly agree that most early stage businesses should be on Postgres. There's simply too much churn in early stage data models. Also, unforeseen esoteric needs jumping out of the wood work that you can knock out a SQL query for instead of having to build a solution come up constantly. However, this does assume that your development team has a competent understanding of SQL.

I've been in a couple startups that went Dynamo first and development velocity was a pale shadow of velocity with Postgres. When one of those startups dumped dynamo for Postgres velocity multiplied immediately. I'd estimate we were moving at around 1000% and the complete transition took less time than even I expected (about a month). Once the business matures, moving tables onto dynamo and wrapping them in a microservice makes a lot of sense. Dynamo does solve a lot of problems that become increasingly material as the business evolves.

Eventually, SQL's presence declines and transitions into an analytics system as narrower, but easier for ops, options proliferate.


> 2. DynamoDB does not have a SQL facade on top.

I think this is out of date https://aws.amazon.com/about-aws/whats-new/2020/11/you-now-c...


Ad 3. Lot of people don't know about it, but there's a open source, free, and DynamoDB compatibile databse called ScyllaDB - API it's called Alternator to be specific.


I definitely didn't know about it before. Thanks!


That’s surprising to me, I consider DynamoDB to be far simpler than any relational DB including the alternatives you list


We landed on DynamoDB when we migrated a monolith to microservice architecture. I have to say that DynamoDB fits fairly well in the microservices world where the service scope is small, query patterns are pretty narrow and don't really change much. Building new things using DynamoDB when query patterns aren't necessarily known is very painful and require tedious migration strategies unless you don't mind paying for GSIs.


Quite a few of the teams that were early adopters of AWS DynamoDB were not prepared for the pricing nuances that had to be taken into consideration when building their solutions.


I remember trying Dynamodb around 2015/2016: You had to specify your expected read and write throughout and you would be billed for that. At that time we had a pretty spikey traffic use case which made using dynamodb efficiently impossible


I had a similar experience, but ultimately wrote a service to monitor our workloads and request increased provisioning during spikes. You could reduce your provisioning like 10 times a day, but after that you could only increase it and would be stuck with the higher rate for a time.

And then on-demand provisioning was released and it was cheap enough to be worth simplifying our workflows.


I was one of these. However I now understand that the pricing nuances reflected a reality that I appreciate. We used DDB in a way that was not the best fit and the cost was a reflection of this.


I wonder if DynamoDB would be met with less criticism had it simply been named Dynamo Document Store.


Indeed quite a journey. If you love DynamoDB and like open source, give Scylla a try (disclosure me==founder): https://www.scylladb.com/alternator/


If you follow Rick Houlihan (@houlihan_rick) then all the accolades that AWS for DynamoDB pale in comparison to its current team and execution in that the company seems to not be investing in it so much so that Rick left to join MongoDB.


Man I love Rick’s talks as much as anyone but let’s be real, he likely left AWS not for his love of first class geographical indexes but because Mongo offered a giant pile of money for him to evangelize their tech. Though I have no doubts that he actually had a lot of reservations around Dynamo’s DX before, he likely has some around mongodb but those won’t be the bulk of his content


At his rank at AWS I don’t know if money was such an issue. He strikes me as a person who cares deeply about the underlying tech. But I have no idea one way or the other.


I think I've seen you post something similar on r/aws about how Rick was "top DynamoDb person at AWS" (apologies if that wasn't you). I think you are overestimating Rick's "rank".

I just looked him up (I had not heard of him before seeing his name mentioned on r/aws a few days ago) and he was an L7 TPM/Practice Manager in AWS's sales organization. That's not really a notably high position, and in the grand scheme of Amazon pay scales, isn't that high up. An L7 TPM gets paid about the same as, or sometimes less than, an L6 software dev (L6 is "senior", which is ~5-10 years of experience).

Also, him being in the sales org means he had practically nothing to do with the engineering of the service. AWS Sales is a revolving door of people. I mean no offense towards Rick (again, I didn't know him or even know of him before I read his name in a comment a few days ago), but I would not read anything at all into the fact that an L7 Sales TPM left for another company.


Actually, I was a direct report to Colin Lazier (https://twitter.com/clazier) who is the GM for DynamoDB, Keyspaces, and Glue Elastic Views. I was the original TPM for DocumentDB before joining the Professional Services team as a Senior Practice Manager to head up the NoSQL Blackbelt team which led the archtecture/design effort for Amazon's RDBMS->NoSQL migration. I was brought back to the service team by Jim Scharf to lead the technical solutions team for strategic accounts, but I maintained the org chart role of Senior Practice Manager until I left for MongoDB.

Compensation was a minor issue. I was an org chart aberration already and AWS pulled out all the stops to retain me. I will always appreciate the opportunity that AWS provided me and my time at DynamoDB will always hold a special place in my heart. I really do believe that MongoDB is poised to do great things and my decision had more to do with being a part of that than anything else.


Whoa straight from the source!


You never heard of Rick Houlihan? He is the 90% of DynamoDB Evangelism... At the same time you are able to this internal lookups? Do you work with DynamoDB?

AWS re:Invent 2018: Amazon DynamoDB Deep Dive: Advanced Design Patterns for DynamoDB (DAT401) https://youtu.be/HaEPXoXVf2k

AWS re:Invent 2019: [REPEAT 1] Amazon DynamoDB deep dive: Advanced design patterns (DAT403-R1) https://youtu.be/6yqfmXiZTlM

AWS re:Invent 2020: Amazon DynamoDB advanced design patterns – Part 1 https://youtu.be/MF9a1UNOAQo

AWS re:Invent 2020: Amazon DynamoDB advanced design patterns – Part 2 https://youtu.be/_KNrRdWD25M

AWS re:Invent 2021 - DynamoDB deep dive: Advanced design patterns https://youtu.be/xfxBhvGpoa0

Amazon DynamoDB | Office Hours with Rick Houlihan: Breaking down the design process for NoSQL applications https://www.twitch.tv/videos/761425806


Do you expect the engineers on your team to know the top sales person at your company?

This person might be responsible for the majority of evangelism and revenue for the company. Do you expect the SDEs to know about him?

Again, no shot against against Rick - he is amazing, smart, technical, competent, and a deep owner.

But the average SDE on the team won't know about these or watch these talks. There are too many deep internal engineering challenges to solve.


Maybe that was the problem. He cited that there was seemingly not enough effort in making DynamoDB better as evidenced by the many orthogonally very close other DBs that AWS promotes. If Rick was ears to the ground listening to customers and sending back feedback but it was falling on deaf ears that's enough ground for someone as high up and as influential and productive as him to leave. It also speaks to inner AWS turmoil at least at DynamoDB.


Based on what I know, that's not the case.

DDB is a steady ship. The explanation on https://news.ycombinator.com/item?id=30009611 is likely the best explanation. L7 TPMs make the same money as L6 SDEs.

Getting promoted to L8 - director - is a monumental effort and likely seemed much harder than pursuing a comprable position at MongoDB.

Good for him for doing it, and for making Amazon take a long hard look at every way they failed in not keeping him.


>It also speaks to inner AWS turmoil at least at DynamoDB.

How? Rick wasn't part of the DynamoDB service team. He wasn't an engineer, nor a manager on the team, nor even a product manager. He was a salesperson that specialized in DDB. He most likely had very little interactions, if any, with the engineering team. I don't see how him leaving speaks at all to anything about the inner workings of the engineering teams.

Rick seems cool, and after skimming some of his chats he seems really knowledgeable about the customer-facing side of DDB, and I mean absolutely no disrespect to him. But I think you're making way too many assumptions about his "rank" and "influence" within the company.


I have watched almost all those talks as they are technically dense and full of very good and very useful technical knowledge that I would be much poorer for not watching. These are not sales videos but highly complex instructional content meant for developers on the ground


Are you calling the person who did the core DynamoDB Technical Deep Dive sessions at reInvent, for the last 4 years in a row, a sales person?


There are over a thousand breakout sessions at every reinvent every year. Some of the speakers are sales people, some are engineers, some are managers. There are L5 or junior engineers who give reinvent session talks. It's a fun gig, but it doesn't mean that the speaker is some top executive or anything like that.

Rich was in the sales org. His primary job was sales. Reinvent is a sales conference. Speaking at reinvent is a sales pitch. He was a salesperson. I'm not sure why you're so offended by that. Being a salesperson isn't bad, it's just an explanation for why engineers wouldn't have heard of him.


What do you think Solutions Architects and Developer Advocates (between the two groups who do most Re:invent sessions) are?

Hell, what do you think re:Invent is? It's a sales conference.

In any company you have two groups of people: Those that build the product, and those that sell it. Ultimately, solutions architects and developer advocates are there to help sell the product.

Of course Amazon is customer obsessed. And genuinely interested in ensuring customers have a good experience, and their technical needs are met - through education, support, and architectural guidance. But ultimately, that's what it is.


I think I understand now why he left...


No, I haven't. There are thousands of reinvent sessions every year. I don't watch them all (I don't watch hardly any of them, and most people I know in Amazon watch a couple breakout sessions if that. Some don't even watch the keynotes). Their targeted audience is AWS customers, not internal engineers. Reinvent itself is a sales conference. If internal Amazonians want to learn about something like DDB, there are internal talks and documents given by the engineering leaders that we watch.

>At the same time you are able to this internal lookups?

I looked him up on LinkedIn. Nothing internal about it.


was not me at r/aws

unless he posts here about it we can't really know -- we can only speculate but I think he had a higher amount of influence than his title/rank might suggest. I think Rick's influence with respect to DynamoDB is akin to that of Kelsey Hightower's influence over k8s at Google.


I absolutely love DynamoDB. Would definitely recommend the DynamoDB Book by Alex Debrie and Advanced DynamoDB Patterns by Rick H on YouTube.

Once you understand how to properly use dynamo and what it’s good at it you get so much power; at a fraction of the cost depending on your workload.

Was a breeze to setup multi region applications that utilize the single table design strategy.

If you ever need complex queries just use dynamo streams to power whatever search solution you are comfortable with.


Can anyone recommend me a good paper (or other resource) describing the designs of DynamoDB and S3? Ideally something in the spirt of the original Dynamo paper (i.e. https://www.allthingsdistributed.com/files/amazon-dynamo-sos... )


This one is pretty good for DynamoDB: https://youtu.be/yvBR71D0nAQ


The only experience I had with dynamodb has been in aws: we set up a testing DB with defaults... then we left it there.... two months later we realized we've lost about 1500$: our mistake was to use the setup defaults (which had some sort of auto-scaling): I hope in the meanwhile AWS corrected this: we did let them know)


I haven't run into anyone who uses Dynamo for anything other than managing Terraform backend state locking. And I think that's still the best use case for it: you just want to store a couple random key-values somewhere and have more functionality than AWS Parameter Store. Trying to build anything large-scale with it will probably leave you wanting.


The in-the-trenches, technical, battle-tested/battle-scarred comments on this thread are why I come to HN for the comments.


> Customers no longer want to just store and query the data in their databases. They then want to analyze that data to create value

Presumably create more value. I know it's a marketing post but still - storing and retrieving things is pretty valuable in and of itself.


DynamoDB for me is the perfect database for my serverless / graphql API. My only gripe is the limitation of items in a transaction of 25. I've had to resort to layering another transaction mgmt system on the top of it.


we use DynamoDB like a big hash table of s3 file locations.. we look up these locations via a key (at the time, it sounded like a pretty good use-case for it). I suppose we could have used some other managed redis or memcached thing, but being an AWS shop, it was, and is, pretty useful. I have to say, it's been pretty effortless to configure.. read/write units are really the only thing we've had to configure (other than the base index) The rest of it has been easy. It has about a 100 million entries that are read or written pretty quickly.


I remember talking to someone who was playing with AWS stuff for the first time and they had a similar architecture, using Dynamo for a lookup store. It still seems a bit odd to me though. It's been a long time since I've worked with the S3 API, so maybe it just doesn't support the same sort of thing, but wouldn't it be nicer to just query S3 with some key and get back either the path/URL to render a link, or the content itself? Why the Dynamo intermediary? (And on the other side, if you don't need to render a link to serve the content, why not use Dynamo as the actual document store and skip S3? Storage cost?)


You're not going to get O(1) lookup using the S3 API. The APIs are basically O(n). So we we use DynamoDB to store the S3 URL.


Moreover, storing a large binary file (in this case) in Dynamo is probably not the best use-case for it..most likely would have to convert to base64 in and out of it.


We tried to implement an application on DynamoDB about 2 years ago.

We really struggled with implementing adhoc queries/search. For e.g:- select * from employees where name = X and city = Y.

Any improvements in DynamoDB that make it easier to implement such queries?


This will sound flippant, but that's not what Dynamo is for. If you want to do freeform relational queries like that then put it in a relational database.

Dynamo is primarily designed for high volume storage/querying on well understood data sets with a few query patterns. If you want to be able to query information on employees based on their name and city you'll need to build another index keyed on name and city (in practice Dynamo makes that reasonably simple by adding a secondary index).


Amazon has a perfect use case for this. You click on a product in the search results, that url contains a UUID, that UUID is used to search Dynamo and returns an object that has all the information on the product, from that you build the page.

If what you are trying to do looks more like "Give me all the customers that live in Cuba and have spent more than $10 and have green eyes", Dynamo isn't for you. You can query that way but after you put all the work in to get it up and running, you'd probably be better off with Postgres.


If that's one of 12 or less query patterns you need, I can write you a simple dynamo table for it. Dynamo's limitation is that it can only support n different query patterns, and you have to hand craft an index for each one(well, sometimes you can get multiple on one index)


Alternatively, practice single table design: structure your table keys in such a way that they can represent all (or at least most) of the queries you need to run.

This is often easier said than done, but it can be far less expensive and more performant than adding an index for each search.


It's always great fun compounding new, manual 'indexes' when you discover you need another query.


That's not what DynamoDB is for. If you need to run queries like that, you should be using RDBMS. DynamoDB should only really be used for use cases where the queries are known up-front. There are ways to design your data model in Dynamo so that you could actually run queries like that, but you would have had to that work from day 1. You won't be able to retroactively support queries like that.


The most reliable way to build a system with DynamoDB is to plan queries upfront. Trying to use it like a SQL database and make Ad-Hoc queries won't work because it's not a SQL DB.

Data should be stored in the fashion you wish for it to be read, and storing the same data in more than one configuration is acceptable.

Good resource: https://docs.aws.amazon.com/amazondynamodb/latest/developerg...


DynamoDB is not meant for ad-hoc query patterns; as others have said, plan your indexes around your access patterns.

However, so long as you add a global secondary index (GSI) with name, city as the key, you can certainly do such things. But be aware for large-scale solutions:

1. There's a limit of 20 GSIs per table. You can increase with a call to AWS support.

2. GSIs are latently updated; read-after write is not guaranteed, and there is no "consistent read" option on a GSI like there is with tables.

3. WCUs on GSIs should match (or surpass) the WCUs on the original table, else throughput limit exceeded exceptions will occur. So, 3 GSIs on a table means you pay 4x+ in WCU costs.

4. The keys of the GSI should be evenly distributed, just like the PK on a main table. If not, there is additional opportunity for hot partitions on write.

Ref: https://aws.amazon.com/premiumsupport/knowledge-center/dynam...


I struggled at first but I watched Advanced Design Patterns for DynamoDB[0] a few times and it clicked. As other responses have suggested, generally you define your access patterns first and then structure the data later to fit those access patterns.

[0]: https://www.youtube.com/watch?v=HaEPXoXVf2k


DynamoDB (and other dynamo-like systems like Cassandra, Bigtable) are just advanced key/value stores. They support multiple levels of keys->values but fundamentally you need the key to find the associated value.

If you want to search by parameters that aren't keys then you need to store your data that way. Most of these systems have secondary indexes now, and that's basically what they do for you automatically in the backend, storing another copy of your records using a different key.

If you need adhoc relational queries then you should use a relational database.


> If you want to search by parameters that aren't keys then you need to store your data that way.

Not that I recommend it, but by using space-filling curves, one could to index multiple dimensions onto DynamoDB's bi-dimensional (hash-key, range-key) primary-index: https://aws.amazon.com/blogs/database/z-order-indexing-for-m... and https://web.archive.org/web/20220120151929/https://citeseerx...


With DynamoDB, you can now execute SQL queries using PartiQL:

https://docs.aws.amazon.com/amazondynamodb/latest/developerg...


Note that this is just a new syntax for the existing querying capabilities. If you query something that's not in the hash/sort key, you still need to filter on the "client" after the 1mb data set size limit etc.


I'm really happy that Cosmos DB has this - https://docs.microsoft.com/en-us/azure/cosmos-db/sql/sql-que...

I haven't used DynamoDB in a couple of years, so I'd be curious to know how querying compares if anyone can share some light that has used both Cosmos and Dynamo recently.


DynamoDB is one of my favorite AWS products.


knocking off mongo... 10 years later they still haven't caught up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: