Right now, it is basically impossible to reliably build full applications with t...

cbsmith · 2025-09-09T01:27:50 1757381270

I've built several DynamoDB apps, and while you might have some expectations of internal behaviour, you can build apps that are pretty resilient to change of the internal behaviour but rely heavily on the documented behaviour. I actually find the extent of the opacity a helpful guide on the limitations of the service.

catlifeonmars · 2025-09-09T05:51:39 1757397099

Agree. TTL 48h SLA comes to mind.

JustExAWS · 2025-09-08T22:21:31 1757370091

I am also a former AWS employee. What non public information did you need for DDB?

tracker1 · 2025-09-08T23:39:17 1757374757

Try ingesting the a complete WHOIS dump into DDB sometime. This was before autoscaling worked at all when I tried... but it absolutely wasn't anything one can consider fun.

In the end, after multiple implementations, finally had to use a Java Spring app on a server with a LOT of ram just to buffer the CSV reads without blowing up on the pushback from DDB. I think the company spent over $20k in the couple months on different efforts in a couple different languages (C#/.Net, Node.js, Java) across a couple different routes (multiple queues, lambda, etc) just to get the initial data ingestion working a first time.

The Node.js implementation was fastest, but would always blow up a few days in without the ability to catch with a debugger attached. The queues and lambda experiments had throttling issues similar to the DynamoDB ingestion itself, even with the knobs turned all the way up. I don't recall what the issue with the .Net implementation was at the time, but it blew up differently.

I don't recall all the details, and tbh I shouldn't care, but it would have been nice if there was some extra guidance of trying to take in a few gb of csv into DynamoDB at the time. To this day, I still hate ETL work.

JustExAWS · 2025-09-09T00:01:49 1757376109

https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

tracker1 · 2025-09-09T00:10:59 1757376659

Cool... though that would make it difficult to get the hundred or so CSVs into a single table, since it isn't supported I guess stitching them before processing would be easy enough... also, no idea when that feature became available.

JustExAWS · 2025-09-09T01:08:10 1757380090

It’s never been a good idea to batch ingest a lot of little single files using any ETL process on AWS, whether it be DDB, Aurora MySQL/Postgres using “load data from S3…”, Redshift batch import from S3, or just using Athena (yeah I’ve done all of them).

tracker1 · 2025-09-10T15:57:57 1757519877

These weren't "little" single files... just separated by tld iirc.

everfrustrated · 2025-09-09T08:52:47 1757407967

Why would you expect an OLTP db like DDB to work for ETL? You'd have the same problems if you used Postgres.

It's not like AWS is short on ETL technologies to use...

scarface_74 · 2025-09-09T16:02:29 1757433749

Even in an OlTP db, there is often a need to bulk import and export data. AWS has methods in most supported data stores - ElasticSearch, DDB, MySQL, Aurora, Redshift, etc to bulk insert from S3.

cyberax · 2025-09-09T05:24:47 1757395487

A tool to look at hot partitions, for one thing.

JustExAWS · 2025-09-09T14:29:28 1757428168

It should handle that automatically

https://aws.amazon.com/blogs/database/part-2-scaling-dynamod...

cyberax · 2025-09-09T18:47:51 1757443671

The keyword here is "should" :) Back then DynamoDB also had a problem with scaling the data can be easily split into partitions, but it's never merged back into fewer partitions.

So if you scaled up and then down, you might have ended with a lot of partitions that got only a few IOPS quota each. It's better now with burst IOPS, but it still is a problem sometimes.

mannyv · 2025-09-09T04:56:21 1757393781

Totally incorrect for Dynamo.

It was probably correct for Cognito 1.0.