Hacker News new | past | comments | ask | show | jobs | submit login

We had major success by simply batching incoming writes into ~15 second chunks and writing that as a file to S3 and an index that tracks how the files are split / chunked to make read performance decent.

This alone gave us an insanely scalable (load tested against 100GB/day ~100M records/day) for a grand cheap total cost of $10/day for everything, server, disk, and S3. https://youtu.be/x_WqBuEA7s8

Works great for timeseries data, super scalable, no devops or managing database servers, simple, and works.

I believe the Discord guys also did something similar and had written some good engineering articles on it, give it a Google as well.




Storing and retrieving data has never been all that hard. The challenge is having user-interactive performance on complex queries against the data. Comparing and correlating and deriving and integrating and ... (lots of other analysis). For many "scaled" systems, 100M records/minute isn't uncommon... and while that's very likely possible with your design the question of economic feasibility enters. Solving these problems at scale with good economics is the playground of TSDB vendors today.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: