Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Was going to point out the same thing - the original article's solution is losing timestamps and possibly ordering. They also are losing some compressibility by converting to a structured format (JSON). And if they actually include a lot of UUIDs (their diagram is vague on what transaction IDs look like), then good luck - those don't compress very well.

I worked at a magnificent 7 company that compressed a lot of logs; we found that zstd actually did the best all-around job back in 2021 after a lot of testing.



We have a process monitor that basically polls ps output and writes it to JSON. We see ~30:1 compression using zstd on a ZFS dataset that stores these logs.

I laugh every time I see it.


Agreed.

If you used something like sequential IDs (even in some UUID format) it can compress pretty well.


As a member of the UUIDv7 cheering squad let me say 'rah rah'! :D


Which compression level of zstd worked best in terms of the ideal balance between compression ratio vs. run time?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: