Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's wild to me that we gave up hardware error correction on memory at the same time we increased memory sizes about 1000x, shrinking the die (and thus reliability) by a roughly similar amount.


This is true, but even today bit flips per GB/hour are still really low.

However failures in the memory chip -> chip pin -> dimm -> dimm connector -> motherboard -> CPU socket -> CPU pin -> CPU are pretty common. Sure ECC helps with random bitflips, but it's also very useful to diagnose something is broken in the CPU <-> memory chip pipeline. It's very frustrating to debug something when the main sign of the problem is a reboot.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: