Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, I'm not surprised either. But if you're operating at this kind of scale and with this level of immediate roll-out, what I would expect are:

* A staggered process for the roll-out, so that machines that are updated check-in with some metrics that say "this new version is OK" (aka "canary deployment") and that the update is paused/rolled back if not.

* Basic smoke testing of the files before they're pushed to any customers

* Validation that the file is OK before accepting an update (via a checksum or whatever, matched against the "this update works" automated test checksums)

* Fuzz tests that broken files don't brick the machine

Literally any of the above would have saved millions and millions of dollars today.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: