Over decades I've slowly convinced myself that the best approach is to split pro... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		jiggawatts on July 9, 2021 \| parent \| context \| favorite \| on: I made a mistake with Terraform and Azure made it ... Over decades I've slowly convinced myself that the best approach is to split production into groups and do rolling updates. One of the smoothest run environments I've ever worked with had ten silos. A failed update would not impact more than 10% of the users, and we could quickly redirect them to the remaining 90% of the platform without a material performance impact.

corty on July 9, 2021 [–]

That definitely is an option, but viability depends on what your environment is doing: "Sorry, we lost 10% of your bank transfers due to an update" just won't be acceptable, but "we delivered 10% less catpics today" might be.

And I wonder how many platforms can be made to work like that.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact