What basis do you have for saying that? It is likely their DR was running on a m...

chgs · on July 29, 2024

If your DR system is susceptible to the same faults as your main system it’s not a DR system.

It would be like claiming raid1 is a backup.

TheDong · on July 29, 2024

Or it would be like claiming my backup isn’t a backup because both systems run openssh, so a remote code execution vuln there could take down both systems.

Any DR system will have to accept some risks, and those don’t necessarily invalidate it in general, just make it insufficient for some scenarios.

Conversely, if they ran the main system on windows with crowdstrike and the DR one on poorly configured linux with no security software, they probably would have needed more sysadmins, had more trouble maintaining software for both, and been vulnerable to risk from both linux and windows bugs, so I feel like they made the right tradeoff in general.

I’m sure you, who can deride this DR system, have devised your own system such that it is resilient to a meteor destroying the earth.

shagie · on July 30, 2024

> I’m sure you, who can deride this DR system, have devised your own system such that it is resilient to a meteor destroying the earth.

That reminds me one of Corey Quinn's comfortable AWS truths.

https://x.com/QuinnyPig/status/1173371749808783360

> If your DR plan assumes us-east-1 dies unrecoverably, what you're really planning for is 100 square miles of Northern Virginia no longer existing. Good luck with that ad farm in a nuclear wasteland, buddy!

dredmorbius · on July 30, 2024

As HN itself discovered a couple of years ago when a set of same-manufacturer, same-batch disks within both RAID arrays and backup server failed within a few hours of one another:

<https://news.ycombinator.com/item?id=32048148>

<https://news.ycombinator.com/item?id=32031243>

amluto · on July 30, 2024

One idea: build a DR system and turn it off. Ideally it would be cloneable, but even without that ability, one could test it every few months to make sure it boots adequately quickly and then turn it back off. The attack surface of a bunch of computers or instances that are powered down is pretty low.

compiler-guy · on July 30, 2024

Better yet, alternate between them every month or two.

freeopinion · on July 29, 2024

> Keep in mind there was no way to opt out or delay CS Channel updates.

Do CS updates somehow work over airgaps? You know, the kind that production systems have to prevent any access to or from external networks? Well... some production systems anyway.

nradov · on July 30, 2024

What's your point? An air gapped disaster recovery system would be useless. An airline operations application has to connect to a bunch of other external systems to be of any use.