Perhaps - but if I made a list of all of the things your company should be doing and didn't, or even things that your side project should be doing and didn't, or even things in your personal life that you should be doing and haven't, I'm sure it would be very long.
> all of the things your company should be doing and didn't
Processes need to match the potential risk.
If your company is doing some inconsequential social app or whatever, then sure, go ahead and move fast and break things if that's how you roll.
If you are a company, let's call them Crowdstrike, that has access to push root privileged code to a significant percentage of all machine on the internet, the minimum quality bar is vastly higher.
For this type of code, I would expect a comprehensive test suite that covers everything and a fleet of QA machines representing every possible combination of supported hardware and software (yes, possibly thousands of machines). A build has to pass that and then get rolled into dogfooding usage internally for a while. And then very slowly gets pushed to customers, with monitoring that nothing seems to be regressing.
Anything short of that is highly irresponsible given the access and risk the Crowdstrike code represents.
> A build has to pass that and then get rolled into dogfooding usage internally for a while. And then very slowly gets pushed to customers, with monitoring that nothing seems to be regressing.
That doesn't work in the business they're in. They need to roll out definition updates quickly. Their clients won't be happy if they get compromised while CrowdStrike was still doing the dogfooding or phased rollout of the update that would've prevented it.
> That doesn't work in the business they're in. They need to roll out definition updates quickly.
Well clearly we have incontrovertible evidence now (if it was needed) that YOLO-pushing insufficiently tested updates to everyone at once does not work either.
This is being called in many places (righfully) the largest IT outage in history. How many billions will be the cost? How many people died?
A company deploying kernel-mode code that can render huge numbers of machines unusable should have done better. It's one of those "you had one job" kind of situations.
They would be a gigantic target for malware. Imagine pwning a CDN to pwn millions of client computers. The CDN being malicious would be a major threat.
Oh, they have one job for sure. Selling compliance. All else isn't their job, including actual security.
Antiviruses are security cosplay that works by using a combination of bug-riddled custom kernel drivers and unsandboxed C++ parsers running with the highest level of privileges to tamper with every bit of data it can get its hands on. They violate every security common sense. They also won't even hesitate to disable or delay rollouts of actual security mechanisms built into browsers and OSes if it gets in the way.
The software industry needs to call out this scam and put them out of business sooner than later. This has been the case for at least a decade or two and it's sad that nothing has changed.
> Nope, I have seen software like Crowdstrike, S1, Huntress and Defender E5 stop active ransomware attacks.
Yes, occasionally they do. This is not an either-or situation.
While they do catch and stop attacks, it is also true that crowdstrike and its ilk are root-level backdoors into the system that bypass all protections and thus will cause problems sometimes.
That anecdote doesn't justify installing gaping security holes into the kernel with those tools. Actual security requires knowledge, good practice, and good engineering. Antiviruses can never be a substitute.
You seem security-wise, so surely you can understand that in some (many?) cases, antivirus is totally acceptable given the threat model. If you are wanting to keep the script kiddies from metasploiting your ordinary finance employees, it's certainly worth the tradeoff for some organizations, no? It's but one tool with its tradeoffs like any tool.
That's like pointing at the occasional petty theft and mugging, and using it to justify establishing an extraordinary secret police to run the entire country. It's stupid, and if you do it anyway, it's obvious you had other reasons.
Antivirus software is almost universally malware. Enterprise endpoint "protection" software like CrowdStrike is worse, it's an aggressive malware and a backdoor controlled by a third party, whose main selling points are compliance and surveillance. Installing it is a lot like outsourcing your secret police to a consulting company. No surprise, everything looks pretty early on, but two weeks in, smart consultants rotate out to bring in new customers, and bad ones rotate in to run the show.
Yeah, that's definitely a good tradeoff against script kiddies metasploiting your ordinary finance employees. Wonder if it'll look as good when loss of life caused by CrowdStrike this weekend gets tallied up.
The failure mode here was a page fault due to an invalid definition file. That (likely) means the definition file was being used as-is without any validation, and pointers were being dereferenced based on that non-validated definition file. That means this software is likely vulnerable to some kind of kernel-level RCE through its definition files, and is (clearly) 100% vulnerable to DoS attacks through invalid definition files. Who knows how long this has been the case.
This isn’t a matter of “either your system is protected all the time, even if that means it’s down, or your system will remain up but might be unprotected.” It’s “your system is vulnerable to kernel-level exploits because of your AV software’s inability to validate definition files.”
The failure mode here should absolutely not be to soft-brick the machine. You can have either of your choices configurable by the sysadmin; definition file fails to validate? No problem, the endpoint has its network access blocked until the problem can be resolved. Or, it can revert to a known-good definition, if that’s within the organization’s risk tolerance.
But that would require competent engineering, which clearly was not going on here.