I'm still amazed how the blame shifted from Microsoft to CrowdStrike. Yes, CrowdStrike update caused that -- but applications fail all the time. It was Microsoft's oversight to put it on Windows critical path.
And banks/airlines etc were hit hard because their _Windows_ didn't boot, not because of an application crash on a perfectly working Windows.
The application (Crowdstrike) was part of Windows' booting process.
Windows cannot simply "skip" failed drivers. Say Crowdstrike driver failed as a one time thing, Windows skipped it instead of retrying which led to the endpoint being vulnerable and a ransomware happens. We'd be saying the opposite now.
This is a high-impact ability Windows offers to applications - and applications should take responsibility and treat it as such.
I spoke to another EDR lead I know - they said they had provisions in place to read the dump if boot crashed, check if it was due to their driver and skip it if it was (and then send telemetry after startup so that it can be fixed, probably). Crowdstrike should have done the same.
One more thing to note is that we cannot say Windows shouldn't provide this ability - that becomes an anti-trust monopoly, because MS themselves are a competitor in this space.
The difference is that if windows does the skipping then you probably don't find out until its too late, if the application does the skipping there is the opportunity to set up alerting so you can fix whatever went wrong.
Do you mean that the skip would be manually approved after telemetry is sent and folks on-call paged? Then that sounds like it could be viable and a good idea yes.
But always a chance that the skipping mechanism could break as well. And there must be some form of networking available to able to send that and ask for approval.
Exactly! On skipping mechanism breaking - I mean, anything could break. Boils down to design and testing like all things.
One change - this approval and telemetry doesn't happen during the boot loading process. It's just logged and skipped.
Once bootup is done, the EDR app auto starts, checks logs for anomalies and sends telemetry over whenever network is available (it usually is, because they update malware signatures etc frequently). Someone at the company gets paged, they fix and the process continues.
Windows could sure handle this kind of error better, but IMHO it would be a mistake to require Microsoft to absolutely block any path Windows could be crashing due to third party software.
We'd end in a situation similar to Mac OS where there's a single gatekeeper and whole industries are subjected to the will of the platform owner.
Enterprises have chosen Windows because of that flexibility and control, while having a business partner they don't get with linux. If anything the blame should fall on them for getting hosed even as they fully had the means to avoid that situation.
I don't think "Microsoft should lock down Windows so hard" is the solution we want here. I don't want my desktop OS to be a walled garden like iOS is. I want to be able to install software on it that does anything I need to be able to do -- and yes, having that capability to run software at the lowest possible level in the OS does also mean that that software has extra responsibility to be well-behaved, as the OS can't protect the system from it. But I still would rather have that option than not have it (and also I wouldn't use CrowdStrike).
How did Microsoft put it on the Windows critical path? (Informational question—I’m not following the issue super closely, but I thought CrowdStrike was a third-party system. Crowdstrike was wrong to put so much code in the kernel. Microsoft was reportedly legally bound to provide this access and allow third-party code to run in the kernel.)
Microsoft added a feature to Windows that allows specially-signed antimalware drivers to be loaded extremely early in the boot sequence and be marked as non-optional. The idea is to give antimalware drivers the opportunity to load first, before anything else has had the chance to start.
Furthermore, if a driver is marked as optional and crashes, Windows can reboot with that optional driver disabled next time, preventing infinite crash/boot loops. Obviously that's no good if your antimalware driver gets disabled, so they can mark theirs as "required." Obviously in the CrowdStrike case, we got the worst of both worlds.
Microsoft is not who made the decision to put this on Windows' critical path; CrowdStrike was. Nothing stops you from running whatever dodgy third-party kernel modules you like on Linux or FreeBSD and they could easily cause the same sort of problem.
In fact, CrowdStrike has taken down Linux systems in much the same way in the past year (in April I think). It's just that the impact was less widespread.
Linux yes, but *BSD systems have microkernel architecture, so must be more resilient to failures of one of the components. Although I have no idea whether the full system would boot either, I'm pretty sure it could partially load, give more information to user, and make it easier to fix.
Partially agree. Linux yes, but *BSD systems have microkernel architecture, so must be more resilient to failures of one of the components. Although I have no idea whether the full system would boot either, I'm pretty sure it could partially load, give more information to user, and make it easier to fix.
To be fair, AFAIK the CrowdStrike driver was WHQL-certified. The loophole is that the driver loaded files at runtime, which made it impossible to predict every failure scenario.
Maybe this is the loophole that needs closing. You can't claim a driver is certified for Windows if the manufacturer can push arbitrary files that change its behavior. Especially if that manufacturer has sloppy development practices.
I understand that a primary goal of endpoint monitoring software is to be able to quickly react to new threats, and that the turn around time for Windows certification is surely unacceptable in this scenario, but this functionality can never be allowed to jeopardize the stability of the system it's supposed to protect. So it's ultimately on Microsoft to fix this for their users.
Ironically, this is exactly the failure pattern that the changes in Chrome extensions to manifest v3 try to prevent. You can't provide a guarantee to the end-user of pre-vetted safety when the application is downloading and executing arbitrary code from a third-party source. That's like expecting a static code verifier to prevent all runtime errors.
It is, perhaps, a guarantee that no vendor should be expected to make.
> You can't provide a guarantee to the end-user of pre-vetted safety when the application is downloading and executing arbitrary code from a third-party source.
So a web browser can't be trusted or certified, ever. Unless JavaScript is disabled?
Correct, and I should have been more clear. By the nature of what they do, Chrome extensions operate outside the sandbox designed to make attacking the underlying operating system running the browser very hard.
Sandboxing is such a way to attempt to enforce a guarantee (modulo sandbox bugs, of course). Since crexs aren't entirely in the sandbox, vetting and signoff is supposed to provide the added assurance of security the sandbox can't provide. And those assurances are hollow when the vetted crex is running arbitrary code from a third-party source.
In the article it states that Microsoft HAD to allow Crowdstrike to run in kernelspace by EU laws, because else MS would have the monopoly on kernel-level security solutions / integrations.
They probably had to, in the same way that banks had to use crowdstrike. Much as it's easy for banks to say "we use crowdstrike, like everyone else" rather than implement a bespoke and accountable framework for risk assessment and mitigation for every type of endpoint use case (and argue that case to both the auditor and regular). In this case it's easier for Microsoft to say "see, they can run in kernel space" rather than provide a bunch of API functions that achieve what's needed, convince all third party vendors to use them, and put in place a process to convince an auditor that Microsoft security software will never use any knowledge or functionality from the OS outside this.
I guess I don't think that's the sole reason, as I think the incentives would still be in place even if Microsoft authored security software did not run anything in kernel space.
You're spilling cheap propaganda. Microsoft likely never had[0] an appropriate userland-level API in place and them blaming the EU should not be repeated by someone calling themselves a journalist.
[0] https://www.youtube.com/watch?v=EGttFWntctU - I need to state here that I do not possess the level of knowledge the author of video presents and therefore am unable to confirm findings included in the video
And we're back to Microsoft -- they are responsible for not having a proper way to handle such third-party apps, nor they maintained a process and controls to prevent such rogue breaking updates.
Let me exaggerate a bit to show how bad that analogy is:
Let's say I've developed an laptop that bricks whenever you open a website with incorrectly formatted HTML.
Not sure how to adapt your bike analogy to this... Let's say you made a bike that's intended to be ridden outdoors, but breaks down whenever user sits on indoors. Yea, no one is supposed to ride it indoors. Not sure it's the best analogy though.
UPDATE: let's say the bike breaks down completely whenever it's ridden in the rain.
Same with Linux yes, I never said Linux is any better in this question than Windows. At least it's free, and no warranties is given. But if RedHat had failed the same way, I think ReHat Inc would bear the blame just as well.
PS: I believe BSD-based systems would be more resilient because of microkernel architecture.
Isn't corporate malware by definition on the "critical path"? The article outlines the reasons why that jank runs in kernel space, and why MS is unable to "downgrade" it to userspace.
This is the comment I expected, begging to handover your freedoms to run software to a big carry.
If you replace parts in your BMW, and put in some garbage or incompatible parts, it your fault if it doesn’t run.
You expect to sue your mechanic if he messed up, and for him to cover the full cost. For some reason people do not expect CrowdStrike to pay for their stupidity, which is the root of the problem. And the management that installed crowdstrike without due diligence
This is the comment I expected, begging to handover your freedoms to run software to a big carry.
If you replace parts in your BMW, and put in some garbage or incompatible parts, it your fault if it doesn’t run.
You expect to sue your mechanic if he messed up, and for him to cover the full cost. For some reason people do not expect CrowdStrike to pay for their stupidity, which is the root of the problem. And the management that installed crowdstrike without due diligence
Bit it wasn't some garbage parts in a car, it was an app. And apps fail all the time, OS is expected to handle that. Same as car is expected to handle rain for example.
The EU's rules are that Microsoft can't hoard APIs away from competitors, not that they have to give competitors a kernel driver SDK. If Microsoft says Windows Defender needs a kernel driver, then CrowdStrike gets to ship a kernel driver, too.
Microsoft, interestingly enough, is working on a project to add an eBPF[0] runtime to the NT kernel. If they were to use this for their own security products then I doubt the EU would prohibit them from transitioning third-party security products to eBPF programs. Antitrust and competition law do not care about specific technical measures competitors use to compete, just that dominant companies are not shutting competitors out of markets.
[0] Formerly "extended Berkley Packet Filter", eBPF lets you run safety-verified code in kernel space. Notably, the verifier isn't just a signing check, it can actually ensure the code won't crash the kernel directly.
Yes and no. As others have pointed out above, it is factually correct that they were forced by the EU to give access to kernelspace. However, it is also true that the only reason for that was that _they_ were using kernelspace for the same things (instead of creating a framework and API into the features needed).
Microsoft didn't write the Falcon sensor software nor did they put it in the kernel. In fact, Microsoft has been shouting to the heavens trying to shift the blame from CrowdStrike onto the European Commission, because they want people to irrationally hate antitrust so they can turn Windows into shitty iOS and monopolize the security market (and applications market) for it.
Furthermore, Microsoft does actually have some rules regarding what you can and can't put into a signed kernel driver. Specifically, they won't sign kernel code unless they've seen and tested it first. CrowdStrike deliberately circumvented this rule by implementing their own configuration format - really, just a fancy way of loading code into the kernel that Microsoft doesn't have signing control over.
If there is blame to be had here for Microsoft, maybe it's that their kernel code signing program doesn't scrutinize third-party configuration formats hard enough. I mean, if you sign a code loader, you're really signing all possible programs, making code signing irrelevant. And configuration is more often than not, code in a trenchcoat. It's often Turing-complete, and almost certainly more complicated than the actual programming languages used to write the compiled code being signed off on.
But at the same time I imagine Microsoft tried this and got pushback. That might be why they feel (incorrectly) like they can blame the EU for this. Every third-party security solution does absolutely unspeakable things in kernel space that no one with actual computer science training would sign off on, using configuration to wrestle signing control away from Microsoft. Remember: Crowdstrike is designed to backdoor Windows systems so that their owners know if an attack has succeeded, not to make them more secure from attacks in the first place. Corporations are states[0], and states fundamentally suffer from poor legibility: they own and operate far too much stuff for a tribe[1] of humans to meaningfully control or remember.
The problem is that we have two different entities that all have the ability to stop this madness. When states run into this situation, they impose "joint and several liability", which means "I don't care how we precisely assign blame, I'm just going to say you all caused it and move on". In other words, it's Microsoft's fault and it's CrowdStrike's fault.
[0] ancaps fite me
[1] Maximally connected social graph with node degree below Dunbar's number.
> because they want people to irrationally hate antitrust
One only needs to look at what's happening with Google's privacy sandbox to know the perils of antitrust with regard to introducing new interfaces. Even though Google has offered new interfaces and APIs that they themselves intend to migrate to (and take a ~20% revenue reduction), they've attracted the scrutiny of regulators who claim that this is a way of locking out competitors in the advertising space.
> [0] ancaps fite me
This part is simply inciting a flamewar, and something that you can do without in the spirit of the website guidelines[1].
It's important to remember that every other browser dropped third-party cookie support years before Chrome did. Google dragged their feet on it until they could come up with a solution that would give Google the same level of tracking, because Google is an advertising company. So the competition authorities are telling Google - and only Google - that they can't drop third-party cookies anymore.
I've never actually heard anyone claim Privacy Sandbox[0] APIs would give third-party ad networks the same level of tracking as Google. But I imagine even if they did, the APIs would probably be a poor fit for competing ad networks, in the same way that, say, the iOS File Provider APIs are a terrible fit for Dropbox[1].
There are three different ways you can introduce a new standard or interface:
- You can go to or form a standards body with all the relevant market players and agree on a technical specification for that interface. This is preferred, and it's how the Web is usually done.
- You can take a competitor's interface people are already using and adopt that. This is how you get de-facto standards, and while they might have loads of technical problems[2], none of them give you an unfair market advantage.
- You can make your own interface and force competitors to adopt that. You get all the technical problems of a de-facto standard, but those are all problems your competition has to deal with, not you.
The difference is a matter of market advantage. Out of all the major browser vendors, only Google has dominance in online marketing. Microsoft and Apple would like to have a piece of that pie, but they all dropped third-party cookies without tying it to their own competing standards that they wanted to force other people to use.
[0] Hell of an Orwellian name
[1] For example, if you use Dropbox as your file storage, you can't pick folders. At all. On an operating system built by the company whose engineers are obsessed with bundles (directories that look and act like files instead of folders).
The driver is some kind of AV/Signature detection hook. E.g check every open() for this list of checksums and refuse to open known viruses style system. The 'update' was a borked definition file which triggered a bug in that system.
It's not code execution without signing, and I think probably they do want these files to be updated hands free.
The real problem was the lack of testing, rather than the actual mechanism I think.
This is the nugget of the issue. The code-signing process, in this case, was abused to verify something that, fundamentally, cannot give the guarantee "Doesn't crash your OS" because it is allowed to run arbitrary code in the form of novel commands in what is essentially a DSL. So if code-signing is supposed to be a guarantee from MS that "this code can't crash your system," it should never have been signed... But then MS would have been on hooks for blocking a competitor.
To get a driver signed by Microsoft, the developer of the driver is required to provide a full cert pass log from the Windows Hardware Lab Kit to dev center [0]. Do you have any article that says the CrowdStrike driver has been tested by Microsoft?
To avoid going through the full cert process the sensor was certified but it loaded code from an uncertified module too so that it could be quickly updated to catch new threats. It's a tough corner to be in, to function properly it needs to update very quickly but the cert process takes a while to complete so they went with this work around of a signed module loading uncertified code.
...you want Microsoft to forbid you from running certain kinds of programs on your own machine, even if you really, really insist on it, do I understand you correctly?
More like: "...you want Microsoft to forbid you from running certain kinds of programs (with gaping security holes / processes) on your own machine" YES
You're moving the goal post waaaay far down. How about just following best practices? How about not allowing runtime code injection? Turns out security holes often have much in common, and with ways to mitigate them. Stop 100% of security holes? nah. Stop 99.9% of security holes? Yes and what an improvement.
This is a valid opinion and I don't know why you were downvoted (well other than the hacker news bubble mindset (or mindless-set).
How is Microsoft not to blame, it's their product? We wouldn't blame a Toyota supplier for a failure in a car, but we somehow segment that in the software world?
Toyota chose the supplier, worked with them on the specs and designs, and put it in their OE car delivered to the customer. It has Toyota's name on it, it was bought at a Toyota dealership, is a part of Toyota's warranty.
Crowdstrike is entirely optional software that doesn't come from Microsoft. Microsoft doesn't market it. Microsoft had no hand in making it. Microsoft doesn't sell it. Microsoft had no hand in a user installing Crowdstrike.
No. My point is that Microsoft allows the damn thing to be ran in kernel space. Mac, linux don't have this problem due to how THEY architected the system. Yes I think that puts Microsoft at blame.
Microsoft should have no say to decide what software I am allowed to run on my computer.
> Mac, linux don't have this problem due to how THEY architected the system.
You're joking right? You're arguing kernel panics can't happen on Linux? FFS, the CrowdStrike sensor caused kernel panics on multiple Linux distros in the last few months! Linux is not immune to kernel panics for buggy kernel modules.
The first point is pretty philosophical so I'm not gonna go far into that. At the end of the day you bought a product from a company, some of those products have a way to load programs on and some are locked down (a microwave). "should" is pretty biased whether I agree with that conclusion or not.
Two: Here I'm not arguing about what's possible but rather what happened in the real world. 8.5 M machines down, my org runs Macs, we knew about it from the news...
No smarty pants, I'm arguing that you can't load a program on a microwave's microprocessor. Should I be able to do that?
"And yes, in the real-world, third-party software can and does cause Macs to crash." Thanks for adding so much to the conversation (eyes rolled).
In the absolute sense 8.5M machines is a lot. Airlines down is a lot. Hospitals down is a lot. Hey we guarantee we won't wreck 99.4% of our machines out there! is not a good guarantee.
Yes, you are arguing for that microwave when you argue Microsoft should approve the software you're allowed to run on a Windows box and be liable for its performance. Should Microsoft also have to approve what browsers you're allowed to run, should they approve what chat applications you're allowed to use?
And sure, why shouldn't you be able to modify the software on hardware you own? It's your microwave. If you modify the software on it and that causes it to burn up don't go to the manufacturer when it burns your house down. But that's true if you open it up and rewire it as well. Which, sure, feel free to open it up. It is your microwave.
Are you arguing you shouldn't be able to modify the things you own?
> Thanks for adding so much to the conversation
I mean it seriously seems like you're arguing MacOS and Linux are immune to third party software crashing the system. Do you agree or disagree that third party software can cause MacOS and Linux instability, especially when the user chooses to run it at root level permissions?
> we guarantee we won't wreck
Microsoft didn't wreck these machines. CrowdStrike wrecked these machines. Every Windows machine that did not have CrowdStrike installed was unaffected by this, which is 99.4% of Windows machines.
> what happened in the real world
And yes, look at those bug reports, those are crashes happening in the real world not something theoretical. Kernel panics happen!
I'm not making an analogy with the microwave (your saying food is software and the microwave is hardware) I'm literally talking about the software that runs on a microwave.
I'm aware of the point you're trying to make with the microwave. I'm making another analogy; one you're not getting. And either way, yes, I think you should be able to change the software on the microwave. It is your microwave. Do whatever you want with it. Why should Samsung or GE have the right to say what you can or cannot do with the things you own?
If we want to talk microwaves, Microsoft is the microwave manufacturer. Users installing CrowdStrike are people sticking a giant ball of foil and paper towels in the microwave and turning it on for an hour. You're arguing Microsoft is liable for the things people stick in their microwaves, and that Microsoft should put in place guards to prevent people from putting whatever they want in their own microwaves. That Microsoft should control the things people put in their microwaves. Only Microsoft tested and Microsoft approved foods in Microsoft microwaves. And the microwave needs to ensure only the proper cook time applies to the properly signed food products to make sure it doesn't get burnt. Sorry, Microsoft hasn't fully validated Red Gold potatoes, it can only cook Russet potatoes.
That is the same logic as Microsoft is liable for the third-party software people install on Windows machines and that Microsoft shouldn't have allowed the third-party software to run.
Why should Microsoft be able to say what antivirus software I choose to install or not? Why should Microsoft be able to say what browser I install? If I install some software that breaks my Windows machine, is that the faut of Microsoft or the fault of the software maker? If I stick foil in the microwave is the ensuing fire GE's fault?
And banks/airlines etc were hit hard because their _Windows_ didn't boot, not because of an application crash on a perfectly working Windows.