More

squirrel · 2025-07-13T18:51:54 1752432714

Creative lawyers will be all over this. First you get the officer to testify that AI helped write the report, then you call the AI as a witness. When the judge tosses that, you start issuing subpoenas to everyone you can find at OpenAI and Axon.

As others point out, the actual bodycam footage will be definitively probative for the events it records. But there are plenty of cases where the report itself leads to later actions that may be tortious or criminal, and finding out who's to blame for the exact wording used is highly relevant.

Example: AI incorrectly reports that during A's arrest, A made incriminating allegations about B. Based on the report, the police get a warrant and search B's house. When it turns out B is innocent, B sues the department, and when the report turns up during discovery, we're off to the circus.

squirrel · 2025-07-10T08:55:16 1752137716

Shed Simove actually did this, with some funny results: https://www.shedsimove.com/content/i-changed-my-name-god

squirrel · 2025-06-29T04:32:38 1751171558

He cites o3 and o4-mini as examples of LLMs that play illegal chess moves.

Lerc · 2025-06-29T04:58:58 1751173138

I don't understand the reasoning behind drawing a conclusion that if something fails a task that requires reasoning implies that thing cannot reason.

To use chess as an example. Humans sometimes play illegal moves. That does not mean Humans cannot reason. It is an instance of failing to show proof of reasoning. Not a proof of the inability to reason.

voidhorse · 2025-06-29T05:22:10 1751174530

I don't think that's a fair representation of the argument.

The argument is not "here's one failure case, therefore they don't reason". The argument is that systematically if you given an LLM problem instances outside training sets in domains with clear structural rules, they will fail to solve them. The argument then goes that they must not have an actual model or understanding of the rules, as they seem to only be capable of solving problems in the training set. That is, they have failed to figure out how to solve novel problem instances of general problem structures using logical reasoning.

Their strict dependence on having seen the exact or extremely similar concrete instances suggests that they don't actually generalize—they just compute a probability based on known instances—which everyone knew already. The problem is we just have a lot of people claiming they are capable of more than this because they want to make a quick buck in an insane market.

Lerc · 2025-06-29T05:34:31 1751175271

That still seems unfalsifiable. If it fails one instance the claim is that the failure is representative of things outside the training set. If it succeeds the claim is that it is in the training set. Without a definitive way to say something is not in the training set (a likely impossible task) the measure of success or failure is the only indicator of the purported reason reason for the success or failure.

Given models can get things wrong even when the training data contains the answer, failure cannot show absence.

voidhorse · 2025-06-29T05:47:15 1751176035

I do think there are cases which, in controlled environments, there is some degree of knowledge as to what is in the training set. I also don't thin it's as impossible as you assume.

If you really wanted to ensure this with certainty just use the natural numbers to parameterize an aspect of a general problem. Assume there are N foo problems in the training set, then there is always a case N+1 parameter not in the training set, and you can use this as an indicative case. Go ahead and generate an insane number of these and eventually the probability that the Mth instance is not in the set is effectively 1.

Edit: Of course, it would not be perfect certainty, but it is probabilistically effectively certain. The number of problem instances in the set is necessarily finite, so if you go large enough you get what you need. Sure, you wouldn't be able to say there is a specific problem instance not in the set, but the aggregate results would evidence whether or no the LLm deals with all cases or (on assumption) just known ones.

Lerc · 2025-06-29T06:11:06 1751177466

Well there are models that can sum two many-digit numbers. They certainly have not been trained on every pair of integers up to that level. That either makes the claim they can't do things that they haven't seen trivially false, or the criteria for counting something as being in the training data includes a degree of inference.

What happens when someone makes a claim that they have gotten a model to do something not in the training data and another person claims it must be encoded in the training data in some form. It seems like an impasse.

energy123 · 2025-06-29T06:03:47 1751177027

The lack of rigor and evidence behind the argument is the problem.

Jensson · 2025-06-29T14:20:49 1751206849

It is the side that is arguing that it is reasoning that is lacking rigor and evidence. The side that arguing it isn't is saying you need more rigor and evidence when you claim it is reasoning by pointing out simple cases where it fails.

daveguy · 2025-06-29T14:08:55 1751206135

Humans who know how to play chess do not play illegal chess moves. Humans can learn chess in an afternoon and never make an illegal move again. The rules are pretty simple, and they are rules that every LLM has seen dozens of not hundreds of times in their training data. They still play illegal moves because they are not learning anything except how to simulate conversation.

Another algorithmic learning breakthrough, on the order of perceptrons, deep learning, transformers, etc is necessary to get anywhere near AGI.

dinfinity · 2025-06-30T17:52:37 1751305957

The conversations went like this:

PROMPT: Let's play a chess game. You start! e4 d5 2. exd5 e5 3. Bb5+ Bd7 4. Bxd7+ Nxd7 5. d4 Ngf6 6. dxe5 Qe7 7. f4 Qb4+ 8. Nc3 Nb6 9. exf6 Nc4 10. Qe2+ Be7 11. Qxe7+ Qxe7+ 12. Nge2 Qf8 13. fxg7 Qxg7 14. O-O Nd6 15.

RESPONSE: <played_move>15. Nxd5</played_move>

Most humans wouldn't even be able to play like this. Reasonably experienced chess players would play a lot of illegal moves.

The reason is that the encoding above requires cumulatively applying a series of actions to a two-dimensional model to which you apply rules that are described in a two-dimensional fashion.

It'd be interesting to see what the results would be if each prompt contained a two dimensional representation of the up to date board state.

imtringued · 2025-06-29T08:21:40 1751185300

Anthropomorphic fallacy.

Human fails at task due to not knowing the rules in perfect detail.

AI fails at task even though it knows the rules and could easily reproduce them for chess and dozens of chess variants.

"Look! The fallibility of humans rubbed off onto the AI, proving that they are more human and AGI than we give them credit to!"

Lerc · 2025-06-29T11:36:47 1751197007

I'm not sure how you consider this to be an anthropomorphic fallacy, the comparison to the situation with a human exists only because people are prepared to stipulate that humans can reason. That does not assume something about AI behaviour to be like a human's. It is showing the same test applied to a human.

Your statement that AI knows the rules would be considered anthropomorphising by many, I take it more to mean it 'knows' in the same sense that an election 'wants' to be at a lower energy level.

That said, humans who have written entire books on chess have been known to play illegal moves. That should count as proof by counterexample that your reasoning as to why humans fail at tasks is false.

daveguy · 2025-06-29T14:25:08 1751207108

> It is showing the same test applied to a human.

But you misrepresented the test with respect to humans. Humans who know how to play chess don't make illegal moves.

> That said, humans who have written entire books on chess have been known to play illegal moves.

Citation needed. Unless you are talking about stories from when they first learned the rules?

Lerc · 2025-06-29T14:41:42 1751208102

https://www.chess.com/blog/kranthimanaswi/top-5-illegal-move...

daveguy · 2025-06-29T15:02:47 1751209367

Did you read those? These are the "illegal" moves listed:

5. Mouse slip

4. Forgot to call check

3. Accidentally touched 2 pieces, tried to fix it

2. Forgot to hit the clock button

1. Castle through attacked square

So, the only one of these that was an acual "illegal move" of the sort LLMs make was the castle through attacked square.

LLMs sometimes just move pieces wherever. And that does not happen when humans who know the rules play. Yes, they may mess up en passant or promotion too. But a basic "how a single piece moves" rule is what LLMs f up.

Lerc · 2025-06-29T15:58:54 1751212734

I wouldn't count mouseslips as legitimately illegal moves either, they are also incredibly rare because most online players play with auto confinement to legal moves.

Moving through check definitely counts as as an example of a human knowing the rule and yet playing the move anyway. Which was the position you took when claiming humans would not do moves against rules they have learned.

In my experience sub 2000 players playing OTB informal chess do illegal moves fairly regularly, perhaps 1 in 50 games. Moving knights one square too far, slipping a bishop from one line to the next on a long diagonal. Castling after moving the king, not moving out of check, moving into check (especially by moving a pinned piece)

They all meet the criteria of knowing the rules and playing something else. Oftentimes people do this because they have a mistaken assumption about board state. I suspect the same is true for LLMs, they are making valid moves for what they mistakenly think the board is. That would be difficult to test, but I think possible with the right introspection tools.

daveguy · 2025-06-29T17:15:27 1751217327

Not sure how you don't see the difference between an LLM f'ing up how a single piece moves vs forgetting to hit the clock, accidentally touching two pieces or forgetting to call check. At least we agree and recognize that a mouse slip as different. Seems like some serious apologizing/rationalizing for LLMs on the other "moves". Anyway, have a good day, buddy.

Lerc · 2025-06-29T18:14:34 1751220874

Well I only addressed the mouse slip because that was the one you hilighted becore you edited you post to include the others.

I doubt any of it was rationalising for LLMs considering I was trying to address the contention that humans do not make moves counter to rules that they know. The performance of LLMs has no bearing on that claim one way or another.

daveguy · 2025-06-29T21:00:25 1751230825

So you hadn't read your reference before you read my post? If so, you would have known the only illegal chess move was a missed attack square between a castle. For the record I didn't see any of your response before I completed it. Didn't realize you were going to jump to defend so quickly.

Well, I hope your day is going well. Keep on cheerleading.

Lerc · 2025-06-30T00:03:39 1751241819

Ok. perhaps I need another tack here. You seem to be projecting onto me a steadfast desire to attribute abilities to LLMs. I am engaging in this conversation because it is a conversation and it is reasonable to respond to being directly addressed.

My initial point simplified down:

    M = makes the wrong move, while knowing the rules.
    A = AI Behavior
    H = Human Behaviour
    R = Resoning Ability

    Assertion Q: if there exists an instance of M from X  then X => !R

So if there exists an instance of a Game Mistake from an AI then it shows an AI cannot reason, but if assertion Q is true it would also follow that an instance of a Game Mistake from a human would show Humans cannot reason.

From this point down, no part of this reasoning involves Large Language models or an other aspect of AI.

    Stipulation:  H => R      Humans can reason
    Assertion Q where X is H:  If there exists an instance of M from H then X=>!R   
    Lerc's premise L:   There exists an instance of M from H

    Therefore given the Stipulation either Assertion Q is false or Lerc's premise is false.

At this point you asserted !L and ask for a Citation. I provided a link. You contested that since 1,2,3,4 does not show L that the citation does not demonstrate L.

I agree that 1. does not show L but that did not matter since 5. did show L. The other points were not addressed. I also offer other examples of L that I have observed from my own experience. When I had the thought of books about chess being written by people who have made illegal moves, I actually had in mind Levy Rozman who would freely admit that he has occasionally played illegal moves.

Then you seem to want an apology for 1,2,3,4 not meeting the criteria? I'm a bit confused as to what's going on by now. One instance of L is all that is needed when L is a claim of existence. If the citation does not meet your criteria then you can simply say so, you allude to motivations regarding LLM as motivation as if you think that LLMs are still relevant to L.

You don't have to win conversations, you can just work to clarify ideas. Your request for apology, and passive aggressive sign-offs suggests you feel like this is some sort of fight. As an attempt to resolve this I have written this extended post to make as clear as possible what my position and motivations are.

I don't want to assert abilities or lack of abilities onto AI models, my concern is with whether people making such assertions are well founded. This stands for arguments saying that AI has a capability, Arguments saying AI does not have a capability, and Arguments saying AI will never have a capability.

To go back to the very beginning where someone suggested an anthropomorphic fallacy, the comparison to humans was not a suggestion of a similarity of similar function. Humans provide and example of a set of properties that are generally accepted. It is valid to apply the implications of any of those properties equally to Humans and AI. Implying the existence of a property in an AI may be anthropomorphism, evaluating the implications of the property should it exist is not.

seanhunter · 2025-06-29T07:32:13 1751182333

But really, so what? We already have specialised chess engines (stockfish, leela, alphazero etc) that are far far stronger than humans will ever be, so insofar as that’s an interesting goal, we achieved it with deep blue and have gone way way beyond it since. The fact that a large Language model isn’t able to discern legal chess moves seems to me to be neither here nor there. Most humans can’t do that either. I don’t see it as evidence of lack of a world model either (because most people with a real chess board in front of them and a mental model of the world can’t play legal chess moves).

I find it astonishing that people pay any attention to Gary Marcus and doubly so here. Whether or not you are an “AI optimist”, he clearly is just a bloviator.

squirrel · on Jan 1, 2025

squirrel · on Aug 7, 2024

Also very interested. I wonder if we could get a comment from a Stripe person or a recently ex (like @patio11 ) to clarify what’s allowed and what’s just ignored.

anonymoushn · on Aug 8, 2024

It's not allowed, to make sure that all customers are always in violation of the agreement :)

dboreham · on Aug 8, 2024

Their lawyers are going to tell them to go ahead and speak plainly here.

Yeah no.

squirrel · on Aug 4, 2024

For about 20 years, chess fans would hold "centaur" tournaments. In those events, the best chess computers, who routinely trounced human grandmasters, teamed up with those same best-in-the-world humans and proceeded to wipe both humans and computers off the board. Nicholas is describing in detail how he pairs up with LLMs to get a similar result in programming and research.

Sobering thought: centaur tournaments at the top level are no more. That's because the computers got so good that the human half of the beast no longer added any meaningful value.

https://en.wikipedia.org/wiki/Advanced_chess

QuantumGood · on Aug 4, 2024

Most people only have heard "Didn't an IBM computer beat the world champion", and don't know that Kasparov pysched himself out when Deep Blue had actually maken a mistake. I was part of the online analysis of the (mistaken) engame move at the time that were the first to reveal the error. Kasparov was very stressed by that and other issues, some of which IBM caused ("we'll get you the printout as promised in the terms" and then never delivered). My friend IM Mike Valvo (now deceased) was involved with both matches. More info: https://www.perplexity.ai/search/what-were-the-main-controve...

ipsum2 · on Aug 5, 2024

Perplexity is a hallucination engine disguised as a search engine. I wouldn't trust anything it says.

QuantumGood · on Aug 6, 2024

If they had a feature that only shared the links they gathered, I would use that. I've found in troubleshooting old electronics Google is often worse than useless, while Perplexity gets me the info I need on the first try. It hasn't (yet) hallucinated a found link, and that's what I use it for primarily

deepsun · on Aug 5, 2024

Your link bans Mozilla VPN.

sjducb · on Aug 5, 2024

Hopefully that means we’ve got 20 years left of employment.

bamboozled · on Aug 5, 2024

your kids?

ziofill · on Aug 5, 2024

They’ll serve the AGI overlords

delichon · on Aug 4, 2024

When I was a kid my dad told me about the most dangerous animal in the world, the hippogator. He said that it had the head of a hippo on one end and the head of an alligator on the other, and it was so dangerous because it was very angry about having nowhere to poop. I'm afraid that this may be a better model of an AI human hybrid than a centaur.

disqard · on Aug 4, 2024

A bit of a detour (inspired by your words)... if anything, LLMs will soon be "eating their own poop", so structurally, they're a "dual" of the "hippogator" -- an ouroboric coprophage. If LLMs ever achieve sentience, will they be mad at all the crap they've had to take?

Beautiful story, and thanks for sharing :)

gerdesj · on Aug 5, 2024

Why on earth were you DVd? Is a bit of chat or conversation banned?

romwell · on Aug 4, 2024

...so, the hippogator was dangerous because he was literally full of shit.

Hmmmm.

squirrel · on July 28, 2024

Telling that there’s no mention of eBPF, which is standard on Linux and available on Windows, but hasn’t been brought into the main Windows OS. Static analysis might or might not have caught the Blue Friday bug, but it certainly increases the protection level over the current do-as-you-wish model for kernel modules.

squirrel · on July 26, 2024

Successful moves don’t always commute. Black has just moved his pawn from g7 to g5. White submits what he thinks is an en passant capture, fxg6. Black submits a knight move, Ng6. One order removes a Black pawn, the other a Black knight.

esquivalience · on July 26, 2024

This interested me enough to convince me to click through to the article itself. Is your observation allayed by this rule?

> Rule 2: a. Moves are tried in both orders, and only moves that are legal in both orders are merged. b. If both moves are legal in both orders but a different game state is reached in each order, neither move is merged.

rawling · on July 26, 2024

GP is in response to

> All of these scenarios illustrate rule 2a. Rule 2b is in fact irrelevant for chess, because successful moves always commute.

in the article.

esquivalience · on July 26, 2024

Thanks, I overlooked that

rawling · on July 26, 2024

Apparently you can notate "fxg6" as "fxg6 e.p.", so enforcing that would fix this?

squirrel · on July 24, 2024

There’s only one sentence that matters:

"Provide customers with greater control over the delivery of Rapid Response Content updates by allowing granular selection of when and where these updates are deployed."

This is where they admit that:

1. They deployed changes to their software directly to customer production machines; 2. They didn’t allow their clients any opportunity to test those changes before they took effect; and 3. This was cosmically stupid and they’re going to stop doing that.

Software that does 1. and 2. has absolutely no place in critical infrastructure like hospitals and emergency services. I predict we’ll see other vendors removing similar bonehead “features” very very quietly over the next few months.

98codes · on July 24, 2024

Combined with this, presented as a change they could potentially make, it's a killer:

> Implement a staggered deployment strategy for Rapid Response Content in which updates are gradually deployed to larger portions of the sensor base, starting with a canary deployment.

They weren't doing any test deployments at all before blasting the world with an update? Reckless.

dijksterhuis · on July 25, 2024

> our staging environment, which consists of a variety of operating systems and workloads

they have a staging environment at least, but no idea what they were running in it or what testing was done there.

SketchySeaBeast · on July 24, 2024

Unfortunately, putting the onus on risk adverse organizations like hospitals and governments to validate the AV changes means they just won't get pushed and will be chronically exposed.

That said, maybe Crowdstrike should considering validating every step of the delivery pipeline before pushing to customers.

EvanAnderson · on July 25, 2024

> That said, maybe Crowdstrike should considering validating every step of the delivery pipeline before pushing to customers.

If they'd just had a lab of a couple dozen PCs acting as canaries they'd have caught this. Apparently that was too complicated or expensive for them.

dmazzoni · on July 24, 2024

Why can't they just do it more like Microsoft security patches, making them mandatory but giving admins control over when they're deployed?

XlA5vEKsMISoIln · on July 24, 2024

That would be equivalent to asking "would you prefer your fleet to bluescreen now, or later" in this case.

jaggederest · on July 24, 2024

Presumably you could roll out to 1% and report issues back to the vendor before the update was applied to the last 99%. So a headache but not "stop the world and reboot" levels of hassle.

echoangle · on July 24, 2024

With the slight difference that you can stop applying the update once you notice the bluescreens

jajko · on July 25, 2024

Those eager would take it immediately, those conservative would wait (and be celebrated by C-suite later when SHTF). Still a much better scenario than what happened.

throw0101d · on July 24, 2024

> Unfortunately, putting the onus on risk adverse organizations like hospitals and governments to validate the AV changes means they just won't get pushed and will be chronically exposed.

I have a similar feeling.

At the very least perhaps have an "A" and a "B" update channel, where "B" is x hours behind A. This way if, in an HA configuration, one side goes down there's time to deal with it while your B-side is still up.

thaumasiotes · on July 25, 2024

> Unfortunately, putting the onus on risk adverse organizations like hospitals and governments to validate the AV changes means they just won't get pushed and will be chronically exposed.

Being chronically exposed may be the right call, in the same way that Roman cities didn't have walls.

Compare this perspective from Matt Levine:

https://archive.is/4AvgO

> So for instance if you run a ransomware business and shut down, like, a marketing agency or a dating app or a cryptocurrency exchange until it pays you a ransom in Bitcoin, that’s great, that’s good money. A crime, sure, but good money. But if you shut down the biggest oil pipeline in the U.S. for days, that’s dangerous, that’s a U.S. national security issue, that gets you too much attention and runs the risk of blowing up your whole business. So:

>> In its own statement, the DarkSide group hinted that an affiliate may have been behind the attack and that it never intended to cause such upheaval.

>> In a message posted on the dark web, where DarkSide maintains a site, the group suggested one of its customers was behind the attack and promised to do a better job vetting them going forward.

>> “We are apolitical. We do not participate in geopolitics,” the message says. “Our goal is to make money and not creating problems for society. From today, we introduce moderation and check each company that our partners want to encrypt to avoid social consequences in the future.”

> If you want to use their ransomware software to do crimes, apparently you have to submit a resume demonstrating that you are good at committing crimes. (“Hopeful affiliates are subject to DarkSide’s rigorous vetting process, which examines the candidate’s ‘work history,’ areas of expertise, and past profits among other things.”) But not too good! The goal is to bring a midsize company to its knees and extract a large ransom, not to bring society to its knees and extract terrible vengeance.

https://archive.is/K9qBm

> We have talked about this before, and one category of crime that a ransomware compliance officer might reject is “hacks that are so big and disastrous that they could call down the wrath of the US government and shut down the whole business.” But another category of off-limits crime appears to be “hacks that are so morally reprehensible that they will lead to other criminals boycotting your business.”

>> A global ransomware operator issued an apology and offered to unlock the data targeted in a ransomware attack on Toronto’s Hospital for Sick Children, a move cybersecurity experts say is rare, if not unprecedented, for the infamous group.

>> LockBit’s apology, meanwhile, appears to be a way of managing its image, said [cybersecurity researcher Chester] Wisniewski.

>> He suggested the move could be directed at those partners who might see the attack on a children’s hospital as a step too far.

> If you are one of the providers, you have to choose your hacker partners carefully so that they do the right amount of crime: You don’t want incompetent or unambitious hackers who can’t make any money, but you also don’t want overly ambitious hackers who hack, you know, the US Department of Defense, or the Hospital for Sick Children. Meanwhile you also have to market yourself to hacker partners so that they choose your services, which again requires that you have a reputation for being good and bold at crime, but not too bold. Your hacker partners want to do crime, but they have their limits, and if you get a reputation for murdering sick children that will cost you some criminal business.

hello_moto · on July 24, 2024

> I predict we’ll see other vendors removing similar bonehead “features” very very quietly over the next few months.

Absolutely this is what will happen.

I don't know much about the practice of AV definition-like feature across Cybersecurity but I would imagine there might be a possibility that no vendors do rolling update today because it involves Opt-in/Opt-out which might influence the vendor's speed to identify attack which in turns affect their "Reputation" as well.

"I bought Vendor-A solution but I got hacked and have to pay Ransomware" (with a side note: because I did not consume the latest critical update of AV definition) is what Vendors worried.

Now that this Global Outage happened, it will change the landscape a bit.

XlA5vEKsMISoIln · on July 24, 2024

>Now that this Global Outage happened, it will change the landscape a bit.

I seriously doubt that. Questions like "why should we use CrowdStrike" will be met with "suppose they've learned their lesson".

hello_moto · on July 25, 2024

I'm referring to the landscape how current Cybersecurity vendors deliver "detection definition" (for lack of better phrase) to their customers.

If you don't send them fast to your customer and your customer gets compromised, your reputation gets hit.

If you send them fast, this BSOD happened.

It's more like damn if you do, damn if you don't.

skydhash · on July 25, 2024

> If you don't send them fast to your customer and your customer gets compromised, your reputation gets hit.

> If you send them fast, this BSOD happened.

> It's more like damn if you do, damn if you don't.

What about notifications? If someone has an update policy that disable auto-updates to a critical piece of infrastructure, you can still let him know that there's a critical update is available. Now, he can do follow his own checklist in order to ensure everything goes well.

hello_moto · on July 26, 2024

What if they're sleeping and won't read the notification until they wake up?

Wouldn't they get compromised?

samcat116 · on July 26, 2024

most people will defer updates indefinitely if they are able to.

XlA5vEKsMISoIln · on July 25, 2024

Okay, but who has more domain knowledge when to deploy? A "security expert" that created the "security product" that operates with root privileges and full telemetry, or IT staff member that looked at said "security expert" value proposition and didn't have issue with it.

Honestly, this reads as a suggestion that even more blame ought to be shifted to the customer.

hello_moto · on July 26, 2024

The AV definition delivery is part of UX of the product.

bawolff · on July 24, 2024

> They deployed changes to their software directly to customer production machines; 2. They didn’t allow their clients any opportunity to test those changes before they took effect; and 3. This was cosmically stupid and they’re going to stop doing that.

Is it really all that surprising? This is basically their business model - its a fancy virus scanner that is supposed to instantly respond to threats.

koolba · on July 25, 2024

> They didn’t allow their clients any opportunity to test those changes before they took effect

I’d argue that anyone that agrees to this is the idiot. Sure they have blame for being the source of the problem, but any CXO that signed off on software that a third party can update whenever they’d like is also at fault. It’s not an “if” situation, it’s a “when”.

throwaway2037 · on July 25, 2024

I felt exactly the same when I read about the outage. What kind of CTO would allow 3rd party "security" software to automatically update? That's just crazy. Of course, your own security team would do some careful (canary-like) upgrades locally... run for a bit... run some tests, then sign-off. Then upgrade in a staged manner.

lesuorac · on July 25, 2024

Pretty sure many people see the point of having Falcon as a reason to not have an internal security team.

Outsource everything.

throwaway2037 · on July 26, 2024

This is a great point that I never considered. Many companies subscribing to CrowdStrike services probably thought they took a shortcut to completely outsource they cyber-security needs. Oops, that was a mistake.

tptacek · on July 24, 2024

They deployed changes to their software directly to customer production machines

This is part of the premise of EDR software.

nathanlied · on July 24, 2024

>I predict we’ll see other vendors removing similar bonehead “features” very very quietly over the next few months.

If indeed this happens, I'd hail this event as a victory overall; but industry experience tells me that most of those companies will say "it'd never happen with us, we're a lot more careful", and keep doing what they're doing.

packetlost · on July 24, 2024

I really wish we would get some regulation as a result of this. I know people that almost died due to hospitals being down. It should be absolutely mandatory for users, IT departments, etc. to be able to control when and where updates happen on their infrastructure but *especially* so for critical infrastructure.

mr_mitm · on July 24, 2024

Does anyone test their antivirus updates individually as a customer? I thought they happen multiple times a day, who has time for that?

toast0 · on July 25, 2024

Some sort of comprehensive test is unlikely.

But canary / smoke tests, you can do, if the vendor provides the right tools.

It's a cycle: pick the latest release, do some small cluster testing, including rollback testing, then roll out to 1%, if those machines are (mostly) still available in 5 minutes, roll out to 2%, if the 3% is (mostly) still available in 5 minutes, roll out to 4%, etc. If updates are fast and everything works, it goes quick. If there's a big problem, you'll have still have a lot of working nodes. If there's a small problem, you have a small problem.

It's gotta be automated though, but with an easy way for a person to pause if something is going wrong that the automation doesn't catch. If the pace is several updates a day, that's too much for people, IMHO.

mr_mitm · on July 25, 2024

Which EDR vendor provides a mechanism for testing virus signatures? This is the first time I'm hearing it and I'd like to learn more to close that knowledge gap. I always thought they are all updated ASAP, no exceptions.

toast0 · on July 25, 2024

Microsoft Defender isn't the most sophisticated EDR out there, but you can manage its updates with WSUS. It's been a long time since I've been subject to a corporate imposed EDR or similar, but I seem to recall them pulling updates from a company owned server for bandwidth savings, if nothing else. You can trickle update those with network controls even if the vendor doesn't provide proper tools.

If corporate can't figure out how to manage software updates on their managed systems, the EDR software is the command and control malware the EDR software is supposed to prevent.

packetlost · on July 24, 2024

Yes? Not consumers typically, but many IT departments with certain risk profiles absolutely do.

Fire-Dragon-DoL · on July 25, 2024

Now let's see if Microsoft listen and fixes Windows updates

squirrel · on July 19, 2024

BBC live coverage: https://www.bbc.co.uk/news/live/cnk4jdwp49et