He is implying that the scientists involved haven't thought of those questions, ...

ttpphd · on Oct 25, 2024

A computer scientist/electrical engineer who is arrogant? I dunno, I need to see the statistical test to believe that's possible.

eightysixfour · on Oct 25, 2024

Computers are a “complete” system where everything they do is inspectable and, eventually, explainable, and I have observed that people who work with computers (myself included) over estimate their ability to interrogate and explain complex, emergent systems - economics, physics, etc. - which are not literally built on formal logic.

dekhn · on Oct 25, 2024

a single computer might be complete (even then, not everything is inspectable unless you have some very expensive equipment) but distributed systems are not.

There was an entire class of engineers at google- SREs- many of whom were previously physicists (or experts in some other quantitative field). A fraction of them (myself included) were "cluster whisperers"- able to take a collection of vague observations and build a testable hypothesis of why things were Fucked At Scale In Prod. Then come up with a way to fix it that didn't mess up the rest of the complete system.

Nothing- not even computers are truly built on formal logic. They are fundamentally physics-driven machines with statistical failure rates, etc. There's nothing quite like coming across a very expensive computer which occasionally calculates the equivalent of 1*1 = inf, simply because some physical gates have slightly more electrical charge on them due to RF from a power supply that's 2 feet away.

eightysixfour · on Oct 25, 2024

I think you're mixing up two different things: the challenges of building these systems at scale, and their fundamental properties. Take your example of the expensive computer returning 1*1 = inf because of a nearby power supply - that actually proves my point about computers being knowable systems. You were able to track down that specific environmental interference precisely because computers are built on logic with explicit rules and dependencies. When these types of errors are caught, we know because they do not conform to the rules of the system, which are explicitly defined, by us. We can measure and understand their failures exactly because we designed them.

Even massive distributed systems, while complex, still follow explicit rules for how they change state. Every bit of information exists in a measurable form somewhere. Sure, at Google scale we might not have tools to capture everything at once, and no single person could follow every step from electrical signal to final output. But it's theoretically possible - which is fundamentally different from natural systems.

You could argue the universe itself is deterministic (and philosophically, I agree), but in practice, the emergent systems we deal with - like biology or economics - follow rules we can't fully describe, using information we can't fully measure, where complete state capture isn't just impractical, it's impossible.

Vegenoid · on Oct 25, 2024

To simply illustrate your point: if you see a computer calculate 1*1=∞ occasionally, you know the computer is wrong and something is causing it to break.

If you see a particle accelerator occasionally make an observation that breaks the standard model, depending on what it is breaking you can be very confident that the observation is wrong, but you cannot know that with absolute certainty.

eightysixfour · on Oct 25, 2024

Great explanation, thank you.

BeetleB · on Oct 25, 2024

> when in reality this field is one of the strictest in terms of statistical procedures like pre registeration, blinding, multiple hypothesis testing etc

I'm not in HEP, but my graduate work had overlap with condensed matter physics. I worked with physics professors/students in a top 10 physics school (which had Nobel laureates, although I didn't work with them).

Things may have changed since then, but the majority of them had no idea what pre-registration meant, and none had taken a course on statistics. In most US universities, statistics is not required for a physics degree (although it is for an engineering one). When I probed them, the response was "Why should we take a whole course on it? We study what we need in quantum mechanics courses."

No, my friend. You studied probability. Not statistics.

Whatever you can say about reproducibility in the social sciences, a typical professor in those fields knew and understood an order of magnitude more statistics than physicists.

noslenwerdna · on Oct 25, 2024

As an ex-HEP, I can confirm that yes, we had blinding and did correct for multiple hypothesis testing explicitly. As Kyle Cranmer points out, we called it the "look elsewhere effect." Blinding is enforced by the physics group. You are not allowed to look at a signal region until you have basically finished your analysis.

For pre-registration, this might be debatable, but what I meant was that we have teams of people looking for specific signals (SUSY, etc). Each of those teams would have generated monte carlo simulations of their signals and compared those with backgrounds. Generally speaking, analysis teams were looking for something specific in the data.

However, there are sometimes more general "bump hunts", which you could argue didn't have preregistration. But on the other hand, they are generally looking for bumps with a specific signature (say, two leptons).

So yes, people in HEP generally are knowledgeable about stats... and yes, this field is extremely strict compared to psychology for example.