> "I've experienced 100 error-free hours, so this is a non-issue for me and my u...

lxgr · on March 20, 2024

> It's a statement of fact: it has been a non-issue for me.

Yes...

> If you're like me, it's statistically reasonable to assume it will be a non-issue for you too.

No! This simply does not follow from the first statement, statistically or otherwise.

You and I might or might not be fine; you having been fine for 100 hours on the same configuration just offers next-to-zero predictive power for that.

jcalvinowens · on March 20, 2024

> No! This simply does not follow from the first statement, statistically or otherwise.

> You and I might or might not be fine; you having been fine for 100 hours on the same configuration just offers next-to-zero predictive power for that.

You're missing the forest for the trees here.

It is predictive ON AVERAGE. I don't care about the worst case like you do: I only care about the expected case. If I died when my filesystem got corrupted... I would hope it's obvious I wouldn't approach it this way.

Adding to this: my laptop has this btrfs bug right now. I'm not going to do anything about it, because it's not worth 20 minutes of my time to rebuild my kernel for a bug that is unlikely to bite before I get the fix in 6.9-rc1, and would only cost me 30 minutes of time in the worst case if it did.

I'll update if it bites me. I've bet on much worse poker hands :)

lxgr · on March 20, 2024

Well, from your data (100 error-free hours, sample size 1) alone, we can only conclude this: “The bug probably happens less frequently than every few hours”.

Is that reliable enough for you? Great! Is that “very rare”? Absolutely not for almost any type of user/scenario I can imagine.

If you’re making any statistical arguments beyond that data, or are implying more data than that, please provide either, otherwise this will lead nowhere.

Dylan16807 · on March 21, 2024

> I only care about the expected case.

The expected case after surviving a hundred hours is that you're likely to survive another hundred.

Which is a completely useless promise.

That piece of data doesn't let you predict anything at reasonable time scales for an OS install.

You can't squeeze more implications out of such a small sample.

jcalvinowens · on March 22, 2024

I don't care about the aggregate: I only care about me and my machine here.

> The expected case after surviving a hundred hours is that you're likely to survive another hundred.

That's exactly right. I don't expect to accrue another hundred hours before the new release, so I'll likely be fine.

> Which is a completely useless promise.

Statistics is never a promise: that's a really naive concept.

> at reasonable time scales for an OS

The timescale of the OS install is irrelevant: all that matters is the time between when the bug is introduced and when it is fixed. In this case, about nine months.

Dylan16807 · on March 22, 2024

You only use your machines for twenty hours per month?

Even so, "likely" here is something like "better than 50:50". Your claim was "very very rare" and that's not supported by the evidence.

> Statistics is never a promise: that's a really naive concept.

It's a promise of odds with error bars, don't be so nitpicky.

jcalvinowens · on March 22, 2024

> Even so, "likely" here is something like "better than 50:50". Your claim was "very very rare" and that's not supported by the evidence.

You're free to disagree, obviously, but I think it's accurate to describe a race condition that doesn't happen in 100 hours on a multiple machines with clock rates north of 3GHz as "very very rare". That particular code containing the bug has probably executed tens of millions of times on my little pile of machines alone.

> It's a promise of odds with error bars, don't be so nitpicky.

No, it's not. I'm not being nitpicky, the word "promise" is entirely inapplicable to statistics.

Dylan16807 · on March 22, 2024

If my computer has a filesystem error that happens every week of uptime (168 machine hours), I call that "common".