Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The "how likely is it, really?" response to questions of technical correctness has always bothered me. It takes a mindset completely alien to mine to say "Here's a race condition. Sure, it's undefined behavior, but the race is narrow, so it's rare" or to say "Sure, memory allocation can theoretically fail, but in practice almost never does" or to say "fsync is too slow and most computers have batteries these days".

Software is unreliable enough as it is due to problems beneath our notice. It seems reckless to avoid fixing problems that we do notice. Sure, you could argue that rare problems are rare and that users probably won't notice them --- this attitude is penny-wise and pound-foolish, because you can't meaningfully reason about a system that's only probably correct.



Engineering is about tradeoffs. How many once-in-a-thousand bugs do you fix before you tackle the one-in-a-million? Or one in a billion? What about if it takes $10/bug to fix every 1:1000 bug and $100,000 to fix one 1:1000000 bug?

Correctness is great in theory, but in practice it's a matter of what's important.


If you are only looking at probability and cost-to-fix you are overlooking something important - the cost if/when it happens.


This is really emphasized in things like dmfea and other failure mode analysis documents or regulated industry. They want you to document the likelihood, your ability to recover from the failure, as well as the cost o the failure. You can say that you didn't want to pay for someone fixing some unlikely fail mode but that's small consultation to the people whose lives your product is ruining.


The problem you're latching on to I think is how the context for caculating a probability can vary.

If it were really as likely as, say, the sun exploding that X happened then it would be of no use to expend time on X.

BUT very often people speaking about the probability of events given suspicious constraints. While a memory allocation might not fail in most situations it will fail often in some situations. And a one-in-a-million chance is almost guaranteed when there are millions of uses.


Also worth considering that our processors are handling billions of ops per second. One in a million might be happening all the time even for one user.


That's why it's called one in a million...



One in a million happens quite often if you're processing something like ~100k requests a second.


In fact I agreed with the parent and just posted a tongue in cheek remark.

One in a million literally means that at ~100k requests a second it will happen once every 10 seconds.


But it's extremely unlikely when you're processing 10 requests a week, such as might be the case for the web server in a consumer-grade router.


It's amazing how skilled blackhats are at converting "rare bug that doesn't affect the UI" into "massive DDoS cannon".


And you can see how risk analyses by senior engineers with tons of embedded experience who are used to working with systems that are not networked leads to problems when their systems are later networked.


...by hundreds of thousands of customers.


One in a million isn't just a typical statement of probability, it's a colloquialism used to refer to things that never happen in practice. It's highly misleading to use in the context of computers which, due to their natures, have one in a million events occurring constantly.


My comment was tongue in cheek.


>he "how likely is it, really?" response to questions of technical correctness has always bothered me.

But the question is important in another context: language design. Why is this undefined behavior something that exists in the first place? Objects larger than PTRDIFF_MAX could just not be allowed! This avoids the problem and makes code easier to reason about, with pretty much no downside.


I like the way you're thinking, but that sort of thing probably doesn't get past a committee. "Hey we might not be able to think of an application but that doesn't mean our users won't have a legitimate reason for doing it ... Motion passed."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: