You might not need it, but gcc/clang have __builtin_unreachable and msvc has __assume and both are used extensively for optimizations. std::unreachable is just standardizing existing diverging practice.
If a code path that's supposed to be unreachable is reached then the program is already broken. Unless it has a bug, a compiler will not make a program more broken. At worst (or best, depending on how you look at it) it will only make any bugs it already has more obvious.
Yes, but what was unreachable may get reachable as I do changes to the program. I see the utility when doing highly optimized library code.
But for my purposes I much prefer to stick something which flags me (exception or logging or whatever) of "this should never happen" instead of crashing. (Undefined.)
The point is that you're telling the compiler "I don't care what happens if control reaches here. Assume it never will and use that information to better optimize the rest". It's not an alternative to abort() or throwing because the compiler still needs to generate code for them.
Yes, that’s what I mean, the compiler can completely remove a branch which is unreachable, along with any checks. If it was forced to keep the checks and report an error instead, any bug would be more obvious. If the compiler removes checks for bugs, then the application could silently continue, obscuring the bug.
So this is a “hide bugs but maybe improve performance” function.
No, it doesn't hide bugs. It has no defined effect on the obviousness of bugs. The reason is that modern compilers are pretty much theorem solvers, and they're able to propagate truth values in order to deduce facts about programs. For example, a compiler could deduce that since a branch is never reached, a particular pointer is never null, and could therefore skip a null check that it deduced was redundant that would have prevented a null dereference. If it turns out that the branch is reachable and the pointer is null, the pointer would be dereferenced and the program would crash immediately (typically). But UB is UB; once you hit it all behaviors are permissible, from immediate crash, to silent data corruption, to nasal demons.
I find it unhelpful to frame undefined behavior as an “escape hatch” that lets the compiler do anything it wants. Compiler writers aren’t gleefully hunting for issues and using its presence as an excuse to make your life miserable. Instead, the weirdness arises because the compiler makes certain assumptions and transforms the code in ways that depend on them. For example:
1. Null pointers are never dereferenced.
2. Ptr *p is dereferenced.
3. p therefore cannot be null.
4. Ergo, we can omit checking whether p is null.
If the initial premise isn’t actually true (i.e., you slip up and dereference a null pointer), the chain of logic breaks down and boom! Applying many such rules can certainly lead to weird emergent behavior—and maybe you should act as if anything can happen—but it’s not a total free-for-all.
If you have undefined behavior, your program is already broken. No such thing as "more broken"; there's already no theoretical limit to what might happen if it gets triggered.
Undefined behavior is considered worse than crashing, which is typically the alternative when reaching "unreachable" codepaths.
Compare these two blocks similar to the article
switch (ch) {
case 'a': do_a(); return;
case 'd': do_d(); return;
// ch is guaranteed to be 'a' or 'd' by previous code.
default: assert(0);
}
switch (ch) {
case 'a': do_a(); return;
case 'd': do_d(); return;
default: std::unreachable();
}
If the programmer is wrong about `ch` in the first one, the program terminates.
For the second one, the compiler could change it to be equivalent to
If the programmer is wrong here, the program might `do_d()` with unintended consequences. I'd say "going down unintended codepaths" is typically considered worse than crashing.
I take your point, but you can get the behavior of your first example, while still marking the default branch with std::unreachable(), by asserting the preconditions before the switch. This seems to me to be a pretty general equivalence.
So what does std::unreachable() do here? In this particular case, and with NDEBUG defined and any level of optimization selected, I suspect that, at a minimum, the switch would be replaced as you have shown in all versions - it would take a more complex example to show how std::unreachable() makes a difference. The point is, now we have a choice - and it is one that is being offered without creating any backwards-compatibility issues.
Furthermore, the original function, without assertions, is not guaranteed to crash, with or without std::unreachable(). You need some explicit checks to get a desirable response in the case where a mistake has been made, and that option is just as available whether or not you use std::unreachable().
Therefore, while I agree you have shown that not all broken variants of a given program are equivalent, this does not show that std::unreachable() is harmful.
One thing people are missing is that you may need to use this to satisfy conditions that the compiler might encounter to emit a warning (or error if equivalent of -Werror is enabled). If you have strict warnings on, but no way to tell the compiler that a location should not be reachable, you wind up in situations where you are doing something like `assert(!"not reached!")` and it won't be enabled in all build types. This is similar to how `__attribute__((noreturn))` is used in a "fatal" function that isn't always enabled but needs to convey its intention to static analyzers so that they stop evaluating branches past that call.
The literal example in the Clang documentation is not about optimization:
> For example, without the __builtin_unreachable in the example below, the compiler assumes that the inline asm can fall through and prints a “function declared ‘noreturn’ should not return” warning.
> the compiler assumes that the inline asm can fall through and prints a "function declared 'noreturn' should not return" warning.
It actually can (Hint: what happens if the operating system `iret`s from its int 3 handler?), although it's probably not a issue in practice. Regardless, you don't need __builtin_unreachable to write:
void myabort(void) __attribute__((noreturn));
void myabort(void) {
asm("int3");
myabort(); // might need `return myabort();` to force TCO,
// but gcc doesn't like that and it should work anyway
}
# Assuming tail-call optimization etcetera, this produces:
myabort:
int3
jmp myabort
which is a correct implementation.
However, if you're implementing built-in/standard functions like abort, you presumably know what compiler you're using and don't need a std interface in the first place. There's zero legitimate reason to use a undefined-behaviour-based `unreachable` in application code.
Claiming std::unreachable is useful for implementing abort is like proposing a std::manual_copy function because your compiler optimized a implementation of memcpy to a call to itself - at some point you do in fact have to resort to implementaion-specific details to define the abstractions that abstract away said details, and "in literally the same function as the (also-nonstandard, IIRC) inline assembly that hopefully doesn't return" seems at if not noticeably past that point.
It seems like optimization cases where this makes sense are generally of the sort where `noreturn` can't be used because it is conditional. That makes sense to me (although, could this have covered most use cases by making `noreturn` support being passed the name of a function argument?).
One example was interesting, though. I could see someone believing these might produce the same optimized assembly (-O2):
void a(int& x, int& y) {
if (&x == &y) __builtin_unreachable();
x ^= y; y ^= x; x ^= y;
}
void b(int& __restrict x, int& __restrict y) {
x ^= y; y ^= x; x ^= y;
}
Is this something compiler contributors would optimize once they know about it? Similar question probably exists with using `__builtin_unreachable` if values aren't aligned versus `__builtin_assume_aligned`.
I'd love to see these things illustrated with more real-world examples.
Attempting to emulate restricted with __builtin_unreachable has been one of the first thing I tried when I learned about unreachable. I periodically try again, but generally I have been underwhelmed with trying to give gcc value range informations with it.
I felt physically ill when I read:
It’s intended to be used when you know you have an execution path in your code that cannot be reached but the compiler cannot figure that out.
It felt like saying to the compiler, "please, find a way to make my program break even more easily". Exactly not what I need.