Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Your compiler does not hallucinate registers or JMP instructions


Doesn't it? Many compilers offer all sorts of novel optimizations for operations that end up producing the same result with entirely different runtime characteristics than the source code would imply. Going further, turn on gcc fast math and your code with no undefined behavior suddenly has undefined behavior.

I'm not much of a user of LLMs for generating code myself, but this particular analogy isn't a great fit. The one redeeming quality is that compiler output is deterministic or at least repeatable, whereas LLMs have some randomness thrown in intentionally.

With that said, both can give you unexpected behavior, just in different ways.


> With that said, both can give you unexpected behavior, just in different ways.

Unexpected as in "I didn't know" is different than unexpected as in "I can't predict". GCC optimizations is in the former camp and if you care to know, you just need to do a deep dive in your CPU architecture and the gcc docs and codebase. LLMs is a true shot in the dark with a high chance miss and a slightly lower chance of friendly fire.


And every single developer is supposed to memorize GCC's optimizations so they never make a change that will optimize wrong?

Nah, treating undefined behavior as predictable is a fool's errand. It's also a shot in the dark.


What is that about memorization? I just need to know where the information is so I can refer to it later when I need it (and possibly archive them if they're that important).


If you're not trying to memorize the entirety of GCC's behavior (and keeping up with its updates), then you need to check if your UB is still doing what you expect every single time you change your function. Or other functions near it. Or anything that gets linked to it.

It's effectively impossible to rely on. Checking at the time of coding, or occasionally spot checking, still leaves you at massive risk of bugs or security flaws. It falls under "I can't predict".


In C, strings are just basic arrays which themselves are just pointers. There’s no safeguards there like we have in Java, so we need to write the guardrails ourselves, because failure to do so result in errors. If you didn’t know about it, a buffer overflow may be unexpected, but you don’t need to go and memorize the entire gcc codebase to know. Just knowing the semantics is enough.

The same thing happens with optimization. They usually warns about the gotchas, and it’s easy to check if the errors will bother you or not. You dont have to do an exhaustive list of errors when the classes are neatly defined.


This comment seems to mostly be describing avoiding undefined behavior. You can learn the rules to do that, though it's very hard to completely avoid UB mistakes.

But I'm talking about code that has undefined behavior. If there is any at all, you can't reliably learn what optimizations will happen or not, what will break your code or not. And you can't check for incorrect optimization in any meaningful way, because the result can change at any point in the future for the tiniest of reasons. You can try to avoid this situation, but again it's very hard to write code that has exactly zero undefined behavior.

When you talked about doing "a deep dive in your CPU architecture and the gcc docs and codebase", that is only necessary if you do have undefined behavior and you're trying to figure out what actually happens. But it's a waste of effort here. If you don't have UB, you don't need to do that. If you do have UB it's not enough, not nearly enough. It's useful for debugging but it won't predict whether your code is safe into the future.

To put it another way, if we're looking at optimizations listing gotchas, when there's UB it's like half the optimizations in the entire compiler are annotated with "this could break badly if there's UB". You can't predict it.


I suppose you are talking about UB? I don't think that is anything like Halucination. It's just tradeoffs being made (speed vs specified instructions) with more ambiguity (UB) than one might want. fast math is basically the same idea. You should probably never turn on fast math unless you are willing to trade speed for accuracy and accept a bunch of new UB that your libraries may never have been designed for. It's not like the compiler is making up new instructions that the hardware doesn't support or claiming the behavior of an instruction is different than documented. If it ever did anything like that, it would be a bug, and would be fixed.


> or claiming the behavior of an instruction is different than documented

When talking about undefined behavior, the only documentation is a shrug emoticon. If you want a working program, arbitrary undocumented behavior is just as bad as incorrectly documented behavior.


UB is not undocumented. It is documented to not be defined. In fact any given hardware reacts deterministically in the majority of UB cases, but compilers are allowed to assume UB was not possible for the purposes of optimization.


The trigger for UB is documented, the result is not documented.

And despite the triggers being documented, they're very very hard to avoid completely.


I bet they did at one point in time, then they stopped doing that, but still not bug free.


lol are you serious? I bet compilers are less deterministic now than before what with all the CPUs and their speculative executions and who knows what else. But all that stuff is still documented and not made out of thin air randomly…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: