Then a C compiler optimizer decides that knows better and changes the table underneath, or maybe it doesn't but the Intel microcode unit decides to write the Assembly instructions in unexpected way.
Or that spot we were 100% certain was a bottleneck as it clearly wasn't micro-optimized C code, shows 0.1% hit on the V-Tune profiler.
Or that spot we were 100% certain was a bottleneck as it clearly wasn't micro-optimized C code, shows 0.1% hit on the V-Tune profiler.