In that case, on a computer with an instruction cache, you can probably optimize your program by not unrolling the loops that aren't hot. Maybe you can compile those parts of the program with GCC.
Micro-optimizations are in many cases not justifiable, for the reasons Knuth explains in his paper. But sometimes they are.
Micro-optimizations are in many cases not justifiable, for the reasons Knuth explains in his paper. But sometimes they are.