I did tests on Kahan summation recently on my macbook pro and -O3 defeated the a...

vlovich123 · on Oct 26, 2021

Sounds like a compiler bug to me. Can you file a bug to clang with a reduced standalone test (or I can do it for you if you share the standalone test).

kloch · on Oct 26, 2021

Here is a complete simplified Kahan summation test and indeed it works with -O3 but fails with -Ofast. There must have been something else going on in my real program at -O3. However the original point that 'volatile' can be a workaround for some optimization problems is still valid (you may want the rest of your program to benefit from -Ofast without breaking certain parts).

Changing the three kahan_* variables to volatile makes this work (slowly) with -Ofast.

  #include <stdio.h>

  int main(int argc, char **argv) {
    int i;
    double sample, sum;
    double kahan_y, kahan_t, kahan_c;

    // initial values
    sum=0.0;
    sample=1.0; // start with "large" value

    for (i=0; i <= 1000000000; i++) { // add 1 large value plus 1 billion small values
      // Kahan summation algorithm
      kahan_y=sample - kahan_c;
      kahan_t=sum + kahan_y;
      kahan_c=(kahan_t - sum) - kahan_y;
      sum=kahan_t;

      // pre-load next small value
      sample=1.0E-20;
    }
    printf("sum: %.15f\n", sum);
  }

vlovich123 · on Oct 26, 2021

Correct. `-Ofast` claim to fame is it enables `-ffast-math` which is why it has huge warning signs around it in the documentation. `-ffast-math` turns on associativity which is problematic for Kahan summation. Rather than sprinkling in volatiles which pessimizes the compiler to no end, I would recommend annotating the problematic function to turn off associativity [1][2].

Something like:

    [[gnu::optimize("no-associative-math")]]
    double kahanSummation() {
      ...
    }

That way the compiler applies all the optimizations it can but only turns off associative math. This should work on Clang & GCC & be net faster in all cases.

This is what I mean by "If you're sprinkling volatile around, you probably aren't doing what you want" and are just cargo culting bad advice.

[1] https://stackoverflow.com/questions/26266820/in-clang-how-do... [2] https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/Function-Attrib...

hermitdev · on Oct 26, 2021

I hope this isn't the actual "real" code, because you've got undefined behavior before you even have to worry about the associativity optimizations. There's an uninitialized read of 'kahan_c' on the first loop iteration.