logoalt Hacker News

Aardwolfyesterday at 6:29 PM3 repliesview on HN

But this is not valuable if doing so results in different numerical results, and I think that will always happen if ++ is executed at different times, there's no point in a compiler optimizing pointless code that can silently give different results elsewhere


Replies

Karlissyesterday at 10:52 PM

The same rule which makes the evaluation order of a++ + ++a unsequenced also applies to (x+y+z+a+b+c) where x,y,z could be any expression (in a sane case on separate variables and without mix of pre/post increments). Breaking questionable code and introducing UB where reordering changes result is just a side effect of this.

Just switching between left to right or right to left wouldn't be that useful but it also permits to interleave the subexpression evaluation. Grouping memory fetches/writes, taking into account how many execution units and registers of different kinds a CPU has can have some performance benefits.

For example if you have something like `++a[0] + ++a[1] + ++a[2] + ++a[3]` instead of evaluating each increment one by one both GCC and Clang will vectorize it loading all 4 values from memory using single simd instruction, incrementing and then writing result back to memory. And if you add fifth one (but not 8) which needs to be handled using regular instruction, that will be done after the first 4. If standard defined that left subexpression of addition is fully evaluated before the right expression that wouldn't be allowed.

AlotOfReadingyesterday at 7:11 PM

Your compiler does many optimizations that break numerical reproducibility, especially in floats. I reviewed a PR the other day that wrote X=AB+(CD)+E;

And when I checked 3 different compilers, each of them chose a different way to use FMAs.

Even with integer math, you can get different numerical results via UB (e.g. expressions with signed overflow one way and not another).

show 1 reply