It could be written more clearly but I think when it refers to a 4x and a 3x slowdown, it's act...

leereeves • today at 9:03 AM • 0 replies • view on HN

It could be written more clearly but I think when it refers to a 4x and a 3x slowdown, it's actually a 4x slowdown and 3x larger code that causes cache misses, and the impact of those cache misses on runtime is surely much larger than 3x.

> Each individual iteration: ~4x slower (register spilling)

> Cache pressure: ~2-3x additional penalty (instructions don't fit in L1/L2 cache)

> Combined over a billion iterations: 158,000x total slowdown

I think that "2-3x additional penalty" refers to this:

> The 2.78x code bloat means more instruction cache misses, which compounds the register spilling penalty.

Also, the analysis refers elsewhere to other factors that weren't included in this part.

alt Hacker News