There are many ways to compute the same matrix multiplication that apply the sum reduction in differ...

tripplyons • yesterday at 10:29 PM • 1 reply • view on HN

There are many ways to compute the same matrix multiplication that apply the sum reduction in different orders, which can produce different answers when using floating point values. This is because floating point addition is not truly associative because of rounding.

Replies

spwa4 • yesterday at 10:51 PM

Is that really going to matter in FP32, FP16 or BF16? I would think models would be written so they'd be at least somewhat numerically stable.

Also if the inference provider guarantees specific hardware this shouldn't happen.

➕ show 1 reply

alt Hacker News

Replies