logoalt Hacker News

kevmo314today at 1:40 PM1 replyview on HN

> This is not the first time we can see Nvidia taking shortcuts to achieve maximum performance of their GPUs

Why is implementing it correctly not performant? For context I have no idea how rounding is typically implemented anyways.


Replies

adrian_btoday at 4:32 PM

It is not correct because it does not implement the FP arithmetic standard and this can lead to much greater numerical errors than expected.

NVIDIA is not responsible alone, because the Microsoft DirectX specification includes the non-standard behavior.

Nevertheless, as shown in TFA, both the AMD and Intel GPUs allow the user to choose between correct behavior and incorrect behavior that might be faster, while NVIDIA ignores what the user requests and implements only the non-standard behavior.

The developers of graphics or ML/AI applications do not care about errors, but there are also people who want to use GPUs for normal computations, where the accuracy of the results matters, so they want to be able to choose between correct behavior and incorrect but faster behavior.

Actually "faster" is a misnomer, because denormals can be handled correctly without diminishing the speed, but that costs additional die area. Thus what NVIDIA gains by not implementing the right behavior is a reduced production cost.

show 1 reply