Floor and Ceil versus Denormals on CPU and GPU

35 points • by ibobev • last Tuesday at 1:22 PM • 14 comments • view on HN

Comments

petermcneeley • today at 4:36 PM

WebGPU (WGSL) handles this by having a specified accuracy for each operation.

https://www.w3.org/TR/WGSL/#concrete-float-accuracy

This is all fully tested in the CTS.

https://gpuweb.github.io/cts/standalone/?q=webgpu:shader,*

kevmo314 • today at 1:40 PM

> This is not the first time we can see Nvidia taking shortcuts to achieve maximum performance of their GPUs

Why is implementing it correctly not performant? For context I have no idea how rounding is typically implemented anyways.

➕ show 1 reply

Dwedit • today at 4:16 PM

Denormals happen to be the way that Zero can even be represented at all?

crote • today at 12:06 PM

Another thing to keep in mind is that CPU processing of denormals tends to be extremely slow - I vaguely recall running into something like a 10x slowdown a decade ago.

For a lot of applications the difference between a denormal and zero is small enough to be irrelevant, so if you expect near-zero values to be common, enabling a denormals-to-zero compiler flag might give you a pretty nice performance boost for free.

➕ show 4 replies

yosefk • today at 2:07 PM

Flush denormals to zero. Even their inventor had trouble writing correct code in their presence - see the Appendix to that "what every programmer should know..." paper

➕ show 2 replies

andrepd • today at 3:21 PM

It's one of several issues with the design of IEEE floats, unfortunately. I wish we could start thinking more seriously about a new design, to complement if not replace IEEE in the long term. Posits are an example https://github.com/andrepd/posit-rust

➕ show 1 reply

alt Hacker News

Floor and Ceil versus Denormals on CPU and GPU

Comments