It's OK to compare floating-points for equality

178 points • by coinfused • last Tuesday at 4:00 PM • 115 comments • view on HN

Comments

Something I've observed as someone who works in the physical sciences and used to work as a software engineer is:

Very few software engineers understand that tolerances are fundamental.

In the physical sciences, strict equality - of actual quantities, not variables in equations - is almost never a thing. Even though an equation might show that two theoretical quantities are exactly equal, the practical fact is that every quantity begins life as a measurement, and measurements have inherent uncertainty.

Even in design work, nothing is exact. It's simply not possible. A resistor's value is subject to manufacturing tolerances, will vary with temperature, and can drift as it ages. A mechanical part will also have manufacturing tolerance, and changes size and shape with temperature, applied forces, and wear. So even if a spec sheet states an exact number, the heading or notes will tell you that this is a nominal value under specific conditions. (Those conditions are also impossible to achieve and maintain exactly for all the same reasons.)

Even the voltages that represent 0 and 1 inside a computer aren't exact. Digital parts like CPUs, GPUs, RAM, etc. specify low and high thresholds, under or over which a voltage is considered a 0 or 1.

Floating-point numbers have uses outside the physical sciences, so there's no one-size-fits-all approach to using them correctly. But if you are writing code that deals with physical quantities, making equality comparisons is almost always going to be wrong even if floating-point numbers had infinite precision and no rounding error. Physical quantities simply can't be used that way.

vouwfietsman • yesterday at 12:20 PM

This explanation is relatively reductive when it comes to its criticism of computational geometry.

The thing with computational geometry is, that its usually someone else's geometry, i.e you have no control over its quality or intention. In other words, whether two points or planes or lines actually align or align within 1e-4 is no longer really mathematically interesting because its all about the intention of the user: does the user think these planes overlap?.

This is why most geometry kernels (see open cascade) sport things like "fuzzy boolean operations" [0]) that lean into epsilons. These epsilons mask the error-prone supply chain of these meshes that arrive in your program by allowing some tolerance.

Finally, the remark "There are many ways of solving this problem" is also overly reductive, everyone reading here should really understand that this is a topic that is being actively researched right now in 2026, hence there are currently no blessed solutions to this problem, otherwise this research would not be needed. Even more so, to some extent this problem is fundamentally unsolvable depending on what you mean by "solvable", because your input is inexact not all geometrical operations are topologically valid, hence an "exact" or let alone "correct along some dimension" result cannot be achieved for all (combination of) inputs.

[0] https://dev.opencascade.org/content/fuzzy-boolean-operations

➕ show 1 reply

jph • yesterday at 11:14 AM

I have this floating-point problem at scale and will donate $100 to the author, or to anyone here, who can improve my code the most.

The Rust code in the assert_f64_eq macro is:

    if (a >= b && a - b < f64::EPSILON) || (a <= b && b - a < f64::EPSILON)

I'm the author of the Rust assertables crate. It provides floating-point assert macros much as described in the article.

https://github.com/SixArm/assertables-rust-crate/blob/main/s...

If there's a way to make it more precise and/or specific and/or faster, or create similar macros with better functionality and/or correctness, that's great.

See the same directory for corresponding assert_* macros for less than, greater than, etc.

➕ show 15 replies

dnautics • yesterday at 3:23 PM

> In reality it is a pretty deterministic (modulo compiler options, CPU flags, etc)

IIRC this was not ALWAYS the case, on x86 not too long ago the CPU might choose to put your operation in an 80-bit fp register, and if due to multitasking the CPU state got evicted, it would only be able to store it in a 32-bit slot while it's waiting to be scheduled back in?

It might not be the case now in a modern system if based on load patterns the software decides to schedule some math operations or another on the GPU vs the CPU, or maybe some sort of corner case where you are horizontally load balancing on two different GPUs (one AMD, one Nvidia) -- I'm speculating here.

GuB-42 • yesterday at 4:28 PM

The thing with floating point numbers is they are meant to work with physical quantities: distances, durations, etc...

Physical quantities involve imprecision: measurement devices, tools, display devices, ADC/DACs etc... They all have some tolerances. And when you are using epsilons, the epsilon value should be chosen based on that physical value. For example, you set the epsilon to 1e-4 because that's 100 microns and you can't display 100 micron details.

That's also the reason why there is not one size fits all solution. If you are working with microscopic objects, 100 microns is huge, and if you are doing a space simulation, 1 km may be negligible. Some operations involve a huge loss of precision, some don't, and sometimes you really want exact numbers and therefore you have to know your fractional powers of 2.

➕ show 1 reply

amelius • yesterday at 12:32 PM

Think about this. It's silly to use floating point numbers to represent geometry, because it gives coordinates closer to the origin more precision and in most cases the origin is just an arbitrary point.

➕ show 4 replies

desdenova • yesterday at 1:53 PM

The problem with floating point comparison is not that it's nondeterministic, it's that what should be the same number may have different representations, often with different rounding behavior as well, so depending on the exact operations you use to arrive at it, it may not compare as equal, hence the need for the epsilon trick.

If all you're comparing is the result from the same operations, you _may_ be fine using equality, but you should really know that you're never getting a number from an uncontrolled source.

demorro • yesterday at 11:33 AM

I guess I'm confused. I thought epsilon was the smallest possible value to account for accuracy drift across the range of a floating point representation, not just "1e-4".

Done some reading. Thanks to the article to waking me up to this fact at least. I didn't realize that the epsilon provided by languages tends to be the one that only works around 1.0, and if you want to use episilons globally (which the article would say is generally a bad idea) you need to be more dynamic as your ranges, and potential errors, increase.

➕ show 2 replies

Joker_vD • yesterday at 7:33 PM

To quote from one of my previous comments:

> > the myth about exactness is that you can't use strict equality with floating point numbers because they are somehow fuzzy. They are not.

> They are though. All arithmetic operations involve rounding, so e.g. (7.0 / 1234 + 0.5) * 1234 is not equal to 7.0 + 617 (it differs in 1 ULP). On the other hand, (9.0 / 1234 + 0.5) * 1234 is equal to 9.0 + 617, so the end result is sometimes exact and sometimes is not. How can you know beforehand which one is the case in your specific case? Generally, you can't, any arithmetic operation can potentially give you 1 ULP of error, and it can (and likely, will) slowly accumulate.

Also, please don't comment how nobody has a use for "f(x) = (x / 1234 + 0.5) * 1234": there are all kinds of queer computations people do in floating point, and for most of them, figuring out the exactness of the end result requires an absurd amount of applied numerical analysis, doing which would undermine most of the "just let the computer crunch the numbers" point of doing this computation on a computer.

mcv • yesterday at 4:02 PM

This is mostly about game logic, where I can understand the reliance on floating point numbers. I've also seen these epsilon comparisons in code that had nothing to do with game engines or positions in continuous space, and it has always hurt my eyes.

I think if you want to work with values that might be exactly equal to other values, floating point is simply not the right choice. For money, use BigDecimal or something like that. For lots of purposes, int might be more appropriate. If you do need floating point, maybe compare whether the value is larger than the other value.

mtklein • yesterday at 6:46 PM

My preference in tests is a little different than just using IEEE 754 ==,

    _Bool equiv(float x, float y) {
        return (x <= y && y <= x)
            || (x != x && y != y);
    }

which both handles NaNs sensibly (all NaNs are equivalent) and won't warn about using == on floats. I find it also easy to remember how to write when starting a new project.

hansvm • yesterday at 3:01 PM

My normal issue with floating-point epsilon shenanigans is that they don't usually pass the sniff test, suggesting something fundamentally wrong with the problem framing or its solution.

It's a classic, so let's take vector normalization as an example. Topologically, you're ripping a hole in the space, and that's causing your issues. It manifests as NaN for length-zero vectors, weird precision issues too close to zero, etc, but no matter what you employ to try to fix it you're never going to have a good time squishing N-D space onto the surface of an N-D sphere if you need it to be continuous.

Some common subroutines where I see this:

1. You want to know the average direction of a bunch of objects and thus have to normalize each vector contributing to that average. Solution 1: That's not what you want almost ever. In any of the sciences, or anything loosely approximating the real world, you want to average the un-normalized vectors 99.999% of the time. Solution 2: Maybe you really do need directions for some reason (e.g., tracking where birds are looking in a game). Then don't rely on vectors for your in-band signaling. Explicitly track direction and magnitude separately and observe the magic of never having direction-related precision errors.

2. You're doing some sort of lighting normalization and need to compute something involving areas of potentially near-degenerate triangles, dividing by those values to weight contributions appropriately. Solution: Same as above, this is kind of like an average of averages problem. It can make fuzzy, intuitive sense, but you'll get better results if you do your summing and averaging in an un-normalized space. If you really do need surface normals, store those explicitly and separate from magnitude.

3. You're doing some sort of ML voodoo to try to get better empirical results via some vague appeal to vanishing gradients or whatever. Solution: The core property you want is a somewhat strange constraint on your layer's Jacobian matrix, and outside of like two papers nobody is willing to put up with the code complexity or runtime costs, even when they recognize it as the right thing to do. Everything you're doing is a hack anyway, so make your normalization term x/(|x|+eps) with eps > 0 rather than equal to zero like normal. Choose eps much smaller than most of the vectors you're normalizing this way and much bigger than zero. Something like 1e-3, 1e-20, and 1e-150 should be fine for f16, f32, and f64. You don't have to tune because it's a pretty weak constraint on the model, and it's able to learn around it.

beyondCritics • yesterday at 4:38 PM

If your code may be compiled, to use the Intel x87 numerical coprocessor, an important issue is the so called "excess precision": Different values on chip can collapse after being rounded and stored to their memory locations, invalidating previous comparisons. Spilling can happen unexpectedly. Note that Intel calls the x87 "legacy"

➕ show 1 reply

lisper • yesterday at 5:02 PM

Well, at least the author is honest about it:

> The title of this post is an intentional clickbait.

Unfortunately, that's where the honesty ends.

> It's NOT OK to compare floating-points using epsilons.

> So, are epsilons good or bad? Usually bad, but sometimes okay.

So which is it? Emphatically NOT OK, or sometimes okay?

mizmar • last Wednesday at 5:38 AM

There is another way to compare floats for rough equality that I haven't seen much explored anywhere: bit-cast to integer, strip few least significant bits and then compare for equality. This is agnostic to magnitude, unlike epsilon which has to be tuned for range of values you expect to get a meaningful result.

➕ show 6 replies

hun3 • yesterday at 3:58 PM

I used floating timestamps as some kind of an identity. If there is ever a conflict, I just increase it by 1 ulp until it doesn't collide with anything. Sorry.

bananzamba • yesterday at 9:19 PM

If you need equality, just use fixed point

Cold_Miserable • yesterday at 7:10 PM

I've yet to encounter a need for == equality for floating point operations.

➕ show 1 reply

apitman • yesterday at 2:42 PM

> that's how maths works

Wait is British "maths" a singular noun or is this a typo? I was willing to go along with it if it was plural, but I have to draw the line here.

➕ show 4 replies

4pkjai • yesterday at 10:47 AM

I do this to see if text in a PDF is exactly where it is in some other PDF. For my use case it works pretty well.

Asooka • yesterday at 9:59 PM

My one small nitpick is that vector length is usually 2 instructions with SSE4:

    dpps xmm0, xmm0, 0x17 ; dot product of 3 lanes, write lane 0
    sqrtss xmm0, xmm0
    ret

And is considerably faster than the fancy version, mainly because Intel still hasn't given us horizontal-max vector instruction! ARM is a bit better in that regard with their fancy vmaxvq_f32 and vmaxnmvq_f32...

darepublic • yesterday at 3:14 PM

Plus or minus eps

AshamedCaptain • yesterday at 11:27 AM

One of the goals of comparing floating points with an epsilon is precisely so that you can apply these types of accuracy increasing (or decreasing) changes to the operations, and still get similar results.

Anything else is basically a nightmare to however has to maintain the code in the future.

Also, good luck with e.g. checking if points are aligned to a grid or the like without introducing a concept of epsilon _somewhere_.

hpcgroup • yesterday at 2:10 PM

[dead]

jheriko • yesterday at 9:20 PM

[dead]

thayne • yesterday at 4:13 PM

So they say you should use epsilons, then their solution to the first problem is to use an epsilon. There may be some cases when you can get by without using epsilon comparison, but in many cases epsilon comparison is the right thing to do, but you need to choose a good value for it.

This is especially true in cases where the number comes from some kind of input (user controls, sensor reading, etc.) or random number generation.

alt Hacker News

It's OK to compare floating-points for equality

Comments