I get the idea behind NaN != NaN, but has there ever been any instance where this design decision has made practical code better instead of becoming a tripping hazard and requiring extensive special casing?
I'm also not a fan of the other property that NaN evaluates to false for all three of <, > and =, even though I don't have a good idea what to do otherwise.
I think as programmers, we usually assume that "not (a > b)" implies "a <= b" and vice-versa and often rely on that assumption implicitly. NaN breaks that assumption, which could lead to unexpected behavior.
Consider something like this (in JS) :
function make_examples(num_examples) {
if (num_examples <= 0) {
throw Error("num_examples must be 1 or more");
}
const examples = [];
for (let i = 0; i < num_examples; i++) {
examples.push(make_example(i));
}
// we assume that num_examples >= 1 here, so the loop ran at least once and the array cannot be empty.
postprocess_first_example(examples[0]); // <-- (!)
return examples;
}
If somehow num_examples were NaN, the (!) line would fail unexpectedly because the array would be empty.
> That’s also the reason NaN !== NaN. If NaN behaved like a number and had a value equal to itself, well, you could accidentally do math with it: NaN / NaN would result in 1, and that would mean that a calculation containing a NaN result could ultimately result in an incorrect number rather than an easily-spotted “hey, something went wrong in here” NaN flag.
While I'm not really against the concept of NaN not equaling itself, this reasoning makes no sense. Even if the standard was "NaN == NaN evaluates to true" there would be no reason why NaN/Nan should necessarily evaluate to 1.
I definitely support NaN not being equal to NaN in boolean logic.
If you have x = "not a number", you don't want 1 + x == 2 + x to be true. There would be a lot of potential for false equivalencies if you said NaN == NaN is true.
--
It could be interesting if there was some kind of complex NaN number / NaN math. Like if x is NaN but 1x / 2x resulted in 0.5 maybe you could do some funny mixed type math. To be clear I don't think it would be good, but interesting to play with maybe.
Doesn't "1 + x == 2 + x" evaluate to true for any x with large enough magnitude? In general we should expect identities that hold true in "real" math to not hold in FP
The concept of NaN long predates the language that uses ===, and is part of a language-agnostic standard that doesn't consider other data types. Any language choosing to treat the equality (regardless of the operator symbol) of NaN differently would be deviating from the spec.
Julia evaluates NaN === NaN to true, as long as the underlying bit representations are the same. Eg NaN === -NaN evaluates to false, as happens also with NaNs where for some reason you tinker with their bytes. I think it makes sense that it is so though, regardless that I cannot think of any actual use cases out of doing weird stuff.
In R, NA (which is almost, but not quite like NaN) actually has separate types for each result, so you can have NA_boolean, NA_integer etc. Its super confusing.
It is a minor nuisance, but I think there's ultimately a pretty good reason for it.
Old-school base R is less type-sensitive and more "do what I mean", but that leads to slowness and bugs. Now we have the tidyverse, which among many other things provides a new generation of much faster functions with vectorized C implementations under the hood, but this requires them to be more rigid and type-sensitive.
When I want to stick a NA into one of these, I often have to give it the right type of NA, or it'll default to NA_boolean and I'll get type errors.
Either you throw an exception (and imo it is better to just throw an exception before that already, then) or else what you do determines what NaN === NaN actually evaluates to.
Apparently NaN (not a number) becomes false when type-cast to boolean.
Boolean(NaN)
===> false
For a hypothetical NaB (not a boolean), the same behavior seems logical.
Boolean(NaB)
===> false
So the condition `if (NaB)` is false and will fall through to the `else` branch. But..
> what you do determines what NaN === NaN actually evaluates to
I think I disagree with this because it's not about casting to boolean, it's a totally different question of self-identity, or comparing two instances (?) of a value (?!).
From the article:
typeof NaN
===> "number"
For symmetry and consistency:
typeof NaB
===> "boolean"
> NaN is the only value in the whole of JavaScript that isn’t equal to itself .. the concept of NaN is meant to represent a breakdown of calculation
Similarly, NaB would represent a breakdown of true/false condition (somehow) as an exceptional case. Whether it equals itself is a matter of convention or language design, not logic - since it's beyond logic just as NaN is beyond numbers. I would say:
NaN === NaN
===> false
NaB === NaB
===> false
> you throw an exception (and imo it is better..
I agree throwing an exception is better design for such exceptional cases - but we know JavaScript as a cowboy language would rather plow through such ambiguities with idiosyncratic dynamic typing, and let the user figure out the implicit logic (if any).
I don't necessarily think about JS in particular. A lot of languages have similar design and prob most have to deal with this issue in general.
In your examples, it does not make sense to have both
typeof NaB
===> "boolean"
and
Boolean(NaB)
===> false
But maybe if you had NaB it would make sense to evaluate
Boolean(NaN)
===> NaB
And maybe also evaluate `NaN === 1` to NaB?
I am not too fond of NaB because boolean is supposed to encode binary logic. There exists ternary logic and other concepts that could be used if you do not want strictly two values. Or if you want to encode exceptional values, no reason not to use int8 directly, instead of calling it boolean but actually using sth that could be represented as int8 (NaB has to be represented by some byte anwyay). In general, tbh, I think often it is not useful to encode your logic with booleans because many times you will need to encode exceptional values one way or another. But NaB will not solve that, as it will just function as another way to encode false at best, throwing exceptions around at worst.
> Similarly, NaB would represent a breakdown of true/false condition (somehow) as an exceptional case.
Still, in JS `NaN || true` evaluates to `true` hence I assume `NaB || true` should evaluate to true too. It is not quite the same as NaN + 1 evaluating to NaN. And as NaN (and hence NaB) functions as false when logical operations are involved for all intents and purposes, except if you exactly inquire for it with some isNaB() or sth, to me NaB is another way to encode false, which is probably fine depending what you want it for.
Apparently this is also called "three-valued logic".
> In logic, a three-valued logic (also trinary logic, trivalent, ternary, or trilean, sometimes abbreviated 3VL) is any of several many-valued logic systems in which there are three truth values indicating true, false, and some third value.
> the primary motivation for research of three-valued logic is to represent the truth value of a statement that cannot be represented as true or false.
Personally I think I'd prefer a language to instead support multiple return values, like result and optional error. Or a union of result and error types.
> no reason not to use int8 directly
Hm, so it gets into the territory of flags, bit fields, packed structs.
It should throw a compile-time error. Anything like this which allows an invalid or unmeaningful operation to evaluate at compile-time is rife for carrying uncaught errors at run-time.
What do you mean "not allowed"? Throwing a compile or runtime error? Many languages allow division by zero, and x/0 typically gives inf unless x is 0, then x/0 gives nan. If x is negative then x/0 gives -inf. Of course this all can get tricky with floats, but mathematically it makes sense to divide by zero (interpreted as a limit).
For NaNs, maybe in some domains it could make sense, but eg I would find it impractical when wanting to select rows based on values in a column and stuff like that.
I think a better reasoning is that NaN does not have a single binary representation but in software, one may not be able to distinguish them.
An f32-NaN has 22 bits that can have any value, originally intended to encode error information or other user data. Also, there are two kinds of NaNs: queit NaN (qNaN) and signalling NaNs (sNaN) which behave differently when used in calculations (sNaNs may throw exceptions).
Without looking at the bits, all you can see is NaN, so it makes sense to not equal them in general. Otherwise, some NaN === NaN and some NaN !== NaN, which would be even more confusing.
I don't think that logic quite holds up because when you have two NaNs that do have the same bit representation, a conforming implementation still has to report them as not equal. So an implementation of `==` that handles NaN still ends up poking around in the bits and doing some extra logic. It's not just "are the bit patterns the same?"
(I believe this is also true for non-NaN floating point values. I'm not sure but off the top of my head, I think `==` needs to ignore the difference between positive and negative zero.)
In julia NaN === NaN evaluates to true but NaN === -NaN evaluates to false. Of course, NaN == NaN evaluates to false. I think it makes sense that in principle === looks at bit representations, but cannot think of any reason === is useful here, unless you want to encode meaningful stuff inside your NaNs for some reason. It reminded me of this satirical repo [0] discussed also here [1].
If you're going to nitpick this comment, you should note that infinity isn't on the number line and infinity != infinity, and dividing by zero is undefined
Also, you say NaN ("not a number") is "defined as a number" but Infinity is not. I would think every IEEE 754 value is either "a number" or "not a number". But apparently you believe NaN is both and Infinity is neither?
And you say 0 / 0 is "undefined" but the standard requires it to be NaN, which you say is "defined".
It doesn't really matter if NaN is technically a number or not. I find the standard "NaN == NaN is true" to be potentially reasonable (though I do prefer the standard "NaN == Nan is false"). Regardless of what you choose NaN/NaN = 1 is entirely unacceptable.
> That’s also the reason NaN !== NaN. If NaN behaved like a number and had a value equal to itself, well, you could accidentally do math with it: NaN / NaN would result in 1,
So, by that logic, if 0 behaved like a number and had a value equal to itself, well, you could accidentally do math with it: 0 / 0 would result in 1...
But as it turns out, 0 behaves like a number, has a value equal to itself, you can do math with it, and 0/0 results in NaN.
Try subtraction. But also, not all calculations are purely using mathematical operations. You might calculate two numbers from two different code paths and compare them.
The D language default initializes floating point values to NaN. AFAIK, D is the only language that does that.
The rationale is that if the programmer forgets to initialize a float, and it defaults to 0.0, he may never realize that the result of his calculation is in error. But with NaN initialization, the result will be NaN and he'll know to look at the inputs to see what was not initialized.
Not too familiar with D, but isn't 0xFF ÿ (Latin Small Letter Y with diaeresis) in unicode? It's not valid UTF-8 or ascii, but it's still a valid codepoint in unicode.
I'm a fan of the idea in general, and don't think there's a better byte to use as an obviously-wrong default.
My understanding is that the reasoning behind all this is:
- In 1985 there were a ton of different hardware floating-point implementations with incompatible instructions, making it a nightmare to write floating-point code once that worked on multiple machines
- To address the compatibility problem, IEEE came up with a hardware standard that could do error handling using only CPU registers (no software, since it's a hardware standard)
- With that design constraint, they (reasonably imo) chose to handle errors by making them "poisonous" - once you have a NaN, all operations on it fail, including equality, so the error state propagates rather than potentially accidentally "un-erroring" if you do another operation, leading you into undefined behavior territory
- The standard solved the problem when hardware manufacturers adopted it
- The upstream consequence on software is that if your programming language does anything other than these exact floating-point semantics, the cost is losing hardware acceleration, which makes your floating-point operations way slower
NaN is just an encoding for "undefined operation".
As specified by the standard since its beginning, there are 2 methods for handling undefined operations:
1. Generate a dedicated exception.
2. Return the special value NaN.
The default is to return NaN because this means less work for the programmer, who does not have to write an exception handler, and also because on older CPUs it was expensive to add enough hardware to ensure that exceptions could be handled without slowing down all programs, regardless whether they generated exceptions or not. On modern CPUs with speculative execution this is not really a problem, because they must be able to discard any executed instruction anyway, while running at full speed. Therefore enabling additional reasons for discarding the previously executed instructions, e.g. because of exceptional conditions, just reuses the speculative execution mechanism.
Whoever does not want to handle NaNs must enable the exception for undefined operations and handle that. In that case no NaNs will ever be generated. Enabling this exception may be needed in any case when one sees unexpected NaNs, for debugging the program.
This is a matter of choice, not something with an objectively correct answer. Every possible answer has trade offs. I think consistency with the underlying standard defining NaN probably has better tradeoffs in general, and more specific answers can always be built on top of that.
That said, I don’t think undefined in JS has the colloquial meaning you’re using here. The tradeoffs would be potentially much more confusing and error prone for that reason alone.
It might be more “correct” (logically; standard aside) to throw, as others suggest. But that would have considerable ergonomic tradeoffs that might make code implementing simple math incredibly hard to understand in practice.
A language with better error handling ergonomics overall might fare better though.
>A language with better error handling ergonomics overall might fare better though.
So what always trips me up about JavaScript is that if you make a mistake, it silently propagates nonsense through the program. There's no way to configure it to even warn you about it. (There's "use strict", and there should be "use stricter!")
And this aspect of the language is somehow considered sacred, load-bearing infrastructure that may never be altered. (Even though, with "use strict" we already demonstrated that have a mechanism for fixing things without breaking them!)
I think the existence of TS might unfortunately be an unhelpful influence on JS's soundness, because now there's even less pressure to fix it than there was before.
To some extent you’ve answered this yourself: TypeScript (and/or linting) is the way to be warned about this. Aside from the points in sibling comment (also correct), adding these kinds of runtime checks would have performance implications that I don’t think could be taken lightly. But it’s not really necessary: static analysis tools designed for this are already great, you just have to use them!
> And this aspect of the language is somehow considered sacred, load-bearing infrastructure that may never be altered. (Even though, with "use strict" we already demonstrated that have a mechanism for fixing things without breaking them!)
There are many things we could do which wouldn't break the web but which we choose not to do because they would be costly to implement/maintain and would expand the attack surface of JS engines.
This reminds me of an interesting approach a student had to detecting NaNs for an assignment. The task was to count no-data values (-999) in a file. Pandas (Python library) has its own NaN type, and when used in a boolean expression, will return NaN instead of true or false. So the student changed -999 to NaN on import with Pandas and had a loop, checking each value against itself with an if statement. If the value was NaN the if statement would throw an exception (what could poor if do with NaN?) which the student caught, and in the catch incremented the NaN count.
JavaScript has also TypeError which would be more appropriate here. unfortunately undefined has never been used well and it's caused much more pain than it has brought interesting use cases
It should return false, right? They are different types of thing, so they can’t be the same thing.
Or, maybe we could say that our variables just represent some ideal things, and if the ideal things they represent are equal, it is reasonable to call the variables equal. 1.0d0, 1.0, 1, and maybe “1” could be equal.
Interesting. So, having a comparison between incomparable types result in false -- what we have now -- is functionally equivalent, in an if-statement, to having the undefined evaluate to false... with the difference that the type coercion is currently one level lower (inside the == operator itself).
It kind of sounds like we need more type coercion because we already have too much type coercion!
I'm not sure what an ergonomic solution would look like though.
Lately I'm more in favour of "makes sense but is a little awkward to read and write" (but becomes effortless once you internalize it because it actually makes sense) over "convenient but not really designed so falls apart once you leave the happy path, and requires you to memorize a long list of exceptions and gotchas."
NaNs aren't always equal to each other in their bit representation either, most of the bits are kept as a "payload" which is not defined in the spec it can be anything. I believe the payload is actually used in V8 to encode more information in NaNs (NaN-boxing).
Also, NaN is the only value in JS that isn't === to itself, so if for some reason you want to test for strict value identity with the value NaN of type number, that's one way to do it:
NaN comes from parsing results or Infinity occurring in operations. I personally ends up more to use Number.isFinite(), which will be false on both occurrences when I need a real (haha) numeric answer.
In the error monad NaN = NaN (or Nothing = Nothing or None = None, depending on your terminology) because mathematical equality is an equivalence relation. There are many foundational debates about equality, but whether or not it is an equivalence is never the question.
The root of the problem, completely overlooked by OP is that IEEE 754 comparison is not an equivalence relation. It's a partial equivalence relation (PER). It does have its utility, but these things can be weird and they are definitely not interchangeable with actual equivalence relations. Actual, sane, comparison of floating points got standardized eventually, but probably too late https://en.wikipedia.org/wiki/IEEE_754#Total-ordering_predic.... It's actually kinda nuts that the partial relation is the one that you get by default (no, your sorting function on float arrays does not sort it).
Opened Web Inspector in Safari and pasted the above. (I knew what to expect but did not know how it would work … me trying to figure out what subtracting 1 from a string (ASCII?) would give you. But very related to this post.)
- NaN is a floating point number, and NaN != NaN by definition in the IEEE 754-2019 floating point number standard, regardless of the programming language, there's nothing JavaScript-specific here.
- In JS Number.isNaN(v) returns true for NaN and anything that's not a number. And in JS, s * n and n * s return NaN for any non empty string s and any number n ("" * n returns 0). (EDIT: WRONG, sée below)
> And in JS, s * n and n * s return NaN for any non empty string s and any number n ("" * n returns 0).
No? It is easy to verify that `"3" * 4` evaluates to 12. The full answer is that * converts its operands into primitives (with a hint of being number), and any string that can be parsed as a number converts to that number. Otherwise it converts to NaN.
I always thought of NaN as more of the concept of not-a-number the way that infinity in math is not a specific value but the concept of some unbounded largest possible value.
Therefore, trying to do math with either (for example: NaN/NaN or inf./inf.) was to try to pin them down to something tangible and no longer conceptual — therefore disallowed.
You can use some form of real extensions, e.g. the extended real line (+inf, -inf is often useful for programmers) or the projectively extended real line (+inf = -inf).
This is not about infinity in math not being a _specific_ value, it can certainly be (the actual infinite instead of potential).
It's simply about design and foresight, in my humble opinion.
I'm sometimes wondering if a floating point format really needs to have inf, -inf and nan, or if a single "non finite" value capturing all of those would be sufficient
Not at all sufficient. NaN typically means that something has gone wrong—e.g. your precision requirements exceed that of the floating point representation you've selected, you've done a nonsensical operation. inf and -inf might be perfectly acceptable results depending on your application and needs.
Well no, the in operator is just defined to produce results equivalent equivalent to any(nan is x or nan==x for x in a); it is counterintuitive to the extent people assume that identity implies equality, but the operator doesn't assume that identity implies equality, it is defined as returning True if either is satisfied. [0]
Well, more precisely, this is how the operator behaves for most built in collections; other types can define how it behaves for them by implementing a __contains__() method with the desired semantics.
A similar issue occurs in SQL, where NULL != NULL. [0] In both bases, our typical "equals" abstraction has become too leaky, and we're left trying to grapple with managing different kinds of "equality" at the same time.
Consider the difference between:
1. "Box A contains a cursed object that the human mind cannot comprehend without being driven to madness. Does Box B also contain one? ... Yes."
2. "Is the cursed object in Box A the same as the one in Box B? ... It... uh..." <screaming begins>
Note that this is not the same as stuff like "1"==1.0, because we're not mixing types here. Both operands are the same type, our problem is determining their "value", and how we encode uncertainty or a lack of knowledge.
SQL more elegantly introduces ternary logic in this case, where any comparison with NULL is itself NULL. This is sadly not possible in most languages where a comparison operator must always return a (non-nullable) boolean value.
Also no need to define that in JS, because Number.isNaN() has been around forever (note the case). The global isNaN() has been there from the very beginning, but use Number.isNaN() because it doesn't coerce the param to a number.
Tangentially related: one of my favourite things about JavaScript is that it has so many different ways for the computer to “say no” (in the sense of “computer says no”): false, null, undefined, NaN, boolean coercion of 0/“”, throwing errors, ...
While it’s common to see groaning about double-equal vs triple-equal comparison and eye-rolling directed at absurdly large tables like in https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guid... but I think it’s genuinely great that we have the ability to distinguish between concepts like “explicitly not present” and “absent”.
Another use for NaN's is, suppose you have an array of sensors. Given enough sensors, you're pretty much guaranteed that some of the sensors will have failed. So the array needs to continue to work, even if degraded.
A failed sensor can indicate this by submitting a NaN reading. Then, and subsequent operations on the array data will indicate which results depended on the failed sensor, as the result will be NaN. Just defaulting to zero on failure will hide the fact that it failed and the end results will not be obviously wrong.
I get the idea behind NaN != NaN, but has there ever been any instance where this design decision has made practical code better instead of becoming a tripping hazard and requiring extensive special casing?
I'm also not a fan of the other property that NaN evaluates to false for all three of <, > and =, even though I don't have a good idea what to do otherwise.
I think as programmers, we usually assume that "not (a > b)" implies "a <= b" and vice-versa and often rely on that assumption implicitly. NaN breaks that assumption, which could lead to unexpected behavior.
Consider something like this (in JS) :
If somehow num_examples were NaN, the (!) line would fail unexpectedly because the array would be empty.> That’s also the reason NaN !== NaN. If NaN behaved like a number and had a value equal to itself, well, you could accidentally do math with it: NaN / NaN would result in 1, and that would mean that a calculation containing a NaN result could ultimately result in an incorrect number rather than an easily-spotted “hey, something went wrong in here” NaN flag.
While I'm not really against the concept of NaN not equaling itself, this reasoning makes no sense. Even if the standard was "NaN == NaN evaluates to true" there would be no reason why NaN/Nan should necessarily evaluate to 1.
I definitely support NaN not being equal to NaN in boolean logic.
If you have x = "not a number", you don't want 1 + x == 2 + x to be true. There would be a lot of potential for false equivalencies if you said NaN == NaN is true.
--
It could be interesting if there was some kind of complex NaN number / NaN math. Like if x is NaN but 1x / 2x resulted in 0.5 maybe you could do some funny mixed type math. To be clear I don't think it would be good, but interesting to play with maybe.
Doesn't "1 + x == 2 + x" evaluate to true for any x with large enough magnitude? In general we should expect identities that hold true in "real" math to not hold in FP
Maybe the result of NaN === NaN should be neither true nor false but NaB (not a bool).
NaN is, by definition, not equal to NaN because they’re not comparable, it does have a definitive Boolean representation - false is correct
The concept of NaN long predates the language that uses ===, and is part of a language-agnostic standard that doesn't consider other data types. Any language choosing to treat the equality (regardless of the operator symbol) of NaN differently would be deviating from the spec.
Julia evaluates NaN === NaN to true, as long as the underlying bit representations are the same. Eg NaN === -NaN evaluates to false, as happens also with NaNs where for some reason you tinker with their bytes. I think it makes sense that it is so though, regardless that I cannot think of any actual use cases out of doing weird stuff.
In R, NA (which is almost, but not quite like NaN) actually has separate types for each result, so you can have NA_boolean, NA_integer etc. Its super confusing.
It is a minor nuisance, but I think there's ultimately a pretty good reason for it.
Old-school base R is less type-sensitive and more "do what I mean", but that leads to slowness and bugs. Now we have the tidyverse, which among many other things provides a new generation of much faster functions with vectorized C implementations under the hood, but this requires them to be more rigid and type-sensitive.
When I want to stick a NA into one of these, I often have to give it the right type of NA, or it'll default to NA_boolean and I'll get type errors.
> When I want to stick a NA into one of these, I often have to give it the right type of NA, or it'll default to NA_boolean and I'll get type errors.
Yeah, I know. I hit this when I was building S4 classes, which are similarly type-strict.
Again, I think this was the right decision (pandas's decision was definitely not), but it was pretty confusing the first time.
What would an `if (NaB) ... else ...` block do?
Either you throw an exception (and imo it is better to just throw an exception before that already, then) or else what you do determines what NaN === NaN actually evaluates to.
Apparently NaN (not a number) becomes false when type-cast to boolean.
For a hypothetical NaB (not a boolean), the same behavior seems logical. So the condition `if (NaB)` is false and will fall through to the `else` branch. But..> what you do determines what NaN === NaN actually evaluates to
I think I disagree with this because it's not about casting to boolean, it's a totally different question of self-identity, or comparing two instances (?) of a value (?!).
From the article:
For symmetry and consistency: > NaN is the only value in the whole of JavaScript that isn’t equal to itself .. the concept of NaN is meant to represent a breakdown of calculationSimilarly, NaB would represent a breakdown of true/false condition (somehow) as an exceptional case. Whether it equals itself is a matter of convention or language design, not logic - since it's beyond logic just as NaN is beyond numbers. I would say:
> you throw an exception (and imo it is better..I agree throwing an exception is better design for such exceptional cases - but we know JavaScript as a cowboy language would rather plow through such ambiguities with idiosyncratic dynamic typing, and let the user figure out the implicit logic (if any).
I don't necessarily think about JS in particular. A lot of languages have similar design and prob most have to deal with this issue in general.
In your examples, it does not make sense to have both
and But maybe if you had NaB it would make sense to evaluate And maybe also evaluate `NaN === 1` to NaB?I am not too fond of NaB because boolean is supposed to encode binary logic. There exists ternary logic and other concepts that could be used if you do not want strictly two values. Or if you want to encode exceptional values, no reason not to use int8 directly, instead of calling it boolean but actually using sth that could be represented as int8 (NaB has to be represented by some byte anwyay). In general, tbh, I think often it is not useful to encode your logic with booleans because many times you will need to encode exceptional values one way or another. But NaB will not solve that, as it will just function as another way to encode false at best, throwing exceptions around at worst.
> Similarly, NaB would represent a breakdown of true/false condition (somehow) as an exceptional case.
Still, in JS `NaN || true` evaluates to `true` hence I assume `NaB || true` should evaluate to true too. It is not quite the same as NaN + 1 evaluating to NaN. And as NaN (and hence NaB) functions as false when logical operations are involved for all intents and purposes, except if you exactly inquire for it with some isNaB() or sth, to me NaB is another way to encode false, which is probably fine depending what you want it for.
Reminds me of something I saw recently, a tri-value boolean, a "trilean" or "tribool".
They all feel "risky" in terms of language design, like null itself. But I suppose there are languages with Maybe or Optional.> The tribool class acts like the built-in bool type, but for 3-state boolean logic
https://www.boost.org/doc/libs/1_48_0/doc/html/tribool/tutor...
Apparently this is also called "three-valued logic".
> In logic, a three-valued logic (also trinary logic, trivalent, ternary, or trilean, sometimes abbreviated 3VL) is any of several many-valued logic systems in which there are three truth values indicating true, false, and some third value.
> the primary motivation for research of three-valued logic is to represent the truth value of a statement that cannot be represented as true or false.
https://en.wikipedia.org/wiki/Three-valued_logic
Personally I think I'd prefer a language to instead support multiple return values, like result and optional error. Or a union of result and error types.
> no reason not to use int8 directly
Hm, so it gets into the territory of flags, bit fields, packed structs.
https://andrewkelley.me/post/a-better-way-to-implement-bit-f...It should throw a compile-time error. Anything like this which allows an invalid or unmeaningful operation to evaluate at compile-time is rife for carrying uncaught errors at run-time.
Typed nulls are good
Or a 3rd option is it should not be allowed similar to dividing by 0.
What do you mean "not allowed"? Throwing a compile or runtime error? Many languages allow division by zero, and x/0 typically gives inf unless x is 0, then x/0 gives nan. If x is negative then x/0 gives -inf. Of course this all can get tricky with floats, but mathematically it makes sense to divide by zero (interpreted as a limit).
For NaNs, maybe in some domains it could make sense, but eg I would find it impractical when wanting to select rows based on values in a column and stuff like that.
I think a better reasoning is that NaN does not have a single binary representation but in software, one may not be able to distinguish them.
An f32-NaN has 22 bits that can have any value, originally intended to encode error information or other user data. Also, there are two kinds of NaNs: queit NaN (qNaN) and signalling NaNs (sNaN) which behave differently when used in calculations (sNaNs may throw exceptions).
Without looking at the bits, all you can see is NaN, so it makes sense to not equal them in general. Otherwise, some NaN === NaN and some NaN !== NaN, which would be even more confusing.
I don't think that logic quite holds up because when you have two NaNs that do have the same bit representation, a conforming implementation still has to report them as not equal. So an implementation of `==` that handles NaN still ends up poking around in the bits and doing some extra logic. It's not just "are the bit patterns the same?"
(I believe this is also true for non-NaN floating point values. I'm not sure but off the top of my head, I think `==` needs to ignore the difference between positive and negative zero.)
> NaN === NaN and some NaN !== NaN
In julia NaN === NaN evaluates to true but NaN === -NaN evaluates to false. Of course, NaN == NaN evaluates to false. I think it makes sense that in principle === looks at bit representations, but cannot think of any reason === is useful here, unless you want to encode meaningful stuff inside your NaNs for some reason. It reminded me of this satirical repo [0] discussed also here [1].
[0] https://github.com/si14/stuffed-naan-js [1] https://news.ycombinator.com/item?id=43803724
Something like:
// Optimize special case if (x == y) return 1; else return x/y;
Because NaN is defined as a number and two equal numbers divided by themselves equal 1
> two equal numbers divided by themselves equal 1
That's not true. For example: 0 == 0, but 0/0 != 1.
(See also +Infinity, -Infinity, and -0.)
If you're going to nitpick this comment, you should note that infinity isn't on the number line and infinity != infinity, and dividing by zero is undefined
We're commenting on an article about IEEE 754 floating point values. Following the IEEE 754 standard, we have:
Also, you say NaN ("not a number") is "defined as a number" but Infinity is not. I would think every IEEE 754 value is either "a number" or "not a number". But apparently you believe NaN is both and Infinity is neither?And you say 0 / 0 is "undefined" but the standard requires it to be NaN, which you say is "defined".
It doesn't really matter if NaN is technically a number or not. I find the standard "NaN == NaN is true" to be potentially reasonable (though I do prefer the standard "NaN == Nan is false"). Regardless of what you choose NaN/NaN = 1 is entirely unacceptable.
> That’s also the reason NaN !== NaN. If NaN behaved like a number and had a value equal to itself, well, you could accidentally do math with it: NaN / NaN would result in 1,
So, by that logic, if 0 behaved like a number and had a value equal to itself, well, you could accidentally do math with it: 0 / 0 would result in 1...
But as it turns out, 0 behaves like a number, has a value equal to itself, you can do math with it, and 0/0 results in NaN.
Try subtraction. But also, not all calculations are purely using mathematical operations. You might calculate two numbers from two different code paths and compare them.
The D language default initializes floating point values to NaN. AFAIK, D is the only language that does that.
The rationale is that if the programmer forgets to initialize a float, and it defaults to 0.0, he may never realize that the result of his calculation is in error. But with NaN initialization, the result will be NaN and he'll know to look at the inputs to see what was not initialized.
It causes some spirited discussion now and then.
In the same spirit, the `char` type default initializes to 0xFF, which is an invalid Unicode value.
It's the same idea for pointers, which default initialize to null.
Not too familiar with D, but isn't 0xFF ÿ (Latin Small Letter Y with diaeresis) in unicode? It's not valid UTF-8 or ascii, but it's still a valid codepoint in unicode.
I'm a fan of the idea in general, and don't think there's a better byte to use as an obviously-wrong default.
It's an invalid 8 bit code unit, which is what matters. It's a valid codepoint but codepoints are just abstract numbers, not byte patterns.
Shouldn't an operator on incompatible types return undefined? ;)
Equality on things that it doesn't make sense to compare returning false seems wrong to me. That operation isn't defined to begin with.
By shipping with undefined, JavaScript could have been there only language whose type system makes sense... alas!
My understanding is that the reasoning behind all this is:
- In 1985 there were a ton of different hardware floating-point implementations with incompatible instructions, making it a nightmare to write floating-point code once that worked on multiple machines
- To address the compatibility problem, IEEE came up with a hardware standard that could do error handling using only CPU registers (no software, since it's a hardware standard) - With that design constraint, they (reasonably imo) chose to handle errors by making them "poisonous" - once you have a NaN, all operations on it fail, including equality, so the error state propagates rather than potentially accidentally "un-erroring" if you do another operation, leading you into undefined behavior territory
- The standard solved the problem when hardware manufacturers adopted it
- The upstream consequence on software is that if your programming language does anything other than these exact floating-point semantics, the cost is losing hardware acceleration, which makes your floating-point operations way slower
NaN is just an encoding for "undefined operation".
As specified by the standard since its beginning, there are 2 methods for handling undefined operations:
1. Generate a dedicated exception.
2. Return the special value NaN.
The default is to return NaN because this means less work for the programmer, who does not have to write an exception handler, and also because on older CPUs it was expensive to add enough hardware to ensure that exceptions could be handled without slowing down all programs, regardless whether they generated exceptions or not. On modern CPUs with speculative execution this is not really a problem, because they must be able to discard any executed instruction anyway, while running at full speed. Therefore enabling additional reasons for discarding the previously executed instructions, e.g. because of exceptional conditions, just reuses the speculative execution mechanism.
Whoever does not want to handle NaNs must enable the exception for undefined operations and handle that. In that case no NaNs will ever be generated. Enabling this exception may be needed in any case when one sees unexpected NaNs, for debugging the program.
This is a matter of choice, not something with an objectively correct answer. Every possible answer has trade offs. I think consistency with the underlying standard defining NaN probably has better tradeoffs in general, and more specific answers can always be built on top of that.
That said, I don’t think undefined in JS has the colloquial meaning you’re using here. The tradeoffs would be potentially much more confusing and error prone for that reason alone.
It might be more “correct” (logically; standard aside) to throw, as others suggest. But that would have considerable ergonomic tradeoffs that might make code implementing simple math incredibly hard to understand in practice.
A language with better error handling ergonomics overall might fare better though.
>A language with better error handling ergonomics overall might fare better though.
So what always trips me up about JavaScript is that if you make a mistake, it silently propagates nonsense through the program. There's no way to configure it to even warn you about it. (There's "use strict", and there should be "use stricter!")
And this aspect of the language is somehow considered sacred, load-bearing infrastructure that may never be altered. (Even though, with "use strict" we already demonstrated that have a mechanism for fixing things without breaking them!)
I think the existence of TS might unfortunately be an unhelpful influence on JS's soundness, because now there's even less pressure to fix it than there was before.
To some extent you’ve answered this yourself: TypeScript (and/or linting) is the way to be warned about this. Aside from the points in sibling comment (also correct), adding these kinds of runtime checks would have performance implications that I don’t think could be taken lightly. But it’s not really necessary: static analysis tools designed for this are already great, you just have to use them!
> And this aspect of the language is somehow considered sacred, load-bearing infrastructure that may never be altered. (Even though, with "use strict" we already demonstrated that have a mechanism for fixing things without breaking them!)
There are many things we could do which wouldn't break the web but which we choose not to do because they would be costly to implement/maintain and would expand the attack surface of JS engines.
> Shouldn't an operator on incompatible types return undefined? ;)
NaN is a value of the Number type; I think there are some problems with deciding that Number is not compatible with Number for equality.
We just need another value in the boolean type called NaB, and then NaN == NaN can return NaB.
To complement this, also if/then/else should get a new branch called otherwise that is taken when the if clause evaluates to NaB.
This reminds me of an interesting approach a student had to detecting NaNs for an assignment. The task was to count no-data values (-999) in a file. Pandas (Python library) has its own NaN type, and when used in a boolean expression, will return NaN instead of true or false. So the student changed -999 to NaN on import with Pandas and had a loop, checking each value against itself with an if statement. If the value was NaN the if statement would throw an exception (what could poor if do with NaN?) which the student caught, and in the catch incremented the NaN count.
JavaScript has also TypeError which would be more appropriate here. unfortunately undefined has never been used well and it's caused much more pain than it has brought interesting use cases
It should return false, right? They are different types of thing, so they can’t be the same thing.
Or, maybe we could say that our variables just represent some ideal things, and if the ideal things they represent are equal, it is reasonable to call the variables equal. 1.0d0, 1.0, 1, and maybe “1” could be equal.
"return undefined" is incoherent in almost every language, and IEEE754 predates JavaScript by a decade.
>Shouldn't an operator on incompatible types return undefined? ;)
Please no, js devs rely too much on boolean collapse for that. Undefined would pass as falsy in many places, causing hard to debug issues.
Besides, conceptually speaking if two things are too different to be compared, doesn’t that tell you that they’re very unequal?
Interesting. So, having a comparison between incomparable types result in false -- what we have now -- is functionally equivalent, in an if-statement, to having the undefined evaluate to false... with the difference that the type coercion is currently one level lower (inside the == operator itself).
It kind of sounds like we need more type coercion because we already have too much type coercion!
I'm not sure what an ergonomic solution would look like though.
Lately I'm more in favour of "makes sense but is a little awkward to read and write" (but becomes effortless once you internalize it because it actually makes sense) over "convenient but not really designed so falls apart once you leave the happy path, and requires you to memorize a long list of exceptions and gotchas."
NaNs aren't always equal to each other in their bit representation either, most of the bits are kept as a "payload" which is not defined in the spec it can be anything. I believe the payload is actually used in V8 to encode more information in NaNs (NaN-boxing).
You’ve opened a rabbit hole for me
Also remember that NaN is represented in multiple ways bitwise:
https://en.wikipedia.org/wiki/NaN
Also you even have different kinds of NaN (signalling vs quiet)
Per IEEE 754, yes, but JS the language doesn't distinguish between NaN representations.
Correct.
I guess you could actually see what representation your browser is using with ArrayBuffer.
A quiet NaN in my case if I am doing things right:
'0000000000000000111110000111111100000000000000000000000000000000'
Also, NaN is the only value in JS that isn't === to itself, so if for some reason you want to test for strict value identity with the value NaN of type number, that's one way to do it:
if(x !== x) ... // x is NaN
NaN comes from parsing results or Infinity occurring in operations. I personally ends up more to use Number.isFinite(), which will be false on both occurrences when I need a real (haha) numeric answer.
Equality is a very slippery mathematical relationship. This observation formed the genesis of modern Category Theory [0].
NaN is an error monad.
[0] https://www.ams.org/journals/tran/1945-058-00/S0002-9947-194...
In the error monad NaN = NaN (or Nothing = Nothing or None = None, depending on your terminology) because mathematical equality is an equivalence relation. There are many foundational debates about equality, but whether or not it is an equivalence is never the question.
The root of the problem, completely overlooked by OP is that IEEE 754 comparison is not an equivalence relation. It's a partial equivalence relation (PER). It does have its utility, but these things can be weird and they are definitely not interchangeable with actual equivalence relations. Actual, sane, comparison of floating points got standardized eventually, but probably too late https://en.wikipedia.org/wiki/IEEE_754#Total-ordering_predic.... It's actually kinda nuts that the partial relation is the one that you get by default (no, your sorting function on float arrays does not sort it).
console.log(new Array(16).join("wat"-1) + " Batman!")
Opened Web Inspector in Safari and pasted the above. (I knew what to expect but did not know how it would work … me trying to figure out what subtracting 1 from a string (ASCII?) would give you. But very related to this post.)
Indeed, this shtick was funnier 13 years ago. https://www.destroyallsoftware.com/talks/wat
JavaScript is a quirky, badly-designed language and I think that is common knowledge at this point.
tl;dr:
- NaN is a floating point number, and NaN != NaN by definition in the IEEE 754-2019 floating point number standard, regardless of the programming language, there's nothing JavaScript-specific here.
- In JS Number.isNaN(v) returns true for NaN and anything that's not a number. And in JS, s * n and n * s return NaN for any non empty string s and any number n ("" * n returns 0). (EDIT: WRONG, sée below)
> And in JS, s * n and n * s return NaN for any non empty string s and any number n ("" * n returns 0).
No? It is easy to verify that `"3" * 4` evaluates to 12. The full answer is that * converts its operands into primitives (with a hint of being number), and any string that can be parsed as a number converts to that number. Otherwise it converts to NaN.
Ah indeed, thanks for the correction, I edited message.
I always thought of NaN as more of the concept of not-a-number the way that infinity in math is not a specific value but the concept of some unbounded largest possible value.
Therefore, trying to do math with either (for example: NaN/NaN or inf./inf.) was to try to pin them down to something tangible and no longer conceptual — therefore disallowed.
You can use some form of real extensions, e.g. the extended real line (+inf, -inf is often useful for programmers) or the projectively extended real line (+inf = -inf).
This is not about infinity in math not being a _specific_ value, it can certainly be (the actual infinite instead of potential).
It's simply about design and foresight, in my humble opinion.
I'm sometimes wondering if a floating point format really needs to have inf, -inf and nan, or if a single "non finite" value capturing all of those would be sufficient
Not at all sufficient. NaN typically means that something has gone wrong—e.g. your precision requirements exceed that of the floating point representation you've selected, you've done a nonsensical operation. inf and -inf might be perfectly acceptable results depending on your application and needs.
My gut reaction is that both NaN == NaN and NaN != NaN should be false, it to put it another way, NaN != NaN returns True was a surprise to me.
Does Numpy do the same? That’s where I usually meet NaN.
In a perfect world, in my opinion is that they are incompatibles, and the equality operation should return False in both cases.
But equality is a very complicated concept.
I guess if the values are incomparable != preserves PEM
But Boolean algebra is a lattice, with two+ binary operators were two of those are meet and join with a shared absorbing property.
X == not not X being PEM, we lose that in NP vs co-NP and in vector clocks etc…
But that is just my view.
For the built-in float type in Python, the behaviour is a bit funny:
(Because the `in` operator assumes that identity implies equality...)Well no, the in operator is just defined to produce results equivalent equivalent to any(nan is x or nan==x for x in a); it is counterintuitive to the extent people assume that identity implies equality, but the operator doesn't assume that identity implies equality, it is defined as returning True if either is satisfied. [0]
Well, more precisely, this is how the operator behaves for most built in collections; other types can define how it behaves for them by implementing a __contains__() method with the desired semantics.
[0] https://docs.python.org/3/reference/expressions.html#members...
Yes, in numpy we also have that `np.float64(nan) != np.float64(nan)` evaluates to true.
A similar issue occurs in SQL, where NULL != NULL. [0] In both bases, our typical "equals" abstraction has become too leaky, and we're left trying to grapple with managing different kinds of "equality" at the same time.
Consider the difference between:
1. "Box A contains a cursed object that the human mind cannot comprehend without being driven to madness. Does Box B also contain one? ... Yes."
2. "Is the cursed object in Box A the same as the one in Box B? ... It... uh..." <screaming begins>
Note that this is not the same as stuff like "1"==1.0, because we're not mixing types here. Both operands are the same type, our problem is determining their "value", and how we encode uncertainty or a lack of knowledge.
[0] https://en.wikipedia.org/wiki/Null_(SQL)
SQL more elegantly introduces ternary logic in this case, where any comparison with NULL is itself NULL. This is sadly not possible in most languages where a comparison operator must always return a (non-nullable) boolean value.
To make a deep cut: https://thedailywtf.com/articles/what_is_truth_0x3f_
random thought: To see if something equals NaN,can't you just check for the stringified form of the number equaling "NaN"?
after all, the usual WTF lists for JS usually have a stringified NaN somewhere as part of the fun.
It can usually be implemented like this. No need for strings.
fun isNan(n) = n != n
Also no need to define that in JS, because Number.isNaN() has been around forever (note the case). The global isNaN() has been there from the very beginning, but use Number.isNaN() because it doesn't coerce the param to a number.
Ah, clever!
well there is also one weird quirk I I assumed will be also included in this article:
because a <= b is defined as !(a > b)
then:
5 < NaN // false
5 == NaN // false
5 <= NaN // true
Edit: my bad, this does not work with NaN, but you can try `0 <= null`
IEEE 754 specifically prohibits that definition, and JavaScript indeed evaluates `5 <= NaN` to false.
Yep, my memory was incorrect here and I didn't had access to computer, but it is true with `0 <= null`
This is because null coerces to 0 in JS so this is effectively 0 <= 0. NaN is already a `number` so no coercion happens.
Note that == has special rules, so 0 == null does NOT coerce to 0 == 0. If using == null, it only equals undefined and itself.
Tangentially related: one of my favourite things about JavaScript is that it has so many different ways for the computer to “say no” (in the sense of “computer says no”): false, null, undefined, NaN, boolean coercion of 0/“”, throwing errors, ...
While it’s common to see groaning about double-equal vs triple-equal comparison and eye-rolling directed at absurdly large tables like in https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guid... but I think it’s genuinely great that we have the ability to distinguish between concepts like “explicitly not present” and “absent”.
Another use for NaN's is, suppose you have an array of sensors. Given enough sensors, you're pretty much guaranteed that some of the sensors will have failed. So the array needs to continue to work, even if degraded.
A failed sensor can indicate this by submitting a NaN reading. Then, and subsequent operations on the array data will indicate which results depended on the failed sensor, as the result will be NaN. Just defaulting to zero on failure will hide the fact that it failed and the end results will not be obviously wrong.
Slightly off topic, I hate that typescript bundles NaN under the number type and the signature for parseInt is number.
See also: Tom7's "NaN gates and Flip FLOPS": https://www.youtube.com/watch?v=5TFDG-y-EHs
Imagine that society calls the people who have to work with these toys during office hours, engineers
What's wrong with that? No engineering is 100% strict; there is always ambiguity at the edges
> type(NaN) -> "number"
NaN should have been NaVN, not a valid number.
What's the difference between something that's "not a number" and something that's "a number but not a valid one"?
I'm now remembering the differences between "games" and "numbers" in Surreal numbers. :D
A string (like "20") that can be coerced to a number?
Isn't that just not a number? It's a text string