Mauro Bringolf

Currently geeking out as WordPress developer at WebKinder and student of computer science at ETH.

JS Numbers: There are no integers, but how many?

January 3, 2018
,

This is the first post of what I hope to become series about numbers in JavaScript. It is supposed to be a summary of my findings for the open source project xtuc/js-webassembly-interpreter1. JavaScript is dynamically typed which means variables are not bound to contain only values of one type. This is in contrast to statically typed languages like C, C++ or Java. One consequence is that JavaScript does not distinguish between different number types such as int, double or long. Everything is abstracted as a number:

// So simple, I love it
typeof 7 // "number"
typeof .3e-10 // "number"
typeof 0x234 // "number"

// Okay I can live with this
typeof Infinity // "number"

// Now wtf is this
typeof NaN // "number"

This post is not about the typeof operator though. I think it makes perfect sense that “Not-a-Number” is a number and we might see why in a later post. The takeway here is that all these values are of the same primitive type.

To summarize: In JavaScript, all numerical values are of the same type called “number”. I am trying to explain quite a few things in this article from the ground up. Depending on what you already know you might want to skip some parts:

Integers and floats in bits

As we know, JavaScript and other programming languages are high-level abstractions for bits changing inside computers. At some point in the chain of abstractions, a number has to become a sequence of bits. One method that you might know is the binary representation of natural numbers. Here is how a sequence of bits can be interpreted as a natural number with that method:

\( 1011_2 = 1 * 8 + 0 * 4 + 1 * 2 + 1 * 1 = 8 + 2 + 1 = 11_{10} \)

I used subscripts to indicate that the result is to be read as the number eleven (base 10), but the input as a binary string. As you can see, each bit can add a power of two corresponding to its position in the bit string. If you find this confusing, think about how we write down numbers and determine their values. You will see that you do this everyday but with powers of ten instead of two.

The representation above is simple and unique for all natural numbers. That is great, but we might want negative numbers too. The simplest solution would be to store one additional bit and let it decide about the sign of the value. That is called sign-magnitude representation and can be done, but it turns out that it makes hardware operating on those bit strings unnecessarily complicated. That is why a different representation called two’s complement is used. If you’re interested in how that works, I have written about it before2 but its details are not necessary to understand integers in JS.

Assume we have a way to represent all integers, positive and negative. That is still not good enough, because the real world deals with real numbers (not really, if you’re into math). We definitely want to have values like 94.53471 to work with data in any meaningful way. These numbers are called floating point numbers. Two’s complement cannot handle these, but there is a widely accepted standard called IEEE754 which describes how to represent floating point numbers with bit strings.

To summarize: A sequence of bits is not a number. Different ways of interpreting the same bit string can yield different values. The rules of interpretation are defined by the type associated to the value. Conversely, the same value can be represented by a different bit string depending on its type.

JavaScript integers are floating point numbers

If you put together the two main points from above you might see where this is going: JavaScript uses the same representation to encode all numerical values. Here is an excerpt from the EcmaScript specification3:

“The Number type has exactly \(18437736874454810627\) (that is, \(2^{64}-2^{53}+3\)) values, representing the double-precision 64-bit format IEEE 754-2008 values as specified in the IEEE Standard for Binary Floating-Point Arithmetic.”

Your reaction to this statement depends mainly on your background: If you come from a place where bits and bytes are too close to the metal for you to be interesting, then you might consider this fact as perfectly reasonable. Floating point can do everything we need, so why not have a nice abstraction for it and never worry about tricky conversion rules between number types. However, if you program in C or other languages that sit just above the assembly level and deal more directly with memory you probably think that this is just crazy. I personally do not think that this design is particularly good or bad, but I want to understand its consequences.

Of course, this does not mean that you cannot use integers in JavaScript. It just means that they are represented using the floating point standard. A natural question now is:

What range of integers can be encoded using 64-bit floating point representation?

This range is clearly not empty since there are integer values in JavaScript. But it is also clear that there are less integers than with a 64-bit two’s complement representation which is used for large integers in C for example. One bit string can only represent one number after all. So the fact that 0.5 exists in JS means that there is at least one 64-bit integer that is missing in JS. Don’t think about this too hard, we will count all quantities properly in the next paragraph.

How are floating point numbers represented?

In order to understand what integers exist in floating point, we need a clear understanding of how floating point numbers are encoded. Once we know the meaning of each of the 64 bits, we can look at which of these combinations represent integers. A floating point number is represented by three parts:

The idea behind these numbers is easy to explain: The mantissa \( m \) is some decimal, i.e. \(1.625 \) and the exponent \( e \) moves the decimal point to the left or right in that number. The sign bit can be used to add a minus sign in front of it. The advantage of this representation is the floating decimal point that it allows us to go from very small to very large values without loosing precision. If you want to dig further into this, I suggest you compare this representation to fixed point arithmetic4.

Now that we understand the idea it is time to look at it bit for bit. Above I cheated by saying how we interpret the three numbers \(s,m,e \) but not how they are encoded in bits. According to the standard, one bit is used for \( s\), \(11\) bits are for \( e \) and \(52\) bits for \( m \) which adds up to the total 64 bits we want. The sign bit is the easiest, \(0\) means positive and \(1\) means negative. The exponent is an integer so we can use 11-bit two’s complement to encode it as bit string5. The mantissa is between \(1\) and \(2\), so we use a simple trick to store it: Ignore the leading \(1.\) part and just store the rest as a binary fraction. A binary fraction works just like decimal fraction:

\begin{align*}
0.625_{10} = \frac{6}{10} + \frac{2}{100} + \frac{5}{1000} = \frac{625}{1000} = \frac{5}{8} \\
0.101_2 = \frac{1}{2} + \frac{0}{4} + \frac{1}{8} = \frac{5}{8}
\end{align*}

Again, note how a sequence of bits is not a number but can be interpreted as different numbers. The exponent moves the point in the binary fraction, not the decimal one which is different. So if we have the mantissa as above and exponent \(3\), the represented value is \( 1101.0_2 = 13_{10} \). Positive exponents shift the point to the right and negative to the left. In case you are wondering how the value \( 0.625 \) relates to 13: With the implicit leading \( 1 \) this value represents the mantissa \( 1 + 0.625 = \frac{13}{8} \). Shifting the point 3 places to the right is the same as multiplying by \( 2^3 \), yielding a result of \( \frac{13}{8} * 8 = 13 \).

What integers exist in 64-bit floating point?

So we found the representation of the integer \( 13 \). What is the condition for the result to become an integer? Any binary digit after the fractional point has to be zero. As we have seen the binary representation of the mantissa always starts with \(1,\dots\) followed by \(52\) digits. So if the binary representation of a natural number \(1b_2b_1b_0\) is of length \( 4 \) we can represent that number in floating point by setting the mantissa bits to \( b_2b_1b_0 0 \dots 0 \) and the exponent to be \( 3 \). This is exactly what we did for \( 13 \) above. The exact same procedure works as long as we manage to shift all digits to the left of the decimal point, so this works for natural numbers whose binary representations are up to \( 53 \) digits long.

The maximum integer in 53-bit binary is \( 2^{53} – 1 \) so with the method above including the sign bit we can represent all integers \( – (2^{53} – 1), \dots, -1, 0, 1, \dots, 2^{53} -1 \). This includes the range of 32-bit signed integers which goes from \( – (2^{31} , \dots, -1, 0, 1, \dots, 2^{31} – 1) \). Good news, it means that the range is not smaller than working with int‘s in C for example. If you have a feeling for exponentials or powers of two, you can see that 64-bit floating point in fact has a lot more integers than 32-bit integers. But then it also has a considerably smaller range of integers than 64-bit integer representation6. At the time of writing, there exists a stage 3 proposal to add 64-bit integers to JavaScript7.

Not all is lost

To finish the first article of this series, I would like to mention that everything said so far is about what the language specification says. JavaScript engines are free to use whatever representation of numbers they want to, as long as they make it look like they use what is defined in the specification. Actually, if you read the quote about the number type from the specification carefully it never says what representation an engine should use. It only says that its values and behavior must be as if they were IEEE 754-2008 64-bit floating point numbers. To make the final statement as paradox as its title: Even though JS does not have an integer type, an engine might still use integer types to represent numbers.

References

  1. https://github.com/xtuc/js-webassembly-interpreter
  2. http://beta.maurobringolf.ch/2017/05/a-formal-approach-to-twos-complement-binary-representation-of-integers/
  3. https://www.ecma-international.org/ecma-262/8.0/index.html#sec-ecmascript-language-types-number-type
  4. https://en.wikipedia.org/wiki/Fixed-point_arithmetic
  5. In reality a mechanism called bias is used instead of two’s complement. It does not matter for understanding the format though.
  6. I have not shown all integers here. For example \( 2^{100} \) can be represented by setting the mantissa to \( 1\) and the exponent to \( 100 \). But there are “wholes” once you go past the range described so these values are less useful to work with.
  7. https://github.com/tc39/proposal-bigint