If you have seen one of my old posts: Interesting unequal math equation, you would know there is an accuracy problem in R. And I give an explanation in that post: “Most float number has no exact representation in binary format, just approximation”. Here I decide to dig a litter bit deeper.
Let’s look at some examples first.
See, notice the number there, 2.220446e-16. Do you think it’s just a coincidence ?
Of course not.
Thanks to Google, I find a detailed explanation about this problem.
Real numbers in R are stored in double precision, which means that 53 bit floating point arithmetic in base 2 is used. This may be seen from
> 1 + 2^-52 == 1
> 1 + 2^-53 == 1
The number 1 + 2^-52 with a 53 bit mantissa is exactly representable, while 1 + 2^-53 with 54 bit mantissa is rounded to 1. The smallest difference between two consecutive representable numbers in the interval [1 , 2) is about 2.220446e-16 which exactly equals to 2^-52.
Double precision is the standard for numerical calculations, where speed is required. This cannot represent irrational numbers and rational numbers, whose denominator is not a power of 2. In particular, numbers with a finite number of decimal digits need not have a finite expansion as a binary number. This is the reason for the following
> 0.1 + 0.2 - 0.3
Similar effects may be demonstrated using decimal numbers. The reason for the above is similar to the reason, why 2/3 – 1/3 – 1/3 is not 0, if 1/3 and 2/3 are rounded to a finite number of decimal digits. With 5 digits, we get 0.66667 – 0.33333 – 0.33333 = 0.00001.
The fact that numbers like 0.1 are not represented exactly does not mean that we cannot get correct result, at least in simple cases, if the calculations are done with care. In particular, for correcting errors of addition and subtraction of fractional decimal numbers, the functions round() and signif() may be used.