Interesting unequal math equation

Well, I saw an interesting problem this morning. See the code below.

> 1.37+0.12 == 1.49
[1] FALSE
> 1.36+0.12 == 1.48
[1] TRUE

It looks weird, right? I googled this problem and someone gives an explanation like this: “Most float number has no exact representation in binary format, just approximation”. The  interpretation isn’t so clear, but at least we know what’s going on.

> 1.37+0.12-1.49
[1] 2.220446e-16
> 1.36+0.12-1.48
[1] 0

So, if you need this kind of comparison in an if control structure, you may have some trouble. One solution is that writing code in this way: 1.37+0.12-1.49 > -1e-10 and 1.37+0.12-1.49 < 1e-10. Looks ugly, but it works.

And there is also a better way to handle this in R. The all.equal() function is what we need.  The function is used to test if two objects are nearly equal.

> if (1.37+0.12 == 1.49) {cat('Match')}
> if (-1e-10 < 1.37+0.12-1.49 & 1.37+0.12-1.49 < 1e-10)
+ {cat('Match')}
Match
> if (all.equal(1.37+0.12, 1.49)) {cat('Match')}
Match

Recursive or non-recursive list

In R, lists can be recursive, which means that you can have list within list.

> c(list(a=1, b=2, c=list(d=4, e=5)))
$a
[1] 1

$b
[1] 2

$c
$c$d
[1] 4

$c$e
[1] 5

The code above creates a two-component list, with c component of the main list itself being another list.

However, sometimes you may want to create a single list instead of a recursive list. You can do this by setting the optional argument recursive in c() function to TRUE. (It’s weird that setting recursive to TRUE actually gives you a non-recursive list.)

> c(list(a=1, b=2, c=list(d=4, e=5)), recursive=T)
  a   b c.d c.e
  1   2   4   5

 

Reference: The Art of R Programming by Norman Matloff

Avoiding Unintended Dimension Reduction

It’s a common scenario that you need to extract one row from a matrix and still want to put some matrix operation on this ‘one-row submatrix’.

> z <- matrix(1:8, nrow=4)
> z
     [,1] [,2]
[1,]    1    5
[2,]    2    6
[3,]    3    7
[4,]    4    8
> r <- z[3, ]
> r
[1] 3 7
> attributes(z)
$dim
[1] 4 2

> attributes(r)
NULL
> str(z)
 int [1:4, 1:2] 1 2 3 4 5 6 7 8
> str(r)
 int [1:2] 3 7

See, when you extract a row from a four-row matrix, you got a vector not a one-row matrix. It seems natural, but in many case, it will cause trouble in programs that do a lot of matrix operation.

The good news is that R has a way to suppress this kind of dimension reduction, with the drop argument.

> r <- z[3,, drop=FALSE]
> r
     [,1] [,2]
[1,]    3    7

or you can always explicitly convert a vector to a matrix by using the as.matrix() function.

Plus: the drop option not only works for matrix, it also can be used in data.frame structure.

 

Reference: The Art of R Programming by Norman Matloff

Using seq() function to deal with the empty-vector problem

Well, for loop structure might be the most common control structure we used in R programming. The code normally looks like this:

for (i in 1:length(x)) {}

It works well for most of the case, how ever when the x vector is empty, 1:length(x) will be (1,0) , so the program will have an error. A better way to handle this is using seq() function.

for (i in seq(x)) {}

And let’s see how the seq() function handle the empty vector.

> x <- c(4, 10)
> seq(x)
[1] 1 2
> x <- NULL
> seq(x)
integer(0)

The seq() function gives the same result as the length() function, but correctly evaluates to NULL, if x is empty, resulting in zero iteration in the loop.

 

Reference: The Art of R Programming by Norman Matloff

Create a numeric vector in R: using : or c() ?

Did you know in R, : and c() are different when you want to create a numeric vector?

See the example below.

> x <- 1:2
> y <- c(1, 2)
> identical(x, y)
[1] FALSE
> typeof(x)
[1] "integer"
> typeof(y)
[1] "double"

So, : produces integers while c() produces floating-point number.

 

Reference: The Art of R Programming by Norman Matloff

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Recommended Blogs