Avoiding Unintended Dimension Reduction

It’s a common scenario that you need to extract one row from a matrix and still want to put some matrix operation on this ‘one-row submatrix’.

> z <- matrix(1:8, nrow=4)
> z
     [,1] [,2]
[1,]    1    5
[2,]    2    6
[3,]    3    7
[4,]    4    8
> r <- z[3, ]
> r
[1] 3 7
> attributes(z)
[1] 4 2

> attributes(r)
> str(z)
 int [1:4, 1:2] 1 2 3 4 5 6 7 8
> str(r)
 int [1:2] 3 7

See, when you extract a row from a four-row matrix, you got a vector not a one-row matrix. It seems natural, but in many case, it will cause trouble in programs that do a lot of matrix operation.

The good news is that R has a way to suppress this kind of dimension reduction, with the drop argument.

> r <- z[3,, drop=FALSE]
> r
     [,1] [,2]
[1,]    3    7

or you can always explicitly convert a vector to a matrix by using the as.matrix() function.

Plus: the drop option not only works for matrix, it also can be used in data.frame structure.


Reference: The Art of R Programming by Norman Matloff

Using seq() function to deal with the empty-vector problem

Well, for loop structure might be the most common control structure we used in R programming. The code normally looks like this:

for (i in 1:length(x)) {}

It works well for most of the case, how ever when the x vector is empty, 1:length(x) will be (1,0) , so the program will have an error. A better way to handle this is using seq() function.

for (i in seq(x)) {}

And let’s see how the seq() function handle the empty vector.

> x <- c(4, 10)
> seq(x)
[1] 1 2
> x <- NULL
> seq(x)

The seq() function gives the same result as the length() function, but correctly evaluates to NULL, if x is empty, resulting in zero iteration in the loop.


Reference: The Art of R Programming by Norman Matloff

Create a numeric vector in R: using : or c() ?

Did you know in R, : and c() are different when you want to create a numeric vector?

See the example below.

> x <- 1:2
> y <- c(1, 2)
> identical(x, y)
> typeof(x)
[1] "integer"
> typeof(y)
[1] "double"

So, : produces integers while c() produces floating-point number.


Reference: The Art of R Programming by Norman Matloff