The difference between using subset() function and ordinary filtering

Well at first, I thought there is no difference between this two methods. And I normally use  these two methods interchangeable when I wrote the R code.

And actually there  is a small difference in how NA values are handled.

> x <- c(6, 1, NA, 10)
> x
[1]  6  1 NA 10
> x[x > 5]
[1]  6 NA 10
> subset(x, x > 5)
[1]  6 10

So when your data have some missing values, for example survey data, choose subset() or filtering method carefully. This tiny difference may cause unpredictable mistake which normally takes you a lot of time to debug the program.


Reference: The Art of R Programming by Norman Matloff

Posted in R Programming Tips.

Leave a Reply

Your email address will not be published. Required fields are marked *