Character string functions provided by base R

FunctionDescriptionExample
Basic character string functions
nchar(x)Return the string lengthnchar("Hello") #5
toupper(x)Upcase the stringtoupper("hello world") #"HELLO WORLD"
tolower(x)Lowcase the string
strtrim(x, width)Trim character strings to specified display widths.strtrim("Hello", 2) #"He"
paste(…, sep = " ")Concatenate vectors after converting to character.paste(x, 1:3, sep = "") #"x1" "x2" "x3"
paste(c("x", "y", "z"), 1:3, sep = "M") #"xM1" "yM2" "zM3"
paste("Hello", "World", sep = " ") #"Hello World"
Also work with regular expression patterns (fixed = )
substr(x, start, stop) or substr(x, start, stop)Extract or replace substrings in a character vector.substr("Hello World", 1, 5) #"Hello"
x <- "Hello World"
substr(x, 1, 5) <- "Goodbye"
x #Goodbye World
sub(pattern, replacement, x) or gsub(pattern, replacement, x)Sub and gsub perform replacement of the first and all matches respectively.sub("\\s", ".", "Hello World") #"Hello.World"
strsplit(x, split)Split the elements of a character vector x into substrings according to the matches to substring split within them.strsplit("a.b.c", ".", fixed = TRUE) #"a" "b" "c"
grep(pattern, x)Search for matches to argument pattern within each element of a character vectorgrep("foo", c("arm", "foot")) #2

Use the superassignment operator

One of the most important functional programming principle is that functions do not change non-local variables; that is, generally speaking, the code in a function only has read access to its non-local variables.  This is a quite important feature which can protect the higher-level variable from being changed by local functions. See the example below.

> x <- 10
> test <- function(x) {
+   x <- x - 5
+   print(x)
+ }
> test(x)
[1] 5
> x
[1] 10

However, sometimes you may wish to write to a global variable or any variable higher than the level at which your write statement exists. The superassignment operator, <<-, or the assign() function is what you want. Let’s look at the superassignment operator first.Continue reading

get() function in R

The get() function might be one of the most useful utilities in R. However I’ve never use this function before I write this page. Shame on me.

Well, let’s get to the point. The job of the get() function is actually quite simple: given the name of an object, it fetches the object itself. See the example below:

> x <- c(1:3)
> x
[1] 1 2 3
> get("x")
[1] 1 2 3

It’s easy to imagine how useful this function is.

 

Reference: The Art of R Programming by Norman Matloff

Using seq() function to deal with the empty-vector problem

Well, for loop structure might be the most common control structure we used in R programming. The code normally looks like this:

for (i in 1:length(x)) {}

It works well for most of the case, how ever when the x vector is empty, 1:length(x) will be (1,0) , so the program will have an error. A better way to handle this is using seq() function.

for (i in seq(x)) {}

And let’s see how the seq() function handle the empty vector.

> x <- c(4, 10)
> seq(x)
[1] 1 2
> x <- NULL
> seq(x)
integer(0)

The seq() function gives the same result as the length() function, but correctly evaluates to NULL, if x is empty, resulting in zero iteration in the loop.

 

Reference: The Art of R Programming by Norman Matloff