anthe.sevenants

Get the last character(s) from a character vector in R

2023-02-22

Getting the last character or few characters of a string is stupid easy in Python:

test = "gebeurt"
test[-1]
>> "t"
test[-2:]
>> "rt"

In R, the same operation is more complicated. So, here is how to get the last character(s) from a character vector.

Extract the last character

If you only need the last character, you can use the following command:

test <- "gebeurt"
substr(test, nchar(test), nchar(test))
>> [1] "t"

We are telling R to take a piece (a so-called 'substring', substr) of the test character vector (first argument). The second argument dictates where the piece should start. We want our substring to start at the last character, so we count the number of characters in our character vector (nchar(test)). In our example, we want the 7th character, because our test character vector is only 7 characters long. Then, we indicate that we also want our piece to end at that same 7th character, because we only want one letter. Thus, we repeat nchar(test) as the third argument. Now, we have extracted only the last character.

Extract the last few characters

The process is almost the same if you need more than one of the last characters. You just have to adjust where the substring starts. If you want to get the last two characters from a character vector, you adjust the start position to start earlier. Here, I subtract 1 position from the 7th position to start from the 6th position. Of course, we still want to end at the 7th character, so the final argument remains the same.

test <- "gebeurt"
substr(test, nchar(test) - 1, nchar(test))
>> [1] "rt"

You can adjust the - 1 depending on how many characters you need.

Warning: it might seem intuitive to use the length function to get the length of a character vector, instead of nchar. However, using length will just always return 1, since a 'string' in R is actually just a vector of length 1 (which is what length returns). Always use nchar.