Accessing portions of an R string
The individual characters of a string can be extracted from a string by using the indexing methods of a string. There are two R’s inbuilt functions in order to access both the single character as well as the substrings of the string.
substr() or substring() function in R extracts substrings out of a string beginning with the start index and ending with the end index. It also replaces the specified substring with a new set of characters.
Syntax
substr(..., start, end) or substring(..., start, end)
Using substr() function
R
# R program to access # characters in a string # Accessing characters # using substr() function substr ( "Learn Code Tech" , 1, 1) |
Output
"L"
If the starting index is equal to the ending index, the corresponding character of the string is accessed. In this case, the first character, ‘L’ is printed.
Using substring() function
R
# R program to access characters in string str <- "Learn Code" # counts the characters in the string len <- nchar (str) # Accessing character using # substring() function print ( substring (str, len, len)) # Accessing elements out of index print ( substring (str, len+1, len+1)) |
Output
[1] "e"
The number of characters in the string is 10. The first print statement prints the last character of the string, “e”, which is str[10]. The second print statement prints the 11th character of the string, which doesn’t exist, but the code doesn’t throw an error and print “”, that is an empty character.
The following R code indicates the mechanism of String Slicing, where in the substrings of a R string are extracted:
R
# R program to access characters in string str <- "Learn Code" # counts the number of characters of str = 10 len <- nchar (str) print ( substr (str, 1, 4)) print ( substr (str, len-2, len)) |
Output
[1]"Lear" [1]"ode"
The first print statement prints the first four characters of the string. The second print statement prints the substring from the indexes 8 to 10, which is “ode”.
R Strings
Strings are a bunch of character variables. It is a one-dimensional array of characters. One or more characters enclosed in a pair of matching single or double quotes can be considered a string in R. Strings in R Programming represent textual content and can contain numbers, spaces, and special characters. An empty string is represented by using “. R Strings are always stored as double-quoted values. A double-quoted string can contain single quotes within it. Single-quoted strings can’t contain single quotes. Similarly, double quotes can’t be surrounded by double quotes.