R Factors

Factors are used to categorize data. Examples of factors are

Factors

Factors are used to categorize data. Examples of factors are:

  • Demography: Male/Female
  • Music: Rock, Pop, Classic, Jazz
  • Training: Strength, Stamina
  • To create a factor, use the factor() function and add a vector as argument:

    Example

    # Create a factor
    music_genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz"))

    # Print the factor
    music_genre

    Result:

    [1] Jazz    Rock    Classic Classic Pop     Jazz    Rock    Jazz
    Levels: Classic Jazz Pop Rock
    

    You can see from the example above that that the factor has four levels (categories): Classic, Jazz, Pop and Rock.

    To only print the levels, use the levels() function:

    Example

    music_genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz"))

    levels(music_genre)

    Result:

    [1] "Classic" "Jazz"    "Pop"     "Rock"   
    

    You can also set the levels, by adding the levels argument inside the factor() function:

    Example

    music_genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz"), levels = c("Classic", "Jazz", "Pop", "Rock", "Other"))

    levels(music_genre)

    Result:

    [1] "Classic" "Jazz"    "Pop"     "Rock"    "Other"
    

    Factor Length

    Use the length() function to find out how many items there are in the factor:

    Example

    music_genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz"))

    length(music_genre)

    Result:

    [1] 8
    

    Access Factors

    To access the items in a factor, refer to the index number, using [] brackets:

    Example

    Access the third item:

    music_genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz"))

    music_genre[3]

    Result:

    [1] Classic
    Levels: Classic Jazz Pop Rock
    

    Change Item Value

    To change the value of a specific item, refer to the index number:

    Example

    Change the value of the third item:

    music_genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz"))

    music_genre[3] <- "Pop"

    music_genre[3]

    Result:

    [1] Pop
    Levels: Classic Jazz Pop Rock
    

    Note that you cannot change the value of a specific item if it is not already specified in the factor. The following example will produce an error:

    Example

    Trying to change the value of the third item ("Classic") to an item that does not exist/not predefined ("Opera"):

    music_genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz"))

    music_genre[3] <- "Opera"

    music_genre[3]

    Result:

    Warning message:
    In `[<-.factor`(`*tmp*`, 3, value = "Opera") :
      invalid factor level, NA generated
    

    However, if you have already specified it inside the levels argument, it will work:

    Example

    Change the value of the third item:

    music_genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock", "Jazz"), levels = c("Classic", "Jazz", "Pop", "Rock", "Opera"))

    music_genre[3] <- "Opera"

    music_genre[3]

    Result:

    [1] Opera
    Levels: Classic Jazz Pop Rock Opera