It is particularly helpful when there are many levels (like the subjects in our example data set). Which is low? You wouldn’t be able to tell with just integer data. They are better than using simple integer labels because factors are self describing: "low", "medium", and "high"" is more descriptive than 1, 2, 3. In R’s memory, these factors are represented by numbers (1, 2, 3). Label = c("low", : 'min' not meaningful for factorsįood <- factor(food, levels= c( "low", "medium", "high"), ordered= TRUE) Min(food) # doesn't work Error in Summary.factor(structure(c(1L, 3L, 2L, 3L, 1L, 2L, 3L). Additionally, specifying the order of the levels allows us to compare levels: food <- factor( c( "low", "high", "medium", "high", "low", "medium", "high"))įood <- factor(food, levels= c( "low", "medium", "high")) Sometimes, the order of the factors does not matter, other times you might want to specify the order because it is meaningful (e.g., “low”, “medium”, “high”) or it is required by particular type of analysis. You can check this by using the function levels(), and check the number of levels using nlevels(): levels(sex) "female" "male" R will assign 1 to the level "female" and 2 to the level "male" (because f comes before m, even though the first element in this vector is "male"). The factor() command is used to create and modify factors in R sex <- factor( c( "male", "female", "female", "male"))