Simulated Ordinal Dataset with visualization
Loading Packages
R
# Load necessary libraries library (MASS) library (ggplot2) |
Building the Model
R
# Generate example data set.seed (123) data <- data.frame ( Score = rnorm (100, mean = 50, sd = 10), Category = ordered ( sample (1:5, 100, replace = TRUE ), levels = c (1, 2, 3, 4, 5)) ) # Fit an ordinal logistic regression model model <- polr (Category ~ Score, data = data) |
Interpreting Model Results
R
summary (model) |
Output:
Call:
polr(formula = Category ~ Score, data = data)
Coefficients:
Value Std. Error t value
Score 0.02324 0.01973 1.178
Intercepts:
Value Std. Error t value
1|2 0.0712 1.0177 0.0699
2|3 1.0079 1.0122 0.9958
3|4 1.4883 1.0115 1.4713
4|5 2.1692 1.0255 2.1153
Residual Deviance: 311.9673
AIC: 321.9673
Visualizing the model
R
# Create a sequence of scores for prediction new_data <- data.frame (Score = seq ( min (data$Score), max (data$Score), length.out = 100)) # Predict probabilities for each category predicted_probs <- as.data.frame ( predict (model, newdata = new_data, type = "probs" )) # Rename the columns for clarity colnames (predicted_probs) <- c ( "1" , "2" , "3" , "4" , "5" ) # Create a visualization ggplot (predicted_probs, aes (x = new_data$Score)) + geom_line ( aes (y = `1`, color = "Category 1" ), size = 1) + geom_line ( aes (y = `2`, color = "Category 2" ), size = 1) + geom_line ( aes (y = `3`, color = "Category 3" ), size = 1) + geom_line ( aes (y = `4`, color = "Category 4" ), size = 1) + geom_line ( aes (y = `5`, color = "Category 5" ), size = 1) + labs (title = "Ordinal Logistic Regression" , x = "Score" , y = "Predicted Probability" ) + scale_color_manual (values = c ( "#1f77b4" , "#ff7f0e" , "#2ca02c" , "#d62728" , "#9467bd" ), name = "Category" ) + theme_minimal () |
Output:
- We use ggplot2 to create the visualization:
- We specify the predicted_probs dataframe as the data source.
- We create separate lines for each category’s predicted probability using geom_line.
- We set labels for the title, x-axis, and y-axis using labs.
- We customize the line colors using scale_color_manual and provide a legend title.
- We apply the “minimal” theme using theme_minimal() for a clean and simple appearance
Ordinal Logistic Regression in R
A statistical method for modelling and analysing ordinal categorical outcomes is ordinal logistic regression, commonly referred to as ordered logistic regression. Ordinal results are categorical variables having a built-in order, but the gaps between the categories are not all the same. An example of an ordinal scale is a Likert scale with responses ranging from “Strongly Disagree” to “Strongly Agree.”
We investigate the relationship between one or more independent factors and the likelihood that an ordinal outcome will fall into a certain category or a higher category using ordinal logistic regression. The model calculates the odds ratios corresponding to the independent variables, which aids in our comprehension of how predictors affect the ordered response variable.