Create multiple Cumulative Frequency Graph in R

Here we are try to plot Cumulative Frequency Graph in R Programming Language.


# Install and load the necessary library
# Create a sample data frame
data <- data.frame(
  values = c(5, 8, 12, 15, 20, 22, 25, 28, 32, 35),
  frequency = c(2, 3, 4, 5, 6, 7, 8, 9, 10, 12)
# Calculate cumulative frequency
data$cumulative_frequency <- cumsum(data$frequency)
# Create additional scenarios for cumulative frequency
data$cumulative_frequency_scenario2 <- cumsum(data$frequency) + 10
data$cumulative_frequency_scenario3 <- cumsum(data$frequency) - 5
# Combine all scenarios into one data frame
all_data <- rbind(
  transform(data, scenario = "Scenario 1"),
  transform(data, cumulative_frequency = data$cumulative_frequency_scenario2,
            scenario = "Scenario 2"),
  transform(data, cumulative_frequency = data$cumulative_frequency_scenario3,
            scenario = "Scenario 3")
# Create a cumulative frequency plot with step format and multiple scenarios
ggplot(all_data, aes(x = values, y = cumulative_frequency, group = scenario,
                     color = scenario)) +
  geom_step(size = 1.5, direction = "hv") +
  geom_point(size = 3) +
  labs(title = "Cumulative Frequency Graph",
       x = "Values",
       y = "Cumulative Frequency",
       color = "Scenario") +
  theme_minimal() +
  scale_color_manual(values = c("Scenario 1" = "steelblue", "Scenario 2" = "red",
                                "Scenario 3" = "green"))


In this example, geom_step() with direction = "hv" is used to create a step plot with both horizontal and vertical segments.

In this article, we are going to plot a cumulative frequency graph using the R programming language.

What is Cumulative Frequency?

When the frequency of the first-class interval is added to the frequency of the second class, this total is added to the third class and so on is known as the cumulative frequency.

What is a Cumulative Frequency Graph?

A graph that can show the cumulative frequency distribution of grouped data is called a cumulative frequency graph or an ogive. This is the most effective technique to comprehend cumulative frequency data and arrive at conclusions is to plot the data. Graphs in particular are crucial in the realm of statistics because they enable us to better comprehend the data and depict it. 

Functions Used for R Cumulative Frequency Graph

Here are the some of the functions used for R Cumulative Frequency Graph.

seq() Method

The seq() method creates a list of values beginning from the lower limit to the higher and segregates them with the difference specified in the “by” parameter. 

Syntax: seq( start , end, by )

Parameters : 

start – start of the sequence 

end – end of the sequence

by – increment value of the sequence

cut() Method

The cut() method in R divides the range of the specified vector of data points into intervals and codes the values in the vector as per which interval in which they belong.

Syntax: cut(x, breaks) 

Parameters : 

x – The vector of data points.

breaks – The vector of break points.

table(x) Method

The transformed vector is then converted into a table of values, in order to construct a frequency table. The values are mapped according to the interval in which they lie. It is used to create a categorical representation of data with the specified variable name and its corresponding frequency.

Syntax: table(x)

Parameter : 

x – The vector of values to be converted.

cumsum(x) Method

The cumulative frequencies can be generated using the cumsum() method for the specified vector. Cumulative frequency for a data point at nth interval is the summation of frequencies till the (n-1)th interval.

Syntax: cumsum(x)

Parameters : 

x – A vector of data points.

plot() Method

The plot of cumulative frequencies can then be created using the plot() method in R. The method takes as arguments the breakpoints as the coordinates on the x-axis and their respective cumulative frequencies as the coordinates on the y axis respectively. 

Syntax: plot(x-coordinates, y-coordinates, xlab, ylab)

Parameters : 

x-coordinates – The vector of x coordinates.

y-coordinates – The vector of y coordinates.

xlab – The labelling of x axis.

ylab – The labelling of y axis.

