Plotting Linear Regression in R
The dataset we are using for this is: placement.csv
R
# Reading Dataset SalaryData <- read.csv ( "placement.csv" ) # Preparing model model <- lm (package ~ cgpa, data = SalaryData) # Fetching R-square, p value, f statistics ... summary (model) png (file = "placement_stats.png" ) # Plotting graph plot (SalaryData$cgpa, SalaryData$package, col = "blue" , main = "CGPA and Package regression" , abline (model), cex = 1.3, pch = 16, xlab = "CGPA" , ylab = "Package (LPA)" ) |
Output:
Call:
lm(formula = package ~ cgpa, data = SalaryData)
Residuals:
Min 1Q Median 3Q Max
-15.517 -9.794 -4.762 2.376 53.174
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -94.141 47.167 -1.996 0.0613 .
cgpa 17.415 6.704 2.598 0.0182 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 16.56 on 18 degrees of freedom
Multiple R-squared: 0.2726, Adjusted R-squared: 0.2322
F-statistic: 6.747 on 1 and 18 DF, p-value: 0.01819
- Here, first we are importing the dataset (placement.csv) available in csv (Comma Separated Values) format using read.csv function in R and assigning it to a variable SalaryData.
- Next we are using the pre-built Linear Regression model using lm, feeding x, y and our dataset to the function as arguments.
- Next, we are creating a png file to store the graph plotted using png function which takes the file name as the argument.
- Finally, we are using the plot function given by R to plot the graph for the data.
Running this script, will create a file “placement_stats.png” which has the plotted graph for the dataset in the current directory
Changing pch
R
# Reading Dataset SalaryData <- read.csv ( "placement.csv" ) # Preparing model model <- lm (package ~ cgpa, data = SalaryData) # Fetching R-square, p value, f statistics ... summary (model) png (file = "placement_stats.png" ) # Plotting graph plot (SalaryData$cgpa, SalaryData$package, col = "blue" , main = "CGPA and Package regression" , abline (model), cex = 1.3, pch = 17, # pch: 16 -> 17 xlab = "CGPA" , ylab = "Package (LPA)" ) |
Output:
Call:
lm(formula = package ~ cgpa, data = SalaryData)
Residuals:
Min 1Q Median 3Q Max
-15.517 -9.794 -4.762 2.376 53.174
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -94.141 47.167 -1.996 0.0613 .
cgpa 17.415 6.704 2.598 0.0182 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 16.56 on 18 degrees of freedom
Multiple R-squared: 0.2726, Adjusted R-squared: 0.2322
F-statistic: 6.747 on 1 and 18 DF, p-value: 0.01819
Changing the pch value from 16 to 17 changed the shape of plotting points from solid circles to triangles, while keeping everything same.
Changing col
R
# Reading Dataset SalaryData <- read.csv ( "placement.csv" ) # Preparing model model <- lm (package ~ cgpa, data = SalaryData) png (file = "placement_stats.png" ) # Plotting graph plot (SalaryData$cgpa, SalaryData$package, col = "red" , # col: blue -> red main = "CGPA and Package regression" , abline (model), cex = 1.3, pch = 16, xlab = "CGPA" , ylab = "Package (LPA)" ) |
Output:
Changing the col from blue to red lead to change in the color of plotting points from blue (initial) to red (new), while keeping everything the same.
Additionally, abline function is used to add the regression line (line of best fit) to the plot.
Removing it will result in a scatterplot.
Removing abline parameter
R
# Reading Dataset SalaryData <- read.csv ( "placement.csv" ) # Preparing model model <- lm (package ~ cgpa, data = SalaryData) png (file = "placement_stats.png" ) # Plotting graph plot (SalaryData$cgpa, SalaryData$package, col = "blue" , main = "CGPA and Package regression" , cex = 1.3, pch = 16, # Removed abline parameter xlab = "CGPA" , ylab = "Package (LPA)" ) |
Output:
When abline parameter is removed from the call of plot function, the line of best fit disappears from the grap, hence making it a scatterplot, while keeping everything the same.
This can also be plotted by using external libraries like ggplot2.
How to Plot the Linear Regression in R
In this article, we are going to learn to plot linear regression in R. But, to plot Linear regression, we first need to understand what exactly is linear regression.