Python Implementation of Linear Regression

Import the necessary libraries:

Python3

import pandas as pd import numpy as np import matplotlib.pyplot as plt import matplotlib.axes as ax from matplotlib.animation import FuncAnimation

Load the dataset and separate input and Target variables

Here is the link for dataset: Dataset Link

Python3

url = 'https://media.w3wiki.org/wp-content/uploads/20240320114716/data_for_lr.csv' data = pd.read_csv(url) data # Drop the missing values data = data.dropna() # training dataset and labels train_input = np.array(data.x[0:500]).reshape(500, 1) train_output = np.array(data.y[0:500]).reshape(500, 1) # valid dataset and labels test_input = np.array(data.x[500:700]).reshape(199, 1) test_output = np.array(data.y[500:700]).reshape(199, 1)

Build the Linear Regression Model and Plot the regression line

Steps:

  • In forward propagation, Linear regression function Y=mx+c is applied by initially assigning random value of parameter (m & c).
  • The we have written the function to finding the cost function i.e the mean 
Python3

class LinearRegression: def __init__(self): self.parameters = {} def forward_propagation(self, train_input): m = self.parameters['m'] c = self.parameters['c'] predictions = np.multiply(m, train_input) + c return predictions def cost_function(self, predictions, train_output): cost = np.mean((train_output - predictions) ** 2) return cost def backward_propagation(self, train_input, train_output, predictions): derivatives = {} df = (predictions-train_output) # dm= 2/n * mean of (predictions-actual) * input dm = 2 * np.mean(np.multiply(train_input, df)) # dc = 2/n * mean of (predictions-actual) dc = 2 * np.mean(df) derivatives['dm'] = dm derivatives['dc'] = dc return derivatives def update_parameters(self, derivatives, learning_rate): self.parameters['m'] = self.parameters['m'] - learning_rate * derivatives['dm'] self.parameters['c'] = self.parameters['c'] - learning_rate * derivatives['dc'] def train(self, train_input, train_output, learning_rate, iters): # Initialize random parameters self.parameters['m'] = np.random.uniform(0, 1) * -1 self.parameters['c'] = np.random.uniform(0, 1) * -1 # Initialize loss self.loss = [] # Initialize figure and axis for animation fig, ax = plt.subplots() x_vals = np.linspace(min(train_input), max(train_input), 100) line, = ax.plot(x_vals, self.parameters['m'] * x_vals + self.parameters['c'], color='red', label='Regression Line') ax.scatter(train_input, train_output, marker='o', color='green', label='Training Data') # Set y-axis limits to exclude negative values ax.set_ylim(0, max(train_output) + 1) def update(frame): # Forward propagation predictions = self.forward_propagation(train_input) # Cost function cost = self.cost_function(predictions, train_output) # Back propagation derivatives = self.backward_propagation( train_input, train_output, predictions) # Update parameters self.update_parameters(derivatives, learning_rate) # Update the regression line line.set_ydata(self.parameters['m'] * x_vals + self.parameters['c']) # Append loss and print self.loss.append(cost) print("Iteration = {}, Loss = {}".format(frame + 1, cost)) return line, # Create animation ani = FuncAnimation(fig, update, frames=iters, interval=200, blit=True) # Save the animation as a video file (e.g., MP4) ani.save('linear_regression_A.gif', writer='ffmpeg') plt.xlabel('Input') plt.ylabel('Output') plt.title('Linear Regression') plt.legend() plt.show() return self.parameters, self.loss

Trained the model and Final Prediction

Python3

#Example usage linear_reg = LinearRegression() parameters, loss = linear_reg.train(train_input, train_output, 0.0001, 20)

Output:

Iteration = 1, Loss = 9130.407560462196
Iteration = 1, Loss = 1107.1996742908998
Iteration = 1, Loss = 140.31580932842422
Iteration = 1, Loss = 23.795780526084116
Iteration = 2, Loss = 9.753848205147605
Iteration = 3, Loss = 8.061641745006835
Iteration = 4, Loss = 7.8577116490914864
Iteration = 5, Loss = 7.8331350515579015
Iteration = 6, Loss = 7.830172502503967
Iteration = 7, Loss = 7.829814681591015
Iteration = 8, Loss = 7.829770758846183
Iteration = 9, Loss = 7.829764664327399
Iteration = 10, Loss = 7.829763128602258
Iteration = 11, Loss = 7.829762142342088
Iteration = 12, Loss = 7.829761222379141
Iteration = 13, Loss = 7.829760310486438
Iteration = 14, Loss = 7.829759399646989
Iteration = 15, Loss = 7.829758489015161
Iteration = 16, Loss = 7.829757578489033
Iteration = 17, Loss = 7.829756668056319
Iteration = 18, Loss = 7.829755757715535
Iteration = 19, Loss = 7.829754847466484
Iteration = 20, Loss = 7.829753937309139

Linear Regression Line

The linear regression line provides valuable insights into the relationship between the two variables. It represents the best-fitting line that captures the overall trend of how a dependent variable (Y) changes in response to variations in an independent variable (X).

  • Positive Linear Regression Line: A positive linear regression line indicates a direct relationship between the independent variable (X) and the dependent variable (Y). This means that as the value of X increases, the value of Y also increases. The slope of a positive linear regression line is positive, meaning that the line slants upward from left to right.
  • Negative Linear Regression Line: A negative linear regression line indicates an inverse relationship between the independent variable (X) and the dependent variable (Y). This means that as the value of X increases, the value of Y decreases. The slope of a negative linear regression line is negative, meaning that the line slants downward from left to right.

Linear Regression in Machine learning

Machine Learning is a branch of Artificial intelligence that focuses on the development of algorithms and statistical models that can learn from and make predictions on data. Linear regression is also a type of machine-learning algorithm more specifically a supervised machine-learning algorithm that learns from the labelled datasets and maps the data points to the most optimized linear functions. which can be used for prediction on new datasets. 

First of we should know what supervised machine learning algorithms is. It is a type of machine learning where the algorithm learns from labelled data.  Labeled data means the dataset whose respective target value is already known. Supervised learning has two types:

  • Classification: It predicts the class of the dataset based on the independent input variable. Class is the categorical or discrete values. like the image of an animal is a cat or dog?
  • Regression: It predicts the continuous output variables based on the independent input variable. like the prediction of house prices based on different parameters like house age, distance from the main road, location, area, etc.

Here, we will discuss one of the simplest types of regression i.e. Linear Regression.

Table of Content

  • What is Linear Regression?
  • Types of Linear Regression
  • What is the best Fit Line?
  • Cost function for Linear Regression
  • Assumptions of Simple Linear Regression
  • Assumptions of Multiple Linear Regression
  • Evaluation Metrics for Linear Regression
  • Python Implementation of Linear Regression
  • Regularization Techniques for Linear Models
  • Applications of Linear Regression
  • Advantages & Disadvantages of Linear Regression
  • Linear Regression – Frequently Asked Questions (FAQs)

Similar Reads

What is Linear Regression?

Linear regression is a type of supervised machine learning algorithm that computes the linear relationship between the dependent variable and one or more independent features by fitting a linear equation to observed data....

Types of Linear Regression

There are two main types of linear regression:...

What is the best Fit Line?

Our primary objective while using linear regression is to locate the best-fit line, which implies that the error between the predicted and actual values should be kept to a minimum. There will be the least error in the best-fit line....

Cost function for Linear Regression

The cost function or the loss function is nothing but the error or difference between the predicted value [Tex]\hat{Y}      [/Tex] and the true value Y....

Assumptions of Simple Linear Regression

Linear regression is a powerful tool for understanding and predicting the behavior of a variable, however, it needs to meet a few conditions in order to be accurate and dependable solutions....

Assumptions of Multiple Linear Regression

For Multiple Linear Regression, all four of the assumptions from Simple Linear Regression apply. In addition to this, below are few more:...

Evaluation Metrics for Linear Regression

A variety of evaluation measures can be used to determine the strength of any linear regression model. These assessment metrics often give an indication of how well the model is producing the observed outputs....

Python Implementation of Linear Regression

Import the necessary libraries:...

Regularization Techniques for Linear Models

Lasso Regression (L1 Regularization)...

Applications of Linear Regression

Linear regression is used in many different fields, including finance, economics, and psychology, to understand and predict the behavior of a particular variable. For example, in finance, linear regression might be used to understand the relationship between a company’s stock price and its earnings or to predict the future value of a currency based on its past performance....

Advantages & Disadvantages of Linear Regression

Advantages of Linear Regression...

Conclusion

Linear regression is a fundamental machine learning algorithm that has been widely used for many years due to its simplicity, interpretability, and efficiency. It is a valuable tool for understanding relationships between variables and making predictions in a variety of applications....

Linear Regression – Frequently Asked Questions (FAQs)

What does linear regression mean in simple?...