Python implementation of Normal Equation
We will create a synthetic dataset using sklearn having only one feature. Also, we will use numpy for mathematical computation like for getting the matrix to transform and inverse of the dataset. Also, we will use try and except block in our function so that in case if our input data matrix is singular our function will not be throwing an error.
Python3
import numpy as np from sklearn.datasets import make_regression # Create data set. X, y = make_regression(n_samples = 100 , n_features = 1 , n_informative = 1 , noise = 10 , random_state = 10 ) def linear_regression_normal_equation(X, y): X_transpose = np.transpose(X) X_transpose_X = np.dot(X_transpose, X) X_transpose_y = np.dot(X_transpose, y) try : theta = np.linalg.solve(X_transpose_X, X_transpose_y) return theta except np.linalg.LinAlgError: return None # Add a column of ones to X for the intercept term X_with_intercept = np.c_[np.ones((X.shape[ 0 ], 1 )), X] theta = linear_regression_normal_equation(X_with_intercept, y) if theta is not None : print (theta) else : print ( "Unable to compute theta. The matrix X_transpose_X is singular." ) |
Output:
[ 0.52804151 30.65896337]
To Predict on New Test Data Instances
Since we have trained our model and have found parameters that give us the lowest error. We can use this parameter to predict on new unseen test data.
Python3
def predict(X, theta): predictions = np.dot(X, theta) return predictions # Input features for testing X_test = np.array([[ 1 ], [ 4 ]]) X_test_with_intercept = np.c_[np.ones((X_test.shape[ 0 ], 1 )), X_test] predictions = predict(X_test_with_intercept, theta) print ( "Predictions:" , predictions) |
Output:
Predictions: [ 31.18700488 123.16389501]
ML | Normal Equation in Linear Regression
We know the Linear Regression model is a parameterized model which means that the model’s behavior and predictions are determined by a set of parameters or coefficients in the model. However, we use different methods for finding these parameters which give the lowest error on our dataset. In this article, we will read one such article which is the normal equation.