Implementation of Stochastic Gradient Descent Regressor using Scikit-learn

We will use the diabetes dataset to build and evaluate a linear regression model using SGD.

Step 1 : Importing the required libraries

Python
from sklearn.datasets import load_diabetes
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split

Step 2 :Splitting the dataset

We will now split our dataset in test and training parts, in the ratio 1:4.

Python
X,y = load_diabetes(return_X_y=True)
print(X.shape)
print(y.shape)
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=2)

Output :

(442, 10)
(442,)

Step 3 : Fitting the linear Regression model on Training data

After training the linear regression model using the fit method, you can access the coefficients and intercept of the model. The coefficients represent the weight of each feature in the linear equation, and the intercept is the constant term.

Python
reg = LinearRegression()
reg.fit(X_train,y_train)

print(reg.coef_)
print(reg.intercept_)

Output :

LinearRegression()
[ -9.16088483 -205.46225988 516.68462383 340.62734108 -895.54360867
561.21453306 153.88478595 126.73431596 861.12139955 52.41982836]
151.88334520854633

Each number in the Coefficients array corresponds to a feature from the diabetes dataset, and the Intercept is the constant term in the linear model.

Step 4 : Evaluating the Model

The code is used to predict values on the test set (X_test) using a trained regression model (reg), and then calculate the R-squared score between the predicted values (y_pred) and the actual values (y_test).


Python
y_pred = reg.predict(X_test)
r2_score(y_test,y_pred)

Output:

0.4399387660024645


Step 5: Implementing SGD

We wil implement the Stochastic Gradient Descent Regressor using Scikit-learn library.

Python
from sklearn.linear_model import SGDRegressor
reg = SGDRegressor(max_iter=100,learning_rate='constant',eta0=0.01)
reg.fit(X_train,y_train)

SGDRegressor(learning_rate='constant', max_iter=100)
y_pred = reg.predict(X_test)
r2_score(y_test,y_pred)

Output :

SGDRegressor(learning_rate='constant', max_iter=100)
0.49059547063734904


The scikit-learn‘s implementation of SGDRegressor benefits from optimized internal algorithms, better learning rate schedules, and regularization techniques, leading to improved performance over the custom implementation.

Stochastic Gradient Descent Regressor using Scikit-learn

Stochastic Gradient Descent (SGD) is a popular optimization technique in the field of machine learning. It is particularly well-suited for handling large datasets and online learning scenarios where data arrives sequentially. In this article, we will discuss how a stochastic gradient descent regressor is implemented using Scikit-Learn.

Similar Reads

What is a stochastic gradient descent regressor?

The Stochastic Gradient Descent Regressor (SGD Regressor) is a linear model used for regression tasks that employ the Stochastic Gradient Descent optimization algorithm. Unlike traditional gradient descent, which computes the gradient of the cost function using the entire dataset, stochastic gradient descent updates the model parameters iteratively using each training example....

Implementation of Stochastic Gradient Descent Regressor using Scikit-learn

We will use the diabetes dataset to build and evaluate a linear regression model using SGD....

Stochastic Gradient Descent Regressor FAQs

Q. What is Stochastic Gradient Descent (SGD)?...