Implementation of Stochastic Gradient Descent Regressor using Scikit-learn
We will use the diabetes dataset to build and evaluate a linear regression model using SGD.
Step 1 : Importing the required libraries
from sklearn.datasets import load_diabetes
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
Step 2 :Splitting the dataset
We will now split our dataset in test and training parts, in the ratio 1:4.
X,y = load_diabetes(return_X_y=True)
print(X.shape)
print(y.shape)
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=2)
Output :
(442, 10)
(442,)
Step 3 : Fitting the linear Regression model on Training data
After training the linear regression model using the fit
method, you can access the coefficients and intercept of the model. The coefficients represent the weight of each feature in the linear equation, and the intercept is the constant term.
reg = LinearRegression()
reg.fit(X_train,y_train)
print(reg.coef_)
print(reg.intercept_)
Output :
LinearRegression()
[ -9.16088483 -205.46225988 516.68462383 340.62734108 -895.54360867
561.21453306 153.88478595 126.73431596 861.12139955 52.41982836]
151.88334520854633
Each number in the Coefficients
array corresponds to a feature from the diabetes dataset, and the Intercept
is the constant term in the linear model.
Step 4 : Evaluating the Model
The code is used to predict values on the test set (X_test
) using a trained regression model (reg
), and then calculate the R-squared score between the predicted values (y_pred
) and the actual values (y_test
).
y_pred = reg.predict(X_test)
r2_score(y_test,y_pred)
Output:
0.4399387660024645
Step 5: Implementing SGD
We wil implement the Stochastic Gradient Descent Regressor using Scikit-learn library.
from sklearn.linear_model import SGDRegressor
reg = SGDRegressor(max_iter=100,learning_rate='constant',eta0=0.01)
reg.fit(X_train,y_train)
SGDRegressor(learning_rate='constant', max_iter=100)
y_pred = reg.predict(X_test)
r2_score(y_test,y_pred)
Output :
SGDRegressor(learning_rate='constant', max_iter=100)
0.49059547063734904
The scikit-learn
‘s implementation of SGDRegressor benefits from optimized internal algorithms, better learning rate schedules, and regularization techniques, leading to improved performance over the custom implementation.
Stochastic Gradient Descent Regressor using Scikit-learn
Stochastic Gradient Descent (SGD) is a popular optimization technique in the field of machine learning. It is particularly well-suited for handling large datasets and online learning scenarios where data arrives sequentially. In this article, we will discuss how a stochastic gradient descent regressor is implemented using Scikit-Learn.