Hyperparameter Tuning with Grid Search Elastic Net
Like other machine learning models, the performance of Elastic Net can be influenced by its hyperparameters, such as alpha (regularization strength) and l1_ratio (mixing parameter). Scikit-learn provides several methods for hyperparameter tuning, including grid search and randomized search.
In this example, we define a parameter grid for alpha and l1_ratio and use GridSearchCV to find the best combination of hyperparameters based on a specified scoring metric.
from sklearn.model_selection import GridSearchCV
# Load data from a CSV file
data = pd.read_csv('housing.csv')
# Separate features (X) and target variable (y)
X = data.drop('MEDV', axis=1)
y = data['MEDV']
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create an instance of the ElasticNet model
elastic_net = ElasticNet(alpha=0.5, l1_ratio=0.7)
# Fit the model to the training data
elastic_net.fit(X_train, y_train)
print('Elastic Net model trained successfully.')
# Make predictions on the test data
y_pred = elastic_net.predict(X_test)
print('Predictions made on the test data.')
# Print the coefficients of the trained model
print('Elastic Net coefficients:')
print(elastic_net.coef_)
Output:
Elastic Net model trained successfully.
Predictions made on the test data.
Elastic Net coefficients:
[ 0.12345678 0. 0.98765432 0. -0.54321987 0.
0.76543209 0.1234567 0. 0.32456789 0. 0.
0. ]
In this sample output, we’re using the famous Boston Housing dataset from scikit-learn, which contains information about various features related to housing in Boston and the corresponding median housing values (MEDV).
- The data is loaded from the housing.csv file using pd.read_csv().
- The features (X) and the target variable (MEDV) are separated.
- The data is split into training and testing sets using train_test_split() with a test size of 0.2 and a random state of 42.
- An instance of the ElasticNet model is created with alpha=0.5 and l1_ratio=0.7.
- The model is fitted to the training data X_train and y_train.
- Predictions are made on the test data X_test, and the predicted values are stored in y_pred.
- The coefficients (weights) of the trained Elastic Net model are printed, showing the values assigned to each feature.
The coefficients represent the contribution of each feature to the prediction of the median housing value (MEDV). Features with coefficients close to zero have a low impact on the target variable, while features with larger coefficients (positive or negative) have a more significant impact.
What is Elasticnet in Sklearn?
To minimize overfitting, in machine learning, regularizations techniques are applied which helps to enhance the model’s generalization performance. ElasticNet is a regularized regression method in scikit-learn that combines the penalties of both Lasso (L1) and Ridge (L2) regression methods.
This combination allows ElasticNet to handle scenarios where there are multiple correlated features, providing a balance between the sparsity of Lasso and the regularization of Ridge. In this article we will implement and understand the concept of Elasticnet in Sklearn.
Table of Content
- Understanding Elastic Net Regularization
- Implementing Elasticnet in Scikit-Learn
- Hyperparameter Tuning with Grid Search Elastic Net
- Applications and Use Cases of Elasticnet