Regression Model Machine Learning
Let’s take an example of linear regression. We have a Housing data set and we want to predict the price of the house. Following is the python code for it.
Python3
# Python code to illustrate # regression using data set import matplotlib matplotlib.use( 'GTKAgg' ) import matplotlib.pyplot as plt import numpy as np from sklearn import datasets, linear_model import pandas as pd # Load CSV and columns df = pd.read_csv( "Housing.csv" ) Y = df[ 'price' ] X = df[ 'lotsize' ] X = X.values.reshape( len (X), 1 ) Y = Y.values.reshape( len (Y), 1 ) # Split the data into training/testing sets X_train = X[: - 250 ] X_test = X[ - 250 :] # Split the targets into training/testing sets Y_train = Y[: - 250 ] Y_test = Y[ - 250 :] # Plot outputs plt.scatter(X_test, Y_test, color = 'black' ) plt.title( 'Test Data' ) plt.xlabel( 'Size' ) plt.ylabel( 'Price' ) plt.xticks(()) plt.yticks(()) # Create linear regression object regr = linear_model.LinearRegression() # Train the model using the training sets regr.fit(X_train, Y_train) # Plot outputs plt.plot(X_test, regr.predict(X_test), color = 'red' ,linewidth = 3 ) plt.show() |
Output:
Here in this graph, we plot the test data. The red line indicates the best fit line for predicting the price.
To make an individual prediction using the linear regression model:
print( str(round(regr.predict(5000))) )
Regression Evaluation Metrics
Here are some most popular evaluation metrics for regression:
- Mean Absolute Error (MAE): The average absolute difference between the predicted and actual values of the target variable.
- Mean Squared Error (MSE): The average squared difference between the predicted and actual values of the target variable.
- Root Mean Squared Error (RMSE): The square root of the mean squared error.
- Huber Loss: A hybrid loss function that transitions from MAE to MSE for larger errors, providing balance between robustness and MSE’s sensitivity to outliers.
- Root Mean Square Logarithmic Error
- R2 – Score: Higher values indicate better fit, ranging from 0 to 1.
Applications of Regression
- Predicting prices: For example, a regression model could be used to predict the price of a house based on its size, location, and other features.
- Forecasting trends: For example, a regression model could be used to forecast the sales of a product based on historical sales data and economic indicators.
- Identifying risk factors: For example, a regression model could be used to identify risk factors for heart disease based on patient data.
- Making decisions: For example, a regression model could be used to recommend which investment to buy based on market data.
Advantages of Regression
- Easy to understand and interpret
- Robust to outliers
- Can handle both linear and nonlinear relationships.
Disadvantages of Regression
- Assumes linearity
- Sensitive to multicollinearity
- May not be suitable for highly complex relationships
Regression in machine learning
Regression, a statistical approach, dissects the relationship between dependent and independent variables, enabling predictions through various regression models.
The article delves into regression in machine learning, elucidating models, terminologies, types, and practical applications.