Standard Error of Regression

Standard error is a statistical technique that is used to find the average distance between the observed values and the regression line. It defines how much the actual data is spread around the line. In other words, it can be said that it provides a measure of how much the actual dependent value deviates from the predicted value. Since it is an error, therefore lower the value better will be our prediction.

Suppose we want to estimate the slope as well as the Y-intercept using the independent variables and dependent variables. Let us look at the Python code.

Python3

import numpy as np
import statsmodels.api as sm
 
 
x = np.array([2, 3, 4, 9, 5])  # Independent variable
y = np.array([2, 6, 8, 18, 10])  # Dependent variable
 
# Add a constant term (intercept) to the independent variable
x = sm.add_constant(x, prepend=True)
 
# Fitting a linear regression model
regression = sm.OLS(y, x).fit()
 
 
regression.summary()

                    

Output:

OLS Regression Results
Dep. Variable: y R-squared: 0.984
Model: OLS Adj. R-squared: 0.978
Method: Least Squares F-statistic: 182.8
Date: Thu, 05 Oct 2023 Prob (F-statistic): 0.000875
Time: 11:18:32 Log-Likelihood: -5.1249
No. Observations: 5 AIC: 14.25
Df Residuals: 3 BIC: 13.47
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const -1.2192 0.837 -1.456 0.241 -3.883 1.445
x1 2.1781 0.161 13.519 0.001 1.665 2.691
Omnibus: nan Durbin-Watson: 2.045
Prob(Omnibus): nan Jarque-Bera (JB): 0.624
Skew: -0.678 Prob(JB): 0.732
Kurtosis: 1.925 Cond. No. 11.5


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Standard Error:

Python3

# Get the standard errors of regression coefficients
error = regression.bse
 
# Print the standard errors
print('Constant error :',error[0])
print('x1 error       :',error[1])

                    

Output:

Constant error : 0.8371869364710968
x1 error       : 0.16111670104454195

After executing the code we get a list comprising two values. In the above code we have defined two numpy arrays. For statistical calculations we have the imported stats model library. After that we specified a constant term and theen used a linear regression model to fit the independent values and the dependent values. The model analyses the values and establishes a relationship between the dependent and indevariablesvariable. So basically the model estimates the slope y-interceptntercept. Then using the bse we are estimating the standard errors of the intercept and the slope respectively. Here the inhass has a standard error of 0.837 whereas it has a standard error of 0.161.

Standard Error of the Regression vs. R-squared

Regression is a statistical technique used to establish a relationship between dependent and independent variables. It predicts a continuous set of values in a given range. The general equation of Regression is given by

  • Here y is the dependent variable. It is the variable whose value changes when the independent values are changed
  • x is the independent variable. Here y is dependent on x. It is to be noted that there can be more than one independent variable.
  • m is the slope
  • c is the y-intercept

There are different types of Regression: Linear Regression, Ridge Regression, Polynomial Regression, and Lasso Regression. Regression analysis involves the prediction of continuous values within a given range therefore we require evaluation metrics. Evaluation metrics help us to analyze the performance of the Machine Learning model. In Regression Analysis, we calculate how much the predicted values deviate from the actual values. There are different evaluation metrics for Regression Analysis like Mean Squared Error, Mean Absolute Error, R squared, etc.

Similar Reads

Standard Error of Regression

Standard error is a statistical technique that is used to find the average distance between the observed values and the regression line. It defines how much the actual data is spread around the line. In other words, it can be said that it provides a measure of how much the actual dependent value deviates from the predicted value. Since it is an error, therefore lower the value better will be our prediction....

R – squared

...

Interpreting standard error vs. R-squared

...

Limitations

R squared, also known as the coefficient of determination, is a statistical measure that determines how well the regression line fits the values. In other words it explains the variability between independent and dependent variables. The basic idea of R square is to provide information about the relationship between independent and dependent variable. Therefore higher the value of R2 greater the relationship between independent and dependent variables....

Standard Error of the Regression vs. R-squared

...