What is Partial Least Squares Regression?

Partial least squares regression (PLS regression) is a statistical technique that shares similarities with principal components regression. Instead of identifying hyperplanes of maximum variance between the response and independent variables, PLS regression constructs a linear regression model by projecting both the predicted and observable variables into a new space. This characteristic of projecting data to new spaces classifies PLS methods as bilinear factor models. Partial least squares discriminant analysis (PLS-DA) is a specific variant used when the response variable (Y) is categorical.

PLS is employed to uncover the underlying relationships between two matrices (X and Y). It takes a latent variable approach to model the covariance structures in these matrices. The objective of a PLS model is to identify a multidimensional direction in the X space that explains the maximum multidimensional variance direction in the Y space. PLS regression is particularly advantageous when the predictor matrix has more variables than observations and when there is multicollinearity among X values. This is in contrast to standard regression, which may struggle in these situations unless regularization is applied.

Partial Least Squares Regression (PLSRegression) using Sklearn

Partial least square regression is a Machine learning Algorithm used for modelling the relationship between independent and dependent variables. This is mainly used when there are many interrelated independent variables. It is more commonly used in regression and latent variable modelling. It finds the directions (latent variables) in the independent variable space, explaining the maximum variance in both dependent and independent variables. It iteratively extracts the latent variables to find the maximum covariance between dependent and independent variables. The article explores more PLSRegression and implementation using the Sklearn library.

Similar Reads

What is Partial Least Squares Regression?

Partial least squares regression (PLS regression) is a statistical technique that shares similarities with principal components regression. Instead of identifying hyperplanes of maximum variance between the response and independent variables, PLS regression constructs a linear regression model by projecting both the predicted and observable variables into a new space. This characteristic of projecting data to new spaces classifies PLS methods as bilinear factor models. Partial least squares discriminant analysis (PLS-DA) is a specific variant used when the response variable (Y) is categorical....

Partial Least Squares Regression Implementation

To implement PLS we are taking the “Diabetes” dataset. Now, let’s take a look at how PLS is used to predict diabetes progression. In this dataset, the “diabetes progression” variable is the dependent variable In the Diabetes dataset, independent variables would be the various health-related measurements such as age, sex, BMI, blood pressure, and other serum measurements. We use the PLS model to predict the “diabetes progression” (dependent variable) based on the “health-related measurements”(independent variables)....

Why PLS is used?

...

Conclusion

...