Implement Multiple factor analysis in R

To demonstrate Multiple Factor Analysis (MFA) using R we use the “Iris” dataset from the FactoMineR package. This dataset contains various measurements from different blocks which can be analyzed with MFA to understand the relationships among the variables and the contributions of different blocks to the data structure.

Step 1. Installing and Loading Required Packages

First, ensure that the FactoMineR and factoextra packages installed:

R
install.packages("FactoMineR")
library(FactoMineR)
install.packages("factoextra")
library(factoextra)

Step 2. Load the Dataset

Now we load the dataset that we used for Multiple Factor Analysis.

R
data("iris")
head(iris)

Output:

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa

Step 3. Preparing the Data for Multiple factor analysis

Now we will Preparing the Data for Multiple factor analysis by removing missing values and detect outliers.

R
iris_data <- iris[, -5]
#Defining the Groups
group_definitions <- c(2, 2)

Step 4. Performing Multiple Factor Analysis

Now we will Performing Multiple Factor Analysis and checking the summary.

R
res_mfa <- MFA(iris_data, 
               group = group_definitions, 
               type = c("s", "s"), 
               name.group = c("Sepal", "Petal"), 
               graph = FALSE)

summary(res_mfa)

Output:

Call:
MFA(base = iris_data, group = group_definitions, type = c("s",
"s"), name.group = c("Sepal", "Petal"), graph = FALSE)


Eigenvalues
Dim.1 Dim.2 Dim.3 Dim.4
Variance 1.882 0.815 0.101 0.011
% of var. 67.006 29.026 3.579 0.389
Cumulative % of var. 67.006 96.032 99.611 100.000

Groups
Dim.1 ctr cos2 Dim.2 ctr cos2 Dim.3 ctr cos2
Sepal | 0.932 49.551 0.536 | 0.810 99.336 0.404 | 0.047 46.595 0.001 |
Petal | 0.949 50.449 0.901 | 0.005 0.664 0.000 | 0.054 53.405 0.003 |

Individuals (the 10 first)
Dim.1 ctr cos2 Dim.2 ctr cos2 Dim.3 ctr cos2
1 | -1.804 1.153 0.943 | 0.422 0.146 0.052 | -0.132 0.116 0.005 |
2 | -1.577 0.881 0.834 | -0.661 0.358 0.147 | -0.230 0.350 0.018 |
3 | -1.886 1.261 0.965 | -0.361 0.107 0.035 | 0.005 0.000 0.000 |
4 | -1.842 1.202 0.902 | -0.603 0.297 0.097 | 0.052 0.018 0.001 |
5 | -1.949 1.346 0.920 | 0.573 0.268 0.080 | -0.009 0.001 0.000 |
6 | -1.733 1.064 0.615 | 1.372 1.540 0.385 | 0.003 0.000 0.000 |
7 | -2.038 1.471 0.985 | -0.004 0.000 0.000 | 0.252 0.420 0.015 |
8 | -1.781 1.123 0.987 | 0.179 0.026 0.010 | -0.096 0.061 0.003 |
9 | -1.856 1.221 0.741 | -1.094 0.978 0.257 | 0.090 0.054 0.002 |
10 | -1.676 0.995 0.911 | -0.468 0.179 0.071 | -0.233 0.361 0.018 |

Continuous variables
Dim.1 ctr cos2 Dim.2 ctr cos2 Dim.3 ctr cos2
Sepal.Length | 0.895 38.053 0.800 | 0.391 16.752 0.153 | -0.216 41.516 0.047 |
Sepal.Width | -0.492 11.497 0.242 | 0.867 82.584 0.752 | 0.076 5.078 0.006 |
Petal.Length | 0.984 26.213 0.968 | 0.052 0.168 0.003 | 0.122 7.520 0.015 |
Petal.Width | 0.946 24.236 0.895 | 0.089 0.495 0.008 | 0.301 45.885 0.091 |

The Multiple Factor Analysis (MFA) conducted on the iris dataset reveals key insights:

  1. Eigenvalues: It shows how much variance each dimension captures. The first dimension explains 67% of the total variance, followed by 29% in the second dimension, and so on.
  2. Groups Analysis: It assesses how the variables (Sepal and Petal) contribute to each dimension. Sepal variables dominate Dimension 1 and 2, while Petal variables have a significant impact on Dimension 1.
  3. Individuals Analysis: It examines how individual data points relate to each dimension. The first ten individuals show their positions in the multidimensional space.
  4. Continuous Variables: It indicates the relationship between the original variables (Sepal Length, Sepal Width, Petal Length, and Petal Width) and the extracted dimensions. For instance, Sepal Length strongly influences Dimension 1.

Overall, MFA helps understand the underlying structure of the data by reducing its dimensionality and highlighting the relationships between variables and observations.

Step 5. Visualizing MFA Results

Plotting the Individual Data Points.

R
fviz_mfa_ind(res_mfa, label = "var", habillage = iris$Species, 
             addEllipses = TRUE, ellipse.level = 0.95)

Output:

Multiple factor analysis in R

Contributions of Quantitative Variables to Dimensions

R
fviz_contrib(res_mfa, choice = "quanti.var", axes = 1)

Output:

Multiple factor analysis in R

Use fviz_contrib to visualize contributions of quantitative variables to the first dimension.

Multiple Factor Analysis In R

Multiple factor analysis(MFA) is designed to handle data sets with distinct groups (blocks) of variables. In this article, we will discuss what multiple factor analysis is and how to implement It in R Programming Language.

Similar Reads

What is Multiple factor analysis(MFA)?

Multiple Factor Analysis (MFA) is a statistical technique used to analyze complex data sets that contain multiple groups of variables, known as blocks. It extends the concept of Principal Component Analysis (PCA) to datasets with multiple data sources or types, allowing for integrated analysis across diverse data blocks....

Implement Multiple factor analysis in R

To demonstrate Multiple Factor Analysis (MFA) using R we use the “Iris” dataset from the FactoMineR package. This dataset contains various measurements from different blocks which can be analyzed with MFA to understand the relationships among the variables and the contributions of different blocks to the data structure....

Conclusion

MFA in R is a versatile method for examining complex, multi-block datasets. It is well-suited for integrating heterogeneous data and uncovering hidden patterns. The ability to visualize and interpret MFA results makes it a valuable tool in many research and industrial settings. By using packages like FactoMineR and factoextra, you can efficiently perform MFA and create insightful visualizations to guide your analysis and decision-making....