Implement Multiple factor analysis in R
To demonstrate Multiple Factor Analysis (MFA) using R we use the “Iris” dataset from the FactoMineR package. This dataset contains various measurements from different blocks which can be analyzed with MFA to understand the relationships among the variables and the contributions of different blocks to the data structure.
Step 1. Installing and Loading Required Packages
First, ensure that the FactoMineR and factoextra packages installed:
install.packages("FactoMineR")
library(FactoMineR)
install.packages("factoextra")
library(factoextra)
Step 2. Load the Dataset
Now we load the dataset that we used for Multiple Factor Analysis.
data("iris")
head(iris)
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
Step 3. Preparing the Data for Multiple factor analysis
Now we will Preparing the Data for Multiple factor analysis by removing missing values and detect outliers.
iris_data <- iris[, -5]
#Defining the Groups
group_definitions <- c(2, 2)
Step 4. Performing Multiple Factor Analysis
Now we will Performing Multiple Factor Analysis and checking the summary.
res_mfa <- MFA(iris_data,
group = group_definitions,
type = c("s", "s"),
name.group = c("Sepal", "Petal"),
graph = FALSE)
summary(res_mfa)
Output:
Call:
MFA(base = iris_data, group = group_definitions, type = c("s",
"s"), name.group = c("Sepal", "Petal"), graph = FALSE)
Eigenvalues
Dim.1 Dim.2 Dim.3 Dim.4
Variance 1.882 0.815 0.101 0.011
% of var. 67.006 29.026 3.579 0.389
Cumulative % of var. 67.006 96.032 99.611 100.000
Groups
Dim.1 ctr cos2 Dim.2 ctr cos2 Dim.3 ctr cos2
Sepal | 0.932 49.551 0.536 | 0.810 99.336 0.404 | 0.047 46.595 0.001 |
Petal | 0.949 50.449 0.901 | 0.005 0.664 0.000 | 0.054 53.405 0.003 |
Individuals (the 10 first)
Dim.1 ctr cos2 Dim.2 ctr cos2 Dim.3 ctr cos2
1 | -1.804 1.153 0.943 | 0.422 0.146 0.052 | -0.132 0.116 0.005 |
2 | -1.577 0.881 0.834 | -0.661 0.358 0.147 | -0.230 0.350 0.018 |
3 | -1.886 1.261 0.965 | -0.361 0.107 0.035 | 0.005 0.000 0.000 |
4 | -1.842 1.202 0.902 | -0.603 0.297 0.097 | 0.052 0.018 0.001 |
5 | -1.949 1.346 0.920 | 0.573 0.268 0.080 | -0.009 0.001 0.000 |
6 | -1.733 1.064 0.615 | 1.372 1.540 0.385 | 0.003 0.000 0.000 |
7 | -2.038 1.471 0.985 | -0.004 0.000 0.000 | 0.252 0.420 0.015 |
8 | -1.781 1.123 0.987 | 0.179 0.026 0.010 | -0.096 0.061 0.003 |
9 | -1.856 1.221 0.741 | -1.094 0.978 0.257 | 0.090 0.054 0.002 |
10 | -1.676 0.995 0.911 | -0.468 0.179 0.071 | -0.233 0.361 0.018 |
Continuous variables
Dim.1 ctr cos2 Dim.2 ctr cos2 Dim.3 ctr cos2
Sepal.Length | 0.895 38.053 0.800 | 0.391 16.752 0.153 | -0.216 41.516 0.047 |
Sepal.Width | -0.492 11.497 0.242 | 0.867 82.584 0.752 | 0.076 5.078 0.006 |
Petal.Length | 0.984 26.213 0.968 | 0.052 0.168 0.003 | 0.122 7.520 0.015 |
Petal.Width | 0.946 24.236 0.895 | 0.089 0.495 0.008 | 0.301 45.885 0.091 |
The Multiple Factor Analysis (MFA) conducted on the iris dataset reveals key insights:
- Eigenvalues: It shows how much variance each dimension captures. The first dimension explains 67% of the total variance, followed by 29% in the second dimension, and so on.
- Groups Analysis: It assesses how the variables (Sepal and Petal) contribute to each dimension. Sepal variables dominate Dimension 1 and 2, while Petal variables have a significant impact on Dimension 1.
- Individuals Analysis: It examines how individual data points relate to each dimension. The first ten individuals show their positions in the multidimensional space.
- Continuous Variables: It indicates the relationship between the original variables (Sepal Length, Sepal Width, Petal Length, and Petal Width) and the extracted dimensions. For instance, Sepal Length strongly influences Dimension 1.
Overall, MFA helps understand the underlying structure of the data by reducing its dimensionality and highlighting the relationships between variables and observations.
Step 5. Visualizing MFA Results
Plotting the Individual Data Points.
fviz_mfa_ind(res_mfa, label = "var", habillage = iris$Species,
addEllipses = TRUE, ellipse.level = 0.95)
Output:
Contributions of Quantitative Variables to Dimensions
fviz_contrib(res_mfa, choice = "quanti.var", axes = 1)
Output:
Use fviz_contrib to visualize contributions of quantitative variables to the first dimension.
Multiple Factor Analysis In R
Multiple factor analysis(MFA) is designed to handle data sets with distinct groups (blocks) of variables. In this article, we will discuss what multiple factor analysis is and how to implement It in R Programming Language.