Performing Hierarchical clustering on Dataset

Using Hierarchical Clustering algorithm on the dataset using hclust() which is pre-installed in stats package when R is installed.

R

# Finding distance matrix
distance_mat <- dist(mtcars, method = 'euclidean')
distance_mat
 
# Fitting Hierarchical clustering Model 
# to training dataset
set.seed(240)  # Setting seed
Hierar_cl <- hclust(distance_mat, method = "average")
Hierar_cl
 
# Plotting dendrogram
plot(Hierar_cl)
 
# Choosing no. of clusters
# Cutting tree by height
abline(h = 110, col = "green")
 
# Cutting tree by no. of clusters
fit <- cutree(Hierar_cl, k = 3 )
fit
 
table(fit)
rect.hclust(Hierar_cl, k = 3, border = "green")

Output:

Distance matrix:

The values are shown as per the distance matrix calculation with the method as euclidean.
Model Hierar_cl:

In the model, the cluster method is average, distance is euclidean and no. of objects are 32.
Plot dendrogram:

The plot dendrogram is shown with x-axis as distance matrix and y-axis as height.
Cutted tree:

So, Tree is cut where k = 3 and each category represents its number of clusters.
Plotting dendrogram after cutting:

The plot denotes dendrogram after being cut. The green lines show the number of clusters as per the thumb rule.

Hierarchical Clustering in R Programming

Hierarchical clustering in R Programming Language is an Unsupervised non-linear algorithm in which clusters are created such that they have a hierarchy(or a pre-determined ordering). For example, consider a family of up to three generations. A grandfather and mother have their children that become father and mother of their children. So, they all are grouped together to the same family i.e they form a hierarchy.

Performing Hierarchical clustering on Dataset

R

Hierarchical Clustering in R Programming

Categories

Contact US

Performing Hierarchical clustering on Dataset

R

Hierarchical Clustering in R Programming

Similar Reads

Categories

Contact US