Preparing Your Data

For our demonstration, let’s consider a hypothetical gene expression dataset. It’s crucial to have data with clear patterns or relationships to create meaningful heatmaps. Replace this example data with your own dataset as needed.

R




# Example gene expression data
gene_data <- data.frame(
  Gene = c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5"),
  Sample1 = c(2.3, 1.8, 3.2, 0.9, 2.5),
  Sample2 = c(2.1, 1.7, 3.0, 1.0, 2.4),
  Sample3 = c(2.2, 1.9, 3.1, 0.8, 2.6),
  Sample4 = c(2.4, 1.6, 3.3, 0.7, 2.3),
  Sample5 = c(2.0, 1.5, 3.4, 0.6, 2.7)
)
 
# Print the example gene expression data
print(gene_data)


Output:

 Gene Sample1 Sample2 Sample3 Sample4 Sample5
1 Gene1     2.3     2.1     2.2     2.4     2.0
2 Gene2     1.8     1.7     1.9     1.6     1.5
3 Gene3     3.2     3.0     3.1     3.3     3.4
4 Gene4     0.9     1.0     0.8     0.7     0.6
5 Gene5     2.5     2.4     2.6     2.3     2.7

Creating Heatmaps with Hierarchical Clustering

Before diving into our actual topic, let’s have an understanding of Heatmaps and Hierarchical Clustering.

Similar Reads

Heatmaps

Heatmaps are a powerful data visualization tool that can reveal patterns, relationships, and similarities within large datasets. When combined with hierarchical clustering, they become even more insightful. In this brief article, we’ll explore how to create captivating heatmaps with hierarchical clustering in R programming....

Understanding Hierarchical Clustering

Hierarchical Clustering is a powerful data analysis technique used to uncover patterns, relationships, and structures within a dataset. It belongs to the family of unsupervised machine learning algorithms and is particularly useful in exploratory data analysis and data visualization. Hierarchical Clustering is often combined with heatmap visualizations, as demonstrated in this article, to provide a comprehensive understanding of complex datasets....

Getting Started

Before diving into the code, ensure you have the necessary packages installed. We’ll use the ‘ pheatmap ‘ package for heatmap visualization and ‘dendextend’ for dendrogram customization. If you haven’t already installed them, run the following commands:...

Load the required packages:

...

Preparing Your Data

R library(pheatmap) library(dendextend)...

Removing Non-Numeric Labels

...

Calculating Distances and Performing Hierarchical Clustering

For our demonstration, let’s consider a hypothetical gene expression dataset. It’s crucial to have data with clear patterns or relationships to create meaningful heatmaps. Replace this example data with your own dataset as needed....

Generating Distinct Heatmaps:

...

Euclidean Distance Heatmap:

R # Remove the non-numeric column (Gene names) temporarily gene_names <- gene_data$Gene gene_data <- gene_data[, -1] print(gene_data)...

Manhattan Distance Heatmap:

...

Pearson Correlation Distance Heatmap:

To create meaningful heatmaps, we first calculate distances between data points using various methods. In this case, we’ll use Euclidean, Manhattan, and Pearson correlation distances....

Conclusion:

...