What is Clustering ?

The task of grouping data points based on their similarity with each other is called Clustering or Cluster Analysis. This method is defined under the branch of Unsupervised Learning, which aims at gaining insights from unlabelled data points, that is, unlike supervised learning we don’t have a target variable.

Clustering aims at forming groups of homogeneous data points from a heterogeneous dataset. It evaluates the similarity based on a metric like Euclidean distance, Cosine similarity, Manhattan distance, etc. and then group the points with highest similarity score together.

For Example, In the graph given below, we can clearly see that there are 3 circular clusters forming on the basis of distance.

Now it is not necessary that the clusters formed must be circular in shape. The shape of clusters can be arbitrary. There are many algortihms that work well with detecting arbitrary shaped clusters.

For example, In the below given graph we can see that the clusters formed are not circular in shape.

Clustering in Machine Learning

In real world, not every data we work upon has a target variable. This kind of data cannot be analyzed using supervised learning algorithms. We need the help of unsupervised algorithms. One of the most popular type of analysis under unsupervised learning is Cluster analysis. When the goal is to group similar data points in a dataset, then we use cluster analysis. In practical situations, we can use cluster analysis for customer segmentation for targeted advertisements, or in medical imaging to find unknown or new infected areas and many more use cases that we will discuss further in this article.

Table of Content

What is Clustering ?
Types of Clustering
Uses of Clustering
Types of Clustering Algorithms
Applications of Clustering in different fields:
Frequently Asked Questions (FAQs) on Clustering

Similar Reads

Marketing: It can be used to characterize & discover customer segments for marketing purposes. Biology: It can be used for classification among different species of plants and animals. Libraries: It is used in clustering different books on the basis of topics and information. Insurance: It is used to acknowledge the customers, their policies and identifying the frauds. City Planning: It is used to make groups of houses and to study their values based on their geographical locations and other factors present. Earthquake studies: By learning the earthquake-affected areas we can determine the dangerous zones. Image Processing: Clustering can be used to group similar images together, classify images based on content, and identify patterns in image data. Genetics: Clustering is used to group genes that have similar expression patterns and identify gene networks that work together in biological processes. Finance: Clustering is used to identify market segments based on customer behavior, identify patterns in stock market data, and analyze risk in investment portfolios. Customer Service: Clustering is used to group customer inquiries and complaints into categories, identify common issues, and develop targeted solutions. Manufacturing: Clustering is used to group similar products together, optimize production processes, and identify defects in manufacturing processes. Medical diagnosis: Clustering is used to group patients with similar symptoms or diseases, which helps in making accurate diagnoses and identifying effective treatments. Fraud detection: Clustering is used to identify suspicious patterns or anomalies in financial transactions, which can help in detecting fraud or other financial crimes. Traffic analysis: Clustering is used to group similar patterns of traffic data, such as peak hours, routes, and speeds, which can help in improving transportation planning and infrastructure. Social network analysis: Clustering is used to identify communities or groups within social networks, which can help in understanding social behavior, influence, and trends. Cybersecurity: Clustering is used to group similar patterns of network traffic or system behavior, which can help in detecting and preventing cyberattacks. Climate analysis: Clustering is used to group similar patterns of climate data, such as temperature, precipitation, and wind, which can help in understanding climate change and its impact on the environment. Sports analysis: Clustering is used to group similar patterns of player or team performance data, which can help in analyzing player or team strengths and weaknesses and making strategic decisions. Crime analysis: Clustering is used to group similar patterns of crime data, such as location, time, and type, which can help in identifying crime hotspots, predicting future crime trends, and improving crime prevention strategies....

What is Clustering ?

Clustering in Machine Learning

Categories

Contact US