Mathematical Formulation of MDS
The basic idea of MDS is to find a general process in low-dimensional space that minimizes the variance (D) in high-dimensional space and the dependent variance (d) in low-dimensional space.
Mathematically this can be expressed as: Given a dissimilarity or distance matrix D representing the difference between points i and j in a space where Dij is higher (e.g. Euclidean distance), MDS attempts to find the set of coordinates xi and xj. This is done in a low-dimensional space such that the Euclidean distance between xi and xj (denoted by dij) is as close as possible to the difference Dij.
The target task to reduce MDS can be defined as follows:
Where:
- Wij is a weight that can be used to emphasize or de-emphasize specific distances.
- dij is the Euclidean distance between data points i and j in the lower-dimensional space.
- Dij is the dissimilarity (distance) between data points i and j in the high-dimensional space.
Sklearn | Multi-dimensional Scaling (MDS) Python Implementation from Scratch
Scikit-learn (sklearn) is a Python machine-learning package that is open-source and free to use. It is Python’s most popular machine-learning library, and it is extensively used in business and academics. Scikit-learn includes a wide range of machine learning methods, including supervised learning (classification, regression), unsupervised learning (clustering, dimensionality reduction), model selection and evaluation, data preparation, and feature engineering. In this article, we will discuss an unsupervised learning technique that is commonly used to visualize the relationships between data points in a high-dimensional space by mapping them to a lower-dimensional space, such as 2D or 3D, while preserving the pairwise distances between the data points as much as possible.
Table of Content
- Multi-dimensional Scaling (MDS)
- Why is Multi-dimensional Scaling (MDS) important?
- Application of MDS
- Advanced Multi-dimensional Scaling (MDS)
- Limitations of MDS
- Mathematical Formulation of MDS
- Why MDS is better than other dimensionality reduction methods
- MDS on Digits Dataset
- MDS on Make_blobs dataset