Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 5 0 Dimensionality Reduction

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🍯 Honey never spoils — archaeologists found 3,000-year-old jars still edible.
Cover Image for Dimensionality Reduction in Machine Learning

Dimensionality Reduction in Machine Learning

Learn how dimensionality reduction simplifies high-dimensional data while preserving important patterns. Explore techniques like PCA and understand how reducing features improves model performance, visualization, and computational efficiency.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Regularized Logistic Regression

Next →

Principal Component Analysis (PCA) Explained

Dimensionality Reduction

Dimensionality reduction is a type of unsupervised learning.

The idea is simple:

Take high-dimensional data and represent it using fewer dimensions while preserving as much important structure as possible.

  • This is an approximation.
  • We lose some information because we are projecting.
  • But if most of the data lies near a lower-dimensional structure, the loss is small.

Usually we start with:

x(i)∈Rnx^{(i)} \in \mathbb{R}^nx(i)∈Rn

And reduce to:

z(i)∈Rkwhere k<nz^{(i)} \in \mathbb{R}^k \quad \text{where } k < nz(i)∈Rkwhere k<n

Example:

  • 1000D → 100D
  • 300D → 50D

Dimensionality reduction finds:

  • A lower-dimensional subspace (line, plane, etc.)
  • That captures most of the variance in the data

Then it projects the data onto that subspace.

Advantages

There are two main reasons:

1. Data Compression

  • Store fewer numbers per example
  • Reduce memory and disk usage

2. Faster Learning

  • Many algorithms scale with number of features
  • Fewer features → faster training and prediction

What Projection Means

Suppose original examples are:

x(i)∈R2x^{(i)} \in \mathbb{R}^2x(i)∈R2

After projection onto a line:

z(i)∈Rz^{(i)} \in \mathbb{R}z(i)∈R

So instead of storing:

x(i)=[x1(i)x2(i)]x^{(i)} = \begin{bmatrix} x_1^{(i)} \\ x_2^{(i)} \end{bmatrix}x(i)=[x1(i)​x2(i)​​]

We store:

z(i)z^{(i)}z(i)

One number instead of two.


Example 1: Redundant Features (2D → 1D)

Suppose:

  • x1x_1x1​ = length in centimeters
  • x2x_2x2​ = length in inches

These features are highly correlated.

Instead of storing:

x=[x1x2]x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}x=[x1​x2​​]

We can project the data onto a line and represent each example with a single number:

z1z_1z1​

So:

  • Original representation: 2 numbers per example
  • Reduced representation: 1 number per example

We approximate the original data by projecting onto a line that captures the main direction of variation.


Example 2: 3D → 2D

Now suppose:

x(i)∈R3x^{(i)} \in \mathbb{R}^3x(i)∈R3

But the data roughly lies on a plane.

Instead of keeping 3 coordinates:

x(i)=[x1(i)x2(i)x3(i)]x^{(i)} = \begin{bmatrix} x_1^{(i)} \\ x_2^{(i)} \\ x_3^{(i)} \end{bmatrix}x(i)=​x1(i)​x2(i)​x3(i)​​​

We project onto a 2D plane and represent each example as:

z(i)=[z1(i)z2(i)]∈R2z^{(i)} = \begin{bmatrix} z_1^{(i)} \\ z_2^{(i)} \end{bmatrix} \in \mathbb{R}^2z(i)=[z1(i)​z2(i)​​]∈R2

Now we only need two numbers instead of three.


Summary

Dimensionality reduction:

  • Removes redundancy
  • Compresses data
  • Speeds up learning
  • Represents high-dimensional data using fewer variables

In the next step, we usually use Principal Component Analysis (PCA) to compute the optimal projection direction mathematically.

Example: Countries Dataset

Suppose we collect a large dataset containing statistics about countries around the world.

Each country may have 50 features, such as:

  • x1x_1x1​ = GDP (Gross Domestic Product)
  • x2x_2x2​ = GDP per capita
  • x3x_3x3​ = Human Development Index
  • x4x_4x4​ = Life expectancy
  • x5,x6,…x_5, x_6, \dotsx5​,x6​,…

Each country is represented as:

x(i)∈R50x^{(i)} \in \mathbb{R}^{50}x(i)∈R50

So every country corresponds to a 50-dimensional feature vector.

Reducing 50D to 2D

Using dimensionality reduction, we can transform each country:

x(i)∈R50⟶z(i)∈R2x^{(i)} \in \mathbb{R}^{50} \quad \longrightarrow \quad z^{(i)} \in \mathbb{R}^{2}x(i)∈R50⟶z(i)∈R2

Now each country is represented by only two numbers:

z(i)=[z1(i)z2(i)]z^{(i)} = \begin{bmatrix} z_1^{(i)} \\ z_2^{(i)} \end{bmatrix}z(i)=[z1(i)​z2(i)​​]

This allows us to plot every country as a point in 2D space.

AI-Machine-Learning/5-0-Dimensionality-Reduction
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.