Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. PCA Reduction

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🍯 Honey never spoils — archaeologists found 3,000-year-old jars still edible.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🍌 Bananas are berries, but strawberries are not.
Cover Image for Principal Component Analysis (PCA) Explained

Principal Component Analysis (PCA) Explained

Learn how Principal Component Analysis (PCA) reduces the dimensionality of datasets while preserving important information. Understand the intuition, mathematics, and practical uses of PCA in machine learning and data science.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Dimensionality Reduction in Machine Learning

Next →

Revision Cheat Sheet

Principal Component Analysis (PCA)

The most widely used algorithm for dimensionality reduction.

Intuition in 2D → 1D

Suppose we have:

x(i)∈R2x^{(i)} \in \mathbb{R}^2x(i)∈R2

and we want to reduce the data from 2 dimensions to 1 dimension.

That means:

  • We want to find a line
  • Onto which we project all data points

The key question:

Which line should we choose?

Good Projection Direction

A good projection line is one where:

  • When we project each point onto the line
  • The distance between the original point and its projection is small

These distances are called:

Projection errors

The orthogonal distance from a point to the line.

PCA chooses the line that minimizes:

Sum of squared projection errors\text{Sum of squared projection errors}Sum of squared projection errors

General Case: nD → kD

Now suppose:

x(i)∈Rnx^{(i)} \in \mathbb{R}^nx(i)∈Rn

and we want to reduce to:

z(i)∈Rkwhere k<nz^{(i)} \in \mathbb{R}^k \quad \text{where } k < nz(i)∈Rkwhere k<n

Instead of finding one vector, we find:

u(1),u(2),…,u(k)u^{(1)}, u^{(2)}, \dots, u^{(k)}u(1),u(2),…,u(k)

These vectors:

  • Define a k-dimensional surface
  • Span a k-dimensional linear subspace

We then project each point onto that subspace.

3D → 2D Example

If:

x(i)∈R3x^{(i)} \in \mathbb{R}^3x(i)∈R3

and we reduce to 2D:

  • We find two vectors:
u(1),u(2)∈R3u^{(1)}, u^{(2)} \in \mathbb{R}^3u(1),u(2)∈R3
  • These define a plane.
  • Each point is projected onto that plane.

The projection error is:

∥x(i)−x^(i)∥2\| x^{(i)} - \hat{x}^{(i)} \|^2∥x(i)−x^(i)∥2

where:

  • x^(i)\hat{x}^{(i)}x^(i) is the projected version of x(i)x^{(i)}x(i)

PCA minimizes:

∑i=1m∥x(i)−x^(i)∥2\sum_{i=1}^{m} \| x^{(i)} - \hat{x}^{(i)} \|^2i=1∑m​∥x(i)−x^(i)∥2

2D → 1D Example

We want to find a vector:

u(1)∈R2u^{(1)} \in \mathbb{R}^2u(1)∈R2

that defines the direction of the line.

PCA solves:

min⁡u(1)∑i=1m∥x(i)−projection of x(i) onto u(1)∥2\min_{u^{(1)}} \sum_{i=1}^{m} \| x^{(i)} - \text{projection of } x^{(i)} \text{ onto } u^{(1)} \|^2u(1)min​i=1∑m​∥x(i)−projection of x(i) onto u(1)∥2

So PCA finds the direction that minimizes the total squared orthogonal distance.

Important:

  • If PCA returns u(1)u^{(1)}u(1) or −u(1)-u^{(1)}−u(1), it does not matter.
  • Both define the same line.

Preprocessing Step

Before applying PCA, it is standard to:

  1. Perform mean normalization
  2. Perform feature scaling

So that:

  • Each feature has zero mean
  • Features have comparable ranges

This prevents one feature from dominating purely due to scale.


PCA vs Linear Regression (Very Important)

PCA is NOT linear regression.

Linear Regression:

  • Predicts a special variable yyy
  • Minimizes vertical squared errors
  • Error is measured in the y-direction only

PCA:

  • Has no special target variable
  • All features x1,x2,…,xnx_1, x_2, \dots, x_nx1​,x2​,…,xn​ are treated equally
  • Minimizes orthogonal (shortest) distance to a line/plane

Linear regression minimizes:

vertical distance\text{vertical distance}vertical distance

PCA minimizes:

orthogonal distance\text{orthogonal distance}orthogonal distance

These are completely different objectives.


Final Summary

PCA:

  • Finds a lower-dimensional subspace
  • Projects data onto that subspace
  • Minimizes squared orthogonal projection error
  • Treats all features symmetrically
  • Is not a predictive model

Formally, PCA solves:

min⁡∑i=1m∥x(i)−x^(i)∥2\min \sum_{i=1}^{m} \| x^{(i)} - \hat{x}^{(i)} \|^2mini=1∑m​∥x(i)−x^(i)∥2

where x^(i)\hat{x}^{(i)}x^(i) is the projection of x(i)x^{(i)}x(i) onto a k-dimensional subspace.


AI-DeepLearning/PCA-Reduction
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.