Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 4 NormalEquation

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🍯 Honey never spoils — archaeologists found 3,000-year-old jars still edible.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Cover Image for Normal Equation in Linear Regression

Normal Equation in Linear Regression

Detailed explanation of the Normal Equation for linear regression, including matrix formulation, closed-form solution, comparison with gradient descent, and practical considerations for implementation.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Thu Feb 19 2026

Share This on

Normal Equation (Closed-Form Solution)

Instead of solving multiple iteration of gradient descent, Normal equation can get theta in one step

  • Θ can be directly calculated where cost function is minimal using calculus in one step instead of iterating iterative optimization:
θ=(XTX)−1XTy\theta = (X^T X)^{-1} X^T yθ=(XTX)−1XTy

Advantages

  • No learning rate required
  • Direct computation

Limitations

  • Computationally expensive for very large datasets
  • Matrix inversion can be costly

Steps:

  • Construct design matrix X using feature columns and add 1 in first column
  • Construct y vector using result values Y
  • calculate:

Θ = (XTX)-1 XTy

Mean Normalization

Feature scaling is not required for Normal Equation method

Normal Equation vs Gradient Descent:

FeatureGradient DescentNormal Equation
ComplexityComplex need to debug alphaConvenient & Simple to implement
Choose Learning Rate(α)RequiredNo need
Feature ScalingRequiredNo need
IterationMany Iteration RequiredNot required
Feature Set>=millionEfficient if n is huge
O(kn2)
Slow if n is huge, cost of inverse matrix is O(n3)
Complex Learning AlgoCan used for Complex learning algoNot supported

Inverse Matrix(A-1)

A' called inverse if

  A'.A = A'.A = I
Matrix without Inverse called Degenerate Matrix/ Singular/ Non Invertible
Cause for non invertible Matrix:
  • Redundant feature: two feature related by a linear equation x2 = kx1 eg: size in feet and meter
  • More feature than training set(m<=n)): delete some feature or use regularization
Octave method for inverting matrix:
  • pinv(A) : Pseudo Inverse, calculates inverse even if matrix is non invertible
  • inv(A) : Inverse
AI-Machine-Learning/4-NormalEquation
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.