Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 10 Neural Network Training

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🤯 Your stomach gets a new lining every 3–4 days.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

AI-DeepLearning

  • AI-DeepLearning Index

  • Deep Learning Path 🤖

  • Neural Network Hypothesis and Intuition

  • Forward Propagation in Neural Networks

  • Vectorized Neural Networks Model Representation

  • Examples and Intuitions I — Neural Networks as Logical Gates

  • Examples and Intuitions II — Building XNOR with a Hidden Layer

  • Multiclass Classification with Neural Networks

  • Cost Function for Neural Networks

  • Backpropagation Algorithm

  • Gradient Checking and Random Initialization

  • Training a Neural Network

  • Revision Cheat Sheet

Cover Image for Training a Neural Network

Training a Neural Network

In this post, we will put together all the pieces we've learned about neural networks to understand how to train a neural network effectively. We will cover the cost function, backpropagation, gradient checking, and random initialization, along with key intuitions for each step.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Neural Network Hypothesis and Intuition

Next →

Forward Propagation in Neural Networks

Training a Neural Network

Putting It Together

Now that we have covered forward propagation, backpropagation, and gradient checking, let’s combine everything into a complete training pipeline.

1. 🔀 Choose a Network Architecture

First, decide the structure of your neural network:

  • Number of layers LLL
  • Number of hidden units per layer jjj
  • Number of Outputs yyy

How to choose Network

  • Input layer size = dimension of feature vector x(i)x^{(i)}x(i)
  • Output layer size = number of output classes
  • Hidden units:
    • More units usually perform better
    • But increase computational cost
  • Default choice:
    • Use 1 hidden layer
    • If using multiple hidden layers, use the same number of units in each layer

2. 📚 Training a Neural Network

2.1 🎲 Randomly Initialize Weights

Initialize each Θ(l)\Theta^{(l)}Θ(l) randomly (not to zero).

This breaks symmetry and allows learning.

2.2 ⏩ Forward Propagation (FP)

For each training example x(i)x^{(i)}x(i), compute:

hΘ(x(i))h_\Theta(x^{(i)})hΘ​(x(i))

This gives the network’s prediction.

2.3 💰 Implement the Cost Function

Compute:

J(Θ)J(\Theta)J(Θ)

This includes:

  • Logistic loss over all output units
  • Regularization term

2.4 ⏪ Backpropagation (BP)

Use backpropagation to compute:

∂∂Θi,j(l)J(Θ)\frac{\partial}{\partial \Theta_{i,j}^{(l)}} J(\Theta)∂Θi,j(l)​∂​J(Θ)

This gives the gradients needed for optimization.

2.5 🎢 Gradient Checking

Use numerical approximation to verify backpropagation:

∂∂ΘJ(Θ)≈J(Θ+ϵ)−J(Θ−ϵ)2ϵ\frac{\partial}{\partial \Theta} J(\Theta) \approx \frac{J(\Theta + \epsilon) - J(\Theta - \epsilon)}{2\epsilon}∂Θ∂​J(Θ)≈2ϵJ(Θ+ϵ)−J(Θ−ϵ)​

⚠️ Once verified:

  • Disable gradient checking
  • It is computationally expensive

2.6 ⚖️ Minimize the Cost Function

Use:

  • Gradient descent, or
  • A built-in optimization algorithm (e.g., advanced optimizers)

to minimize J(Θ)J(\Theta)J(Θ).


Training Loop

During training, we iterate over all examples:

for i = 1:m
    % Forward propagation
    % Compute activations a^(l)

    % Backpropagation
    % Compute delta terms d^(l) for l = 2,...,L
end

For each example:

  • Perform forward pass
  • Compute errors
  • Accumulate gradients

Final Insight

Neural network training is simply:

  • Forward propagation
  • Backpropagation
  • Gradient-based optimization

All of deep learning is built on this foundation.

Complete Neural Network Workflow

  1. Choose architecture
  2. Initialize weights randomly
  3. Implement forward propagation
  4. Implement cost function
  5. Implement backpropagation
  6. Perform gradient checking
  7. Optimize using gradient descent
  8. Train until convergence
← Previous

Neural Network Hypothesis and Intuition

Next →

Forward Propagation in Neural Networks

AI-DeepLearning/10-Neural-Network-Training
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.