Forward Propagation in Neural Networks

Understand how forward propagation works in neural networks. Learn how inputs move through layers, how weights and biases transform data, and how activation functions generate predictions in deep learning models.

Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Training a Neural Network

Vectorized Neural Networks Model Representation

⏩ Forward Propagation (FP)

Forward propagation computes the hypothesis

h_\Theta(x)

by passing the input through each layer of the neural network for each layer.

General Form

For any Network layer $l$ :

Linear Term: preactivation

z^{(l)} = \Theta^{(l-1)} a^{(l-1)}

Activation Term

a^{(l)} = g(z^{(l)})

Or when look Forward we can rewrite:

z^{(l+1)} = \Theta^{(l)} a^{(l)}

a^{(l+1)} = g\left(z^{(l+1)}\right)

Where

$a^{(l)}$ = activations of layer $l$
$z^{(l)}$ = linear combination before activation
$\Theta^{(l)}$ = weight matrix between layer $l$ and $l+1$
$g(\cdot)$ = activation function

Advance Example: 4 Layer Neural Network:

Example: Assume a 4-layer neural network.

Input layer: 3 units
Hidden layer 1: 3 units
Hidden layer 2: 3 units
Output layer: 1 unit


graph LR

%% Input Layer
    subgraph Input Layer
        x1(((x1)))
        x2(((x2)))
        x3(((x3)))
    end

%% Hidden Layer 1
    subgraph Hidden Layer 1
        a1{a1}
        a2{a2}
        a3{a3}
    end

%% Hidden Layer 2
    subgraph Hidden Layer 2
        b1{b1}
        b2{b2}
        b3{b3}
    end

%% Output Layer
    subgraph Output Layer
        y(((hθx)))
    end

%% Connections: Input → Hidden 1
    x1 --> a1
    x1 --> a2
    x1 --> a3
    x2 --> a1
    x2 --> a2
    x2 --> a3
    x3 --> a1
    x3 --> a2
    x3 --> a3
%% Connections: Hidden 1 → Hidden 2
    a1 --> b1
    a1 --> b2
    a1 --> b3
    a2 --> b1
    a2 --> b2
    a2 --> b3
    a3 --> b1
    a3 --> b2
    a3 --> b3
%% Connections: Hidden 2 → Output
    b1 --> y
    b2 --> y
    b3 --> y

Weight Matrix:

\Theta^{(1)} \in \mathbb{R}^{3 \times 4}

\Theta^{(2)} \in \mathbb{R}^{3 \times 4}

\Theta^{(3)} \in \mathbb{R}^{1 \times 4}

Layer 1 (Input Layer)

Forward Pass

a^{(1)} = x

With bias term:

a^{(1)} = \begin{bmatrix} 1 \\ x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}

Layer 2

Linear step:

z^{(2)} = \Theta^{(1)} a^{(1)}

Activation Function:

a^{(2)} = g(z^{(2)})

Add bias:

a^{(2)} = \begin{bmatrix} 1 \\ g(z_1^{(2)}) \\ g(z_2^{(2)}) \\ \vdots \end{bmatrix}

Activation of Neurons in Layer 2

First Neuron in layer 2

a^{(2)}_1 = g(\Theta^{(1)}_{10}x_0 + \Theta^{(1)}_{11}x_1 + \Theta^{(1)}_{12}x_2 + \Theta^{(1)}_{13}x_3)

Second Neuron in layer 2

a^{(2)}_2 = g(\Theta^{(1)}_{20}x_0 + \Theta^{(1)}_{21}x_1 + \Theta^{(1)}_{22}x_2 + \Theta^{(1)}_{23}x_3)

Third Neuron in layer 2

a^{(2)}_3 = g(\Theta^{(1)}_{30}x_0 + \Theta^{(1)}_{31}x_1 + \Theta^{(1)}_{32}x_2 + \Theta^{(1)}_{33}x_3)

Generalized

a^{(2)}_i = g(\Theta^{(1)}_{i0}x_0 + \Theta^{(1)}_{i1}x_1 + \Theta^{(1)}_{i2}x_2 + \Theta^{(1)}_{i3}x_3)

for $i = 1,2,3$

Layer 3

Linear step:

z^{(3)} = \Theta^{(2)} a^{(2)}

Activation:

a^{(3)} = g(z^{(3)})

(Add bias if needed.)

Activation of Neurons in Layer 3

For each neuron $i$ in layer 3:

a^{(3)}_i = g\left( \Theta^{(2)}_{i0}a^{(2)}_0 + \Theta^{(2)}_{i1}a^{(2)}_1 + \Theta^{(2)}_{i2}a^{(2)}_2 + \Theta^{(2)}_{i3}a^{(2)}_3 \right)

for $i = 1,2,3$

h_\Theta(x) = a^{(4)}

Layer 4 (Output Layer) : Hypothesis

Linear step:

z^{(4)} = \Theta^{(3)} a^{(3)}

Final activation:

a^{(4)} = g(z^{(4)})

The final hypothesis is First neuron of 3rd layer

h_\Theta(x) = a^{(4)}_1

Forward Propagation in Neural Networks

Understand how forward propagation works in neural networks. Learn how inputs move through layers, how weights and biases transform data, and how activation functions generate predictions in deep learning models.

Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Training a Neural Network

Vectorized Neural Networks Model Representation

⏩ Forward Propagation (FP)

Forward propagation computes the hypothesis

h_\Theta(x)

by passing the input through each layer of the neural network for each layer.

General Form

For any Network layer $l$ :

Linear Term: preactivation

z^{(l)} = \Theta^{(l-1)} a^{(l-1)}

Activation Term

a^{(l)} = g(z^{(l)})

Or when look Forward we can rewrite:

z^{(l+1)} = \Theta^{(l)} a^{(l)}

a^{(l+1)} = g\left(z^{(l+1)}\right)

Where

$a^{(l)}$ = activations of layer $l$
$z^{(l)}$ = linear combination before activation
$\Theta^{(l)}$ = weight matrix between layer $l$ and $l+1$
$g(\cdot)$ = activation function

Advance Example: 4 Layer Neural Network:

Example: Assume a 4-layer neural network.

Input layer: 3 units
Hidden layer 1: 3 units
Hidden layer 2: 3 units
Output layer: 1 unit


graph LR

%% Input Layer
    subgraph Input Layer
        x1(((x1)))
        x2(((x2)))
        x3(((x3)))
    end

%% Hidden Layer 1
    subgraph Hidden Layer 1
        a1{a1}
        a2{a2}
        a3{a3}
    end

%% Hidden Layer 2
    subgraph Hidden Layer 2
        b1{b1}
        b2{b2}
        b3{b3}
    end

%% Output Layer
    subgraph Output Layer
        y(((hθx)))
    end

%% Connections: Input → Hidden 1
    x1 --> a1
    x1 --> a2
    x1 --> a3
    x2 --> a1
    x2 --> a2
    x2 --> a3
    x3 --> a1
    x3 --> a2
    x3 --> a3
%% Connections: Hidden 1 → Hidden 2
    a1 --> b1
    a1 --> b2
    a1 --> b3
    a2 --> b1
    a2 --> b2
    a2 --> b3
    a3 --> b1
    a3 --> b2
    a3 --> b3
%% Connections: Hidden 2 → Output
    b1 --> y
    b2 --> y
    b3 --> y

Weight Matrix:

\Theta^{(1)} \in \mathbb{R}^{3 \times 4}

\Theta^{(2)} \in \mathbb{R}^{3 \times 4}

\Theta^{(3)} \in \mathbb{R}^{1 \times 4}

Layer 1 (Input Layer)

Forward Pass

a^{(1)} = x

With bias term:

a^{(1)} = \begin{bmatrix} 1 \\ x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}

Layer 2

Linear step:

z^{(2)} = \Theta^{(1)} a^{(1)}

Activation Function:

a^{(2)} = g(z^{(2)})

Add bias:

a^{(2)} = \begin{bmatrix} 1 \\ g(z_1^{(2)}) \\ g(z_2^{(2)}) \\ \vdots \end{bmatrix}

Activation of Neurons in Layer 2

First Neuron in layer 2

a^{(2)}_1 = g(\Theta^{(1)}_{10}x_0 + \Theta^{(1)}_{11}x_1 + \Theta^{(1)}_{12}x_2 + \Theta^{(1)}_{13}x_3)

Second Neuron in layer 2

a^{(2)}_2 = g(\Theta^{(1)}_{20}x_0 + \Theta^{(1)}_{21}x_1 + \Theta^{(1)}_{22}x_2 + \Theta^{(1)}_{23}x_3)

Third Neuron in layer 2

a^{(2)}_3 = g(\Theta^{(1)}_{30}x_0 + \Theta^{(1)}_{31}x_1 + \Theta^{(1)}_{32}x_2 + \Theta^{(1)}_{33}x_3)

Generalized

a^{(2)}_i = g(\Theta^{(1)}_{i0}x_0 + \Theta^{(1)}_{i1}x_1 + \Theta^{(1)}_{i2}x_2 + \Theta^{(1)}_{i3}x_3)

for $i = 1,2,3$

Layer 3

Linear step:

z^{(3)} = \Theta^{(2)} a^{(2)}

Activation:

a^{(3)} = g(z^{(3)})

(Add bias if needed.)

Activation of Neurons in Layer 3

For each neuron $i$ in layer 3:

a^{(3)}_i = g\left( \Theta^{(2)}_{i0}a^{(2)}_0 + \Theta^{(2)}_{i1}a^{(2)}_1 + \Theta^{(2)}_{i2}a^{(2)}_2 + \Theta^{(2)}_{i3}a^{(2)}_3 \right)

for $i = 1,2,3$

h_\Theta(x) = a^{(4)}

Layer 4 (Output Layer) : Hypothesis

Linear step:

z^{(4)} = \Theta^{(3)} a^{(3)}

Final activation:

a^{(4)} = g(z^{(4)})

The final hypothesis is First neuron of 3rd layer

h_\Theta(x) = a^{(4)}_1

AI-DeepLearning

AI-DeepLearning Index

Deep Learning Path 🤖

Neural Network Hypothesis and Intuition

Forward Propagation in Neural Networks

Vectorized Neural Networks Model Representation

Examples and Intuitions I — Neural Networks as Logical Gates

Examples and Intuitions II — Building XNOR with a Hidden Layer

Multiclass Classification with Neural Networks

Cost Function for Neural Networks

Backpropagation Algorithm

Gradient Checking and Random Initialization

Training a Neural Network

Revision Cheat Sheet

Forward Propagation in Neural Networks

Understand how forward propagation works in neural networks. Learn how inputs move through layers, how weights and biases transform data, and how activation functions generate predictions in deep learning models.

Written by Hitesh Sahu, a passionate developer and blogger.

⏩ Forward Propagation (FP)

General Form

Advance Example: 4 Layer Neural Network:

Layer 1 (Input Layer)

Layer 2

Activation of Neurons in Layer 2

Generalized

Layer 3

Activation of Neurons in Layer 3

Layer 4 (Output Layer) : Hypothesis

Fetching content, this won’t take long…

🐙 Octopuses have three hearts and blue blood.

AI-DeepLearning

AI-DeepLearning Index

Deep Learning Path 🤖

Neural Network Hypothesis and Intuition

Forward Propagation in Neural Networks

Vectorized Neural Networks Model Representation

Examples and Intuitions I — Neural Networks as Logical Gates

Examples and Intuitions II — Building XNOR with a Hidden Layer

Multiclass Classification with Neural Networks

Cost Function for Neural Networks

Backpropagation Algorithm

Gradient Checking and Random Initialization

Training a Neural Network

Revision Cheat Sheet

Forward Propagation in Neural Networks

Understand how forward propagation works in neural networks. Learn how inputs move through layers, how weights and biases transform data, and how activation functions generate predictions in deep learning models.

Written by Hitesh Sahu, a passionate developer and blogger.

⏩ Forward Propagation (FP)

General Form

Advance Example: 4 Layer Neural Network:

Layer 1 (Input Layer)

Layer 2

Activation of Neurons in Layer 2

Generalized

Layer 3

Activation of Neurons in Layer 3

Layer 4 (Output Layer) : Hypothesis