Backpropagation Algorithm
Backpropagation is the algorithm used to minimize the neural network cost function. It computes the gradients of the cost function with respect to the parameters, allowing us to perform gradient descent and update our model.
Backpropagation Algorithm
Backpropagation is the algorithm used to minimize the neural network cost function.
Just like gradient descent in linear and logistic regression, our goal is:
That is, we want to find parameters that minimize the cost function.
Objective
We want to compute the partial derivatives:
These derivatives are used in gradient descent to update the parameters.
Backpropagation Algorithm
Given training set:
Step 1: Initialize Accumulators
Set:
for all .
This creates matrices of zeros to accumulate gradients.
Step 2: For each training example to
2.1 Forward Propagation
Set:
Compute forward propagation for:
to obtain activations .
2.2 Compute Output Layer Error
Using the true label :
This is the error of the output layer.
2.3 Backpropagate the Error
For layers:
Compute:
For sigmoid activation:
So equivalently:
The operator denotes element-wise multiplication.
2.4 Accumulate Gradients
Update:
Vectorized form:
Step 3: Compute Gradients
After processing all training examples:
For (non-bias terms):
For bias terms ():
Final Result
The gradient of the cost function is:
The matrix gives the partial derivatives used in gradient descent.
Key Ideas
- Forward propagation computes activations.
- Backpropagation computes errors ( values).
- Errors are propagated from right to left.
- Gradients are accumulated in .
- Regularization is added for non-bias weights.
- Finally, we divide by to obtain the average gradient.
