What is the problem
You’ll be given,
- the nodes, connections and weights of a neural network
- the activation function of the hidden layers.
(usually is a relu) - the activation function of the o/p layer.
(usually is a sigmoid/softmax)
- the error function.
- the learning rate
… and asked to perform backpropagation on it.
Before you begin
- draw the network, leaving at lease a line’s space above every neuron
(to later write in it the net and out of the neuron) - add the target value of every output node next to it, and the input value of every input node also next to it.
- perform forward propagation to fill the net and out values for every node.
- compute the derivative formula for the activation function of both the o/p layer and the hidden layers.
- compute the derivative formula for the error function.
The strategy would change if u were to use batch gradient descent vs the stochastic one.
Strategy
Using Stochastic Gradient Descent
Starting with the output layer’s neurons and moving back, do the following,
- calculate the neuron’s delta ()
- update the weight connecting it to every neuron in the previous layer.
Do this for every neuron within every layer…