Mental Model

Think of a Neural Network as a line of layers, where every neuron in a layer is wired to the ones in the layer before it. Every single one of those wires (connections) has a weight attached to it.

These weights are basically the “knobs” that control the final output when u feed the network an input.

When u start, all these weights are totally random (the network knows nothing). The goal of backpropagation is to tweak/update these weights, in some way, such that they fit your training samples as much as possible.

The Structure of a Neural Network

The Diagram

(the neurons don’t contain a bias term)
Link to original

Notation Key

$y$ $\equiv$ the training output vector

$^{(l)}$ $\equiv$ the layer index.

$δ^{(l)}$ $\equiv$ the sigma vector of layer $l$ .

$δ^{(l)} = δ_{1}^{(l)} δ_{2}^{(l)} ⋮ δ_{n}^{(l)}$

Using Stochastic/Online Gradient Descent

This of course is done for every training sample, and runs for a few epochs…

Using Batch Gradient Descent

This of course runs for a few epochs…
$Δ w$ contains the weight update matrix of every layer.
(e.g, this is $Δ w^{(2)}$ , or the weight update matrix of layer 2).

Asser'sKnowledge Space

Explorer

Backpropagation in a Neural Network

Mental Model

The Structure of a Neural Network

The Diagram

Using Stochastic/Online Gradient Descent

Using Batch Gradient Descent

Connections

Graph View

Table of Contents

Backlinks