Introduction to Multi-Layer Perceptrons (Feedforward Neural Networks)Multi-Layer Neural NetworksAn MLP (for Multi-Layer Perceptron) or multi-layer neural network defines
a family of functions. Let us first consider the most classical case
of a single hidden layer neural network, mapping a
where The vector-valued function The kind of operation computed by the above Most Common Training Criteria and Output Non-LinearitiesLet
The Back-Propagation AlgorithmWe just apply the recursive gradient computation algorithm seen previously to the graph formed naturally by the MLP, with one node for each input unit, hidden unit and output unit. Note that each parameter (weight or bias) also corresponds to a node, and the final Let us formalize a notation for MLPs with more than one hidden layer.
Let us denote with With tanh units in the hidden layers, we have (in matrix-vector notation):
In the case of a probabilistic classifier, we would then have a softmax output layer, e.g., where we used where Let us now see how the recursive application of the chain rule in flow graphs is instantiated in this structure. First of all, let us denote (for the argument of the non-linearity at each level) and note (from a small derivation) that and that
Now let us apply the back-propagation recipe in the corresponding flow graph. Each parameter (each weight and
each bias) is a node, each neuron potential
Logistic RegressionLogistic regression is a special case of the MLP with no hidden layer (the input is directly connected to the output) and the cross-entropy (sigmoid output) or negative log-likelihood (softmax output) loss. It corresponds to a probabilistic linear classifier and the training criterion is convex in terms of the parameters (which garantees that there is only one minimum, which is global). Training Multi-Layer Neural NetworksMany algorithms have been proposed to train multi-layer neural networks but the most commonly used ones are gradient-based. Two fundamental issues guide the various strategies employed in training MLPs:
|
Artificial Intelligence + NLP + deep learning > AI > Machine Learning > Neural Networks > Deep Learning > python > MNIST (Theano) > 2 Multi-layer Perceptron >