|Wallkötter, Sebastian: Regularizing gradient properties on deep neural networks. |
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Bachelorarbeit (2016).
31 Seiten, englisch.
|CR-Klassif.||F.1.1 (Models of Computation)|
G.1.6 (Numerical Analysis Optimization)
I.2.6 (Artificial Intelligence Learning)
I.5.2 (Pattern Recognition Design Methodology)
This bachelor thesis presents a novel approach to training deep neural networks. While back propagating on these deep architectures, it is often found that the gradient vanishes. Further, layers with logistic activation functions will saturate from top to bottom, which is slowing down convergence as the gradient can't propagate well past these saturated layers. Both observations awaken the wish to have the ability to regularize the gradient and directly force its properties. This thesis enables such regularization by modifying the network's cost function. Such changes modify the classic back propagation equations and therefore, the new extended back propagation equations are computed. Finally, two methods of regularization and their combination are presented and tested on a binary and a multi-class (MNIST) classification problem to show the benefits of training with these methods. A result of this thesis is the finding that this setup massively improves training on logistic networks, on the one hand enabling otherwise impossible classification in the multi-class case, while on the other speeding up training on a single class.
|PDF (1062652 Bytes)|
|Abteilung(en)||Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Maschinelles Lernen und Robotik|
|Betreuer||Toussaint, Prof. Marc|
|Eingabedatum||25. September 2018|