Bachelor Thesis BCLR-2016-06

BibliographyWallkötter, Sebastian: Regularizing gradient properties on deep neural networks.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Bachelor Thesis (2016).
31 pages, english.
CR-SchemaF.1.1 (Models of Computation)
G.1.6 (Numerical Analysis Optimization)
I.2.6 (Artificial Intelligence Learning)
I.5.2 (Pattern Recognition Design Methodology)
Abstract

This bachelor thesis presents a novel approach to training deep neural networks. While back propagating on these deep architectures, it is often found that the gradient vanishes. Further, layers with logistic activation functions will saturate from top to bottom, which is slowing down convergence as the gradient can't propagate well past these saturated layers. Both observations awaken the wish to have the ability to regularize the gradient and directly force its properties. This thesis enables such regularization by modifying the network's cost function. Such changes modify the classic back propagation equations and therefore, the new extended back propagation equations are computed. Finally, two methods of regularization and their combination are presented and tested on a binary and a multi-class (MNIST) classification problem to show the benefits of training with these methods. A result of this thesis is the finding that this setup massively improves training on logistic networks, on the one hand enabling otherwise impossible classification in the multi-class case, while on the other speeding up training on a single class.

Full text and
other links
PDF (1062652 Bytes)
Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Machine Learning und Robotics
Superviser(s)Toussaint, Prof. Marc
Entry dateSeptember 25, 2018
   Publ. Computer Science