Bibliography | Zimmermann, Heiko: Bayesian functional optimization of neural network activation functions. University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 116 (2017). 51 pages, english.
|
Abstract | In the past we have seen many great successes of Bayesian optimization as a black-box and hyperparameter optimization method in many applications of machine learning. Most existing approaches aim to optimize an unknown objective function by treating it as a random function and place a parametric prior over it. Recently an alternative approach was introduced which allows Bayesian optimization to work in nonparametric settings to optimize functionals (Bayesian functional optimization). Another well recognized framework that powers some of today’s most competitive machine learning algorithms are artificial neural networks which are state of the art tools to parameterize and train complex nonlinear models. However, while normally a lot of attention is paid to the network’s layout and structure the neuron’s nonlinear activation function is often still chosen from the set of commonly used function. While recent work addressing this problem mainly considers steepest-descent-based methods to jointly train individual neuron activation functions and the network parameters, we use Bayesian functional optimization to search for globally optimal shared activation functions. Therefore, we formulate the problem as a functional optimization problem and model the activation functions as elements in a reproducing kernel Hilbert space. Our experiments have shown that Bayesian functional optimization outperforms a similar parametric approach using standard Bayesian optimization and works well for higher dimensional problems. Compared to the baseline models with fixed sigmoid and jointly trained shared activation function we achieved an improvement of the relative classification error over 39% and over 20%, respectively.
|
Full text and other links | Volltext
|
Department(s) | University of Stuttgart, Institute of Parallel and Distributed Systems, Machine Learning und Robotics
|
Superviser(s) | Toussaint, Prof. Marc; Ngo, Ph.D. Vien |
Entry date | May 9, 2022 |
---|