Master Thesis MSTR-2019-83

BibliographyTschechlov, Dennis: Analysis and Transfer of AutoML Concepts for Clustering Algorithms.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 83 (2019).
91 pages, english.

Data analysts are confronted with the choice of selecting an appropriate algorithm with suitable hyperparameters for datasets that they want to analyze. For this, they typically execute and evaluate many configurations in a trial-and-error manner. However, for novice data analysts this is a time-consuming task. Recent advances in the research area of AutoML address this problem by automatically find a suitable algorithm with appropriate hyperparameters. Yet, these systems are only applicable for supervised learning tasks and not for unsupervised learning. In the scope of this work, existing AutoML systems are analyzed in detail. Subsequently, a concept is developed that uses components from existing AutoML systems but modifies them in such a way that they are applicable for unsupervised learning. Although, various kinds of unsupervised learning methods exist, this work focuses on the popular unsupervised method clustering. This concept is also prototypical implemented as proof-of-concept and is used for the evaluation. The comprehensive evaluation discusses the results for different optimization methods for selecting a suitable clustering algorithm with appropriate hyperparameters. The evaluation unveils that the predicted number of clusters of the implemented prototype deviates only slightly from the actual number of clusters. Hence, this work showed that it is possible to successfully transfer the concepts of existing AutoML systems to the unsupervised learning method of clustering and at the same time achieve precise results in an acceptable amount of time.

Full text and
other links
Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Applications of Parallel and Distributed Systems
Superviser(s)Schwarz, PD Dr. Holger; Fritz, Manuel
Entry dateMarch 2, 2020
   Publ. Computer Science