Masterarbeit MSTR-2022-04

Bibliograph.
Daten
Kunze, Ulf: Partitioning training data for complex multi-class problems using constraint-based clustering.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 4 (2022).
58 Seiten, englisch.
Kurzfassung

Quality control is one of the most important tools for protecting consumers of low quality products and is therefore essential. But it is not only important to keep defective and substandard products off the market through quality control, but to repair them whenever possible. This saves unnecessary waste and is a sustainable use of resources. In this thesis, constraint-based clustering algorithms are evaluated in the use case of quality control. Constraint-based clustering algorithms are used because they promise more flexibility than rigid partitions. The reallocation of data instance into new clusters can help to reduce the influence of analytic challenges for example: heterogeneous product portfolio, Multi-class imbalance and small sample size. For this thesis the algorithms CDBSCAN, COP-Kmeans and MPCK-Means are evaluated. The used constraint sets are Constraints by: Product group, engine type and error classes. This work also examines the existing method in more detail to understand the different behaviours. The end result is an average improvement of 5% over the existing approach and an increase of 13% over a random forest classifier. Furthermore, methods for extracting domain knowledge from data sets are investigated. For this purpose, an active learning and an algorithmic approach are integrated into the existing pipeline.

Volltext und
andere Links
Volltext
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
BetreuerSchwarz, Prof. Holger; Tschechlov; Dennis
Eingabedatum28. April 2022
   Publ. Informatik