Bibliograph. Daten | Fritz, Manuel; Schwarz, Holger: Initializing k-Means Efficiently: Benefits for Exploratory Cluster Analysis. In: Panetto, Hervé (Hrsg); Debruyne, Christophe (Hrsg); Hepp, Martin (Hrsg); Lewis, Dave (Hrsg); Ardagna, Claudio Agostino (Hrsg); Meersman, Robert (Hrsg): On the Move to Meaningful Internet Systems: OTM 2019 Conferences. Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik. Lecture Notes in Computer Science (LNCS); 11877, S. 146-163, englisch. Springer Nature Switzerland AG, Januar 2019. ISSN: 978-3-030-33245-7; DOI: 10.1007/978-3-030-33246-4. Artikel in Tagungsband (Konferenz-Beitrag).
|
Körperschaft | On the Move to Meaningful Internet Systems |
CR-Klassif. | E.0 (Data General) H.2.8 (Database Applications) H.3.3 (Information Search and Retrieval)
|
Keywords | Exploratory cluster analysis; k-Means; Initialization |
Kurzfassung | Data analysis is a highly exploratory task, where various algorithms with different parameters are executed until a solid result is achieved. This is especially evident for cluster analyses, where the number of clusters must be provided prior to the execution of the clustering algorithm. Since this number is rarely known in advance, the algorithm is typically executed several times with varying parameters. Hence, the duration of the exploratory analysis heavily dependends on the runtime of each execution of the clustering algorithm. While previous work shows that the initialization of clustering algorithms is crucial for fast and solid results, it solely focuses on a single execution of the clustering algorithm and thereby neglects previous executions. We propose Delta Initialization as an initialization strategy for k-Means in such an exploratory setting. The core idea of this new algorithm is to exploit the clustering results of previous executions in order to enhance the initialization of subsequent executions. We show that this algorithm is well suited for exploratory cluster analysis as considerable speedups can be achieved while additionally achieving superior clustering results compared to state-of-the-art initialization strategies.
|
Volltext und andere Links | Springer Link
|
Kontakt | manuel.fritz@ipvs.uni-stuttgart.de |
Abteilung(en) | Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
|
Eingabedatum | 16. Oktober 2019 |
---|