Bibliograph. Daten | Fritz, Manuel; Tschechlov, Dennis; Schwarz, Holger: Efficient Exploratory Clustering Analyses with Qualitative Approximations. In: Proceedings of the 24th International Conference on Extending Database Technology (EDBT). Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik. S. 1-6, englisch. Online, März 2021. DOI: 10.5441/002/EDBT.2021.31. Artikel in Tagungsband (Konferenz-Beitrag).
|
CR-Klassif. | H.2.8 (Database Applications)
|
Kurzfassung | Clustering is a fundamental primitive for exploratory data analyses. Yet, finding valuable clustering results for previously unseen datasets is a pivotal challenge. Analysts as well as automated exploration methods often perform an exploratory clustering analysis, i.e., they repeatedly execute a clustering algorithm with varying parameters until valuable results can be found. k-center clustering algorithms, such as k-Means, are commonly used in such exploratory processes. However, in the worst case, each single execution of k-Means requires a super-polynomial runtime, making the overall exploratory process on voluminous datasets infeasible in a reasonable time frame. We propose a novel and efficient approach for approximating results of k-center clustering algorithms, thus supporting analysts in an ad-hoc exploratory process for valuable clustering results. Our evaluation on an Apache Spark cluster unveils that our approach significantly outperforms the regular execution of a k-center clustering algorithm by several orders of magnitude in runtime with a predefinable qualitative demand. Hence, our approach is a strong fit for clustering voluminous datasets in exploratory settings.
|
Abteilung(en) | Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
|
Projekt(e) | INTERACT
|
Eingabedatum | 27. Mai 2021 |
---|