|Fritz, Manuel; Muazzen, Osama; Behringer, Michael; Schwarz, Holger: ASAP-DM: a framework for automatic selection of analytic platforms for data mining. |
In: Software-Intensive Cyber-Physical Systems.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
S. 1-13, englisch.
Springer Berlin Heidelberg, 17. August 2019.
ISSN: 2524-8510; 2524-8529; DOI: 10.1007/s00450-019-00408-7.
Artikel in Zeitschrift.
|CR-Klassif.||E.0 (Data General)|
H.2.8 (Database Applications)
H.3.3 (Information Search and Retrieval)
|Keywords||Data mining; Analytic platform; Platform selection|
The plethora of analytic platforms escalates the difficulty of selecting the most appropriate analytic platform that fits the needed data mining task, the dataset as well as additional user-defined criteria. Especially analysts, who are rather focused on the analytics domain, experience difficulties to keep up with the latest developments. In this work, we introduce the ASAP-DM framework, which enables analysts to seamlessly use several platforms, whereas programmers can easily add several platforms to the framework. Furthermore, we investigate how to predict a platform based on specific criteria, such as lowest runtime or resource consumption during the execution of a data mining task. We formulate this task as an optimization problem, which can be solved by todayâ€™s classification algorithms. We evaluate the proposed framework on several analytic platforms such as Spark, Mahout, and WEKA along with several data mining algorithms for classification, clustering, and association rule discovery. Our experiments unveil that the automatic selection process can save up to 99.71% of the execution time due to automatically choosing a faster platform.
|Copyright||Springer Berlin Heidelberg |
|Abteilung(en)||Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware|
|Eingabedatum||19. August 2019|