Artikel in Zeitschrift ART-2019-11

Fritz, Manuel; Muazzen, Osama; Behringer, Michael; Schwarz, Holger: ASAP-DM: a framework for automatic selection of analytic platforms for data mining.
In: Software-Intensive Cyber-Physical Systems.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
S. 1-13, englisch.
Springer Berlin Heidelberg, 17. August 2019.
ISSN: 2524-8510; 2524-8529; DOI: 10.1007/s00450-019-00408-7.
Artikel in Zeitschrift.
CR-Klassif.E.0 (Data General)
H.2.8 (Database Applications)
H.3.3 (Information Search and Retrieval)
KeywordsData mining; Analytic platform; Platform selection

The plethora of analytic platforms escalates the difficulty of selecting the most appropriate analytic platform that fits the needed data mining task, the dataset as well as additional user-defined criteria. Especially analysts, who are rather focused on the analytics domain, experience difficulties to keep up with the latest developments. In this work, we introduce the ASAP-DM framework, which enables analysts to seamlessly use several platforms, whereas programmers can easily add several platforms to the framework. Furthermore, we investigate how to predict a platform based on specific criteria, such as lowest runtime or resource consumption during the execution of a data mining task. We formulate this task as an optimization problem, which can be solved by today’s classification algorithms. We evaluate the proposed framework on several analytic platforms such as Spark, Mahout, and WEKA along with several data mining algorithms for classification, clustering, and association rule discovery. Our experiments unveil that the automatic selection process can save up to 99.71% of the execution time due to automatically choosing a faster platform.

CopyrightSpringer Berlin Heidelberg
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
Eingabedatum19. August 2019
   Publ. Abteilung   Publ. Institut   Publ. Informatik