Bachelorarbeit BCLR-0156

Gessler, Alexander: MapReduce to Couple a Bio-mechanical and a Systems-biological Simulation.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Bachelorarbeit Nr. 156 (2014).
106 Seiten, englisch.
CR-Klassif.H.2.4 (Database Management Systems)
H.2.8 (Database Applications)
H.2.3 (Database Management Languages)
H.3.4 (Information Storage and Retrieval Systems and Software)
H.4.1 (Office Automation)

Recently, workflow technology has fostered the hope of the scientific community in that they could help complex scientific simulations to become easier to implement and maintain. The subject of this thesis is an existing workflow for a multi-scalar simulation which calculates the flux of porous mass in human bones. The simulation consists of separate systems-biological and bio-mechanical simulation steps coupled through additional data processing steps. The workflow exhibits a high potential for parallelism which is only used to a marginal degree. Thus we investigate whether _Big Data_ concepts such as MapReduce or NoSQL can be integrated into the workflow.

A prototype of the workflow is developed using the Apache Hadoop ecosystem to parallelize the simulation and this prototype compared against a hand-parallelized baseline prototype in terms of performance and scalability. NoSQL concepts for storing inputs and results are utilized with an emphasis on HDFS, the Hadoop File System, as a schemaless distributed file system and MySQL Cluster as an intermediary between a classic database system and a NoSQL system.

Lastly, the MapReduce-based prototype is implemented in the WS-BPEL workflow language using the SIMPL[0] framework and a custom Web Service to access Hadoop functionality. We show the simplicity of the resulting workflow model and argue that the approach greatly decreases implementation effort and at the same time enables simulations to scale to very large data volumes at ease.

[0] P. Reimann, M. Reiter, H. Schwarz, D. Karastoyanova, F. Leymann. SIMPL - A Framework for Accessing External Data in Simulation Workflows. In BTW, pp. 534–553. Kaiserslautern, Germany, 2011.

Volltext und
andere Links
PDF (4184413 Bytes)
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
BetreuerReimann, Peter
Projekt(e)SimTech - DP4DDS
Eingabedatum20. Januar 2015
   Publ. Informatik