Masterarbeit MSTR-2016-88

Bibliograph.
Daten
Walter, Johannes: Design and implementation of a fault simulation layer for the combination technique on HPC systems.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 88 (2016).
95 Seiten, englisch.
Kurzfassung

In today's supercomputers, computing power is achieved by using a large amount of parallel executed processors. With growing amount of simultaneously used processors, the probability of hardware faults with resulting process failures grows as well. A popular standard for exchanging messages in networks is MPI. Current MPI versions are not fault-tolerant and terminate the whole MPI network in case of faults. ULFM, which is a proposed fault-tolerant extension of MPI, is not stable implemented and not available on supercomputers. In this master's thesis, a concept of a fault simulator as intermediate layer between MPI and application is introduced and implemented. By means of this fault simulator, process crashes and the behavior of ULFM shall be able to be simulated, without resulting in termination of the underlying MPI network.

Volltext und
andere Links
Volltext
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Simulationssoftwarebau
BetreuerPflüger, Jun.-Prof. Dirk; Heene, Mario
Eingabedatum19. Juni 2019
   Publ. Institut   Publ. Informatik