Artikel in Tagungsband INPROC-2017-31

Bibliograph.
Daten
Heene, Mario; Parra Hinojosa, Alfredo; Bungartz, Hans-Joachim; Pflüger, Dirk: A Massively-Parallel, Fault-Tolerant Solver for High-Dimensional PDEs.
In: Desprez, F. (Hrsg); Et al. (Hrsg): Euro-Par 2016: Parallel Processing Workshops.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
Lecture Notes in Computer Science (LNCS); 10104, S. 635-647, englisch.
Cham: Springer, 28. Mai 2017.
DOI: 10.1007/978-3-319-58943-5_51.
Artikel in Tagungsband (Konferenz-Beitrag).
KörperschaftEuro-Par 2016
CR-Klassif.G.4 (Mathematical Software)
Kurzfassung

We investigate the effect of hard faults on a massively-parallel implementation of the Sparse Grid Combination Technique (SGCT), an efficient numerical approach for the solution of high-dimensional time-dependent PDEs. The SGCT allows us to increase the spatial resolution of a solver to a level that is out of scope with classical discretization schemes due to the curse of dimensionality. We exploit the inherent data redundancy of this algorithm to obtain a scalable and fault-tolerant implementation without the need of checkpointing or process replication. It is a lossy approach that can guarantee convergence for a large number of faults and a wide range of applications. We present first results using our fault simulation framework – and the first convergence and scalability results with simulated faults and algorithm-based fault tolerance for PDEs in more than three dimensions.

Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Simulation großer Systeme
Projekt(e)EXAHD
Eingabedatum19. Juni 2017
   Publ. Institut   Publ. Informatik