Artikel in Tagungsband INPROC-2017-31

Bibliograph.
Daten
Heene, Mario; Parra Hinojosa, Alfredo; Bungartz, Hans-Joachim; Pflüger, Dirk: A Massively-Parallel, Fault-Tolerant Solver for High-Dimensional PDEs.
In: Desprez, F. (Hrsg); Et al. (Hrsg): Euro-Par 2016: Parallel Processing Workshops.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
Lecture Notes in Computer Science (LNCS); 10104, S. 635-647, englisch.
Cham: Springer, 28. Mai 2017.
DOI: 10.1007/978-3-319-58943-5_51.
Artikel in Tagungsband (Konferenz-Beitrag).
KörperschaftEuro-Par 2016
CR-Klassif.G.4 (Mathematical Software)
Kurzfassung

We investigate the effect of hard faults on a massively-parallel implementation of the Sparse Grid Combination Technique (SGCT), an efficient numerical approach for the solution of high-dimensional time-dependent PDEs. The SGCT allows us to increase the spatial resolution of a solver to a level that is out of scope with classical discretization schemes due to the curse of dimensionality. We exploit the inherent data redundancy of this algorithm to obtain a scalable and fault-tolerant implementation without the need of checkpointing or process replication. It is a lossy approach that can guarantee convergence for a large number of faults and a wide range of applications. We present first results using our fault simulation framework – and the first convergence and scalability results with simulated faults and algorithm-based fault tolerance for PDEs in more than three dimensions.

Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Simulation großer Systeme
Projekt(e)EXAHD
Eingabedatum19. Juni 2017
   Publ. Abteilung   Publ. Institut   Publ. Informatik