Artikel in Tagungsband INPROC-2013-26

Bibliograph.
Daten
Koldehofe, Boris; Mayer, Ruben; Ramachandran, Umakishore; Rothermel, Kurt; Völz, Marco: Rollback-Recovery without Checkpoints in Distributed Event Processing Systems.
In: Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems (DEBS).
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
S. 27-38, englisch.
ACM, 27. Juni 2013.
DOI: 10.1145/2488222.2488259.
Artikel in Tagungsband (Konferenz-Beitrag).
CR-Klassif.C.2.4 (Distributed Systems)
C.4 (Performance of Systems)
KeywordsReliability; Recovery; Complex Event Processing
Kurzfassung

Reliability is of critical importance to many applications involving distributed event processing systems. Especially the use of stateful operators makes it challenging to provide efficient recovery from failures and to ensure consistent event streams. Even during failure-free execution, state-of-the-art methods for achieving reliability incur significant overhead at run-time concerning computational resources, event traffic, and event detection time. This paper proposes a novel method for rollback-recovery that allows for recovery from multiple simultaneous operator failures, but eliminates the need for persistent checkpoints. Thereby, the operator state is preserved in savepoints at points in time when its execution solely depends on the state of incoming event streams which are reproducible by predecessor operators. We propose an expressive event processing model to determine savepoints and algorithms for their coordination in a distributed operator network. Evaluations show that very low overhead at failure-free execution in comparison to other approaches is achieved.

Volltext und
andere Links
PDF (555407 Bytes)
The original publication is available at ACM Digital Library
Copyright© ACM, 2013. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in proceedings of the 7th international conference on Distributed Event-Based Systems, pp. 27-38, Arlington, Texas, USA, June 29 - July 3, 2013.
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Verteilte Systeme
Projekt(e)Adaptive Kommunikationssysteme
CEPiL
Eingabedatum27. Mai 2013
   Publ. Abteilung   Publ. Institut   Publ. Informatik