Studienarbeit STUD-2412

Bibliograph.
Daten
Mayer, Christian: Design and Implementation of a Coordination Service for Distributed Applications (In-memory Paxos).
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Studienarbeit Nr. 2412 (2013).
48 Seiten, englisch.
CR-Klassif.C.2.4 (Distributed Systems)
C.4 (Performance of Systems)
H.3.4 (Information Storage and Retrieval Systems and Software)
Kurzfassung

Abstract Coordination of different, independent processes is a very important aspect in the area of distributed systems. In order to coordinate each other, participants of a distributed system often have to agree on some common knowledge such as locking of a shared resource. The general problem how to reach agreement on some value is also known as consensus problem. In many practical systems, the consensus problem is outsourced to a distributed system consisting of multiple servers to increase its availability. Each server can be contacted by clients that intend to reach consensus about a specific value. Examples are Google’s locking service Chubby or Yahoo’s distributed file system Zookeeper. The standard Paxos algorithm solves this problem in an environment where nodes may recover after a crash and messages can have infinite delay. However, a system based on the classical Paxos algorithm makes use of expensive stable storage operations to guarantee that a crashed and recovered Paxos server is still able to participate in the protocol. Studies have shown that these disk costs are the bottleneck of the whole system. In this work a performance-oriented version of Paxos will be investigated that still solves the consensus problem, but trades availability of the consensus system against performance by not using stable storage operations. Without careful design this can be problematic, because a Paxos server that has lost its memory can be dangerous for the success of the protocol. This can be solved by not allowing a crashed and recovered Paxos server to participate in the protocol anymore. Instead, a recovered server rejoins the protocol with a new id, so that no active processes assume anything about the recovered process. In order to join the group of active processes, a majority of servers has to be active and agree on this. Our evaluations show, that a coordination system based on the in-memory Paxos approach has a very short response time of only one millisecond and high throughput up to 18000 write requests per second.

Volltext und
andere Links
PDF (902781 Bytes)
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Verteilte Systeme
BetreuerDürr Frank
Eingabedatum18. Juli 2013
   Publ. Abteilung   Publ. Institut   Publ. Informatik