Chain replication for supporting high throughput and availability
Chain replication tries to support high throughput, availability, and strong consistency
The state of an object consists of a history of updates that have been performed and a set of pending requests
Chain replication uses a fail-stop approach, where servers halt in response to failure rather than making erroneous transitions
We use a primary/backup approach, so servers can fail without compromising an object. These servers are linearly ordered to form a chain.
- The reply for every request is generated and sent by the tail
- Query requests are directed to the tail
- Update requests are processed in the head and then passed down to the tail
Since update requests are processed by a single replica, they can be non-deterministic
The master service detects server failure and informs each server in the chain of its position and the current head and tail. In their prototype, this is implemented as a replicated service coordinated with Paxos
Fault Tolerance
First note that servers earlier in the chain have a more recent history than servers later in the chain
If the master detects failure of the head, we simply notify the chain of the update and lose any of the pending updates that had not been propagated.
If the master detects failure of the tail, it notifies the chain of the update. At this point, the set of completed requests has likely increased,
Failure of other servers is more complex. Each server maintains a list of sent requests that have been forwarded but not processed by the tail. When the tail processes a request, it forwards and acknowledgement up the chain. The master finds the last request that has received and tells to reforward the missing bits.
To add a new server, we append to the tail. The existing tail is responsible for forwarding the history until it is caught up, at which point requests in the sent list are forwarded to the new tail.
Performance
Chain replication is an instance of primary/backup. The head and tail share responsibility for ordering operations, which in theory is less overhead. However, serially distributing data also makes for higher latency.
In their experiments, they find that chain replication is consistently faster than a regular primary/backup protocol.