10 13 Lightweight Causal and Atomic Group Multicast - lkuper/CSE232-2021-09 GitHub Wiki
Discussion for: Kenneth Birman, André Schiper, and Pat Stephenson, "Lightweight Causal and Atomic Group Multicast" (TOCS, 1991)
Paper overview
In the ISIS toolkit, previous iterations of broadcast protocols used methods that did not scale beyond a certain number of nodes and limited performance. In this paper, a new protocol for causal broadcast is described that provides much better scalability and allows for more asynchronous behavior (thereby increasing performance), and does not need any form of stable storage. By using vector time to keep track of causal history within a process group, the new CBCAST protocol is able to maintain causality, and ABCAST can be implemented while respecting the causal order of all other multicast operations. Beyond just using vector time, the paper proves several results related to fault tolerance and message stability when process group membership changes for any reason. Specifically, every process maintains a view of the current state of its group, and when group membership changes processes use a special flush operation to ensure new multicasts cannot be sent until the process is up to date on messages sent in prior views. Several versions of this flush protocol are presented, and a protocol that sends a linear number of messages (instead of an exponential number) is given as an optimization.
Discussion questions
Q1
(Contributed by @Koda98)
CBCAST ensures causal delivery, while ABCAST ensures totally-ordered delivery and also causal delivery. What would be an example of a scenario where we would want to use ABCAST rather than CBCAST?
Discussion Summary for Q1.
- ‘Concurrent’ operations happening on different processes. Use ABCAST if you’re very interested in the order of events.
-
Events are concurrent in terms of partial ordering and we want transactions to be atomic and therefore would need ABCAST. If there’s some fixed resource, and the knowledge of this resource is spread across multiple processes, then a process cannot decide independently how to allocate the resource without leading to divergent states
-
Example: Shopping cart (the total order would matter when adding and deleting the product from the cart). Imagine two people are shopping together and are editing the same shopping cart. Removal and addition of cart values can be inconsistent for the two users if the events are not ordered.
- CBCAST isn’t enough because we can ‘lose’ messages (red herring). We need consensus between processes. CBCAST can send messages in any order and it can cause processes to not be in consensus.
-
Example 1: One seat left on a plane and two people want the seat while both are communicating to different servers (both say only one seat is left). Both of them end up buying the seat, but only one person should be getting the seat.
-
Example 2: A similar example is the bank withdrawal example in Databases. Bank deposit and withdrawal. Imagine there is less balance in the bank account. If the withdrawal message is sent before the deposit message, the withdrawal fails. So total ordering is important.
Q2
(Contributed by @gshen42)
The ABCAST protocol is an extension of the CBCAST protocol that orders messages totally. Lamport also described a mechanism to order events (and hence messages) totally using his logical clock (the "=>" relation in his paper). How do these two approaches compare to each other?
Discussion summary for Q2
Comparison between ABCAST and Lamport Clock:
-
Clocks:
- ABCAST and CBCAST use vector clocks.
- Lamport approach uses Lamport Clocks. It is difficult to find order with Lamport clocks.
-
Breaking a Tie:
- Lamport method suggests choosing a process uniquely to break a tie. This can be done ahead of time as well.
- ABCAST protocol cannot break ties in arbitrary order. The protocol depends on the token holder in order to break the tie with set orders.(Set-orders: There is a set of ids assigned for each message.)
-
Ordering:
- ABCAST (Atomic Broadcast or Totally ordered broadcast) and Lamport's clock can follow total ordering.
Q3
The CBCAST protocol does the "usual vector clock thing" of merging the received vector clock (VC) with the local VC by taking their pointwise maximum whenever a message is delivered, and then updating the local VC to this new, merged value. This happens in step (3) of the protocol in section 5.1, which says that the local VC is updated "in accordance with the vector time protocol from Section 4.3." Turning to section 4.3, the pointwise maximum computation happens in step (4) of the protocol described there.
However, as we briefly touched on in class, a small optimization to this protocol is possible, so you don't have to take the pointwise maximum. What is the optimization, and why is it OK to do in CBCAST?
Discussion summary for Q3
CBCAST can be optimized by just incrementing the value of the sender’s index, rather than doing a pointwise incrementation. This will still be valid because the deliverability conditions take care of the consistency.
Errata
Running list of typos / issues found in the paper:
- (to be filled in by scribes, if any)
Other
Any interesting points or questions from the group discussion that didn't fit above:
- (to be filled in by scribes, if any)