PAXOS – (c) Victor Yodaiken, 2022
These are the lecture notes, or see the paper.
1. Really clever distributed consensus algorithm by Leslie Lamport A. Infamously hard to understand - but not complicated B. Livelocks - can get stuck without reaching consensus C. Highly patented D. Widely used - supposedly. 2. Original paper: The Part Time Parliament, 1990 Most annoying CS paper ever published. 3. Paxos Made Simple, 2001. The key text: but it really describes 3 different algorithms. ==================================================== DISTRIBUTED CONSENSUS PROBLEM Network of computers (agents) that can lose messages but never corrupts them (semi-realistic) Agents can fail completely but don't lie or break the rules Proposer agents propose values Acceptor agents can accept proposals or reject them Goals: Majority of acceptors agree on value, Winning proposer sees agreement No more than one winning value (safety) ==================================================== SIMPLE SOLUTION - NOT PAXOS 1. One network "agent" is the only proposer 2. The proposer sends a proposed value to all acceptor agents until .. Acceptors send back acknowledgment messages 3. When the proposer has acks from a majority it knows it has won Use timeouts to make it more robust - Oki and Liskov, Viewstamped Replication, 1988 Use cycles for more efficiency Chang and Maxemchuk, Reliable Broacast Protocols, 1984 Use with "state machine" model for fault tolerance Borg, Baumbach, Glazer, A Message System Supporting Fault Tolerance 1983 ==================================================== PAXOS SOLVES THE PROBLEM OF TOO MUCH SIMPLICITY Key premise at start is the network is asynchronous (no clocks) (fashionable premise 1980s, not realistic) also that the algorithm needs multiple proposers (for fault tolerance, performance?) Clever idea: Multiple winning proposers ok - if they all propose the same value. 2 phase algorithm ensures that proposers converge on single winning value ==================================================== PAXOS MADE SIMPLE Pages 1-6 explain how the algorithm works Page 6, Lamport explains that the algorithm "livelocks" although it is safe Page 7: different algorithm also called Paxos that has a single proposer selected via timeouts (clocks) or randomization. Rest: the state machine model of fault tolerance using Paxos2 Question: What is advantage of Paxos2 over Simple Method? ==================================================== HOW DOES PAXOS1 SORT OF WORK? id numbers are unique per proposer and per proposal. (called "sequence numbers" in Paxos Made Simple) Proposers start in phase1 and ask acceptors to agree to their proposal id. When/if a majority of acceptors agree to the proposal id The proposer can pick a value, enter phase 2, and send its actual proposal When/if a majority of acceptors accept the proposal, the proposer wins Key: the messages agreeing to a proposal id carry the best (highest numbered) proposal the acceptor has accepted so far the proposer has to take (inherit) the value from the highest numbered one of those ==================================================== THERE ARE FOUR KINDS OF MESSAGES (phase1, id) - sent by proposers in phase 1 to acceptors (ok1, id, bestproposal ) -sent by acceptors to agree to phase 1 (phase2, id,value) - sent by proposers in phase 2 (ok2, id) - sent by acceptors to agree to phase 2 proposal Also messages reliably carry the identity of the original sender of the message ==================================================== ACCEPTORS ENFORCE ID ORDER Each acceptor A keeps track of A.BestID which is the highest id it has agreed to (either phase) A.BestID is initially 0 A.BestProposal =(phase2,id,value) the highest numbered proposal the acceptor has accepted A.BestProposal is initially null When an acceptor A receives m=(phase1,id) if m.id > A.BestID can send (ok1,m.id,A.BestProposal) back update A.BestId = m.id When an acceptor A receives m=(phase2,id,value) if m.id >= A.BestId can send (ok2,m.id) back and update A.BestProposal=m and A.BestId=m.id ==================================================== PROPOSERS CAN ABANDON THEIR NON-WINNING PROPOSALS They can pick a new id number that must be higher and still unique and start over at any time but it can still livelock forever Proposer 1 and 2 2 starts and gets a majority in phase 1 and then rests 1 starts and gets 1 less than a majority, blocks, increases number to 3, gets a majority for phase 1 ... repeat ===================================================== PROOF OF SAFETY: If only one proposal has won, nothing to prove If more than one proposal has won, let m_0 =(id0,value0) be the winning proposal with the lowest id number Let L = (m_0,m_1,m_2 ... m_N) add to m_0, in id order, the complete list of proposals with m_i.id > m_0.id that have been sent by some proposer, If m_i is on the list, the proposer of m_i reached phase2 with id m_i.id L includes all winning proposals plus more Claim: All of these must have the same value v = m_0.value. ==================================================== WHY? P0 sent m_0 and received (ok2,m_0.id) from a majority of acceptors because we chose it as the lowest numbered WINNING proposal if P sent proposal m_i, i>0, then P got (ok1,m_i.id, best) from a majority of acceptors The two majority sets of acceptors must intersect at some A A must have sent (ok2,m_0.id) BEFORE it sent (ok1,m_i.id,A.BestProposal) since m_i.id > m_0.id When A agrees to m_i.id it already accepted m_0, so A.BestProposal was not null and m_0.id <= A.BestProposal.id < m_i.id P must have inherited at least 1 proposal m with m_0.id <= m.id < m_i.id ==================================================== BACK TO LIST FOR INDUCTION L = (m_0,m_1,m_2 ... m_N) CLAIM i >= 0 implies m_i.value = m_0.value. Proof by induction. True for i=0 Suppose true for 0,... j-1 and consider j the proposer P of m_j inherited some best message m with m_0.id <= m.id < m_j.id and since m was sent, the proposer of m must have reached phase2 so m must be on the list. Since m.id < m_j.id and it is on the list, m.value = m_0.value and so m_j.value = m_0.value QED.
Lecture notes on Paxos