Someone you have to know – Leslie Lamport and his Bakery Algorithm

If you don’t know about Leslie Lamport you can’t really say that you know about distributed systems. A not accurate analogy is that someone who claims to know about American history doesn’t know about George Washington or Thomas Jefferson. In case you know his name but not his Bakery algorithm I will show you why this algorithm is so interesting.

To put it short Mr. Lamport did seminal research work on distributed systems which laid the foundations of the theory of distributed systems. . He won the 2013 Turing Ward due to his work on distributed systems (I think that the award is overdue). His is also famous for the document preparation system LaTex and his work on temporal logic.

He is currently working as a principal researcher in Microsoft Research Silicon Valley Lab. He is 73 at the time of writing. One reason some people who are working on distributed systems might not know about him is because people working in engineering field read papers or read the history. The more important reason is that his work is usually at the very core or very low level of distributed systems. If you are a developer you probably don’t need to know about him to understand a system or get your work done. But a lot of us are using systems which are based on his work directly or indirectly every day. e.g. part of the system behind iCloud is using the Paxos algorithm invented by him. Many Google services are using Google Big Table which relies on GFS which uses Chubby which in turn uses Paxos at its core. Microsoft Azure and Bing which power numerous services are also using Paxos.

Leslie Lamport got his BS, MS and Phd degrees on Mathematics which I think is an advantage for him when he works as a computer scientist because he can write structured proofs easily. He has a lot of humor and wisdom which you can strongly feel it from his casual comments[2] and the interview[3]. His most cited paper is “Time, Clocks, and the Ordering of Events in a Distributed System.” and it was published in 1978 and among the most cite articles in Computer Science. The famous algorithm Paxos has a colorful history since it came out of his failed attempt to prove that no such algorithm existed[3]. The paper “The Part-Time Parliament” which introduced Paxos was written by him in 1990 but only published 8 years later in 1998. This paper won an ACM SIGOPS Hall of Fame Award in 2012 [2]. You can read the details at [7]. I strongly encourage that everyone should read the two papers. One caveat. Please read “Paxos made simple” (also by Lamport) instead for the Paxos algorithm. But if you can understand “The Part-Time Parliament” well the first time you read it you are probably a genius and should stop reading my blog. Also look at the years I put in here and you will have a rough idea about the development of the theory of distributed systems.

I particularly want to talk about his bakery algorithm for the below reasons.

  1. This is the first mutual exclusion algorithm I know which doesn’t use the low level mutual exclusion (e.g. CAS – Compare And Swap). In case you don’t think this is interesting the built-in mutual exclusion algorithms in Java, C# and other languages assume atomic reads and writes. The lock keyword and Interlocked class in C# use the low level hardware support (most likely CAS instruction). The synchronized keyword and AtomicInteger class in Java are also using the low level hard support (most likely CAS instruction).
  2. It is fault tolerant. If one thread/one process/one machine which participates the algorithm crashes the algorithm can still work.
  3. Lamport’s Bakery algorithm marked the beginning of his study on distributed algorithms.
  4. It is what Ms. Lamport is proud of the most in his career according to his interview [3].

The Bakery algorithm was introduced in his paper [5] in 1974. The analogy is very simple. Let’s say there is a bakery with a numbering machine at its entrance. Each customer needs to take a number which is unique. At any time only one customer can be served and that customer has the smallest number. In real life this is easy to understand it works perfectly. In computer world please think about the below problems.

  1. What if the number is being read while it is being generated for a single party (like a process)?
  2. What if multiple customers try to get the numbers at the same time and what if they get the same number?
  3. How does it work without low level hardware support?

I encourage everyone to read this paper (just 3 pages. Actually 2 pages with proper formatting).There are three sections in this paper: the problem, the solution and the proof. It is fun to read. The code is easy to understand. The proof might not be easy or interesting for most readers but you can just assume that it is right.

If you want to read code in java you can read the Wikipedia page [6]. Unfortunately the Java code on the Wikipedia is wrong. I didn’t bother to fix the Wikipedia because I want my readers to know what is wrong. The highlighted part is wrong and it should be entering.get(i) != 0. Otherwise the algorithm would lose the remarkable property that read and write can overlap.

  1.  public void lock(int pid) //thread ID
  2.  {
  3.      entering.set(pid, 1);
  4.      ticket.set(pid, max(ticket) + 1); //find max in the array and return it
  5.      entering.set(pid, 0);
  6.      for(int i = 0; i < ticket.length(); ++i)
  7.      {
  8.          if(i != pid)
  9.          {
  10.              while(entering.get(i) == 1){} //wait while other thread picks a ticket
  11.              while(ticket.get(i) != 0 &&( ticket.get(pid) > ticket.get(i) || (ticket.get(pid) == ticket.get(i) && pid > i)))
  12.              {}
  13.          }
  14.      }
  15.  }

[1] wikipedia http://en.wikipedia.org/wiki/Leslie_Lamport

[2] My writings http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html

[3] A Discussion with Leslie Lamport by Dejan Milojicic http://research.microsoft.com/en-us/um/people/lamport/pubs/ds-interview.pdf

[4] Leslie Lamport. “Time, Locks, …” http://research.microsoft.com/users/lamport/pubs/pubs.html#time-clocks

[5] The bakery algorithm (http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html#bakery)

[6] wikipedia http://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm

[7] The Part-Time Parliament http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html#lamport-paxos