Lectures

The following topics will be presented over the course of the semester. Each topic will be covered in (roughly) one week of lectures. Lecture notes are linked as they become available.

  1. Course introduction

  2. Distributed systems primer
    • challenges and goals of distributed systems
    • example architectures
  3. Communication models
    • remote procedure calls (RPC)
    • RPC libraries
    • failure models
    • semantics
  4. Time and coordination
    • challenges
    • physical and logical clocks
    • distributed mutual exclusion
  5. The Consensus problem
    • two-generals problem
    • practical instances of the problem
    • FLP impossibility result
  6. Commitment protocols
    • 2-phase-commit
    • 3-phase-commit
    • safety/liveness tradeoffs with the two
  7. Consensus protocols
    • Paxos overview, key ideas, basic algorithm
    • examples of normal operation and operation under failures
    • liveness failure mode
  8. Practical applications
    • fault-tolerant architectures: primary/secondary
    • the design of Google’s Chubby lock service
  9. Case study: Google’s “older” storage stack
  10. Transactions
    • ACID semantics
    • concurrency control mechanisms
    • recovery mechanisms
    • distributed transactions
  11. Case study: Google’s “current” storage stack
    • design of TrueTime
    • design of Spanner and its linearizable, distributed transactions
  12. Consistency models
    • sequential, causal, and eventual consistency
    • mechanisms to achieve each
    • tradeoffs
  13. Distributed systems security primer
    • security challenges and opportunities in DS
    • authentication protocols: Needham-Schroeder, Kerberos
    • byzantine fault tolerance (a few words)

Assignments/Go Lecture

Supplemental code lecture