The following topics will be presented over the course of the semester. Each topic will be covered in (roughly) one week of lectures. Lecture notes are linked as they become available.

  1. Course introduction

  2. Distributed systems primer
    • challenges and goals of distributed systems
    • example architectures
  3. Communication models
    • remote procedure calls (RPC)
    • RPC libraries
    • failure models
    • semantics
  4. Time and coordination
    • challenges
    • physical and logical clocks
    • distributed mutual exclusion
  5. Agreement in distributed systems
    • the atomic commitment problem
    • the consensus problem
    • use cases for each
  6. The transaction abstraction
    • ACID semantics
    • concurrency control mechanisms
    • recovery mechanisms
  7. Atomic commitment protocols
    • 2-phase-commit
    • blocking nature
  8. Consensus protocols
    • Paxos overview, key ideas, basic algorithm
    • examples of normal operation and operation under failures
    • liveness failure mode
    • FLP impossibility result
  9. Case study: Google’s Spanner
    • design of TrueTime
    • design of Spanner and its linearizable, distributed transactions
  10. Broader view of isolation and consistency semantics
  11. Beyond storage: broader system architectures
    • Google’s software stack
    • Facebook’s software stack
    • Open source software stacks
  12. Distributed computation
    • MapReduce
    • tradeoffs
    • Notes by invited instructor Eugene Wu shared over courseworks

Supplemental code for lecture