Time : MW 1:10pm - 2:25pm
Class Location: 602 Hamilton
Piazza: Link

Distributed systems help programmers aggregate the resources of many networked computers to construct highly available and scalable services. Most of the applications and services we interact with today are distributed, some at enormous scales.

This class teaches design and implementation techniques that enable the building of fast, scalable, fault-tolerant distributed systems. Topics include distributed communication models (e.g., sockets, remote procedure calls, distributed shared memory), distributed synchronization (clock synchronization, logical clocks, distributed mutex), distributed file systems, replication, consistency models, fault tolerance, distributed transactions, agreement and commitment, Paxos-based consensus, MapReduce infrastructures, scalable distributed databases.

The class combines concepts and algorithms with descriptions of real-world implementations at Google, Facebook, Yahoo, Microsoft, LinkedIn, etc. In addition to lectures, students will get hands-on experience building distributed systems through a series of coding-oriented homeworks. The series, adopted from MIT’s course, implements a fault-tolerant, sharded key/value store.


The grading formula is:

You’ll notice that class participation doesn’t appear to impact the grade. Actually, it does: the instructor and TAs can nominate students for extra credit counting toward your grade if those students have been observed over the course of the semester to have substantial contributions either in class (through questions, answers, perspectives relevant to the class’ topic) or on Piazza (through constructive answers to others’ questions). The nominated students can get up to 15% (which equals one homework or one quiz!) and will be publicly acknowledged in class for their contributions. The nomination process will occur three times during the semester.


The homework series will require a lot of coding. Hence, in this class, we require that you have solid coding experience, particularly building systems-level components (e.g., not just apps). This can come either from personal or industry experience, or from the following Columbia courses or equivalents:

  1. COMS W3137 Data Structures and Algorithms
  2. COMS W3157 Advanced Programming
  3. COMS W3827 Fundamentals of Computer Systems
  4. W4118 Operating Systems is not required, but it is a big plus for your homework assignments

Please make sure you can meet the resource requirements listed in the homeworks section.


This class, along with the materials distributed for it, was inspired by Distributed Systems courses at various institutions:

* MIT’s 6.824 distributed systems course. The 6.824 course materials are available under a CC BY 3.0 US license. We thank Robert Morris, Frans Kaashoek, and Nickolai Zeldovich for sharing their course materials.