22C:166 (CS 5620) Distributed Systems and Algorithms

(Fall 2012) 9:30-10:45PM Tuesdays and Thursdays, Room 218 MLH.


Instructor: Sukumar Ghosh
201P MLH, sukumar-ghosh@uiowa.edu, 319-335-0738
Office Hours: 2:00-3:30 PM Tuesdays and Thursdays.

Announcements, Prerequisites, Handouts, Homework and Exams, Lecture Notes, On-line resources,


Textbook


Sukumar Ghosh: Distributed Systems: An Algorithmic Approach, 2006 CRC Press (ISBN 158488564) Table of contents


In addition to the textbook, we will occasionally use the following books as references:

1. Gerard Tel , "Introduction to Distributed Algorithms," Cambridge University Press 2000
2. Andrew Tannenbaum, Maarten van Steen , "Distributed Systems: Principles and Paradigms," Prentice Hall (2nd edition) 2006.

Prerequisites

Some knowledge of Operating Systems and/or Networking, Algorithms, and interest in Distributed Computing. Our goal is to learn and analyze why and how distributed systems work, why some of them fail, and how to tolerate failures and various dynamic behaviors.

Teaching Assistant

Rahil Sharma, rahil-sharma@uiowa.edu
Office: 201G Maclean Hall, Office hours: 11:30AM-12:30PM (MW), 1:00-2:00PM (F)

Course Outline

Course policies are governed by the College of Liberal Arts.

Handouts

Homework and Exams

Five homeworks worth 35% of the final grade. One midterm exam (30%), and a final exam (35%).

Projects

Here is the List of projects

Each project is for a two-person team, and can be chosen in place of the last two assignments. It is an opportunity to demonstrate intellectual independence. The descriptions are in the draft form -- the details will evolve after initial discussions. Note that it is not mandatory to work on a project.

Tentative scale for letter grades
A+ = 95-100     B+ = 80-84     C+ = 65-69     D+ = 50-54    F = 0-39
A  = 90-94      B  = 75-79     C  = 60-64     D  = 45-49
A- = 85-89      B- = 70-74     C- = 55-59     D- = 40-44

The instructor reserves the right to make minor modifications in the above grading scale.

Lecture Notes

Week 1 (8/20-8/24). Introduction to Distributed Systems
Week 2 (8/27-8/31). Correctness criteria and Proof techniques
Week 3 (9/3-9/7). Time, Clocks and Causality
Week 4 (9/10-9/14). Distributed Mutual Exclusion
Week 5 (9/17-9/21). Distributed Snapshot
Week 6 (9/24-9/28). Global State Collection
Week 8 (10/8-10/12). Graph Algorithms
Week 9 (10/15-10/19). Coordination Algorithms
Week 10 (10/22-10/26). Faults and Fault-tolerance
Week 11 (10/29-11/2). Distributed Consensus
Week 12 (11/5-11/9). Failure Detectors
Week 13 (11/12-11/16). Distributed algorithms for graph coloring (pdf)
Week 14 (11/26-11/30). Group Communication
Week 15 (12/3-12/7). Replication

On-line resources

1. Leslie Lamport: Time, Clocks, and the Ordering of Events in a Distributed System. Commun. ACM 21(7): 558-565 (1978) (pdf)

2. Leslie Lamport, Robert E. Shostak, Marshall C. Pease: The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst. 4(3): 382-401 (1982) (pdf)

3. K. Mani Chandy, Leslie Lamport: Distributed Snapshots: Determining Global States of Distributed Systems ACM Trans. Comput. Syst. 3(1): 63-75 (1985) (pdf)

4. Anish Arora, Mohamed Gouda: Closure and Convergence: A Foundation of Fault-Tolerant Computing. IEEE Trans. Software Eng 19 (11) pp. 1015-1027, November 1993. (pdf)

5. Leslie Lamport, P.M. Melliar-Smith: Byzantine Clock Synchronization. ACM PODC 1984. (pdf)

6. Jeffrey Dean and Sanjay Ghemawat: MapReduce: simplified data processing on large clusters. Communications of the ACM 51 (1), January 2008. (pdf)

7. Tushar Deepak Chandra and Sam Toueg: Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM, 43:2, March 1996, 225-267. (pdf)

8. Edsger W. Dijkstra: Self-stabilizing Systems in Spite of Distributed Control. Commun. ACM 17(11): 643-644 (1974). (pdf)

9. Michael J. Fischer, Nancy A. Lynch, Michael S. Paterson: Impossibility of Distributed Consensus with One Faulty Process.Vol. 32, No. 2, pp. 374-382 (1985) (pdf)

10. Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan: Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11(1): 17-32 (2003) (pdf)

11. James Aspnes, Gauri Shah: Skip graphs. SODA 2003: 384-393 (pdf)

12. Watts, D.J.; Strogatz, S.H. (1998). "Collective dynamics of 'small-world' networks.". Nature 393 (6684)

13. Jon Kleinberg. The small-world phenomenon: an algorithm perspective (2000) (pdf)

14. Leslie Lamport: Paxos made Simple. ACM SIGACT News (Distributed Computing Column) 32, 4 (No. 121, December 2001) 51-58. (pdf)

15. Hector Garcia-Molina: .Elections in a Distributed Computing System. IEEE Trans. Computers (1982) (pdf)

16. Baruch Awerbuch: The Complexity of Network Synchronization. Journal of ACM (1985) (pdf)