Professor Lorenzo Alvisi Office: ACES 6.244 Phone: 471-9792 Email: lorenzo@cs.utexas.edu Office Hours: T, Th: 11:00-12:00 |
|
Teaching Assistant Anurag Agarwal Office: ENS 512 Phone: 471-9589 Email: anurag@cs.utexas.edu Office Hours: M,W,f: 10:30-11:30 |
This course is designed to cover the some of the key ideas that have proved useful or are expected to be useful for designing and building tomorrow's distributed systems. The course focuses on fundamentals. We will cover problems, models, algorithms, and impossibility results. We will not cover Java, RMI, RPC, HTTP, CORBA, etc. At the same time, the course includes a non-trivial practical component, by requiring students to build systems that implement some of the algorithms presented in class, and some algorithms not presented in class.
You should have a good undergraduate background in Operating Systems
and be willing to participate in class. You should also be comfortable
about developing proofs, as many of the homework problems will require
you to develop protocols and prove them correct. For instance,
you should have no problems about how to use
induction.
Finally, you should be comfortable programming in
C, C++, or Java
Enrolling requires graduate standing: if you are not a graduate student but would like to take the course come and talk to me.
Here are a syllabus and a projected schedule for the class. Please note that both syllabus and schedule are tentative .
Week | Tuesday | Thursday | Assignment | Readings & Notes |
8/26 | Intro-2 Generals Problem | Week 0 Notes | ||
8/31, 9/2 | Event Ordering and Global Predicate Detection |
Lamport Clocks, Snapshot Protocol |
Week 1 Notes
| |
9/7, 9/9 | Causal Order Vector Clocks |
Detecting non-stable properties |
Week 2 Notes |
|
9/14, 9/16 | Guest Lecture ACES 6:304, 10:00 AM |
3-Phase Commit Last Process to Fail |
Implement 3 phase commit protocol |
Week 3 Notes
|
9/21, 9/23 | Atomic Commot Wrap-Up | Intro to Fault-Tolerance |
Week 4 Notes
| |
9/28, 9/30 | Rollback Recovery | Rollback Recovery | Homework 1 | Week 5 Notes |
10/5, 10/7 | The rise and fall of CIC protocols |
The State Machine Approach | Week 6 Notes | |
10/12, 10/14 | Consensus for benign failures | Consensus for benign failures | Week 7 Notes | |
10/19, 10/21 | Lower bound on TRB | Early Stopping TRB | Week 8 Notes | |
10/26, 10/28 | Paxos | Arbitrary Failures with message authentication |
Homework 2 Implement Paxos Consensus protocol |
Week 9 Notes
|
11/2, 11/4 | Arbitrary failures with message authentication |
Impossibility of consensus |
Week 10 Notes
|
|
11/9, 11/11 | Wait-free Synchronization | Wait-free Synchronization | Implement a wait-free data structure |
Week 11 Notes
|
11/16, 11/18 | Randomized consensus | Quorum Systems |
Week 12 Notes
|
|
11/23, 11/25 | Epidemic protocols | Thanksgiving | Homework 3 |
Week 13 Notes
|
11/30, 12/2 | Byzantine Quorum Systems | Byzantine Quorum Systems |
Week 14 Notes
|
Late policy. No extensions will be given for completing the homeworks or the programming projects, except that students will be allowed 6 slip days for all the assignments combined (projects, paper review, homework). A student may divide his or her slip days across projects in any way he or she wishes to extend deadlines for the projects (or a homework.) To help the TA track your slip-day status, the top of your project README file (or your homework) should include the line:
One of the most valuable components of this class will be the projects. Every project should be done in groups of two. Forming the partnership is your responsibility. We prefer not to be involved in match-making, although we will try to help the handful who cannot find a partner. Problems that arise during the partnership are also your responsibility.
All projects will require a demo. Demos will take place in the basement of Taylor the Monday following the deadline.
There is no single book that covers the material for the course. You will be able to integrate your class notes with copy of the slides I use in class and pointers to papers relevant to the material discussed in class, all of which I will post on the class web site.
9:30-11:00 Tuesday and Thursday, in Welch 2.256.
If you believe that an error has been made in grading your work, you have one week from the time that the work was returned to the class to file a complaint in writing to your instructor. If you decide to submit a complaint about the grading of your work, make sure to describe the issue clearly and return the original work and your note to your instructor.
So, which grade can you expect to receive in this class? Well, here are some guidelines. A student who fails to complete some of the assignments or shows little understanding of the material covered in class is likely to receive a final grade of C or lower. A student that does a good job in the course (i.e. masters the material satisfactorily) will earn a B. Outstanding students will earn an A. The class is not curved: if all students are outstanding, they all can earn an A. Being outstanding means doing really well in the projects, in the homeworks, and in the final. Note that being outstanding is not equivalent to working really hard during the semester. For instance, a typical B is earned by students that do great in the projects (almost everybody does: you guys are good!) but not so great in the final. Although it is not a rule cast in stone, past experience indicates that a) if you earn less than 75% of the credit in the final exam, you may struggle to earn an A and b) a little more than half of the class earns an A, and all but one or two earn B or more.
Submit the project(or homework) with the command `turnin' on the departmental public linux machines.
When submitting a project solution, it is important that you design test cases such that the grader (yours truly) is convinced of the correctness of your implementation. For this purpose, you will probably need to simulate certain scenarios for the protocols; for this purpose, it will be useful to build a network layer such that it is possible to control the communication between processes. More specifically, it should be possible to drop certain messages, delay some messages to achieve necessary message ordering. It should also be possible to shutdown processes at different stages of the protocol and then restart the processes after certain interval of time. Designing your project with these things in mind will speed up the testing and improve your chances of getting good grades.
Projects are completed in a group---you should refrain from discussing the specifics of your solution with members of other groups. It should go without saying that copying other groups' code, or asking them to help you debug you own code, are not admissible. You are welcome to use existing public libraries in your programming assignments (such as public classes for queues, trees, etc.) You may also look at operating systems code for public domain software such as Linux. Such activities qualify under approved collaboration practices and you are welcome to take advantage of them.
Failure to follow the above guidelines will be considered chating. This is not something you want to mess with. Students who violate University rules on scholastic dishonesty are subject to disciplinary penalties, including the possibility of failure in the course and/or dismissal from the University. Because such dishonesty harms the individual, all students, and the integrity of the University, policies on scholastic dishonesty will be strictly enforced. So, if in doubt, ask!