Time: Tuesday and Thursday 10:30 AM - 11:50 AM
Location: GHC 4303
Units: 12
Instructor: Weina Wang weinaw "at" cs "dot" cmu "dot" edu
Office hours: anytime, please send an email first
Prerequisite: undergraduate probability, mathematical maturity in general
Large-scale cloud computing systems are at the heart of today’s emerging technologies. A key challenge in advancing the state-of-the-art is to design systems with stringent performance guarantees such as high-throughput and low-delay. To tackle this challenge, it is important to understand how algorithms in various components of the system have an impact on the performance. This course introduces theories and techniques from applied probability and stochastic processes for building models and analyzing algorithms. The goal is to address questions such as: How should computation tasks be assigned to the large number of servers in a cloud computing system? How should the bandwidth in a datacenter network be allocated to achieve fast data transfer? How should data be distributed and how does that affect the efficiency of computation?
Topics covered include:
After successfully completing this course, you will
Date Topics
9/4 Introduction: A datacenter from the modeling perspective
9/6 Discrete-time Markov chain (DTMC) (References: Chapter 3.3 of Srikant and Ying's book, Chapter 2 & 6 of Hajek's book, Chapter 5 of Gallager's book/notes)
9/11 Foster-Lyapunov stability criterion with application in load-balancing
9/13 Applying Foster-Lyapunov to the giant switch abstraction of datacenter networks
9/18 Moment bounds given by the drift
9/20 The drift method (I)
9/25 The drift method (II)
9/27 The drift method (III)
10/2 Continuous-time Markov chain (CTMC)
10/4 No class
10/9 Extending Foster-Lyapunov and the drift method to CTMC
10/11 Course project presentation
10/16 Course project presentation
10/18 Mean-field analysis: Cavity method and differential equations method
10/23 Mean-field analysis: Kurtz's theorem example
10/25 Mean-field analysis: Kurtz's theorem proof and more examples
10/30 Mean-field analysis: Interchange of limits, asymptotic independence
11/1 Mean-field analysis: asymptotic independence, neural network example
11/6 No class
11/8 Data management in datacenters: Data locality and data transfer
11/13 A connection-level model for data transfer, multi-commodity flow algorithms
11/14 (Makeup class) Open problems in modern large-scale computing systems
11/15 Course project presentation
11/20 Course project presentation
11/27 Course project presentation
11/29 Course project presentation
Homework: 60%
You will have homework every other week.
Course project: 40%
You will choose a topic of your interest and read up papers in the literature. I will suggest some topics and papers. But you are free to choose any topic that you find particularly interesting as long as it is related to this course. You can work by yourself or together with another student. Your team will submit a brief report every four weeks to document your progress, and a final report at the end of the semester. Of course you are also welcome to have discussions with me anytime. Your team will also give a 40 min presentation to show your findings. You are encouraged to identify and solve problems beyond the existing work in the literature -- there will be bonus in your grade and I am happy work with you if desired to help you submit your work to a conference.