CS 6007: Multi-agent machine learning (maml)

Instructor: Avishek Ghosh, Assistant Professor, Dept. of CSE, IIT Bombay

Contact: Room KR-218, Kresit Building; Email: avishek@cse.iitb.ac.in

Timing: Tuesday/Friday 2:00-3:30 pm

Classroom: CC-101 (changed from LA 001) [This is in the new Computer Science Building]

TA: 1) Bittu Kumar (bittu_kumar@iitb.ac.in)

2) Garima Jain (jgarima203@gmail.com)

3) Merugu Anvesh (24m0799@iitb.ac.in)

4) Sravani Gunnu (gunnu.sravi@gmail.com)

About the course:

This is an advanced course in Multi-agent framework in machine learning systems. It is expected that you have taken a course on Machine Learning and Optimization before. Please check the prerequisites. Also, I expect students to be comfortable in reading research papers and presenting them. The basics of probability, linear algebra, optimization and machine learning will not be covered during lectures.

Why?

Recent applications of ML (recommendation systems, NLP, RL etc) require solving large scale problems. One standard way to beat the curse-of-scale is to introduce multiple agents and leverage collaboration. Hence, Multi-agent ML has now become an important topic in AI. This course will introduce a variety of problems (some of them open to date) to the students and teach them tools and techniques (algorithmic solutions) to address them.

Prerequisites:

A strong foundation in probability and linear algebra is expected. Prior exposure to optimization and machine learning is also expected.

Content: The content is roughly divided into 3 modules:

Module I: Federated Learning (FL)

Basic Formulation, FedAvg algorithm, Non-convexity. Challenges in FL : (i) Heterogeneity (ii) Communication Bottleneck (iv) Adversarial Attacks (v) Data Privacy. To address these, we study (i) Clustering in FL, Bounded Heterogeneity control (ii) Local Averaging and data compression (\delta approximate compressors) (iii) Robust estimators (Geometric Median, Trimmed Mean) and (iv) Differential Privacy. We study a variety of algorithms with sharp theoretical guarantees.

Module II: Multi Agent Bandits

Collaborative Multi Agent Multi Arm Bandits. Gossip Type Algorithms, Modified Upper Confidence Bound, Regret Guarantees. Federated Bandits-Regret and Best Arm Identification. Heterogeneous Federated Bandits, Clustering and Personalization. Federated RL (TD and Q learning).

Module III: Matching Markets

Formulation of matching markets with multiple agents and arms via a bipartite graph, Stable matching, Matching Algorithms: Gale-Shapley, Preference Learning via a multi armed bandit framework, Connection to economic markets. Algorithms: (i) Collision avoidance Upper Confidence Bound (CA-UCB) (ii) Explore-Then-Gale Shapley (ETGS) (iii) Dominated Upper Confidence Bound (UCBD3). Variations: Dynamic Market, Structured Market, Robust Market

Grading: Quiz - 15%, Homeworks - 15%, Student Presentation - 15%, Final Project - 30%, Endsem - 25%

References:

We will study a combination of Textbooks and Research Papers. I am listing a few here. More during the coverage of the topics.

[Book] Learning Theory from First Principles, Francis Bach, MIT Press, 2024
[Book] Optimization for Data Analysis, Benjamin Recht and Stephen J. Wright, Cambridge University Press, 2022
[Book] Optimization Algorithms for Distributed Machine Learning, Gauri Joshi, Springer, 2023
[Book] Bandit Algorithms, Csaba Szepesvari and Tor Lattimore, Cambridge University Press, 2020
[Paper] On the Convergence of Fed-Avg on non-iid data, X. Li et.al’, ICLR 2020
[Paper] Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates, Dong Yin et. al, ICML, 2018
[Paper] The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits, Chawla et. al, AISTATS, 2020.
[Paper] Multi-Agent Multi-Armed Bandits with Limited Communication, Agarwal et. al, JMLR, 2022.
[Paper] Bandit Learning in Decentralized Matching Markets, Liu et. al, JMLR, 2021
[Paper] Player-optimal Stable Regret for Bandit Learning in Matching Markets, Fang Kong et. al’, SODA 2023

Apart from these references, we will follow the lecture notes of Prof. Peter Bartlett (UC Berkeley), Prof. Aditya Guntuboyina (UC Berkeley), Prof. Aditya Gopalan (IISc Bangalore), Prof. Arya Mazumdar (UC San Diego) on similar courses.

Discussions:

We will use Moodle for discussions and communication. Please enroll if you haven't yet.

Course Schedule:

We are maintaining the following sheet for the paper details and student presentation: see here (please sign up).

Lectures:

[July 29] Lecture 1: Introduction, Administrative details, Rough Overview of the modules
[Aug 1] Lecture 2: Basics of Convex functions, First and Second order characterization, structures, Smooth functions, characterizations, Strongly convex functions, Examples.
[Aug 5] Lecture 3 : Convergence of Gradient Descent for a) Smooth (possibly non-convex) function b) Convex and Smooth function and c) Strongly convex and Smooth function. Comparison in convergence rates. Constrained optimization, Projection, Projected Gradient Descent.
[Aug 8] Lecture 4 : Stochastic Gradient Descent. Basics, Noisy gradient oracle, unbiased, bounded second order noise, examples, finite sum problem, coordinate descent. We also had a quiz.
[Aug 12] Lecture 5: [Guest Lecture by Prof. Pratik Jawanpuria of CMInDS, IITB] Constrained Optimization, First Order Necessary Conditions, Tangent Plane, KKT condition, Active constraints.
[Aug 15]: (No Lecture) Institute Holiday
[Aug 19] Lecture 6 : Convergence of SGD. Slow 1/sqrt{T} rate for convex functions. Special structure (PL-type). A-B Boundedness. Fast convergence, Exponential convergence, Randomized Kaczmarz, 1/T convergence. Variance reduction, Mini-batch, SVRG.
[Aug 22] Lecture 7:
[Aug 26] Lecture 8:
[Aug 29] Lecture 9:
[Sep 2] Lecture 10:
[Sep 5] (No Lecture) Institute Holiday
[Sep 9] Lecture 11:
[Sep 12] Lecture 12:
Midsem Week

General Guidelines for Homeworks:

Homeworks should be submitted in class. Students are encouraged to discuss among themselves while solving the HW problems. However, they are required to write the solution on their own. Near-identical submissions will not be awarded any score.

Page updated

Report abuse