ML Systems for Large Models and Federated Learning

This tutorial will teach attendees how to overcome performance, cost, privacy and robustness challenges when using distributed and federated software systems for learning and deploying Computer Vision and ML applications across various hardware settings (networked machines, GPUs, embedded, mobile systems).

The audience will learn about theory, implementation and practice of these topics: state-of-the-art approaches and system architectures, forms of distributed parallelism, pitfalls in the measurement of parallel application performance, parallel ML compilers, computation-communication-memory efficiency in federated learning (FL), trustworthy FL, tackling device heterogeneity in FL, and on-device FL systems.

Tutorial Schedule: Sunday 18th June 2023, Pacific Standard Time

8:30-10:00am: Tutorial Part 1
10:00-10:15am: Break
10:15-11:45am: Tutorial Part 2

Interested in the open source projects from the tutorial?
Visit casl-project.ai to learn more!

Slides

CVPR 2023 Tutorial - (Part 1) ML Systems For Large Models and Federated Learning.pdf

CVPR Tutorial Samuel.pdf

CVPR Tutorial_part_hongyiwang.pdf

About Us

Qirong Ho

Mohamed bin Zayed University of Artificial Intelligence and Petuum, Inc.

Homepage

Qirong Ho is an Assistant Professor at the Mohamed bin Zayed University of Artificial Intelligence, and also the CTO and Co-Founder of Petuum, Inc., a World Economic Forum Tech Pioneers Startup. Ho’s primary area of research interest is in software systems for the scale-up of ML programs. These ML software systems must enable, automate, and optimize over multiple tasks: composition of elementary ML program and systems “building blocks” to create sophisticated applications, scaling to very large data and model sizes, resource allocation and scheduling, hyperparameter tuning, and code-to-hardware placement.

Hongyi Wang

Carnegie Mellon University

Homepage

Hongyi Wang is a Senior Researcher at the Machine Learning Department at Carnegie Mellon University. Dr. Wang's research interests focus on the scalability and efficiency of distributed and federated learning systems. He also studies how to automatically deploy distributed machine learning algorithms over heterogeneous hardware platforms and trustworthy distributed and federated learning.

Samuel Horvath

Mohamed bin Zayed University of Artificial Intelligence

Homepage

Samuel Horvath is an Assistant Professor of Machine Learning at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI). In his research, Horvath focuses on providing a fundamental understanding of how distributed and federated training algorithms work and how they interact with different sources of heterogeneity, such as system-level variability in the computing infrastructure and statistical variability in the training data. He is broadly interested in federated learning, distributed optimization, and efficient deep learning.

Page updated

Google Sites

Report abuse