The lectures and poster sessions will happen in Keller Hall -- Please, check the table below for the details of the rooms!
Address: 200 Union St. SE, Minneapolis MN 55455
Program
Abstracts
Title: Approximate Computing and Privacy: New Coding Frameworks for Secure Distributed Computing [SLIDES] [VIDEO]
Abstract: This talk presents several recent advances in coding techniques to enhance resilience and privacy in distributed computing and learning. Classical approaches often rely on perfect privacy and accuracy, but in this talk, we explore the role of approximation in these frameworks. We introduce new coding-theoretic formulations over the real field—especially motivated by secure distributed computing with data privacy—that depart from traditional finite-field models to enable controlled privacy leakage and approximate computation. Focusing on distributed mean estimation and matrix multiplication, we analyze privacy-accuracy trade-offs using differential privacy and mean squared error metrics. We present results that offer better communication efficiency, lower redundancy, or improved privacy-accuracy trade-offs compared to existing approaches. These gains are enabled by techniques that depart from standard coding approaches such as Shamir secret sharing, as well as from standard noise distributions used in the differential privacy literature. Our results reveal rich coding structures and phenomena that emerge uniquely in the study of approximate distributed computing.
Title: Explainable Machine Learning Through the Lens of Information-Theoretic Methods [SLIDES] [VIDEO]
Abstract: Machine learning has driven performance improvements in several high-stakes applications. However, the models are becoming increasingly complex, and difficult for human stakeholders to understand and trust. Can we enable the trustworthy and efficient adoption of AI through explainability? This talk will cover some of our recent research on enabling explainability by leveraging information theory, polytope theory, statistics, and optimization. The talk will be self-contained and include a quick background on explainability research. Module I will start with an emerging problem in explainability called algorithmic recourse: how do we guide a rejected applicant to receive a favorable model outcome while also being robust to model multiplicity (Rashomon Effect)? We propose robust "counterfactual explanations" that remain valid under model multiplicity with probabilistic guarantees. Module II will discuss efficient model reconstruction strategies from larger complex models by specifically integrating explainability techniques and the related problem of private counterfactual retrieval. Module III will introduce an information-theoretic tool called Partial Information Decomposition and discuss its role in explainability and model compression/distillation.
Title: Network Information Theory Through the Lens of Machine Learning [SLIDES] [VIDEO Part 1] [VIDEO Part 2]
Abstract: Network information theory addresses fundamental bounds and optimal coding strategies for information flow among distributed terminals. On the communication side, the extensive information theoretic foundations have impacted the way modern communication networks handle multiple access, broadcast and interference. The information theory of distributed compression is equally rich and the applications, from camera arrays to sensor networks, are equally appealing. Yet, the impact of theory on practical network compression has been somewhat limited, partly because modeling and exploiting correlation among real-world sources, such as an array of images, is difficult. As the field of data compression is undergoing a transformation with the emergence of learning-based techniques, machine learning is becoming an important tool to reap the long-promised benefits of distributed compression.
In this tutorial, we review some of the important results in network information theory, with an emphasis on distributed compression. We discuss the recent progress in the broad area of learned distributed compression, focusing on images as well as abstract sources. We show how learning-based approaches can provide interpretable results operating close to information-theoretic bounds, and how important information theoretic strategies such as “binning” or “grouping” in the source space emerge from a learning framework. We also illustrate how learned distributed compression, and more generally learning-based techniques, have the potential to impact multi-user communications.
Title: Optimization Algorithms for Federated Learning: Looking Back, Looking Ahead [SLIDES] [VIDEO]
Abstract: In this tutorial, we will review the extensive research done on optimization algorithms for federated learning in the past few years, and reflect on key takeaways and insights. In particular, we will discuss how to design algorithms to cope with data and computational heterogeneity across devices and communication/bandwidth limitations. Then we will look ahead and put research on federated learning in the context of the training and inference on large generative AI models. We will talk about the effect of pre-training on federated fine-tuning, parameter-efficient fine-tuning, and multi-task learning. Along the way, we will discuss open problems and connections to information theory.
Title: Theory and Experiments for Peer Review and Other Distributed Evaluations [SLIDES] [VIDEO Part 1] [VIDEO Part 2]
Abstract: Peer review is central to scientific research, influencing its progress, researchers' careers, public perception, and the distribution of billions of dollars in grants. Despite its importance, it faces challenges like unqualified reviewers, fraud, miscalibration, and subjectivity—issues that also arise in other applications like admissions and hiring. This tutorial offers two perspectives: insights from carefully designed experiments, and an algorithmic, theoretical approach. We will explore mathematical formulations, algorithms deployed in the real world, and their theoretical guarantees. We will also spend time discussing open problems that if solved can be highly impactful. The material also applies broadly to distributed evaluations, including hiring and admissions. Based on this survey.
Title: Bits from Nucleotides: An Information Theory for Genomic Data Processing [SLIDES] [VIDEO]
Abstract: Over the last decades, advances in DNA sequencing technologies have driven down the time and cost of genomic data significantly. However, the acquisition, processing, and analysis of large amounts of genomic data pose several fundamental information-theoretic questions. How much sequencing data needs to be collected to reliably learn the genome of a species? How much can we compress genomic sequencing data while maintaining its usefulness? How do sequencing errors impact our ability to perform biologically valid inference?
In this talk, we will introduce key algorithmic problems in genomics and discuss how information-theoretic thinking can be useful. We will start with the classical problem of genome assembly and pose a basic information feasibility question: when does the data contain enough information for correct genome reconstruction? We will show that this viewpoint can lead to new algorithms with theoretical guarantees. We will then extend this information-theoretic perspective to other problem settings such as sequence alignment, metagenomics, and haplotype phasing. Finally, we will discuss how recent advances in DNA synthesis technologies have made the idea of DNA-based storage a reality. In this context, we will study the capacity of different DNA storage channels and discuss how this can provide insights into how to optimally encode information in this emerging storage medium.
Title: Theoretical Foundations of AI [SLIDES] [VIDEO Part 1] [VIDEO Part 2]
Abstract: Despite wide empirical success of modern AI, many of the most commonly used approaches lack a clear mathematical foundation and often rely on poorly understood heuristics. Even when theoretical guarantees do exist they are often too crude and/or pessimistic to explain their success in practical regimes of operation or serve as a guiding principle for practitioners. This two part tutorial aims to familiarize the audience with the theoretical progress over the last decade in the foundations of AI. Part I focuses on more classical theoretical results in the Lazy/linear training regime which can help demystify a host of more classical deep learning phenomena ranging from overparameterization, to early stopping and deep image priors. Part II will focus on explaining a more recent theoretical paradigm that operates in a more dynamic and nonlinear regime. We will explain how this new emerging paradigm enables a more nuanced understanding of modern generative AI spanning feature learning and prompting, to more recent inference-time reasoning methods.
Title: Causal Representation Learning [SLIDES] [VIDEO Part 1] [VIDEO Part 2]
Abstract: Machine learning (ML) has shown great success in learning low-dimensional and semantically interpretable representations of high-dimensional data. Recent leaps in designing transformers have further proliferated representation learning. Despite such success, strong generalization — transfer of the learned representations to new problems — is still an unsolved problem. Addressing strong representation requires moving away from learning good enough representations to learning ground truth representation. As a key step toward strong generalization, causal representation learning (CRL) has emerged as a cutting-edge field that merges the strengths of statistical inference, machine learning, and causal inference. Its objective is to estimate the ground truth latent representation of the data and the rich structures that model the interactions among the variables in the latent space. In this talk, we will explore the latest advancements in the emerging field of CRL. We will introduce the foundational concepts and motivations behind combining representation learning with causal inference. Following a brief history of CRL, we will describe its primary objectives and the theoretical challenges. We will then review the key approaches to address these challenges, including CRL with multi-view observations, CRL with interventions on latent variables, and CRL applied to temporal data. We will also highlight real-world application opportunities, discuss the challenges in scaling CRL to practical use cases, and discuss open questions for CRL related to theoretical and empirical viewpoints.
Title: Universal Compression [VIDEO Part 1] [VIDEO Part 2]
[Abstract.]
Title: Foundations of Preference and Metric Learning for Pluralistic Alignment [SLIDES] [VIDEO]
Abstract: Large pre-trained models trained on internet-scale data are often not ready for deployment out-of-the-box. They require extensive fine-tuning or alignment using large quantities of human preference data, typically gathered through pairwise comparisons. While aligning an AI/ML model to human preferences or values, it is crucial to ask whose preference and values we are aligning it to? Current approaches to preference alignment are severely limited due to inherent assumption of uniformity in preference models. This has necessitated development of new approaches for modeling and algorithms that can learn under heterogeneity, enabling the incorporation of a plurality of values in AI alignment.
In this tutorial, we will explore comparison-based preference and metric learning under heterogeneity. Our focus will be on mathematical modeling and algorithms, grounded in practical applications, along with theoretical analysis of these models and algorithms. While our primary motivating application is preference alignment, the topics covered in this tutorial have a broad relevance across cognitive and behavioral psychology, crowdsourcing democracy, analysis of social science surveys, and recommendation systems.
Title: An Information-Theoretic Approach to Stochastic Optimization: Fundamental Limits and Optimal Algorithms [SLIDES] [VIDEO]
Abstract: Zeroth-order optimization lies at the core of many contemporary challenges in machine learning. This tutorial presents an information-theoretic framework for understanding and designing sample-optimal algorithms under zeroth-order oracle access. We begin with classical results in both discrete and continuous optimization, illustrating how tools such as KL divergence can provide principled guidance for algorithm design and lower bound analysis. Building on this intuition, we present recent advances demonstrating how information-theoretic principles can be substantially extended to yield a unified approach to resolving a wide range of fundamental problems, from the optimal design of gradient estimators to achieving minimax sample complexities in both instance-dependent and instance-independent settings.