Learning to Defer in Content Moderation: The Human-AI Interplay [slides]

Date: May 20, 2024

Speaker:

Thodoris Lykouris

Massachusetts Institute of Technology

More Info:

Abstract

Ensuring successful content moderation is vital for a healthy online social platform where it is necessary to responsively remove harmful posts without jeopardizing non-harmful content. Due to the high-volume nature of online posts, human-only moderation is operationally challenging, and platforms often employ a human-AI collaboration approach. A typical machine-learning heuristic estimates the expected harmfulness of incoming posts and uses fixed thresholds to decide whether to remove the post (classification decision) and whether to send it for human review (admission decision). This can be inefficient as it disregards the uncertainty in the machine learning estimation, the time-varying element of human review capacity and post arrivals, and the selective sampling in the dataset (humans only review posts filtered by the admission algorithm).


We introduce a model to capture the human-AI interplay in content moderation. The algorithm observes contextual information for incoming posts, makes classification and admission decisions, and schedules posts for human review. Non-admitted posts do not receive reviews (selective sampling) and admitted posts receive human reviews on their harmfulness. These reviews help educate the machine-learning algorithms but are delayed due to congestion in the human review system. The classical learning-theoretic way to capture this human-AI interplay is via the framework of learning to defer, where the algorithm has the option to defer a classification task to humans for a fixed cost and immediately receive feedback. Our model contributes to this literature by introducing congestion in the human review system. Moreover, unlike work on online learning with delayed feedback where the delay in the feedback is exogenous to the algorithm’s decisions, the delay in our model is endogenous to both the admission and the scheduling decisions.


We propose a near-optimal learning algorithm that carefully balances the classification loss from a selectively sampled dataset, the idiosyncratic loss of non-reviewed posts, and the delay loss of having congestion in the human review system. To the best of our knowledge, this is the first result for online learning in contextual queueing systems and hence our analytical framework may be of independent interest.


This talk is based on joint work with Wentao Weng (Ph.D. student at MIT); a preprint of the corresponding paper can be found here: https://arxiv.org/pdf/2402.12237.


Speaker's Bio

Thodoris Lykouris is an Assistant Professor of Operations Management at the MIT Sloan School of Management. Before joining MIT, Thodoris received his Ph.D. in Computer Science from Cornell University and was then a postdoctoral researcher in Microsoft Research New York. His research focuses on data-driven sequential decision-making and spans across the areas of machine learning, dynamic optimization, and economics. Thodoris publishes in journals such as Journal of the ACM, Mathematics of Operations Research, and Operations Research as well as conferences such as COLT, EC, ICML, NeurIPS, and STOC. His research has been recognized with a Google Ph.D. Fellowship and was a finalist for the Dantzig Dissertation Award, Nicholson Student Paper Competition and Applied Probability Society Student Paper Competition. Thodoris recently co-organized an interdisciplinary semester-long program on "Data-Driven Decision Processes" at the Simons Institute for the Theory of Computing at Berkeley and is currently serving as an Associate Editor in the Stochastic Models area of Operations Research.

Stochastic Matching Models and Matching Queues - Survey and Perspectives [Slides]

Date: May 6, 2024

Speaker:

Pascal Moyal

Université de Lorraine

More Info:

Abstract

In this talk, we present a series of results regarding the so-called Stochastic matching model, seen as a generalization of bipartite matching queues to general (possibly non-bipartite) graphs. After reviewing the main existing results regarding stability, access control and optimization of the matching policy, we will introduce a few recent extensions and investigations regarding this class of systems: (i) the sub-additivity property, and applications to perfect simulation; (ii) the extension to hypergraphical matching structures, which is crucial for applications such as assemble-to-order systems and (iii) insightful connections with the construction of performant online matching algorithms on large random graphs.


Speaker's Bio

Pascal Moyal is a Full Professor in the mathematics department (IECL) of Université de Lorraine (UL), in Nancy, France. He is currently Head of the Probability and Statistics Team of IECL, and of the Master Program IMSD at UL. He is also associated researcher at the Inria PASTA project Team, and lecturer at CNAM and Telecom Paristech. His research activity focuses on the stochastic modeling and analysis of networks, and in particular, Markov chains modeling, weak approximations of stochastic processes in large state-space, stochastic processes on random graphs, graph theory, and ergodic theory.  Before joining UL in 2018, Pascal Moyal was Associate Professor at UTC (Compiègne, France), and then Invited Associated Professor for 2 years at the IEMS Department of Northwestern University (Evanston, USA).

Accelerating Convergence of Score-Based Diffusion Models, Provably [Slides]

Date: April 22, 2024

Speaker:

Yuxin Chen

University of Pennsylvania

More Info:

Abstract

Diffusion models, which convert noise into new data instances by learning to reverse a Markov diffusion process, have become a cornerstone in contemporary generative modeling. While their practical power has now been widely recognized, theoretical underpinnings for mainstream samplers remain far from mature. Additionally, despite a flurry of recent activities towards speeding up diffusion-based samplers in practice, convergence theory for acceleration techniques remains severely limited. In this talk, we first present a new suite of non-asymptotic theory towards understanding the popular DDIM (or probability flow ODE) sampler in discrete time, which significantly improves upon prior convergence guarantees for this sampler. We then design training-free algorithms that provably accelerate the DDIM and DDPM samplers, which leverage insights from higher-order approximation and share similar intuitions as popular high-order ODE solvers like DPM-Solver-2. Our non-asymptotic theory accommodates L2-accurate score estimates, and does not require log-concavity or smoothness on the target distribution.


This is based on joint work with Gen Li, Yu Huang, Timofey Efimov, Yuejie Chi and Yuting Wei.


Paper 1: arxiv.org/abs/2306.09251

Paper 2: arxiv.org/abs/2403.03852


Speaker's Bio

Yuxin Chen is currently an associate professor of statistics and data science and of electrical and systems engineering at the University of Pennsylvania. Before joining UPenn, he was an assistant professor of electrical and computer engineering at Princeton University. He completed his Ph.D. in Electrical Engineering at Stanford University and was also a postdoc scholar at Stanford Statistics. His current research interests include high-dimensional statistics, nonconvex optimization, and machine learning theory. He has received the Alfred P. Sloan Research Fellowship, the SIAM Activity Group on Imaging Science Best Paper Prize, the ICCM Best Paper Award (gold medal), and was selected as a finalist for the Best Paper Prize for Young Researchers in Continuous Optimization. He has also received the Princeton Graduate Mentoring Award.

Towards Learning-based Approximately Optimal Control in (Constrained) Decentralized Dynamic Teams [Slides]

Date: April 8, 2024

Speaker:

Vijay Subramanian

University of Michigan

More Info:

Abstract

In this talk we will discuss our efforts on a principled approach to developing learning-based approximately optimal control in (constrained) decentralized dynamic teams. We describe the approach using an (realistic) example of optimal control of transmissions multiple video streams over a wireless downlink between a base-transceiver-station (BTS)/access point and N end-devices (EDs), and play-outs of the streams at the EDs. The BTS sends video packets to each ED under a joint transmission energy constraint, the EDs choose when to play out the received packets, and the collective goal is to provide a high Quality-of-Experience (QoE) to the clients/end-users. All EDs send feedback about their states and actions to the BTS which reaches it after a fixed deterministic delay (in our model). We analyze this team problem with delayed feedback as a cooperative Multi-Agent Constrained Partially Observable Markov Decision Process (MA-C-POMDP).


A core result that we will discuss in some detail is a recently established strong duality result for MA-C-POMDPs (in the discounted cost setting). Using this new result, the original video-streaming problem is decomposed into N independent unconstrained transmitter-receiver (two-agent) problems—all sharing a common Lagrange multiplier (that also needs to be optimized for optimal control). Thereafter, the common information (CI) approach and the formalism of approximate information states (AISs) are used to guide the design of a neural-network based architecture for learning-based multi-agent control in a single unconstrained transmitter-receiver pair (team) problem. Simulations on such a single transmitter-receiver pair with a stylized QoE model highlight the advantage of delay-aware two-agent coordination over a strategy where the transmitter chooses both transmission and play-out actions (perceiving the delayed state of the receiver as its current state). We will conclude by discussing generalizations to stochastic dynamic games.


This is joint work with Nouman Khan (UMich, AA), Hsu Kao (JP Morgan, formerly at the Umich, AA), and Ujwal Dinesha, Subrahmanyam Arunachalam, Dheeraj Narasimha, and Srinivas Shakkottai at TAMU. It is based on three recent papers, one presented at AISTATS 2022, and one at IEEE CDC 2023, and the third one accepted at IEEE INFOCOM 2024.

Speaker's Bio

Vijay Subramanian received the Ph.D. degree in electrical engineering from the University of Illinois at Urbana-Champaign,  Champaign, IL, USA, in 1999. He worked at Motorola Inc., and at the Hamilton Institute, Maynooth, Ireland, for many years, and also in the EECS Department, Northwestern University, Evanston, IL, USA. In Fall 2014, he started in his current position as an Associate Professor with the  EECS Department at the University of Michigan, Ann Arbor. For the academic year 2022-2023 (also in academic year 2023-2024), he was an Adjunct Research Associate Professor in CSL and ECE at UIUC. His current research interests are in stochastic analysis, random graphs, multi-agent systems, and game theory (mechanism and information design) with applications to social, economic and technological networks.

Capitalizing Generative AI: Diffusion Models Towards High-Dimensional Generative Optimization [slides]

Date: March 25, 2024

Speaker:

Mengdi Wang

Princeton University

More Info:

Abstract

Diffusion models represent a significant breakthrough in generative AI, operating by progressively transforming random noise distributions into structured outputs, with adaptability for specific tasks through guidance or fine-tuning. In this presentation, we delve into the statistical aspects of diffusion models and establish their connection to theoretical optimization frameworks. In the first part, we explore how unconditioned diffusion models efficiently capture complex high-dimensional data, particularly when low-dimensional structures are present. We present the first efficient sample complexity bound for diffusion models that depend on the small intrinsic dimension, effectively addressing the challenge of the curse of dimensionality. Moving to the second part, we leverage our understanding of diffusion models to introduce a pioneering optimization method termed "generative optimization." Here, we harness diffusion models as data-driven solution generators to maximize an unknown objective function. We introduce innovative reward guidance techniques incorporating the target function value to guide the diffusion model. Theoretical analysis in the offline setting demonstrates that the generated solutions yield higher function values on average, with optimality gaps aligning with off-policy bandit regret. Moreover, these solutions maintain fidelity to the intrinsic structures within the training data, suggesting a promising avenue for optimization in complex, structured spaces through generative AI.

Speaker's Bio

Mengdi is an associate professor at the Center for Statistics and Machine Learning, Department of Electrical and Computer Engineering, Department of Computer Science and the Omenn-Darling Bioengineering Institute at Princeton University. She is also affiliated with the Princeton ML Theory Group and Princeton Language+Intelligence Initiative, and was a visiting research scientist at DeepMind, Institute of Advanced Studies, and Simons Institute on Theoretical Computer Science. Mengdi Wang did her PhD at MIT LIDS and got her PhD in 2013. She works on machine learning theory, reinforcement learning, language models and their applications in healthcare and bioscience. She was Program Chair of ICLR 2023 and serves as SAC of Neurips, ICML, COLT, associate editor of Harvard Data Science Review.