Talk Date and Time: November 10, 2022 at 04:00 pm - 04:45 pm EST followed by 10 minutes of Q&A on Zoom and IRB-5105
Topic: Hierarchical Bayesian Bandits
Abstract:
Exploration-exploitation trade-off is a fundamental online learning problem, between taking exploration actions that lead to learning a better model, and taking exploitation actions that leverage it. A multi-armed bandit arose as a de-facto standard approach to solving this problem, with many successful applications in practice. In this talk, we extend bandit algorithms to complex decision problems represented by hierarchical Bayesian models. This extension leads to simple and natural algorithms for multi-task, meta-, and federated learning. The algorithms can be implemented as analyzed, work well in practice, and their analyses reflect improvements due to the structure.
Bio:
I am a Principal Scientist at Amazon AWS AI Labs, where I propose, analyze, and apply algorithms that learn incrementally, run in real time, and converge to near-optimal solutions as the number of observations increases. Most of my recent work focuses on designing bandit algorithms for structured real-world problems. I made several fundamental contributions to the field of multi-armed bandits. My earlier work focused on structured bandit problems with graphs, submodularity, semi-bandit feedback, and low-rank matrices. This culminated in my work on online learning to rank, where we design bandit algorithms that handle both combinatorial action sets and partial feedback. These algorithms are simple, theoretically sound, robust, and remain the state of the art. My recent work also involves follow-the-perturbed-leader exploration, which can be analyzed up to generalized linear bandits and applied to neural networks; latent bandits, which can be combined with offline graphical models; and even learning of bandit algorithms from logged data.