Date: November 10, 2025 (JST)
Venue: RIKEN AIP Nihonbashi Open Space
Format: Hybrid (In-person + Online)
Registration will be open soon!
Related Topics
Online learning
Statistical learning
Reinforcement learning
Bandit algorithms
Combinatorial optimization
Online convex optimization
Bayesian optimization
Graph algorithms and graph mining
Causal inference and counterfactual reasoning
Privacy-preserving and communication-efficient learning
Federated and decentralized systems
Applications in science and real-world domains
09:30 – 10:00 Registration / Opening
10:00 – 11:00 Keynote Talk 1
11:00 – 11:20 Coffee Break
11:20 – 11:45 Talk 1
11:45 – 12:10 Talk 2
12:10 – 14:00 Lunch
14:00 – 15:00 Keynote Talk 2
15:00 – 15:10 Coffee Break
15:10 – 15:35 Talk 3
15:35 – 16:00 Talk 4
16:00 – 16:10 Coffee Break
16:10 – 16:35 Talk 5
16:35 – 17:00 Talk 6
17:00 – 17:25 Talk 7
17:25 – 17:30 Closing
17:30 – 19:30 Reception
Nicolò Cesa-Bianchi (University of Milan / Politecnico di Milano, Italy)
Title: Trades, tariffs, and regret: Online Learning in Digital Markets
Abstract: Online learning explores algorithms that acquire knowledge sequentially, through repeated interactions with an unknown environment. The general goal is to understand how fast an agent can learn based on the information received from the environment. Digital markets, with their complex ecosystems of algorithmic agents, offer a rich landscape of sequential decision-making problems, characterized by diverse decision spaces, utility functions, and feedback mechanisms. This talk will demonstrate how tackling challenges within digital markets has not only advanced our understanding of machine learning capabilities but also revealed novel insights into algorithmic efficiency and decision-making under uncertainty.
Nishant Mehta (University of Victoria)
Title: Elicitation Meets Online Learning: Games of Prediction with Advice from Self-Interested Experts
Abstract: The classical game of prediction with expert advice involves two players: Decision Maker, who forecasts outcomes based on expert advice, and an adversarial Nature that selects the experts’ forecasts of outcomes and the outcomes themselves. The experts' forecasts are taken at face value: various benchmarks like external regret and swap regret are based on the performance of these forecasts. Yet, real-world experts may have beliefs about the outcomes they forecast. If not properly incentivized, self-interested experts can fail to report their beliefs truthfully, compromising benchmarks based on the experts’ beliefs. A series of recent works have developed online learning algorithms that succeed in the face of such self-interested experts, drawing from past results in online learning but also giving online learning both new results and new understanding. This talk will begin with a tour of fundamental mechanisms for eliciting experts’ beliefs. It will then cover recent progress in games of prediction with advice from self-interested experts, highlighting many open problems along the way.
Kohei Hatano (Kyushu University/ RIKEN)
Title: TBD
Abstract: TBD
Junya Honda (Kyoto University/ RIKEN)
Title: TBD
Abstract: TBD
Shinji Ito (The University of Tokyo / RIKEN)
Title: TBD
Abstract: TBD
Kyoungseok Jang (Chung-Ang University)
Title: TBD
Abstract: TBD
Yuko Kuroki (CENTAI Institute S.p.A.)
Title: Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits
Abstract: We study the problem of minimizing polarization and disagreement in the Friedkin-Johnsen opinion dynamics model under incomplete information. Unlike prior work that assumes a static setting with full knowledge of users' innate opinions, we address the more realistic online setting where innate opinions are unknown and must be learned through sequential observations. This novel setting, which naturally mirrors periodic interventions on social media platforms, is formulated as a regret minimization problem, establishing a key connection between algorithmic interventions on social media platforms and theory of multi-armed bandits. In our formulation, a learner observes only a scalar feedback of the overall polarization and disagreement after an intervention. For this novel bandit problem, we propose a two-stage algorithm based on low-rank matrix bandits. The algorithm first performs subspace estimation to identify an underlying low-dimensional structure, and then employs a linear bandit algorithm within the compact dimensional representation derived from the estimated subspace. We prove that our algorithm achieves the sublinear cumulative regret over any time horizon T. Empirical results validate that our algorithm significantly outperforms a linear bandit baseline in terms of both cumulative regret and running time.
Daiki Suehiro (Kyushu University / RIKEN AIP)
Title: TBD
Abstract: TBD
Shinji Ito (The University of Tokyo / RIKEN)
Junya Honda (Kyoto University/ RIKEN)
Kohei Hatano (Kyushu University/ RIKEN)
Yuko Kuroki (CENTAI Institute S.p.A.)
Sequential Decision Making Team,
RIKEN Center for Advanced Intelligence Project (AIP)
Computational Learning Theory Team
RIKEN Center for Advanced Intelligence Project (AIP)
JSPS KAKENHI Grant-in-Aid for Scientific Research (B)
"Fundamental Technologies for Robust Dynamic Decision-Making Policies with Optimality in Diverse Environments"
JST PRESTO (AI and Robotics for Innovation in Research and Development Process)
"Dynamic Environment Analysis and Its Applications Using Sequential Learning Theory and Graph Mining Techniques"