Haolin Liu:

Pessimism Principle for Efficient Decision-Making with Offline Data

Time: March 21, Friday, 12:15 -- 1:45pm

Location: Rice 109

Title: Pessimism Principle for Efficient Decision-Making with Offline Data.

Abstract:

Offline reinforcement learning (RL) is a powerful paradigm that enables agents to learn policies from pre-collected datasets without requiring active interaction with the environment. This is particularly valuable in scenarios where data collection is expensive, dangerous, or impractical, such as healthcare, robotics, autonomous driving, and finance.

A key challenge in offline RL is learning an effective policy from data that may have been collected using suboptimal policies, leading to distributional shift. Theoretically, it is crucial to characterize the quality of offline data by quantifying the discrepancy between the dataset and data that would be gathered by an optimal policy. From a lower-bound perspective, such a characterization should define an information-theoretic limit that no algorithm can surpass. On the upper-bound side, our goal is to design algorithms that scale with this tight characterization. This raises a fundamental question: what principles should guide the design of such algorithms?

In this talk, I will introduce a data-dependent characterization of offline data quality with a corresponding lower bound, and discuss the pessimism principle as a guiding approach for offline RL algorithm design, ensuring tight guarantees with this characterization. Pessimism principle advocates for conservative estimation when uncertainty is present, ensuring robustness in decision-making with offline data. Additionally, I will compare the pessimism principle in offline RL to the optimism principle commonly used in online RL.

Short Bio:

Haolin Liu is a PhD candidate in Computer Science at the University of Virginia, advised by Professor Chen-Yu Wei. His research focuses on reinforcement learning theory and its application to language models, with works published in NeurIPS and ICLR with spotlight recognition. More detail can be found in his homepage https://liuhl2000.github.io/.

Page updated

Google Sites

Report abuse