Ziteng Cheng

December 5th


Title: Incorporating risk measures into reinforcement learning and inverse reinforcement learning

Speaker: Ziteng Cheng (University of Toronto)

Date/Time: Tuesday, 12/05, 7pm CET (10am PST, 1pm EST)

Abstract:  In the first part of the talk, I will explore various methods for incorporating risk measures into Markov decision processes, with a particular focus on a framework that utilizes nested compositions of conditional risk mappings. We propose a distributional approach to this framework to include weakly continuous dynamics, latent costs, and mixed actions. The corresponding dynamic programming principle is derived. Additionally, I will present a novel distributional reinforcement learning method that solves the problem in discrete environment.  

In the second part, I will discuss a novel framework for identifying an agent's risk aversion using interactive questioning, in one-period and infinite horizon cases. In the one-period case, we assume that the agent's risk aversion is characterized by a state-dependent cost function and a distortion risk measure. In the infinite horizon case, we model risk aversion with an additional component, a discount factor. We establish the identifiability from a finite set of candidate risk aversions. We also develop an algorithm for designing optimal questions and provide empirical evidence that our method learns risk aversion significantly faster than randomly designed questions in simulations.


This talk is based on joint works with Sebastian Jaimungal (UToronto), Anthony Coache (UToronto), and Nick Martin.


Bio: Ziteng currently holds the position of Postdoctoral Fellow in the Department of Statistical Sciences at the University of Toronto, under the mentorship of Professor Sebastian Jaimungal. He earned a Ph.D. in Applied Mathematics from the College of Computing at Illinois Institute of Technology in 2021, guided by Professors Tomasz R. Bielecki and Ruoting Gong. His current research is centered on incorporating risk measures into reinforcement learning, inverse reinforcement learning, and mean-field games.

Meeting Recording: https://ucsb.zoom.us/rec/share/xCBishh9gMonKnrLpjX99gqSArYhFITVm1UfI0WfCV1K73YeTlIJqxw1XFOPytQ.HRF-fI6FVmPs4y0o

Access Passcode: 0Wv?LBQ3