Consistently Simulating Human Personas with Multi-turn Reinforcement Learning 

 

  Marwa Abdulhai,  Ryan Cheng, Donovan Clay, Tim Althoff, Sergey Levine, Natasha Jaques

UC Berkeley,  University of Washington, Google Research

NeurIPS 2025

arXiv | Code