Reinforcement Learning for Real Life Workshop @ NeurIPS 2022

Dec 3, 2022. New Orleans, USA. In-person.  

Website @ NeurIPS 2022 (videos, posters, etc.)


  7:30-  8:25 Posters (for early birds; optional)

  8:25-  8:30 Opening Remarks 

  8:30-  9:00 Invited Talk:  Peter Stone, UTAustin/Sony

  9:00-  9:30 Invited Talk:  Robert Nishihara, Anyscale   

  9:30-10:00 Invited Talk:  Dhruv Madeka, Amazon 

10:00-10:20 Coffee Break 

10:20-11:10  Panel: RL Implementation

11:10-12:00  Panel: RL Benchmarks

12:00-13:30 Lunch/Posters 

13:30-14:00 Invited Talk:  Matej Balog, Deepmind

14:00-14:55 Panel: RL Theory-Practice Gap

14:55-15:00 Closing Remarks

15:00-15:30 Coffee Break/Posters

15:30-17:00 Posters

Invited talks

Matej Balog (Deepmind)

Dhruv Madeka (Amazon)

Robert Nishihara (Anyscale)

Peter Stone (UTAustin/Sony)

Dhruv Madeka, Amazon

Title: Deep Reinforcement Learning for Real-World Inventory Management

Abstract: We present a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching. While this dynamic program has historically been considered intractable, we show that several policy learning approaches are competitive with or outperform classical baseline approaches. In order to train these algorithms, we develop novel techniques to convert historical data into a simulator and present a collection of results that motivate this approach. We also present a model-based reinforcement learning procedure (Direct Backprop) to solve the dynamic periodic review inventory control problem by constructing a differentiable simulator. Under a variety of metrics Direct Backprop outperforms model-free RL and newsvendor baselines, in both simulations and real-world deployments.

Bio: Dhruv Madeka is a Principal Machine Learning Scientist at Amazon. His current research focuses on applying Deep Reinforcement Learning to supply chain problems. Dhruv has also worked on developing generative and supervised deep learning models for probabilistic time series forecasting. In the past - Dhruv worked in the Quantitative Research team at Bloomberg LP, developing open source tools for the Jupyter ecosystem and conducting advanced mathematical research in derivatives pricing, quantitative finance and election forecasting.

Robert Nishihara, Anyscale 

Title: Scaling reinforcement learning in the real world, from gaming to finance to manufacturing

Abstract: Reinforcement learning is transforming industries from gaming to robotics to manufacturing. This talk showcases how a variety of industries are adopting reinforcement learning to overhaul their businesses, from changing the nature of game development to designing the boat that won the America's Cup. These industries leverage Ray, a distributed framework for scaling Python applications and machine learning applications. Ray is used by companies across the board from Uber to OpenAI to Shopify to Amazon to scale their machine learning training, inference, data ingest, and reinforcement learning workloads.

Bio: Robert Nishihara is one of the creators of Ray, a distributed framework for scaling Python applications and machine learning applications. Ray is used by companies across the board from Uber to OpenAI to Shopify to Amazon to scale their machine learning training, inference, data ingest, and reinforcement learning workloads. He is one of the co-founders and CEO of Anyscale, which is the company behind Ray. He did his PhD in machine learning and distributed systems in the computer science department at UC Berkeley. Before that, he majored in math at Harvard.

Peter Stone, UTAustin/Sony

Title: Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning

Abstract: Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world's best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics. In addition, we construct a reward function that enables the agent to be competitive while adhering to racing's important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four

of the world's best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms. (A previous talk:

Bio: Dr. Peter Stone holds the Truchard Foundation Chair in Computer Science at the University of Texas at Austin. He is Associate Chair of the Computer Science Department, as well as Director of Texas Robotics. In 2013 he was awarded the University of Texas System Regents' Outstanding Teaching Award and in 2014 he was inducted into the UT Austin Academy of Distinguished Teachers, earning him the title of University Distinguished Teaching Professor. Professor Stone's research interests in Artificial Intelligence include machine learning (especially reinforcement learning), multiagent systems, and robotics. Professor Stone received his Ph.D in Computer Science in 1998 from Carnegie Mellon University. From 1999 to 2002 he was a Senior Technical Staff Member in the Artificial Intelligence Principles Research Department at AT&T Labs - Research. He is an Alfred P. Sloan Research Fellow, Guggenheim Fellow, AAAI Fellow, IEEE Fellow, AAAS Fellow, ACM Fellow, Fulbright Scholar, and 2004 ONR Young Investigator. In 2007 he received the prestigious IJCAI Computers and Thought Award, given biannually to the top AI researcher under the age of 35, and in 2016 he was awarded the ACM/SIGAI Autonomous Agents Research Award. Professor Stone co-founded Cogitai, Inc., a startup company focused on continual learning, in 2015, and currently serves as Executive Director of Sony AI America. 

Matej Balog, Deepmind

Title: AlphaTensor: Discovering faster matrix multiplication algorithms with RL 

Abstract: Improving the efficiency of algorithms for fundamental computational tasks such as matrix multiplication can have widespread impact, as it affects the overall speed of a large amount of computations. Automatic discovery of algorithms using ML offers the prospect of reaching beyond human intuition and outperforming the current best human-designed algorithms. In this talk I’ll present AlphaTensor, our RL agent based on AlphaZero for discovering efficient and provably correct algorithms for the multiplication of arbitrary matrices. AlphaTensor discovered algorithms that outperform the state-of-the-art complexity for many matrix sizes. Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time since its discovery 50 years ago. I’ll present our problem formulation as a single-player game, the key ingredients that enable tackling such difficult mathematical problems using RL, and the flexibility of the AlphaTensor framework. 

Bio: Matej Balog is a Senior Research Scientist at DeepMind, working in the Science team on applications of AI to Maths and Computation. Prior to joining DeepMind he worked on program synthesis and understanding, and was a PhD student at the University of Cambridge with Zoubin Ghahramani, working on general machine learning methodology, in particular on conversions between fundamental computational tasks such as integration, sampling, optimization, and search.


Panel:  RL Benchmarks   

Moderator: Minmin Chen


Pablo Samuel Castro, Google

Linxi "Jim" Fan, Nvidia

Caglar Gulcehre, Deepmind

Tony Jabara, Spotify

Peter Stone, UTAustin/Sony

Minmin Chen (Google)

Pablo Samuel Castro (Google)

Linxi "Jim" Fan (Nvidia)

Caglar Gulcehre (Deepmind)

Tony Jabara (Spotify)

Peter Stone (UTAustin/Sony)

Panel : RL Implementation: From the Halfway Point to the Last Mile

Chair: Xiaolin Ge, Summit Human Capital

Moderator: Alborz Geramifard, Meta


Kence Anderson, Microsoft

Craig Buhr, MathWorks

Robert Nishihara, Anyscale

Yuandong Tian, Meta/Facebook AI Research (FAIR)

Xiaolin Ge (Summit Human Capital)

Alborz Geramifard (Meta)

Kence Anderson (Microsoft)

Craig Buhr (MathWorks)

Robert Nishihara (Anyscale)

Yuandong Tian (Meta/FAIR)

Panel:  RL Theory-Practice Gap  

Moderator: Peter Stone 


Matej Balog, Deepmind 

Jonas Buchli, Deepmind 

Jason Gauci, Argo AI 

Dhruv Madeka, Amazon

Peter Stone (UTAustin/Sony)

Matej Balog (Deepmind)

Jonas Buchli (Deepmind)

Jason Gauci (Argo AI)

Dhruv Madeka (Amazon)

Accepted Papers

Optimizing Audio Recommendations for the Long-Term

Lucas Maystre (Spotify)*; Daniel Russo (Columbia); Yu Zhao (Spotify)

Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization

Kaixuan Huang (Princeton University)*; Yu Wu (Princeton University); Xuezhou Zhang (Princeton); Shenyinying Tu (LinkedIn); Qingyun Wu (Pennsylvania State University); Mengdi Wang (Princeton University/DeepMind); Huazheng Wang (Oregon State University)

Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response

Vincent Mai (Mila, Université de Montréal)*; Philippe Maisonneuve (Polytechnique Montreal & GERAD); Tianyu Zhang (Mila, Université de Montréal); Jorge Montalvo Arvizu (Solario); Liam Paull (University of Montreal); Antoine Lesage-Landry (Polytechnique Montréal & GERAD)

LibSignal: An Open Library for Traffic Signal Control

Hao Mei (New jersey institue of technology); Xiaoliang Lei ( Xi'an Jiaotong University); Longchao Da (New Jersey Institute of Technology); Bin Shi (Xi'an jiaotong University); Hua Wei (NJIT)*

Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management

Yuandong Ding (Huazhong University of Science and Technology); Mingxiao Feng (University of Science and Technology of China); Guozi Liu (Carnegie Mellon University); Wei Jiang (University of Illinois at Urbana-Champaign); Chuheng Zhang (Microsoft Research)*; Li Zhao (Microsoft Research); Lei Song (Microsoft Research); Houqiang Li (University of Science and Technology of China); Yan Jin (Huazhong University of Science and Technology); Jiang Bian (Microsoft Research)

Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Benjamin Fuhrer (Nvidia Networking)*; Yuval Shpigelman (Nvidia Networking); Chen Tessler (NVIDIA); Shie Mannor (Nvidia); Gal Chechik (Nvidia); Eitan Zahavi (Nvidia Networking); Gal Dalal (NVIDIA Research)

Controlling Commercial Cooling Systems Using Reinforcement Learning

Jerry Luo (DeepMind)*; Cosmin Paduraru (DeepMind); Octavian Voicu (DeepMind); Yuri Chervonyi (Google/Deepmind); Scott Munns (Trane Technologies); Jerry Li (Deepmind); Crystal Qian (Google); Praneet Dutta (Google); Daniel Mankowitz (DeepMind); Jared Q Davis (Stanford University | DeepMind); Ningjia Wu (Google); Xingwei Yang (Google); Chu-Ming Chang (Google); Ted Li (Google); Rob Rose (Google); Mingyan Fan (Google); Hootan Nakhost (Google); Tinglin Liu (Google); Deeni Fatiha (DeepMind); Neil Satra (Google); Juliet Rothenberg (Google); Molly Carlin (DeepMind); Satish Tallapaka (Google); Sims Witherspoon (DeepMind); David Parish (Google); Peter Dolan (DeepMind); Chenyu Zhao (Google)

Structured Q-learning For Antibody Design

Alexander I Cowen-Rivers (Preferred Networks, Inc.)*; Philip John Gorinski (Huawei Noah's Ark Lab); aivar sootla (HAUWEI ); Asif Khan (University of Edinburgh); Jun WANG (UCL); Jan Peters (TU Darmstadt); Haitham Bou Ammar (Huawei)

Semi-analytical Industrial Cooling System Model for Reinforcement Learning

Yuri Chervonyi (Google/Deepmind)*; Praneet Dutta (Google)

A Versatile and Efficient Reinforcement Learning Approach for Autonomous Driving

Guan Wang (Tsinghua University); Haoyi Niu (Tsinghua University); desheng zhu (China University of Mining and Technology-beijing); Jianming Hu (Department of Automation, Tsinghua University); Xianyuan Zhan (Tsinghua University)*; Guyue Zhou (Tsinghua University)

Function Approximations for Reinforcement Learning Controller for Wave Energy Converters

Soumyendu Sarkar (Hewlett Packard Enterprise)*; Vineet Gundecha (Hewlett Packard Enterpise); Alexander Shmakov (UC Irvine); Sahand Ghorbanpour (Hewlett Packard Enterprise); Ashwin  Ramesh Babu (Hewlett Packard Enterprise Labs); Alexandre Pichard (Carnegie Clean Energy); mathieu Cocho (Carnegie Clean Energy)

Pareto-Optimal Diagnostic Policy Learning in Clinical Applications via Semi-Model-Based Deep Reinforcement Learning

Zheng Yu (Princeton)*; Yikuan Li (Northwestern University); Joseph Kim (Princeton University); Kaixuan Huang (Princeton University); Yuan Luo ("Northwestern University, IL"); Mengdi Wang (Princeton University/DeepMind)

Optimizing Industrial HVAC Systems with Hierarchical Reinforcement Learning

William Wong (Carnegie Mellon University)*; Praneet Dutta (Google); Octavian Voicu (DeepMind); Yuri Chervonyi (Google/Deepmind); Cosmin Paduraru (DeepMind); Jerry Luo (DeepMind)

Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms

Vashist Avadhanula (Facebook); Omar Abdul Baki (Meta); Hamsa Bastani (Wharton); Osbert Bastani (University of Pennsylvania); Caner Gocmen (Facebook); Daniel Haimovich (Facebook); Darren Hwang (Meta); Dmytro Karamshuk (Facebook); Thomas J. Leeper (Facebook); Jiayuan Ma (Meta); Gregory macnamara (Meta); Jake Mullet (Meta); Christopher Palow (Meta); Sung Park (Meta); Varun S Rajagopal (Meta); Kevin Schaeffer (Facebook); Parikshit Shah (Facebook); Deeksha Sinha (Massachusetts Institute of Technology)*; Nicolas E Stier-Moses (Facebook); Ben Xu (Meta)

Power Grid Congestion Management via Topology Optimization with AlphaZero

Matthias Dorfer (enliteAI)*; Anton Fuxjäger (enliteAI); Kristián Kozák (enliteAI); Patrick M Blies (EnliteAI); Marcel Wasserer (enliteAI)

Beyond CAGE: Investigating Generalization of Learned Autonomous Network Defense Policies

Melody Wolk (Apple); Andy Applebaum (Apple)*; Camron Dennler (Apple); Patrick Dwyer (Apple); Marina Moskowitz (Apple); Harold Nguyen (Apple); Nicole Nichols (Apple); Nicole Park (Apple); Paul Rachwalski (Apple); Frank Rau (Apple); Adrian Webster (Apple)

Reinforcement Learning Approaches for Traffic Signal Control under Missing Data

Hao Mei (New jersey institue of technology)*; Junxian Li (Xi‘an JiaoTong University); Bin Shi (Xi'an jiaotong University); Hua Wei (NJIT)

An Empirical Evaluation of Posterior Sampling for Constrained Reinforcement Learning

Danil Provodin (TU Eindhoven)*; Pratik Gajane (Eindhoven University of Technology); Mykola Pechenizkiy (TU Eindhoven); Maurits Kaptein (Tilburg University)

MARLIM: Multi-Agent Reinforcement Learning for Inventory Management

Rémi Leluc (Télécom Paris)*; Elie Kadoche (Télécom-Paris); Antoine Bertoncello (TotalEnergies); Sébastien Gourvénec (TotalEnergies)

tinyMAN: Lightweight Energy Manager using Reinforcement Learning for Energy Harvesting Wearable IoT Devices

Toygun Basaklar (University of Wisconsin-Madison)*; Yigit Tuncel (UW-Madison); Umit Ogras (University of Wisconsin-Madison)

Automatic Evaluation of Excavator Operators using Learned Reward Functions

Pranav Agarwal (École de technologie supérieure)*; Marek Teichmann (CM Labs); Sheldon Andrews (École de technologie supérieure); Samira Ebrahimi Kahou (École de technologie supérieure)

Hierarchical Reinforcement Learning for Furniture Layout in Virtual Indoor Scenes

XINHAN DI (Deepearthgo)*; Pengqian Yu (Independent Researcher)

Reinforcement Learning-Based Air Traffic Deconfliction

Denis Osipychev (Boeing)*; Dragos Margineantu (Boeing)

Learning an Adaptive Forwarding Strategy for Mobile Wireless Networks:  Resource Usage vs. Latency

Victoria Manfredi (Wesleyan University)*; Alicia P Wolfe (Wesleyan University); Xiaolan Zhang (Fordham University); Bing Wang (University of Connecticut)

Safe Reinforcement Learning for Automatic Insulin Delivery in Type I Diabetes

Maxime Louis (Diabeloop)*; Hector M Romero Ugalde (Diabeloop SA); Pierre Gauthier (Diabeloop); Alice Adenis (Diabeloop); Yousra Tourki (Diabeloop); Erik Huneker (Diabeloop)

Identifying Disparities in Sepsis Treatment by Learning the Expert Policy

Hyewon Jeong (MIT)*; Siddharth Nagar Nayak (Massachusetts Institute of Technology); Taylor W Killian (University of Toronto, Vector Institute); Sanjat Kanjilal (Harvard Medical School); Marzyeh Ghassemi (University of Toronto, Vector Institute)

Call For Papers

Reinforcement learning (RL) is a general learning, predicting, and decision-making paradigm and applies broadly in many disciplines in science, engineering, and the arts. RL has seen prominent successes in many problems, such as those in simulated environments like Atari games and AlphaGo, and those in real life like robotics, recommender systems, and nuclear fusion. However, given the significant theoretical and algorithmic gains made in the past few years, applying RL in real life remains challenging, and a natural question is:

Why isn’t RL used even more often and how can we improve this?

The main goals of the workshop are to: (1) identify key research problems that are critical for the success of real-world applications; (2) report progress on addressing these critical issues; and (3) have practitioners share their success stories of applying RL to real-world problems, and the insights gained from such applications.

We invite paper submissions of original work successfully applying RL algorithms to real-life problems and/or addressing practically relevant RL issues. Our topics of interest are general, w.r.t. practical RL algorithms, practical issues, and applications

Topics of interest include (but are not limited to):

+ studies about real-life RL systems, esp. about deployment/product

+ significant efforts for a high-fidelity simulator, esp. for a complicated system 

+ significant efforts for benchmarks/datasets

+ significant efforts for human factors

The following alone is not considered real-life RL:

- practical work with existing simulator/benchmark/dataset only, w/o significant real-life efforts

- theory/algorithm work with toy/simple experiments only, w/o significant real-life efforts

Paper Submission

Deadline: Sep. 15, 2022

Notification: Oct. 15, 2022 Oct. 20, 2022

We invite unpublished submissions up to 9 pages excluding references and appendix, in PDF format using the NeurIPS 2022 template and style guidelines. Here is a customized style file for the workshop. (In the .tex file, use "\usepackage{neurips_2022}" for the submission, and use "\usepackage[final]{neurips_2022}" for the final version if accepted.) The paper review process will be double-blind. All accepted papers will be presented as posters, some as spotlight talks, and all will be made available on the workshop website. Accepted papers are non-archival, i.e. there will be no proceedings for this workshop. Selected papers will be further considered for the Machine Learning Journal Special Issue on Reinforcement Learning for Real Life.

The submission website is: (Choose “Papers”)

Direct MLJ submission deadline Jan 30, 2023

More details:

Call for Panels

We invite proposals for panel discussions at the Reinforcement Learning for Real Life (RL4RealLife) Workshop @ NeurIPS 2022.

The proposed panel should be informative and engaging. We expect selected panels to address novel, aspiring, and open questions of broad interest to the RL4RealLife community. Each panel is a 60-minute session. We encourage fully in-person panels, and consider virtual ones as well.

A panel proposal must include:

+ A title and a description of the content. (two pages)

  * Why is the topic of interest to the RL4RealLife community?

  * What are possible perspectives that might spark discussions?

  * Who are the panelists? Who has been confirmed?

  * What are the likely positions of the panelists on the topic?

  * What are the potential questions to facilitate discussions?

  * How will the audience be involved in the discussions?

 + An account of the efforts made to ensure diversity of the chair(s) and panelists.

+A very brief advertisement or tagline for the panel discussion, up to 140 characters, that highlights any key information you wish prospective attendees to know. Please input this as the abstract.

+ Name, affiliation, brief bios, and contact info for one or two chairs/moderators of the proposed panel. (no page limit)

+ Names, affiliations, and brief bios for up to 5 confirmed panelists. (no page limit)    

The proposal is submitted as a single PDF file. 

We will follow the Guidance for NeurIPS Workshop Proposals 2022,

Our Call for Panels borrows ideas from and We encourage potential panel organizers to refer to them.

Deadline: Aug. 30, 2022

Notification: Sep. 15, 2022

The submission website is: (Choose “Panel Proposals”)

TPC Members

Abhishek Naik, University of Alberta

Abhishek Gupta, UC Berkeley

Aditya Modi, Michigan University

Alborz Geramifard, Facebook AI

Aleksandra Faust, Google Brain

Alex Lewandowski, University of Alberta

Anurag Ajay, MIT

Bardienus Duisterhof, Harvard University

Baturay Sağlam, Bilkent University

Bharathan Balaji, Amazon

Bo Chang, Google

Bo Dai, Google Brain

Branislav Kveton, Amazon

Chao Qin, Deepmind

Chih-wei Hsu, Google Research

Cong Lu, University of Oxford

Daochen Zha, Rice University

Di Wu, McGill

Dylan Ashley, The Swiss AI Lab IDSIA, USI, SUPSI

Frederik Schubert, Leibniz University Hannover

Fushan Li, Twitter

Glen Berseth, Université de Montréal, Mila

Gokul Swamy, Carnegie Mellon University

Hamsa Bastani, Wharton

Hanjun Dai, Google Brain

Haoran Xu, JD Technology

Haruka Kiyohara, Tokyo Institute of Technology

Hengshuai Yao, Sony AI

Hongming Zhang, University of Alberta

Hugo Caselles-Dupré, Flowers Team (ENSTA ParisTech & INRIA) & Softbank Robotics Europe

Ioannis Boukas, University of Liège

Iou-Jen Liu, University of Illinois at Urbana-Champaign

Jincheng Mei, Google Brain

Jingwei Zhang, Deepmind

Johannes Kirschner, University of Alberta

Joshua Greaves‎, Google

Juan Jose Garau Luis, MIT

Junfeng Wen, Carleton University

Konstantina Christakopoulou, Google

Kui Wu, University of Victoria

Luchen Li, Imperial College London

Manyou Ma, The University of British Columbia

Masatoshi Uehara, Cornell University

Maxime Heuillet, Universite Laval

Meng Qi, University of California, Berkeley

Mengxin Wang, University of California, Berkeley

Minghao Zhang, Tsinghua University

Myounggyu Won,University of Memphis

Nathan Dahlin, University of Illinois at Urbana-Champaign

Peng Liao, Harvard University

Rahul Kidambi, Amazon Search & AI

Rasool Fakoor, AWS

Ruofan Kong, Microsoft

Sahika Genc, Amazon Artificial Intelligence

Scott Rome, Comcast

Shangtong Zhang, University of Virginia

Shengpu Tang, University of Michigan

Shie Mannor, Technion

Shuai Li, Shanghai Jiao Tong University

Srijita Das, University of Alberta

Srivatsan Krishnan, Harvard University

Subhojyoti Mukherjee, University of Massachusetts Amherst

Tao Chen, MIT

Tengyang Xie, University of Illinois at Urbana-Champaign

Vianney Perchet, ENSAE & Criteo AI Lab

Victor Carbune, Google

Vincent Mai, Mila, Université de Montréal

Wei Qiu, Nanyang Technological University

Weixun Wang, Tianjin University

Wilka Carvalho, University of Michigan

Xianyuan Zhan, Tsinghua University

Xiao-Yang Liu, Columbia University

Xinyun Chen, Chinese University of Hong Kong, Shenzhen

Xuesu Xiao, University of Texas at Austin

Xuezhou Zhang, Princeton University

Ya Le, Google

Yiding Chen, University of Wisconsin-Madison

Yijie Guo, University of Michigan

Yingru Li, The Chinese University of Hong Kong, Shenzhen

Yinlam Chow, Google AI

Yongshuai Liu, University of California, Davis

Yuandong Tian, Meta

Yue Gao, University of Alberta

Yuping Luo, Princeton University

Yuqing Hou, Intel Labs China

Yuta Saito, Cornell University

Zhang-Hua Fu, The Chinese University of Hong Kong, Shenzhen

Zheqing Zhu, Stanford University

Zhimin Hou, National University of Singapore

Zhipeng Wang, Apple

Ziniu Li, The Chinese University of Hong Kong, Shenzhen


Emma Brunskill (Stanford)

Minmin Chen (Google)

Lihong Li (Amazon)

Yao Liu (Amazon)

Matthew E. Taylor    (U. of Alberta)

Additional guest editors for the fast track MLJ Special Issue on RL4RealLife

Niranjani Prasad (Microsoft Research)

Csaba Szepesvari (Deepmind & U. of Alberta)



Welcome to join our Slack Workspace for RL4Real Life.