Reinforcement Learning for Real Life Workshop @ NeurIPS 2022
Dec 3, 2022. New Orleans, USA. In-person.
Website @ NeurIPS 2022 (videos, posters, etc.)
More Workshops and Special Issue
Synergy of Reinforcement Learning and Large Language Models (RL+LLMs) @ AAAI 2024
Fast track special issue with Machine Learning journal
Reinforcement Learning for Real Life Virtual Workshop @ ICML 2021
Machine Learning Special Issue on Reinforcement Learning for Real Life
Schedule
7:30- 8:25 Posters (for early birds; optional)
8:25- 8:30 Opening Remarks
8:30- 9:00 Invited Talk: Peter Stone, UTAustin/Sony
9:00- 9:30 Invited Talk: Robert Nishihara, Anyscale
9:30-10:00 Invited Talk: Dhruv Madeka, Amazon
10:00-10:20 Coffee Break
10:20-11:10 Panel: RL Implementation
11:10-12:00 Panel: RL Benchmarks
12:00-13:30 Lunch/Posters
13:30-14:00 Invited Talk: Matej Balog, Deepmind
14:00-14:55 Panel: RL Theory-Practice Gap
14:55-15:00 Closing Remarks
15:00-15:30 Coffee Break/Posters
15:30-17:00 Posters
Invited talks
Matej Balog (Deepmind)
Dhruv Madeka (Amazon)
Robert Nishihara (Anyscale)
Peter Stone (UTAustin/Sony)
Dhruv Madeka, Amazon
Title: Deep Reinforcement Learning for Real-World Inventory Management
Abstract: We present a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching. While this dynamic program has historically been considered intractable, we show that several policy learning approaches are competitive with or outperform classical baseline approaches. In order to train these algorithms, we develop novel techniques to convert historical data into a simulator and present a collection of results that motivate this approach. We also present a model-based reinforcement learning procedure (Direct Backprop) to solve the dynamic periodic review inventory control problem by constructing a differentiable simulator. Under a variety of metrics Direct Backprop outperforms model-free RL and newsvendor baselines, in both simulations and real-world deployments.
Bio: Dhruv Madeka is a Principal Machine Learning Scientist at Amazon. His current research focuses on applying Deep Reinforcement Learning to supply chain problems. Dhruv has also worked on developing generative and supervised deep learning models for probabilistic time series forecasting. In the past - Dhruv worked in the Quantitative Research team at Bloomberg LP, developing open source tools for the Jupyter ecosystem and conducting advanced mathematical research in derivatives pricing, quantitative finance and election forecasting.
Robert Nishihara, Anyscale
Title: Scaling reinforcement learning in the real world, from gaming to finance to manufacturing
Abstract: Reinforcement learning is transforming industries from gaming to robotics to manufacturing. This talk showcases how a variety of industries are adopting reinforcement learning to overhaul their businesses, from changing the nature of game development to designing the boat that won the America's Cup. These industries leverage Ray, a distributed framework for scaling Python applications and machine learning applications. Ray is used by companies across the board from Uber to OpenAI to Shopify to Amazon to scale their machine learning training, inference, data ingest, and reinforcement learning workloads.
Bio: Robert Nishihara is one of the creators of Ray, a distributed framework for scaling Python applications and machine learning applications. Ray is used by companies across the board from Uber to OpenAI to Shopify to Amazon to scale their machine learning training, inference, data ingest, and reinforcement learning workloads. He is one of the co-founders and CEO of Anyscale, which is the company behind Ray. He did his PhD in machine learning and distributed systems in the computer science department at UC Berkeley. Before that, he majored in math at Harvard.
Peter Stone, UTAustin/Sony
Title: Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning
Abstract: Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world's best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics. In addition, we construct a reward function that enables the agent to be competitive while adhering to racing's important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four
of the world's best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms. (A previous talk: https://www.cs.utexas.edu/~pstone/media/Stone_UofM_041522.mp4)
Bio: Dr. Peter Stone holds the Truchard Foundation Chair in Computer Science at the University of Texas at Austin. He is Associate Chair of the Computer Science Department, as well as Director of Texas Robotics. In 2013 he was awarded the University of Texas System Regents' Outstanding Teaching Award and in 2014 he was inducted into the UT Austin Academy of Distinguished Teachers, earning him the title of University Distinguished Teaching Professor. Professor Stone's research interests in Artificial Intelligence include machine learning (especially reinforcement learning), multiagent systems, and robotics. Professor Stone received his Ph.D in Computer Science in 1998 from Carnegie Mellon University. From 1999 to 2002 he was a Senior Technical Staff Member in the Artificial Intelligence Principles Research Department at AT&T Labs - Research. He is an Alfred P. Sloan Research Fellow, Guggenheim Fellow, AAAI Fellow, IEEE Fellow, AAAS Fellow, ACM Fellow, Fulbright Scholar, and 2004 ONR Young Investigator. In 2007 he received the prestigious IJCAI Computers and Thought Award, given biannually to the top AI researcher under the age of 35, and in 2016 he was awarded the ACM/SIGAI Autonomous Agents Research Award. Professor Stone co-founded Cogitai, Inc., a startup company focused on continual learning, in 2015, and currently serves as Executive Director of Sony AI America.
Matej Balog, Deepmind
Title: AlphaTensor: Discovering faster matrix multiplication algorithms with RL
Abstract: Improving the efficiency of algorithms for fundamental computational tasks such as matrix multiplication can have widespread impact, as it affects the overall speed of a large amount of computations. Automatic discovery of algorithms using ML offers the prospect of reaching beyond human intuition and outperforming the current best human-designed algorithms. In this talk I’ll present AlphaTensor, our RL agent based on AlphaZero for discovering efficient and provably correct algorithms for the multiplication of arbitrary matrices. AlphaTensor discovered algorithms that outperform the state-of-the-art complexity for many matrix sizes. Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time since its discovery 50 years ago. I’ll present our problem formulation as a single-player game, the key ingredients that enable tackling such difficult mathematical problems using RL, and the flexibility of the AlphaTensor framework.
Bio: Matej Balog is a Senior Research Scientist at DeepMind, working in the Science team on applications of AI to Maths and Computation. Prior to joining DeepMind he worked on program synthesis and understanding, and was a PhD student at the University of Cambridge with Zoubin Ghahramani, working on general machine learning methodology, in particular on conversions between fundamental computational tasks such as integration, sampling, optimization, and search.
Panels
Panel: RL Benchmarks
Moderator: Minmin Chen
Panelists:
Pablo Samuel Castro, Google
Linxi "Jim" Fan, Nvidia
Caglar Gulcehre, Deepmind
Tony Jabara, Spotify
Peter Stone, UTAustin/Sony
Minmin Chen (Google)
Pablo Samuel Castro (Google)
Linxi "Jim" Fan (Nvidia)
Caglar Gulcehre (Deepmind)
Tony Jabara (Spotify)
Peter Stone (UTAustin/Sony)
Panel : RL Implementation: From the Halfway Point to the Last Mile
Chair: Xiaolin Ge, Summit Human Capital
Moderator: Alborz Geramifard, Meta
Panelists:
Kence Anderson, Microsoft
Craig Buhr, MathWorks
Robert Nishihara, Anyscale
Yuandong Tian, Meta/Facebook AI Research (FAIR)
Xiaolin Ge (Summit Human Capital)
Alborz Geramifard (Meta)
Kence Anderson (Microsoft)
Craig Buhr (MathWorks)
Robert Nishihara (Anyscale)
Yuandong Tian (Meta/FAIR)
Panel: RL Theory-Practice Gap
Moderator: Peter Stone
Panelists:
Matej Balog, Deepmind
Jonas Buchli, Deepmind
Jason Gauci, Argo AI
Dhruv Madeka, Amazon
Peter Stone (UTAustin/Sony)
Matej Balog (Deepmind)
Jonas Buchli (Deepmind)
Jason Gauci (Argo AI)
Dhruv Madeka (Amazon)
Accepted Papers
Optimizing Audio Recommendations for the Long-Term
Lucas Maystre (Spotify)*; Daniel Russo (Columbia); Yu Zhao (Spotify)
Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization
Kaixuan Huang (Princeton University)*; Yu Wu (Princeton University); Xuezhou Zhang (Princeton); Shenyinying Tu (LinkedIn); Qingyun Wu (Pennsylvania State University); Mengdi Wang (Princeton University/DeepMind); Huazheng Wang (Oregon State University)
Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response
Vincent Mai (Mila, Université de Montréal)*; Philippe Maisonneuve (Polytechnique Montreal & GERAD); Tianyu Zhang (Mila, Université de Montréal); Jorge Montalvo Arvizu (Solario); Liam Paull (University of Montreal); Antoine Lesage-Landry (Polytechnique Montréal & GERAD)
LibSignal: An Open Library for Traffic Signal Control
Hao Mei (New jersey institue of technology); Xiaoliang Lei ( Xi'an Jiaotong University); Longchao Da (New Jersey Institute of Technology); Bin Shi (Xi'an jiaotong University); Hua Wei (NJIT)*
Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management
Yuandong Ding (Huazhong University of Science and Technology); Mingxiao Feng (University of Science and Technology of China); Guozi Liu (Carnegie Mellon University); Wei Jiang (University of Illinois at Urbana-Champaign); Chuheng Zhang (Microsoft Research)*; Li Zhao (Microsoft Research); Lei Song (Microsoft Research); Houqiang Li (University of Science and Technology of China); Yan Jin (Huazhong University of Science and Technology); Jiang Bian (Microsoft Research)
Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs
Benjamin Fuhrer (Nvidia Networking)*; Yuval Shpigelman (Nvidia Networking); Chen Tessler (NVIDIA); Shie Mannor (Nvidia); Gal Chechik (Nvidia); Eitan Zahavi (Nvidia Networking); Gal Dalal (NVIDIA Research)
Controlling Commercial Cooling Systems Using Reinforcement Learning
Jerry Luo (DeepMind)*; Cosmin Paduraru (DeepMind); Octavian Voicu (DeepMind); Yuri Chervonyi (Google/Deepmind); Scott Munns (Trane Technologies); Jerry Li (Deepmind); Crystal Qian (Google); Praneet Dutta (Google); Daniel Mankowitz (DeepMind); Jared Q Davis (Stanford University | DeepMind); Ningjia Wu (Google); Xingwei Yang (Google); Chu-Ming Chang (Google); Ted Li (Google); Rob Rose (Google); Mingyan Fan (Google); Hootan Nakhost (Google); Tinglin Liu (Google); Deeni Fatiha (DeepMind); Neil Satra (Google); Juliet Rothenberg (Google); Molly Carlin (DeepMind); Satish Tallapaka (Google); Sims Witherspoon (DeepMind); David Parish (Google); Peter Dolan (DeepMind); Chenyu Zhao (Google)
Structured Q-learning For Antibody Design
Alexander I Cowen-Rivers (Preferred Networks, Inc.)*; Philip John Gorinski (Huawei Noah's Ark Lab); aivar sootla (HAUWEI ); Asif Khan (University of Edinburgh); Jun WANG (UCL); Jan Peters (TU Darmstadt); Haitham Bou Ammar (Huawei)
Semi-analytical Industrial Cooling System Model for Reinforcement Learning
Yuri Chervonyi (Google/Deepmind)*; Praneet Dutta (Google)
A Versatile and Efficient Reinforcement Learning Approach for Autonomous Driving
Guan Wang (Tsinghua University); Haoyi Niu (Tsinghua University); desheng zhu (China University of Mining and Technology-beijing); Jianming Hu (Department of Automation, Tsinghua University); Xianyuan Zhan (Tsinghua University)*; Guyue Zhou (Tsinghua University)
Function Approximations for Reinforcement Learning Controller for Wave Energy Converters
Soumyendu Sarkar (Hewlett Packard Enterprise)*; Vineet Gundecha (Hewlett Packard Enterpise); Alexander Shmakov (UC Irvine); Sahand Ghorbanpour (Hewlett Packard Enterprise); Ashwin Ramesh Babu (Hewlett Packard Enterprise Labs); Alexandre Pichard (Carnegie Clean Energy); mathieu Cocho (Carnegie Clean Energy)
Zheng Yu (Princeton)*; Yikuan Li (Northwestern University); Joseph Kim (Princeton University); Kaixuan Huang (Princeton University); Yuan Luo ("Northwestern University, IL"); Mengdi Wang (Princeton University/DeepMind)
Optimizing Industrial HVAC Systems with Hierarchical Reinforcement Learning
William Wong (Carnegie Mellon University)*; Praneet Dutta (Google); Octavian Voicu (DeepMind); Yuri Chervonyi (Google/Deepmind); Cosmin Paduraru (DeepMind); Jerry Luo (DeepMind)
Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms
Vashist Avadhanula (Facebook); Omar Abdul Baki (Meta); Hamsa Bastani (Wharton); Osbert Bastani (University of Pennsylvania); Caner Gocmen (Facebook); Daniel Haimovich (Facebook); Darren Hwang (Meta); Dmytro Karamshuk (Facebook); Thomas J. Leeper (Facebook); Jiayuan Ma (Meta); Gregory macnamara (Meta); Jake Mullet (Meta); Christopher Palow (Meta); Sung Park (Meta); Varun S Rajagopal (Meta); Kevin Schaeffer (Facebook); Parikshit Shah (Facebook); Deeksha Sinha (Massachusetts Institute of Technology)*; Nicolas E Stier-Moses (Facebook); Ben Xu (Meta)
Power Grid Congestion Management via Topology Optimization with AlphaZero
Matthias Dorfer (enliteAI)*; Anton Fuxjäger (enliteAI); Kristián Kozák (enliteAI); Patrick M Blies (EnliteAI); Marcel Wasserer (enliteAI)
Beyond CAGE: Investigating Generalization of Learned Autonomous Network Defense Policies
Melody Wolk (Apple); Andy Applebaum (Apple)*; Camron Dennler (Apple); Patrick Dwyer (Apple); Marina Moskowitz (Apple); Harold Nguyen (Apple); Nicole Nichols (Apple); Nicole Park (Apple); Paul Rachwalski (Apple); Frank Rau (Apple); Adrian Webster (Apple)
Reinforcement Learning Approaches for Traffic Signal Control under Missing Data
Hao Mei (New jersey institue of technology)*; Junxian Li (Xi‘an JiaoTong University); Bin Shi (Xi'an jiaotong University); Hua Wei (NJIT)
An Empirical Evaluation of Posterior Sampling for Constrained Reinforcement Learning
Danil Provodin (TU Eindhoven)*; Pratik Gajane (Eindhoven University of Technology); Mykola Pechenizkiy (TU Eindhoven); Maurits Kaptein (Tilburg University)
MARLIM: Multi-Agent Reinforcement Learning for Inventory Management
Rémi Leluc (Télécom Paris)*; Elie Kadoche (Télécom-Paris); Antoine Bertoncello (TotalEnergies); Sébastien Gourvénec (TotalEnergies)
Toygun Basaklar (University of Wisconsin-Madison)*; Yigit Tuncel (UW-Madison); Umit Ogras (University of Wisconsin-Madison)
Automatic Evaluation of Excavator Operators using Learned Reward Functions
Pranav Agarwal (École de technologie supérieure)*; Marek Teichmann (CM Labs); Sheldon Andrews (École de technologie supérieure); Samira Ebrahimi Kahou (École de technologie supérieure)
Hierarchical Reinforcement Learning for Furniture Layout in Virtual Indoor Scenes
XINHAN DI (Deepearthgo)*; Pengqian Yu (Independent Researcher)
Reinforcement Learning-Based Air Traffic Deconfliction
Denis Osipychev (Boeing)*; Dragos Margineantu (Boeing)
Learning an Adaptive Forwarding Strategy for Mobile Wireless Networks: Resource Usage vs. Latency
Victoria Manfredi (Wesleyan University)*; Alicia P Wolfe (Wesleyan University); Xiaolan Zhang (Fordham University); Bing Wang (University of Connecticut)
Safe Reinforcement Learning for Automatic Insulin Delivery in Type I Diabetes
Maxime Louis (Diabeloop)*; Hector M Romero Ugalde (Diabeloop SA); Pierre Gauthier (Diabeloop); Alice Adenis (Diabeloop); Yousra Tourki (Diabeloop); Erik Huneker (Diabeloop)
Identifying Disparities in Sepsis Treatment by Learning the Expert Policy
Hyewon Jeong (MIT)*; Siddharth Nagar Nayak (Massachusetts Institute of Technology); Taylor W Killian (University of Toronto, Vector Institute); Sanjat Kanjilal (Harvard Medical School); Marzyeh Ghassemi (University of Toronto, Vector Institute)
Call For Papers
Reinforcement learning (RL) is a general learning, predicting, and decision-making paradigm and applies broadly in many disciplines in science, engineering, and the arts. RL has seen prominent successes in many problems, such as those in simulated environments like Atari games and AlphaGo, and those in real life like robotics, recommender systems, and nuclear fusion. However, given the significant theoretical and algorithmic gains made in the past few years, applying RL in real life remains challenging, and a natural question is:
Why isn’t RL used even more often and how can we improve this?
The main goals of the workshop are to: (1) identify key research problems that are critical for the success of real-world applications; (2) report progress on addressing these critical issues; and (3) have practitioners share their success stories of applying RL to real-world problems, and the insights gained from such applications.
We invite paper submissions of original work successfully applying RL algorithms to real-life problems and/or addressing practically relevant RL issues. Our topics of interest are general, w.r.t. practical RL algorithms, practical issues, and applications.
Topics of interest include (but are not limited to):
+ studies about real-life RL systems, esp. about deployment/product
+ significant efforts for a high-fidelity simulator, esp. for a complicated system
+ significant efforts for benchmarks/datasets
+ significant efforts for human factors
The following alone is not considered real-life RL:
- practical work with existing simulator/benchmark/dataset only, w/o significant real-life efforts
- theory/algorithm work with toy/simple experiments only, w/o significant real-life efforts
Paper Submission
Deadline: Sep. 15, 2022
Notification: Oct. 15, 2022 Oct. 20, 2022
We invite unpublished submissions up to 9 pages excluding references and appendix, in PDF format using the NeurIPS 2022 template and style guidelines. Here is a customized style file for the workshop. (In the .tex file, use "\usepackage{neurips_2022}" for the submission, and use "\usepackage[final]{neurips_2022}" for the final version if accepted.) The paper review process will be double-blind. All accepted papers will be presented as posters, some as spotlight talks, and all will be made available on the workshop website. Accepted papers are non-archival, i.e. there will be no proceedings for this workshop. Selected papers will be further considered for the Machine Learning Journal Special Issue on Reinforcement Learning for Real Life.
https://cmt3.research.microsoft.com/RL4RealLife2022/ (Choose “Papers”)
Direct MLJ submission deadline Jan 30, 2023
More details: https://www.springer.com/journal/10994/updates/19601294
Call for Panels
We invite proposals for panel discussions at the Reinforcement Learning for Real Life (RL4RealLife) Workshop @ NeurIPS 2022.
The proposed panel should be informative and engaging. We expect selected panels to address novel, aspiring, and open questions of broad interest to the RL4RealLife community. Each panel is a 60-minute session. We encourage fully in-person panels, and consider virtual ones as well.
A panel proposal must include:
+ A title and a description of the content. (two pages)
* Why is the topic of interest to the RL4RealLife community?
* What are possible perspectives that might spark discussions?
* Who are the panelists? Who has been confirmed?
* What are the likely positions of the panelists on the topic?
* What are the potential questions to facilitate discussions?
* How will the audience be involved in the discussions?
+ An account of the efforts made to ensure diversity of the chair(s) and panelists.
+A very brief advertisement or tagline for the panel discussion, up to 140 characters, that highlights any key information you wish prospective attendees to know. Please input this as the abstract.
+ Name, affiliation, brief bios, and contact info for one or two chairs/moderators of the proposed panel. (no page limit)
+ Names, affiliations, and brief bios for up to 5 confirmed panelists. (no page limit)
The proposal is submitted as a single PDF file.
We will follow the Guidance for NeurIPS Workshop Proposals 2022, https://blog.neurips.cc/2022/04/24/guidance-for-neurips-workshop-proposals-2022/.
Our Call for Panels borrows ideas from https://vldb.org/2022/?call-for-panels and https://chi2022.acm.org/for-authors/interacting-and-discussing/panels/. We encourage potential panel organizers to refer to them.
Deadline: Aug. 30, 2022
Notification: Sep. 15, 2022
https://cmt3.research.microsoft.com/RL4RealLife2022/ (Choose “Panel Proposals”)
TPC Members
Abhishek Naik, University of Alberta
Abhishek Gupta, UC Berkeley
Aditya Modi, Michigan University
Alborz Geramifard, Facebook AI
Aleksandra Faust, Google Brain
Alex Lewandowski, University of Alberta
Anurag Ajay, MIT
Bardienus Duisterhof, Harvard University
Baturay Sağlam, Bilkent University
Bharathan Balaji, Amazon
Bo Chang, Google
Bo Dai, Google Brain
Branislav Kveton, Amazon
Chao Qin, Deepmind
Chih-wei Hsu, Google Research
Cong Lu, University of Oxford
Daochen Zha, Rice University
Di Wu, McGill
Dylan Ashley, The Swiss AI Lab IDSIA, USI, SUPSI
Frederik Schubert, Leibniz University Hannover
Fushan Li, Twitter
Glen Berseth, Université de Montréal, Mila
Gokul Swamy, Carnegie Mellon University
Hamsa Bastani, Wharton
Hanjun Dai, Google Brain
Haoran Xu, JD Technology
Haruka Kiyohara, Tokyo Institute of Technology
Hengshuai Yao, Sony AI
Hongming Zhang, University of Alberta
Hugo Caselles-Dupré, Flowers Team (ENSTA ParisTech & INRIA) & Softbank Robotics Europe
Ioannis Boukas, University of Liège
Iou-Jen Liu, University of Illinois at Urbana-Champaign
Jincheng Mei, Google Brain
Jingwei Zhang, Deepmind
Johannes Kirschner, University of Alberta
Joshua Greaves, Google
Juan Jose Garau Luis, MIT
Junfeng Wen, Carleton University
Konstantina Christakopoulou, Google
Kui Wu, University of Victoria
Luchen Li, Imperial College London
Manyou Ma, The University of British Columbia
Masatoshi Uehara, Cornell University
Maxime Heuillet, Universite Laval
Meng Qi, University of California, Berkeley
Mengxin Wang, University of California, Berkeley
Minghao Zhang, Tsinghua University
Myounggyu Won,University of Memphis
Nathan Dahlin, University of Illinois at Urbana-Champaign
Peng Liao, Harvard University
Rahul Kidambi, Amazon Search & AI
Rasool Fakoor, AWS
Ruofan Kong, Microsoft
Sahika Genc, Amazon Artificial Intelligence
Scott Rome, Comcast
Shangtong Zhang, University of Virginia
Shengpu Tang, University of Michigan
Shie Mannor, Technion
Shuai Li, Shanghai Jiao Tong University
Srijita Das, University of Alberta
Srivatsan Krishnan, Harvard University
Subhojyoti Mukherjee, University of Massachusetts Amherst
Tao Chen, MIT
Tengyang Xie, University of Illinois at Urbana-Champaign
Vianney Perchet, ENSAE & Criteo AI Lab
Victor Carbune, Google
Vincent Mai, Mila, Université de Montréal
Wei Qiu, Nanyang Technological University
Weixun Wang, Tianjin University
Wilka Carvalho, University of Michigan
Xianyuan Zhan, Tsinghua University
Xiao-Yang Liu, Columbia University
Xinyun Chen, Chinese University of Hong Kong, Shenzhen
Xuesu Xiao, University of Texas at Austin
Xuezhou Zhang, Princeton University
Ya Le, Google
Yiding Chen, University of Wisconsin-Madison
Yijie Guo, University of Michigan
Yingru Li, The Chinese University of Hong Kong, Shenzhen
Yinlam Chow, Google AI
Yongshuai Liu, University of California, Davis
Yuandong Tian, Meta
Yue Gao, University of Alberta
Yuping Luo, Princeton University
Yuqing Hou, Intel Labs China
Yuta Saito, Cornell University
Zhang-Hua Fu, The Chinese University of Hong Kong, Shenzhen
Zheqing Zhu, Stanford University
Zhimin Hou, National University of Singapore
Zhipeng Wang, Apple
Ziniu Li, The Chinese University of Hong Kong, Shenzhen
Co-Chairs
Additional guest editors for the fast track MLJ Special Issue on RL4RealLife