ICML 2024 Workshop
Models of Human Feedback for AI Alignment
July 26th 2024, Schubert 4 - 6, Messe Wien Exhibition Congress Center, Vienna, Austria
Important Dates
Paper Submission Deadline (OpenReview): May 31, 2024
Acceptance Notification: June 17, 2024
Camera-Ready Deadline: June 25, 2024
Workshop: July 26, 2024
Workshop Overview
Aligning AI agents with human intentions and values is one of the main barriers to the safe and ethical application of AI systems in the real world, spanning various domains such as robotics, recommender systems, autonomous driving, and large language models. To this end, understanding human decision-making and interpreting human choices is fundamental for building intelligent systems that can interact with users effectively, align with their preferences, and contribute to the development of ethical and user-centric AI applications.
Despite its vital importance for Human-AI Alignment, current approaches, such as Reinforcement Learning with Human Feedback (RLHF) or Learning from Demonstrations (LfD), rely on highly questionable assumptions about the meaning of observed human feedback and interactions. In fact, these assumptions remain mostly unchallenged by the community, and simplistic human feedback models are often being reused without any re-evaluation of their suitability. For example, we typically assume that a human acts rationally, that human feedback is unbiased, or that all humans provide similar feedback and have similar opinions. Of course, many of these assumptions are violated in practice, however, the role of such modeling assumptions has mostly been neglected in the literature on human-AI alignment. The goals of this workshop are:
to bring together different communities towards a better understanding of human feedback
to discuss different types of human feedback and discuss mathematical and computational models of human feedback and their shortcomings
discuss important and promising future directions towards a better understanding of human feedback models and better AI alignment.
Speakers
Ariel Procaccia
(Harvard University)
Dylan Hadfield-Menell
(MIT)
Tracy Liu
(Tsinghua University)
David Lindner
(Google DeepMind)
Panelists
Daniele Calandriello
(Google DeepMind)
Adam Gleave
(FAR AI)
Rin Metcalf Susa
(Apple ML Research)
David Krueger
(University of Cambridge)
Organizers
Thomas Kleine Buening
(The Alan Turing Institute)
Christos Dimitrakakis
(Universite de Neuchatel)
Scott Niekum
(UMass Amherst)
Constantin Rothkopf
(TU Darmstadt)
Aadirupa Saha
(Apple ML Research)
Harshit Sikchi
(UT Austin)
Lirong Xia
(Rensselaer Polytechnic Institute)
Venue
Messe Wien Exhibition Congress Center, Vienna, Austria
Room: Schubert 4-6
Contact: mhf.icml.2024@gmail.com
Twitter: x.com/mhf_icml2024