July 18th 2025, Ballroom A, West Building, Vancouver Convention Center, Vancouver, Canada
Important Dates
Paper Submission Deadline (OpenReview): May 25, 2025
Acceptance Notification: June 9, 2025
Camera-Ready deadline: June 23, 2025
Workshop: July 18, 2025
Workshop Overview
Aligning AI agents with human intentions and values is one of the main barriers to the safe and ethical application of AI systems in the real world, spanning various domains such as robotics, recommender systems, autonomous driving, and large language models. To this end, understanding human decision-making and interpreting human choices is fundamental for building intelligent systems that can interact with users effectively, align with their preferences, and contribute to the development of ethical and user-centric AI applications.
Despite its vital importance for Human-AI Alignment, current approaches, such as Reinforcement Learning with Human Feedback (RLHF) or Learning from Demonstrations (LfD), rely on highly questionable assumptions about the meaning of observed human feedback and interactions. In fact, these assumptions remain mostly unchallenged by the community, and simplistic human feedback models are often being reused without any re-evaluation of their suitability. For example, we typically assume that a human acts rationally, that human feedback is unbiased, or that all humans provide similar feedback and have similar opinions. Of course, many of these assumptions are violated in practice, however, the role of such modeling assumptions has mostly been neglected in the literature on human-AI alignment. The goals of this workshop are:
to bring together different communities towards a better understanding of human feedback
to discuss different types of human feedback and discuss mathematical and computational models of human feedback and their shortcomings
discuss important and promising future directions towards a better understanding of human feedback models and better AI alignment.
Speakers
Gillian Hadfield
(Johns Hopkins University)
Matthew Luebbers
(Georgia Tech.)
Natasha Jaques
(University of Washington)
Valentina Pyatkin
(Allen Institute for AI,
University of Washington)
Panelists
Gillian Hadfield
(Johns Hopkins University)
Natasha Jaques
(University of Washington)
(AI Security Institute and MIT)
(Salesforce AI Research)
Organizers
(Georgia Tech.)
(Berkeley)
Thomas Kleine Buening
(The Alan Turing Institute)
Maria Teresa Parreira
(Cornell University)
Venue
Ballroom A, West Building. Vancouver Convention Center, Vancouver, Canada
Contact: mhf.icml.2025@gmail.com