The ERR@HRI 3.0 Challenge provides two complementary datasets, each capturing different aspects of error detection in human-robot interaction.
Data format:Â Raw video files (.mp4) containing participants' faces as they watched failure scenarios (average 17.84 s); Video dataset (46 videos) used as stimuli (.mp4).
Labels: Videos in the BAD dataset are labeled according to the stimulus video content: Failure (human or robot) or Control (no failure) (6 videos).
Data format: Raw video files (.mp4) of participants watching and reacting to scenarios, along with their outcome predictions (average 1.95 s). Video dataset (30 videos) used as stimuli (.mp4).
Labels: The Bad Idea dataset includes annotations for the participant's prediction of whether the scenario will end Well (scenario shown ends in good outcome, e.g., a human on a bike does not fall) or Poorly (scenario ends in a bad outcome, e.g. robot jumping fails the landing). Importantly, these labels are based on what the participant predicted was going to happen after watching the video with the outcome cut-off, not what really happens. Nonetheless, data is mostly balanced (1.15 good-to-bad outcome prediction ratio).