Automating the process of annotations can significantly facilitate the research for autonomous vehicles and increase field-evaluations. Training deep-learning models in the realm of autonomous driving require massive datasets annotated for diverse downstream tasks, such as detections, segmentation, tracking, predictions, interaction features, maneuvers, location marks and etc.
The primary goal of this competition is to design an automated annotation pipeline for the ROAD family datasets, that generates annotations in the existing format along with the textual descriptions. ROAD family datasets are datasets for road events detection. These datasets encompass videos from two different countries: the United Kingdom (ROAD-UK ) and the United States (ROAD-WAYMO). Automatic annotations according to the existing format, will significantly speed up utilization of a newly collected data and reduce the efforts required from human annotators. Textual descriptions provide deeper insights into the images and understanding the environmental context, and can be further utilized by the community for Vision Language Models(VLMs) to train hybrid models for road event detection.
This includes the following tasks:
• Detecting and modeling complex activities, contributed to by several agents over an extended period of time. • Predicting agent intentions. • Forecasting future road events. • Based on all these elements, deciding what action the autonomous vehicle should perform next. • Modelling the reasoning processes of road agents in terms of goals or mental states.
We invite to a challenge participants from both academia and industry (individually and in groups), professionals who are working in the areas of autonomous driving and seeking for automation of the annotation process.