[Biography]
Fabio Cuzzolin was born in Jesolo, Italy. He received the laurea degree magna cum laude from the University of Padova, Italy, in 1997 and a Ph.D. degree from the same institution in 2001, with a thesis entitled “Visions of a generalized probability theory”. He was a researcher with the Image and Sound Processing Group of the Politecnico di Milano in Milan, Italy, and a postdoc with the UCLA Vision Lab at the University of California at Los Angeles, California. He later joined as a Marie Curie fellow the Perception team at INRIA Rhone-Alpes, Grenoble.
He joined the Department of Computing of Oxford Brookes University in September 2008. He has taken on the role of Head of the Artificial Intelligence and Vision research group in September 2012. The group has taken on the name of Visual Artificial Intelligence Laboratory, part of the School of Engineering, Computing and Mathematics, in 2018. He is a Professor of Artificial Intelligence since January 2016. Since 2020 he is on the Board of the Institute for Ethical AI.
The Visual AI Lab currently runs on a budget of £3 million, with eight live projects funded by the European Union (2), Innovate UK (2), UKIERI, the ECM School, Huawei Technologies and the Leverhulme Trust.
In 2021 the team is projected to comprise around 35 members, including five faculty, nine research fellows, two KTP associates, six Ph.D. students, six MSc and final year students and six external collaborators.
Fabio is a world leader in the field of imprecise probabilities and random set theory, to which he contributed an original geometric approach. His Lab's research spans artificial intelligence, machine learning, computer vision, surgical robotics, autonomous driving, AI for healthcare as well as uncertainty theory. The team is pioneering frontier topics such as machine theory of mind, epistemic artificial intelligence, predicting future actions and behaviour, neurosymbolic reasoning, self-supervised learning and federated learning.
Fabio is the author of 110+ publications, published or under review, including 4 books, 13 book chapters, and 27 journal papers.
He is a four-term member of the Board of Directors of the Belief Functions and Applications Society (BFAS) and was Executive Editor of the Society for Imprecise Probabilities and Their Applications (SIPTA). Fabio was in the Technical Program Committee of 100+ international conferences, including UAI, BMVC, ECCV, ICCV (as Area Chair), IJCAI, CVPR, NeurIPS, AAAI, ICML. He has been on a board of IEEE Fuzzy Systems, IEEE SMC, IJAR, Information Fusion, IEEE TNN and Frontiers.
[Abstract]
Autonomous vehicles (AVs) employ a variety of sensors to identify roadside infrastructure and other road users, with much of the existing work focusing on scene understanding and robust object detection. Human drivers, however, approach the driving task in a more holistic fashion which entails, in particular, recognising and understanding the evolution of road events. Testing an AV’s capability to recognise the actions undertaken by other road agents is thus crucial to improve their situational awareness and facilitate decision making.
In this talk we introduce the ROad event Awareness Dataset (ROAD) for Autonomous Driving, to our knowledge the first of its kind. ROAD is explicitly designed to test the ability of an autonomous vehicle to detect road events, defined as triplets composed by a moving agent, the actions it performs (possibly more than one, e.g. as in a car concurrently turning left, blinking, and moving-away) and the associated locations. ROAD comprises 22 videos captured as part of the Oxford RobotCar Dataset, which we annotated with bounding boxes to show the location in the image plane of each road event, and is designed to provide an information-rich playground for validating a variety of tasks related to the understanding of road user behaviour, includin cyclists and pedestrians.
The dataset comes with a new baseline algorithm for online road event awareness capable of working incrementally, an essential feature for autonomous driving. Our baseline, inspired by the success of 3D CNNs and single-stage object detectors, is based on inflating RetinaNet along the temporal direction and achieves a mean average precision of 16.8% and 6.1%, respectively, for frame-level event bounding box detection and video-level event tube detection at 50% overlap. Further significant results have been achieved during the recent ROAD @ ICCV 2021 challenge, where a number of participant teams competed to achieve the best performance on three tasks: agent, action and event detection. While promising, these figures do highlight the challenges faced by realistic situation awareness in autonomous driving.
Finally, ROAD is readied to allow scholars to conduct research on exciting new tasks, such as the understanding of complex (road) activities, the anticipation of future road events, and the modelling of sentient road agents in terms of mental states. Further extensions in the form of ROAD-like annotation for other datasets such as Waymo and PIE, logical scene constraints for neuro-symbolic reasoning and intent / trajectory prediction baselines are underway.