Steps
Registration. Register your team at EvalAI.
Download the dataset. To download the dataset, follow the guidelines given at the Dataset tab.
Try our baseline models. The GitHub repository, available here, contains all the necessary instructions to preprocess the videos that compose ROAD, unpack them to the correct directory structure and run the baseline model.
Submit your predictions. Submit your predictions at EvalAI. Our challenge will accept submissions in two phases. In the first phase, we will only evaluate the performance on the validation set. In the second phase, we will open the evaluation on the test set. The submissions need to follow the format specified below.
Leader-Board. The real-time ranking of all submissions is available here.
Registration
To register in the challenge please use this form.
You also need to register in EvalAI platform to be able to submit your results for evaluation.
Submission Format
Each submission must be a .zip file containing your detections in .pkl file. The file should not exceed 300 MB.
The .pkl file should contain a dictionary whose keys are the video names (e.g., "2014-06-25-16-45-34_stereo_centre_02").
The value associated to each video name should be a dictionary whose keys are the frame names (e.g., "00001.jpg").
The value associated to each frame name should be a list in which each element corresponds to a bounding box.
Each bounding box should then be represented as a dictionary with keys:
"bbox": whose value will be a list of length equal to 4 representing the bounding box position in 2-points format in absolute image coordinates (i.e., [x_min, y_min, x_max, y_max]). Note: images have width 1280 and height 960.
"labels": whose value will be different for task 1 and task 2.
Task 1: a list of length 41, where each element corresponds to the prediction score of each of the 41 labels (e.g., [0.8, 0.1, ..., 0.2] if the model predicts score 0.8 for the label Pedestrian, 0.1 for the label Car, ..., and 0.2 for the label Parking).
Task 2: a list of variable length where each element corresponds to the index of a predicted label (e.g., [0, 13, 36] if the prediction associates the labels [Pedestrian, Moving Away, Right Pavement] to the bounding box). To have your submission accepted, the set of labels must be compliant with the requirements. As mentioned also below, we will check whether the submissions are valid in this sense (i.e. compliant with the requirements) at the key dates mentioned on EvalAI.
The labels should be indexed as indicated below. The first column corresponds to agent labels, the second to actions and the third to location labels.
Agents
0. Pedestrian (Ped)
1. Car (Car)
2. Cyclist (Cyc)
3. Motorbike (Mobike)
4. Medium Vehicle (MedVeh)
5. Large Vehicle (LarVeh)
6. Bus (Bus)
7. Emergency Vehicle (EmVeh)
8. Traffic Light (TL)
9. Other Traffic Light (OthTL)
Actions
10. Red (Red)
11. Amber (Amber)
12. Green (Green)
13. Move Away (MovAway)
14. Move Towards (MovTow)
15. Move (Mov)
16. Brake (Brake)
17. Stop (Stop)
18. Indicate Left (IncatLeft)
19. Indicate Right (IncatRht)
20. Hazards Lights On (HazLit)
21. Turn Left (TurnLft)
22. Turn Right (TurnRht)
23. Overtake (Ovtak)
24. Wait to Cross (Wait2X)
25. Crossing from Left (XingFmLft)
26. Crossing from Right (XingFmRht)
27. Crossing (Xing)
28. Push Object (PushObj)
Locations
29. Vehicle Lane (VehLane)
30. Outgoing Lane (OutgoLane)
31. Outgoing Cyclist Lane (OutgoCycLane)
32. Incoming Lane (IncomLane)
33. Incoming Cyclist Lane (IncomCycLane)
34. Pavement (Pav)
35. Left Pavement (LftPav)
36. Right Pavement (RhtPav)
37. Junction (Jun)
38. Crossing (xing)
39. Bus Stop (BusStop)
40. Parking (parking)
Suppose your prediction includes a single video with a single frame and two bounding boxes. Then your dictionary should be in the following format:
For Task 1:
{
'2014-06-26-09-31-18_stereo_centre_02':
{
'02012.jpg':
[
{'bbox': [878.1075894428152, 450.922275, 936.5801290322579, 493.669875],
'labels': [0.8, 0.6, ..., 0.1]},
{'bbox': [792.0062686217009, 429.5547375, 819.96437771261, 504.108975],
'labels': [0.2, 0.7, ..., 0.25]}
]
}
}
For Task 2:
{
'2014-06-26-09-31-18_stereo_centre_02':
{
'02012.jpg':
[
{'bbox': [878.1075894428152, 450.922275, 936.5801290322579, 493.669875],
'labels': [0, 24, 35]},
{'bbox': [792.0062686217009, 429.5547375, 819.96437771261, 504.108975],
'labels': [1, 13, 29]}
]
}
}
Evaluation
The performance is evaluated differently in the two tasks.
In Task 1 we evaluate the predictions according to the Mean Average Precision (mAP) at IOU 0.5.
In Task 2 we evaluate the prediction according to the F1-score at IoU 0.5.
Notes for Task 1:
The evaluation on Eval.AI takes about 30 mins.
To ease the load on the Eval.AI worker, we will admit submissions with up to 20 bounding boxes per frame.
Notes for Task 2:
We will only check whether the predictions satisfy all requirements at the dates specified on EvalAI.
As only one predicted bounding box can be matched to each ground truth bounding box, we recommend ordering the bounding boxes in the list from the most confident to the least confident ones to maximize the performance.
General Rules
For each task, in the first stage of the challenge, the participants will submit their predictions as generated on the validation fold and get the evaluation metric in return, in order to get a feel of how well their method(s) work. In the second stage, they will submit the predictions generated on the test fold which will be used for the final ranking.
Evaluation will take place on the EvalAI platform. For each challenge stage and each task, the maximum number of submissions is capped at 50, with an additional constraint of 5 submissions per day.
A separate ranking will be produced for each task
For Task 1, the participants must only use the 3 provided videos of ROAD-R and must not use external data for training their models.
Useful Papers
Eleonora Giunchiglia, Mihaela Catalina Stoian, Salman Khan, Fabio Cuzzolin, Thomas Lukasiewicz, ROAD-R: The Autonomous Driving Dataset with Logical Requirements, Machine Learning, 2023.
Gurkirt Singh, Stephen Akrigg, Manuele Di Maio, Valentina Fontana, Reza Javanmard Alitappeh, Salman Khan, Suman Saha, Kossar Jeddisaravi, Farzad Yousefi, Jacob Culley, Tom Nicholson, Jordan Omokeowa, Stanislao Grazioso, Andrew Bradley, Giuseppe Di Gironimo, Fabio Cuzzolin, Road: The road event awareness dataset for autonomous driving, IEEE TPAMI, 2022.
Eleonora Giunchiglia, Fergus Imrie, Mihaela van der Schaar, Thomas Lukasiewicz, Machine Learning with Requirements: a Manifesto, arXiv:2304.03674, 2023.