NAVI 2025
AVSS 2025 Challange on Natural language-based Automotive Video Identification
NAVI 2025
AVSS 2025 Challange on Natural language-based Automotive Video Identification
the name NAVI as a double entendre: in Korean, 나비 (‘navi’) means “butterfly,” symbolizing the freedom and agility that AI can bring to our daily lives—and in English it hints at “navigation,” reflecting AI’s potential to guide us toward smarter, safer urban futures.
The NAVI Challenge, hosted at AVSS 2025, aims to apply AI to real-world scenarios, such as intelligent transportation systems, to enhance urban management efficiency, address modern urban challenges, and pave the way toward the future of smart cities.
This year’s challenge focuses on leveraging road surveillance videos collected from various countries, paired with rich natural language descriptions, to advance natural language-based vehicle video retrieval and take a step closer to building future cities. The challenge also seeks innovative contributions leveraging computer vision, natural language processing, and deep learning, aiming to develop practical, large-scale applications that will make our environment safer and smarter.
Natural language-based Automotive Video Identification
This new task aims to identify target vehicles in road videos based on natural language descriptions. The dataset has been constructed to account for characteristics from various countries, accompanied by rich natural language annotations. The focus of this task is to provide deep insights for future urban management and surveillance.
We hope many participants will join the NAVI Challenge. To participate, please fill out the online NAVI Challenge Dataset Request Form. This year’s challenge aims to advance the frontiers of research and innovation, fostering smarter and safer urban environments.
NAVI Challenge Workshop Schedule
Agust 12, from 09:45 AM ~ 11:25 AM
09:45 – 09:55: Presentation on the NAVI Challenge overview (10 minutes)
09:55 – 10:45: Invited Talk (50 minutes)
10:45 – 10:55: Break (10 minutes)
10:55 – 11:15: Winner team's presentation (20 minutes)
11:15 – 11:25: Wrap-up and closing remarks (10 minutes)
Important Dates
Sharing Traing and Validation dataset : 25.04.23
Open source on GitHub ( Starter Kit ) : 25.04.23
Evaluation server open to submissions : 25.04.23
Challenge track submssions due : 25.06.29 25.07.14
Announcement of awards : 25.07.07 25.07.18
Dataset Download
To participate, fill out this online NAVI Challange Datasets Requset Form.
VISION Data-Download [By clicking the link, you are confirming your acceptance of this particular data license agreement.]
After submitting this data request form and registering for the challenge on CodaBench, you’ll receive the dataset link.
Evaluation
The dataset for this track, named VISION, is based on the CityFlow Benchmark, as well as surveillance camera video data from South Korea and Indonesia, with vehicles annotated using natural language descriptions. It contains around 7,000 vehicle tracks, each described at varying levels of detail. The dataset for this challenge track consists of three files: train-tracks.json, test-tracks.json, and test-queries.json. Please refer to the README file in the dataset for further details.
Task :Teams are required to retrieve and rank the vehicle tracks corresponding to each query. A basic retrieval model is provided as an initial reference for participating teams.
Submission Format :
{
"query-uuid-1": ["track-uuid-i", ..., "track-uuid-j"],
"query-uuid-2": ["track-uuid-m", ..., "track-uuid-n"]
}
For every query, the list should include the testing tracks, ranked according to the retrieval model.
Evaluation : The evaluation for the Vehicle Retrieval by NL Descriptions task will be based on standard retrieval metrics. The primary evaluation criterion is the Mean Reciprocal Rank (MRR). Additionally, Recall @ 5, Recall @ 10, and Recall @ 25 will also be calculated for all submissions.
Submission
Submission Site : Natural Language-based Automotive Video Identification codabench.
Awards
Top 3 winners will be invited to present their work on-site at AVSS 2025' (paper submission is not required).
1st place : RTX-5070Ti 16G (USD 1000)
2nd place : RTX-5060Ti 16G (USD 500)
3rd place : Jetson Nano (USD 150)
To reduce the burden on participants, only presentation is required-paper submission is not necessary.
Challenge Committee
Jaejun Yoo, Associate Professor, AIGS, UNIST
KwangJu Kim, Senior Researcher, Director of AI Infrastructure, ETRI
Kyungoh Lee, Senior Researcher, ETRI
Dongyoung KIM, Post-Master's Resesearcher, ETRI
Seongjun Park, Master student, AIGS, UNIST
Bumhoon Park, Master student, AIGS, UNIST
We would like to participate. What do wee need to do?
Fill out the participation intent form to list your institution, your team. You just need to follow the instructions and submit the form.
How large can a team be?
No restrictions on team size.
What are the rules for downloading the data set?
A participation agreement is available ahead of the data being shared. You need to accept that agreement and submit that response ahead of getting access to the data set.
Can I use any availabe data set to train models in this challenge?
Teams that are willing to be listed in the public leader board and win the challenge awards are NOT allowed to use any external data for either training or validation. The winning teams and runners-up are required to submit their training and testing codes for verification after the challenge submission deadline.
What are the prizes?
TBD
Will we need to submit our code?
Teams need to make there code publicly accessible to be considered for winning (including complete/reproducible pipeline for mode training/creation). This is to ensure that no external data is used for training and the tasks were performed by algorithms and not humans and contribute to the community.
How will the submissions be evaluated?
The submissions formats for each track are detailed on the Data and Evaluation page.
Are we allowed to use validation sets in training?
The validation sets are allowed to be used in training.
Are we allowed to use test sets in training?
Additional manual annotations on our testing data are strictly prohibited. We also do not encourage the use of testing data in any way during training, with or without labels, because the task is supposed to be fairly evaluated in real life where we don’t have access to testing data at all. Although it is permitted to perform algorithms like clustering to automatically generate pseudo labels on the testing data, we will choose a winning method without using such techniques when multiple teams have similar performance (~1%). Finally, please keep in mind that, all the winning methods and runners-up will be requested to submit their code for verification purposes. Their performance needs to be reproducible using the training/validation/synthetic data only.
Are we allowed to use other external data/pre-trained models?
The use of any real external data is prohibited. There is NO restriction on synthetic data. Some pre-trained models trained on ImageNet / MSCOCO, such as classification models (pre-trained ResNet, DenseNet, etc.), detection models (pre-trained YOLO, Mask R-CNN, etc.), etc. that are not directly for the challenge tasks can be applied. Please confirm with us if you have any question about the data/models you are using.
Do the winning teams are runners-up need to submit papers and present at the session?
All the winning teams and runners-up have to submit papers, register and present at the session, in order to be qualified for winning.
Citations
@misc{feng2021cityflownltrackingretrievalvehicles,
title={CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural Language Descriptions},
author={Qi Feng and Vitaly Ablavsky and Stan Sclaroff},
year={2021},
eprint={2101.04741},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2101.04741},
}