Based on the Aerial Vision-and-Dialog Navigation (AVDN) dataset, we're organizing a challenge for the Aerial Navigation from Dialog History (ANDH) task. This event will be part of the 5th workshop on Closing the Loop Between Vision and Language (CLVL) workshop at ICCV 2023. To participate, visit the challenge page on Eval.ai: https://eval.ai/web/challenges/challenge-page/2049/overview.


To submit the report, please send the following to yfan71@ucsc.edu:

Winner (announcement date: Aug 28, 2023):

Congratulations to the winning team!

First place winner


The AVDN dataset is a dataset for aerial embodied AI, which includes human-human dialogs, drone navigation trajectories, and drone's visual observation (simulated using the  xView dataset) with human attention. The dialog involves the user (commander) that provides instructions, and the aerial agent (follower) that followes the instruction and askes questions when needed. Based on the AVDN dataset, we introduce the Aerial Navigation from Dialog History (ANDH) task. The goal of the task is to let the agent predict aerial navigation actions that lead to goal areas, following the instructions in the dialog history.

How to participate: 

Challenge phases:


Have any questions or suggestions? Feel free to reach out at yfan71@ucsc.edu.

Please add [AVDN Challenge] to the email title.

Please cite our paper as below if you use our work.


  title={Aerial vision-and-dialog navigation},

  author={Fan, Yue and Chen, Winson and Jiang, Tongzhou and Zhou, Chun and Zhang, Yi and Wang, Xin Eric},

  journal={arXiv preprint arXiv:2205.12219},