Supported Challenge Dataset

The organizers have upgraded the 2020 HLVU dataset previously composed of ten Creative Commons license movies used at the first edition of the challenge. In 2021, additional set of creative common movies will be added. Full ground truth as annotated by human annotators (on the whole movie level, as well on a scene-based level) will be released for the 10 HLVU movies which can be used by systems as a training set. The DVU Challenge will also complement the 2020 relationship ontology by merging and adding to the MovieGraphs used vocabulary (Vicol, Paul, et al. "Moviegraphs: Towards understanding human-centric situations from videos." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018) to include character interactions, scene locations and sentiments.

If you make use of the DVU dataset or otherwise participate in the challenge please cite this paper using the following bibtex:

@inproceedings{curtis2020hlvu,

title={HLVU: A New Challenge to Test Deep Understanding of Movies the Way Humans do},

author={Curtis, Keith and Awad, George and Rajput, Shahzad and Soboroff, Ian},

booktitle={Proceedings of the 2020 International Conference on Multimedia Retrieval},

pages={355--361},

year={2020}

}

Movie/TV Dataset

The full Deep Video Understanding training set is available from this link www-nlpir.nist.gov/projects/trecvid/dvu/training/ . This training set has been annotated by human assessors and final ground truth, both at the overall movie level (Ontology of relations, entities, actions & events, Knowledge Graph, and names and images of all main characters), and the individual scene level (Ontology of locations, scene textual summaries, interactions between characters, character emotional states, and scene sentiments) has been be provided for the training set to participating researchers for training and development of their systems. Full details of these movies are provided:

Training dataset:

Honey - Romance - 86 mins.
Let's bring back Sophie - Drama - 50 mins.
Nuclear Family - Drama - 28 mins.
Shooters - Drama - 41 mins.
Spiritual Contact The Movie - Fantasy - 66 mins.
Super Hero - Fantasy - 18 mins.
The Adventures of Huckleberry Finn - Adventure - 106 mins.
The Big Something - Comedy - 101 mins.
Time Expired - Comedy / Drama - 92 mins.
Valkaama - Adventure - 93 mins.

Testing dataset:

1- Bagman - Drama / Thriller - 107 mins.

2- Manos - Horror - 73 mins.

3- Road to Bali - Comedy / Musical - 90 mins.

4- The Illusionist - Adventure / Drama - 109 mins.

Resources by participating teams

Automatically generated transcripts by university of Zurich is available from HERE. Please cite the team's 2020 system paper: https://dl.acm.org/doi/10.1145/3394171.3416292

Speech and person/face bounding box annotations for subset of the HLVU dataset are available by TokyoTech team from HERE. Please cite the team's 2020 system paper: https://dl.acm.org/doi/abs/10.1145/3395035.3425639

Scene annotations and resources by Nanjing University are available from HERE together with a README file. Please cite the team's 2020 system paper:

https://dl.acm.org/doi/10.1145/3394171.3416303

Movie Level

Query Types

Fill in the graph space: Fill in spaces in the Knowledge Graph (KG). Given the listed relationships, events or actions for certain nodes, where some nodes are replaced by variables X, Y, etc., solve for X, Y etc. Example of The Simpsons: X Married To Marge. X Friend Of Lenny. Y Volunteers at Church. Y Neighbor Of X. Solution for X and Y in that case would be: X = Homer, Y = Ned Flanders.
Question Answering: This query type represents questions on the resulting KG, including actions and events, of the movies in the described dataset. For example, we may ask 'How many children does Person A have?', in which case participating researchers should count the 'Parent Of' relationships Person A has in the Knowledge Graph. These are multiple choice questions.
Relations between characters: How is character X related to character Y ? This query type question asks participants about all routes through the KG from one person to another. The main objective of this query type is to test the quality of the established KG. If the system managed to build a representative KG reflecting the real story line of the movie, then it should be able to return back all valid paths, including the shortest, between characters (i.e. how they are related to each other).

Movie Level

Metrics

Results will be treated as ranked list of result items per each unknown variable and the Reciprocal Rank score will be calculated per unknown variable and Mean Reciprocal Rank (MRR) per query.
Scores for this query will be calculated by the number of Correct Answers / number of Total Questions.
In this query type, systems are asked to submit all valid paths from a source node to another target node with the goal of maximizing recall and precision. Each path will be evaluated if it is a valid path (i.e the submitted order of nodes and edges leads to a path from the source person to the target person of the query) and finally the recall, precision and F1 measures will be reported.

Scene Level

Query Types

Find the Unique Scene: Given a full, inclusive list of interactions, unique to a specific scene in the movie, teams should find which scene this is.
Fill in the graph space: Find the person in a specific scene with the following attributes and interactions with others. Participating teams will be given a scene number, a list of person attributes, and a list of interactions to and from other people. Teams should find the only person in that scene with those attributes and interactions.
Find next or previous interaction: Given a specific scene and a specific interaction between person X and person Y, participants will be asked to return either the previous interaction or the next interaction, in either direction, between person X and Person Y. This can be specifically the next or previous interaction within the same scene, or over the entire movie. These will be multiple choice questions selected from a list of possible interactions, only one of which will be correct.
Find the 1-to-1 relationship between scenes and natural language descriptions: Given a set of scenes, and a set of natural language descriptions of movie scenes, match the correct natural language description for each scene.
Classify scene sentiment from a given scene: Given a specific movie scene and a set of possible sentiments, classify the correct sentiment label for each given scene.

Scene Level

Metrics

Results will be treated as ranked list of result items per each unknown variable and the Reciprocal Rank score will be calculated per unknown variable and Mean Reciprocal Rank (MRR) per query.
Results will be treated as ranked list of result items per each unknown variable and the Reciprocal Rank score will be calculated per unknown variable and Mean Reciprocal Rank (MRR) per query.
Scores for this query will be calculated by the number of Correct Answers / number of Total Questions.
Scores for this query will be calculated by the number of Correct Answers / number of Total Questions.
Scores for this query will be calculated by the number of Correct Answers / number of Total Questions.

Please see sample XML response files for a movie-level run and scene-level run. Please make sure your runs validates against the DTD files for both movie and scene query results. Two DTD files for movie-level and scene-level results are available : Movie-level DTD , Scene-level DTD.

A readme file for queries setup is available from HERE

Sample Queries and Responses (Movie-level)