Visual Storytelling challenge

Visual Storytelling Challenge Organizers (afternoon)

Margaret Mitchell, Google Research, margarmitchell@gmail.com

Justin Johnson, Stanford, jcjohns@cs.stanford.edu

Ting-Hao Kenneth Huang, windx0303@gmail.com

Human storytelling has existed for as far back as we can trace, and predates writing.

Humans have used stories for entertainment, education, cultural preservation; to convey experiences, history, lessons, morals; and to share the human experience.

This challenge begins to scratch the surface on how well artificial intelligence can share in this cultural human experience.

Participants are encouraged to work on creating AI systems that can generate stories for themselves, sharing the human experience that they see -- and begin to understand.

Participants may submit to two different tracks: The Internal track and the External track.

Submissions are evaluated on how well they can generate human-like stories given a sequence of images as input.

Dates

Extended to 17.October: Submissions due on EvalAI. (You will need to create an account to view the challenge)

21.October: Results announced

Submission Tracks

Internal Track

For apples-to-apples comparison, all participants should submit to the Internal track. In this track, the only allowable training data is:

- Any of the VIST storytelling data (SIS, DII, and/or the non-annotated album images)
  - Data available here: http://visionandlanguage.net/VIST/dataset.html
- Allowed pretraining
  - Data from any version of the ImageNet ILSVRC Challenge (common in computer vision)
  - Data from any version of the Penn Treebank (common in natural language processing).

If you wish to use any other sources of data/labels or pre-training, please submit to the External track.

External track

Participants can use any data or method they wish during training (including humans-in-the-loop), but all data should be publicly available or made publicly available. At test time, the systems must be stand-alone (no human intervention). Possible datasets include data from other ICCV workshops, COCO, and VQA. Please e-mail us to let us know what datasets you choose to use and a link to the source. We will update the website with links to datasets chosen by 2 or more groups, and make all datasets available after submission.

Evaluation

Evaluation will have two parts:

1. Automatic: On EvalAI, using the automatic metric of METEOR.
2. Human: Crowdsourced survey of the quality of the stories.

Visual Storytelling challenge

Internal Track

External track

Evaluation

Submission Instructions