Track 2B: Text-to-Video Generation


Introduction

The realm of AI-powered creativity is expanding rapidly, and one of the latest frontiers is text-to-video generation. With models like Sora, VideoPoet and Pika leading the way, this field is opening up new possibilities for storytelling and content creation.

In this competition, participants will harness the power of these advanced models to generate videos from text prompts. Imagine turning a simple description like "a bustling city at sunrise" into a dynamic, visually stunning video.

Competitors will be provided with a set of text prompts and tasked with developing models that can transform these prompts into engaging, high-quality videos. This challenge not only tests the capabilities of current text-to-video generation technology but also pushes the boundaries of what’s possible in AI-driven content creation.

Join us in this innovative competition, where you’ll have the chance to showcase your skills, contribute to the growing field of AI, and compete for a $2000 USD prize. Let your creativity and technical prowess shine as you bring text descriptions to life through the magic of video.

Quick start guide

Dates

Evaluation method

To participate in the contest, you will submit the videos generated by your model. As you develop your model, you may want to visually evaluate your results and use automated metrics in VBench to track your progress.


After all submissions are uploaded, we will run a human-evaluation of all submitted videos. Labelers will evaluate videos on the following criteria:


We will choose a winner and a runner-up based on both automatic scores and human evaluation scores.

Dataset

Our LOVEU-T2V-2024 dataset consists of 240 prompts spanning diverse categories/dimensions:

2Rules

Report format

In your report, please explain clearly:

The report can be simple (1 page) or detailed (many pages). The report should be in PDF format.

FAQ

Organizers

Zhen Dong 

UC Berkeley