This presents samples from the autoregressive video generation model described in:
Scaling Autoregressive Video Models
Dirk Weissenborn, Oscar Täckström, Jakob Uszkoreit
ICLR 2020
Note: Generations start when the border turns red.
We start by showing results of the large and small spatiotemporal subscaling models on BAIR Robot Pushing.
This shows a random sample for each of the 256 test videos with temperature=0.9.
This shows 15 random samples for 16 random test videos (shown in leftmost column) with temperature=0.9. Samples ordered by decreasing log-probability from left-to-right.
This shows a random sample for each of the 256 test videos with temperature=0.9.
This shows 15 random samples for 16 random test videos (shown in leftmost column) with temperature=0.9. Samples ordered by decreasing log-probability from left-to-right.
We first show samples on the "cooking" subset of Kinetics (see appendix for definition).
Each row shows 16 random samples, ordered by decreasing log-probability from left-to-right. This corresponds to Figure 3 of the paper.
This shows a random sample for each of the 128 cooking test videos with temperature=0.9.
We next show samples from the first 256 videos of the full Kinetics test set. Here we see that even the most powerful models struggle with producing good samples.
This shows a random sample for each of the first 256 test videos with temperature=0.9.
This shows 16 random samples for 16 test videos with temperature=0.9. Samples ordered by decreasing log-probability from left to right and test videos ordered by increasing movement in prime frames (as measured by SSIM) from top to bottom.
This shows a random sample for each of the first 256 test videos with temperature=0.9.