TECO (ours)
Using Top-k
No Top-k
Latent FDM
Not Applicable
(Diffusion models do not support Top k sampling)
Perceiver AR
We generate 80 frames conditioned on 20