MINT

Downstream task: Object discovery in realistic scenes

MINT (ours)

Transporter [1]

Video Structure [2]

MINT (ours)

 Transporter [1]

Video Structure [2]

Additional video results

MINT (ours)

Transporter [1]

Video Structure [2]

MINT (ours)

Transporter [1]

Video Structure [2]

MINT (ours)

Transporter [1]

Video Structure [2]

MINT (ours)

Transporter [1]

Video Structure [2]

MINT (ours)

Transporter [1]

Video Structure [2]

MINT (ours)

Transporter [1]

Video Structure [2]

MINT (ours)

Transporter [1]

Video Structure [2]

References:

[1] Kulkarni, T. D., Gupta, A., Ionescu, C., Borgeaud, S., Reynolds, M., Zisserman, A., and Mnih, V. Unsupervised learning of object keypoints for perception and control. Advances in Neural Information Processing Systems, 32, 2019.

[2] Minderer, M., Sun, C., Villegas, R., Cole, F., Murphy, K. P., and Lee, H. Unsupervised learning of object structure and dynamics from videos. Advances in Neural Information Processing Systems, 32, 2019.Â