MINT
Downstream task: Object discovery in realistic scenes
MINT (ours)
MINT (ours)
Transporter [1]
Transporter [1]
Video Structure [2]
Video Structure [2]
MINT (ours)
MINT (ours)
 Transporter [1]
 Transporter [1]
Video Structure [2]
Video Structure [2]
Additional video results
MINT (ours)
MINT (ours)
Transporter [1]
Transporter [1]
Video Structure [2]
Video Structure [2]
MINT (ours)
MINT (ours)
Transporter [1]
Transporter [1]
Video Structure [2]
Video Structure [2]
MINT (ours)
MINT (ours)
Transporter [1]
Transporter [1]
Video Structure [2]
Video Structure [2]
MINT (ours)
MINT (ours)
Transporter [1]
Transporter [1]
Video Structure [2]
Video Structure [2]
MINT (ours)
MINT (ours)
Transporter [1]
Transporter [1]
Video Structure [2]
Video Structure [2]
MINT (ours)
MINT (ours)
Transporter [1]
Transporter [1]
Video Structure [2]
Video Structure [2]
MINT (ours)
MINT (ours)
Transporter [1]
Transporter [1]
Video Structure [2]
Video Structure [2]
References:
[1] Kulkarni, T. D., Gupta, A., Ionescu, C., Borgeaud, S., Reynolds, M., Zisserman, A., and Mnih, V. Unsupervised learning of object keypoints for perception and control. Advances in Neural Information Processing Systems, 32, 2019.
[2] Minderer, M., Sun, C., Villegas, R., Cole, F., Murphy, K. P., and Lee, H. Unsupervised learning of object structure and dynamics from videos. Advances in Neural Information Processing Systems, 32, 2019.Â