Predicting Video with VQVAE