Improved Conditional VRNNs for Video Prediction