Qualitative Results: 5-step-prediction

Note on the prediction videos: for an n-step prediction result video, we reset the model input to the ground truth repeatedly after n frames. This means that for a 5-step prediction sequence, the model input is reset after every fifth time step. Accordingly, we reset the model input to the ground truth after each prediction step for a 1-step prediction sequence. If T-1 (T=total steps) is not evenly divisible by n, we discard the remaining time step data elements of that episode.

Novel and dissimilar (to the one's seen during training) shape:


AP (no f_interact)

AP (with f_interact)

Novel and dissimilar shape:


AP (no f_interact)

AP (with f_interact)

Novel and dissimilar shape:


AP (no f_interact)

AP (with f_interact)

Novel and dissimilar shape:


AP (no f_interact)

AP (with f_interact)

Novel and similar shape:


AP (no f_interact)

AP (with f_interact)

Known shape:


AP (no f_interact)

AP (with f_interact)

known (cubes dataset)

AP (with f_interact)

AP (with f_interact)