Hamidreza Hashempoor, Wan Choi
Seoul National University
NeurIPS 2024, the Thirty-eighth Annual Conference on Neural Information Processing Systems
Model
Gated Inference Network (GIN) is an efficient approximate Bayesian inference algorithm for state space models (SSMs) with nonlinear state transitions and emissions. GIN disentangles two latent representations: one representing the object derived from a
nonlinear mapping model, and another representing the latent state describing its dynamics. This disentanglement enables direct state estimation and missing data imputation as the world evolves. To infer the latent state, a deep extended Kalman filter (EKF) approach is utilized that integrates a compact RNN structure to compute both the Kalman Gain and smoothing gain, completing the data flow.
Main Result
As shown in the paper, the GIN can be trained end-to-end, and is able learn the dynamics of system from the videos. The model can be used to generate new sequences, as well as to do missing data imputation, even in highly distorted observation scenarios.
We showcase the model's capability to independently learn the dynamics model from video data. Subsequently, we employ it for tasks such as imputing missing data and generating sequences. These tasks are carried out across distinct environments, each featuring an irregular polygon with a varying number of edges.
4 edges
Sequence Generation.
Image Imputation where 25% of the frames are dropped. The gray ball is the result of the imputation.
5 edges
Sequence Generation
Image Imputation where 25% of the frames are dropped. The gray ball is the result of the imputation.
6 edges
Sequence Generation
Image Imputation where 25% of the frames are dropped. The gray ball is the result of the imputation.
7 edges
Sequence Generation
Image Imputation where 25% of the frames are dropped. The gray ball is the result of the imputation.