A Deep and Autoregressive Approach for Topic Modeling of Multimodal Data