Jialiang Wang

I am a research scientist in Meta GenAI. I now mainly work on media foundation models (emu, movie gen) and multimodal foundation models (llama). I previously worked on depth estimation, on-device computer vision and human vision-inspired computer vision. Prior to Meta, I obtained my Ph.D. from Harvard University advised by Prof. Todd Zickler and my B.A.Sc from the University of Toronto advised by Prof. Sven Dickinson and Prof. Sanja Fidler.

2025

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
H. Wang, CY Ma, YC Liu, J. Hou, T. Xu, J. Wang, F. Juefei-Xu, Y. Luo, P. Zhang, T. Hou, P. Vajda, N. Jha, X. Dai
CVPR, 2025
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation
P. Hansen-Estruch, D. Yan, CY Chung, O. Zohar, J. Wang, T. Hou, T. Xu, S. Vishwanath, P. Vajda, X. Chen
arXiv, 2025

2024

DirectorLLM for Human-Centric Video Generation
K. Song, T. Hou, Z. He, H. Ma, J. Wang, A. Sinha, S. Tsai, Y. Luo, X. Dai, L. Chen, X. Xia, P. Zhang, P. Vajda, A. Elgammal, F. Juefei-Xu
arXiv, 2024
Movie Gen: A Cast of Media Foundation Models
The Movie Gen Team (Core contributor)
Meta AI Tech Report, 2024
Pixel-Space Post-Training of Latent Diffusion Models
C. Zhang, S. Motwaini, M. Yu, J. Hou, F. Juefei-Xu, S. Tsai, P. Vajda, Z. He, J. Wang
arXiv, 2024
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
F. Wimbauer, B. Wu, E. Schoenfeld, X. Dai, J. Hou, Z. He, A. Sanakoyeu, P. Zhang, S. Tsai, J. Kohler, C. Rupprecht, D. Cremers, P. Vajda, J. Wang
CVPR, 2024
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
F. Liang, B. Wu, J. Wang, L. Yu, K. Li, Y. Zhao, I. Misra, JB Huang, P. Zhang, P. Vajda, D. Marculescu
CVPR, 2024
ControlRoom3D: Room Generation using Semantic Proxy Rooms
J. Schult, S. Tsai, L. Höllein, B. Wu, J. Wang, CY Ma, K. Li, X. Wang, F. Wimbauer, Z. He, P. Zhang, B. Leibe, P. Vajda, J. Hou
CVPR, 2024
Efficient Quantization Strategies for Latent Diffusion Models
Y. Yang, X. Dai, J. Wang, P. Zhang, H. Zhang
CVPR workshop on Efficient and On-Device Generation, 2024
An Analysis on Quantizing Diffusion Transformers
Y. Yang, J. Wang, X. Dai, P. Zhang, H. Zhang
CVPR workshop on Transformers for Vision, 2024

2023

Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
X. Dai∗, J. Hou∗, CY Ma∗, S Tsai∗, J. Wang∗, R. Wang∗, P. Zhang∗, S. Vandenhende, X. Wang, A. Dubey, M. Yu, A. Kadian, F. Radenovic, D. Mahajan, K. Li, Y. Zhao, V. Petrovic, M. K. Singh, S. Motwani, Y. Wen, Y. Song, R. Sumbaly†, V. Ramanathan†, Z. He†, P. Vajda†, D. Parikh†
Meta AI Tech Report, 2023
∗: Equal contribution: alphabetical order
†: joint last authors
NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View Indoor 3D Object Detection
C. Xu, B. Wu, J. Hou, S. Tsai, R. Li, J. Wang, W. Zhan, Z. He, P. Vajda, K. Keutzer, M. Tomizuka
ICCV, 2023
A Practical Stereo Depth System for Smart Glasses
J. Wang, D. Scharstein, A. Bapat, K. Blackburn-Matzen, M. Yu, J. Lehman, S. Alsisan, Y. Wang, S. Tsai, JM Frahm, Z. He, P. Vajda, M. F. Cohen, M. Uyttendaele
CVPR, 2023
Consistent Direct Time-of-Flight Video Depth Super-Resolution
Z. Sun, W. Ye, J. Xiong, G. Choe, J. Wang, S. Su, R. Ranjan
CVPR, 2023

2022

Toward practical monocular indoor depth estimation
CY Wu, J. Wang, M. Hall, U. Neumann, S. Su
CVPR, 2022

2021

Before 2020

Wang, J., et al. "Distance determinations using one or more neural networks."
U.S. Patent Application No. 16/852,944

Reviewer: CVPR'20-24, NeurIPS'20-24, ICML'21-23, ICCV 21,23, ICLR'21-22, BMVC'20, ACCV'20, WACV'21-22, ECCV'22, 24