I am a Staff Research Scientist working on image and video generation models in GenAI at Meta. In the past, I have been woking on efficient neural network design, neural architecture search, and data efficient learning. I received the B.S. degree from Peking University in 2014, and the Ph. D. degree from Princeton University in July 2019.
2014 - 2019, PhD, Princeton University
2010 - 2014, Bachelor of Science, Peking University
02/2023 - current, Staff Research Scientist, Media Foundation at GenAI, Meta.
09/2019 - 02/2023, Research Scientist, Mobile Computer Vision, Meta.
05/2018 - 01/2019, Research Intern, Mobile Computer Vision. Meta.
05/2016 - 08/2016, DSP Software Engineer Intern, Tensilica.
Generative AI
Efficient deep neural network
Data efficient learning
Meta Movie Gen Team, "Movie Gen: A Cast of Media Foundation Models", arXiv:2410.13720 (2024).
B. Lai, X. Dai, L. Chen, G, Pang, J. M. Rehg, and M. Liu, "Lego: Learning egocentric action frame generation via visual instruction tuning", ECCV 2024, Best paper candidate
X. Dai, et al. "Emu: Enhancing image generation models using photogenic needles in a haystack." arXiv preprint arXiv:2309.15807 (2023).
J. Tian, X. Dai, C. Ma, Z. He, Y. Liu, Z. Kira, "Trainable Projected Gradient Method for Robust Fine-tuning", CVPR 2023.
J. Hou, X. Dai, Z. He, A. Dai, M. Nießner, "Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors", CVPR 2023.
H. You, Y. Xiong, X. Dai, B. Wu, P. Zhang, H. Fan, P Vajda, Y. Lin, "Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference", CVPR 2023.
F. Liang, B. Wu, X. Dai, K. Li, Y. Zhao, H. Zhang, P. Zhang, P. Vajda, and D. Marculescu, "Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP", CVPR 2023.
D. Bolya, C.-Y. Fu, X. Dai, P. Zhang, C. Feichtenhofer, J. Hoffman, "Token Merging: Your ViT But Faster", ICLR 2023.
T. Zhang, D. Cheng, Y. He, Z. Chen, X. Dai, L. Xiong, F. Yan, H. Li, Y. Chen, W. Wen, "NASRec: Weight Sharing Neural Architecture Search for Recommender Systems", WWW 2023.
Y. Liu, C. Ma, X. Dai, J. Tian, P. Vajda, Z. He, Z. Kira, “Open-Set Semi-Supervised Object Detection”, ECCV, 2022.
D. Bolya, C.-Y. Fu, X. Dai, P. Zhang, J. Hoffman, "Hydra attention: Efficient attention with many heads", ECCV CADL Workshop (Best Paper Award), 2022.
Y. Li, X. Dai, C. Ma, Y. Liu, K. Chen, B. Wu, Z. He, K. Kitani, P. Vajda"Cross-Domain Adaptive Teacher for Object Detection", CVPR, 2022.
X. Dai, A. Wan, P. Zhang, B. Wu, Z. He, Z. Wei, K. Chen, Y. Tian, M. Yu, P. Vajda, and J. Gonzalez, “FBNetV3: Joint Architecture-Recipe Search using Neural Acquisition Function,” CVPR, 2021.
Z. Yan, X. Dai, P. Zhang, Y. Tian, B. Wu, M. Feiszli, “FP-NAS: Fast Probabilistic Neural Architecture Search”, CVPR, 2021.
B. Wu, C. Xu, X. Dai, A. Wan, P. Zhang, Z. Yan, M. Tomizuka, K. Keutzer, P. Vajda, “Visual Transformers: Where Do Transformers Really Belong in Vision Models?” ICCV, 2021.
A. Wan, X. Dai, P. Zhang, Z. He, Y. Tian, S. Xie, B. Wu, M. Yu, T. Xu, K. Chen, P. Vajda, and J. Gonzatez, “FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions,” CVPR, 2020.
X. Dai, P. Zhang, B. Wu, H. Yin, F. Sun, Y. Wang, M. Dukhan, Y. Hu, Y. Wu, Y. Jia, P. Vajda, M. Uyttendaele, and N. K. Jha, “ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation,” CVPR, 2019.
B. Wu, X. Dai, P. Zhang, Y. Wang, Y., F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, “FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search,” CVPR, 2019.
X. Dai, H. Yin, and N. K. Jha, “NeST: A Neural Network Synthesis Tool Based on a Grow-and-prune Paradigm,” IEEE Trans. on Computers, 2019.
X. Dai, H. Yin, and N. K. Jha, “Grow and Prune Compact, Fast, and Accurate LSTMs,” IEEE Trans. on Computers, 2019.
X. Dai, H. Yin, and N. K. Jha, “Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks,” IEEE Trans. on Emerging Topics in Computing, 2021.
A. Mosenia, X. Dai, P. Mittal and N. K. Jha, “PinMe: Tracking a Smartphone User around the World,” IEEE Trans. on Multi-scale Computing Syst., Aug. 2017.
Meta's Movie Gen team
[Webpage] / [arXiv] / [MovieGenBench (GitHub)] / [bibtex]
Meta's media generation team
Lai, Bolin, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, and Miao Liu
Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, Joseph E. Gonzalez
Proc. IEEE Conf. CVPR, 2021
Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli
Proc. IEEE Conf. CVPR, 2021
Bichen Wu , Chenfeng Xu , Xiaoliang Dai , Alvin Wan , Peizhao Zhang , Masayoshi Tomizuka , Kurt Keutzer , Peter Vajda
Proc. IEEE Conf. ICCV, 2021
Xiaoliang Dai, Hongxu Yin, and Niraj K. Jha
IEEE Trans. on Computers, 2019
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)
Neural Information Processing Systems (NeurIPS)
IEEE International Conf. on Computer Vision (ICCV)
European Conference on Computer Vision (ECCV)
International Conference on Learning Representations (ICLR)
IEEE Winter Conf. on Applications of Computer Vision (WACV)
IEEE Trans. on Pattern Analysis and Machine Intelligence
IEEE Trans. on Neural Networks and Learning Systems
IEEE Trans. on Computers
IEEE Trans. on Image Processing
IEEE Trans. on Emerging Topics in Computing
IEEE Trans. on Multi-scale Computing Systems
IEEE Signal Processing Letters
J. Visual Communication and Image Representation
J. Selected Topics in Signal Processing
Efficient Deep Learning in Computer Vision (EDLCV) Workshop
Medical Image Computing and Computer Assisted Interventions (MICCAI)