Search this site
Embedded Files
Welcome

Xiaoliang Dai

xiaoliangdai@fb.com [Google Scholar]

About Me

I am a Staff Research Scientist working on image and video generation models in GenAI at Meta.   In the past, I have been woking on efficient neural network design, neural architecture search, and data efficient learning. I received the B.S. degree from Peking University in 2014, and the Ph. D. degree from Princeton University in July 2019.

Education

  • 2014 - 2019, PhD,  Princeton University

  • 2010 - 2014, Bachelor of Science, Peking University

Work Experience

  • 02/2023 - current, Staff Research Scientist, Media Foundation at GenAI, Meta.

    • Pretraining stage leading IC and babysitter in Meta's Movie Gen.

    • Tech lead in Meta's image generation model Emu.

  • 09/2019 - 02/2023, Research Scientist, Mobile Computer Vision, Meta.

  • 05/2018 - 01/2019, Research Intern, Mobile Computer Vision. Meta.

  • 05/2016 - 08/2016, DSP Software Engineer Intern, Tensilica.

Research Interests

  • Generative AI

  • Efficient deep neural network

  • Data efficient learning

Selected Publications                                                                            

  • Meta Movie Gen Team, "Movie Gen: A Cast of Media Foundation Models", arXiv:2410.13720 (2024).

  • B. Lai, X. Dai, L. Chen, G, Pang, J. M. Rehg, and M. Liu, "Lego: Learning egocentric action frame generation via visual instruction tuning", ECCV 2024, Best paper candidate

  • X. Dai, et al. "Emu: Enhancing image generation models using photogenic needles in a haystack." arXiv preprint arXiv:2309.15807 (2023).

  • J. Tian, X. Dai, C. Ma, Z. He, Y. Liu, Z. Kira, "Trainable Projected Gradient Method for Robust Fine-tuning", CVPR 2023.

  • J. Hou, X. Dai, Z. He, A. Dai, M. Nießner, "Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors", CVPR 2023.

  • H. You, Y. Xiong, X. Dai, B. Wu, P. Zhang, H. Fan, P Vajda, Y. Lin, "Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference", CVPR 2023.

  • F. Liang, B. Wu, X. Dai, K. Li, Y. Zhao, H. Zhang, P. Zhang, P. Vajda, and D. Marculescu, "Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP", CVPR 2023.

  • D. Bolya, C.-Y. Fu, X. Dai, P. Zhang, C. Feichtenhofer, J. Hoffman, "Token Merging: Your ViT But Faster", ICLR 2023.

  • T. Zhang, D. Cheng, Y. He, Z. Chen, X. Dai, L. Xiong, F. Yan, H. Li, Y. Chen, W. Wen, "NASRec: Weight Sharing Neural Architecture Search for Recommender Systems", WWW 2023.

  • Y. Liu, C. Ma, X. Dai, J. Tian, P. Vajda, Z. He, Z. Kira, “Open-Set Semi-Supervised Object Detection”, ECCV, 2022.

  • D. Bolya, C.-Y. Fu, X. Dai, P. Zhang, J. Hoffman, "Hydra attention: Efficient attention with many heads", ECCV CADL Workshop (Best Paper Award), 2022.

  • Y. Li, X. Dai, C. Ma, Y. Liu, K. Chen, B. Wu, Z. He, K. Kitani, P. Vajda"Cross-Domain Adaptive Teacher for Object Detection", CVPR, 2022.

  • X. Dai, A. Wan, P. Zhang, B. Wu, Z. He, Z. Wei, K. Chen, Y. Tian, M. Yu, P. Vajda, and J. Gonzalez, “FBNetV3: Joint Architecture-Recipe Search using Neural Acquisition Function,” CVPR, 2021.

  • Z. Yan, X. Dai, P. Zhang, Y. Tian, B. Wu, M. Feiszli, “FP-NAS: Fast Probabilistic Neural Architecture Search”, CVPR, 2021.

  • B. Wu, C. Xu, X. Dai, A. Wan, P. Zhang, Z. Yan, M. Tomizuka, K. Keutzer, P. Vajda, “Visual Transformers: Where Do Transformers Really Belong in Vision Models?” ICCV, 2021.

  • A. Wan, X. Dai, P. Zhang, Z. He, Y. Tian, S. Xie, B. Wu, M. Yu, T. Xu, K. Chen, P. Vajda, and J. Gonzatez, “FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions,” CVPR, 2020.

  • X. Dai, P. Zhang, B. Wu, H. Yin, F. Sun, Y. Wang, M. Dukhan, Y. Hu, Y. Wu, Y. Jia, P. Vajda, M. Uyttendaele, and N. K. Jha, “ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation,” CVPR, 2019.

  • B. Wu, X. Dai, P. Zhang, Y. Wang, Y., F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, “FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search,” CVPR, 2019.

  • X. Dai, H. Yin, and N. K. Jha, “NeST: A Neural Network Synthesis Tool Based on a Grow-and-prune Paradigm,” IEEE Trans. on Computers, 2019.

  • X. Dai, H. Yin, and N. K. Jha, “Grow and Prune Compact, Fast, and Accurate LSTMs,” IEEE Trans. on Computers, 2019.         

  • X. Dai, H. Yin, and N. K. Jha, “Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks,” IEEE Trans. on Emerging Topics in Computing, 2021.

  • A. Mosenia, X. Dai, P. Mittal and N. K. Jha, “PinMe: Tracking a Smartphone User around the World,” IEEE Trans. on Multi-scale Computing Syst., Aug. 2017.

Movie Gen: A Cast of Media Foundation Models 

Meta's Movie Gen team

[Webpage] / [arXiv] / [MovieGenBench (GitHub)] / [bibtex]

Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack

Meta's media generation team

[Emu] 

Lego: Learning egocentric action frame generation via visual instruction tuning

Lai, Bolin, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, and Miao Liu

paper, project page

Cross-Domain Adaptive Teacher for Object Detection

Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, Peter Vajda

IEEE Conf. CVPR, 2022

paper code

FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining

Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, Joseph E. Gonzalez

Proc. IEEE Conf. CVPR, 2021

paper

FP-NAS: Fast Probabilistic Neural Architecture Search

Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli

Proc. IEEE Conf. CVPR, 2021

paper

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

Bichen Wu , Chenfeng Xu , Xiaoliang Dai , Alvin Wan , Peizhao Zhang , Masayoshi Tomizuka , Kurt Keutzer , Peter Vajda

Proc. IEEE Conf. ICCV, 2021

paper

FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions

Alvin Wan, Xiaoliang Dai, Peizhao Zhang, Zijian He, Yuandong Tian, Saining Xie, Bichen Wu, Matthew Yu, Tao Xu, Kan Chen, Peter Vajda, Joseph E. Gonzalez

Proc. IEEE Conf. CVPR, 2020

paper / code

ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation

Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matt Uyttendaele, and Niraj K. Jha,

Proc. IEEE Conf. CVPR, 2019


paper / code

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer,

 Proc. IEEE Conf. CVPR, 2019

paper / code

NeST: A Neural Network Synthesis Tool Based on a Grow-and-prune Paradigm

Xiaoliang Dai, Hongxu Yin, and Niraj K. Jha

IEEE Trans. on Computers, 2019


paper

Academic Service

Reviewer / Program committee

  • IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)

  • Neural Information Processing Systems  (NeurIPS)

  • IEEE International Conf. on Computer Vision (ICCV)

  • European Conference on Computer Vision (ECCV)

  • International Conference on Learning Representations (ICLR)

  • IEEE Winter Conf. on Applications of Computer Vision (WACV)

  • IEEE Trans. on Pattern Analysis and Machine Intelligence

  • IEEE Trans. on Neural Networks and Learning Systems

  • IEEE Trans. on Computers

  • IEEE Trans. on Image Processing

  • IEEE Trans. on Emerging Topics in Computing

  • IEEE Trans. on Multi-scale Computing Systems

  • IEEE Signal Processing Letters

  • J. Visual Communication and Image Representation

  • J. Selected Topics in Signal Processing

  • Efficient Deep Learning in Computer Vision  (EDLCV) Workshop

  • Medical Image Computing and Computer Assisted Interventions (MICCAI)

Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse