Welcome to Xiaopeng Zhang's Homepage
Xiaopeng Zhang (张晓鹏) Ph.D
Senior Researcher, Assistant Scientist
Huawei
Email: zxphistory(at)gmail(dot)com
Bio
I am currently an Assistant Scientist at Huawei. I lead the PanGu vision team at Huawei Cloud since 2020, and in charge of PanGu foundation model research. As a core member, anticipate Pangu 1.0 to 3.0 evolution, success case I lead including PanGu Mine, Railway, Autonomous Driving, and Electricity project etc. Prior that, I lead a research team at Noah's Ark Lab, focus on data efficient learning in autonomous driving.
Before I joined Huawei, I was a Research Fellow from 2017 to 2019 with the Department of Electrical and Computer Engineering at National University of Singapore, a member of Learning and Vision Lab under the supervision of Jiashi Feng and Shuicheng Yan. I finished my PhD in Electronic Engineering from Shanghai Jiao Tong University in 2017, under the supervision of Pro. Hongkai Xiong and Pro. Qi Tian. My research interest focus on vision and language foundation models, including foundation model pretraining, work flow development, data engineering, and multi-modal understanding. Prior that, my research focus on fine-grained recognition and weakly supervised learning during my PhD and post Doc. period.
I am always recruiting highly motivated interns (PhD preferred) focusing on foundamental models, scopes include (but not limited to) self-supervised learning, multi-modal learning, network optimization, data engineering etc. We offer sufficient computing resouces and competive benefits. if you are interested, please drop me an email.
Selected Honors and Awards
The Pangu Large Pre-Trained Models, SAIL Star of World Artificial Intelligence Conference, 2021.
NuScenes Autonomous Driving 3D Detection Task 2020, Top 1.
WebVision Large Scale Classification Challenge 2020, Top 1.
LVIS Long-tailed Challenge 2020, Most Innovative Award.
Outstanding Doctoral Thesis Award, China Society of Image and Graphics (CSIG), 2018.
Best Student Paper Award, Visual Communications and Image Processing (VCIP), 2014.
Projects
Technological Innovation 2030—Major Project of “New Generation Artificial Intelligence”- Machine learning technology under data security and privacy protection: Large Scale Learning System Sub-project Leader
Publications
Representative Works: Foundation models (pretraining, adaptation, model acceleration), 3D CV.
Preprints:
B. Shi, P. Zhao, X. Zhang et al. UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding, arxiv:2401.06397, 2024.
G. Wu, T. Yi, J. Fang, L. Xie, X. Zhang et al., 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering, arxiv:2310.08528, 2023.
T. Yi, J. Fang, G. Wu, L. Xie, X. Zhang et al., GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors, arxiv:2310.08529, 2023.
J. Fang, J. Wang, X. Zhang, L. Xie, Q. Tian, GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions, arxiv:2311.16037, 2023.
J. Cen, J. Fang, C. Yang, L. Xie, X. Zhang et al., Segment Any 3D Gaussians, arxiv:2311.16037, 2023.
Y. Chen, J. Fang, Y. Huang, T. Yi, X. Zhang et al., Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views, arxiv:2312.04424, 2023.
B. Shi, X. Zhang, H. Xu, W. Dai, J. Zou, H. Xiong, Q. Tian, Multi-dataset pretraining: A unified model for semantic segmentation, arXiv preprint arXiv:2106.04121
Conference & Journal Papers:
Y. Wang, J. Li, X. Zhang, et al., BarLeRIa: An Efficient Tuning Framework for Referring Image Segmentation, ICLR 2024 (Spotlight).
B. Shi, X. Zhang, et al., Hybrid Distillation: Connecting Masked Autoencoders with Contrastive Learners, ICLR 2024.
Y. Zhang, Y. Wei, X. Zhang et al., ControlVideo: Training-free Controllable Text-to-Video Generation, ICLR 2024.
Y. Xu, L. Xie, X. Zhang, et al., Qa-lora: Quantization-aware low-rank adaptation of large language models,ICLR 2024.
J. Li, Y. Wang, X. Zhang, et al., AiluRus: A Scalable ViT Framework for Dense Prediction, NeurIPS 2023.
J. Cen, Z. Zhou, J. Fang, X. Zhang, et al., Segment Anything in 3D with NeRFs, NeurIPS 2023.
Y. Wang, Y. Liu, X. Zhang, et al., VioLET: Vision-Language Efficient Tuning with Collaborative Multimodal Gradients, ACM MM 2023.
J. Li, Y. Wang, X. Zhang, Progressively Compressed Auto-Encoder for Self-supervised RepresentationLearning, ICLR 2023.
Y. Wang, B. Shi, X. Zhang, Adapting Shortcut with Normalizing Flow: An Efficient Tuning Framework for Visual Recognition, CVPR 2023.
J. Peng, Q. Chang, H. Yin, X. Bu, J. Sun, L. Xie, X. Zhang et al., GAIA-Universe: Everything is Super-Netify, IEEE TPAMI, 2023.
L. Xie, L. Wei, X. Zhang et al., Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models, arXiv preprint arXiv:2306.08641.
H. Xu, X. Zhang, H. Li, L. Xie, W. Dai, H. Xiong, Q. Tian, Seed the views: Hierarchical semantic alignment for contrastive representation learning, IEEE TPAMI, 2022.
H. Xu, J. Fang, X. Zhang, L. Xie, X. Wang, W. Dai, H. Xiong, Q. Tian, Bag of instances aggregation boosts self-supervised distillation, ICLR 2022.
H. Cao, Y. Wang, X. Zhang, et al. Swin-unet: Unet-like pure transformer for medical image segmentation. ECCV Workshop, 2022.
Y. Xu, L. Xie, X. Zhang et al. PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search. ICLR, 2019.
Early Publications:
fine-grained recognition, weakly supervised learning:
X. Zhang, Y. Yang, J. Feng. Learning to Localize Objects with Noisy Labeled Instances, AAAI, 2019.
X. Zhang, Y. Yang, J. Feng. ML-LocNet: Improving Object Localization with Multi-view Learning Network, ECCV, 2018.
X. Zhang, J. Feng, H. Xiong, and Q. Tian, Zigzag Learning for Weakly Supervised Object Detection, CVPR, 2018.
X. Zhang, H. Xiong, W. Zhou, W. Lin and Q. Tian, Picking Deep Filter Responses for Fine-grained Image Recognition, CVPR, 2016.
X. Zhang, H. Xiong, W. Zhou, and Q. Tian, Fused One-vs-All Mid-Level Features for Fine-Grained Visual Categorization, ACM Multimedia, Full paper, 2014.
H. Wang,X. Zhang, H. Xiong, Spatio-Temporal Coherence for 3-D View Synthesis with Curve-Based Disparity Warping, VCIP, 2014. Best Student Paper Award
X. Zhang, Y. Yang, H. Xiong, J. Feng. PML-LocNet: Improving Object Localization with Prior-induced Multi-view Learning Network. IEEE TIP, 2019.
X. Zhang, H. Xiong, W. Lin, and Q. Tian, Weak to Strong Detector Learning for Simultaneous Classification and Localization, IEEE TCSVT, 2018.
X. Zhang, H. Xiong, W. Zhou, W. Lin and Q. Tian, Picking Neural Activations for Fine-grained Recognition, IEEE TMM, 2017.
X. Zhang, H. Xiong, W. Zhou, and Q. Tian, Fused One-vs-All Features with Semantic Alignments for Fine-Grained Visual Categorization, IEEE TIP, 2016.