Sibo Zhang (Michael) 

Baidu USA, Senior Research Engineer

Generative AI, Computer Vision, 

Robotics and Autonomous Driving

Senior Research Engineer with 6+ year experience in Generative AI, Machine Learning, Computer Vision, Autonomous Driving and Robotics. 

I am working in Baidu Research USA since Feb, 2018. I got my Master degree from University of Illinois at Urbana-Champaign (UIUC) in Computer Engineering. Before going to UIUC, I got my bachelor degree from University of Minnesota, Twin Cities (UMN) in Electrical Engineering and Computer Science.    

Published 10+ papers in AI top conference like AAAI 2019 (Oral), ECCV 2020, ACCV 2020, ISARC 2021 (Oral), ICASSP 2022, Construction Robotics Journal etc. Published 15+ US/EU/CN/PCT patents. Google scholar citation: 600+ (as of 2024.6).

I'm open to new opportunities and free feel to reach out to me. I'd love to chat with you!


9. Sibo Zhang, Liangjun Zhang. Construction Site Safety Monitoring and Excavator Activity Analysis System. Construction Robotics Journal 2022

[PDF] [Project Page]  

8. Sibo Zhang, Jiahong Yuan, Miao Liao, Liangjun Zhang. Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022). [PDF] [Project Page] [Demo Video] [Github

7. Sibo Zhang, Liangjun Zhang. Vision-based Excavator Activity Analysis and Safety Monitoring System. 38th International Symposium on Automation and Robotics in Construction (ISARC 2021 (Oral)). (6/170+)  [PDF] [Project Page] [10 min Presentation]

6. Miao Liao*, Sibo Zhang*, Peng Wang, Hao Zhu, and Ruigang Yang. Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses. Asian Conference on Computer Vision (ACCV 2020). [PDF] [Project Page] [Result Video] [1 min Spotlight] [10 min Presentation] [Github]

Media news in Chinese: 语音驱动3D虚拟人,百度ACCV 2020最新文章解读

5. Miao Liao, Feixiang Lu, Dingfu Zhou, Sibo Zhang, Wei Li, Ruigang Yang. "DVI: Depth Guided Video Inpainting for Autonomous Driving". European Conference on Computer Vision (ECCV 2020). [PDF] [Project Page] [Result Video] [10 min Presentation] [Inpainting Dataset] [Github]

Media news in Chinese: ECCV2020论文收录揭晓,百度AI入选10篇论文,涵盖众多研究领域 

实现最强自动驾驶街景仿真,百度ECCV 2020视频修复论文解读 

4. Sibo Zhang, Yuexin Ma, Ruigang Yang, Xin Li, Yanliang Zhu, Deheng Qian, Zetong Yang, Wenjing Zhang, Yuanpei Liu. CVPR 2019 WAD Challenge on Trajectory Prediction and 3D Perception (arXiv). [PDF] [Website]  

3. Yuexin Ma, Xinzhe Zhu, Sibo Zhang, Ruigang Yang, Wenping Wang, Dinesh Manocha. "TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents." The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019 (Oral)). (acceptance rate: 16%) [PDF] [Project Page] [Webpage] [Dataset] [BibTeX][Video

Media news in Chinese: 百度亮相人工智能顶会AAAI2019 共有15篇论文被收录  

2. Zhang, Qichao, Dongbin Zhao, and Sibo Zhang. "Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games." International Conference on Neural Information Processing (ICONIP). Springer, Cham, 2017. [PDF]

1. Sibo Zhang, Yuan Cheng, Deyuan Ke. Event-Radar: Real-time Local Event Detection System for Geo-Tagged Tweet Streams (arXiv).  [PDF]


4. Sibo Zhang, Liangjun Zhang. VISION-BASED EQUIPMENT ACTIVITY ANALYSIS AND SAFETY MONITORING. BN210719USN1 | US63/233,146. Publication Date: 2023/03/09. [US Patent]   

3. Sibo Zhang, Jiahong Yuan, Miao Liao, Liangjun Zhang. TEXT-DRIVEN VIDEO SYNTHESIS WITH PHONETIC DICTIONARY. BN210126USN1 | US17/221,701. Publication Date: 2021/12/15[US Patent]   

2. Miao Liao*, Sibo Zhang*, Peng Wang, Hao Zhu, and Ruigang Yang. PERSONALIZED SPEECH-TO-VIDEO WITH THREE-DIMENSIONAL (3D) SKELETON REGULARIZATION AND EXPRESSIVE BODY POSES. BY200327USN8-PCT | PCT/CN2020/095891, US16/980,373. Publication Date: 2021/12/15. [US Patent]   

1. Miao Liao, Feixiang Lu, Dingfu Zhou, Sibo Zhang, Wei Li, Ruigang Yang. DEPTH-GUIDED VIDEO INPAINTING FOR AUTONOMOUS DRIVING. PCT/CN2020/092390, US16/770,904. Publication Date: 2021/12/01. [US Patent]            

Professional Activities

[2020] Organized Kaggle PKU/Baidu Workshop on Autonomous Driving. [Kaggle Link]

[2019] Organized the talk of "Workshop on Autonomous Driving - Beyond Single-Frame Perception" on ICCV 2019.  [ICCV 2019 Workshop]

[2019] Organized "Workshop on Autonomous Driving - Beyond Single-Frame Perception" on CVPR2019. Our challenge have 3 tasks, including Trajectory Prediction, 3D Lidar object Detection, 3D Lidar object Tracking. For more information see:  [WAD 2019 challenge]

ApolloScape Trajectory Dataset, 3d Lidar Detection and Tracking dataset was released. There are more than 2000 participants attended the workshop, 50,000 people downloaded Apolloscape dataset, and more than 1400 teams submitted results on Leaderboard. Published toolkit for ApolloScape Dataset on Github

[2018] Organized CVPR 2018 Workshop on Autonomous Driving. [WAD 2018 challenge]

Reviewer for Journals 

Reviewer for Conferences