@SIGGRAPH 24 Big Bear
I am an assistant professor at Tsinghua University.
I was a research scientist at Intel Labs China.
I was a joint postdoc affiliated to Peking University.
I got my Ph.D. and bachelor degrees from the Electronic Engineering department of Tsinghua University.
Proudly, I was a former leader of Skyworks (天空工场), the largest robotics club at THU and an amazing geek utopia.
I am generally interested in any computer vision fields related to robotics, especially 3D scene understanding.
I was a serial entrepreneur since 2009, co-launching 10+ startups covering social networks, web development tools, unmanned aerial vehicles, intelligent delivery, smart grid, VR devices, virtual human, cloud design, autonomous driving, smart manufacturing, etc.
Recruitment
I am actively looking for Postdocs to join my Lab.
I am casually looking for Research Interns.
I select Ph.D. students from my research intern pool.
Contact me at zhaohao@air.tsinghua.edu.cn
Publications
[coming soon] Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences, WACV 2025
[coming soon] Diffusion-based Visual Anagram as Multi-task Learninga, WACV 2025
[coming soon] Dual-frame Fluid Motion Estimation with Test-time Optimization and Zero-divergence Loss, NeurIPS 2024
[coming soon] Locate n' Rotate: Two-stage Openable Part Detection with Geometric Foundation Model Priors, ACCV 2024
[coming soon] Hint-AD: Holistically Aligned Interpretability for End-to-End Autonomous Driving, CoRL 2024
[coming soon] P-MapNet: Far-seeing map generator enhanced by both SDMap and HDMap priors, RA-L 2024
[coming soon] Inverse Rendering of Outdoor Scenes with under Time-variant Illumination, BMVC 2024
[coming soon] Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty, BMVC 2024
[coming soon] Model Merging for Multi-target Domain Adaptation, ECCV 2024
[coming soon] Structured-NeRF: Hierarchical Scene Graph with Neural Representation, ECCV 2024
[coming soon] SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior, ECCV 2024
[coming soon] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes, ECCV 2024
[coming soon] Active Neural Mapping at Scale, IROS 2024
[coming soon] Large Language Models Powered Context-aware Motion Prediction, IROS 2024
[coming soon] PreAfford: An Affordance-based Pre-grasping Framework with high Adaptability, IROS 2024
[coming soon] Blending Distributed NeRFs with Tri-stage Robust Pose Optimization, IROS 2024
[coming soon] FairDiff: Fair Segmentation with Point-Image Diffusion, MICCAI 2024
Rip-NeRF: Anti-aliasing Radiance Fields with Ripmap-Encoded Platonic Solids, SIGGRAPH 2024
Encoding biological metaverse: Advancements and challenges in neural fields from macroscopic to microscopic, The Innovation 2024 (IF: 33.1)
Adaptive Surface Normal Constraint for Geometric Estimation From Monocular Images, T-PAMI 2024 (In-the-wild depth and normal, https://www.xxlong.site/ASNDepth/)
Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration, CVPR 2024
SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis, CVPR 2024
FastMAC: Stochastic Spectral Sampling of Correspondence Graph, CVPR 2024
ECT: Fine-grained Edge Detection with Learned Cause Tokens, IVC 2024
Camera Relocalization in Shadow-free Neural Radiance Fields, ICRA 2024
MonoOcc: Digging into Monocular Semantic Occupancy Prediction, ICRA 2024
Block-Map-Based Localization in Large-Scale Environment, ICRA 2024
Car-Studio: Learning Car Radiance Fields from Single-View and Unlimited In-the-wild Images, RA-L 2024
SlimmeRF: Slimmable Radiance Fields, 3DV 2024 (Best Paper, https://github.com/Shiran-Yuan/SlimmeRF)
PAD: A Dataset and Benchmark for Pose-agnostic Anomaly Detection, NeurIPS 2023
MARS: An Instance-aware, Modular and Realistic Simulator for Autonomous Driving, CICAI 2023 (Best Paper Runner-up, https://open-air-sun.github.io/mars/ )
DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection, ICCV 2023
City-scale continual neural semantic mapping with three-layer sampling and panoptic representation, KBS 2023
INT2: Interactive Trajectory Prediction at Intersections, ICCV 2023
3D Implicit Transporter for Temporally Consistent Keypoint Discovery, ICCV 2023
Understanding Embodied Reference with Touch-Line Transformer, ICLR 2023
From Semi-supervised to Omni-supervised Room Layout Estimation Using Point Clouds, ICRA 2023
ADAPT: Action-aware Driving Caption Transformer, ICRA 2023
LODE: Locally Conditioned Eikonal Implicit Scene Completion from Sparse LiDAR, ICRA 2023
STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation, ICRA 2023
Delving into Shape-aware Zero-shot Semantic Segmentation, CVPR 2023
DPF: Learning Dense Prediction Fields with Weak Supervision, CVPR 2023
Planning assembly sequence with graph transformer, ICRA 2023
LATITUDE: Robotic Global Localization with Truncated Dynamic Low-pass Filter in City-scale NeRF, ICRA 2023
Unsupervised Road Anomaly Detection with Language Anchors, ICRA 2023
Toist: Task oriented instance segmentation transformer with noun-pronoun distillation, NeurIPS 2022
SNAKE: Shape-aware Neural 3D Keypoint Field, NeurIPS 2022
A boundary-guided transformer for measuring distance from rectal tumor to anal verge on magnetic resonance images, Cell Patterns 2023
Language-guided Semantic Style Transfer of 3D Indoor Scenes, PIES-ME 2022
Measuring distance from lowest boundary of rectal tumor to anal verge on CT images using pyramid attention pooling transformer, CIBM 2023
VIBUS: Data-efficient 3D scene parsing with VIewpoint Bottleneck and Uncertainty-Spectrum modeling, ISPRS 2022
Sc-wls: Towards interpretable feed-forward camera re-localization, ECCV 2022
Distance-Aware Occlusion Detection With Focused Attention, T-IP 2022
Brick Yourself within 3 Minutes, ICRA 2022
Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing, CVPR 2022
Pq-transformer: Jointly parsing 3d objects and layouts from point clouds, RA-L&ICRA 2022
Pointly-supervised scene parsing with uncertainty mixture, CVIU 2020
3d room layout estimation from a single rgb image, T-MM 2020
Seeing Through the Occluders: Robust Monocular 6-DOF Object Pose Tracking via Model-Guided Video Object Segmentation, RA-L&IROS 2020
Learning to Draw Sight Lines, IJCV 2019
Deeply-supervised knowledge synergy, CVPR 2019
A closed-form solution to universal style transfer, ICCV 2019
Efficient semantic scene completion network with spatial group convolution, ECCV 2018
Decoder network over lightweight reconstructed feature for fast semantic style transfer, ICCV 2017
Network sketching: Exploiting binary structure in deep cnns, CVPR 2017
Physics inspired optimization on semantic transfer features: An alternative method for room layout estimation, CVPR 2017
Other Staff
Talk Show @ ASPARA 2020
Young Scientist Representative for Intel 2021
Young Scientist Representative for Intel 2021
BUCEA-Pinlan AI+Design Seminar
Implicit Representation Seminar
Welcome to our ICRA 2022 Sim2Real Challenge
Intel China TikTok Campaign 2022
Seeing old and new friends @ Bytedance AI
Proud to have several (many?) papers accepted to ICRA 2023, on 3D scene understanding and its applications.
CCL 2023 Tutorial on LLM for robotics
Seeing old and new friends @ THUEE
Presenting ADAPT at FISITA Forum
【#具身智能的使用前... - @环球人物杂志的微博 - 微博 (weibo.com)
“对话科学家”:具身智能,如何塑造未来-今日头条 (toutiao.com)
“对话科学家”:具身智能,如何塑造未来|赵昊|大模型|人工智能|仿生机器人_网易订阅 (163.com)
【一点资讯】“对话科学家”:具身智能,如何塑造未来 www.yidianzixun.com
“对话科学家”:具身智能,如何塑造未来 (peopleapp.com)
“对话科学家”:具身智能,如何塑造未来_中华网 (china.com)
“对话科学家”:具身智能,如何塑造未来 - 国内 - 环球人物网-有温度的人物网站 (globalpeople.com.cn)
Thrilled and proud.
(Honored to Give This) Guest Lecture
Talk @ CARIAD China
A gift from life (from fate?)
CCDM 2024
ChinaGraph 2024 Tutorial
CIRAC 2024 Talk
This is ARTS
Visiting Shanghai AIlab
Visiting teleAI
详细议程 | 中国无人驾驶装备应用创新生态大会 (qq.com) WAIC-AD Forum
Embodied AI Forum
PreAfford !!!
Visiting Ant Research
VENUE@ECCV 2024
Visiting BUAA