I work on the intersection of Computer Vision, Signal Processing, Graphics, Machine Learning, and various combinations of their applications. This covers a broad range of topics, going from raw signal recovery, 3D reconstruction, to high level visual understanding (segmentation/recognition/agentic reasoning and planning, etc.).
In spirit, my research can be divided into two aspects:
1. Learning powerful priors
My research focuses on learning strong, data-driven priors. Past/ongoing work has dealt with image recovery (e.g., image super-resolution/deblurring), 3D reconstruction (e.g., feature matching/3D foundation model), and recognition (e.g., human identification/tumor segmentation). Nowadays, this line of research incorporates methods like Generative Modeling, Transformer-based architectures, Neural Architecture Search, etc.
2. Adapting to the real world
My research also focuses on algorithms that work without priors, or adapt priors to specific observations. This includes 3D neural rendering/reconstruction methods, e.g. ray tracing in NeRF and CT imaging, rasterization with 3DGS. This also includes test-time adaptation methods for super-resolution, 3D reconstruction, etc.
Even more specifically and recently, I am thinking of how visual reconstruction and understanding can be applied with an embodiment.
I constantly balance my interest in fundamental AI advancements and my pragmatic interest in how AI algorithms can be used in the real world. So far, I have successfully delivered some major governmental efforts that have been transitioned to real world use cases. From an application perspective, my research is highly relevant in construction, robotic surgery, medical imaging, satellite imaging, sports/human analysis, etc.
Prospective PhD: For prospective Ph.D. applicants, I am particularly interested in students who have strong Computer Vision, Computational Imaging, and coding background. Current emphasis is given to the following directions:
3D/4D Foundation Model.
3D Reconstruction/Surface Reconstruction/Novel View Synthesis/Camera Calibration.
Image/Video/3D Generative Modeling (World Model).
Medical imaging and robotic application.