Experiences

Ph.D Researcher, Multimedia Group, IMEC, Leuven, Belgium, Oct 2004 – Aug 2009
- Conducted a cross-disciplinary, innovative research on the next generation multi-view multimedia applications and system
- Research and developed several key technologies fundamental to a variety of advanced multi-view video applications, ranging from scene analysis, representation and compression, to interactive rendition
- As a key contributor who pioneered, coordinated, and steered the IMEC's research in stereo vision and view synthesis
- Proposed, explored and validated a set of high-level principles for efficient visual computing algorithm design, collectively called A^3 design principles, namely, Application-Driven, Architecture-Aware, Algorithm Design and Optimization
Research Intern, Media Communication Group, Microsoft Research Asia, Beijing, China, Oct 2005 - Jan 2006
- Research on novel image-based representation and rendering techniques for multi-view video systems
- Developed an effective epipolar geometry-based fast disparity search for multi-view image/video coding
- Contributed to one patent for the real-time interactive multi-view video system project: http://research.microsoft.com/en-us/projects/imv/default.aspx
Architecture Design Engineer, Algorithm and Architecture Group, VIA-S3 Graphics, Shanghai, China, Apr 2003 - Aug 2004
- Played an active role in the next-generation GPU architecture design, especially focusing on the GPU's memory hierarchy design, modeling, and optimization
- Proposed a novel and efficient system-level modeling approach for graphics pipeline implementation verification, GPU architecture exploration, and performance analysis
- Took a lead in integrating, verifying, and debugging a variety of GPU architecture constructs based on the framework I proposed
- Contributed 5 internal technical proposals and reports, and 2 comprehensive sets of training slides
- Winner of the slogan competition with "Rendering your dream", and received all excellent evaluations in the year-end performance review
Research Intern, Internet Media Group, Microsoft Research Asia, Beijing, China, May 2002 - Sep 2002
- Research and developed a low-complexity and efficient video compression method for video communication on mobile device
- Integrated this video codec into Microsoft Portrait version 1.6 (Sep 2002), which is the first research prototype for two-way full color video communication on Pocket PC. More details are available at: http://research.microsoft.com/en-us/projects/portrait/
- The codec developed by me in C++ has become the basis for other projects later on
Research Assistant, Dept. of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China, Aug 2001 - Mar 2003
- Participated in a project with Motorola Shanghai Lab: "Facial model-based low bit-rate video coding"
- Research on MPEG-4 compatible facial animation systems with TTS support, consisting of e.g., Text-to-speech (TTS) driven facial animation, virtual facial image synthesis, co-articulation, action blending
- Developed 2 prototype systems: i) "Grimace VTTS", targeting live virtual newscaster with synthetic facial animation and speech, and ii) "Grimace CHAT", an instant virtual visual communication application over ultra-low bandwidth, supporting real-time synthetic talking facial animation
- Prototype systems demonstrated in the Euro-China Digital Olympics Workshop, Beijing, April 2003
Research Assistant, Dept. of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China, Sep 1999 - Jul 2001
- Participated in the research project: "The technique development of the system chip used for MPEG-2 decoding and receiving programs from the 8VSB channel"
- Implemented and optimized the core part of a Dobly AC-3 audio decoder, and developed a real-time Winamp-like AC-3 player
- Proposed an efficient S/PDIF digital audio interface implementation on FPGAs, and performed the RISC core redesign, simulation, and optimization for decoding AC-3 audio streams

My previous research topics (some are still of keen interest to me) primarily include:

Stereo matching and depth estimation
- Accurate and efficient stereo correspondence
- Robust stereo matching in presence of radiometric differences
- Real-time accurate stereo on GPUs
- High-speed stereo matching on GPUs
- Complexity-scalable stereo matching
View synthesis and image-based rendering
- Depth image-based view synthesis and occlusion handling on GPUs
- Quality-complexity scalable view synthesis methods
- Quality metric and assessment for stereo and view interpolation
- Stereo-based view synthesis framework and end-to-end performance evaluation on GPUs
Locally adaptive image approximation techniques and their applications
- Point-wise adaptive polygon approach
- Point-wise variable cross approach
- Applications in stereo matching, image denoising, image abstraction, structure-preserving smoothing, etc
Video coding, processing, communication, and video conferencing
- Video coding and transmission for real-time video conferencing over the lossy Internet
- Motion region-of-interests detection and perceptual quality-driven advanced video coding (Online briefing)
- Practical real-time full-color video codec for mobile devices
- Facial model-based visual communication and TTS-driven talking head animation
Multi-view image and video coding
- Epipolar-geometry based fast disparity/motion estimation
- Fast illumination compensation and its seamless integration into video coding framework
- (Multi-view) video coding standards (Online tutorial)
Algorithm-architecture co-design and exploration
- GPU architecture design, transaction-level modelling, and performance evaluation
- Media processor design for AC-3 audio decoding