Experiences

  • Ph.D Researcher, Multimedia Group, IMEC, Leuven, Belgium, Oct 2004 – Aug 2009

    • Conducted a cross-disciplinary, innovative research on the next generation multi-view multimedia applications and system

    • Research and developed several key technologies fundamental to a variety of advanced multi-view video applications, ranging from scene analysis, representation and compression, to interactive rendition

    • As a key contributor who pioneered, coordinated, and steered the IMEC's research in stereo vision and view synthesis

    • Proposed, explored and validated a set of high-level principles for efficient visual computing algorithm design, collectively called A^3 design principles, namely, Application-Driven, Architecture-Aware, Algorithm Design and Optimization

  • Research Intern, Media Communication Group, Microsoft Research Asia, Beijing, China, Oct 2005 - Jan 2006

    • Research on novel image-based representation and rendering techniques for multi-view video systems

    • Developed an effective epipolar geometry-based fast disparity search for multi-view image/video coding

    • Contributed to one patent for the real-time interactive multi-view video system project: http://research.microsoft.com/en-us/projects/imv/default.aspx

  • Architecture Design Engineer, Algorithm and Architecture Group, VIA-S3 Graphics, Shanghai, China, Apr 2003 - Aug 2004

    • Played an active role in the next-generation GPU architecture design, especially focusing on the GPU's memory hierarchy design, modeling, and optimization

    • Proposed a novel and efficient system-level modeling approach for graphics pipeline implementation verification, GPU architecture exploration, and performance analysis

    • Took a lead in integrating, verifying, and debugging a variety of GPU architecture constructs based on the framework I proposed

    • Contributed 5 internal technical proposals and reports, and 2 comprehensive sets of training slides

    • Winner of the slogan competition with "Rendering your dream", and received all excellent evaluations in the year-end performance review

  • Research Intern, Internet Media Group, Microsoft Research Asia, Beijing, China, May 2002 - Sep 2002

    • Research and developed a low-complexity and efficient video compression method for video communication on mobile device

    • Integrated this video codec into Microsoft Portrait version 1.6 (Sep 2002), which is the first research prototype for two-way full color video communication on Pocket PC. More details are available at: http://research.microsoft.com/en-us/projects/portrait/

    • The codec developed by me in C++ has become the basis for other projects later on

  • Research Assistant, Dept. of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China, Aug 2001 - Mar 2003

    • Participated in a project with Motorola Shanghai Lab: "Facial model-based low bit-rate video coding"

    • Research on MPEG-4 compatible facial animation systems with TTS support, consisting of e.g., Text-to-speech (TTS) driven facial animation, virtual facial image synthesis, co-articulation, action blending

    • Developed 2 prototype systems: i) "Grimace VTTS", targeting live virtual newscaster with synthetic facial animation and speech, and ii) "Grimace CHAT", an instant virtual visual communication application over ultra-low bandwidth, supporting real-time synthetic talking facial animation

    • Prototype systems demonstrated in the Euro-China Digital Olympics Workshop, Beijing, April 2003

  • Research Assistant, Dept. of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China, Sep 1999 - Jul 2001

    • Participated in the research project: "The technique development of the system chip used for MPEG-2 decoding and receiving programs from the 8VSB channel"

    • Implemented and optimized the core part of a Dobly AC-3 audio decoder, and developed a real-time Winamp-like AC-3 player

    • Proposed an efficient S/PDIF digital audio interface implementation on FPGAs, and performed the RISC core redesign, simulation, and optimization for decoding AC-3 audio streams


My previous research topics (some are still of keen interest to me) primarily include:

  • Stereo matching and depth estimation

    • Accurate and efficient stereo correspondence

    • Robust stereo matching in presence of radiometric differences

    • Real-time accurate stereo on GPUs

    • High-speed stereo matching on GPUs

    • Complexity-scalable stereo matching

  • View synthesis and image-based rendering

    • Depth image-based view synthesis and occlusion handling on GPUs

    • Quality-complexity scalable view synthesis methods

    • Quality metric and assessment for stereo and view interpolation

    • Stereo-based view synthesis framework and end-to-end performance evaluation on GPUs

  • Locally adaptive image approximation techniques and their applications

    • Point-wise adaptive polygon approach

    • Point-wise variable cross approach

    • Applications in stereo matching, image denoising, image abstraction, structure-preserving smoothing, etc

  • Video coding, processing, communication, and video conferencing

    • Video coding and transmission for real-time video conferencing over the lossy Internet

    • Motion region-of-interests detection and perceptual quality-driven advanced video coding (Online briefing)

    • Practical real-time full-color video codec for mobile devices

    • Facial model-based visual communication and TTS-driven talking head animation

  • Multi-view image and video coding

    • Epipolar-geometry based fast disparity/motion estimation

    • Fast illumination compensation and its seamless integration into video coding framework

    • (Multi-view) video coding standards (Online tutorial)

  • Algorithm-architecture co-design and exploration

    • GPU architecture design, transaction-level modelling, and performance evaluation

    • Media processor design for AC-3 audio decoding