Research

My research interest includes

  • Machine Learning: classification, clustering, feature selection
  • Image/Video Processing: image restoration, object segmentation, video coding
  • Computer Vision: object recognition, video tracking, video surveillance, 3D reconstruction

PARC East (formerly known as Xerox Research Center Webster)

Computer Vision Research | Mentor: Dr. Raja Bala

Vehicle Driver Monitoring based on Gaze Estimation

06/2013─08/2013

We introduce a camera-based and portable driver monitoring framework that reduces traffic accidents caused by driver drowsiness and distraction. Input video frames from a smartphone camera facing the driver are first processed through a coarse head pose direction. Next, the locations and scales of face parts, namely mouth, eyes, and nose, define a feature descriptor that is supplied to an SVM gaze classifier which outputs one of 8 common driver gaze directions.

A key novel aspect is an in-situ approach for gathering training data that improves generalization performance across drivers, vehicles, devices, and capture geometry. Experimental results show that a high accuracy of gaze direction estimation is achieved for practical scenarios with different drivers, vehicles, phones and camera mounting locations

[Published in CVPR Workshops 2014]

University of Washington

Information Processing Lab, EE | Adviser: Prof. Jenq-Neng Hwang

Segmentation and Tracking for Moving Cameras

06/2014─present

Tracking of objects from moving cameras suffers from the inapplicability of background modeling and high variability in object perspective and scale. In this paper, a novel deformable multiple-kernel tracking algorithm is proposed to address these challenges. Adopting the deformable part model (DPM) object detector, a set of kernels is defined to represent the holistic object and several parts in terms of color histogram, texture histogram and HOG features. Kernel positions are optimized by the mean-shift procedure on each feature in order to realize tracking. The deformation costs from the DPM provide the part configuration as soft constraints. Experimental results show that the proposed method accurately tracks the live fish from underwater moving cameras.

Fine-Grained Object Recognition

09/2012─06/2014

A novel feature learning and object recognition framework is proposed to address challenges in practical applications of fine-grained object recognition. We proposed an unsupervised learning algorithm that discovers discriminative details of objects via a non-rigid part model. On the other hand, an unsupervised clustering approach generates a binary class hierarchy, where the notion of partial classification is introduced to assign coarse labels to ambiguous instances. Experiments showed that such a system outperforms existing approaches on data with high inter-class similarity, data uncertainty and class imbalance.

[Published in ICIP 2014, CVAUI 2014]

Camera-based Monitoring for Fishery Conveyor Belts

06/2012─09/2012

We developed a system to automate the process of counting and isolating fish on conveyor belts. Given the monitoring video, a systematic marker-driven watershed segmentation is developed to separate clustered fish. An automated target selection algorithm is trained to reject irrelevant objects with a high discrimination. Most importantly, we proposed an innovative algorithm, named aggregated segmentation, which provides a mean to refine the shape boundary as long as the objects are tracked across video frames.

Watch demo video

[Published in ICASSP 2013]

Tracking from Low-Contrast and Low-Frame-Rate Stereo Video

09/2010─06/2012

A robust segmentation algorithm is developed that extracts low-contrast object under poor illumination. A variant of Viterbi algorithm is proposed for low-frame-rate video data where abrupt motion and frequent entrance/exit hinder object tracking. The proposed system provides an automatic and reliable solution to NOAA's trawl-based underwater camera system, which is devoted to fisheries surveys.

Watch demo video

[Published in ICIP 2011, ISCAS 2013, IEEE T-CSVT 2015]

National Taiwan University

Media IC & System Lab, GIEE | Adviser: Prof. Shao-Yi Chien

Color Filter Array Demosaicking

08/2008─02/2009

Proposed a color filter array demosaicking algorithm based on joint bilateral upsampling technique to exploit the high correlations between RGB color channels while preserving object edges in images; suppressed false color artifacts in reconstructed images more successfully than state-of-the-art techniques did, in measurements of PSNR and percentage of the zipper effect, and produced the most visually acceptable results.

[Published in ICME 2009]

Bandwidth and Local Memory Reduction for Video Encoders

09/2007─06/2008

Proposed a memory management scheme for the reference frame buffer with the concept of bit plane partitioning (BPPMM), which is especially suitable for motion estimation with bit-truncation; reduced on-chip SRAM size and external memory bandwidth with very little quality loss; adopted reference frame subsampling method to further reduce memory bandwidth; integrated the proposed scheme in H.264/AVC encoder using JM 13.0 reference software for simulation.

[Published in ISCAS 2009]

Data Compression and Rendering of Concentric Mosaics

02/2007─06/2007

Investigated into image-based rendering with concentric mosaics and various data compression techniques designed for concentric mosaics.

Speech Processing Lab, GICE | Adviser: Prof. Lin-shan Lee

Energy- and Spectral-Domain Voice Activity Detection

09/2006─01/2007

Developed an automatic threshold decision scheme for energy contour technique and noise-suppressed spectral entropy-based voice activity detection (NSSE-VAD); implemented various approaches including endpoint detection, energy contour and NSSE-VAD; enhanced recall and precision rates of detecting human speech in both high- and low-SNR audio clips.