Research
Select Projects
Temporal Context Network for Activity Localization in Videos (ICCV 2017) [Code]
A Faster-RCNN like proposal and detection architecture
Proposal anchors are placed at equal intervals in a video which span multiple temporal scales
Pair-wise input sampling to capture context
Temporal convolution is used to learn temporal consistency and position of boundaries
Top 3 method in Activity-Net 2017 Challenge for Action Proposals
We demonstrate that building a representation by sampling from a pair of scales which explicitly captures context around proposal anchors is important for precise temporal localization
Efficient Fine-grained Classification and Part Localization Using One Compact Network (ICCVW 2017)
We propose a novel multi-task deep learning and fusion architecture that have both shared and dedicated convolutional layers for simultaneous part labeling and make-model classification.
The accuracy of our approach is competitive to state-of-the-art methods on both car and bird domains.
Our network architecture is more compact (30M parameters) and runs much faster (78FPS) than competitors, enabling real-time, mobile applications.
FASON: First and Second Order Information Fusion Network (CVPR 2017)
We design a deep fusion architecture that effectively combines second order information (calculated from a bilinear model) and first order information (preserved through our leaking shortcut) in an end-to-end deep network.
Our architecture enables more effective learning.
We extend our fusion architecture to take advantage of the multiple features from different convolution layers.