Xi (Stephen) Chen

Xi (Stephen) Chen, Ph.D.

陈曦

Manager, Machine Learning Engineering

Ph.D. in Computer Vision, University of Maryland

Xi (Stephen) Chen is specialized in computer vision, multimodal vision-language understanding and machine learning research and products. He received his Ph.D. in Computer Science from the University of Maryland (UMD) in 2015, advised by Prof. Larry Davis. He has 12+ years research experience in computer vision and machine learning. His interests lie in Computer Vision and Pattern Recognition (esp. Object Detection and Recognition), Deep Learning and its applications.

Research Interests

Computer Vision: Object/Human Detection, Object Recognition, Semantic/Instance Segmentation
Vision Language: Multi-modal vision/language representation training
Machine Learning: Deep Learning and applications
Visual Search and Recommendation: Applying computer vision and machine learning to image/visual search and recommendation experience.

Education

PhD in Computer Science, Department of Computer Science, University of Maryland
- Advisor: Prof. Larry S. Davis
- Thesis: Context Driven Scene Understanding
- Research Interests: Computer Vision, Machine Learning, Image Understanding, Deep Learning
Bachelor in Engineering, Department of Computer Science, Zhejiang University
- Advisor: Prof. Qunsheng Peng

News

- Our paper "Scala: Slicing Vision Transformer for Flexibile Inference" is accepted by NeuraIPS 2024. [Link][Arxiv]
- Our TinyCLIP paper has been accepted by ICCV 2023. Thanks all collaborators!
- Our paper "Dissecting Deep Metric Learning Losses for Image-Text Retrieval" is accepted by WACV 2023 [Link]
- Our paper "Web-Scale Generic Object Detection at Microsoft Bing" has been accepted by KDD 2021. [Arxiv][PDF][ACM DL]
- Our paper "Stacked Cross Attention for Image-Text Matching" has been accepted by ECCV 2018.
- Our paper "Web-Scale Responsive Visual Search at Bing" has been accepted by KDD 2018.

Thesis Research Projects (Selected)

Computer Vision (Context-driven Object Detection and Recognition):

Jigsaw Puzzle: Piecing Together the Segmentation Jigsaw using Context (PDF)(Bibtex). CVPR 2011, Corolado.

Xi Chen, Arpit Jain, Abhinav Gupta, Larry Davis. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2011, Corolado Springs, USA.
Highlights: We propose an approach for semantic segmentation and object recognition in a single optimization procedure that utilizes context to perform segment selection and labeling coherently.

Scene Understanding in 20 Questions: Searching for Objects in 20 Questions.

Xi Chen, He He, Larry Davis. Object Detection in 20 Questions. Accepted, IEEE WACV 2016, New York, USA (PDF).
Xi Chen, He He, Larry Davis. Searching for Objects in 20 Questions.(PDF). SUNw CVPR 2015,Boston, MA.
Highlights: Proposed a new general algorithm to handle the task of object detection and instance segmentation in complex scenes using a divide-and-conquer approach. It dynamically poses questions about related context cues to sequentially narrow down the search space in order to both improve detection accuracy and reduce computation compared to exhaustive search.

Object Co-labeling in Multiple Images (PDF). WACV 2014, Steamboat Springs, Corolado.

Goal: To jointly annotate multiple images of the same/related scene(s) which do not have temporal consistency, using information from other images in the scene to improve recognition performance

Computer Graphics (Simulation of Physics Phenomenon/GPGPU):

Real-time Simulation of Fluid with Adaptive SPH

Click Here to Watch a Video Demo

To simulate large scale fluid scenes, which interact with complex boundaries, with realistic effects in real-time.

Helped design and implement a stable adaptive SPH(Smoothed Particle Hydrodynamic) particle system in GPU, which can handle up to one million particles simulating fluid in real-time, taking full advantage of the speed of GPGPU and the programmability of CUDA.

A robust GPGPU-based system that can simulate up to 1 million particles in real-time (>15fps) with realistic effects (breaking waves, splashes). The demo can be viewed by clicking the link on the figure.

Publication

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance. Kan Wu, Houwen Peng, Zhenghong Zhou, Bin Xiao, Mengchen Liu, Lu Yuan, Hong Xuan, Michael Valenzuela, Xi (Stephen)Chen, Xinggang Wang, Hongyang Chao, Han Hu. ICCV 2023, Paris, France. [Arxiv][PDF]
Dissecting Deep Metric Learning Losses for Image-Text Retrieval. Hong Xuan, Xi Chen. IEEE WACV 2023, Hawaii, HI. [Arxiv][PDF]
Web-Scale Generic Object Detection at Microsoft Bing. Stephen Xi Chen, Saurajit Mukherjee, Unmesh Phadke Tingting Wang, Junwon Park, Ravi Yada. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’21), August 14–18, 2021, Virtual Event, Singapore. [Arxiv][PDF][ACM DL]
Stacked Cross Attention for Image-Text Matching . Kuang-Huei Lee, Xi Chen, Gang Hua, Houdong Hu, Xiaodong He. European Conference on Computer Vision (ECCV), 2018. [Arxiv][Project Page]
Web-Scale Responsive Visual Search at Bing. Houdong Hu, Yan Wang, Linjun Yang, Pavel Komlev, Li Huang, Xi Chen, Jiapei Huang, Ye Wu, Meenaz Merchant, Arun Sacheti. To appear in KDD 2018 (Applied Data Science Track). [Arxiv][PDF]
The Role of Context Selection in Object Detection. Ruichi (Rich) Yu, Xi (Stephen) Chen, Vlad I. Morariu, Larry S. Davis. BMVC 2016, York, UK. (Oral).[PDF]
Object Detection in 20 Questions. Xi Chen, He He, Larry Davis. IEEE WACV 2016, New York, USA [PDF][Link].
Unsupervised Network Pretraining via Encoding Human Design. IEEE WACV 2016, New York, USA.
Generating Discriminative Object Proposals via Submodular Ranking. Yangmuzi Zhang, Zhuolin Jiang, Xi Chen, Larry S. Davis. Robust Features for Computer Vision, CVPR 2016. [Arxiv][PDF]
Searching for Objects in 20 Questions [PDF]. Xi Chen, He He, Larry Davis. Scene Understanding Workshop CVPR 2015, Boston, MA.
Object Co-Labeling in Multiple Images [PDF] Xi Chen, Arpit Jain, Larry Davis. IEEE Winter Conference on Applications of Computer Vision (WACV) 2014, Steamboat Springs, Corolado.
Piecing Together the Segmentation Jigsaw using Context [PDF]. Xi Chen, Arpit Jain, Abhinav Gupta, Larry Davis. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2011.
Real-time fluid simulation with adaptive SPH.[PDF/ Video Demo]. He Yan, Jian He, Xi Chen, Changbo Wang, Qunsheng Peng; Zhangye Wang. Oral Presentation, 22nd Annual Conference on Computer Animation and Social Agents (CASA’2009),June 2009, Amsterdam, Netherlands.
A New Adaptive Model for Real-time Fluid Simulation with Complex Boundaries.[PDF]. Jian He, Xi Chen, Zhangye Wang, He Yan, Chen Cao, Qunsheng Peng. Oral Presentation, 11th IEEE International Conference on Computer-Aided Design and Computer Graphics (CAD/Graphics2009) (Aug. 2009, Yellow Mountain City, China).
Visual Analysis of Temporal Trends in Social Networks Using Edge Color Coding and Metric Timelines. U.Khurana, V.Nguyen, H.Cheng, J.Ahn, Xi Chen, Ben Shneiderman. 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom),
A New Adaptive Model for Real-time Fluid Simulation with Complex Boundaries.[PDF]. Jian He, Xi Chen, Zhangye Wang, He Yan, Chen Cao, Qunsheng Peng. The Visual Computer. April 2010, Volume 26, Issue 4, pp 243-252
Real-time fluid simulation with adaptive SPH.[PDF/ Video Demo]. He Yan, Jian He, Xi Chen, Changbo Wang, Qunsheng Peng; Zhangye Wang. Computer Animation & Virtual World(CAVW), Vol. 20, Num. 2-3, June 2009 , pp. 417-426(10)
An Integrated Algorithm of Real-time Simulation of Fluid Scene in GPU. [PDF]. Xi Chen, Zhangye Wang, Jian He, He Yan, Qunsheng Peng. Published in: Special Issue on GPGPU, Journal of CAD\CG (Mar. 2010, in Chinese)

Patents

Multi-modal visual search pipeline for web scale images. US Patent 11074289
Stacked cross-modal matching. US Patent 11,093,560

Academic Services

Program Committee Member/Reviewer:

ICLR 2022
Annual Conference of the Association for Computational Linguistics (ACL) 2020
IEEE Conference on Computer Vision and Pattern Recognition (CVPR): 2017, 2018, 2020
European Conference on Computer Vision (ECCV): 2018, 2020
International Conference on Computer Vision (ICCV) 2017
Association for the Advancement of Artificial Intelligence (AAAI) 2017
International Joint Conference on Artificial Intelligence (IJCAI) 2016
IEEE International Symposium on Multimedia (ISM) 2016
Journal of Neurocomputing
IEEE Transactions on Cybernetics
Pattern Recognition
Transactions on Multimedia

Awards

Dean's Fellowship, Department of Computer Science, University of Maryland. 2010-2012
IBM China Fellowship. Aug 2009
Citibank Scholarship. Sept. 2008
1st Class Scholarship of Research and Innovation, Zhejiang University. 2009
1st Class Scholarship of Academic Excellence, Zhejiang University. 2008

Google Sites

Report abuse