Wan-Cyuan Fan (Chris Fan)

My name is Wan-Cyuan Fan (Chris Fan), and I am a Ph.D. student working with Prof. Leonid Sigal on the topics of multimodal learning at the University of British Columbia.

I previously worked as a Research Assistant in the Vision and Learning Lab (VLL) at National Taiwan University (NTU), guided by Prof. Yu-Chiang Frank Wang during my Master's and Research Assistant journey. Additionally, I collaborated with Yen-Chun Chen, DongDong Chen, Yu Cheng, and Lu Yuan as a student intern at Microsoft Research for six months. I received my Bachelor of Science degree in Electrical Engineering from NTU in 2020, during which I served as a student research intern for a year at the Institute of Information Science, Academia Sinica, under the guidance of Prof. Tyng-Luh Liu.

My research focuses on two interconnected areas: (1) the fundamental study of multimodal large language models (MLLMs), including customization, synthetic data generation for training, benchmarking, and training strategies; and (2) the application of MLLMs in agentic frameworks, specifically designing LLM-based agents to more effectively solve general vision-and-language tasks and exploring collaborative dynamics within multi-agent systems.

National Taiwan UniversityB.S. in EEM.S. in ECEJan. 20 & Jan. 22

Azure Computer Vision Research at MicrosoftResearch InternMarch 22 - Sep. 22March 24 - Sep. 24

AmazonApplied ScientistInternMay 25 - Sep. 25

Vector Institute for AI, CanadaPhD studentSep. 23 - present

University of British ColumbiaPhD student in CSSep. 23 - present

Waabi AI

News (click to view more) Red: academic activity; Green: internship activity

Jan. 2026 - One paper has been accepted to ICLR.
Aug. 2025 - One paper has been accepted to EMNLP.
Feb 2025 - I will join Amazon Research, Seattle, as a PhD student intern this summer.
Oct. 2024 - Our paper of "On Pre-training of Multimodal Language Models Customized for Chart Understanding" accepted by NeurIPSW on Adaptive Foundation Models.
Aug. 2024 - Our paper of "Surprising Observations in Basic Vision Language Model Capabilities" accepted by ECCVW.
May. 2024 - Our model, TAM-VT, achieved 5th place in the VOTS2024 Challenge at ECCV '24.
Aug. 2023 - I completed my conscription as a rifleman serving in the attack helicopter group.
Feb. 2023 - Our paper of "IOU-Aware Multi-Expert Cascade Network via Dynamic Ensemble for Long-tailed Object Detection" are accepted by ICASSP.
Nov. 2022 - Our papers of "Feature Pyramid Diffusion for Complex Scene Image Synthesis" and "Target-free Text-guided Image Manipulation" are accepted by AAAI.
Oct. 2022 - I completed my first 40-min tech talk in Microsoft Research. Thank Yen-Chun Chen and XiYang Dai for giving such great opportunity!
Sep. 2022 - Our paper entitled "Paraphrasing Is All You Need for Novel Object Captioning" is accepted by NeurIPS 2022.
June 2022 - My master thesis about image manipulation is selected as the honorable master thesis award in IPPR 2022.
March 2022 - Our paper entitled "Scene Graph Expansion for Semantics-Guided Image Outpainting" is accepted by CVPR 2022.
March 2022 - I joined Microsoft Research as a student intern.
Jan. 2022 - I completed my master thesis defense presentation! Thank Prof. Yu-Chiang Frank Wang, Prof. Chu-Song Chen, and Prof. Wei-Chen Walon Chiu for advising.
Dec. 2021 - Our work entitled "Cross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation" is accepted by AAAI 2022 (oral).
Feb. 2021 - Our work entitled "LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity" is accepted by CVPR 2021.
Sep. 2020 - I started my graduate student life.
Aug. 2020 - Our MEC detection model achieves Top 10 Winner (World Ranking 7) in the Large Vocabulary Instance Segmentation (LVIS) Challenge, ECCV workshop.
Jane 2020 - Due to the outbreak of covid-19 pandemic in California, the internship in ICT was canceled ;(
March 2020 - I received my first abroad job as a summer intern at Institute for Creative Technologies, USC. Thank Dr. Andrew (Wei-Wen) Feng and Dr. Meida Chen for giving me such an opportunity!
July 2019 - I joined Computer Vision & Machine Learning Lab, advised by Prof. Tyng-Luh Liu, as an undergraduate student intern in IIS, Sinica, Taiwan.

Selected Publications (please wait a moment for loading gif.)

To Sink or Not to Sink: Visual Information Pathways in Large Vision-Language Models

Jiayun Luo*, Wan-Cyuan Fan*, Lyuyang Wang, Xiangteng He, Tanzila Rahman, Purang Abolmaesumi, Leonid Sigal

ICLR 2026 (*Equal contribution. Listed in alphabetical order.)

[Paper] [Project Page]

On Pre-training of Multimodal Language Models Customized for Chart Understanding

Wan-Cyuan Fan, Yen-Chun Chen, Mengchen Liu, Lu Yuan, Leonid Sigal

NeurIPSW '24 (work done during the internship at Microsoft GenAI Research)

[Paper] [Project Page] [Code] [Data]

TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking

Raghav Goyal*, Wan-Cyuan Fan*, Mennatullah Siam, Leonid Sigal

WACV, 2025. Ranked 5th in the VOTS2024 at ECCV '24 (*Equal contribution. Listed in alphabetical order.)

[Paper] [Project Page]

IOU-Aware Multi-Expert Cascade Network for Long-tailed Object Detection

Wan-Cyuan Fan, Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu

ICASSP, 2023

[Paper]

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Wan-Cyuan Fan, Yen-Chun Chen, Dongdong Chen, Yu Cheng, Lu Yuan, Yu-Chiang Frank Wang

AAAI, 2023 (Oral) (work done during the internship at Microsoft Azure Computer Vision Research)

[Paper] [Project page] [Code]

Target-free Text-guided Image Manipulation

Wan-Cyuan Fan, Cheng-Fu Yang, Chiao-An Yang, Yu-Chiang Frank Wang

AAAI, 2023

[Paper] [Project page]

LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity

Cheng-Fu Yang*, Wan-Cyuan Fan*, Fu-En Yang, Yu-Chiang Frank Wang

CVPR 2021 (*Equal contribution. Listed in alphabetical order.)

[Paper] [Code]

Pre-prints and Miscellaneous

MMFactory: A Universal Solution Search Engine for Vision-Language Tasks

Wan-Cyuan Fan, Tanzila Rahman, Leonid Sigal

Publication Under Review (Journal)

[Paper] [Project Page]

ACE: Adaptive Confusion Energy for Natural World Data Distribution

Yen-Chi Hsu, Cheng-Yao Hong, Wan-Cyuan Fan, Ming-Sui Lee, Davi Geiger, Tyng-Luh Liu

under submission (Journal), 2021

[Paper (Arxiv pre-print)]

Auto-drawer: Generating and Modifying Images Continually byVisual-Relational Knowledge Graph

Wan-Cyuan Fan

Submitted to Second Workshop on Computer Vision for Fashion, Art and Design, ICCVw 2019

[Paper]

Awards

UBC 4YF Fellowship

The University of British Columbia, Canada, 2023

only 2-3 students each year in the CS department

NSERC PGSD/CGSD Award

Natural Sciences and Engineering Research Council of Canada, 2023

Honorable Master Thesis Award

Chinese Image Processing and Pattern Recognition Society, Taiwan, 2022

only 11 recipients in Taiwan

IPPR web

Research Scholarship

Novatek Foundation, Taiwan, 2021

only 3 recipients in NTU EECS (500+ students)

Novatek

7th place

Large Vocabulary Instance Segmentation Challenge, ECCV workshop, 2020

Top 1 among non-industrial teams

CVF Paper LVIS leaderboard (team: Argus) Eval.AI

Professional Activities

Program Committee Member
AAAI 2026

Reviewer

AAAI 2021, CVPR 2021, ICCV 2021, CVPR 2022, AAAI 2022, NeurIPS 2022