Jen-Chun Lin received the Ph.D. degree in computer science and information engineering from National Cheng Kung University, Tainan, Taiwan, in 2014. He was a Post-Doctoral Research Fellow with Academia Sinica, Taipei, Taiwan, from 2014 to 2018, and an Assistant Professor with the Department of Electrical Engineering, Yuan Ze University, Taoyuan, Taiwan, from 2018 to 2020. He is currently an Associate Research Fellow at Academia Sinica, Taipei, Taiwan. 

The illustration in his IEEE/ACM Transactions on Audio, Speech, and Language Processing (March/April issue) 2014 paper has been chosen to highlight the cover of the journal. He also received the Gold Thesis Award from Merry Electronics in 2014; the Excellent PhD Dissertation Award from Chinese Image Processing and Pattern Recognition Society in 2014; the Excellent PhD Dissertation Award from Taiwanese Association for Artificial Intelligence in 2014; the Most Interesting Paper Award from Affective Social Multimedia Computing Workshop in 2015; the Postdoctoral Academic Publication Award from Ministry of Science and Technology (MOST) in 2017; and the APSIPA Sadaoki Furui Prize Paper Award in 2018. 

Research Statement
My research focuses on designing structured, stable, and interpretable learning paradigms in artificial intelligence for cross-modal representation and generation. Rather than relying on ad hoc architectural extensions or heavily engineered loss functions, I aim to reformulate learning problems through clear structural assumptions that improve both training stability and conceptual clarity. 

A central theme of my research is simplifying complex learning objectives while preserving expressive power. I investigate principled approaches such as disentangled representations, causal consistency, and structured competitive or counterfactual learning dynamics, which have guided my research on attention modeling and multimodal learning. 

Recently, I have expanded my research toward cross-modal generative modeling of 3D human motion and camera trajectories, where controllability, temporal coherence, and semantic alignment are critical. By integrating representation learning with generative frameworks such as VQ-VAE, diffusion models, and transformers, my goal is to bridge high-level semantic intent with realistic and controllable motion generation.