Assistant Professor
Sungkyunkwan University (SKKU)
sangmin.lee [at] skku.edu
I am an Assistant Professor at Sungkyunkwan University. Previously, I was a Postdoctoral Researcher at the University of Illinois Urbana-Champaign, working with James M. Rehg. During that time, I was also affiliated with the Georgia Institute of Technology. I received my Ph.D. from KAIST under the supervision of Yong Man Ro. Prior to that, I obtained my B.S. from Yonsei University.
My research interests lie in expanding machine capabilities through multimodal perception and minimal supervision. I investigate multimodal learning to comprehensively leverage visual, language, audio, and physiological signals for holistic reasoning. Furthermore, I explore self-supervised learning to effectively derive feature representations even from weakly-labeled or unlabeled data. Building upon these foundations, my current research focuses on developing socially intelligent machines that can understand and interact with humans in social contexts seamlessly.
Multimodal Learning
- Visual + Language / Audio / Physiological SignalsSelf-supervised Learning
- Weakly-labeled / Unlabeled DataSocial Artificial Intelligence
- Social Understanding / ReasoningSungkyunkwan University (SKKU), Seoul, South Korea (Sep 2024 - Present)
- Assistant ProfessorUniversity of Illinois Urbana-Champaign (UIUC), IL, United States (May 2023 - Aug 2024)
- Postdoctoral Researcher in Computer Science - Advisor: Prof. James M. RehgGeorgia Institute of Technology (Georgia Tech), GA, United States (Oct 2023 - Aug 2024)
- Affiliated Researcher in Interactive ComputingKorea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea (Feb 2023)
- Ph.D. in Electrical Engineering - Advisor: Prof. Yong Man RoYonsei University, Seoul, South Korea (Feb 2017)
- B.S. in Electrical & Electronic EngineeringMemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization
SocialGesture: Delving into Multi-person Gesture Understanding
Object-aware Sound Source Localization via Audio-Visual Scene Understanding
Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Question-Aware Gaussian Experts for Audio-Visual Question Answering
Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight Detection
Text-Guided Distillation Learning to Diversify Video Embeddings for Text-Video Retrieval
Analyzing Visible Articulatory Movements in Speech Production for Speech-Driven 3D Facial Animation
Method for Video Frame Interpolation Robust to Exceptional Motion and the Apparatus Thereof
Korea Patent 2244187 / PCT Patent App. 003461
Method for VR Sickness Assessment Considering Neural Mismatch Model and the Apparatus Thereof
Korea Patent 2284266 / US Patent 11699072
Apparatus and Method for Virtual Reality Sickness Reduction Based on Virtual Reality Sickness Assessment
Korea Patent 2291257 / US Patent 11252371
Video Sequences Generating System Using Generative Adversarial Networks and the Method Thereof
BMVC Outstanding Reviewer (2024)
- British Machine Vision Conference (BMVC)CVPR Oral Presentation (Top 0.8% of Submissions) (2024) — 1st-Author
- IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)1st Place Winner of Ad-hoc Video Search Competition (2022) — Team Lead
- 11th Video Browser Showdown (International Challenge)CVPR Oral Presentation (Top 4% of Submissions) (2021) — 1st-Author
- IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Samsung HumanTech Paper Award, Honor Prize ($2,000) (2021) — 1st-Author
- Samsung ElectronicsOutstanding Project Selection (2020) — Graduate Project Lead
- Institute of Information & communications Technology Planning & Evaluation (IITP)Outstanding TA Award (2 Times) (2019, 2020) — Head TA
- Korea Advanced Institute of Science and Technology (KAIST)ICIP Best Paper Finalist (Top 1% of Submissions) (2019) — 1st-Author
- IEEE International Conference on Image Processing (ICIP)ICIP Top 10% Paper Selection (2019) — 1st-Author
- IEEE International Conference on Image Processing (ICIP)National Government Fellowship (2017 - 2023)
- Government of South KoreaOrganizer
- Artificial Social Intelligence Workshop @ ECCV 2024Reviewer
- Conference on Computer Vision and Pattern Recognition (CVPR) - International Conference on Computer Vision (ICCV) - European Conference on Computer Vision (ECCV) - Medical Image Computing and Computer Assisted Intervention (MICCAI) - IEEE Transactions on Image Processing (TIP) - IEEE Transactions on Neural Networks and Learning Systems (TNNLS) - IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) - IEEE Transactions on Visualization and Computer Graphics (TVCG) - IEEE Transactions on Multimedia (TMM) - International Journal of Computer Vision (IJCV)Invited Talks
- Predictive Visions: Exploring the Potential of Video Predictive Models @ TechArt Conference (2024) - Associative Learning for Multimodal Representation under Ambiguous Pair Problems @ KHU (2023) - Weakly Paired Associative Learning for Sound and Image Representations @ ETRI (2022) - Deep Learning-based VR Sickness Assessment @ IEEE Standard Association WG 3079 (2019) - Quantitative Analysis on VR Sickness Considering Content Quality Factor @ TTA Standardization PG 610 (2019)Teaching
- [Lecturer] Artificial Intelligence with Deep Learning @ DTaQ (2022) - [Programming Lecturer & TA] EE474 Introduction to Multimedia @ KAIST (2018, 2019, 2020) - [TA] EE205 Data Structures and Algorithms @ KAIST (2020) - [TA] EE636 Digital Video Processing @ KAIST (2018, 2019)