Kim Sung-Bin

Hi! I am a PhD student in the Algorithmic Machine Intelligence (AMI) Lab, Dept. of Electrical Engineering, POSTECH, South Korea, advised by Prof. Tae-Hyun Oh. I received the Master's degree from the AMI Lab at POSTECH, and the Bachelor's degree from Dept. of Electrical Engineering, Handong University, South Korea.

I am interested in multi-modal learning, and cross-modal generation, but not limited to.

contact: sungbin [at] postech [dot] ac [dot] kr | sbkim052 [at] gmail [dot] com

Google scholar | LinkedIn | CV

Publications (conferences)

😀SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models, NAACL 2024 Findings
Lee Hyun*, Kim Sung-Bin*, Seungju Han, Youngjae Yu, Tae-Hyun Oh
[arxiv]
We introduce video laugh reasoning, a new task for machines to understand the rationale behind laughter in video

Presented in [Workshop on AV4D, in conjunction with ICCV, 2023]

LaughTalk: Expressive 3D Talking Head Generation with Laughter, WACV 2024
Kim Sung-Bin, Lee Hyun, Da hye Hong, Suekyeong Nam, Janghoon Ju, Tae-Hyun Oh
[project page] [paper] [arxiv]
We generate a 3D talking head that simultaneously expresses speech articulation and laughter

Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment, CVPR 2023
Kim Sung-Bin, Arda Senocak, Hyunwoo Ha, Andrew Owens, Tae-Hyun Oh
[project page] [paper] [arxiv]
We generate images from diverse in-the-wild environmental sound

Covered by Korean news media (Yonhap News, etc.), and featured by YTN Science Channel
Invited talk in [Workshop on Sound and Sight, in conjunction with CVPR, 2023], and [Korean Artificial Intelligence Association, 2023]
Presented in [Workshop on AI4CC, in conjunction with CVPR, 2023], and [Workshop on AV4D, in conjunction with ICCV, 2023]

Prefix Tuning for Automated Audio Captioning, ICASSP 2023 [ORAL]
Minkyu Kim*, Kim Sung-Bin*, Tae-Hyun Oh
[project page] [paper] [arxiv]
We generate text descriptions from environmental sound

Covered by Korean news media (Yonhap News, etc.)

Real-time Face Registration and Classification System using Fuzzy ARTMAP, ICROS 2020
Kim Sung-Bin, Wong Hyong Lee
[paper]
We register and classify faces in real-time

Publications (journals)

Revisiting Learning-based Video Motion Magnification for Real-time Processing, under review
Hyunwoo Ha*, Oh Hyun-Bin*, Kim Jun-Seong, Kwon Byung-Ki, Kim Sung-Bin, Linh-Tam Tran, Ji-Yun Kim, Sung-Ho Bae, Tae-Hyun Oh
[arxiv]
We magnify small and subtle motions into a human perceptible motion

A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization, under review
Kim Youwang, Lee Hyun*, Kim Sung-Bin*, Suekyeong Nam, Janghoon Ju, Tae-Hyun Oh
[project page] [arxiv]
We propose reliable 3D face mesh annotations on large-scale facial video datasets

Presented in [Workshop on AV4D, in conjunction with ICCV, 2023]

The Devil in the Details: Simple and Effective Optical Flow Synthetic Data Generation, TVCJ 2024 [IF: 3.5]
Kwon Byung-Ki, Kim Sung-Bin, Tae-Hyun Oh
[arxiv]
We generate a simple yet effective synthetic optical flow dataset

Lightweight Speaker Recognition in Poincaré Spaces, SPL 2021 [IF: 3.2]
Jieun Lee*, Kim Sung-Bin*, Seokhyeong Kang, Tae-Hyun Oh
[paper]
We design Poincaré speaker embedding space for speaker recognition and verification

Awards & Honors

Full-ride scholarship from SBS Cultural Foundation (news), max $75,000 for 4-years
Summa Cum Laude, Handong University
Best Student Paper Award, ICROS, 2020
Full-ride scholarship during B.S. degree

Academic Services

Journal Reviewer: Transactions on Audio, Speech and Language Processing (TASL)
Conference Reviewer: CVPR2024

Work Experiences

Military Service (discharged as a sergeant), Vanguard Unit, Republic of Korea Army, 09/2016 - 06/2018