MMMI Lab - Publications

Publication

2026 : ECCV(1), CVPR(2), AAAI(1)

See & Sniff: Learning Visuo-Olfactory Representation
The European Conference on Computer Vision (ECCV), 2026
Seongyu Kim*, Seungwoo Lee*, Hyeonggon Ryu, Joon Son Chung, Arda Senocak(*: equal contribution)
Hear you are: Teaching LLMs Spatial Reasoning with Vision and Spatial Sound
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2026
Hyeonggon Ryu, Joon Son Chung, David Harwath
Seeing Through Touch: Tactile-Driven Visual Localization of Material Regions
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2026
Seongyu Kim, Seungwoo Lee, Hyeonggon Ryu, Joon Son Chung, Arda Senocak
Mmau-pro: A challenging and comprehensive benchmark for holistic evaluation of audio general intelligence
Annual AAAI Conference on Artificial Intelligence (AAAI), 2026
Sonal Kumar et al.

2025 : CVPR(1)

Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025
Hyeonggon Ryu*, Seongyu Kim*, Joon Son Chung, Arda Senocak(*: equal contribution)

2024 : ICASSP(1), ACMMM(1)

Let me finish my sentence: Video temporal grounding with holistic text understanding
ACM International Conference on Multimedia (ACMMM), 2024
Jongbhin Woo, Hyeonggon Ryu, Youngjoon Jang, Jae Won Cho, Joon Son Chung
Speech Guided Masked Image Modeling for Visually Grounded Speech
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024
Jongbhin Woo, Hyeonggon Ryu, Arda Senocak, Joon Son Chung

2023 : ICASSP(1), CVPR(1), ICCV(1)

Hindi as a second language: Improving Visually Grounded Speech with Semantically Similar Samples
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
Hyeonggon Ryu*, Arda Senocak*, In So Kweon, Joon Son Chung(*: equal contribution)
Generative Bias for Robust Visual Question Answering
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2023
Jae Won Cho, Dong-Jin Kim, Hyeonggon Ryu, In So Kweon
Sound Source Localization is all about Cross-Modal Alignment
IEEE International Conference on Computer Vision (ICCV), 2023
Arda Senocak*, Hyeonggon Ryu*, Junsik Kim*, Tae-Hyun Oh, Hanspeter Pfister, Joon Son Chung(*: equal contribution)

2022 : WACV(1), ICASSP(1), Preprint(1)

Less Can Be More: Sound Source Localization With a Classification Model
IEEE Winter Conference on Applications of Computer Vision (WACV), 2022
Arda Senocak*, Hyeonggon Ryu*, Junsik Kim*, In So Kweon (*: equal contribution)
Received Honorable Mention, 28th HumanTech Paper Award, Samsung Electronics Co., Ltd
Learning Sound Localization Better from Semantically Similar Samples
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022
Arda Senocak*, Hyeonggon Ryu*, Junsik Kim*, In So Kweon(*: equal contribution)
Audio-Visual Fusion Layers for Event Type Aware Video Recognition
arXiv preprint arXiv:2202.05961
Arda Senocak*, Junsik Kim*, Tae-Hyun Oh, Hyeonggon Ryu, Dingzeyu Li, In So Kweon(*: equal contribution)