PUBLICATIONS
*: equal contribution
2025
Generative Audio Language Modeling with Continuous-valued Tokens and Masked Next-Token Prediction [PDF]
Shu-wen Yang, Byeonggeun Kim, Kuan-Po Huang, Qingming Tang, Huy Phan, Bo-Ru Lu, Harshavardhan Sundar, Shalini Ghosh, Hung-yi Lee, Chieh-Chi Kao, Chao Wang
International Conference on Machine Learning (ICML) 2025 (I served as the primary mentor for this internship project)
IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling [PDF]
Kuan-Po Huang, Shu-wen Yang, Huy Phan, Bo-Ru Lu, Byeonggeun Kim, Sashank Macha, Qingming Tang, Shalini Ghosh, Hung-yi Lee, Chieh-Chi Kao, Chao Wang
International Conference on Machine Learning (ICML) 2025 (Mentored internship work)
Effective Techniques for Scaling Audio Encoder Pretraining [PDF]
Byeonggeun Kim*, Andrew Bydlon*, Qingiming Tang, Huy Phan, Chieh-Chi Kao Tao Zhang, and Chao Wang
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025 (Oral presentation)
2024
Unlocking Transfer Learning for Open-World Few-Shot Recognition [PDF]
Byeonggeun Kim*, Jun-Tae Lee*, Kyuhong Shim, and Simyung Chang
arXiv, 2401.09986, 2024
Cross-Triggering Issue in Audio Event Detection and Mitigation [PDF]
Huy Phan, Byeonggeun Kim, Vu Nguyen, Andrew Bydlon, Qingiming Tang, Chieh-Chi Kao, and Chao Wang
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
2023
Task-Agnostic Open-Set Prototype for Few-Shot Open-Set Recognition [PDF]Â
Byeonggeun Kim*, Jun-Tae Lee*, Kyuhong Shim, and Simyung Chang
IEEE International Conference on Image Processing (ICIP) 2023
Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data [PDF]
Seunghan Yang, Byeonggeun Kim, Kyuhong Shim and Simyung Chang
INTERSPEECH 2023 (Oral presentation)
Scalable Weight Reparametrization for Efficient Transfer Learning [PDF]
Byeonggeun Kim*, Jun-Tae Lee*, Seunghan Yang, and Simyung Chang
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023 (Oral presentation)
TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation [PDF]
Hyesu Lim, Byeonggeun Kim, Jaegul Choo, Sungha Choi
International Conference on Learning Representations (ICLR) 2023 (Mentored internship work)
2022
Dummy Prototypical Networks for Few-shot Open-set Keyword Spotting [PDF]
Byeonggeun Kim, Seunghan Yang, Inseop Chung, Simyung Chang
INTERSPEECH 2022
Domain Generalization with Relaxed Instance Frequency-wise Normalization for Multi-device Acoustic Scene Classification [PDF]
Byeonggeun Kim, Seunghan Yang, Jangho Kim, Hyunsin Park, Juntae Lee, Simyung Chang
INTERSPEECH 2022
Personalized Keyword Spotting through Multi-task Learning [PDF]
Seunghan Yang, Byeonggeun Kim, Inseop Chung, Simyung Chang
INTERSPEECH 2022 (Oral presentation)
2021
Broadcasted Residual Learning for Efficient Keyword Spotting [PDF] [code]
Byeonggeun Kim*, Simyung Chang*, Jinkyu Lee, Dooyong Sung (* equal contribution)
INTERSPEECH 2021
BCResNets
Domain Generalization on Efficient Acoustic Scene Classification Using Residual Normalization [PDF] [poster] [video]
Byeonggeun Kim, Seunghan Yang, Jangho Kim, Simyung Chang
Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE), 2021
QTI submission to DCASE 2021: Residual normalization for device imbalanced acoustic scene classification with efficient design [PDF] [results]
Byeonggeun Kim, Seunghan Yang, Jangho Kim, Simyung Chang
IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE challenge), 2021
1st place in DCASE-2021 challenge
2019
Orthogonality Constrained Multi-Head Attention For Keyword Spotting [PDF]
Mingu Lee, Jinkyu Lee, Hye Jin Jang, Byeonggeun Kim, Wonil Chang, Kyuwoong Hwang
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
Query-by-Example On-Device Keyword Spotting [PDF] [Qualcomm keyword speech dataset]
Byeonggeun Kim, Mingu Lee, Jinkyu Lee, Yeonseok Kim, Kyuwoong Hwang
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
GRANTED PATENTS
[P8] Multi-task learning for personalized keyword spotting [PDF]
Seunghan Yang, Byeonggeun Kim, Inseop Chung, and Simyung Chang
U.S. Patent No. 12,347,439. 1 Jul. 2025.
[P7] Relaxed instance frequency normalization for neural-network-based audio processing [PDF]
Byeonggeun Kim, Seunghan Yang, Hyunsin Park, Juntae Lee, and Simyung Chang
U.S. Patent No. 12,266,379. 1 Apr. 2025.
[P6] Target Keyword Selection [PDF]
Wonil Chang, Jinseok Lee, Mingu Lee, Jinkyu Lee, Byeonggeun Kim, Dooyong Sung, Jaewon Choi, and Kyu Woong Hwang
U.S. Patent No. 12,039,968. 16 Jul. 2024.
[P5] Task Agnostic Open-set Prototypes for Few-shot Open-set Recognition [PDF]
Byeonggeun Kim, Juntae Lee, and Simyung Chang
U.S. Patent No. 12,019,641. 25 Jun. 2024.
[P4] Systems and methods of image processing based on gaze detection [PDF]
Hyunsin Park, Juntae Lee, Simyung Chang, Byeonggeun Kim, Jaewon Choi, and Kyu Woong Hwang
U.S. Patent No. 11,798,204. 24 Oct. 2023.
[P3] On-device self training in two-stage wakeup system comprising a system on chip which operates in a reduced-activity mode [PDF]
Young Mo Kang, Sungrak Yun, Kyu Woong Hwang, Hye Jin Jang, Byeonggeun Kim
U.S. Patent No. 11,664,012. 30 May. 2023.
[P2] Activating speech recognition based on hand patterns detected using plurality of filters [PDF]
Sungrack Yun, Young Mo Kang, Hye Jin Jang, Byeonggeun Kim, Kyu Woong Hwang
U.S. Patent No. 11,437,031. 6 Sep. 2022.
[P1] Method and apparatus for activating speech recognition [PDF]
Byeonggeun Kim, Young Mo Kang, Sungrack Yun, Kyu Woong Hwang, Hye Jin Jang
U.S. Patent No. 11,205,433. 21 Dec. 2021.