Taehwan Kim's Homepage

Taehwan Kim
Assistant Professor in Artificial Intelligence Graduate School and Department of Computer Science and Engineering
Ulsan National Institute of Science and Technology (UNIST)

Director, Interactive Multimodal Machine Learniing lab

Email: taehwankim[at]unist[dot]ac[dot]kr
Google scholar

NEWS: Our paper for speech generation with multimodal LLM was accepted to Interspeech 2025.

NEWS: Our paper for listener generation was accepted to ICASSP 2025.

NEWS: Three papers for astronomy + AI were accepted to Astrophysical Journal.

NEWS: Our paper for embodied AI was accepted to CVPR 2024 Embodied AI workshop.

NEWS: Our paper for multimodal storytelling with sound was accepted to Findings of EMNLP 2023.

NEWS: Our paper for slogan generation with LLM and noise perturbation was accepted to CIKM 2023 (short paper).

NEWS: Our paper for sound-to-image generation was accepted to ICCV 2023.

NEWS: Our lab team won 2nd place award in CVPR 2022 LOng-form VidEo Understanding (LOVEU) challenge track3 (2nd in Recall@1 and 1st in Recall@3) [tech report].

NEWS: Starting from Aug. 2021, I joined Artificial Intelligence Graduate School and Department of Computer Science and Engineering at Ulsan National Institute of Science and Technology (UNIST) as an assistant professor. Currently I am looking for self-motivated and curiosity-driven graduate students and undergraduate interns to join my group. If you are interested, please send me an email with your CV and transcript, along with degree program you intend to pursue.

Currently, I am an assistant professor in Artificial Intelligence Graduate School and Department of Computer Science and Engineering at Ulsan National Institute of Science and Technology (UNIST). Previously, I was an applied scientist at Amazon Alexa AI and a lead research scientist at a start-up company, ObEN. Before then, I was a postdoctoral scholar in the Computing and Mathematical Sciences department at the California Institute of Technology working with Prof. Yisong Yue. I completed my PhD in 2016 at Toyota Technological Institute at Chicago, a philanthropically endowed academic computer science institute located on the University of Chicago campus, and my advisor was Prof. Karen Livescu. I did my master in Computer Science at USC and bachelor in Computer Science & Engineering and Mathematics at POSTECH.

My main research interests span various problems related to the fields of Machine Learning and applications to Computer Vision and Language Processing. Specifically, I am interested in Deep Learning, Generative Models, Multimodal Learning, Transfer Learning, and Spatial-Temporal Data Analysis.

I am fortunate to have worked with great students:

Students

· PhD students

Jaeyeon Bae

Jinsik Bang

Minchang Chung

Seonghee Han

Jaemu Heo

Seokhoon Jeong

Taesoo Kim

Donggyu Lee

Linda Sarmiento (co-advised by Hyounghun Kim)

Hyunmin Song

· Master's students

Chanhyuk Choi

Eldor Fozilov

Yongsik Jo

Jaeho Jo

Siyeol Jung

Youngbin Ki

Minyeong Kim

Semin Myung

Thu Phuong Nguyen

· Undergraduate interns

· Alumni

Dahye Jang (Master's, 2023)

Seok-Un Kang (Master's, 2024)

Chaeri Kim (Master's, 2024)

Hyeonyu kim (Master's, 2024)

Jeonghun Kang (Master's, 2024)

Jongeun Kim (Master's, 2024)

Gaurav Saha (Master's, 2024)

Wonjin Yang (Master's, 2024)

Geonho Kim (Undergraduate intern, 2022-2023)

Chanbin Lee (Undergraduate intern, 2022-2023)

Teaching

· AI517 Deep Learning for Natural Language Processing and Understanding, Spring 2025, UNIST

· AI517 Deep Learning for Natural Language Processing and Understanding, Fall 2024, UNIST

· CSE402 Natural Language Processing, Spring 2024, UNIST

· AI517 Deep Learning for Natural Language Processing and Understanding, Fall 2023, UNIST

· CSE402 Natural Language Processing, Spring 2023, UNIST

· CSE221 Data Structure, Fall 2022, UNIST

· AI517 Deep Learning for Natural Language Processing and Understanding, Spring 2022, UNIST

· AI503 AI Toolkits, Fall 2021, UNIST

· CS159 Advanced Topics in Machine Learning: Structured Prediction, Spring 2017, Caltech

Selected Peer Reviewed Publications

(for the complete list of the publications, please see my Google scholar page)

· Hyeonyu Kim, Seokhoon Jeong, Seonghee Han, Chanhyuk Choi and Taehwan Kim, Audio-Guided Visual Editing with Complex Multi-Modal Prompts, British Machine Vision Conference (BMVC), 2025 (to appear)

· Seokun Kang and Taehwan Kim, Semi-Supervised Audio-Visual Action Recognition with Audio Source Localization Guided Mixup, IEEE International Workshop on Machine Learning for Signal Processing, 2025 (to appear)

· Taesoo Kim, Yongsik Jo, Hyunmin Song and Taehwan Kim, Towards Human-like Multimodal Conversational Agent by Generating Engaging Speech, Interspeech, 2025 (to appear)

· Siyeol Jung and Taehwan Kim, DiffListener: Discrete Diffusion Model for Listener Generation, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025 [pdf][project page]

· Sangjun Cha, M. James Jee, Sungwook E. Hong, Sangnam Park, Dongsu Bak and Taehwan Kim, Weak-lensing Mass Reconstruction of Galaxy Clusters with a Convolutional Neural Network. II. Application to Next-generation Wide-field Surveys, Astrophysical Journal 981:52 (2025) [pdf]

· Ashraf Ayubinia , Jong-Hak Woo, Fatemeh Hafezianzadeh, Taehwan Kim and Changseok Kim, Prediction of Star Formation Rates Using an Artificial Neural Network, Astrophysical Journal 980:177 (2025) [pdf]

· Hyosun Park, Yongsik Jo, Seokun Kang, Taehwan Kim and M. James Jee, Deeper, Sharper, Faster: Application of Efficient Transformer to Galaxy Image Restoration, Astrophysical Journal 972:45 (2024) [pdf]

· Jaeyeon Bae*, Seokhoon Jeong*, Seokun Kang, Namgi Han, Jae-Yon Lee, Hyounghun Kim and Taehwan Kim, Sound of Story: Multi-modal Storytelling with Audio, Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 [pdf][project page]

· Jongeun Kim, Minchung Kim and Taehwan Kim, Effective Slogan Generation with Noise Perturbation, ACM International Conference on Information and Knowledge Management (CIKM), 2023 (short paper) [pdf][project page]

· Taegyeong Lee, Jeonghun Kang, Hyeonyu Kim and Taehwan Kim, Generating Realistic Images from In-the-wild Sounds, IEEE/CVF International Conference on Computer Vision (ICCV), 2023 [pdf][project page]

· Hyeshin Chu, Joohee Kim, Seongouk Kim, Hongkyu Lim, Hyunwook Lee, Seungmin Jin, Jongeun Lee, Taehwan Kim and Sungahn Ko, An Empirical Study on How People Perceive AI-generated Music, ACM Conference on Information and Knowledge Management (CIKM), 2022

· Seyed Hamidreza Mohammadi and Taehwan Kim, One-shot voice conversion with disentangled representations by leveraging phonetic posteriorgrams, Interspeech, 2019 [pdf]

· Chao Yang, Taehwan Kim, Ruizhe Wang, Hao Peng and C.-C. Jay Kuo, Show, attend and translate: Unsupervised image translation with self-regularization and attention, IEEE Transactions on Image Processing 28 (10), 4845-4856 (2019) [pdf]

· Chao Yang, Taehwan Kim, Ruizhe Wang, Hao Peng and C.-C. Jay Kuo, ESTHER: Extremely Simple Image Translation Through Self-Regularization, British Machine Vision Conference (BMVC), 2018 [pdf]

· Seyed Hamidreza Mohammadi and Taehwan Kim, Investigation of Using Disentangled and Interpretable Representations for One-shot Cross-lingual Voice Conversion, Interspeech, 2018 [pdf]

· Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari and Karen Livescu, Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation, Computer Speech and Language, 2017 [pdf]

· Sarah Taylor, Taehwan Kim, Yisong Yue, James Krahe, Anastasio Garcia Rodriguez, Jessica Hodgins, Moshe Mahler, Iain Matthews, A Deep Learning Approach for Generalized Speech Animation, ACM Conference on Computer Graphics (SIGGRAPH), 2017 [pdf][supplementary][demo video]

· Taehwan Kim, Weiran Wang, Hao Tang and Karen Livescu, Signer-independent Fingerspelling Recognition with Deep Neural Network Adaptation, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016 (Best Student Paper of Speech and Language Processing) [pdf]

· Taehwan Kim, Yisong Yue, Sarah Taylor and Iain Matthews, A Decision Tree Framework for Spatiotemporal Sequence Prediction, ACM Conference on Knowledge Discovery and Data Mining (KDD), 2015 [pdf]

· Taehwan Kim, Greg Shakhnarovich and Karen Livescu, Fingerspelling Recognition with semi-Markov Conditional Random Fields, IEEE International Conference on Computer Vision (ICCV), 2013 [pdf]

· Taehwan Kim, Karen Livescu and Greg Shakhnarovich, American Sign Language Fingerspelling Recognition With Phonological Feature-based Tandem Models, IEEE Workshop on Spoken Language Technology (SLT), 2012 [pdf]

· Taehwan Kim, Greg Shakhnarovich and Raquel Urtasun, Sparse Coding for Learning Interpretable Spatio-temporal Primitives, Neural Information Processing Systems (NIPS), 2010 [pdf]

· Jihie Kim, Erin Shaw, Saul Wyner, Taehwan Kim and Jia Li, Discerning Affect in Student Discussions, Annual Meeting of the Cognitive Science Society (CogSci), 2010

· Jihie Kim, Jia Li and Taehwan Kim: Identifying student online discussions with unanswered questions, K-CAP, 2009

· Jihie Kim, Taehwan Kim and Jia Li, Identifying unresolved issues in online students discussions: A multi-phase dialogue classification approach, Proc. of the AI in Education Conference (AIED), 2009

Page updated

Google Sites

Report abuse