NEWS: Our paper for image outpainting was accepted to ECCV 2024.

NEWS: Our paper for text-to-video generation was accepted to CVPR 2024.

NEWS: Our paper for multimodal storytelling with sound was accepted to Findings of EMNLP 2023.

NEWS: Our paper for slogan generation with LLM and noise perturbation was accepted to CIKM 2023 (short paper).

NEWS: Our paper for sound-to-image generation was accepted to ICCV 2023. 

NEWS: Our lab team won 2nd place award in CVPR 2022 LOng-form VidEo Understanding (LOVEU) challenge track3 (2nd in Recall@1 and 1st in Recall@3) [tech report]. 

NEWS: Starting from Aug. 2021, I joined Artificial Intelligence Graduate School and Department of Computer Science and Engineering at Ulsan National Institute of Science and Technology (UNIST) as an assistant professor. Currently I am looking for self-motivated and curiosity-driven graduate students and undergraduate interns to join my group. If you are interested, please send me an email with your CV and transcript.


Currently, I am an assistant professor in Artificial Intelligence Graduate School and Department of Computer Science and Engineering at Ulsan National Institute of Science and Technology (UNIST). Previously, I was an applied scientist at Amazon Alexa AI and a lead research scientist at a start-up company, ObEN. Before then, I was a postdoctoral scholar in the Computing and Mathematical Sciences department at the California Institute of Technology working with Prof. Yisong Yue. I completed my PhD in 2016 at Toyota Technological Institute at Chicago, a philanthropically endowed academic computer science institute located on the University of Chicago campus, and my advisor was Prof. Karen Livescu. I did my master in Computer Science at USC and bachelor in Computer Science & Engineering and Mathematics at POSTECH.

My main research interests span various problems related to the fields of Machine Learning and applications to Computer Vision and Language Processing. Specifically, I am interested in Deep Learning, Generative Models, Multimodal Learning, Transfer Learning, and Spatial-Temporal Data Analysis.

I am fortunate to have worked with great students:


·  PhD students

 Jaeyeon Bae

 Jinsik Bang

 Minchang Chung

 Seonghee Han

 Seokhoon Jeong

 Hyunmin Song 


·  Master's students

 Eldor Fozilov 

 Jaemu Heo

 Yongsik Jo

 Siyeol Jung

 Jeonghun Kang 

 Youngbin Ki 

 Jongeun Kim

 Taesoo Kim

 Soyoung Kwon

 Taegyeong Lee

 Gaurav Saha

 Wonjin Yang


·  Undergraduate interns

·  Alumni

 Dahye Jang (Master's, 2023)

 Seok-Un Kang (Master's, 2024) 

 Chaeri Kim (Master's, 2024) 

 Hyeonyu kim (Master's, 2024) 

 Geonho Kim (Undergraduate intern, 2022-2023)

 Chanbin Lee (Undergraduate intern, 2022-2023)



·  CSE402 Natural Language Processing, Spring 2024, UNIST

·  AI517 Deep Learning for Natural Language Processing and Understanding, Fall 2023, UNIST

·  CSE402 Natural Language Processing, Spring 2023, UNIST

·  CSE221 Data Structure, Fall 2022, UNIST

·  AI517 Deep Learning for Natural Language Processing and Understanding, Spring 2022, UNIST

·  AI503 AI Toolkits, Fall 2021, UNIST

·  CS159 Advanced Topics in Machine Learning: Structured Prediction, Spring 2017, Caltech


Selected Peer Reviewed Publications

(for the complete list of the publications, please see my Google scholar page)

·  Hyosun Park, Yongsik Jo, Seokun Kang, Taehwan Kim, M. James Jee, Deeper, Sharper, Faster: Application of Efficient Transformer to Galaxy Image Restoration, Astrophysical Journal 972:45 (2024) [pdf]

·  Soyeong Kwon*, Taegyeong Lee*  and Taehwan Kim, Zero-shot Text-guided Infinite Image Generation with LLM guidance, European Conference on Computer Vision (ECCV), 2024 (to appear) [pdf][project page]

·  Taegyeong Lee*, Soyeong Kwon* and Taehwan Kim, Grid Diffusion Models for Text-to-Video Generation, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024 [pdf][project page]

·  Jaeyeon Bae*, Seokhoon Jeong*, Seokun Kang, Namgi Han, Jae-Yon Lee, Hyounghun Kim and Taehwan Kim, Sound of Story: Multi-modal Storytelling with Audio, Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 [pdf][project page]

·  Jongeun Kim, Minchung Kim and Taehwan Kim, Effective Slogan Generation with Noise Perturbation, ACM International Conference on Information and Knowledge Management (CIKM), 2023 (short paper) [pdf][project page]

·  Taegyeong Lee, Jeonghun Kang, Hyeonyu Kim and Taehwan Kim, Generating Realistic Images from In-the-wild Sounds, IEEE/CVF International Conference on Computer Vision (ICCV), 2023 [pdf][project page]

·  Hyeshin Chu, Joohee Kim, Seongouk Kim, Hongkyu Lim, Hyunwook Lee, Seungmin Jin, Jongeun Lee, Taehwan Kim and Sungahn Ko, An Empirical Study on How People Perceive AI-generated Music, ACM Conference on Information and Knowledge Management (CIKM), 2022

·  Seyed Hamidreza Mohammadi and Taehwan Kim, One-shot voice conversion with disentangled representations by leveraging phonetic posteriorgrams, Interspeech, 2019 [pdf]

·  Chao Yang, Taehwan Kim, Ruizhe Wang, Hao Peng and C.-C. Jay Kuo, Show, attend and translate: Unsupervised image translation with self-regularization and attention, IEEE Transactions on Image Processing 28 (10), 4845-4856 (2019) [pdf]

·  Chao Yang, Taehwan Kim, Ruizhe Wang, Hao Peng and C.-C. Jay Kuo, ESTHER: Extremely Simple Image Translation Through Self-Regularization, British Machine Vision Conference (BMVC), 2018 [pdf]

·  Seyed Hamidreza Mohammadi and Taehwan Kim, Investigation of Using Disentangled and Interpretable Representations for One-shot Cross-lingual Voice Conversion, Interspeech, 2018 [pdf]

·  Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari and Karen Livescu, Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation, Computer Speech and Language, 2017 [pdf]


·  Sarah Taylor, Taehwan Kim, Yisong Yue, James Krahe, Anastasio Garcia Rodriguez, Jessica Hodgins, Moshe Mahler, Iain Matthews, A Deep Learning Approach for Generalized Speech Animation, ACM Conference on Computer Graphics (SIGGRAPH), 2017 [pdf][supplementary][demo video]


·  Taehwan Kim, Weiran Wang, Hao Tang and Karen Livescu, Signer-independent Fingerspelling Recognition with Deep Neural Network Adaptation, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016 (Best Student Paper of Speech and Language Processing) [pdf]


·  Taehwan Kim, Yisong Yue, Sarah Taylor and Iain Matthews, A Decision Tree Framework for Spatiotemporal Sequence Prediction, ACM Conference on Knowledge Discovery and Data Mining (KDD), 2015 [pdf]


·  Taehwan Kim, Greg Shakhnarovich and Karen Livescu, Fingerspelling Recognition with semi-Markov Conditional Random Fields, IEEE International Conference on Computer Vision (ICCV), 2013 [pdf]


·  Taehwan Kim, Karen Livescu and Greg Shakhnarovich, American Sign Language Fingerspelling Recognition With Phonological Feature-based Tandem Models, IEEE Workshop on Spoken Language Technology (SLT), 2012 [pdf]


·  Taehwan Kim, Greg Shakhnarovich and Raquel Urtasun, Sparse Coding for Learning Interpretable Spatio-temporal Primitives, Neural Information Processing Systems (NIPS), 2010 [pdf]


·  Jihie Kim, Erin Shaw, Saul Wyner, Taehwan Kim and Jia Li, Discerning Affect in Student Discussions, Annual Meeting of the Cognitive Science Society (CogSci), 2010


·  Jihie Kim, Jia Li and Taehwan Kim: Identifying student online discussions with unanswered questions, K-CAP, 2009


·  Jihie Kim, Taehwan Kim and Jia Li, Identifying unresolved issues in online students discussions: A multi-phase dialogue classification approach, Proc. of the AI in Education Conference (AIED), 2009