Se Jin Park

PhD Candidate,

Integrated Vision and Language Lab (IVLLab),

Department of Electrical Engineering,

Korea Advanced Institute of Science and Technology (KAIST)

Email: jinny960812@kaist.ac.kr

CV | Google Scholar | LinkedIn

News

I will be joining Google Deepmind as a Student Researcher in Mountain View, CA, from July to October 2024. I will be supervised by Julian Salazar and Aren Jansen.

Publications

<C: Conference, J: Journal, P: Preprint, *: Equal Contribution>

2024
[P4] AV-EmoDialog: Chat with Audio-Visual Users Leveraging Emotional CuesSe Jin Park, Yeonju Kim, Hyeongseop Rha, and Yong Man Ro Under Review
[C8] Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation Se Jin Park*, Chae Won Kim*, Hyeongseop Rha, Minsu Kim, Joanna Hong, Jeonghun Yeo, and Yong Man Ro Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2024, [paper | data]
[C7] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation Jeongsoo Choi*, Se Jin Park*, Minsu Kim*, and Yong Man Ro IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR Highlight), 2024, [paper | demo | code]
[C6] Persona Extraction Through Semantic Similarity For Emotional Support Conversation Generation Seunghee Han, Se Jin Park, Chae Won Kim, and Yong Man RoIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, [paper]
[C5] Exploring Phonetic Context in Lip Movement for Authentic Talking Face GenerationSe Jin Park, Minsu Kim, Jeongsoo Choi, and Yong Man RoIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, [paper | demo]
[C4] Reprogramming Audio-driven Talking Face Synthesis into Text-drivenJeongsoo Choi, Minsu Kim, Se Jin Park, and Yong Man RoIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, [paper | demo]
[P3] Efficient Multilingual Visual Speech Recognition by Modeling Discretized Visual Speech UnitsMinsu Kim*, Jeonghun Yeo*, Jeongsoo Choi, Se Jin Park, and Yong Man RoUnder Review, 2024, [paper]

2023
[C3] Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained ModelJoanna Hong, Se Jin Park, and Yong Man RoFindings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023, [paper]
[P2] DF-3DFace: One-to-Many Speech Synchronized 3D Facial Animation with DiffusionSe Jin Park, Joanna Hong, Minsu Kim, and Yong Man RoArxiv Preprint, 2023, [paper]

2022
[C2] SyncTalkFace: Talking Face Generation with Precise Lip-syncing via Audio-Lip MemorySe Jin Park, Minsu Kim, Joanna Hong, Jeongsoo Choi, and Yong Man RoAAAI Conference on Artificial Intelligence (AAAI Oral), 2022, [paper]
[P1] Test-time Adaptation for Real Image Denoising via Meta-transfer LearningAgus Gunawan, Muhammad Adi Nugroho, and Se Jin ParkarXiv Preprint, 2022, [paper]

2021
[C1] Multi-Modality Associative Bridging Through Memory: Speech Sound Recollected From Face Video Speech Reconstruction with Reminiscent Sound via Visual Voice Memory Minsu Kim*, Joanna Hong*, Se Jin Park, Yong Man Ro IEEE/CVF International Conference on Computer Vision (ICCV), 2021, [paper]
[J1] Speech Reconstruction with Reminiscent Sound via Visual Voice MemoryJoanna Hong, Minsu Kim, Se Jin Park, Yong Man Ro IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2021, [paper]
[J2] Cromm-vsr: Cross-modal Memory Augmented Visual Speech RecognitionMinsu Kim, Joanna Hong, Se Jin Park, Yong Man Ro IEEE Transactions on Multimedia (TMM), 2021, [paper]

Education

Korea Advanced Institute of Science and Technology (KAIST), South Korea (2022 - Present)

Ph.D in Electrical Engineering (advisor: Prof. Yong Man Ro)

Korea Advanced Institute of Science and Technology (KAIST), South Korea (2020 - 2022)

M.S in Electrical Engineering (advisor: Prof. Yong Man Ro)

Korea Advanced Institute of Science and Technology (KAIST), South Korea (2015 - 2020)

B.S in Electrical Engineering

Nanjing International School (NIS), China (2009 - 2015)

International Baccalaureate (IB) Diploma

Paparoa Street School, New Zealand (2006 - 2007)

Teaching Experience

- EE474 Introduction to Multimedia, KAIST (2022 Spring, 2023 Spring, 2024 Spring)
- EE305 Introduction to Electronics Design Lab, KAIST (2022 Fall, 2023 Fall)

Academic Services

Conference Reviewer

AAAI Conference on Artificial Intelligence (AAAI) (2023, 2024)
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024)

Journal Reviewer

IEEE Transactions on Multimedia
Journal of Natural Language Processing
Neural Processing Letters

Skills

Programming Languages

Python, C, and MATLAB

Languages

Korean, English, and Chinese