Our research is rooted in natural language processing, with a focus on how machines represent meaning, model context, and reason with knowledge. We study the core capabilities required for language systems to function effectively in complex real-world scenarios. As language increasingly serves as an interface to diverse data sources and environments, we examine how these systems generalize across languages and modalities and interact with external tools and web-scale information. Ultimately, we aim to develop AI systems that operate reliably across diverse real-world users and environments.
Our research centers on three primary directions, among others:
Evaluation and Reliability of Language Generation
We study how to rigorously evaluate natural language generation systems beyond surface-level metrics. Our research analyzes the limitations of automatic evaluation methods and examines how evaluation protocols influence model behavior and comparative judgments. We develop principled evaluation frameworks that better reflect human preferences and real-world usage, particularly in dialogue and open-ended generation.
Keywordsđź’ˇ: Dialogue system, evaluation, reliability, robustness
Multilingual and Multimodal Language Intelligence
We extend language systems beyond text-only and monolingual settings to multilingual and multimodal contexts. Our work explores how models interpret meaning across languages, how cultural context shapes understanding, and how linguistic and visual signals interact in complex tasks. We aim to improve robustness and generalization across diverse linguistic, cultural, and multimodal environments.
Keywordsđź’ˇ: Multilingual, Multimodal, Multicultural
Agentic Language Systems
We investigate language-based agents that reason, retrieve information, and interact with external tools and environments. Our research focuses on enabling models to integrate external knowledge sources and perform structured multi-step reasoning. We study how to improve the reliability, controllability, and alignment of agentic systems operating in open-ended and web-scale settings.
Keywordsđź’ˇ: Agentic AI, Tool-augmented LLM, Web agent