Assistant Professor
Department of Computer and Information Science and Engineering
University of Florida
Address: Malachowsky Hall, Gainesville, Florida 32611
Email: yuanyuan.lei@ufl.edu
About Me
I am a Tenure-Track Assistant Professor in the Department of Computer and Information Science and Engineering at the University of Florida since 2025 Fall. My research direction is Natural Language Processing (NLP), Large Language Models (LLM), Machine Learning, Deep Learning. I received my PhD in Computer Science from Texas A&M University in 2025 Summer. Prior to that, I received my Master degree in Statistics from Columbia University. I received my Bachelor degree in Mathematics and Applied Mathematics, as well as a double Bachelor degree in Computer Science from the University of Science and Technology of China (USTC). I was awarded with Rising Stars in Data Science (30 worldwide) by the University of Chicago and University of California San Diego, and Future Research Leaders in Artificial Intelligence (30 national) by the University of Michigan.
Recruiting CS PhD students and Research Interns: I am looking for PhD students starting from 2026 Fall, and Research Interns all-year round. If you are interested in NLP and LLM research, please feel free to email me. Here are more details.
Research Overview
My research focuses on Natural Language Processing (NLP) and Large Language Models (LLMs). My main research contribution is to innovate Knowledge-aware Language Modeling - equipping LLMs with Structured Knowledge.
What is Structured Knowledge: Structured Knowledge is the structure, organized from unstructured text, that represents the structural connections between different pieces of information, such as logical structure or narrative structure. We expect the structured knowledge to capture how information is flowed in a complex information ecosystem.
Why Knowledge-aware LLMs: My research shows structured knowledge is the key point in solving complex semantic reasoning tasks, such as identifying logical fallacy or information framings. However, because LLMs are trained with an autoregressive loss function designed for generating fluent language, they still lack a deep understanding of the intricate structured knowledge.
How to develop Knowledge-aware LLMs: My research develops both supervised and unsupervised algorithms to extract the knowledge that is hidden in text, make the knowledge structuralized into structural forms (relational, tree, graph), and integrate the structured knowledge into language modeling. In this way, we aim to enhance LLMs ability of knowledge-based reasoning.
Education
Texas A&M University, College Station TX, Sep 2020 - May 2025
PhD in Computer Science, Natural Language Processing
Columbia University, New York NY, Sep 2017 - May 2019
Master in Statistics, Data Science
University of Science and Technology of China, Sep 2013 - June 2017
Bachelor in Mathematics and Applied Mathematics
Double degree in Computer Science
Awards
Rising Stars in Data Science (30 worldwide), University of Chicago and University of California San Diego, 2023
Future Research Leaders in Artificial Intelligence (30 national), University of Michigan - Ann Arbor, 2023
Outstanding PhD Research Award (one per year), Texas A&M University, 2025
Outstanding PhD Research Award (one per year), Texas A&M University, 2024
Alumni Honor Roll, Columbia University, 2022
Travel Grant Award, Texas A&M University, Fall 2024
Travel Grant Award, Texas A&M University, Spring 2024
Scholarship for Master students, Columbia University, Fall 2018
Scholarship for Master students, Columbia University, Spring 2018
The First Prize in National Mathematical Modeling Contest in China (Top 2%), Fall 2016
News
May 2025: I was awarded with Outstanding PhD Research Award in 2025 by Texas A&M University
May 2025: One paper on multi-document summarization was accepted by ACL 2025
Sep 2024: One paper on logical fallacy reasoning in LLMs was accepted by EMNLP 2024
May 2024: I was awarded with Outstanding PhD Research Award in 2024 by Texas A&M University
Mar 2024: Three papers were accepted by NAACL 2024, multi-document summarization, moral opinions, information framings
Nov 2023: I was awarded with Rising Stars in Data Science by the University of Chicago and University of California San Diego
Oct 2023: Two papers were accepted by EMNLP 2023, discourse structures, event relation graph
April 2023: I was awarded with Future Research Leaders in Data Science and Artificial Intelligence by the University of Michigan
Dec 2022: One paper on multi-modality learning was accepted by IEEE Transactions on Affective Computing
Oct 2022: I was awarded on Alumni Honor Roll by Columbia University
Oct 2022: Two papers were accepted by EMNLP 2022, meta-learning for domain adaptation, detecting information framings