Data Science Talk Series

Title: Synergizing Knowledge and Large Language Models

Speaker: Dr. Xinya Du, University of Texas at Dallas

Time: 2:00 pm - 3:30 pm on 12-04-2024 (Wednesday)

Room: E297L, Discovery Park, UNT

Coordinator: Dr. Haihua Chen

Abstract:

Large Language Models (LLMs) have revolutionized the field of natural language processing and reshaped how humans acquire and interact with knowledge. In this talk, I will discuss my research on synergizing LMs and knowledge — where LLMs not only extract and discover knowledge, but also continually improve by integrating new knowledge. First, I will cover our work on improving knowledge extraction from the vast amount of existing literature, focusing on enabling models to better understand long documents in a cost-efficient and comprehensive manner. I will describe a novel paradigm for representing document-level structured information as question answer pairs, and how we enable LLMs to extract them by leveraging global context through retrieval-augmented modeling, effectively addressing the fundamental challenges of long-context understanding. Next, I will introduce our pioneering work on LLMs for new scientific knowledge discovery. We explore a multi-stage, LLM-based framework to generate and iteratively refine natural language scientific hypotheses. Finally, building on the above efforts, I will demonstrate how LLMs can self-improve in reasoning by continuously integrating new knowledge, enhancing their ability to generate logical, trustworthy, and explainable outputs.

Page updated

Google Sites

Report abuse