Bitter Lessons in Structured Reasoning for Text Data
It is an appealing idea that structured representations of text, including representations from formal semantics, can help advance the state of the art for NLP applications. In this talk, I will discuss three problem settings where the "bitter lesson" has reared its head: LLMs trained on large datasets have rendered structured approaches unnecessary. However, throughout the talk, I will highlight how lightweight natural language representations of meaning may still have a part to play. First, I will discuss work on textual entailment for detecting hallucinations from language models. Grounding in representations like propositions or abstract meaning representation is no longer necessary to achieve high performance on benchmark datasets, although these methods can provide interpretability and help localize errors. Second, I will discuss similar trends on the problem of fact-checking political claims. Finally, I will discuss work on entailment trees, structures connecting natural language premise statements to derived conclusions. Although these promise higher precision inferences than methods like chain-of-thought, it is not clear they can easily scale to complex settings. I will describe a challenging new benchmark dataset, MuSR, and provide motivation for future work in this area.
Bio: Greg Durrett is an associate professor of Computer Science at UT Austin. He received his BS in Computer Science and Mathematics from MIT and his PhD in Computer Science from UC Berkeley, where he was advised by Dan Klein. His research is broadly in the areas of natural language processing and machine learning. Currently, his group's focus is on techniques for reasoning about knowledge in text, verifying factuality of LLM generations, and building systems using LLMs as primitives. He is a 2023 Sloan Research Fellow and a recipient of a 2022 NSF CAREER award. He has co-organized the Workshop on Natural Language Reasoning and Structured Explanations at ACL 2023 and ACL 2024, as well as workshops on low-resource NLP and NLP for programming. He has served in numerous roles for ACL conferences, recently as a member of the NAACL Board since 2024 and as Senior Area Chair for NAACL 2024 and ACL 2024.
Representing Illustrative Visual Semantics with Descriptive Language
Contemporary visual semantic representations predominantly revolve around commonplace objects found in everyday images and videos, ranging from ladybugs and bunnies to airplanes. However, crucial visual cues extend beyond mere object recognition and interaction. They encompass a spectrum of richer semantics, including vector graphics (e.g., angles, mazes), scientific charts, and molecule graphs. Moreover, they entail intricate visual dynamics, such as object interactions, actions, and activities. Regrettably, traditional visual representations relying solely on pixels and regions fail to fully encapsulate these nuances. In this talk, I propose to design intermediate symbolic semantic representations to precisely describe and aggregate these low-level visual signals. This augmentation promises to enhance their utility as inputs for large language models or vision-language models, thereby facilitating high-level knowledge reasoning and discovery tasks. I will present several applications range from playful maze solving and action detection to the intricate realm of drug discovery.
Bio: Heng Ji is a professor at Computer Science Department, and an affiliated faculty member at Electrical and Computer Engineering Department and Coordinated Science Laboratory of University of Illinois Urbana-Champaign. She is an Amazon Scholar. She is the Founding Director of Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE). She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Multimedia Multilingual Information Extraction, Knowledge-enhanced Large Language Models and Vision-Language Models. She was selected as a Young Scientist by the World Laureates Association in 2023 and 2024. She was selected to participate in DARPA AI Forward in 2023. She was selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. She was named as part of Women Leaders of Conversational AI (Class of 2023) by Project Voice. The other awards she received include "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, PACLIC2012 Best paper runner-up, "Best of ICDM2013" paper award, "Best of SDM2013" paper award, ACL2018 Best Demo paper nomination, ACL2020 Best Demo Paper Award, NAACL2021 Best Demo Paper Award, Google Research Award in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She was invited to testify to the U.S. House Cybersecurity, Data Analytics, & IT Committee as an AI expert in 2023. She was invited by the Secretary of the U.S. Air Force and AFRL to join Air Force Data Analytics Expert Panel to inform the Air Force Strategy 2030, and invited to speak at the Federal Information Integrity R&D Interagency Working Group (IIRD IWG) briefing in 2023. She has coordinated the NIST TAC Knowledge Base Population task 2010-2020. She was the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing, and served as the Program Committee Co-Chair of many conferences including NAACL-HLT2018 and AACL-IJCNLP2022. She was elected as the North American Chapter of the Association for Computational Linguistics (NAACL) secretary 2020-2023.