SustaiNLP 2021

Second Workshop on Simple and Efficient Natural Language Processing

Invited Speakers

Jacob Andreas: Better language modeling with worse context representations

Abstract: Transformer-based language models benefit from conditioning on enormous contexts containing hundreds to thousands of previous tokens. Do we really need explicit representations of long-range context to support accurate language modeling? I'll discuss recent work showing that extremely reduced context representations (including contexts without ordering information, contexts with most words deleted, and contexts in which all linguistic content has been replaced with a compact semantic summary) do not hurt---and in some cases improve! --- downstream prediction. These results suggest that the path to improved transformer-based NLP might not require attention over entire documents.

Bio: Jacob Andreas is the X Consortium Assistant Professor at MIT. His research aims to build intelligent systems that can communicate effectively using language and learn from human guidance. Jacob earned his Ph.D. from UC Berkeley, his M.Phil. from Cambridge (where he studied as a Churchill scholar) and his B.S. from Columbia. As a researcher at Microsoft Semantic Machines, he founded the language generation team and helped develop core pieces of the technology that powers conversational interaction in Microsoft Outlook. He has been the recipient of a Sony Faculty Innovation Award, a Kolokotrones for teaching at MIT, and paper awards at NAACL and ICML.

Colin Raffel: The Sweet Lesson

Abstract: Richard Sutton's essay "The Bitter Lesson" argues that "general methods that leverage computation are ultimately the most effective". In this talk, I will argue that the bitter lesson implies that, at a given point in time, it is often possible to outperform large-scale methods with methods that are more efficient and clever. Furthermore, actively working to develop more efficient methods has often uncovered new approaches that scale better. I call this perspective "the sweet lesson" and will present many examples of this principle. Finally, I will wrap up with some thoughts on how to internalize bitter and sweet lessons in NLP's current era of scale.

Bio: Colin Raffel is an Assistant Professor in the Department of Computer Science at the University of North Carolina, Chapel Hill. He also spends one day a week as a Faculty Researcher at Hugging Face.

Nazneen Rajani: Analyzing and Repurposing off-the-shelf Summarization Systems

Abstract: Despite their widespread success, LLMs are notorious for generating factually inconsistent outputs or hallucinating. In particular, evaluation of text summarization systems has revealed that abstractive systems are susceptible to this problem which in turn has limited their deployment to real-world applications. Using SummVis, an interactive visual analysis toolkit, we analyzed state-of-the-art summarization systems BART and PEGASUS for different types of factual inconsistencies. Motivated by this analysis, we propose a three step approach to repurpose off-the-shelf summarization systems for improved factual consistency. The first step is to filter noisy training data for faithful summaries based on two factual consistency metrics — entity overlap and dependency arc entailment. In the second step, we train experts to generate factually consistent output using reinforcement learning to minimize the error defined by the two metrics. Finally, we ensemble the experts with the base system to generate more faithful and consistent summaries on two popular datasets — XSUM and CNNDM.

Bio: Nazneen is a senior research scientist at Salesforce working on commonsense reasoning, interpretability, robustness, and human-AI collaboration. She got her PhD in Computer Science from UT Austin in 2018. Several of her works (20+) have been published in top tier AI conferences. Nazneen was one of the finalists for the VentureBeat Transform 2020 women in AI Research. Her work has been covered by various media outlets including Quanta magazine, VentureBeat, SiliconAngle, ZDNet, Datanami. More details about her work can be found on her website http://www.nazneenrajani.com/.

Dan Roth: It’s Time for Reasoning

Abstract:

The fundamental issue underlying natural language understanding is that of semantics – there is a need to move toward understanding natural language at an appropriate level of abstraction in order to support natural language understanding and communication with computers. Machine Learning has become ubiquitous in our attempt to induce semantic representations of natural language and support decisions that depend on it; however, while we have made significant progress over the last few years, it has focused on classification tasks for which we have large amounts of annotated data. Supporting high level decisions that depend on natural language understanding is still beyond our capabilities, partly since most of these tasks are very sparse and generating supervision signals for it does not scale. I will discuss some of the challenges underlying reasoning – making natural language understanding decisions that depend on multiple, interdependent, models, and exemplify it mostly using the domain of Reasoning about Time, as it is expressed in natural language.

Bio:

Dan Roth is the Eduardo D. Glandt Distinguished Professor at the Department of Computer and Information Science, University of Pennsylvania, lead of NLP Science at Amazon AWS AI, and a Fellow of the AAAS, the ACM, AAAI, and the ACL. In 2017 Roth was awarded the John McCarthy Award, the highest award the AI community gives to mid-career AI researchers. Roth was recognized “for major conceptual and theoretical advances in the modeling of natural language understanding, machine learning, and reasoning.” Roth has published broadly in machine learning, natural language processing, knowledge representation and reasoning, and learning theory, and has developed advanced machine learning based tools for natural language applications that are being used widely. Roth was the Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR) and a program chair of AAAI, ACL, and CoNLL. Roth has been involved in several startups; most recently he was a co-founder and chief scientist of NexLP, a startup that leverages the latest advances in Natural Language Processing (NLP), Cognitive Analytics, and Machine Learning in the legal and compliance domains. NexLP was acquired by Reveal in 2020. Prof. Roth received his B.A Summa cum laude in Mathematics from the Technion, Israel, and his Ph.D. in Computer Science from Harvard University in 1995.

Roy Schwartz: Not all Textual Instances are Alike: Efficient NLP by Better Understanding of our Data

Abstract: The computations required for deep learning research have been doubling every few months, resulting in an estimated 300,000x increase between 2012–2018. In this talk I will discuss several ways to reduce the computational costs of NLP models by getting a better understanding of our data. I will start by presenting a method for performing efficient inference by devoting the right amount of computation for each instance based on its complexity. I will continue with a method that substantially reduces the time required for training NLP models, by selecting those instances that are most beneficial for learning, and training the model only on them. Finally, I will present several limitations of the masked language modeling objective for vision and language tasks. I will present new masking strategies that mitigate these limitations, leading to more efficient use of the training data. This is joint work with Yonatan Bitton, Yejin Choi, Jesse Dodge, Michael Elhadad, Hannaneh Hajishirzi, Nicholas Lourie, Noah A. Smith, Gabriel Stanovsky, Swabha Swayamdipta, and Yizhong Wang.

Bio: Roy Schwartz is a senior lecturer at the School of Computer Science and Engineering at The Hebrew University of Jerusalem (HUJI). Roy studies natural language processing and artificial intelligence. Prior to joining HUJI, Roy was a postdoc (2016-2019) and then a research scientist (2019-2020) at the Allen institute for AI and at the School of Computer Science and Engineering at The University of Washington, where he worked with Noah A. Smith. Roy completed his Ph.D. in 2016 at the School of Computer Science and Engineering at HUJI, where he worked with Ari Rappoport. Roy’s work has appeared on the cover of the CACM magazine, and has been featured, among others, in the New York Times, MIT Tech Review, and Forbes.

Yulia Tsvetkov: Towards Interpretability Without an Overhead

Abstract: Modern neural networks provide limited insight into interpretations of model decisions and are typically treated as black boxes. Opening-up the black box incurs a significant overhead, which leads to backing off to black-box models that are not well-understood, overfit to confounds and propagate biases. In my talk, I will discuss one proposed direction to interpreting the decisions of text classification models without additional overheads. Specifically, I'll present SelfExplain -- a self-explaining model architecture that incorporates local and global interpretability layers, while maintaining training/inference times and classification accuracies. Next, I'll present three use cases for generating interpretations via SelfExplain: (1) in supervised classification, (2) unsupervisedly, and (3) with text classification models that incorporate auxiliary features into large language model-based classifiers. I will conclude with discussion of multiple future directions towards interpretable and sustainable classifiers.

Bio: Yulia Tsvetkov is an assistant professor at the Paul G. Allen School of Computer Science & Engineering at University of Washington. Her research group works on NLP for social good, multilingual NLP, and language generation. The projects are motivated by a unified goal: to extend the capabilities of human language technology beyond individual populations and across language boundaries, thereby enabling NLP for diverse and disadvantaged users, the users that need it most. Prior to joining UW, Yulia was an assistant professor at Carnegie Mellon University and a postdoc at Stanford. Yulia is a recipient of the Okawa research award, Amazon machine learning research award, Google faculty research award, and multiple NSF awards.

Panel Discussion

We will also hold a moderated Panel Discussion with our invited speakers as well as the following researchers who also expressed willingness in participating on the panel:

Emily M. Bender
Goran Glavaš
Colin Raffel
Roy Schwartz
Moshe Wasserblat
Perez Ogayo