Schedule

Workshop Schedule

Following the rest of the ACL conference, RepL4NLP will be held remotely this year. To support the virtual format in a way that is as accessible as possible to attendees around the world, the workshop will be held in three sessions, each falling approximately within "normal" working hours for a different set of timezones. All talks will be pre-recorded and available to watch before and during the workshop, with live Q&A sessions for invited talks, and live poster sessions held on the workshop date of July 9. We're excited about this opportunity to make the workshop more accessible!

Pre-recorded talks are available now on the virtual ACL 2020 website! Per-poster Zoom links will be posted soon in the RepL4NLP Rocket Chat channel.

All times are July 9, 2020 PDT.

Session 1

  • 01:00-01:15 Welcome and Opening Remarks
  • 01:15-02:45 Poster session 1
    • Improving Bilingual Lexicon Induction with Unsupervised Post-Processing of Monolingual Word Vector Spaces. Ivan Vulić, Anna Korhonen and Goran Glavaš. Best short paper.
    • On the Ability of Self-Attention Networks to Recognize Counter Languages. Satwik Bhattamishra, Kabir Ahuja and Navin Goyal.
    • Word Embeddings as Tuples of Feature Probabilities. Siddharth Bhat, Alok Debnath, Souvik Banerjee and Manish Shrivastava.
    • Compositionality and Capacity in Emergent Languages. Abhinav Gupta, Cinjon Resnick, Jakob Foerster, Andrew Dai and Kyunghyun Cho.
    • Learning Geometric Word Meta-Embeddings. Pratik Jawanpuria, Satya Dev N T V, Anoop Kunchukuttan and Bamdev Mishra.
    • Variational Inference for Learning Representations of Natural Language Edits. Edison Marrese-Taylor, Machel Reid and Yutaka Matsuo.
    • Adversarial Training for Commonsense Inference. Lis Pereira, Xiaodong Liu, Fei Cheng, Masayuki Asahara and Ichiro Kobayashi.
    • Joint Training with Semantic Role Labeling for Better Generalization in Natural Language Inference. Cemil Cengiz and Deniz Yuret.
    • A Metric Learning Approach to Misogyny Categorization. Juan Manuel Coria, Sahar Ghannay, Sophie Rosset and Hervé Bredin.
    • On the Choice of Auxiliary Languages for Improved Sequence Tagging. Lukas Lange, Heike Adel and Jannik Strötgen.
    • Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text. Lukas Lange, Anastasiia Iurshina, Heike Adel and Jannik Strötgen.
    • Contextual and Non-Contextual Word Embeddings: an in-depth Linguistic Investigation. Alessio Miaschi and Felice Dell'Orletta.
    • Staying True to Your Word: (How) Can Attention Become Explanation? Martin Tutek and Jan Snajder.
    • A Simple Approach to Learning Unsupervised Multilingual Embeddings. Pratik Jawanpuria, Mayank Meghwanshi and Bamdev Mishra.
    • What's in a Name? Are BERT Named Entity Representations just as Good for any other Name? Sriram Balasubramanian, Naman Jain, Gaurav Jindal, Abhijeet Awasthi and Sunita Sarawagi.
    • AI4Bharat-IndicNLP Dataset: Monolingual Corpora and Word Embeddings for Indic Languages. Anoop Kunchukuttan, Divyanshu Kakwani, Satish Golla, Gokul N.C., Avik Bhattacharyya, Mitesh M. Khapra and Pratyush Kumar.
    • Evaluating Natural Alpha Embeddings on Intrinsic and Extrinsic Tasks. Riccardo Volpi and Luigi Malagò.
    • Predicting Sexual and Reproductive Health of Migrants using Data Science. Amber Nigam, Pragati Jaiswal, Teertha Arora, Uma Girkar and Leo Anthony Celi.
    • Job Recommendation through Progression of Job Selection. Amber Nigam, Aakash Roy, Hartaran Singh and Arpan Saxena.


Session 2

  • 07:00-07:15 Welcome and Opening Remarks
  • 07:15-07:45 Invited speaker Q&A: Ellie Pavlick
  • 07:45-08:00 Break
  • 08:00-08:30 Invited speaker Q&A: Evelina Fedorenko
  • 08:30-10:00 Poster session 2
    • Improving Bilingual Lexicon Induction with Unsupervised Post-Processing of Monolingual Word Vector Spaces. Ivan Vulić, Anna Korhonen and Goran Glavaš. Best short paper.
    • Are All Languages Created Equal in Multilingual BERT? Shijie Wu and Mark Dredze. Best long paper.
    • On the Ability of Self-Attention Networks to Recognize Counter Languages. Satwik Bhattamishra, Kabir Ahuja and Navin Goyal.
    • Zero-Resource Cross-Domain Named Entity Recognition. Zihan Liu, Genta Indra Winata and Pascale Fung.
    • Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages. Tyler A. Chang and Anna Rafferty.
    • Learning Probabilistic Sentence Representations from Paraphrases. Mingda Chen and Kevin Gimpel.
    • Word Embeddings as Tuples of Feature Probabilities. Siddharth Bhat, Alok Debnath, Souvik Banerjee and Manish Shrivastava.
    • Compositionality and Capacity in Emergent Languages. Abhinav Gupta, Cinjon Resnick, Jakob Foerster, Andrew Dai and Kyunghyun Cho.
    • Learning Geometric Word Meta-Embeddings. Pratik Jawanpuria, Satya Dev N T V, Anoop Kunchukuttan and Bamdev Mishra.
    • Joint Training with Semantic Role Labeling for Better Generalization in Natural Language Inference. Cemil Cengiz and Deniz Yuret.
    • A Metric Learning Approach to Misogyny Categorization. Juan Manuel Coria, Sahar Ghannay, Sophie Rosset and Hervé Bredin.
    • On the Choice of Auxiliary Languages for Improved Sequence Tagging. Lukas Lange, Heike Adel and Jannik Strötgen.
    • Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text. Lukas Lange, Anastasiia Iurshina, Heike Adel and Jannik Strötgen.
    • Contextual and Non-Contextual Word Embeddings: an in-depth Linguistic Investigation. Alessio Miaschi and Felice Dell'Orletta.
    • Staying True to Your Word: (How) Can Attention Become Explanation? Martin Tutek and Jan Snajder.
    • Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning. Mitchell Gordon, Kevin Duh and Nicholas Andrews.
    • On Dimensional Linguistic Properties of the Word Embedding Space. Vikas Raunak, Vaibhav Kumar, Vivek Gupta and Florian Metze.
    • A Simple Approach to Learning Unsupervised Multilingual Embeddings. Pratik Jawanpuria, Mayank Meghwanshi and Bamdev Mishra.
    • A Cross-Task Analysis of Text Span Representations. Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu and Kevin Gimpel.
    • Enhancing Transformer with Sememe Knowledge. Yuhui Zhang, Chenghao Yang, Zhengping Zhou and Zhiyuan Liu.
    • Evaluating Compositionality of Sentence Representation Models. Hanoz Bhathena, Angelica Willis and Nathan Dass.
    • Supertagging with CCG primitives. Aditya Bhargava and Gerald Penn.
    • What's in a Name? Are BERT Named Entity Representations just as Good for any other Name? Sriram Balasubramanian, Naman Jain, Gaurav Jindal, Abhijeet Awasthi and Sunita Sarawagi.
    • Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT. Ashutosh Adhikari, Achyudh Ram, Raphael Tang, William L. Hamilton and Jimmy Lin
    • AI4Bharat-IndicNLP Dataset: Monolingual Corpora and Word Embeddings for Indic Languages. Anoop Kunchukuttan, Divyanshu Kakwani, Satish Golla, Gokul N.C., Avik Bhattacharyya, Mitesh M. Khapra and Pratyush Kumar.
    • Evaluating Natural Alpha Embeddings on Intrinsic and Extrinsic Tasks. Riccardo Volpi and Luigi Malagò.
    • Predicting Sexual and Reproductive Health of Migrants using Data Science. Amber Nigam, Pragati Jaiswal, Teertha Arora, Uma Girkar and Leo Anthony Celi.
    • Job Recommendation through Progression of Job Selection. Amber Nigam, Aakash Roy, Hartaran Singh and Arpan Saxena.


Session 3

  • 17:00-17:15 Welcome and Opening Remarks
  • 17:15-17:45 Invited speaker Q&A: Kristina Toutanova
  • 17:45-18:00 Break
  • 18:00-18:30 Invited speaker Q&A: Mike Lewis
  • 18:30-20:00 Poster session 3
    • Are All Languages Created Equal in Multilingual BERT? Shijie Wu and Mark Dredze. Best long paper.
    • Zero-Resource Cross-Domain Named Entity Recognition. Zihan Liu, Genta Indra Winata and Pascale Fung.
    • Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages. Tyler A. Chang and Anna Rafferty.
    • Learning Probabilistic Sentence Representations from Paraphrases. Mingda Chen and Kevin Gimpel.
    • Variational Inference for Learning Representations of Natural Language Edits. Edison Marrese-Taylor, Machel Reid and Yutaka Matsuo.
    • Adversarial Training for Commonsense Inference. Lis Pereira, Xiaodong Liu, Fei Cheng, Masayuki Asahara and Ichiro Kobayashi.
    • Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning. Mitchell Gordon, Kevin Duh and Nicholas Andrews.
    • On Dimensional Linguistic Properties of the Word Embedding Space. Vikas Raunak, Vaibhav Kumar, Vivek Gupta and Florian Metze.
    • A Cross-Task Analysis of Text Span Representations. Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu and Kevin Gimpel.
    • Enhancing Transformer with Sememe Knowledge. Yuhui Zhang, Chenghao Yang, Zhengping Zhou and Zhiyuan Liu.
    • Evaluating Compositionality of Sentence Representation Models. Hanoz Bhathena, Angelica Willis and Nathan Dass.
    • Supertagging with CCG primitives. Aditya Bhargava and Gerald Penn.
    • Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT. Ashutosh Adhikari, Achyudh Ram, Raphael Tang, William L. Hamilton and Jimmy Lin

Ellie Pavlick is the Manning Assistant Professor of Computer Science at Brown University, where she runs the Language Understanding and Representation (LUNAR) Lab. She received her PhD from University of Pennsylvania in 2017 and was a postdoc at Google Research, where she remains a part-time Research Scientist. Ellie's interests are in semantic representations and reasoning, and in particular, in building computational models which mirror what we understand about human's language representation, processing, and acquisition. She is currently focused on grounded and interactive language learning. Her lab works closely with the departments of linguistics and cognitive science in studying humans' semantic representations, and with Brown's Humans to Robots Lab in developing situated models for language learning.

I don’t know what you mean semantics is hard: Challenges in evaluation of semantic phenomena

Recently, there has been intense focus on understanding what exactly deep language models understand about language. A wave of papers has presented exciting results, in particular suggesting that deep LMs encode a surprising amount of information about complex syntactic structure, from parts of speech to dependency graphs to coreference chains. Evidence that models encode 'semantics' has been less plentiful, and the work that does exist suggests more mixed results. In this talk, I argue that asking the question "do LMs encode semantics?" is premature, since we do not yet have a convincing evaluation, or even definition, of what it means for a model to "encode semantics" in the way we do for syntax. I'll present three case studies to illustrate why models of human semantic inferences are difficult to codify, and discuss what this means for research on computer's NLU.


Evelina Fedorenko is a cognitive neuroscientist who specializes in the study of the human language system. She received her Bachelor’s degree in Psychology and Linguistics from Harvard University in 2002. She then proceeded to pursue graduate studies in cognitive science and neuroscience at MIT. After receiving her Ph.D. in 2007, she was awarded a K99R00 career development award from NICHD and stayed on as a postdoctoral researcher and then a research scientist at MIT. In 2014, she joined the faculty at HMS/MGH, and in 2019 she moved back to MIT, where she is currently Associate Professor in the Brain and Cognitive Sciences Department and a member of the McGovern Institute for Brain Research. Fedorenko aims to understand the computations we perform and the representations we build during language processing, and to provide a detailed characterization of the brain regions underlying these computations and representations. She uses an array of methods, including fMRI, ERPs, MEG, intracranial recordings and stimulation, and tools from Natural Language Processing and machine learning, and works with diverse populations, including healthy adults and children, as well as individuals with developmental and acquired brain disorders.

Artificial Neural Networks as models of language comprehension in the human brain

Human language surpasses all other animal communication systems in its complexity and generative power. My lab uses a combination of behavioral, brain imaging, and computational approaches to illuminate the functional architecture of language, with the ultimate goal of deciphering the representations and computations that enable us to understand and produce language. I will first briefly summarize three discoveries my lab has made over the last decade. In particular, I will show that i) the language network is selective for language processing over a wide range of non-linguistic processes; ii) every brain region that supports syntactic processing is also sensitive to word meanings; and iii) local linguistic composition is the core driver of the response in the language-selective areas: as long as nearby words can be combined into phrases/clauses, the language areas respond as strongly as they do to its preferred stimulus — naturalistic sentences. These results jointly constrain the space of possibilities for the computations underlying language comprehension. However, until recently, we lacked computationally precise neurally plausible models that could serve as quantitative hypotheses for how core aspects of language might be implemented in neural tissue. Inspired by the artificial neural network (ANN) models’ success in explaining neural responses in perceptual tasks, we recently investigated whether state-of-the-art ANN language models capture human brain activity elicited during language comprehension. We tested 43 diverse language models on three neural datasets (2 fMRI, 1 ECoG) and found that the most powerful transformer models accurately predict neural responses, in some cases achieving near-perfect predictivity relative to the noise ceiling. Models’ predictivities correlated with their success on a next-word-prediction task and their ability to explain comprehension difficulty in an independent behavioral dataset. Intriguingly, model architecture alone drives a large portion of model-brain predictivity, with each model’s untrained score predictive of its trained score. These results show that specific language models capture substantial variance in neural and behavioral responses to language thereby providing the first computationally precise accounts of how the human brain may solve the problem of language comprehension, and laying the foundation for further explorations of the actual computations engaged in language understanding.


Kristina Toutanova is a research scientist at Google Research in the Language team in Seattle and an affiliate faculty at the University of Washington. She obtained her Ph.D. from Stanford University with Christopher Manning. Prior to joining Google in 2017, she was a researcher at Microsoft Research, Redmond. Kristina focuses on modeling the structure of natural language using machine learning, most recently in the areas of representation learning, question answering, information retrieval, semantic parsing, and knowledge base completion. Kristina is a general chair for NAACL 2021, and has served as an ACL program co-chair and a TACL co-editor in chief in the past.

Text representations for retrieval and question answering

In this talk I will first overview our team’s recent work on learning efficient representations for large-scale text retrieval and understanding the capacity of these fixed-dimensional dense representations as the document length increases. I will then present approaches that enrich contextual word representations by retrieving relevant knowledge derived from text, improving the ability of the representations to take into account fine-grained entity-focused background. Finally, I will highlight progress toward enabling more resource-efficient and multilingual representations.


Mike Lewis is a research scientist at Facebook AI Research in Seattle, working on representation learning for language. Previously he was a postdoc at the University of Washington (working with Luke Zettlemoyer), developing search algorithms for neural structured prediction. He has a PhD from the University of Edinburgh (advised by Mark Steedman) on combining symbolic and distributed representations of meaning. He received an Outstanding Submission Award at the 2014 ACL Workshop on Semantic Parsing, Best Paper at EMNLP 2016, Best Resource Paper at ACL 2017, and Best Paper Honourable Mention at ACL 2018. His work has been extensively covered in the media, with varying levels of accuracy.

Beyond BERT

Denoising auto-encoders can be pre-trained at a very large scale by corrupting and then reconstructing any input text. Existing methods, based on variations of masked languages models, have transformed the field and now provide the de facto initialization to be tuned for nearly any text classification problem. In this talk, I will describe our recent unsupervised representation learning models that can perform more tasks in more languages with less supervision. I will discuss the surprising effectiveness of simplifying and scaling masked language models (RoBERTa), and show how similar ideas can be used to pre-train sequence-to-sequence models that can also be used for generation (BART). I will also argue for using separate mechanisms for modeling encyclopedic and linguistic knowledge - in particular, equipping language models with large non-parametric memories (kNN-LM, RAG), and pre-training sequence-to-sequence models via paraphrasing (MARGE). MARGE performs well on classification, generation and retrieval tasks in many languages, without supervision in some cases, making it arguably the most broadly applicable pre-trained model to date.