Workshop Schedule (20 July 2018)
- 09.30-09.45: Welcome and Opening Remarks
- 09.45-10:30: Invited talk 1: Yejin Choi -- The Missing Representation in Neural Language Models [Slides]
- 10.30–11.00: Coffee Break
- 11.00-11.45: Invited talk 2:
Trevor CohnTim Baldwin -- Adversarial learning for unbiased and robust text processing [Slides]
- 11.45-12.30: Invited talk 3: Margaret Mitchell -- Fair and Inclusive Representation Learning
- 12.30-14.00: Lunch
- 14.00-14.45: Invited talk 4: Yoav Goldberg
- 14.45-15.00: Outstanding Papers Spotlight Presentations
- Kawin Ethayarajh. Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline. [PDF]
- Yanrong Wu and Zhichun Wang. Knowledge Graph Embedding with Numeric Attributes of Entities. [PDF]
- Ahmet Üstün, Murathan Kurfalı and Burcu Can. Characters or Morphemes: How to Represent Words? [PDF]
- Yinfei Yang, Steve Yuan, Daniel Cer, Sheng-Yi Kong, Noah Constant, Petr Pilar, Heming Ge, Yun-hsuan Sung, Brian Strope and Ray Kurzweil. Learning Semantic Textual Similarity from Conversations. [PDF]
- Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan and Noah A. Smith. LSTMs Exploit Linguistic Attributes of Data. [PDF]
- 15.00-16.30: Poster Session (including Coffee Break from 15:30-16:00) + Drinks Reception
- 16.30–17.30: Panel Discussion
- 17.30–17.40: Closing Remarks + Best Paper Awards Announcement
- Yejin Choi, University of Washington
Yejin Choi is an Associate Professor at the Paul G. Allen School of Computer Science & Engineering at the University of Washington and also a senior research manager at AI2 overseeing the project Alexandria. Her research interests include language grounding with vision, physical and social commonsense knowledge, language generation with long-term coherence, conversational AI, and AI for social good. She was among the IEEE’s AI Top 10 to Watch in 2015, a co-recipient of the Marr Prize at ICCV 2013, and a faculty advisor for the Sounding Board team that won the inaugural Alexa Prize Challenge in 2017. Her work on detecting deceptive reviews, predicting literary success, and interpreting bias and connotation has been featured by numerous media outlets including NBC News for New York, NPR Radio, the New York Times, and Bloomberg Business Week. She received her Ph.D. in Computer Science from Cornell University.
Keynote: The Missing Representation in Neural Language Models
It is curious how machines can learn to translate from one language to another, while still having major difficulties in performing tasks humans complete with ease, such as document summarization and conversation. Equally curious is the fact that language models, trained only from natural language text, often generate such unnatural text that humans would rule implausible instantly.
In this talk I will discuss the missing representations in neural language models and how we might lift their limitations. First, I will argue that the current paradigm of “learning by reading” is fundamentally limiting for long-form text generation and introduce a new framework of “learning by writing” where machines can learn through practice. Second, I will discuss how commonsense knowledge and reasoning are critical for enabling machines to read between the lines, and how we might be able to model this latent commonsense through language.
- Tim Baldwin, University of Melbourne
Timothy Baldwin is a Professor in the Department of Computing and Information Systems, The University of Melbourne, an Australian Research Council Future Fellow, and a contributed research staff member of NICTA Victoria. He has previously held visiting positions at the University of Washington, University of Tokyo, Saarland University, and NTT Communication Science Laboratories. His research interests include text mining of social media, computational lexical semantics, information extraction and web mining, with a particular interest in the interface between computational and theoretical linguistics. Current projects include web user forum mining, text mining of Twitter, and intelligent interfaces for Japanese language learners. He is currently Secretary of the Australasian Language Technology Association, and a member of the Executive Committee of the Asian Federation of Natural Language Processing. Tim completed a BSc(CS/Maths) and BA(Linguistics/Japanese) at The University of Melbourne in 1995, and an MEng(CS) and PhD(CS) at the Tokyo Institute of Technology in 1998 and 2001, respectively. Prior to commencing his current position at The University of Melbourne, he was a Senior Research Engineer at the Center for the Study of Language and Information, Stanford University (2001-2004).
Keynote: Adversarial learning for unbiased and robust text processing
Natural Language Processing systems can be highly accurate, yet are often brittle in the face of linguistic variation, shifts in domain, and moreover are generally biased by the composition of the training data. I will detail a number of approaches for improving the robustness of NLP systems through a range of techniques, including:
(1) engineering more diverse inputs, through linguistic transformations;
(2) joint learning of both domain-specific and domain-general model components, coupled with adversarial training for domain; and
(3) explicitly learning representations that can obscure author characteristics at training time.
Over tasks including sentiment analysis, language identification, and POS tagging, I will show that the resulting models are more robust out-of-domain, and in the case of the latter method, are less demographically biased, while also paving the way towards privacy preserving modelling of language.
- Margaret Mitchell, Google Research and Machine Intelligence
Margaret Mitchell is a Senior Research Scientist in Google AI, serving as a tech lead in Google's ML Fairness effort, and lead of the Ethical AI team. Her research involves vision-language and grounded language generation, focusing on how to evolve artificial intelligence towards positive goals. This includes research on helping computers to communicate based on what they can process, as well as projects to create assistive and clinical technology from the state of the art in AI. Her recent work focuses on issues of diversity and representation in text and face images.
Keynote: Fair and Inclusive Representation Learning
This talk will define some notions of both fairness and inclusion in machine learning, and cover learning methods that meet fairness and inclusion goals. Transfer learning, multi-task learning, and adversarial multi-task learning have recently emerged as powerful tools for learning representations where some problematic biases are mitigated, and some desired traits are preserved. I will discuss strengths and weaknesses of the different learning methods, and application to recent technology in both natural language processing and computer vision.
- Yoav Goldberg, Bar Ilan University
Yoav Goldberg is a Senior Lecturer at Bar Ilan University's Computer Science Department. Before that, he did his Post-Doc as a Research Scientist at Google Research New York. He works on problems related to Natural Language Processing and Machine Learning. In particular, he is interested in syntactic parsing, structured-prediction models, learning for greedy decoding algorithms, multilingual language understanding, and cross domain learning. Lately, he is also interested in neural network based methods for NLP. He completed his PhD at Ben-Gurion University in 2011, under the supervision of Prof. Michael Elhadad.