Invited Speakers

We are happy to announce that we'll have 4 keynote speakers during NLPBT 2021.

Jason Baldridge. Google

Title: Language, vision and action are better together

Abstract: Human knowledge and use of language is inextricably connected to perception, action and the organization of the brain, yet natural language processing is still dominated by text! More research involving language---including speech---in the context of other modalities and environments is needed, and there has never been a better time to do it.

Without ever invoking the worn-out, overblown phrase "how babies learn" in the talk, I'll cover three of my team's efforts involving language, vision and action. First: our work on speech-image representation learning and retrieval, where we demonstrate settings in which directly encoding speech outperforms the hard-to-beat strategy of using automatic speech recognition and strong text encoders. Second: two models for text-to-image generation: a multi-stage model which exploits user-guidance in the form of mouse traces and a single-stage one which uses cross-modal contrastive losses. Third: Room-across-Room, a multilingual dataset for vision-and-language navigation, for which we collected spoken navigation instructions, high-quality text transcriptions, and fine-grained alignments between words and pixels in high-definition 360-degree panoramas. I'll wrap up with some thoughts on how work on computational language grounding more broadly presents new opportunities to enhance and advance our scientific understanding of language and its fundamental role in human intelligence.

Bio: Jason is a research scientist at Google, where he works on natural language understanding. He was previously an Associate Professor of Computational Linguistics at the University of Texas at Austin. His main research interests include applied natural language processing, semantics, and language grounding. Jason received his Ph.D. from the University of Edinburgh in 2002, where his doctoral dissertation on Multimodal Combinatory Categorial Grammar was awarded the 2003 Beth Dissertation Prize from the European Association for Logic, Language and Information.

==========

Desmond Elliott. University of Copenhagen

Title: Beyond Text and Back Again

Abstract: A talk with two parts covering three modalities. In the first part, I will talk about NLP Beyond Text, where we integrate visual context into a speech recognition model and find that the recovery of different types of masked speech inputs is improved by fine-grained visual grounding against detected objects. In the second part, I will come Back Again, and talk about the benefits of textual supervision in cross-modal speech--vision retrieval models.

Bio: Desmond is an Assistant Professor at the University of Copenhagen. He received his PhD from the University of Edinburgh, and was a postdoctoral researcher at CWI Amsterdam, the University of Amsterdam, and the University of Edinburgh, funded by an Alain Bensoussan Career Development Fellowship and an Amazon Research Award. His research interests include multimodal and multlingual machine learning, which has appeared in papers ACL, CoNLL, EMNLP and NAACL. He was involved in the creation the Multi30K and How2 multilingual multimodal datasets and has developed a variety of models that learn from these types of data. He co-organised the How 2 Challenge Workshop at ICML 2019, the Multimodal Machine Translation Shared Task from 2016--2018, and the 2018 Frederick Jelinek Memorial Workshop on Grounded Sequence-to-Sequence Learning.

==========

Raquel Fernandez. University of Amsterdam

Title: Grounding language in visual and conversational contexts

Abstract: Most language use is driven by specific communicative goals in interactive setups, where often visual perception goes hand in hand with language processing. I will discuss some recent projects by my research group related to modelling language generation in socially and visually grounded contexts, arguing that such models can help us to better understand the cognitive processes underpinning these abilities in humans and contribute to more human-like conversational agents.

Bio: Raquel is an Associate Professor at the Institute for Logic, Language & Computation (ILLC), University of Amsterdam, where she leads the Dialogue Modelling Group. Her work and interests revolve around language use, encompassing topics that range from computational semantics and pragmatics to the dynamics of dialogue interaction, visually-grounded language processing, and child language acquisition, among others.

After studying Cognitive Science and Language in Barcelona, she received her PhD in Computational Linguistics from King's College London. She has held research positions at the Linguistics Department of the University of Potsdam and at the Center for the Study of Language and Information (CSLI), Stanford University. Over her career, she has been awarded several personal fellowships by the Dutch Research Council (NWO) and she is a recent [2019] recipient of an ERC Consolidator Grant. Other distinctions include having been a member of the Editorial Board of the Computational Linguistics journal and the Dialogue & Discourse journal, co-president of the SemDial Workshhop Series for ten years, and member of the scientific advisory board of SIGdial in several occasions.

==========

Rada Mihalcea. University of Michigan

Title: Challenges (and Opportunities) in Multimodal Sensing of Human Behavior

Abstract: Much of what we do today is centered around humans — whether it is creating the next generation smartphones, understanding interactions with social media platforms, or developing new mobility strategies. A better understanding of people can not only answer fundamental questions about “us” as humans, but can also facilitate the development of enhanced, personalized technologies. In this talk, I will overview the main challenges (and opportunities) faced by research on multimodal sensing of human behavior, and illustrate these challenges with projects conducted in the Language and Information Technologies lab at Michigan.

Bio: Rada Mihalcea is a Professor of Computer Science and Engineering at the University of Michigan and the Director of the Michigan Artificial Intelligence Lab. Her research interests are in computational linguistics, with a focus on lexical semantics, multilingual natural language processing, and computational social sciences. She serves or has served on the editorial boards of the Journals of Computational Linguistics, Language Resources and Evaluations, Natural Language Engineering, Journal of Artificial Intelligence Research, IEEE Transactions on Affective Computing, and Transactions of the Association for Computational Linguistics. She was a program co-chair for EMNLP 2009 and ACL 2011, and a general chair for NAACL 2015 and *SEM 2019. She is an ACM Fellow, a AAAI Fellow, and currently serves as the President of the ACL. She is the recipient of a Sarah Goddard Power award (2019) for her contributions to diversity in science, and the recipient of a Presidential Early Career Award for Scientists and Engineers awarded by President Obama (2009).