Schedule/Program

Times and the order of events listed below are almost final.


Workshop Schedule (19 July 2018)

  • 9:00 - 9:10 Opening Remarks
  • 9:10 - 9:50 Invited Talk - Stefan Riezler

Deep Sequence-to-Sequence Learning from Human Reinforcement

  • 9:50 - 10:30 Invited Talk - Sujith Ravi

Large-scale Neural Structured Learning for Low-Resource AI

  • 10:30 - 11:00 Coffee Break
  • 11:00 - 12:40 Oral Presentations
      • 11:00-11:25 Phrase-Based & Neural Unsupervised Machine Translation
      • 11:25-11:50 Character-level Supervision for Low-resource POS Tagging
      • 11:50-12:15 Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data
      • 12:15-12:40 Exploiting Subword Similarities in Low-Resource Document Classification
  • 12:40 - 14:00 Lunch Break
  • 14:00 - 15:30 Poster Session
  • 15:30 - 16:00 Coffee Break
  • 16:00 - 16:40 Invited Talk - Trevor Cohn

Translation on a shoestring

  • 16:40 - 17:40 Panel Discussion
  • 17:40 - 17:55 Closing Remarks


Please note that all of the accepted papers/abstracts are going to be presented in the poster session. This includes the papers/abstracts which are presented in the oral session.


Information for Oral and Poster Presentations

For oral presentations, please note that the allocated time for each paper presentation is 25 minutes, consisting of 18 minutes for the talk and 7 minutes for setup and Q/A.

For posters, please note the following guidelines:

1. Posters for the workshops should be A0 portrait or smaller -- in particular, attendees should keep them less than 1 meter in width.

2. We'd also like to encourage attendees, when possible, to print on simple paper or light fabric -- in some cases posters will be on walls, and heavy laminate-type posters may be more difficult to use then.


Invited Talk 1: Deep Sequence-to-Sequence Learning from Human Reinforcement

Human reinforcement signals are often easier to obtain than supervision signals by gold standard structures, and are available in large amounts at no cost in commercial settings. We discuss the prerequisites on human reinforcements to build a basis for machine learning, and present some first small-scale experiments on neural machine translation from human feedbacks that showcase the great potential for applications at larger scale.

Bio: Prof. Stefan Riezler was appointed full professor and head of the chair of Linguistic Informatics at Heidelberg University in 2010, after spending a decade in the world’s most renowned industry research labs (Xerox PARC, Google). He received his PhD in Computational Linguistics from the University of Tübingen in 1998, and then conducted post-doctoral work at Brown University in 1999. Prof. Riezler's research focus is on machine learning and statistics applied to natural language processing problems, especially for the application areas of natural-language based web search and statistical machine translation.


Invited Talk 2: Large-scale Neural Structured Learning for Low-Resource AI

Significant advances in machine learning have enabled us to build intelligent systems capable of perceiving and understanding the real world from text, speech and images. While great progress has been made in many AI fields, building scalable intelligent systems from “scratch” still remains a daunting challenge for many applications. Doing this at scale, especially for NLP applications that often involve noisy and high-dimensional text inputs, is even harder. Researchers have attempted to address this using a variety of techniques in recent years. In this talk, I will formalize some of the popular techniques and how they apply to large-scale learning for low-resource scenarios. I will then introduce and describe a powerful graph-based computing paradigm that we use to overcome data/annotation sparsity challenges by leveraging structure inherent in data via flexible representations and powerful graph algorithms combined with deep learning. Our machine learning framework operates efficiently at large scale and easily handles massive graphs containing billions of vertices and trillions of edges; it effectively combines graph learning with deep neural networks including joint end-to-end training; and it is used to solve variety of real-world prediction tasks from conversational modeling to image recognition and multimodal learning.

Bio: Sujith Ravi is a Senior Staff Research Scientist and Manager at Google, where he leads the company’s large-scale graph-based machine learning platform and on-device machine learning efforts that power natural language understanding and image recognition for products used by millions of people everyday in Search, Gmail, Photos, Android, and YouTube. The machine learning technology enables features such as Smart Reply that automatically suggests replies to incoming e-mails or chat messages in Gmail, Inbox and Allo; Photos that searches for anything, from “hugs” to “dogs,” with the latest image recognition system; conversational modeling & smart messaging directly from Android Wear smartwatches; and Learn2Compress platform for ML Kit that enables training custom on-device deep learning models. His research interests include large-scale inference, unsupervised and semi-supervised learning, on-device machine learning for IoT, conversational AI, computer vision, multimodal learning, and computational decipherment.

Dr. Ravi has authored more than 60 scientific publications and patents in top-tier machine learning and natural language processing conferences, and his work won the ACM SIGKDD Best Research Paper Award in 2014. His work has been showcased in press articles from Wired, Forbes, Forrester, NYTimes, TechCrunch, VentureBeat, Engadget, New Scientist, among others. He is the Co-Chair (AI & Deep Learning) for the 2019 National Academy of Engineering (NAE) German-American Frontiers of Engineering symposium. In 2017, he was selected for the NAE U.S Frontiers of Engineering symposium. He organizes machine learning symposia/workshops and regularly serves as Area Chair and PC of top-tier machine learning and natural language processing conferences like NIPS, ICML, ACL, EMNLP, COLING, KDD, and WSDM.


Invited Talk 3: Translation on a shoestring

Machine translation is challenging from limited resources. Notably for richly parameterised deep learning sequence-to-sequence approaches, which dominate on large competition datasets. For most languages there's insufficient text corpora for model estimation. In this talk I will present several ways to address this shortcoming, in terms of developing more robust neural models, i.e., 1) extending models to support more complex structured inputs, such as tree and semantic graphs, and 2) better models of the translation process through incorporating stochasticity into the generative process. Finally, 3) I will cover speech recognition and translation, which is better able to facilitate field linguist efforts for documenting truly low-resource or endangered languages.

Bio: Dr. Trevor Cohn is an Associate Professor and ARC Future Fellow at the University of Melbourne, in the School of Computing and Information Systems. His research interests focus on probabilistic and statistical machine learning for natural language processing, with applications in several areas including machine translation, parsing and grammar induction. Current projects include translating diverse and noisy text sources, deep learning of semantics in translation, rumour diffusion over social media, and algorithmic approaches for scaling to massive corpora. Dr. Cohn has more than 100 research publications, and his research has been recognised by several best paper awards, including best short paper at EMNLP in 2016. He will be jointly organising ACL 2018 in Melbourne. He received Bachelor degrees in Software Engineering and Commerce, and a PhD degree in Engineering from the University of Melbourne. He was previously based at the University of Sheffield, and before this worked as a Research Fellow at the University of Edinburgh.