ACL 2010 Workshop on Domain Adaptation for Natural Language Processing (DANLP)
July 15, 2010
Uppsala, Sweden

  • July 19, 2010: Thanks to everyone for making DANLP such a success! Special thanks to our keynote, the panelist, the speakers and workshop participants!
  • July 5, 2010: Abstract of invited keynote by John Blitzer is available now (note changed title)
  • June 11, 2010: Panelist: John Blitzer, Walter Daelemans, Hal Daume, Jing Jiang, Khalil Sima'an. Check out our Workshop Program.
  • June 11, 2010: Chairs added to the program.
  • June 10, 2010: Title of invited keynote by John Blitzer: Unsupervised Domain Adaptation: From Practice to Theory.
  • June 9, 2010: Information for presenters - The presentation format will be a 22 minutes talk followed by 3 minutes of Q&A. There are 5 minutes allocated in between talks for changing speaker and as spare room. Due to the tight schedule, we strongly recommend each speaker to take no more than 22 minutes for the oral presentation. We advise a maximum of 22 slides.
  • June 8, 2010: Preliminary Program is online.
  • May 15, 2010: Final papers are limited to 8 pages (including references) and should be formatted using the ACL 2010 style files: http://acl2010.org/authors.html.
  • May 15, 2010: Review form added.
  • May 12, 2010: Accepted papers added
  • May 11, 2010: Notifications sent. Camera-ready papers length: 8 pages (including references).
  • April 11, 2010: Submission closed
  • April 4, 2010: Deadline extended to Sunday April 11, 2010 - 23:59 CET
  • March 31, 2010: Final Call for papers published
  • March 9, 2010: Second Call for papers published
  • January 17, 2010: Call for papers published
  • January 12, 2010: The workshop takes place on July 15, 2010
  • December 4, 2009: Invited speaker added
  • November 26, 2009: Workshop website goes online
Workshop Goals

Most modern Natural Language Processing (NLP) systems are subject to the wellknown problem of lack of portability to new domains/genres of language: there is a substantial drop in their performance when tested on data from a new domain, i.e., their test data is drawn from a related but different distribution as their training data. This problem is inherent in the assumption of independent and identically distributed (i.i.d.) variables for machine learning systems, but has started to get attention only in recent years. The need for domain adaptation arises in almost all NLP tasks: part-of-speech tagging, semantic role labeling, statistical parsing and statistical machine translation, to name but a few.

The goal of this workshop is to provide a meeting-point for research that approaches the problem of adaptation from the varied perspectives of machine-learning and a variety of NLP tasks such as parsing, machine-translation, word sense disambiguation, etc.  We believe there is much to gain by treating domain-adaptation as a general learning strategy that utilizes prior knowledge of a specific or a general domain in learning about a new domain; here the notion of a “domain” could be as varied as child language versus adult-language, or the source-side re-ordering of words to target-side word-order in a statistical machine translation system.

Sharing insights, methodologies and successes across tasks and perspectives will thus contribute  towards a better understanding of this problem. For instance, self-training the Charniak parser alone was not effective for adaptation (it has been common wisdom that self-training is generally not effective), but self-training with a re-ranker was surprisingly highly effective (McClosky et al., 2006). Is this an insight into adaptation that can be used elsewhere? We believe that the key to future success will be to exploit large collections of unlabeled data in addition to labeled data. Not only because unlabeled data is easier to obtain, but existing labeled resources are often not even close to the envisioned target application domain. Directly related is the question of how to measure closeness (or differences) among domains.

We therefore especially encourage submissions on semi-supervised approaches of domain adaptation with a deep analysis of models, data and results, although we do not exclude papers on supervised adaptation.

Invited speaker

   John Blitzer, University of California at Berkeley, USA: Unsupervised Domain Adaptation: From Practice to Theory.
Important Dates
  April 5, 2010: April 11, 2010: Submission deadline
  May 6, 2010: May 11, 2010: Notification of acceptance
  May 16, 2010: May 21, 2010: Camera-ready papers due (23:59 CET)
  July 15, 2010: Workshop


    Hal Daumé III, University of Utah, USA
    Tejaswini Deoskar, University of Amsterdam, The Netherlands
    David McClosky, Stanford University, USA
    Barbara Plank, University of Groningen, The Netherlands
    Jörg Tiedemann, Uppsala University, Sweden


   This workshop is kindly supported by the Stevin project PaCo-MT (Parse and Corpus-based Machine Translation) .

Locations of visitors to this page