Call for Papers

Second Workshop on Subword and Character LEvel Models in NLP (SCLeM) to be held at NAACL 2018 in New Orleans, Louisiana in June.

Dates

Deadline for paper submission: ~~March 2, 2018~~ (New Deadline: March 15, 2018)
Notification of acceptance: April 10, 2018
Camera ready submission due: April 16, 2018
Workshop: June 6, 2018

Invited Speakers

Jacob Eisenstein, Georgia Tech
Wang Ling, DeepMind
Graham Neubig, CMU
Barbara Plank, University of Groningen
Brian Roark, Google

Overview

Traditional NLP starts with a hand-engineered layer of representation, the level of tokens or words. A tokenization component first breaks up the text into units using manually designed rules. Tokens are then processed by components such as word segmentation, morphological analysis and multiword recognition. The heterogeneity of these components makes it hard to create integrated models of both structure within tokens (e.g., morphology) and structure across multiple tokens (e.g., multi-word expressions). This approach can perform poorly (i) for morphologically rich languages, (ii) for noisy text, (iii) for languages in which the recognition of words is difficult and (iv) for adaptation to new domains; and (v) it can impede the optimization of preprocessing in end-to-end learning.

The workshop provides a forum for discussing recent advances as well as future directions on sub-word and character-level natural language processing and representation learning that address these problems.

Topics of Interest

tokenization-free models
character-level machine translation
character-ngram information retrieval
transfer learning for character-level models
models of within-token and cross-token structure
NL generation (of words not seen in training etc)
out of vocabulary words
morphology & segmentation
relationship b/w morphology & character-level models
stemming and lemmatization
inflection generation
orthographic productivity
form-meaning representations
true end-to-end learning
spelling correction
efficient and scalable character-level models

Submission of Long and Short Papers and Extended Abstracts

Please submit your paper using START:

https://www.softconf.com/naacl2018/SCLeM18

Submissions must be in PDF format, anonymized for review, written in English and follow the NAACL 2018 formatting requirements (available at http://naacl2018.org/call_for_paper.html). We strongly advise you use the LaTeX template files.

Long paper submissions consist of up to eight pages of content. Short paper submissions consist of up to four pages of content. There is no limit on the number of pages for references. There is no extra space for appendices. Accepted papers will be given one additional page for content.

Authors can also submit extended abstracts of up to eight pages of content. Add "(EXTENDED ABSTRACT)" to the title of an extended abstract submission. Extended abstracts will be presented as talks or posters if selected by the program committee, but not included in the proceedings. Thus, your work will retain the status of being unpublished and later submission at another venue (e.g., a journal) is not precluded.

Organizing Committee

Manaal Faruqui, Google
Hinrich Schütze, LMU Munich
Isabel Trancoso, INESC-ID/IST
Yulia Tsvetkov, CMU
Yadollah Yaghoobzadeh, MSR Montreal

sponsor

Microsoft Research is sponsoring the best paper awards.

Google Sites

Report abuse