Epilogue: The papers can now be accessed in full at the ACL website, and a selection of presentations are available for download too, linked below.
Programme, Sunday 6th June, 2010, Los Angeles
Invited Talk: Prof. Tom Mitchell, Carnegie Mellon University
Full Presentation: Learning Semantic Features for fMRI Data from Definitional Text (Pereira, Botvinick & Detre) [Presentation
Full Presentation: Concept Classification with Bayesian Multi-Task Learning (van Gerven & Simanova) [Presentation
Full Presentation: WordNet Based Features for Predicting Brain Activity associated with Meanings of Nouns (Jelodar, Alizadeh & Khadivi)
Short Presentation: Network Analysis of Korean Word Associations (Jung, Li & Akama)
Full Presentation: Detecting Semantic Category in Simultaneous EEG/MEG Recordings (Murphy & Poesio) [Presentation]
Short Presentation: Hemispheric processing of Chinese polysemy in the
disyllabic verb/ noun compounds: an event-related potential study (Huang & Lee) [Presentation
Short Presentation: An Investigation on Polysemy and Lexical Organization of Verbs (Germann, Villavicencio & Siqueira)
Mini-Tutorial: Crash Course in Computational Neuroscience of Language (Murphy) [Presentation]
Mini-Tutorial: Crash Course in Computational Neuroscience of Language (Murphy, cntd.)
Full Presentation: Acquiring Human-like Feature-Based Conceptual Representations from Corpora (Kelly, Devereux & Korhonen) [Presentation]
Full Presentation: Using fMRI Activation to Conceptual Stimuli to Evaluate Methods for
Extracting Conceptual Representations from Corpora (Devereux, Kelly & Korhonen) [Presentation]
for authors can be found here
- Full paper oral presentations are 25 minutes in
duration, plus 5 minutes for questions.
- Short paper oral presentations are 15 minutes in
duration, plus 5 minutes for questions.
Computational neurolinguistics is an emerging research area which integrates
recent advances in computational linguistics and cognitive neuroscience, with
the objective of developing cognitively plausible models of language and
gaining a better understanding of the human language system. It builds on
research in decoding cognitive states from recordings of neural activity, and
computational models of lexical representations and sentence processing.
Published work in this area includes the discovery of semantic features in
neural activity (Mitchell et al, 2008), using brain signals for the relative
evaluation of corpus semantic models (Murphy et al, 2009), and recognizing the
semantics of adjective-noun meaning composition (Chang et al, 2009).
On-going research focuses on a number of topics such as brain-computer interfaces to
provide dictation systems for paraplegic patients, and algorithms to perform
tagging and shallow parsing of neural activity recorded during sentence
comprehension. Both computational linguistics and neuroscience stand to gain from these techniques. In
computational linguistics, the cognitive plausibility of language models has primarily
been evaluated against collections of subjective intuitions (e.g. semantic feature norms,
grammaticality judgments, corpus annotations, dictionaries). Evaluation of the large
body of Computational Linguistics work based on data driven distributional approaches
has also relied on hand-crafted resources such as WordNet or data sets manually
tagged with a predefined list of categories. Comparison with neural data may provide a
more objective yardstick for both models and resources. And in brain imaging,
language-related research has often been limited to relatively coarse analyses (e.g.
high level features such as animacy or part-of-speech) but now computational
neurolinguistic methods have leveraged the richness of corpus-based descriptions to
extract finer-grained representations for single lexemes.
Advances in computational neurolinguistics require close collaboration between
computational linguists and neuroscientists. To this end, an interdisciplinary workshop
can play a key role in advancing existing and initiating new research. We hope that it will
attract an interdisciplinary target audience consisting of computational linguists,
machine learning researchers, computational neuroscientists and cognitive scientists.
Topics of Interest
- Computational Linguistic Focus
- Word-level analyses (e.g. corpus semantic models, lexica, lexical relations and ontologies, parts-of-speech, word senses, morphology)
- Phrase-level analyses (e.g. word compounds, meaning composition in multi-word expressions)
- Machine Learning Focus
- Decoding of cognitive states from neural activity
- Feature selection and data mining techniques for decoding linguistic information
- Neural Science Focus
- Brain imaging techniques: fMRI, EEG, MEG, NIRS, including cross-modality analysis (e.g. combining fMRI and EEG)
- Localizing Regions of Interest (e.g. identify the roles / functions of brain regions)
- Cognitive Science Focus
- Comparisons with behavioral (e.g. priming experiments, eye-tracking, self-paced reading) and elicited data (e.g. semantic feature norms)
- Biologically plausible connectionist approaches
Submissions based on any data-sets or tasks are welcomed, and originality of
approach is encouraged. However, to assist researchers who are new to this
topic, we are providing the data used in Mitchell et al. (2008) and Murphy et
al. (2009), as well as a number of sample shared tasks. Submissions are welcome
that follow the tasks in whole or in part, or simply to use them as an
evaluation baseline for their own work. Performance will not be independently
validated by the organizers, and will only be one of the criteria used to
select among submissions.
The CMU fMRI data-set of 60 concrete concepts, in 12 categories, collected while nine English speakers were presented with 60 line drawings of objects with text labels and were instructed to think of the same properties of the stimulus object consistently during each presentation. For each concept there are 6 instances of ~20k neural activity features (brain blood oxygenation levels).
The Trento EEG data-set for 60 concept concepts, in 2 categories (work tools and land mammals), collected while seven Italian speakers were silently naming photographic images that represent these concepts. For each concept there are 6 instances of ~15k neural activity features
(spectral power in voltage signals).
Sample Shared Tasks
As noted above, submissions on any task are welcomed, and these tasks are primarily intended to provide a possible starting point for researchers who are new to the topic.
Concept-pair neural discrimination task: For two concepts randomly left out of training, teach a classifier to match recorded neural data to the correct lexeme. This may be achieved by taking advantage of corpus-based models of word meaning, as in published research, or otherwise. This task is based on the evaluation method used with fMRI data in Mitchell et al. (2008), and replicated with EEG data in Murphy et al. (2009).
Corpus semantic model evaluation task: Teach a classifier to predict the neural activity observed for single concepts, based on each of several corpus semantic models. The average similarity between observed activity and predicted activity over all concepts can be taken as metric of corpus model fidelity.
Authors are invited to submit full papers on original, unpublished work in the topic area of this workshop via the NAACL submission site. Submissions should be formatted using the NAACL 2010 stylefiles, with blind review and not exceeding 8 pages plus an extra page for references. The stylefiles are available at http://naaclhlt2010.isi.edu/authors.html.
The PDF files will be submitted electronically through the NAACL submission system, the link will be available later.
Each submission will be reviewed at least by two members of the programme committee. Accepted papers will be published in the workshop proceedings. Dual submissions to the main NAACL 2010 conference and this workshop are allowed; if you submit to the main session, indicate this when you submit to the workshop. If your paper is accepted for the main session, you should withdraw your paper from the workshop upon notification by the main session.
- March 10, 2010: Deadline for submission of workshop papers
- March 30, 2010: Notification of acceptance
- April 12, 2010: Camera-ready papers due
- June 6, 2010: Workshop date
- Brian Murphy, Centre for Mind/Brain Studies, University of Trento, Italy
- Kai-min Kevin Chang, Language Technologies Institute, Carnegie Mellon University, USA
- Anna Korhonen, Computer Laboratory, University of Cambridge, UK
Tom Mitchell, Carnegie Mellon University, USA
- Afra Alishahi, Saarland University, Germany
- Ben Amsel, University of Toronto, Canada
- Stefano Anzellotti, Harvard University, USA
- Colin Bannard, University of Texas Austin, USA
- Marco Baroni, University of Trento, Italy
- Gemma Boleda, Universitat Politècnica de Catalunya, Spain
- Ina Bornkessel, Max Planck Leipzig, Germany
- Augusto Buchweitz, Carnegie Mellon University, USA
- George Cree, University of Toronto, Canada
- Barry Devereux, University of Cambridge, UK
- Katrin Erk, University of Texas Austin, USA
- Stefan Evert, Unversity of Osnabrück, Germany
- Adele Goldberg, Princeton University, USA
- Chu-Ren Huang, Hong Kong Polytechnic University, Hong Kong
- Aravind Joshi, University of Pennsylvania, USA
- Marcel Just, Carnegie Mellon University, USA
- Frank Keller, University of Edinburgh, UK
- Charles Kemp, Carnegie Mellon University, USA
- Mirella Lapata, University of Edinburgh, UK
- Chia-Ying Lee, Academia Sinica, Taiwan
- Roger Levy, University of California Sand Diego, USA
- Angelika Lingnau, University of Trento, Italy
- Brad Mahon, University of Rochester, USA
- Robert Mason, Carnegie Mellon University, USA
- Diana McCarthy, Lexical Computing Ltd, UK
- Ken McRae, University of Western Ontario, Canada
- Tom Mitchell, Carnegie Mellon University, USA
- Fermin Moscoso del Prado Martin, University of Provence, France
- Sebastian Padò, University of Stuttgart, Germany
- Francisco Pereira, Princeton University, USA
- Massimo Poesio, University of Trento, Italy
- Thierry Poibeau, CNRS and Ecole Normale Supérieure, France
- Dean Pomerleau, Intel Labs Pittsburgh, USA
- Ari Rappoport, Hebrew University of Jerusalem, Israel
- Brian Roark, Oregeon Health & Science University, USA
- Kenji Sagae, University of Southern California, USA
- Hinrich Schütze, Stuttgart University, Germany
- Sabine Schulte im Walde, University of Stuttgart, Germany
- Svetlana Shinkareva, University of South Carolina, USA
- Nathaniel Smith, University of San Diego, USA
- Aline Villavicencio, Federal University of Rio Grande do Sul, Brazil
- David Vinson, University College London, UK
- Yang ChinLung, City University of Hong Kong, China