Main EMNLP 2015 website is available here.
The VL'15 workshop is organized and supported by the European Network on Integrating Vision and Language (iV&L Net).
- Marco Baroni (CLIC, University of Trento)
- Krystian Mikolajczyk (University of Surrey)
Call for Papers
Computational vision-language integration is a process of associating visual and corresponding linguistic pieces of information. Fragments of natural language, in the form of tags, captions, subtitles, surrounding text or audio, can aid the interpretation of image and video data by adding context or disambiguating visual appearance. In addition, labeled images are essential for training object or activity classifiers. On the other hand, by providing the contextual and word knowledge often implied to, but lacking in textual input, visual data can help resolve challenges in language processing such as word sense disambiguation, language understanding, machine translation and speech recognition. Moreover, sign language and gestures are languages that require visual interpretation. Since studying language and vision together can also provide new insight into cognition and universal representations of knowledge and meaning, the focus of researchers is increasingly turning towards models for grounding language in action and perception. There is a growing interest in the NLP, computer vision and cognitive science research on models that are capable to learn from and exploit multi-modal data, that is, to build semantic representations from both linguistic and visual or perceptual input.
The purpose of the VL'15 workshop is to bring together researchers from natural language processing, computer vision, human language technologies, computational linguistics, machine learning, representation learning, reasoning, cognitive science and application communities. The workshop will serve as a strong inter-disciplinary forum which will ignite fertilizing discussions and ideas on how to combine and integrate established techniques from different (but related) fields into new unified modeling approaches, as well as how to approach the problem of multi-modal data processing for NLP and vision from a completely new angle. The initiative on integrating vision and text will organically yield a better understanding of the nature and usability of vast multi-modal data available online and in other multi-modal information sources and repositories.
Topics of interest include, but are not limited to (in alphabetical order):
We solicit full papers describing original research combining and integrating language and vision. Full papers should be 6-9 pages
in length plus any number of additional pages containing references only. In addition to the long papers to be presented at the
workshop, we also invite 2-page abstracts for posters to be presented during the VL'15 poster session. More information on the
submission format and guidelines is available here.
All deadlines are 23:59 UTC - 10 hours.