HOME
News*
The workshop will be held at the Marriott Ballroom ABC
12/07/2015 Extended Abstracts for the oral session are added to the CLVL Program
12/01/2015 Presentation instructions is now available link
11/30/2015 Notification of acceptance
10/30/2015 Deadline of Abstract submission extended to Nov 20th, 2015
10/30/2015 Program is now available!
The scope of this workshop lies in the boundary of Computer Vision and Natural Language Processing. In the recent years, there have been increasing interest in the intersection between Computer Vision and NLP. Researches addressed several interesting tasks including generating text description from images and videos, language embedding of images, predicting visual classifiers from unstructured text. More recent works further extends the scope of this area to combine videos and language, Learning to solve non-visual tasks using visual cues, and question answering by visual verification of relation phrases. In this workshop, we aim to cover all these interesting aspects which benefit from jointly modelling visual and semantic concepts, and discuss its future and impact. We will also have a panel discussion focused on how to develop useful datasets and benchmarks that are suitable to the various tasks in this area.
In order to achieve this goal, the program of this workshop will include three to four invited talks by leading researchers in this area covering its diverse aspects. There is be a call for extended abstracts focused on this area .
Location
The workshop is co-located with ICCV 2015 in Santiago, Chile.
Sponsors
Facebook Event Page
https://www.facebook.com/events/1623936767859445/
Call for papers: The submitted extended abstracts will be considered for presentation. Accepted papers will be presented in the workshop poster session. A portion of the accepted papers will be orally presented. We solicit 2 page extended abstracts. Extended abstracts will not be included in the Proceedings of ICCV 2015 and not published in any form. Topics of this workshop include
learning to solve non-visual tasks using visual cues
question answering by visual verification
novel problems in vision and language
visual sense disambiguation
deep learning methods for vision and language
visual Reasoning on language problems
language based visual abstraction
text as weak labels for image or video classification.
image/Video Annotation and natural language description Generation,
text-to-scene generation
transfer learning for vision and language,
jointly learn to parse and perceive (text+image, text+video)
multimodal clustering and word sense disambiguation
unstructured text search for visual content
visually grounded language acquisition and understanding
language-based image and video search
linguistic descriptions of spatial relations
auto-illustration
natural language grounding & learning by watching
learning knowledge from the web
language as a mechanism to structure and reason about visual perception
language as a learning bias to aid vision in both machines and humans
dialog as means of sharing knowledge about visual perception
stories as means of abstraction
understanding the relationship between language and vision in humans
Intended audience
The intended audience of this workshop are scientists working in the overlap area between vision and language.
Workshop Program Chairs
Ahmed Elgammal Rutgers University
Leonid Sigal Disney Research Pittsburgh
Organizers
Mohamed Elhoseiny (workshop organizer, PhD Candidate at Rutgers University)
Ahmed Elgammal (Workshop organizer and Program Chair, Rutgers University)
Leonid Sigal ((Workshop organizer and Program Chair, Disney Research Pittsburgh)