Program
Audience Interaction:
Poll: What to ask during the panel discussion?: https://app.sli.do/event/jwf61vux/live/polls
Mobile App: There is also an App where you need to enter eventcode: #5325, that gives access to both Live Q/A and the Poll
https://apps.apple.com/us/app/slido-q-a-and-polling/id954596240
Schedule
Invited Speakers
T-Brain, SK Telecom
University of Illinois at Urbana-Champaign
University of Washington & AI2
Seoul National University
Georgia Tech & FAIR
Georgia Tech
Spotlight and Poster presentation
All spotlights are also presented as posters in the poster sessions 10:30-11:00 and 12:00-2:00
9:30-10:00
11:30-12:00
2:30-3:15
Spotlight Presentation 1
1.
Poster Number
84
85
86
87
88
\89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
109
Are we asking the right questions in MovieQA?.Bhavan Jasani (Robotics Institute, Carnegie Mellon University)*; Rohit Girdhar (Carnegie Mellon University); Deva Ramanan (Carnegie Mellon University)
3.
(Poster only) Video-Text Compliance: Activity Verification based on Natural Language Instructions, Mayoore Jaiswal (IBM)*; Frank Liu (IBM Research); Anupama Jagannathan (IBM); Anne Gattiker (IBM); Inseok Hwang (IBM); Jinho Lee (Yonsei University); Matt Tong (IBm); Sahil Dureja (IBM); Soham Shah (IBM); Peter Hofstee (IBM); Valerie Chen (Yale University); Suvadip Paul (Stanford University); Rogerio Feris (IBM Research AI, MIT-IBM Watson AI Lab)
4.
SUN-Spot: An RGB-D Dataset With Spatial Referring Expressions, Cecilia Mauceri (University of Colorado Boulder)*; Christoffer Heckman (University of Colorado); Martha S Palmer (University of Colorado), supp
5.
Evaluating Text-to-Image Matching using Binary Image Selection (BISON), Hexiang Hu (USC)*; Ishan Misra (Facebook AI Research ); Laurens van der Maaten (Facebook), supp
13.
Visual Storytelling via Predicting Anchor Word Embeddings in the Stories, Bowen Zhang (University of Southern California)*; Hexiang Hu (USC); Fei Sha (Google Research)
14.
Prose for a Painting, Prerna Kashyap (Columbia University)*; Samrat H Phatale (Columbia University); Iddo Drori (Columbia University and Cornell)
16.
Why Does a Visual Question Have Different Answers?, Danna Gurari (University of Texas at Austin)*
17.
Analysis of diversity-accuracy tradeoff in image captioning, Ruotian Luo (Toyota Technological Institute at Chicago)*; Greg Shakhnarovich (TTI-Chicago)
19.
nocaps: novel object captioning at scale, Harsh Agrawal (Georgia Institute of Technology)*; Karan Desai (University of Michigan); Yufei Wang (Macquarie University); Xinlei Chen (Facebook AI Research); Rishabh Jain (Georgia Tech); Mark Johnson (Macquarie University); Dhruv Batra (Georgia Tech & Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research); Stefan Lee (Oregon State University); Peter Anderson (Georgia Tech)
20.
Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach, Dong-Jin Kim (KAIST)*; Jinsoo Choi (KAIST); Tae-Hyun Oh (MIT CSAIL); In So Kweon (KAIST)
21.
Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering, Soravit Changpinyo (Google AI)*; Bo Pang (); Piyush Sharma (Google Research); Radu Soricut (Google)
Spotlight Presentation 2
22.
MULE: Multimodal Universal Language Embedding, Donghyun Kim (Boston University)*; Kuniaki Saito (Boston University); Kate Saenko (Boston University); Stan Sclaroff (Boston University); Bryan Plummer (Boston University)
23.
Incorporating 3D Information into Visual Question Answering, Yue Qiu (National Institute of Advanced Industrial Science and Technology (AIST),University of Tsukuba)*; Yutaka Satoh (National Institute of Advanced Industrial Science and Technology (AIST)); Kazuma Asano (National Institute of Advanced Industrial Science and Technology (AIST); University of Tsukuba); Kenji Iwata (National Institute of Advanced Industrial Science and Technology (AIST)); Ryota Suzuki (National Institute of Advanced Industrial Science and Technology (AIST)); Hirokatsu Kataoka (National Institute of Advanced Industrial Science and Technology (AIST))
24.
Multimodal Differential Network for Visual Question Generation, Badri Patro (IIT Kanpur)*; Sandeep Kumar (IIT Kanpur); Vinod Kumar Kurmi (IIT Kanpur); Vinay P Namboodiri (IIT Kanpur)
25.
Learning Semantic Sentence Embeddings using Pair-wise Discriminator, Badri Patro (IIT Kanpur)*; Vinod Kumar Kurmi (IIT Kanpur); Sandeep Kumar (IIT Kanpur); Vinay P Namboodiri (IIT Kanpur)
26.
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning, Jyoti Aneja (University of Illinois, Urbana-Champaign)*; Harsh Agrawal (Georgia Institute of Technology)
27.
Reinforcing an Image Caption Generator using Off-line Human Feedback, Paul Hongsuck Seo (POSTECH)*; Piyush Sharma (Google Research); Tomer Levinboim (Google); Bohyung Han (Seoul National University); Radu Soricut (Google)
28.
Use What You Have: Video retrieval using representations from collaborative experts, Yang Liu (University of Oxford)*; Samuel Albanie (University of Oxford); Arsha Nagrani (Oxford University ); Andrew Zisserman (University of Oxford)
29.
CDAR 2019 Competition on Scene Text Visual Question Answering, Ali Furkan Biten (Computer Vision Center); Rubèn Tito (Computer Vision Center); Andrés Mafla (Computer Vision Centre); Lluis Gomez (Universitat Autónoma de Barcelona)*; Marçal Rusiñol (Computer Vision Center, UAB); Minesh Mathew (CVIT, IIIT-Hyderabad); C.V. Jawahar (IIIT-Hyderabad); Ernest Valveny (Universitat Autónoma de Barcelona); Dimosthenis Karatzas (Computer Vision Centre)
30.
Recognizing and Characterizing Natural Language Descriptions of Visually Complex Images, Ziyan Yang (University of Virginia)*; Yangfeng Ji (University of Virginia); Vicente Ordonez (University of Virginia)
31.
Adversarial Learning of Semantic Relevance in Text to Image Synthesis, Miriam Cha (Harvard University)*; Youngjune Gwon (Samsung SDS); H.T. Kung (Harvard University)
32.
(Poster only) ShapeGlot: Learning Language for Shape Differentiation, Panos Achlioptas, Judy Fan, Robert Hawkins, Noah Goodman, Leonidas Guibas Conference Paper International Conference on Computer Vision, 2019, Seoul
VATEX Challenge Presentations
Multi-modal Information Fusion and Multi-stage Training Strategy
for Video Captioning
Ziqi Zhang*,Yaya Shi*, Jiutong Wei*,Chunfeng Yuan, Bing Li, Weiming Hu
Integrating Temporal and Spatial Attentions for VATEX Video
Captioning Challenge 2019
Shizhe Chen, Yida Zhao, Yuqing Song, Qin Jin, Qi Wu
Multi-View Features and Hybrid Reward Strategies for VATEX Video Captioning Challenge 2019
Xinxin Zhu*, Longteng Guo*, Peng Yao*, Jing Liu, Hanqing Lu