Program

Location: 317A, COEX Convention Center, Seoul, South Korea. http://iccv2019.thecvf.com/

Audience Interaction: 

Mobile App: There is also an App where you need to enter eventcode: #5325, that gives access to both Live Q/A and the Poll

https://apps.apple.com/us/app/slido-q-a-and-polling/id954596240

Schedule

Invited Speakers

Jin-Hwa Kim

T-Brain, SK Telecom 

Svetlana Lazebnik

University of Illinois at Urbana-Champaign

Yejin Choi

University of Washington & AI2

Gunhee Kim

Seoul National University

Devi Parikh

Georgia Tech & FAIR

Jiasen Lu

Georgia Tech

Spotlight and Poster presentation

All spotlights are also presented as posters in the poster sessions 10:30-11:00 and 12:00-2:00

9:30-10:00

 

 

 

 

 

 

 

 

 

 11:30-12:00

 

 

 

 

 

 

 

 

 

 

 

 

2:30-3:15 

 

 

 

 

Spotlight Presentation 1

1. 

Poster Number

84

 

85

 86

 87

 

88

 

\89

 90

91

 

92

 

93

94

 

 

95

 

96

 

97

 

98

 

99

 

100

 

101

 

102

 103

 104

 

105

 

 106

 107

 109

 

https://www.dropbox.com/s/0r9q56dcdxjv165/0001.pdf?dl=0

Are we asking the right questions in MovieQA?.Bhavan Jasani (Robotics Institute, Carnegie Mellon University)*; Rohit Girdhar (Carnegie Mellon University); Deva Ramanan (Carnegie Mellon University) 

3. 

https://www.dropbox.com/s/btedaep5pgxtavm/VTC_paper_CLVL.pdf?dl=0

(Poster only) Video-Text Compliance: Activity Verification based on Natural Language Instructions,  Mayoore Jaiswal (IBM)*; Frank Liu (IBM Research); Anupama Jagannathan (IBM); Anne Gattiker (IBM); Inseok Hwang (IBM); Jinho Lee (Yonsei University); Matt Tong (IBm); Sahil Dureja (IBM); Soham Shah (IBM); Peter Hofstee (IBM); Valerie Chen (Yale University); Suvadip Paul (Stanford University); Rogerio Feris (IBM Research AI, MIT-IBM Watson AI Lab) 

 4. 

https://www.dropbox.com/s/btedaep5pgxtavm/VTC_paper_CLVL.pdf?dl=0

SUN-Spot: An RGB-D Dataset With Spatial Referring Expressions,  Cecilia Mauceri (University of Colorado Boulder)*; Christoffer Heckman (University of Colorado); Martha S Palmer (University of Colorado), supp 

https://www.dropbox.com/s/qchioicfph90lnt/0004-supp.pdf?dl=0

 5. 

https://www.dropbox.com/s/2npqat04crw3nba/camera_ready.pdf?dl=0

Evaluating Text-to-Image Matching using Binary Image Selection (BISON),  Hexiang Hu (USC)*; Ishan Misra (Facebook AI Research ); Laurens van der Maaten (Facebook), supp 

https://www.dropbox.com/s/xd91r0szu8n698n/supp.pdf?dl=0

 13. 

https://www.dropbox.com/s/vttv2r94mypoxp3/main.pdf?dl=0

Visual Storytelling via Predicting Anchor Word Embeddings in the Stories, Bowen Zhang (University of Southern California)*; Hexiang Hu (USC); Fei Sha (Google Research) 

 14. 

https://www.dropbox.com/s/5wscdx3wrz1mx0h/Prose_for_a_Painting__ICCV_%20%282%29.pdf?dl=0

Prose for a Painting,  Prerna Kashyap (Columbia University)*; Samrat H Phatale (Columbia University); Iddo Drori (Columbia University and Cornell)

 16. 

https://www.dropbox.com/s/vsvnxf3smr5172a/vSTS_clvl2017.camera.pdf?dl=0

Why Does a Visual Question Have Different Answers?,  Danna Gurari (University of Texas at Austin)*

 17. 

https://www.dropbox.com/s/tdqr9efrjdkeicz/iccv.pdf?dl=0

Analysis of diversity-accuracy tradeoff in image captioning, Ruotian Luo (Toyota Technological Institute at Chicago)*; Greg Shakhnarovich (TTI-Chicago)

 19. 

https://www.dropbox.com/s/56lea9vvm9ybyhs/nocaps%20%281%29.pdf?dl=0

nocaps: novel object captioning at scale, Harsh Agrawal (Georgia Institute of Technology)*; Karan Desai (University of Michigan); Yufei Wang (Macquarie University); Xinlei Chen (Facebook AI Research); Rishabh Jain (Georgia Tech); Mark Johnson (Macquarie University); Dhruv Batra (Georgia Tech & Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research); Stefan Lee (Oregon State University); Peter Anderson (Georgia Tech)

20. 

https://www.dropbox.com/s/dhskhm016yblgt6/Unpaired_Caption_Data.pdf?dl=0

Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach, Dong-Jin Kim (KAIST)*; Jinsoo Choi (KAIST); Tae-Hyun Oh (MIT CSAIL); In So Kweon (KAIST) 

21.

https://www.dropbox.com/s/vsvnxf3smr5172a/vSTS_clvl2017.camera.pdf?dl=0

Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering, Soravit Changpinyo (Google AI)*; Bo Pang (); Piyush Sharma (Google Research); Radu Soricut (Google)

 Spotlight Presentation 2

22. 

https://www.dropbox.com/s/yaopiqq863x66c8/MULE_cam.pdf?dl=0

MULE: Multimodal Universal Language Embedding,  Donghyun Kim (Boston University)*; Kuniaki Saito (Boston University); Kate Saenko (Boston University); Stan Sclaroff (Boston University); Bryan Plummer (Boston University)

 23.  

https://www.dropbox.com/s/c8izb928nwvkdw5/PID6085667.pdf?dl=0

Incorporating 3D Information into Visual Question Answering, Yue Qiu (National Institute of Advanced Industrial Science and Technology (AIST),University of Tsukuba)*; Yutaka Satoh (National Institute of Advanced Industrial Science and Technology (AIST)); Kazuma Asano (National Institute of Advanced Industrial Science and Technology (AIST); University of Tsukuba); Kenji Iwata (National Institute of Advanced Industrial Science and Technology (AIST)); Ryota Suzuki (National Institute of Advanced Industrial Science and Technology (AIST)); Hirokatsu Kataoka (National Institute of Advanced Industrial Science and Technology (AIST))

 24. 

https://www.dropbox.com/s/e4kelkdhskojbl1/0024.pdf?dl=0

Multimodal Differential Network for Visual Question Generation, Badri Patro (IIT Kanpur)*; Sandeep Kumar (IIT Kanpur); Vinod Kumar Kurmi (IIT Kanpur); Vinay P Namboodiri (IIT Kanpur)

 25. 

https://www.dropbox.com/s/u8zqaff4wo4rjlm/0025.pdf?dl=0

Learning Semantic Sentence Embeddings using Pair-wise Discriminator, Badri Patro (IIT Kanpur)*; Vinod Kumar Kurmi (IIT Kanpur); Sandeep Kumar (IIT Kanpur); Vinay P Namboodiri (IIT Kanpur)

 26. 

https://www.dropbox.com/s/ams143zaqud38zz/seqcave_camera_ready.pdf?dl=0

Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning, Jyoti Aneja (University of Illinois, Urbana-Champaign)*; Harsh Agrawal (Georgia Institute of Technology)

 27. 

https://www.dropbox.com/s/g2kerei332mhiqn/ICCV2019CLVL%20%281%29.pdf?dl=0

Reinforcing an Image Caption Generator using Off-line Human Feedback, Paul Hongsuck Seo (POSTECH)*; Piyush Sharma (Google Research); Tomer Levinboim (Google); Bohyung Han (Seoul National University); Radu Soricut (Google)

 28. 

https://www.dropbox.com/s/u334ylme6pbnyqv/028_Yang%20Liu%20-%200028.pdf?dl=0

Use What You Have: Video retrieval using representations from collaborative experts, Yang Liu (University of Oxford)*; Samuel Albanie (University of Oxford); Arsha Nagrani (Oxford University ); Andrew Zisserman (University of Oxford)

 29. 

https://www.dropbox.com/s/5fe88xkd4v3qk9e/STVQA_ICCV_Workshop.pdf?dl=0

CDAR 2019 Competition on Scene Text Visual Question Answering, Ali Furkan Biten (Computer Vision Center); Rubèn Tito (Computer Vision Center); Andrés Mafla (Computer Vision Centre); Lluis Gomez (Universitat Autónoma de Barcelona)*; Marçal Rusiñol (Computer Vision Center, UAB); Minesh Mathew (CVIT, IIIT-Hyderabad); C.V. Jawahar (IIIT-Hyderabad); Ernest Valveny (Universitat Autónoma de Barcelona); Dimosthenis Karatzas (Computer Vision Centre)

 30. 

https://www.dropbox.com/s/dru7u9htgpnts3i/ICCV_CLVL19_30.pdf?dl=0

Recognizing and Characterizing Natural Language Descriptions of Visually Complex Images,  Ziyan Yang (University of Virginia)*; Yangfeng Ji (University of Virginia); Vicente Ordonez (University of Virginia) 

 31. 

https://www.dropbox.com/s/advksyg68hy4ld5/paper.pdf?dl=0

Adversarial Learning of Semantic Relevance in Text to Image Synthesis,  Miriam Cha (Harvard University)*; Youngjune Gwon (Samsung SDS); H.T. Kung (Harvard University) 

 32. 

https://arxiv.org/abs/1905.02925

 (Poster only)  ShapeGlot: Learning Language for Shape Differentiation, Panos Achlioptas, Judy Fan, Robert Hawkins, Noah Goodman, Leonidas Guibas Conference Paper International Conference on Computer Vision, 2019, Seoul

 

VATEX Challenge Presentations 

Multi-modal Information Fusion and Multi-stage Training Strategy 

for Video Captioning

Ziqi Zhang*,Yaya Shi*, Jiutong Wei*,Chunfeng Yuan, Bing Li, Weiming Hu 

 Integrating Temporal and Spatial Attentions for VATEX Video 

Captioning Challenge 2019 

Shizhe Chen, Yida Zhao, Yuqing Song, Qin Jin, Qi Wu

Multi-View Features and Hybrid Reward Strategies for VATEX Video Captioning Challenge 2019

Xinxin Zhu*, Longteng Guo*, Peng Yao*, Jing Liu, Hanqing Lu