Big Data Meets Computer Vision:
First International Workshop on Large Scale Visual Recognition and Retrieval
December 7, 2012, Lake Tahoe, Nevada, USA
- Videos are up.
- Slides of the invited speakers are up.
- The location of the workshop is Harrahs Sand Harbor 1.
- 10/17/2012: Paper decisions have been sent out to the primary authors.
- 9/17/2012: The submission deadline is extended to 11:59pm PDT, 9/17/2012.
- 8:35pm PDT, 9/16/2012: The CMT submission site is experiencing problems. We will extend our deadline for at least 24 hours.
- We have added FAQ on submission.
7:30-7:35: opening remark
7:35-7:55: invited talk: Charles R. Fay, Strategic Highway Research Program 2 (SHRP2) [slides]
7:55-8:15: contributed talk: Creating a Big Data Resource from the Faces of Wikipedia. Md. Kamrul Hasan (Ecole Polytechnique Montreal), Christopher Pal (Ecole Polytechnique de Montreal)
8:50-9:50: poster session + coffee break
- A k-NN Approach for Scalable Image Annotation Using General Web Data. Mauricio Villegas (Universidad Politecnica de Val), Roberto Paredes (Universidad Politecnica de Valencia)
- Loss-based Learning of Binary Hash Functions. Mohammad Norouzi (University of Toronto), David Fleet (University of Toronto), Ruslan Salakhutdinov (University of Toronto)
- Randomly Multi-view Clustering for Hashing. Caiming Xiong (SUNY at Buffalo), Jason Corso (SUNY at Buffalo)
- Beyond Classification -- Large-scale Gaussian Process Inference and Uncertainty Prediction. Alexander Freytag (Friedrich Schiller University ), Erik Rodner (UC Berkeley,University of Jena), Paul Bodesheim (Computer Vision Group, University of Jena), Joachim Denzler (Computer Vision Group, University of Jena)
- Classiﬁer-as-a-Service: Online Query of Cascades and Operating Points. Brandyn White (University of Maryland: College Park), Andrew Miller (University of Central Florida), Larry Davis (University of Maryland: College Park)
15:35-15:55: contributed talk: Picture Tags and World Knowledge. Lexing Xie (Australian National University)
16:30-16:50: contributed talk: Semantic Kernel Forests from Multiple Taxonomies. Sung Ju Hwang (University of Texas, Austin), Fei Sha (University of Southern California), Kristen Grauman (University of Texas at Austin)
16:50-17:50: poster session + coffee break
- Large-scale image classification with lifted coordinate descent. Zaid Harchaoui (INRIA), Matthijs Douze (INRIA), Mattis Paulin (INRIA), Miro Dudik (Microsoft Research), Jerome Malick (CNRS)
- Aggregating descriptors with local Gaussian metrics. Hideki Nakayama (The University of Tokyo)
- Learning from Incomplete Image Tags. Minmin Chen (Washington university), Kilian Weinberger, Alice Zheng
- Adaptive representations of scenes based on ICA mixture model. Wooyoung Lee (Carnegie Mellon University), Michael Lewicki (Case Western Reserve University)
- Visually-Grounded Bayesian Word Learning. Yangqing Jia (UC Berkeley), Joshua Abbott (UC Berkeley), Joseph Austerweil (UC Berkeley), Thomas Griffiths (UC Berkeley), Trevor Darrell (UC Berkeley)
17:50-18:10: contributed talk: Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach. Boqing Gong (U. of Southern California), Fei Sha (University of Southern California), Kristen Grauman (University of Texas at Austin)
18:10-18:30: panel discussion: Samy Bengio, Alex Berg, Shih-Fu Chang, Andrew Ng, Florent Perronnin
Accepted Papers (Abstracts and PDFs)
The emergence of “big data” has brought about a paradigm shift throughout computer science. Computer vision is no exception. The explosion of images and videos on the Internet and the availability of large amounts of annotated data have created unprecedented opportunities and fundamental challenges on scaling up computer vision.
Over the past few years, machine learning on big data has become a thriving field with a plethora of theories and tools developed. Meanwhile, large scale vision has also attracted increasing attention in the computer vision community. This workshop aims to bring closer researchers in large scale machine learning and large scale vision to foster cross-talk between the two fields. The goal is to encourage machine learning researchers to work on large scale vision problems, to inform computer vision researchers about new developments on large scale learning, and to identify unique challenges and opportunities.
This workshop will focus on two distinct yet closely related vision problems: recognition and retrieval. Both are inherently large scale. In particular, both must handle high dimensional features (hundreds of thousands to millions), a large variety of visual classes (tens of thousands to millions), and a large number of examples (millions to billions).
This workshop will consist of invited talks, panels, discussions, and paper submissions. The target audience of this workshop includes industry and academic researchers interested in machine learning, computer vision, multimedia, and related fields.
Call for Papers
We invite high quality submissions of extended abstracts on topics including, but not limited to
- State of the field: What really defines large scale vision? How does it differ from traditional vision research? What are its unique challenges for large scale learning?
- Indexing algorithms and data structures: How do we efficiently find similar features/images/classes from a large collection, a key operation in both recognition and retrieval?
- Semi-supervised/unsupervised learning: Large scale data comes with different levels of supervision, ranging from fully labeled and quality controlled to completely unlabeled. How do we make use of such data?
- Metric learning: Retrieval visually similar images/objects requires learning a similarity metric. How do we learn a good metric from a large amount of data?
- Visual models and feature representations: What is a good feature representation? How do we model and represent images/videos to handle tens of thousands of fine-grained visual classes?
- Exploiting semantic structures: How do we exploit the rich semantic relations between visual categories to handle a large number of classes?
- Transfer learning: How do we handle new visual classes (objects/scenes/activities) after having learned a large number of them? How do we transfer knowledge using the semantic relations between classes?
- Optimization techniques: How do we perform learning with training data that do not fit into memory? How do we parallelize learning?
- Datasets issues: What is a good large scale dataset? How should we construct datasets? How do we avoid dataset bias?
- Systems and infrastructure: How do we design and develop libraries and tools to facilitate large scale vision research? What infrastructure do we need?
Submissions must be in NIPS 2012 format, with a maximum number of 4 pages (excluding references). The deadline of submission is 11:59pm PDT, September 16th, 2012. Submissions do not have to be anonymous. Accepted papers will be presented as oral talks or posters during the workshop. The submission link is https://cmt.research.microsoft.com/BigVision2012/.
- Q: What do you mean by "extended abstract"? Is it different from the 4 page submission?
- A: The extended abstract refers to the 4 page submission.
- Q: I have a recent paper published elsewhere. Can I submit a 4 page version to the workshop?
- A: We encourage submissions of new work. Authors are also welcome to submit papers that have been recently published or accepted at another venue, as long as this information is disclosed at the time of submission.
- Q: Will papers accepted to the workshop be published?
- A: Authors of accepted papers will have two options for publication: (1) publish the full 4 page version at the workshop website, or (2) provide only a one page abstract and a URL to a current version of the paper.
Submission deadline: September 16th, 2012.
October 7th, 2012. October 17th, 2012.
Workshop date: December 7th, 2012.
- Alex Berg, Kevin Tang, Jia Li, Juan Carlos Niebles, Olga Russakovsky, Florent Perronnin, Ming Yang, Ning Zhou, Xiaoyu Wang, Shenghuo Zhu.