VDU 2022 @ CVPR 2022

The 1^st Workshop on Vision Datasets Understanding

Overview

Data is the fuel of computer vision, on which the state-of-the-art systems are built. A robust object detection system not only needs a strong model architecture and learning algorithms, but also relies on a comprehensive large-scale training set. Despite the pivotal significance of datasets, existing research in computer vision is usually algorithm centric. That is, given fixed training and test data, it is the algorithms or models that are primarily considered for improving. As such, while significant progress has been made in understanding and improving algorithms, there is much less effort in the community made on dataset-level analysis. For example, comparing the number of algorithm-centric works in domain adaptation, the quantitative understanding of the domain gap is much more limited. As a result, there are currently few investigations into the representations of datasets, while in contrast an abundance of literature concerns ways to represent images or videos, essential elements in datasets.

Much benefit can be brought by research centered on datasets. For example, if we can quantify the distribution difference in a more principled way such as end-to-end training, we will have better ideas of how datasets differ from each other and thus be able to design better domain adaptation algorithms. If we can learn to predict the level of labeling noise of a training set, we will be better positioned to design specific noise-resistant learning schemes. Moreover, by quantifying the quality of training datasets, it is eventually possible to improve the training data quality through data generation approaches.

This workshop aims to bring together research works and discussions focusing on analysing vision datasets, as opposed to the commonly seen algorithm-centric counterparts. Specifically, the following topics are of interest in this workshop.

Properties and attributes of vision datasets
Application of dataset-level analysis
Representations of and similarities between vision datasets
Improving vision dataset quality through generation and simulation.

In summary, the questions related to this workshop include but are not limited to:

Can vision datasets be analysed on a large scale?
How to holistically understand the visual semantics contained in a dataset?
How to define vision-related properties and problems on the dataset level?
How can we improve algorithm design by better understanding vision datasets?
Can we predict the performance of an existing model in a new dataset?
What are good dataset representations? Can they be hand-crafted, learned through neural nets or a combination of both?
How do we measure similarities between datasets?
How to measure dataset bias and fairness?
Can we improve training data quality through data engineering or simulation?
How to efficiently create labelled datasets under new environments?
How to create realistic datasets that serve our real-world application purpose?
How can we alleviate the need for large-scale labelled datasets in deep learning?

Important Dates

~~Friday, March 25 [11:59 PM Pacific Time]: Paper submission deadline~~
~~Tuesday, April 12 [11:59 PM Pacific Time]: Final decisions to authors~~
~~Friday, April 15 [11:59 PM Pacific Time]: Camera-ready deadline~~
Monday, June 27: Half-day workshop (AM)

Note: the above deadlines apply to those who want to have their papers included in the proceedings. If you prefer not to be included into the proceedings but still want to share your work with the community, please contact the organizing committee to find out possible solutions.

Submissions

To ensure the high quality of the accepted papers, all submissions will be evaluated by research and industry experts from the corresponding fields. Reviewing will be double-blind and we will accept submissions on work that is not published, currently under review, and already published. All accepted workshop papers will be published in the CVPR 2022 Workshop Proceedings by Computer Vision Foundation Open Access. The authors of all accepted papers (oral/spotlight/posters) will be invited to present their work during the actual workshop event at CVPR 2022.

Paper submission has to be in English, in pdf format, and at most 8 pages (excluding references) in double column. The paper format must follow the same guidelines as for all CVPR 2022 submissions. The author kit provides a LaTeX2e template for paper submissions. Please refer to this kit for detailed formatting instructions and the submission site is: https://cmt3.research.microsoft.com/VDU2022/

Program

Location:

The event is being held virtually.

Schedule:

All the time below are in Central Time (CT)

08:30 - 08:40 Workshop Kickoff and Opening Comments

08:40 - 09:10 First Keynote Speech

09:10 - 10:30 6 Long Oral Presentations (10 mins for talk and 3 mins for Q&A each)

ID-2: A Challenging Benchmark of Anime Style Recognition
ID-12: Few-Shot Image Classification Benchmarks are Too Far From Reality: Build Back Better with Semantic Task Sampling
ID-17: On the Choice of Data for Efficient Training and Validation of End-to-End Driving Models
ID-22: Dark Corner on Skin Lesion Image Dataset: Does it matter?
ID-31: The Topology and Language of Relationships in the Visual Genome Dataset
ID-37: Investigating Neural Architectures by Synthetic Dataset Design

10:30 - 10:50 Coffee Break

10:50 - 11:20 Second Keynote Speech

11:20 - 12:20 8 Spotlight/poster (Short Oral) Presentations (5 mins for talk and 2 mins for Q&A each)

ID-1: Delving into High-Quality Synthetic Face Occlusion Segmentation Datasets
ID-3: Rethinking Illumination for Person Re-Identification: A Unified View
ID-4: What’s in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
ID-8: Dataset Distillation by Matching Training Trajectories
ID-9: Can the Mathematical Correctness of Object Configurations Affect the Accuracy of Their Perception?
ID-13: BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
ID-15: Towards Explaining Image-Based Distribution Shifts
ID-16: deepPIC: Deep Perceptual Image Clustering For Identifying Bias In Vision Datasets

12:20 - 12:40 Coffee Break

12:40 - 13:40 8 Spotlight/poster (Short Oral) Presentations (5 mins for talk and 2 mins for Q&A each)

ID-20: Can we trust bounding box annotations for object detection?
ID-21: Why Object Detectors Fail: Investigating the Influence of the Dataset
ID-27: Mitigating Paucity of data in Sinusoid Characterization Using Generative Synthetic Noise
ID-30: The Effect of Improving Annotation Quality on Object Detection Datasets: A Preliminary Study
ID-32: Analysis of Temporal Tensor Datasets on Product Grassmann Manifold
ID-33: A3D: Studying Pretrained Representations with Programmable Datasets.
ID-38: Self-supervision versus synthetic datasets: which is the lesser evil in the context of video denoising?
ID-40: Video Action Detection: Analysing Limitations and Challenges
Additional Talk: The Missing Link: Finding label relations across datasets