Paper decisions and Competition Results have been announced. The list is available here.
The results of the paper acceptance and the competition will be announced on July 10th, 2025.
Paper submission deadline is June 30th, 2025, 11:59 PM HST *. Submission Website: OpenReview
The DataCV Challenge can be found here.
DataCV 2025 will be held in conjunction with ICCV 2025 in Honolulu, Hawai'i.
Information about previous years' workshops, DataCV (VDU) 2022, 2023 and 2024, are available here.
Data is the fuel of computer vision, on which state-of-the-art systems are built. A robust face recognition system not only needs a strong model architecture and learning algorithms but also relies on a comprehensive large-scale training set. Despite the pivotal significance datasets, existing research in computer vision is usually algorithm centric. Comparing the number of algorithm-centric works in domain adaptation, the quantitative understanding of the domain gap is much more limited. As a result, there are currently few investigations into the representations of datasets, while in contrast, an abundance of literature concerns ways to represent images or videos, essential elements in datasets.
The 4th DataCV workshop aims to bring together research works and discussions focusing on analyzing vision datasets, as opposed to the commonly seen algorithm-centric counterparts. Specifically, the following topics are of interest in this workshop.
Properties and attributes of vision datasets
Application of dataset-level analysis
Representations of and similarities between vision datasets
Improving vision dataset quality through generation and simulation
Exploring Vision-Language Models (VLMs) from a data-centric perspective
In summary, the questions related to this workshop include but are not limited to:
Can vision datasets be analyzed on a large scale?
How to holistically understand the visual semantics contained in a dataset?
How to define vision-related properties and problems on the dataset level?
How can we improve algorithm design by better understanding vision datasets?
Can we predict the performance of an existing model in a new dataset?
What are good dataset representations? Can they be hand-crafted, learned through neural nets or a combination of both?
How do we measure similarities between datasets?
How to measure dataset bias and fairness?
Can we improve training data quality through data engineering or simulation?
How to efficiently create labelled datasets under new environments?
How to create realistic datasets that serve our real-world application purpose?
How can we alleviate the need for large-scale labelled datasets in deep learning?
How to analyze model performance in environments lacking annotated data?
How can we assess model bias and fairness in vision models from a data perspective?
How can generated data be used to alleviate privacy concerns in computer vision tasks?
How to better evaluate diffusion models and large language models using data-centric approaches?
The 1st UDA Workshop @ CVPR 2022, New Orleans, Louisiana
The 2nd UDA Workshop @ CVPR 2023, Vancovor, Canada
The 3rd UDA Workshop @ CVPR 2024, Seattle, Washington