1st CVPR Workshop on
Dataset Distillation
Overview
In the past decade, deep learning has been mainly advanced by training increasingly large models on increasingly large datasets which comes with the price of massive computation and expensive devices for their training. As a result, research on designing state-of-the-art models gradually gets monopolized by large companies, while research groups with limited resources such as universities and small companies are unable to compete. Reducing the training dataset size while preserving model training effects is significant for reducing the training cost, enabling green AI, and encouraging the university research groups to engage in the latest research.
This workshop focuses on the emerging research field of dataset distillation which aims to compress a large training dataset into a tiny informative one (e.g. 1% of the size of the original data) while maintaining the performance of models trained on this dataset. Besides general-purpose efficient model training, dataset distillation can also greatly facilitate downstream tasks such as neural architecture/hyperparameter search by speeding up model evaluation, continual learning by producing compact memory, federated learning by reducing data transmission, and privacy-preserving learning by removing data privacy. Dataset distillation is also closely related to research topics including core-set selection, prototype generation, active learning, few-shot learning, generative models, and a broad area of learning from synthetic data.
Although DD has become an important paradigm in various machine-learning tasks, the potential of DD in computer vision (CV) applications, such as face recognition, person re-identification, and action recognition is far from being fully exploited. Moreover, DD has rarely been demonstrated effectively in advanced computer vision tasks such as object detection, image segmentation, and video understanding.
The purpose of this workshop is to unite researchers and professionals who share an interest in Dataset Distillation for computer vision for developing the next generation of dataset distillation methods for computer vision applications.
News
Feb. 14: We offer 3 free registration for the workshop for students
Feb. 12: The paper submission site will be opened soon.
Important Dates
Deadline for Submission: April 2, 2024 (11:59 PM, GMT)
Notification of Acceptance: April 16, 2024 (11:59 PM, GMT)
Camera-Ready Submission Deadline: April 23, 2024 (11:59 PM, GMT)
Workshop Date: June 17, 2024 (Full day)
Call for Papers
We invite papers related to Dataset Distillation and its related fields and applications. We ask for submissions along two tracks:
Full Papers: Up to 8 pages (not including references and appendices)
Short Papers: Up to 4 pages (not including references and appendices)
Accepted papers will be presented during a poster session and displayed on the workshop website. A select few outstanding papers will also be offered an oral presentation
Topics
Potential topics may include, but are by no means limited to
Novel methods of dataset distillation for computer vision tasks, e.g., image classification, object detection, image segmentation, scene understanding, face recognition, object re-identification, human action recognition, medical image analysis, etc.
Dataset distillation for downstream tasks, e.g., continual learning, neural architecture search, federated learning, domain adaptation, and privacy-preserving.
Robustness and fairness of models trained on the distilled dataset.
Theory study and interpretability of dataset distillation.
Dataset Distillation via generative models.
Scaling dataset distillation for larger datasets and models.
Benchmarking dataset distillation methods.
Dataset Distillation for machine unlearning.
Coreset selection theory, methods, and applications.
Hardware-accelerated approaches to boosting dataset distillation efficiency.
Emerging trends and future directions in dataset distillation research.
Privacy preservation with Dataset Distillation.
For a comprehensive list of previous dataset distillation works, please see this Github repo: Guang000/Awesome-Dataset-Distillation
Submission Instructions
Please use the CVPR 2024 template and upload your anonymized paper as a single PDF (including appendices).
Submission is on OpenReview: link
Please note that authors will have the option to add their papers to the archival proceedings.
For any questions, please contact Dr. Saeed Vahidian (saeed.vahidian@duke.edu)
Chairs
Committee Members
Invited Speakers
Program Schedule
TBD