Exploring the Next Generation of Data
CVPR Workshop 2025
June 11th, 2025
Room 201B
The recording will be released after CVPR
Nashville, TN
June 11th, 2025
Room 201B
The recording will be released after CVPR
Nashville, TN
Full Day Workshop
Data is more crucial than ever, enabling the first generation of deeping learning models to the new generation of foundation models. These foundation models are rapidly incorporating into several safety critical applications of human life, such as medical diagnosis models, autonomous driving, AI chat bots, etc. Thus, the large volume of data they rely on must be high-quality for safe model development. Due to the sheer volume of raw data, it is necessary to obtain a scalable ability to rank and select data by its inherent quality and value for both generic and specific tasks. Recently, foundation models themselves are used to discover even more data to feed into more foundation model training. This cyclic relationship between data and foundation models introduces another layer of complexity and biases to consider. Overall, this enormous challenge to discover the next generation of data requires several considerations: definition of quality data, bias-free data, scalability, generating data, ethical data gathering, continuous data gathering, and hallucination free foundation models for data mining.
The objective of this proposed CVPR 2025 workshop on Exploring the Next Generation of Data is to gather researchers and engineers across academia and industry to discuss how to tackle this large challenge together. We have invited leading experts as speakers and will call for papers to encourage further engagement and research into this new challenging field. We hope this workshop can be a platform to gather all the new cutting edge research required to address this challenge - dataset distillation, bias and fairness, generative AI, large language models, multimodal language models.
The proposed workshop will span a full day and consist of two main parts:
A total of 5 keynote presentations. Our speakers, who are leading industry experts and academics, will each discuss various facets of progress in the next generation of data.
A peer-reviewed paper track with oral presentations.
This workshop will feature invited talks and selected paper publication. See the program section for details.
Workshop paper submission deadline: Sunday, March 23rd, 2025 (23:59 PST) Friday, March 14th, 2025 (23:59 PST)
Notification to authors: Monday, March 31st, 2025 (23:59 PST)
Camera ready deadline: Sunday, April 6th, 2025 (23:59 PST)
We invite original paper submissions that address data mining, data distillation, generation of data, bias free data selection, fair data selection, including but not limited to:
Dataset bias, fairness, and ethical considerations
Data distillation
Data curation
Data mixtures under compute budget
Scalable data mining
Generative models for synthetic data generation
Foundation models for data mining
Foundation models for data annotation
Hallucination free vision language models
We are following the CVPR paper format: https://cvpr.thecvf.com/Conferences/2025/AuthorGuidelines
LaTeX/Word Templates: CVPR 2025 Paper Template
We accept full-length (max 8 pages) submissions, excluding references.
All the submissions will be peer-reviewed by at least two reviewers.
Blind review: we adopt double-blind review for this workshop. Submitted papers and supplementary materials should not reveal any information about the author.
Dual submission: We do not accept paper submissions that have been published or are under review for other conferences or workshops. Accepted papers are expected to be published at CVPR proceedings.
In submitting a manuscript to NEXD Workshop, the authors agree to the review process and agree to contribute with the reviewing process.
Submission site: https://cmt3.research.microsoft.com/NEXD2025
Acknowledgment
The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.