GOOD-DATA @ AAAI 2025
1st Workshop on Preparing Good Data for Generative AI: Challenges and Approaches
3 March 2025
Location: Philadelphia, PA, USA
Room TBD
motivation
Foundation models highly depend on the data they are trained on. While self-supervised learning is one of their promises, it is clear that the carefully processed datasets lead to better models. While datasets and models are frequently released by the community, the data preparation recipes are relatively nascent and not fully open. In this workshop, we invite contributions and collaborations in data preparation recipes for creating and using foundation models and generative AI applications, including (but not limited to) pre-training, alignment, fine-tuning, and in-context learning. Data preparation spans data acquisition, cleaning, processing, mixtures, quality assessments, value of data, ablation studies, safety, and governance.Â
Participation
We encourage submissions that are under one of the topics of interest, but also we welcome other interesting and relevant research for preparing good data.
Data acquisition, cleaning, processing, and mixture recipes
Data quality assessment and quantifying the value of data
Data sequence for multi-phase and curriculum learning
Model-based data improvement techniques
Ablation study strategies to understand the interplay between data and model
Data safety and governance
Responsible and ethical considerations of data collection and human annotation
Diversity, bias, transparency, and privacy of data
Theoretical modeling and analysis of data-related aspects in generative AI
Large-scale data processing (intersection between systems and algorithms)
Data value
Papers will be peer-reviewed under a double-blind policy, and the submission deadline is November 15th, 2024. Accepted papers will be presented at the poster session, some as oral presentations, and one paper will be awarded as the best paper.
Please see the call for papers page for more details about paper submission.
Registration: To register for the workshop, please follow the workshop registration guidelines of AAAI 2025.