Catamaran Resort, San Diego, CA
Data fuels Artificial Intelligence (AI) applications. Hence, collecting and maintaining high-quality data that has been evaluated to be ready for AI is critical to creating new large AI models that assist in enabling scientific breakthroughs across domains. However, there are several challenges in achieving AI readiness for research data. This workshop aims to enable discussions on novel and efficient methods for collecting and preparing data for AI, metrics to quantify data readiness, frameworks for improving data readiness, assessments of the impact of data on AI model performance, and existing challenges and caveats in managing and transforming historical and new data from various science domains into AI-ready data. The workshop seeks to bring together researchers from academia, industry, and national laboratories to share insights, foster collaboration, and push the existing boundaries between data and AI in scientific discoveries.
The Data Readiness for AI (DRAI) workshop will feature contributed papers and invited speakers to discuss and shape the state-of-art of this cross-cutting topic. Topics of interest include (but are not limited to):
Efficient Data Quality Improvement (Semi-/Automatic Methods)
Quantifiable Metrics for AI Data Readiness
Frameworks and Tools for Assessing & Automating Data Readiness
Frameworks for Data Readiness Improvement
Distributed and Parallel Algorithms for AI Data Processing (HPC Intersection)
Scalability of AI Readiness Solutions
Creation and usage of AI-ready benchmark datasets
Novel Approaches for Handling Massive Datasets
Real-World Use Case Assessments of Data Readiness Impact on AI Performance
Data Readiness in Integrated Scientific Pipelines: Use Cases, Challenges, Needs
Role of FAIR principles in AI readiness of data
Ethical and Comprehensive Data Collection
Collection of balanced data and the characterization of biases
AI Algorithms for Non-AI-Ready Scientific Data
Social Impacts of Data in AI Applications
Data Governance: Metrics, Methods, and Use Cases (Security, Privacy, Management)
The importance of provenance and contextual information, e.g., as captured in datasheets
Transparency and ethics related to AI readiness
AI readiness of multi-modal scientific data and streaming data
Submission Deadline June 15, 2025 June 30, 2025
Notification of Acceptance July 1, 2025 July 10, 2025 July 23, 2025
Camera-ready Deadline July 15, 2025 July 28, 2025
DRAI 2025 | Data Readiness for AI (DRAI) Workshop | Bez, Byna