Research Data Management in Data Science and AI
— Avoiding a Replicability Crisis RDM4AI 2024
The tutorial will take place on the morning of the 23rd of September @ KI 2024
The tutorial will take place on the morning of the 23rd of September @ KI 2024
This tutorial RDM4AI, supported by NFDI4DataScience, addresses the replicability crisis in Artificial Intelligence, with a particular focus on machine-based learning approaches. It covers the research data life cycle, emphasizing best practices for data/software management, metadata, documentation, versioning, and sharing practical examples on how to achieve such practices. Additionally, it introduces model and dataset cards for comprehensive reporting, and advocates for the adoption of FAIR Data Principles to transform research outputs into FAIR Data Objects (FDOs). Aimed at academics across all domains working on AI fields, the tutorial provides practical guidance to enhance transparency and accountability, fostering a more reliable and impactful AI landscape.
Programme:
9:00 - 10:00 Hands-On Session 1 — Model and Data Documentation for AI (Angelie Kraft). Participants will learn about documentation schemas that facilitate comprehensive reporting of key information such as model architectures, hyperparameters, and dataset characteristics. By standardizing documentation practices, researchers can enhance the reproducibility of experiments, foster collaboration across diverse AI domains, and foster transparency regarding biases and limitations.
10:00 - 10:30 Coffee Break
11:00 - 11:30 Hands-On Session 2 — Metadata and SMPs for FAIR Research Software (Findable, Accessible, Interoperable, and Reusable) (Leyla Jael Castro). Participants will learn how to align their research software to the FAIR principles and how Software Management Plans (SMPs) complement the effort and promote good practices, making it easier for researchers to integrate software and corresponding metadata to existing infrastructures and maximize the impact of their work.
11:30 - 12:30 Hands-On Session 3 — RO-Crate + Sign Posting to Create FDOs (Leyla Jael Castro). Participants will learn the basics of RO-crates (an approach to package research objects together with their metadata) and Signposting (a web-based approach to add typed links and make it easier for machines to find metadata relevant to the corresponding webpage) and how this contribute to support "webby" FAIR Digital Objects.
Materials:
Brief NFDI4DS intro: https://sync.academiccloud.de/index.php/s/CzybwOShPtvywjg
Session 1 — Model and Data Documentation for AI
Session 2— Introduction and Tutorial on Metadata and SMPs for FAIR Research Software
Slides (including hands-on): DOI:10.5281/zenodo.13799879
Additional material for hands-on at DOI:10.5281/zenodo.13799121
Session 3— Introduction and Tutorial on Signposting
Session recordings will be uploaded afterwards.
Organizing Committee:
Leyla Jael Castro (ZB Med)
Angelie Kraft (University of Hamburg)
Ricardo Usbeck (Leuphana University Lüneburg)