The Reproducibility Crisis in ML‑based Science
The use of machine learning (ML) methods for prediction and forecasting has become widespread across the quantitative sciences. However, there's a reproducibility crisis brewing. Indeed, we found 20 reviews across 17 scientific fields that find errors in a total of 329 papers that use ML-based science.
Hosted by the Center for Statistics and Machine Learning at Princeton University, our online workshop aimed to highlight the scale and scope of the crisis, identify root causes of the observed reproducibility failures, and make progress towards solutions.
We have made the workshop materials public: the talks and slides below, and the annotated reading list.
Talks and slides
Reading list and interactive session
In addition to the public session on July 28th, we also prepared additional content for participants who are interested in going deeper into reproducibility:
Annotated reading list: We prepared a reading list with relevant research on reproducibility from the last few years. The majority of these papers were presented by speakers at the workshop. The list is meant to be an accompanying resource for participants who want to go deeper into reproducibility.
Tutorial and interactive session on July 29th, 3-4:30 PM ET: In a recent preprint, we (Kapoor and Narayanan) introduced model info sheets for improving reproducibility by detecting and preventing leakage. In our testing so far, users have been able to detect leakage in models they previously built by filling out model info sheets.
On the day after the workshop (July 29th, 3-4:30 PM ET), we gave a brief tutorial on how model info sheets can help you prevent leakage in your own research, and then hosted an interactive session.