Modern data analysis processes can be conceptualized as modularized, linear pipelines of sequential decisions. In a complete data pipeline, a series of decisions are made about how to collect data, how to clean and transform data, and how to train models and quantify model success. These processes rely on the software tools utilized and the actions applied to data throughout the analysis pipeline. R, one of the most widely-used statistical software frameworks for data analysis, relies on user-developed “packages” for many data science and data analysis tasks. These packages are subject to change over time, which can impact computational reproducibility efforts, as well as frustrate users who are left to identify problem areas in broken data analysis code.