Relancer: Restoring the Executability of Jupyter Notebooks by Automatic Upgrade of Deprecated APIs
Relancer
Relancer is an automatic technique that restores the executability of broken Jupyter Notebooks by upgrading deprecated APIs.
A Video Guide to Relancer
About
Data scientists typically practice exploratory programming using computational notebooks, to comprehend new data and extract insights. To do this they iteratively refine their code, actively trying to re-use and re-purpose solutions created by other data scientists. However, recent studies have shown that a vast majority of publicly available notebooks cannot be executed out of the box. One of the prominent reasons is the deprecation of data science APIs used in such notebooks, due to the rapid evolution of data science libraries.
Relancer is an automatic technique that restores the executability of broken Jupyter Notebooks, in near real time, by upgrading deprecated APIs. Relancer employs an iterative runtime error driven approach to identify and fix one API issue at a time. This is supported by a machine-learned model which uses the runtime error message to predict the kind of API repair needed -- an update in API or package name, a parameter, or a parameter value. Then Relancer creates a search space of candidate repairs by combining knowledge from API migration examples on GitHub as well as the API documentation and employs a second machine learned model to rank this space of candidate mappings.