The rapid increase in biological data acquisition has made computational analysis essential to research in the life sciences. However, the myriad of software tools to analyze this data were developed in diverse settings, without the capability to interact with one another or to capture the information necessary to reproduce an analysis. The burden of maintaining analytical provenance is therefore placed on the individual scientist. As a consequence, publications in biomedical research usually do not contain sufficient information for reproduction of the presented results. To alleviate these problems, we created a computational genomics environment called GenePattern which tracks the steps in the analysis of genomic data. Recently, in collaboration with Microsoft, we linked GenePattern to Microsoft Word. This resulting combination provides a Reproducible Research System that enables users to link analytical tools into workflows, to automatically record their work, to transparently embed that “recording” into a publication without ever leaving their word processing environment, and importantly to allow exact reproduction of published results. In this talk we will review the motivating use case for GenePattern, its architecture and capabilities, and finally the Microsoft Word add-in that supports the GenePattern Reproducible Research Document (GRRD).
Workshop Papers >