Open Science and R

03 –A: What is Open Science?


“The process of making the content and process of producing evidence and claims transparent and accessible to others” (Munafo et al., 2017)

READ: https://www.apa.org/science/about/psa/2019/11/better-research-practices

03 –B: Why Open Science?

Video (12:27)

03– C: The Open Science Lifecycle

Video (5:53)

03–D: Pre-registration

When you preregister your research, you're simply specifying your research plan in advance of your study and submitting it to a registry.

Preregistration separates hypothesis-generating (exploratory) from hypothesis-testing (confirmatory) research. Both are important. But the same data cannot be used to generate and test a hypothesis, which can happen unintentionally and reduce the credibility of your results. Addressing this problem through planning improves the quality and transparency of your research. This helps you clearly report your study and helps others who may wish to build on it.

So how do you preregister? There are several websites and databases which allow for preregistration. The most broadly applicable for social sciences is the Open Science Framework: https://osf.io/prereg/. The Open Science Framework also provides information about what to include in your preregistration, how much detail to include and offers templates here: https://www.cos.io/initiatives/prereg?_ga=2.263330764.1195627208.1585935801-1853960792.1572623623.



Video (12:27)

Video (5:33)

Confirmatory Research

  • Hypothesis testing

  • Results are held to the highest standards

  • Data-independent

  • Minimizes false positives

  • P-values retain diagnostic value

  • Inferences may be drawn to wider population

Exploratory Research

  • Hypothesis generating

  • Results deserve to be replicated and confirmed

  • Data-dependent

  • Minimizes false negatives in order to find unexpected discoveries

  • P-values lose diagnostic value

  • Not useful for making inferences to any wider population

When Can You Pre-Register?

  • Right before your next round of data collection

  • After you are asked to collect more data in peer review

  • Before you begin analysis of an existing data set

Why Pre-register?

  • Makes your science better by increasing the credibility of your results

  • Allows you to stake your claim to your ideas earlier

  • It's an easy way to plan for better research

03–E: Other Open Science Practices

In order to increase transparency and improve the journal review process, many high-profile journals have started requiring researchers to make their raw data and analysis code available. Still, many do not but researchers that support the open science framework will often make their data and code available as well. If not included with supplemental materials with the journal article, some scientists will make information available on a lab website or GitHub account.

Another open science method is to publish pre-prints of your work and publish with open-access publishers that do not charge individuals or Universities extra to view the manuscript. This helps build equity in science, and increases citations and access to your work.

For example, a researcher Richard Mann, had to retract a paper only after his friend told him he found an error in his code after requesting it to build on the research. Dr. Mann had run a simulation with an N of 1, not N=100. Typos can happen, and sometimes having another set of eyes look at your work can prevent mistakes from making it into the literature. Mann self-retracted the paper and later re-did the research. But this serves as a tale of caution! (https://www.statnews.com/2017/06/01/shrimp-study-error/)

Video (14:03)

03–F: Questionable Research Practices and Barriers to Reproducibility

Video (12:02)

Video (12:52)

Video (11:59)

03–G: How R Encourages and Contributes to Open Science

Since R is freely available and new developments can be easily added and modified by the greater scientific community, it is inherently open. Additionally, the statistical packages in R are transparent in what calculations they are performing, allowing the user more flexibility and understanding when conducting analyses in comparison to other statistical software programs. Finally, integration of R code into LaTex packages and the inclusion of RMarkdown allows for easy and clear documentation of thought process, code, and analytical output for reporting and sharing. The ability to keep track of all details of your work and analysis in R allows for greater reproducibility in research.

Image by Allison Horst

References

  1. Abbeyelder, CC BY 4.0 <https://creativecommons.org/licenses/by/4.0>, via Wikimedia Commons

  2. Gaelen Pinnock, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons

  3. Horst, A. Reproducibility_court. CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via https://github.com/allisonhorst

  4. https://www.cos.io/ CC-BY-4.0 https://creativecommons.org/licenses/by/4.0/

  5. https://ropensci.org/

  6. Oransky, I and Marcus, A (2017). A shrimp study’s jumbo error — and what other researchers can learn https://www.statnews.com/2017/06/01/shrimp-study-error/

Interested in Learning More?

  1. About pre-registration including Frequently Asked Questions: https://www.cos.io/initiatives/prereg?_ga=2.263330764.1195627208.1585935801-1853960792.1572623623.

  2. Tons of Open Science resources from Benjamin Le: http://www.benjaminle.com/openscience/