Open Science and R
TABLE OF CONTENTS:
03 –A: What is Open Science?
“The process of making the content and process of producing evidence and claims transparent and accessible to others” (Munafo et al., 2017)
READ: https://www.apa.org/science/about/psa/2019/11/better-research-practices
03 –B: Why Open Science?
Video (12:27)
03– C: The Open Science Lifecycle
Video (5:53)
03–D: Pre-registration
When you preregister your research, you're simply specifying your research plan in advance of your study and submitting it to a registry.
Preregistration separates hypothesis-generating (exploratory) from hypothesis-testing (confirmatory) research. Both are important. But the same data cannot be used to generate and test a hypothesis, which can happen unintentionally and reduce the credibility of your results. Addressing this problem through planning improves the quality and transparency of your research. This helps you clearly report your study and helps others who may wish to build on it.
So how do you preregister? There are several websites and databases which allow for preregistration. The most broadly applicable for social sciences is the Open Science Framework: https://osf.io/prereg/. The Open Science Framework also provides information about what to include in your preregistration, how much detail to include and offers templates here: https://www.cos.io/initiatives/prereg?_ga=2.263330764.1195627208.1585935801-1853960792.1572623623.
Video (12:27)
Video (5:33)
Confirmatory Research
Hypothesis testing
Results are held to the highest standards
Data-independent
Minimizes false positives
P-values retain diagnostic value
Inferences may be drawn to wider population
Exploratory Research
Hypothesis generating
Results deserve to be replicated and confirmed
Data-dependent
Minimizes false negatives in order to find unexpected discoveries
P-values lose diagnostic value
Not useful for making inferences to any wider population
When Can You Pre-Register?
Right before your next round of data collection
After you are asked to collect more data in peer review
Before you begin analysis of an existing data set
Why Pre-register?
Makes your science better by increasing the credibility of your results
Allows you to stake your claim to your ideas earlier
It's an easy way to plan for better research
03–E: Other Open Science Practices
In order to increase transparency and improve the journal review process, many high-profile journals have started requiring researchers to make their raw data and analysis code available. Still, many do not but researchers that support the open science framework will often make their data and code available as well. If not included with supplemental materials with the journal article, some scientists will make information available on a lab website or GitHub account.
Another open science method is to publish pre-prints of your work and publish with open-access publishers that do not charge individuals or Universities extra to view the manuscript. This helps build equity in science, and increases citations and access to your work.
For example, a researcher Richard Mann, had to retract a paper only after his friend told him he found an error in his code after requesting it to build on the research. Dr. Mann had run a simulation with an N of 1, not N=100. Typos can happen, and sometimes having another set of eyes look at your work can prevent mistakes from making it into the literature. Mann self-retracted the paper and later re-did the research. But this serves as a tale of caution! (https://www.statnews.com/2017/06/01/shrimp-study-error/)
Video (14:03)
03–F: Questionable Research Practices and Barriers to Reproducibility
Video (12:02)
Video (12:52)
Video (11:59)
03–G: How R Encourages and Contributes to Open Science
Since R is freely available and new developments can be easily added and modified by the greater scientific community, it is inherently open. Additionally, the statistical packages in R are transparent in what calculations they are performing, allowing the user more flexibility and understanding when conducting analyses in comparison to other statistical software programs. Finally, integration of R code into LaTex packages and the inclusion of RMarkdown allows for easy and clear documentation of thought process, code, and analytical output for reporting and sharing. The ability to keep track of all details of your work and analysis in R allows for greater reproducibility in research.
Image by Allison Horst
References
Abbeyelder, CC BY 4.0 <https://creativecommons.org/licenses/by/4.0>, via Wikimedia Commons
Gaelen Pinnock, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons
Horst, A. Reproducibility_court. CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via https://github.com/allisonhorst
https://www.cos.io/ CC-BY-4.0 https://creativecommons.org/licenses/by/4.0/
Oransky, I and Marcus, A (2017). A shrimp study’s jumbo error — and what other researchers can learn https://www.statnews.com/2017/06/01/shrimp-study-error/
Interested in Learning More?
About pre-registration including Frequently Asked Questions: https://www.cos.io/initiatives/prereg?_ga=2.263330764.1195627208.1585935801-1853960792.1572623623.
Tons of Open Science resources from Benjamin Le: http://www.benjaminle.com/openscience/