Lesson 1

Synopsis

This chapter briefly summarizes the history of reproducible research in linguistics and describes how this Handbook came to be. It also presents an overview of the structure of the Handbook and how to use it.

Core concepts & keywords

Data: The various bits of evidence that arise during research and upon which conclusions, analyses, observations, generalizations, etc., can be made.

Data Management: A general name for the many tasks involved with proper care for research data (e.g. data collection, storage, organization, analysis, sharing, and preservation.)

Data Management Use Case: Concrete examples of the application of data management principles in research studies; many Data Management Use Cases are found in this Handbook.

Reproducible Research: Research is defined as reproducible when the published results can be replicated using the documented data, code, and methods employed by the author or provider without the need for any additional information or needing to communicate with the author or provider. Reproducible research is generally seen as a marker of good research design.

Transparency: Being explicit, usually in writing, about the conditions under which your data was collected, where it is stored, and how it may be accessed and shared.

Activities

Exercises - Practice what you've learned

  • Look through the Data Management Use Case chapters and find at least three that are relevant to your research interests.

Implement these practices in your career

Quiz - Test yourself!

Related readings

Fidler, Fiona and John Wilcox. Reproducibility of Scientific Results. In Edward N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy. Summer 2021 Edition. https://plato.stanford.edu/archives/sum2021/entries/scientific-reproducibility/.

McDonnell, Bradley and Patrick Hall. (2017, January). Developing methods for reproducible research in linguistics: A first step. [Poster presentation]. https://scholarspace.manoa.hawaii.edu/bitstream/10125/43573/1/Poster_McDonnell_Hall.pdf

Share your thoughts on this article or topic

Use #LingData #LingDataManagement on your favorite social media platform!

About the authors:

Picture of Andrea Berez-Kroeker

Andrea L. Berez-Kroeker

Andrea Berez-Kroeker is a Professor in the Department of Linguistics at the University of Hawaiʻi at Mānoa, where she teaches classes primarily in language documentation. She is active in the field of endangered language archiving and her research interests include morphology, discourse, language reclamation, and data sustainability for linguistics.

Bradley McDonnell

Bradley McDonnell is an Assistant Professor in the Department of Linguistics at the University of Hawaiʻi at Mānoa. His specializations include documentary linguistics, Austronesian languages, interactional linguistics, and usage-based linguistics. He is also interested in improving data management workflows for reproducible research in linguistics.

Picture of Bradley McDonnell
Picture of Lauren Collister

Lauren B. Collister

Lauren B. Collister is the Director of the Office of Scholarly Communication and Publishing at the University Library System, University of Pittsburgh. She holds a Ph.D. in Sociolinguistics from her time researching language change in online discourse. Her current work covers publishing, copyright, author rights, and advocacy for open research.

Eve Koller

Eve Koller is an assistant professor at Brigham Young University Hawai'i. She holds a Ph.D. in Linguistics from the University of Hawai'i at Mānoa. Her research interests include historical linguistics, language typology, morphology, writing systems, and language documentation and reclamation.

Picture of Eve Koller

Citations

Cite this chapter:

Berez-Kroeker, Andrea L., Bradley McDonnell, Lauren Collister, and Eve Koller. 2022. Data management and reproducible research in linguistics: On the need for The Open Handbook of Linguistic Data Management. In The Open Handbook of Linguistic Data Management, edited by Andrea L. Berez-Kroeker, Bradley McDonnell, Eve Koller, and Lauren B. Collister, 3-8. doi.org/10.7551/mitpress/12200.003.0005. Cambridge, MA: MIT Press Open.

Cite this online lesson:

Gabber, Shirley, Danielle Yarbrough, Andrea L. Berez-Kroeker, Bradley McDonnell, Eve Koller, and Lauren B. Collister. 2022. "Lesson 1." Linguistic Data Management: Online companion course to The Open Handbook of Linguistic Data Management. Website: https://sites.google.com/hawaii.edu/linguisticdatamanagement/course-lessons/01-data-data-management-and-reproducible-research-in-linguistics-on-the [Date accessed].