Lesson 8

Synopsis

A Data Management Plan (DMP) is a document created early in a research project that describes the types of data to be generated; how the data will be compiled, analyzed, and stored; who will have access to the data during the project; the legal and ethical status of the data; and how the data will be handled after the project is complete, including deletion or destruction of some or all of the data, long-term preservation of a subset of the data, and how preserved data will be shared. While many funders, publishers, and institutions require a DMP for all new research projects, many researchers view the DMP as a burden. However, the reality is that good data management planning from the project outset can save time, money, and frustration, while ultimately helping to increase the impact of research. This chapter is intended to guide researchers through the process of developing and writing a comprehensive DMP that can be modified to satisfy any requirements.

Core concepts & keywords

3-2-1 Rule: Keep 3 copies of your data on at least 2 storage media with at least 1 copy in an off-site location.

LOCKSS: Lots of Copies Keep Stuff Safe

Data Content: The type of data to be collected and created, such a recorded conversations, transcriptions and translations of the recordings, and coding of the grammatical features of the conversation.

Data Dictionary: A key used to document all terms, conventions, codes, abbreviations, units of measure, recording frequencies, software settings, etc. used in a particular project.

Data Digital Parameters: The file types and formats of the data, such as .wav audio files, .xml transcription/translation files, and .csv coding files.

Data Management Plan: A written document that outlines a researcher's long-term and short-term plans for generating, handling, describing, organizing, processing, analyzing, preserving, and sharing the data resulting from a research project.

Data Preservation: Data storage as well as the management and production of activities that ensure digital files and their metadata can be accessed in the future, even as software and hardware change.

Data Storage: The location where you keep your files (such as cloud storage, IT-managed storage, or a hard drive) so that you or your team can access them.

Documentation: Explains the context of a research project and how that project is carried out (methodology, protocols, workflows, procedures, manuals, programs, equipment configurations, software settings, etc.); how data are organized, managed, stored, and backed up; how data files, points or sets are related; and how data quality is controlled or assured.

Digital Repository: An organization that is committed to preserving data files in a digital format for an agreed upon time duration while making the files discoverable and accessible online; also known as a data repository or digital archive.

Metadata: Structured information about an item that makes the data discoverable.

Administrative Metadata: The technical information about a file, as well as the rights management (copyright, licenses) and preservation information.

Descriptive Metadata: Information such as author, title, abstract, keywords, and publication date.

Structural Metadata: Information about the relationship between files or other objects in a dataset.

Readme file: A human-readable file used to describe and contextualize a project, experiment, or protocol and to explain what the individual files in a folder are and how they are related to each other.

Activities

Exercises - Practice what you've learned

Implement these practices in your career

  • Try using an online tool, such as the DMPTool (https://dmptool.org/) and/or ezDMP (https://ezdmp.org/index), to create a first draft of your DMP.

    • Does the resulting DMP seem complete?

    • Does it satisfy your needs?

    • Does it seem relevant to your sub-discipline of linguistics?

    • Think about how you could revise it to make it more comprehensive for your entire project.

  • Test some of the file size calculators listed in Kung’s Resources for Creating a DMP (DMP-Resources.pdf) to determine your data storage needs.

Quiz - Test yourself!

Related readings

Kung, Susan, 2019, "Data management plans for linguistic research", https://doi.org/10.18738/T8/538EEN, Texas Data Repository, V6.

Share your thoughts on this article or topic

Use #LingData #LingDataManagement #DMP on your favorite social media platform!

About the author:

Picture of Susan Kung

Susan Smythe Kung, PhD, is the Archives Manager of the Archive of the Indigenous Languages of Latin America (AILLA) at the University of Texas at Austin, and she is internationally engaged in the formulation of best practices for organizing, archiving, sharing, and citing language documentation data. She is one of the creators and authors of the open educational resource "Archiving for the Future: Simple Steps for Archiving Language Documentation Collections" available at archivingforthefuture.teachable.com. Kung is also a documentary field linguist who has extensively documented Huehuetla Tepehua, an endangered Indigenous language spoken in Huehuetla, Hidalgo, Mexico. Her dissertation, A Descriptive Grammar of Huehuetla Tepehua, won the Mary R. Haas Book Award from the Society for the Study of the Indigenous Languages of the Americas. Language data that she collected during her fieldwork is publicly accessible in AILLA, www.ailla.utexas.org.

Citations

Cite this chapter:

Kung, Susan Smythe. 2022. Developing a data management plan. In The Open Handbook of Linguistic Data Management, edited by Andrea L. Berez-Kroeker, Bradley McDonnell, Eve Koller, and Lauren B. Collister, 101-116. doi.org/10.7551/mitpress/12200.003.0012. Cambridge, MA: MIT Press Open.

Cite this online lesson:

Gabber, Shirley, Danielle Yarbrough, Andrea L. Berez-Kroeker, Bradley McDonnell, Eve Koller, Lauren B. Collister, and Susan Smythe Kung. 2022. "Lesson 8." Linguistic Data Management: Online companion course to The Open Handbook of Linguistic Data Management. Website: https://sites.google.com/hawaii.edu/linguisticdatamanagement/course-lessons/08-developing-a-data-management-plan [Date accessed].