Faculty Collaborator: Andrew Creamer
About:
For her project, Lily collaborated with Andrew Creamer, the Science Data Specialist at Brown, to enhance the discoverability of data in Brown's Digital Repository. First, she researched FAIR principles and the DataCite Schema for storing metadata. Then, they developed a list of descriptors and persistent identifiers (PIDs) that each dataset in the BDR should have. She also used OpenRefine to write a script that transformed metadata from a spreadsheet format into an XML format using the DataCite Schema.
Making Data FAIR
FAIR Principles:
Findable
Accessible
Interoperable
Reusable
Goals
Identify a set of metadata that will make the datasets in the BDR (Brown Digital Repository) FAIR
Create a crosswalk between analogous properties in the MODS schema and the DataCite schema
Build a tool to collect metadata for a dataset and output it in an XML format in the DataCite schema
Challenges
Cognitive load: Balancing the amount of metadata requested from researchers
Technical: Converting tabular data into XML in the DataCite schema
Time management: Planning and prioritizing a project with few tangible outputs
Accomplishments
Developed a list of metadata that will make the datasets in the BDR FAIR
Created a crosswalk between the MODS schema and the DataCite schema
In progress: Tool to collect metadata for a dataset and output it in an XML format in the DataCite schema
Additional Achievements:
Deeply researched data storage and the interaction between data science and libraries
Gained experience with OpenRefine for reformatting spreadsheets, saving editing scripts, and exporting XML files
Tackled the challenges of adopting data storage principles and new metadata schema
Learned to balance cognitive load with rigor in data collection
Gained experience in creating meeting agendas and a project management plan
Maintained frequent and valuable communication with client/mentor
Actively sought clarification and feedback