Project Description-
This was a partnered project I completed in the fall of 2025 as a part of Rampin’s Data Librarianship Course. The GDA(E) (General, Descriptive, Administrative, Experiential) Rubric is a holistic tool for museums to evaluate completeness and interoperability of metadata. The rubric implements and synthesizes elements from the FAIR principles, Dublin Core metadata standards, and the DataCite metadata schema to produce an analysis of museum metadata records. It is aimed at assessing the functionality, quality and overall “completeness” of museum metadata.
We constructed this rubric based on our analysis of data we retrieved from the open-access API services from four museums:
CHM - Cooper-Hewitt Museum (Smithsonian) (New York, NY)
CMA - Cleveland Museum of Art (Cleveland, OH)
MET - Metropolitan Museum of Art (New York, NY)
VA - Victoria & Albert Museum (South Kensington, London, UK)
Role-
Co-Principal Investigator alongside another student
Methods-
My co-principal investigator and I began with a wish to research data functionality and use in museums. We started off with the FAIR guidelines (also referred to as “FAIR-ness”) as a baseline to build rubric.We created the rubric with the end goal being to understand the ease at which metadata for museum objects could be repurposed for specific uses and/or in between multiple systems.
Due to the massive collections of the institutions we were evaluating, we had to narrow our scope to ceramics from the 19th century to establish a baseline of gatherable data. The focus on a specific type of item also serves as a model for how a patron or researcher may discover and interface with the metadata.
Our research entailed using API queries to call the data and then Python programming to format more queries that formatted returned calls into a more structured query. Our data was returned as .json files.
After using Python and Google Colab to format our API calls, I was tasked with cleaning the data and assessing the overall “completeness” of said data. (I had no experience with Python and in order to better understand the programming language I sat in on a Fall 2025 session of “Programming for Cultural Heritage”)
I used OpenRefine to clean the data. The software application utilizes GREL (General Refine Expression Language) expressions in order to perform data transformations. value.trim() was utilized to properly format all the data/values in all the cells. The .json files were formatted into .json tables in order to not flatten the data as a .tsv or .csv would. The tables were then exported as .xlsx files as I was using Microsoft Excel to further clean and organize the data. I used Excel formulas (specifically COUNTA and COUNTIF) to calculate the percentage of populated rows for a given metadata element. These calculations allow us to quickly and effectively evaluate levels of metadata completeness in a more holistic manner.
All of this was essential in our creation of the GDA rubric in order to evaluate metadata completeness and interoperability on a more thorough level.
Learning Outcomes and Rationale-
The research for this project aligns best with technology learning outcome. This methodology for this project entailed intense problem solving of finicky and hard-to-understand museum API’s coupled with OpenRefine as a data cleaning tool and then Microsoft Excel formulas as a means of synergizing the data and the objective of our research. The combination of technologies used shows capability with troubleshooting and technical proficiency with data processing tools.
To see an example of data after it has been cleaned in OpenRefine and then evaluated for completeness with Microsoft Excel formulas look here: CMA
Below I have included the following components of this project: Data Management Plan (DMP), Final Essay, the completed GDA rubric and the poster.