About

2018-2019 MoMA Archives Linked Open Data (LOD) Fellowship Report by Sarah Ann Adams

EXECUTIVE SUMMARY

For the 2018-2019 academic year, Pratt Institute School of Information offered the first Linked Open Data (LOD) Fellowship at the Museum of Modern Art (MoMA) archives, the purpose of which was to investigate the application of Linked Open Data within a museum setting. The fellowship was supervised by Jonathan Lill, Leon Levy Foundation Project Manager, The Museum of Modern Art Archives, in consultation with Professors M. Cristina Pattuelli and Matt Miller, co-directors of the Semantic Lab at Pratt. This fellowship entailed helping to build a multi-institution integrated exhibition history index through finding, refining, and standardizing exhibition history data. The fellowship also included exploring the index's potential expression as linked open data through the recommendations for appropriate semantic data schemas. The subject of this final project report is the second of these two tasks, the deliverable of which is a preliminary model for the exhibition index data. This report has been completed for the INFO 698-01: Practicum/Seminar course held in the spring of 2019 and lead by Professor M. Cristina Pattuelli.

The first semester of the LOD fellowship was spent working with large amounts of exhibition history data from various art institutions, standardizing the heterogenous information to be able to all live in the same relational database as an exhibition history index. This first part of the fellowship provided the space for the understanding of what type of information was being targeted for the project, and what level of nuance and granularity was desired. Working closely with the data at the outset of the fellowship provided the working knowledge to begin thinking about how this tabular data could be transformed into linked data.

This project - further detailed in the project phases component of this report - started with the survey of the current properties used to describe art exhibitions in wikidata, and then lead to researching the CIDOC Conceptual Reference Model (CIDOC CRM) and the Linked Art extension, in order to determine how (or if) either of these two resources should inform the preliminary data model for the exhibition history index . Discussions with Jonathan Lill throughout the fellowship also informed the creation of the model, particularly in relation to needing an exhibition concept as a distinct class from an exhibition instance, and also for needing a way to relate an artist directly to an exhibition, rather than through the third party of an art object. The culminating deliverable of the project was creating the preliminary model for the potential transformation of the tabular exhibition history data into linked open data, which can be found on The Preliminary Model page of this report. Reflections on the project, references, and resources are also included in the final pages of this report.

INTRODUCTION

Aims and objectives of project: This practicum project report is the result of inquiry and exploration of the conceptual modeling of art exhibition data using semantic web standards. This project surveys current trends in the modeling of art exhibition data, identifies gaps in existing models, and proposes a preliminary option of how to semantically model the MoMA Archives' multi-institutional exhibition history index for eventual public consumption.

Problems addressed by project: As the Leon Levy Foundation Project Manager of the Museum of Modern Art Archives, Jonathan Lill has been gathering art exhibition data from more than 200 institutions to create a multi-institutional index of art exhibition histories, with the goal of creating a linked open data (LOD) set of this information. While this data set is being compiled, the information is being stored in Microsoft Access, a relational database. Before the data can be transformed into linked data, a linked data model must be created. That model must be informed by current linked data trends and capabilities in the cultural heritage domain, but the model must also suit the needs of the MoMA Archives department vision for how the historic exhibition history data can be accessed, used, and reused by interested parties.

Challenges of the project: While it is important to have strong understanding of existing practices and frameworks, it is not possible nor plausible to review every iteration or variant of systems or ontologies capable of semantically modeling art exhibition data. Therefore, a primary challenge of this data modeling project has been that of scale, of deciding which models and ontologies to look to for guidance on current trends in the cultural heritage domain. Additionally, while the MoMA archives department is considering modeling exhibition history data in its own instance of Wikibase - the software that powers Wikidata (Miller, 2018) - the exact technological trajectory of the exhibition history index project is not yet known. This presents a second challenge of developing a preliminary data model without knowing the full scope of what the end technological landscape might be. It is for this reason that the model presented through this report is conceptual in nature, presented with the hope that it will be further improved and refined by future MoMA LOD fellows and adjusted when deemed necessary during implementation process.

Solution produced and success of the project to date: As this is a work in progress that will be continued by subsequent fellows, there are not “solutions” produced but rather advancements in research, especially as they pertain to the cultural heritage domain and the exhibition history index data modeling needs. This project is successful in that it has taken the first step of many that will be needed for the transformation of the index into linked open data. Subsequent fellows will be able to build on the research and modeling work that has been completed, adjust the model when needed - either due to changes in conceptual needs or technical infrastructure - and can then start determining the workflow for the creation of linked open data.

BACKGROUND

Background of problem to be solved: Both independent research and conversations with Jonathan Lill have revealed that the current inclination for the organization of art exhibition data in the cultural heritage domain has been to focus on the art objects as a link between an artist and an exhibition. The artist is related to the object, and the object is related to the exhibition; there is rarely direct linking between the artist and the exhibition. During the processing of MoMA PS1’s archive, Jonathan Lill was worked with records from alternative art spaces, many of which did not (or still do not) own objects like museums do; sometimes the only "collections" these art spaces have are the art exhibition records themselves, records of which artists' work was included in a given show. This led Jonathan Lill to begin thinking about collecting exhibition records as data, indexing the artists that were a part of each show, independent of the art object itself. (Lill, 2018).

Existing solutions to the problem: The existing solutions to the problem are only solutions in part. The internal relational database developed by Jonathan Lill relates the artist directly to an exhibition, but this database is not public, nor is the information in a linked data format. There are existing ontologies for modeling art and art exhibition data, but these models do not link the artist directly to the exhibition, nor have any of these models been instantiated with the data MoMA has compiled. The fully realized solution would be for publicly available linked open data set that links an artist directly to an exhibition, without the intermediary of an art object.

Project requirements: There were no predetermined requirements of the project at the outset of the fellowship. The idea for the project came about after working with the existing art exhibition data and through numerous conversations with Jonathan Lill about the nature of art exhibition record-keeping. As the fellowship progressed, two needs arose: an understanding of current linked data trends and/or capabilities related to art exhibition data modeling, and an exploration of a linked data model to which the current art exhibition information from the Microsoft Access SQL database data could be mapped. These two lines of inquiry composed the project "requirements".