03. Contextualization of Content

After the configuration of crawl settings, archivists supply each website with descriptive metadata to help contextualize the preserved content and facilitate access.  These metadata fields are based on the Dublin Core Metadata Element Set and the Bentley Historical Library has established the following conventions to govern their usage.

While Archive-It does provide a metadata template for authoring descriptive metadata within Archive-It, the Bentley uses a locally developed ArchivesSpace plugin to create and update descriptive metadata for archived websites. Using ArchivesSpace to manage descriptive metadata for archived websites, rather than using the Archive-It interface, provides several affordances, including the ability to manage metadata for websites that exist at multiple seed URLs, the ability to associate existing subjects and agents with seeds, and the ability to export custom MARC XML for archived websites.

The data model that the Bentley uses to manage metadata for Archive-It seeds in ArchivesSpace is as follows:

The screenshot below shows the data model within the context of the resource for the Michigan Historical Collections Web Archives, with multiple site-level archival objects for each distinct intellectual entity, which themselves have one or more seed-level archival objects for each distinct URL at which the website has existed.

Adding New Seeds to ArchivesSpace

To create metadata for a new seed, first navigate to the seed in the Archive-It administrative interface and copy the URL, which should be of the form https://partner.archive-it.org/934/collections/[collection_id]/seeds/[seed_id].

In ArchivesSpace, click the gear menu and navigate to Plug-ins > Archive-It Import.

Next, paste the Archive-It seed URL into the text box and click the Import button.

This will create a seed-level archival object in ArchivesSpace associated with the resource for the seed's Archive-It collection. The seed-level archival object will have a title of the seed URL, external documents that link to the seed in the Archive-It administrative interface and the seed's URL on the live web, and an ArchivesSpace digital object for the seed in the Internet Archive's Wayback Machine.

Next, either create a new series-level archival object for the website or, if a series-level archival object exists for the website, drag and drop the seed-level archival object and make it a child of the series-level archival object. In either scenario, verify that the series-level archival object has the correct and up-to-date metadata as detailed below.

Site-level Descriptive Metadata

The following details the metadata elements to associate with an ArchivesSpace site-level archival object. For instructions on how to create metadata in ArchivesSpace, including how to associate subjects and agents, refer to the Bentley's ArchivesSpace documentation, in particular the sections on ArchivesSpace Subjects and Agents, ArchivesSpace Resource Records, and ArchivesSpace Archival Objects.

Title:  The Bentley Historical Library standardizes the names of preserved sites by using the title found at the top of the target web page or, in the absence of a formal/adequate title, the name of the creator (i.e. the individual or organization responsible for the intellectual content of the site). The library follows the best practices for collection titles as established by Describing Archives: a Content Standard (DACS); to ensure that the nature of the collections is clear, archivists supply “Web Archives” or “Archived Blog” in the final title. University sites furthermore include “University of Michigan” in their titles to highlight the provenance of websites. Complete names for sites in the University of Michigan Web Archives thus follow the pattern “Board of Regents (University of Michigan) Web Archives.” Add a title using the site-level archival object's "Title" field.

Creator: The creator denotes the individual or organization that generated or supplied the website’s intellectual content (and not merely the web designer who created the page). Add the site's creator by associating an ArchivesSpace agent record with the site-level archival object with a "Role" of "Creator." If an existing agent record for the creator cannot be found, follow the instructions in the Bentley's ArchivesSpace Agents documentation to create a new agent record.

Subjects: Subjects provides relevant terms that denote the nature of content in the web archives and facilitate patron searches.  These may include personal names, topical subjects, or geographic areas.  Use Library of Congress subject authorities (http://authorities.loc.gov/) that correspond to MARC21 6XX fields. If the library holds an archival collection for the creator, an existing catalog record may provide useful subject terms.  For University of Michigan websites, it may also be helpful to use subjects from the list of basic terms.  High-priority university sites (which include the Board of Regents, President, Provost, and 19 schools and colleges) should receive additional subject terms to improve the visibility of content. Add subject by associating ArchivesSpace agents with a role of "Subject" with the site-level archival object's list of Agent Links or by associating ArchivesSpace subjects with the site-level archival object's list of Subjects.

Descriptive Note: The Bentley adds a descriptive note in order contextualize preserved websites with an overview of the creator and/or subject matter. This step can be simplified by using relevant text from an existing finding aid or the “About”/”More Information” section of a website, if available. Staff should also give some indication of the nature and scope of the content found in the resource (i.e., newsletters, events, curricula, etc.). Add a descriptive note by associating a Note with the type of Abstract with the site-level archival object.

Publisher: refers to the entity ultimately responsible for the production and presentation of content.  For University of Michigan websites, the Regents of the University of Michigan are recognized as the collective publisher for all affiliated sites in the “edu” domain.  Outside the university, similar situations may arise in which a group or organization is formally identified as being responsible for presenting information or holding copyright. If a Publisher is identified, add it to the site-level archival object by associating an agent record for the publisher with a role of "Subject" and a relator of "Publisher."

Coverage: used to identify the place of publication.  Geographic location in this field should be entered in the format corresponding to MARC field 260.  Examples: ‘Ann Arbor (Mich.)’ or ‘Detroit (Mich.).’ For all University of Michigan web sites ‘Ann Arbor (Mich.)’ should be entered. If the place of publication is known, associate the ArchivesSpace subject record for the geographic entity with the archival object.