Web Archiving

The Bentley Historical Library’s Curation Division has developed a methodology and workflow for the acquisition of web content. These procedures are based on the available features of the Archive-It web archiving service as well as standard archival practices (such as appraisal and description). This document provides an overview of the Bentley Historical Library’s methodology for website preservation.  To determine if we've collected a web site please see the "Finding an Archived Website."

The actual process of website preservation may be broken down into five main steps: 

1.     Identification of the crawl target

2.     Configuration of the crawler settings

3.     Contextualization of content

4.     Initiating a Test Crawl

5.     Conducting Quality Assurance on Completed Crawls

Guided by collecting priorities, surveys of relevant websites, and knowledge of significant individuals and organizations, archivists identify potential targets for preservation. By standardizing the configuration of web crawler settings and addition of metadata and descriptions, archivists are able to ensure that websites are preserved in a manner that is consistent, efficient, and cost-effective.

Given the fast pace of change in web archiving technology and ongoing development of features and functionalities in Archive-It, this methodology document will be periodically reviewed and revised accordingly. Archivists should also consult the Archive-It Help Center, which includes the Archive-It User Guide and a set of FAQs, for comprehensive documentation of the Archive-It web archiving service.