Recommended Practices for Coordinating a Collection Analysis
Library staff in the early stages of exploring a shared print program, whether they are starting a new program, joining an existing program, or performing a program review, will conduct a collection analysis. This process will help them determine whether their analyzed collections can benefit from a shared print program and be a benefit to that program. A collection analysis will determine how a library’s collection will fit into an existing program.
A shared print program collection analysis of a potential member library will look at overlap and divergence across multiple member library collections. It may also highlight scarce titles in need of protection. The shared print program will then use the analysis to determine how to distribute retention commitments across member libraries.
Collection analysis is generally the initial step in beginning or joining a shared print program, and there are a variety of approaches that libraries and programs can take. Analysis projects require significant time and effort, with several areas that merit consideration throughout the planning and implementation process.
Program/Library Considerations
Before commencing a shared print collection analysis, the shared print program should:
Define Project Scope
What are the goals of the analysis? e.g., identify overlaps, gaps, or scarcity; subject (by LC Call Number or Subject Heading) strengths; publication dates; geographic area/local subject interests; geographic area of the holding library.
What are the final outputs desired? e.g., description report, spreadsheets, record sets, or platform for tailored analysis.
What is the scope of the records to be analyzed (See the Shared Print Toolkit discussion of scope)? e.g., determine if the set of records to be analyzed will be limited to this scope at the outset, or whether the collection analysis tool needs to be able to filter out out-of-scope materials.
What level of analysis is needed? For serials and multi-part works, will title-level overlap analysis be sufficient, or will you need volume-level information? For monographs, will you need to verify or analyze edition, printing, or version status? Note that most tools will only conduct automated analysis at the ISBN, OCLC, or title level. Will you have resources to analyze further manually?
What other data should be included? e.g., circulation stats (highly circulated titles may need more copies retained); HathiTrust or other digital collections; holdings at peer libraries; OCLC data (uniqueness/scarcity); age/publication date; LC Call Numbers and/or headings (subject strengths); cost of items; existing retention commitments either in the target program or in other shared print programs of interest.
Identify Project Requirements
Where are the records to be analyzed located, where will analysis reports be stored, and (if applicable) what process will be used to export records to another system?
Who are the personnel responsible for gathering, preparing, processing, and analyzing data?
What systems match needs and budgets?
Consider Project Obstacles
What are the barriers to participation, including cost of the collection analysis tool, staff time required to participate, difficulty using the chosen system (i.e., does it require special training, record export and cleanup, dedicated staff?). Can barriers be overcome or lowered?
What commitment to ongoing staff time will be necessary, given that collection analysis for shared print is not a “one and done” activity? Collections are dynamic, and retention commitments need review and care (e.g., materials that become lost or damaged or out-of-scope, ILL considerations and/or requirements of the shared print program, and onboarding new staff).
What is the data quality of records to be analyzed? Are there any issues with the records that will be analyzed (e.g., brief records, records from legacy systems such as RLIN, records that may have been created or changed through system migrations)? Are there data points that will be inconsistent across participants (e.g., if circulation data is included, was in-house circulation added in some instances? How many years of circulation data are available? Does it matter to your analysis? Also how are ‘bound with’ and analytics handled? How are serial title changes handled–successive entries, or title families? How are multiple volumes enumerated?
Are there any current weeding projects underway that should be completed before entering a collection analysis, or groups of records under consideration for weeding that should be excluded?
Selecting a Collection Analysis Tool
When selecting a collection analysis tool, the shared print program should:
Identify Record Quality
Do records meet specified data requirements for the system under consideration? What are the potential issues with the records being analyzed (e.g., lacking OCLC number for systems that use that as a match point, brief records, bound-with materials, analyzed series)?
How does the collection analysis tool handle these issues?
Will any data cleanup be required before input? If so, does this work align with your resources and staffing?
Assess Results Format Utility
Will the tool provide actionable information and reports?
For external tools not part of a local ILS/LSP, will it give the information needed to reflect the results in the local systems?
Assess Analysis Tool Capability
Can the assessment tool match title families across multiple OCLC or ISSN numbers?
How does the tool accommodate variances in bound issues (e.g., multiple volumes bound together, single volumes bound in multiple bindings)
Determine Retention Allocation Functionality
Can the tool allocate retention copies equitably across the members? Can it assign them based on the library's preferences (e.g., subject strengths)?
Using a Collection Analysis Tool
When using a collection analysis tool, the shared print program should:
Verify Accuracy of Analysis
How will the program determine if the collection analysis tool is producing results that meet project goals? The program should plan to have a person with expertise review tool-generated analyses to ensure results are accurately flagged and distributed, and to further hone the analysis if necessary to produce desired outputs. Particular attention should be paid to the matching algorithm of the tool and whether it meets a program’s or library’s needs, as well as the scope of the project (e.g., do subsequent editions or publications of a title count as overlap or distinct entities?). Programs should check for non-print formats or other materials that may have been included by mistake.
Do the results align with the goals stated in the program’s MOU?
It may also be worth reviewing the Common Pitfalls of shared print programs when embarking on a collection analysis.
Last Updated September 2025