5. Loading CONTENTdm links into Watsonline

Required: CONTENTdm server login; Excel; MarcEdit; Millennium Data Exchange and Global Update.

Log in to CONTENTdm server

  • Go to Collections tab, select your collection, and go to Export.

  • Check the "Return field names in first record" box

  • Export a tab-delimited file and save to a convenient location.

Import the export file into Excel

  • Tab-delimited

  • File origin: Unicode (UTF-8)

  • For MMA Pubs collection only:

    • Delete the contents of the transcription column (leave or re-enter the column header). Otherwise the file is too large for Open Refine to handle.

  • Save as Unicode text

Open in Notepad

  • Save as text file with encoding UTF8

OpenRefine

  • Create project with the export file converted into a UTF8 text file

  • Encoding: UTF-8

  • Undo/Redo > Apply JSON code at bottom. This will:

    • Remove all columns except:

      • Title

      • Digital Collection

      • Watsonline record number

      • Date created

      • Reference URL

    • Replace b with .b in Watsonline record number

  • Facet by blank on Watsonline record number column

    • Select false

  • Export to Excel

In Excel

  • Import the data from the export file.

  • Sort by the Date created column. In that column:

    • Refer to the date listed in the "Links current through items created on" column in the "Items count" tab of the Digitization Project Chart document (shared via Google Drive).

    • Remove all rows of data with a Date Created that's blank, that's equal to, or that precedes the date listed.

    • Once you have a set of data for only the unlinked items, remove the Date Created column.

    • Update the "Links current through" date in the Digitization Project Chart document with the latest date created remaining in the data.

  • Edit the data:

    • Digital Collection

      • Add to the beginning of the column: CONTENTdm[space]

      • If there are multiple Digital Collections listed (indicated by a semicolon-space), insert a new column, move the second DC there, and precede it with CONTENTdm[space].

    • [insert a column here called] Reproduction

      • Fill with data conforming to these practices and changing the information as appropriate (See MARC 533 field documentation for guidelines):

        • For in-house digitized material, for example:

          • Also available as electronic reproduction.$bNew York, N.Y. :$cThomas J. Watson Library,$d2017.

          • $3Cover :$aElectronic reproduction.$bNew York, N.Y. :$cThomas J. Watson Library,$d2021.

        • For outsourced digitized material, for example:

          • Also available as electronic reproduction.$bLa Crosse, Wisc. :$cNorthern Micrographics ;$cThomas J. Watson Library,$d2017.

          • Also available as electronic reproduction.$bProvo, Utah :$cBackstage Library Works ;$cThomas J. Watson Library,$d2017.

        • For downloaded digital material, use "Electronic reproduction" and enter information about the producer, for example:

          • Electronic reproduction.$bParis :$cTajan S.A.,$d2011.

    • [insert a column here if necessary called] Provider

      • Use only for non-MMA locations. Leave blank if not needed.

      • For all MMA locations, delete the contents of this column.

      • For digitized items from non-MMA libraries and sources ONLY, reformat the Provider column as follows (example):

        • Frick Art Reference Library to: The Frick Collection, Frick Art Reference Library, New York, NY

    • Reference URL

    • WWW item data (add only if needed)

      • Follow template at Q> Linking data files

      • ...or enter field by field as follows:

        • 983$a (note) -- Item added for www

        • 983$b (location) -- www

        • 983$c (itype) -- 15

        • 983$d (suppress) -- n

        • 983$e (status) -- j

        • 983$f (stickystatus) -- j

  • Save as a Unicode text file "[file name].txt" to Q > Linking data files

  • Open the text file and Save As a UTF-8 text file (overwrite existing file).

  • Close

Open MarcEdit

  • Open the Delimited Text Translator under Add-ins.

  • Select your new text file as the Input File.

  • Under Output File, select the save location as Q > Linking data files and name the file you'll create.

  • Enter " [double quotation mark] in the Text Qualifier box.

  • Check the UTF-8 Encoded box.

  • Next

  • Click Load template and find the appropriate .mrd template in Q > Linking data files

  • …or enter field by field as follows:

    • Select Field 0 (Title)

      • Map to 245$a -- Indicators 00 -- Apply

    • Select Field 1 (Digital collection)

      • Map to 799$a -- Leave indicators as \\ -- Apply

    • Select Field 2 (Digital collection)

      • Map to 799$a -- Leave indicators as \\ -- Apply

    • Select Field 3 (Reproduction)

      • Map to 533$a -- Leave indicators as \\ -- Apply

    • Select Field 4 (Provider)

      • Map to 535$a -- Indicators 1\ -- Apply

    • Select Field 5 (Watsonline number)

      • Map to 035$a – Leave indicators as \\ -- Apply

    • Select Field 6 (Reference URL)

      • Map to 856$u – Indicators 40 -- Apply

      • For Publishers bindings: Map to 856$u - Indicators 41 -- Apply

  • Finish

Open your new .mrk file in MarcEdit

  • Go to Tools > Edit Subfield Data

    • Enter 856 in Field

    • Enter z Subfield

    • Enter Full text from Watson Library Digital Collections in Replace with

      • Publishers Bindings: Photographs of binding from Watson Library Digital Collections

    • Check "New subfield only" box

    • Click Replace Text

  • Go to Add/Delete Field

    • Delete all 008 fields

    • Add 007 field with cr ||| ||a|| [C, R, space, 3 pipes, 3 spaces, 2 pipes, A, 2 pipes] in Field Data

    • Close

  • Find and Replace

    • If you needed the Provider column for a third Digital Collection name, find and replace 535 1\ [535, space, space, 1, backslash] with 799 \\ [799, space, space, backslash, backslash.

  • Collapse multi-volume sets into one record

    • Move the OCLC numbers and 856 fields for additional volumes into the record for the first volume. Delete the remainder of these records.

    • To the 856 $z, add an appropriate parenthetical qualifier for all volumes, for example:

      • Full text from Watson Library Digital Collections (Volume 1)

      • Full text from Watson Library Digital Collections (Volume 2)

  • File > Compile file into MARC

  • Save as .mrc with a new name

  • Close MarcEdit

Open Sierra Data Exchange

  • Select Load Records via a Locally Created Load Profile

  • Get PC

  • Find your .mrc file

  • Upload as .lfts

  • Prep

  • Load M (Load CONTENTdm fields)

  • Check Use Review Files box

  • Load

  • Verify that what you think you loaded actually loaded correctly.

  • Pull the records into a review file; copy the "Load: Overlaid records for..." file into an empty slot in Millennium Create Lists.

  • Delete .errlog, .lfts and .lmarc files if everything went fine.

Sierra Create Lists

  • Create review file of items attached to the bibs of works you just linked.

Sierra Global Update

Bib records:

  • Delete duplicate fields in bib records.

Item records:

  • Convert any IMESSAGE = z ("Digitize later") to - (dash)

  • Change all REQUESTABLE item records to AVAILABLE ONLINE, regardless of location.

    • Exception: For MMA Publications items, change the STATUS of WARC items only.

    • Exception: For items linked to multiple bib records, verify that each item in the volume has been digitized. If this is the case, change the status to AVAILABLE ONLINE. If not, leave as REQUESTABLE.

  • For other statuses, do the following:

    • DEPARTMENT USE ONLY, NOLEN OPEN SHELVES and OPEN SHELVES:

      • Leave status alone.

      • Update Sticky Status to match.

    • MISSING, LOST, REPLACED:

      • Leave status alone.

      • Update Sticky Status to AVAILABLE ONLINE.

Move digitized files

  • Move to uploaded folder in Q > Production > [Project] > uploaded

  • Move page image folders and PDFs, Excel and text metadata files

Final cleanup and administration

  • File management

    • Delete linking files

Go to 6. Final processing and clean up to finish up.

############ JSON code for Refine below #####################

[

{

"op": "core/column-removal",

"description": "Remove column Alternative Title",

"columnName": "Alternative Title"

},

{

"op": "core/column-removal",

"description": "Remove column Creator",

"columnName": "Creator"

},

{

"op": "core/column-removal",

"description": "Remove column Description",

"columnName": "Description"

},

{

"op": "core/column-removal",

"description": "Remove column Related Resource",

"columnName": "Related Resource"

},

{

"op": "core/column-removal",

"description": "Remove column Subject",

"columnName": "Subject"

},

{

"op": "core/column-removal",

"description": "Remove column Publisher",

"columnName": "Publisher"

},

{

"op": "core/column-removal",

"description": "Remove column Contributor",

"columnName": "Contributor"

},

{

"op": "core/column-removal",

"description": "Remove column Date",

"columnName": "Date"

},

{

"op": "core/column-removal",

"description": "Remove column Date (Text)",

"columnName": "Date (Text)"

},

{

"op": "core/column-removal",

"description": "Remove column Dimensions",

"columnName": "Dimensions"

},

{

"op": "core/column-removal",

"description": "Remove column Format (Medium)",

"columnName": "Format (Medium)"

},

{

"op": "core/column-removal",

"description": "Remove column Repository",

"columnName": "Repository"

},

{

"op": "core/column-removal",

"description": "Remove column Provider",

"columnName": "Provider"

},

{

"op": "core/column-removal",

"description": "Remove column Credit Line",

"columnName": "Credit Line"

},

{

"op": "core/column-removal",

"description": "Remove column Location",

"columnName": "Location"

},

{

"op": "core/column-removal",

"description": "Remove column Time Period",

"columnName": "Time Period"

},

{

"op": "core/column-removal",

"description": "Remove column References",

"columnName": "References"

},

{

"op": "core/column-removal",

"description": "Remove column Identifier",

"columnName": "Identifier"

},

{

"op": "core/column-removal",

"description": "Remove column Language",

"columnName": "Language"

},

{

"op": "core/column-removal",

"description": "Remove column Type",

"columnName": "Type"

},

{

"op": "core/column-removal",

"description": "Remove column Copyright Status",

"columnName": "Copyright Status"

},

{

"op": "core/column-removal",

"description": "Remove column Copyright Notice",

"columnName": "Copyright Notice"

},

{

"op": "core/column-removal",

"description": "Remove column Copyright Information",

"columnName": "Copyright Information"

},

{

"op": "core/column-removal",

"description": "Remove column Link to Watsonline record",

"columnName": "Link to Watsonline record"

},

{

"op": "core/column-removal",

"description": "Remove column Transcription",

"columnName": "Transcription"

},

{

"op": "core/column-removal",

"description": "Remove column Local Use",

"columnName": "Local Use"

},

{

"op": "core/column-removal",

"description": "Remove column OCLC number",

"columnName": "OCLC number"

},

{

"op": "core/column-removal",

"description": "Remove column Date modified",

"columnName": "Date modified"

},

{

"op": "core/text-transform",

"description": "Text transform on cells in column Watsonline record number using expression grel:\".\" + value",

"engineConfig": {

"facets": [],

"mode": "row-based"

},

"columnName": "Watsonline record number",

"expression": "grel:\".\" + value",

"onError": "keep-original",

"repeat": false,

"repeatCount": 10

},

{

"op": "core/text-transform",

"description": "Text transform on cells in column Reference URL using expression grel:value.replace('cdm16028.contentdm.oclc.org:80', 'libmma.contentdm.oclc.org')",

"engineConfig": {

"facets": [],

"mode": "row-based"

},

"columnName": "Reference URL",

"expression": "grel:value.replace('cdm16028.contentdm.oclc.org:80', 'libmma.contentdm.oclc.org')",

"onError": "keep-original",

"repeat": false,

"repeatCount": 10

},

{

"op": "core/column-removal",

"description": "Remove column CONTENTdm number",

"columnName": "CONTENTdm number"

},

{

"op": "core/column-removal",

"description": "Remove column CONTENTdm file name",

"columnName": "CONTENTdm file name"

},

{

"op": "core/column-removal",

"description": "Remove column CONTENTdm file path",

"columnName": "CONTENTdm file path"

}

]