Export data from the review file:
1. Apply the “CONTENTdm export template” saved export.
2. This template should use these parameters:
% as the field delimiter
<none> as the text qualifier
; [semicolon] as the repeated field delimiter
<none> as the maximum field length
3. Save the .txt file to your desktop.
Data manipulation in OpenRefine:
1. Run google-refine.exe in S:\SY_Systems\SY_Hardware and Software\openrefine-3.4.1 (open the .exe)
2. Create project:
Import .txt file
Under "Parse data as" click "CSV/TSV/separator-based files"
In "character encoding" box, enter/select UTF8
Check "custom" (instead of "tabs" which is the default), and make % the symbol columns are separated by
Uncheck the box for "Use character " to enclose cells containing column separators"
Click "Create project"
3. Undo/Redo tab >
4. Apply... >
5. paste in the JSON code (see bottom of this page--to rename columns, add Watsonline links and folder names, and do some basic edits) >
6. Perform operations
7. Export to Excel
7. Before going to Excel, be sure you’ve downloaded the below macro
Installing and Using the Macro
1. Before editing the data in Excel, you will need to download a macro. SKIP the 4 steps below if you have already downloaded this macro on your machine.
2. Run the BkupAndImportMacroWorkbook.bat script in S:\RS_Reader Services\RS_Digitization\CONTENTdm\ContentDM Macros\Current Macro. This will copy personal.xlsb to %appdata%\microsoft\excel\xlstart\. Doing this will create a personal macro workbook for your Excel, which will allow you to run macros on any macro-enabled workbook.
3. To run macros, you have to enable the Developer tab in Excel. To do this, go to File-->Options-->Customize Ribbon and select “Developer.”
4. To run the macro, go to the Developer tab and select Macro. This will open a window that will allow you to run macros and assign hotkeys to them. The macro will be named PERSONAL.XLSB!ContentDMCleanEdit. Select it and click “run.”
Data manipulation in Excel:
1. Open in Excel, enable editing and (on the File tab) convert to .xlsx
2. Click on Developer tab, then the Macros icon, then run the “PERSONAL.XLSB!ContentDMCleanEdit” macro. This will do the following operations:
Entire sheet:
replace [three spaces] with [space];[space]
replace [two spaces] with [single space]
replace all " (double quote) marks with the % character.
Alt title:
remove 799s and any other inappropriate titles.
change the space-semicolon-space exported from 830s into a comma-space.
Publisher:
delete the contents of any cells containing junk values like “s.n.”
Date and Date text:
Delete Date (text) data that is identical to the data in the Date column.
Format/Medium:
Remove trailing periods.
3. Next, do the following operations:
4. Subject:
Remove FAST headings (indicated with "[term]. fast (OCoLC)fast[number]" or just ". fast") if they duplicate LCSH. If they are unique, remove the ". fast (OCoLC)[number]" part).
5. Title:
For multi-volume sets, if more than one volume has been digitized:
Replicate rows as needed.
Add [space]:[space][volume x] to the end of the title as appropriate.
For multi-volume sets, if only one volume or issue has been digitized, edit the title as above.
6. Creator:
remove $e role terms
7. Description:
remove irrelevant notes (e.g. "Nolen Library has v. 1 Arno reprint of 1974 cataloged separately"--use your judgment).
for books from multi-volume sets, remove any notes that don't refer to that volume.
8. Related resource: usually empty
9. Contributor
Remove $e role terms.
10. Date and Date text
If complete date (i.e. 1984) exists in Date, delete whatever info is in Date text column
For individual volumes in multi-volume sets, edit the date to match the publication date of the volume.
Date column must contain numerical dates in the yyyy-mm-dd (single dates) or yyyy-yyyy (year range) format.
Remove dates including non-numerical characters (e.g. 18uu or 192-?) from Date. Reformat these as follows:
Known decade, e.g. 182u -- enter an appropriate date range in Date (e.g. 1820-1829) and appropriate text in Date text (e.g. 1820s).
Known century, e.g. 18uu -- enter appropriate text in the Date text column (e.g. 19th century).
Unknown century, e.g. 1uuu or uuuu, -- remove the data entirely.
11. Dimensions
For books from multi-volume sets, replace the volume information with the page count for that volume (e.g., "3 v." becomes "368 p.")
12. Digital collection: fill in as appropriate
13. Repository: fill in as appropriate, but nearly always:
Metropolitan Museum of Art (New York, N.Y.)
14. Provider: fill in as appropriate.
15. Credit line
Use for specific donor information or, for accessioned Museum objects, the credit line as stated in the object's TMS record.
16. Location: usually empty
17. Time period: usually empty
18. References: usually empty
19. Identifier: usually empty
20. Language
Confirm that the code transforms have worked for all languages, and fix any that were not.
21. Type:
usually "Text"
22. Copyright status: fill in as appropriate:
Copyrighted
Unknown
Public domain
23. Copyright Note : fill in as appropriate, matching to the copyright status.
Copyright © [name of copyright holder]
These digitized collections are accessible for research purposes. We have indicated what we know about copyright and rights of privacy, publicity, or trademark. Due to the nature of archival collections, we are not always able to identify this information. We are eager to hear from any rights owners, so that we may provide accurate information. When a rights issue needs to be addressed, upon request we will remove the material from public view, while we look into it.
Material is in the public domain. No restrictions on use.
24. Copyright information : fill in for all items, regardless of copyright status:
The Libraries of the Metropolitan Museum of Art make digital versions of collections accessible for research purposes in the following situations: They are in the public domain; the rights are owned by The Metropolitan Museum of Art; we have permission to make them accessible; we make them accessible as a fair use, or there are no known restrictions on use. To learn what your responsibilities are if you would like to use the materials, go to http://www.metmuseum.org/information/terms-and-conditions
25. Watsonline record #: should generate automatically
26. Watsonline link: should generate automatically
27. Transcript: usually empty
28. Local use: usually empty
29. Folder: should generate automatically
30. Select and delete empty rows and columns surrounding the data to ensure there are no errant spaces or data.
31. Save the Excel sheet.
32. Save As... to Q > _Production Master JPEGs > [Project folder] as a Unicode text file. You can only save the currently active sheet this way.
Close the file and go to: 3. Batch importing metadata and items via the Project Client
Last updated 17 Sept 2019
================ JSON code for first transform--copy & paste all code below below this line ================
[
{
"op": "core/text-transform",
"engineConfig": {
"facets": [
{
"type": "list",
"name": "260|ab",
"expression": "isBlank(value)",
"columnName": "260|ab",
"invert": false,
"omitBlank": false,
"omitError": false,
"selection": [
{
"v": {
"v": false,
"l": "false"
}
}
],
"selectBlank": false,
"selectError": false
}
],
"mode": "row-based"
},
"columnName": "264|ab",
"expression": "grel:cells['260|ab'].value",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column 264|ab using expression grel:cells['260|ab'].value"
},
{
"op": "core/column-removal",
"columnName": "260|ab",
"description": "Remove column 260|ab"
},
{
"op": "core/column-rename",
"oldColumnName": "264|ab",
"newColumnName": "IMPRINT",
"description": "Rename column 264|ab to IMPRINT"
},
{
"op": "core/column-rename",
"oldColumnName": "246",
"newColumnName": "Alt title",
"description": "Rename column 246 to Alt title"
},
{
"op": "core/column-rename",
"oldColumnName": "AUTHOR",
"newColumnName": "Creator",
"description": "Rename column AUTHOR to Creator"
},
{
"op": "core/column-rename",
"oldColumnName": "500",
"newColumnName": "Description",
"description": "Rename column 500 to Description"
},
{
"op": "core/column-rename",
"oldColumnName": "902",
"newColumnName": "Related resource",
"description": "Rename column 902 to Related resource"
},
{
"op": "core/column-rename",
"oldColumnName": "IMPRINT",
"newColumnName": "Publisher",
"description": "Rename column IMPRINT to Publisher"
},
{
"op": "core/column-rename",
"oldColumnName": "SUBJECT",
"newColumnName": "Subject",
"description": "Rename column SUBJECT to Subject"
},
{
"op": "core/column-rename",
"oldColumnName": "ADD.AUTHOR",
"newColumnName": "Contributor",
"description": "Rename column ADD.AUTHOR to Contributor"
},
{
"op": "core/column-rename",
"oldColumnName": "008 Date One",
"newColumnName": "Date",
"description": "Rename column 008 Date One to Date"
},
{
"op": "core/column-rename",
"oldColumnName": "260|c",
"newColumnName": "Date text",
"description": "Rename column 260|c to Date text"
},
{
"op": "core/column-rename",
"oldColumnName": "300",
"newColumnName": "Dimensions",
"description": "Rename column 300 to Dimensions"
},
{
"op": "core/column-rename",
"oldColumnName": "655|a",
"newColumnName": "Format medium",
"description": "Rename column 655|a to Format medium"
},
{
"op": "core/column-rename",
"oldColumnName": "9022",
"newColumnName": "Digital collection",
"description": "Rename column 9022 to Digital collection"
},
{
"op": "core/column-rename",
"oldColumnName": "9023",
"newColumnName": "Repository",
"description": "Rename column 9023 to Repository"
},
{
"op": "core/column-rename",
"oldColumnName": "9024",
"newColumnName": "Provider",
"description": "Rename column 9024 to Provider"
},
{
"op": "core/column-rename",
"oldColumnName": "904",
"newColumnName": "Credit line",
"description": "Rename column 904 to Credit line"
},
{
"op": "core/column-rename",
"oldColumnName": "9025",
"newColumnName": "Location",
"description": "Rename column 9025 to Location"
},
{
"op": "core/column-rename",
"oldColumnName": "9026",
"newColumnName": "Time Period",
"description": "Rename column 9026 to Time Period"
},
{
"op": "core/column-rename",
"oldColumnName": "510",
"newColumnName": "References",
"description": "Rename column 510 to References"
},
{
"op": "core/column-rename",
"oldColumnName": "9027",
"newColumnName": "Identifier",
"description": "Rename column 9027 to Identifier"
},
{
"op": "core/column-rename",
"oldColumnName": "LANG",
"newColumnName": "Language",
"description": "Rename column LANG to Language"
},
{
"op": "core/column-rename",
"oldColumnName": "9028",
"newColumnName": "Type",
"description": "Rename column 9028 to Type"
},
{
"op": "core/column-rename",
"oldColumnName": "9029",
"newColumnName": "Copyright status",
"description": "Rename column 9029 to Copyright status"
},
{
"op": "core/column-rename",
"oldColumnName": "90210",
"newColumnName": "Copyright Note",
"description": "Rename column 90210 to Copyright Note"
},
{
"op": "core/column-rename",
"oldColumnName": "90211",
"newColumnName": "Copyright information",
"description": "Rename column 90211 to Copyright information"
},
{
"op": "core/column-rename",
"oldColumnName": "RECORD #(Bibliographic)",
"newColumnName": "RECORD #",
"description": "Rename column RECORD #(BIBLIO) to RECORD #"
},
{
"op": "core/column-removal",
"columnName": "90212",
"description": "Remove column 90212"
},
{
"op": "core/column-rename",
"oldColumnName": "90213",
"newColumnName": "Transcription",
"description": "Rename column 90213 to Transcription"
},
{
"op": "core/column-rename",
"oldColumnName": "90214",
"newColumnName": "Local Use",
"description": "Rename column 90214 to Local Use"
},
{
"op": "core/column-removal",
"columnName": "90215",
"description": "Remove column 90215"
},
{
"op": "core/column-addition",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"baseColumnName": "RECORD #",
"expression": "grel:substring(value, 0, -1)",
"onError": "set-to-blank",
"newColumnName": "Folder",
"columnInsertIndex": 26,
"description": "Create column Folder at index 26 based on column RECORD # using expression grel:substring(value, 0, -1)"
},
{
"op": "core/column-addition",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"baseColumnName": "Folder",
"expression": "grel:\"https://library.metmuseum.org/record=\"+value",
"onError": "set-to-blank",
"newColumnName": "Watsonline",
"columnInsertIndex": 27,
"description": "Create column Watsonline at index 27 based on column Folder using expression grel:\"https://library.metmuseum.org/record=\"+value"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Format medium",
"expression": "grel:replace(value, \").\", \")\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Format medium using expression grel:replace(value, \").\", \")\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Format medium",
"expression": "grel:replace(value, \".;\", \";\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Format medium using expression grel:replace(value, \".;\", \";\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Description",
"expression": "grel:replace(value, \";\", \"; \")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Description using expression grel:replace(value, \";\", \"; \")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Subject",
"expression": "grel:replace(value, \";\", \"; \")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Subject using expression grel:replace(value, \";\", \"; \")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Format medium",
"expression": "grel:replace(value, \";\", \"; \")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Format medium using expression grel:replace(value, \";\", \"; \")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Language",
"expression": "grel:replace(value, \"eng\", \"English\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Language using expression grel:replace(value, \"eng\", \"English\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Language",
"expression": "grel:replace(value, \"ger\", \"German\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Language using expression grel:replace(value, \"ger\", \"German\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Language",
"expression": "grel:replace(value, \"fre\", \"French\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Language using expression grel:replace(value, \"fre\", \"French\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Language",
"expression": "grel:replace(value, \"ita\", \"Italian\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Language using expression grel:replace(value, \"ita\", \"Italian\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Language",
"expression": "grel:replace(value, \"spa\", \"Spanish\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Language using expression grel:replace(value, \"spa\", \"Spanish\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Language",
"expression": "grel:replace(value, \"jpn\", \"Japanese\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Language using expression grel:replace(value, \"jpn\", \"Japanese\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Dimensions",
"expression": "grel:replace(value, \" \", \" ; \")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Dimensions using expression grel:replace(value, \" \", \" ; \")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Publisher",
"expression": "grel:replace(value, \":;\", \": \")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Publisher using expression grel:replace(value, \":;\", \": \")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Publisher",
"expression": "grel:value.replace(/\\,$/,\"\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Publisher using expression grel:value.replace(/\\,$/,\"\")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Publisher",
"expression": "grel:value.replace(/,;/, \" : \")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Publisher using expression grel:value.replace(/,;/, \" : \")"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Language",
"expression": "grel:value.replace(\"rus\", \"Russian\")",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Language using expression grel:value.replace(\"rus\", \"Russian\")"
},
{
"op": "core/column-rename",
"oldColumnName": "245",
"newColumnName": "Title",
"description": "Rename column 245 to Title"
},
{
"op": "core/column-move",
"columnName": "Folder",
"index": 29,
"description": "Move column Folder to position 29"
}
]