General Guidelines

[The content on this page as of Friday, July 8, 2011, was moved to a permanent spot on the MWDL website, http://harvester.lib.utah.edu/mwdl_test/index.php/about/guidelines. If you are referring to the Guidelines from another website, please point to "the General Guidelines for Digital Metadata posted on the Mountain West Digital Library website at http://mwdl.org." Do not point to the exact URL, as this is likely to change. For additional changes to the Guidelines, as the Metadata Task Force approves new changes, the content must be added to the MWDL website manually by Sandra McIntyre.]

Many of the guidelines below apply not only to the Mountain West Digital Library, but also to other OAI-harvesting environments, such as Scientific Commons.

Mapping to Dublin Core

  • Fields that are to be shared via OAI for harvesting should be mapped to Qualified Dublin Core (QDC) or simplified Dublin Core (DC). Mountain West Digital Library harvests QDC from servers where QDC is provided, and DC from other servers. Note: All CONTENTdm servers provide both QDC and DC by default.
  • Local fields that you do not wish to share for harvesting should be mapped to "None". In CONTENTdm, fields that are set to be hidden from display are also unavailable for harvesting.
  • Multiple local fields may be mapped to the same QDC or DC field. These fields will be shared via OAI as distinct fields with the same DC/QDC tag.  Keep in mind that the harvester may concatenate these distinct fields into one field in the harvested environment. Therefore, to avoid the values of those fields being run together illegibly, place a semicolon at the end of each entry.
  • At times you may wish to refrain from mapping more than one field to the same QDC or DC field.  For example, if you are using both a Title and a Filing Title, both of which are mapped to <dcterms:title>, then they will both appear in the harvested "Title" field. This could be confusing to users. Decide on one of them to be mapped to "Title" (dcterms:title) and map the other to "Alternative" (dcterms:alternative) or to "None".
  • You can view exactly what the MWDL harvester and other Open Archives Initiative (OAI) harvesters can retrieve from your digital assets management system by requesting the OAI stream via queries in a Web browser. Instructions for doing this are on the MWDL website page on Open Archives Initiative (OAI) Queries.

Searchable fields

  • Locally searchable:
    In CONTENTdm, published collections whose metadata is not restricted are searchable. Within those collections, a field whose "Searchable" property is set to "Yes" is searchable within the local CONTENTdm environment. If a user searches within this field's collection only, the search will use the local field names.
    If a user searches across more than one collection, the search will use the Dublin Core-mapped or Qualified Dublin Core-mapped field names.
  • Centrally searchable:
    Only fields with these characteristics are
    shared for harvesting: (a) mapped to Dublin Core or Qualified Dublin Core and (b) in CONTENTdm and perhaps other systems, not hidden. (In CONTENTdm, hidden fields are not shared via OAI.) Therefore only those fields will be searchable in a central harvested environment such as Mountain West Digital Library. 
Note: These two searchability characteristics are independent.  Therefore, a field whose "Searchable" property is set to "No" could still be shared and searchable in the harvested environment by virtue of being mapped to DC or QDC.  Also, a field whose "Searchable" property is set to "Yes" might not be searchable in the harvested environment by virtue of not being mapped to DC or QDC, or, in the case of CONTENTdm and perhaps other systems, by virtue of being hidden and therefore not shared via OAI.

Placeholder Data in Required Fields

It may happen that information necessary for required fields is not yet known or not yet included when a collection is first uploaded, or even published. In such a case, enter a placeholder to both fulfill the entry requirement and be able to find records for follow-up. The recommended placeholder is the word "Pending". Example:

Subject: Pending

Local field name vs. DC mapping

Each collection can have its own local field names. The "labels" indicated in the MWDL Dublin Core Application Profile are just indicative, and you are free to name your fields as you wish. However, you have to map the fields correctly. Only the mapping matters when a collection is harvested. If a field is not mapped, it will not appear in the MWDL record. If it is mapped to the wrong “DC map”, the metadata will appear in the wrong MWDL field.

Example:

The entity primarily responsible for making the resource has to be mapped to "Creator" (dc:creator). But the local name can be what you want: "Creator", "Artist", "Author", "Photographer", etc. If relevant to your collection, you may create several fields mapped to "Creator".

Identifier

The value of the required field Identifier is the URI of the resource. This field is automatically created and mapped in CONTENTdm. You do not have to create this field and enter a value.

If you create additional Identifier fields in your collection, map them to "None", not to "Identifier". Only the automatically generated "reference URL" from CONTENTdm is allowed to be mapped to "Identifier".

Date Fields

When setting up the fields for your collection and starting to enter values, remember to treat Date fields differently. Here are some tips about configuring field properties and formatting dates.

Date Fields Setup

You can establish several different kinds of dates, if you like. The metadata standard requires you to enter the Date (original date). In CONTENTdm and possibly other systems, the field must not be hidden; in CONTENTdm, hidden fields are not shared via OAI and therefore can not be harvested. Also, we suggest you set the Date field to be searchable.

  • Date (original date): Set this required field to have the data type of "Date" and to be searchable. Go to "fields" on the "collections" tab in the CONTENTdm Administration interface, click "edit" next to the Date field and set its properties:
    • Field name: Date (or "Date.Original" if you prefer)
    • Dublin Core mapping: Date
    • Data type: Date
      Note: The data type lets CONTENTdm know what sort of data to expect: text, date, or full text search. In CONTENTdm, this data type will constrain the format of your entry of metadata to one of the date formats that CONTENTdm allows.
    • Searchable: Yes
    • Hidden: No
      Note: This date must not be hidden. In CONTENTdm and possibly other systems, hidden fields are not shared via OAI.
  • Digitized Date (Date.Digital): You may wish to record the date that a resource was digitized, for local reference. Do not map this field. Only one field should be mapped to Date.
    • Field name: Digitized Date (or "Date.Digital" if you prefer)
    • Dublin Core mapping: None (to prevent it being harvested and creating confusion downstream)
    • Data type: Date
    • Searchable: No
    • Hidden: No (or Yes if you prefer)

Date Formatting

Unsure how to format dates? You can look at the CONTENTdm Help page on "Entering Dates". However, this is not quite a complete list. Here is a modified list that we think is more accurate for CONTENTdm 4.3 and above:

  • Acceptable formats for import:
    • When using the Media Editor, the Project Spreadsheet, or the Template Creator:
      • yyyy
      • yyyy-mm
      • yyyy-mm-dd
      • mm/yyyy
      • mm/dd/yyyy
      • mm-yyyy
      • mm-dd-yyyy
      • dd-month yyyy
      • yyyy-yyyy
    • When using a tab-delimited file for multiple-file imports:
      • yyyy
      • yyyy-yyyy 
      • mm/dd/yyyy
      • mm-dd-yyyy
  • Stored formats: Dates are automatically converted in CONTENTdm to one of these storage formats when the record is imported or when the Media Editor is saved. This is also how the dates are shared via OAI (regardless of the display format chosen in CONTENTdm; see Display formats below). These formats are all compliant with the international standard for dates, ISO 8601.
    • yyyy
    • yyyy; yyyy; yyyy; yyyy [date ranges are converted to semicolon-separated list of single years]
    • yyyy-mm
    • yyyy-mm-dd
  • Display formats within CONTENTdm viewers: How dates are displayed in the Web templates can be configured. This does not affect the formats under which they are stored.
  • Non-standard dates: In CONTENTdm, setting the data type of the Date field to "Date" constrains the entry of metadata to one of the date formats above, all of which require a four-digit year. None of these formats allows entry of Before Common Era (BCE)/BC dates, Common Era (CE)/AD dates before 1000, and other calendar systems. Label such non-standard dates appropriately, and set the field's data type to "Text" in order to allow non-date-formatted entry. Examples:
BCE Date: 48 BCE;
BCE Date: 1000-800 BCE;
Date: 915 CE;
Date: 404-415 AD;
Hebrew Date: 5750;
Islamic Date: Hijri 1350;
Julian Date: 1849 AD;

If a collection consists of both standard dates and non-standard dates, it is recommended to set up two fields both mapped to Dublin Core date. One may be date-formatted while the other remains text to accommodate some of the forms above. Within the local CONTENTdm environment this limits searching within dates, but will allow the display of all forms of date information locally and in OAI-harvested records.

Copyright

The Rights field in MWDL metadata records may contain information regarding copyright ownership or physical ownership. Physically owning something does not always mean copyright ownership. Making a digital version of a work also does not merit copyright protection because, according to the Bridgeman decision http://www.law.cornell.edu/copyright/cases/36_FSupp2d_191.htm, it lacks sufficient original creativity (one of the tests to meet for copyright protection). For an explanation of the difference between copyright and physical ownership, see the following succinct overview:  http://www.library.yale.edu/special_collections/copyright.html.

In formulating copyright statements, refer to your institution’s copyright page. Or, if your institution does not have one, see Marriott Library’s copyright resource page at http://tinyurl.com/5dy84f for more information and a list of tools to determine copyright status, etc. 

Do you have rights to the material you are adding to your digital collection? 

Copyright protects the creators of original literary, dramatic, musical, artistic, and certain other intellectual works (Title 17, U.S. Code). The protection extends to both published and unpublished material. "Section 106 of the 1976 Copyright Act generally gives the owner of copyright the exclusive right to do and to authorize others to do the following" (Copyright Basics, US Copyright Office):

  • To reproduce the work
  • To prepare derivative works
  • To distribute copies of the work
  • To perform the work publicly
  • To display the work publicly
Use these questions as an initial guide:

Are you the original creator/author? 
If yes, then you are the rights holder.

Did someone else create the work?
If so, then you are most likely not the rights holder.

Did someone assign rights to you through a written assignment? 
If yes, then you are the rights holder.

If someone else created the work and did not assign rights to you, you will need to determine who the rights holder is. Determining a work's copyright status requires a bit of investigation, but there are many tools to assist with this. 

Step 1: Research

Research U.S. Copyright Office registration records http://cocatalog.loc.gov/cgi-bin/Pwebrecon.cgi?DB=local&PAGE=First The catalog contains records from 1978 to the present. For works older than 1978, use Stanford's Copyright Renewal Database http://collections.stanford.edu/copyrightrenewals/bin/page?forward=home Search by author, creator, publisher, or title. 

Step 2: Ask (if needed)

If you find a record, it usually means there's a rights holder and that entity (not necessarily the library) should be listed as the copyright holder and you may need to consider getting permission to digitize. See "The Basics of Getting Permission" for more information http://fairuse.stanford.edu/Copyright_and_Fair_Use_Overview/chapter1/1-b.html 

Step 3: Use Public Domain Slider

If there's not a record, check the Public Domain slider http://librarycopyright.net/digitalslider/ to determine if it fits the criteria for public domain. 

Once you've done some investigation and have an informed idea of the work's copyright status, consider using these sample copyright statements. The sample statements below also include wording in the case of unknown copyright status.

Below are sample wordings of rights statements; replace underlined text with applicable local information.

  • For copyrighted works with all rights reserved, use:

© Personal/Corporate name, year, email/web address (if available). Transmission or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. Works not in the public domain cannot be commercially exploited without permission of the copyright owner. Responsibility for any use rests exclusively with the user.

  • For copyrighted works with some permission built-in (Creative Commons)

© Personal/Corporate name, year, email/web address (if available) Use of this file is allowed in accordance with the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License; http://creativecommons.org/licenses/by-nc-nd/3.0/us/

  • For public domain works, use:

Material in the public domain. No restrictions on use.  If you wish to purchase print copies or a high-resolution version of the image, see [local site].

  • For works where copyright status is unknown, use:

Copyright status unknown. Some material in these collections may be protected by the U.S. Copyright Law (Title 17, U.S.C.). In addition, the reproduction and/or commercial use of some materials may be restricted by gift or purchase agreements, donor restrictions, privacy and publicity rights, licensing agreements, and/or trademark rights. Distribution or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. To the extent that restrictions other than copyright apply, permission for distribution or reproduction from the applicable rights holder is also required. Responsibility for obtaining permissions and for any use rests exclusively with the user.

Brief example

Let's say a digital collection contains a digital copy of an original photograph taken in 1907. The photograph is likely in the public domain (check the Public Domain slider). In this case the digital reproduction of the original is not eligible for copyright protection because it lacks sufficient creativity/originality.  The Rights field indicates that the photograph is in the public domain with a statement like “Material in the public domain. No restrictions on use.” However, the library that digitized the photograph offers prints of it for a fee, so the Rights statement explains that  users can order copies of the digital image for a fee and provides a link to an order form and pricing information. The resulting statement looks like this:


Material in the public domain. No restrictions on use. To purchase print copies or a high-resolution version of the image, see [URL for webpage describing how to order].

No HTML tags within metadata

Metadata should be kept free of tags and formatting codes as much as possible since it is shared as text via OAI with MWDL and other harvesters like Scientific Commons. Because it is not predictable how metadata will be used, crosswalked, or formatted at the harvesting end, it is advisable to keep it "clean" of any tags. 
  • Do not use HTML tags within the values of your metadata fields.  For example, do not use "<br>" or "<br />" within metadata fields to force a line break. Do not use "<em>", "<i>", "<strong>", "<b>", or other formatting tags within metadata fields. Even where CONTENTdm is configured to render these tags (as OCLC has configured it to render "<br>" or "<br />" by default), they will be included in the OAI stream and therefore shared with central harvesters. This leads to ineffective and ugly metadata in the harvested environment.
  • CONTENTdm can be configured to recognize hard carriage returns, without using HTML tags, if you like. Nathan Pugh posted instructions to the CONTENT-L list. See his posting at http://listserv.oclc.org/scripts/wa.exe?A2=ind0804A&L=CONTENTDM-L&P=R2290. Ask your CONTENTdm system administrator to make this configuration change if you want to include line breaks in your metadata display within CONTENTdm.  Keep in mind that these line breaks will not be included in the OAI stream and therefore will not appear in any centrally harvested environment.

Relationship between Genre, Medium, Extent, Subject, Type, and Format

[To be developed.]


Please notify the UALC Digitization Committee Metadata Task Force or the MWDL Program Director if you have corrections or additions to the above Guidelines.


Comments