The Developed Common Terminology(CT) 1.1
New Developed Websites (from January 1, 2015):
International Open Public Digital library (IOPDL) project is found on http://www.iopdl.org
Common Terminology(CT) is found on http://www.ct.iopdl.org
To achieve the goal of CT, as a prototype and a case study, a Common Terminology of widely used MARC, MODS, DC and QDC metadata formats has been developed since May 2012. The Common Terminology is a set of Common Terms of MARC & MODS and DC & QDC. The Common Terminology is developed based on crosswalks of Library of Congress (e.g., MARC to MODS, MARC to DC, MODS 3.4 to MARC, MODS 3.4 to DC, DC to MODS, and DC to MARC) (LC, Conversions). Actual metadata records of Harvard (MARC, 12 million records), UIUC (MARCXML, 10 million), and MIT (QDC, 20 thousand) were used in developing CT. The metadata records are used to investigate MARC tags usage and QDC elements usage with the specially designed Python programs. MARC tags usage of WorldCat and MARC tags usage in searchings are also referenced (Smith-Yoshimura, et al., 2010). The detail works for the CT project can be found in the paper, 'A Model and Roles of a Common Terminology to Improve Metadata interoperability' on http://hdl.handle.net/2142/50100.
1. Definition of Common Terminology
Common Terminology (CT) is a set of Common Terms.
Common Terminology 1.1 is a set of Common Terms of commonly used standards (MARC & MODS and DC & QDC).
Common Terminology is to encompass various metadata standards allowing communities to use their own standards, while providing uniformity to searching.
The Purpose of Common Terminology is
to give an uniformity, Common Terms, to an integrated search engine allowing that communities use their metadata standards, according to their need.
to achieve metadata interoperability at the schema level by designing CT of commonly used metadata schemas (MARC & MODS and DC & QDC).
to achieve metadata interoperability at the schema definition language level by representing the designed CT 1.1 with XML schema and RDF schema, and SKOS concepts linking semantically CT on the Web.
to achieve metadata interoperability at the record level by conducting mapping experiments with conversions for Harvard (MARC 12 million), MIT (QDC 20,000), and UIUC (MARCXML 10 million) metadata records.
to achieve metadata interoperability at the repository level by building CT union catalog and Linked Open Data connecting 3 million online accessible records of Harvard (MARC), MIT (QDC) and UIUC (MARCXML) libraries providing a portal for them.
The Definition of Terms for Common Terminology is
A Common Terminology is a set of Common Terms of element names in widely used metadata schemas such as MARC, MODS, DC and QDC
A Common Term is a property (element) or class.
A property (sub-property) can be one kind of common element (field) or attribute (subfield) in two or more metadata schemas.
A class is an authority group or “a group containing members that have attributes, [behaviors], relationships or semantics in common or a kind of category” (DCMI, 2013).
The CTScheme is a set of authorities that are sets of value lists of MODS & MARC and DC & QDC
CTTypeGenre
CTFormat
CTRelator
CTLanguage
CTDescription
CTIdentifier
CTSubject
2. Versions of the Developed Common Terminology
The current Common Terminology 1.1
The refined and published CT documentation for version 1.1 is CT version 1.1.
The conducted and published CT version 1.1 schema in XML is ct.xsd.
The conducted and published CT version 1.1 schema in RDF` is ct.rdf.
The conducted and published CT version 1.1 in SKOS is ctskos.rdf
The diagram of the Common Terminology 1.1 is in CT Diagram 1.1.
Prior Version CT 1.0
The CT version 1.0 came out finally in March 2014. But, it is modified and improved by reviewers into version 1.1.
Implemented the CT version 1.0 schema in XML and RDF form to achieve schema language interoperability.
Published the CT on the semantic Web.
The conducted and published CT version 1.0 schema in XML form is ct.xsd.
The conducted and published CT version 1.0 schema in RDF` form is ct.rdf.
The refined and published CT documentation for version 1.0 is CT version 1.0.
The developed Crosswalks for the CT 1.0 are (March, 2014):
The developed MARC to CT crosswalk for CT version 1.0,
The developed DC & QDC to CT crosswalk for CT version 1.0,
The developed CT Primer, refining the CT so that it can improve metadata interoperability - still developing.
3. Improving Metadata Interoperability at the Schema Level
CT has been developed to improve lexical and semantic interoperability at schema metadata model level.
CT is a set of Common Terms of element names of MODS for MARC, and DC & QDC.
It has 12 Common Terms (properties), and 58 qualifiers (sub-properties) that specify and subdivide 12 Common Terms in detail.
The Common Terms and qualifiers are selected to minimize the gap of different degree of generality or specificity of MARC, MODS, DC and QDC.
The 12 Common Terms are contributor, date, description, format, identifier, language, publisher, relation, rights, subject, title, and typeGenre.
58 qualifiers of CT 1.1 are selected to preserve much information of the 1000 MARC tags and many subfields, and elements and attributes of MODS.
3.1. The developed 12 Common Terms and 58 qualifiers
3.2. The Developed Crosswalks
MARC(Harvard and UIUC) to CT Crosswalk
4. Improving Metadata Interoperability at the Schema Definition Language Level
To improve metadata interoperability at the schema definition language metadata model level, the generalized 12 Common Terms and 58 qualifiers of the Common Terminology are represented with XML schema definition language (XSD) and RDF schema language (RDFS).
These open for many communities to use CT either in XML or RDF form.
Especially, CT in RDF will give more opportunities for it to be developed further enhancing semantic interoperability on the Web.
The represented RDF schema for CT will be a foundation to conduct Linked Open Data on the Web to improve metadata interoperability at the repository level.
Practically, in XML schema (W3C, XML Schema), 12 Common Terms of CT are defined as elements, and 58 qualifiers are defined as attributes of type, authority, name, role, and source.
On the other hand, in RDF schema (W3C, RDF Schema 1.1, 2014), 12 Common Terms of CT are defined as properties, and type, name, role and source qualifiers as sub-properties. For example, CT:subject has type attributes such as type=”spatial” and type=”temporal”in XML schema, but in RDFs, they are defined as sub-property of subject property as the below. They are also defined and connected asnarrower relationships in SKOS concepts (W3C, SKOS Simple Knowledge Organization System Primer, 2009). An authority qualifier in XML is defined as a class in RDFs.
The Common Terminology that has 12 Common Terms (elements/properties) and 58 qualifiers (sub-properties) with authorities (classes) is defined in SKOS as concepts with URIs. The defined SKOS concepts of CT clarify the relationships between properties(sub-properties) and CTSchemes in XML and RDF. CT is defined as a Concept Scheme in SKOS that has 12 top concepts for 12 Common Terms (properties) as the below (W3C, SKOS Simple Knowledge Organization System Reference, 2009). For example, CT:contributor is top concept of CT in CT 1.1 scheme and has two narrower terms (sub-properties), name and role. The role sub-property of contributor has narrower terms to designate the role of contributor such as author, creator, etc. that are defined the related CTRelator of CTScheme. The full SKOS concepts of CT are in Appendix C.
CT 1.1 diagram that indicates relationships between properties and sub-properties with authorities. The blue nodes represent properties and the yellow nodes represent sub-properties. The 12 Common Terminology 1.1 diagram shows all 12 properties (Common Terms) and 58 sub-properties (qualifiers) with authorities that are defined in CTScheme.
5. Improving Metadata Interoperability at the Record Metadata Model Level
To improve metadata interoperability at the record level, a conversion is designed with Python language to convert MIT (QDC) records into the Common Terminology 1.1. This is a part of mapping experiments with conversions involving Harvard (MARC), MIT (QDC) and UIUC (MARCXML) metadata records. Another conversion for UIUC (MARCXML) to CT 1.1 is developed. The other conversion for Harvard (MARC or their description) is under development. It is to achieve and improve metadata interoperability at the record level among three universities’ libraries and among MARC (MODS), QDC (DC), and CT. This part was not possible without three universities’ cooperation. I really appreciate Harvard, MIT and UIUC persons who cooperate for CT project providing their metadata records.
Last Modified October 7, 2014