Work Plan

Task 1. Proposal the Research Project and Background Research

Objective

The objective of Task 1 is to propose the research project and to assemble all background research materials required to support the research about the problems that are caused by using many different metadata schemas. Also, it is to collect the paper which is related to achieve interoperability, metadata interoperability, among the chosen three databases and universities.

Work Steps

  • Propose the research project with the work plan.
  • Assemble all background research materials required to support the research.
  • Literature reviews about the problems of interoperability by using various metadata schemas.
  • Identify and assess the existing interoperability methods.
  • Document about the found problems of interoperability, evaluate the existing interoperability methods, and identify best way to achieve interoperability.

Deliverables

  • Research proposal – Roles of Common Terminology for Improving Metadata Interoperability (Technical Requirement)
  • Progress reports as described above and a documentation of literature reviews.

Tools and Technologies

A number of open source tools and technologies could be used in the project, including:

Task 2. Suggesting a Common Terminology of MARC, MODS, DC, and QDC, Designing Crosswalks for MARC to CT and (Q)DC to CT.

Objective

The objective of Task 2 is to suggest and develop a Common Terminology of MARC, MODS, and (Q)DC, examining and researching MARC, MODS and DC metadata schemas. It is to design crosswalks for MARC to CT and (Q)DC to CT. It is to build a crosswalk for CT in detail for three very different metadata schemas. Moreover, task 2 is to achieve lexical, semantic, syntactic and grammatical level metadata interoperability among them.

Work Steps

  • Research MARC21, QDC, and MODS metadata schemas and their crosswalks that Library of Congress presents (DC to MODS 3.2/3.3/3.4, MARC to MODS 3.4, MODS 3.2 to DC, MODS 3.4 to MARC)
  • Build a crosswalk for three metadata schemas considering lexical, semantic, syntactic and grammatical level metadata interoperability based on each above crosswalk.
  • Design a frame of Common Terminology for the element names of three different metadata schemas.
  • Document each element name of CT (e.g., why we chose it, how it achieves interoperability at each interoperability level).

Deliverables

  • Progress reports as described above and document assessments of the chosen metadata schemas in achieving interoperability.
  • A built crosswalk for three different metadata schemas.
  • A suggested Common Terminology that achieves lexical, semantic, syntactic and grammatical level metadata interoperability.
  • A document that describes each element name of CT (e.g., why we chose it, how it achieve interoperability at each interoperability level).

Task 3. Refining the suggested CT, Applying it to three different metadata records of Harvard, MIT, and UIUC libraries

Objective

The objective of Task 3 is to refine the suggested Common Terminology by applying it into three different metadata records (e.g., Harvard University that uses MARC 21, MIT DSpace Library that uses DC, and UIUC libraries that uses MARCXML). It is to represent the developed Common Terminology into RDF and XML schemas considering achieving schema definition language interoperability. We will publish it on the semantic Web linking CT properties and subproperties by SKOS concepts. We will define concepts of the CT using existing MODS, (Q)DC and SKOS concepts, so that we can use the CT in mappings with the defined SKOS concepts.

Work Steps

  • Research Harvard, MIT, and UIUC metadata.
  • Revise the designed CT with three universities’ metadata to reduce losing infomation rates.
  • Identify and assess the chosen metadata schemas and records to achieve interoperability.
  • Choose sample metadata in various genres of three different databases
  • Apply the suggested CT to the chosen sample metadata in various genres.
  • Refine the CT so that it can improve metadata interoperability.
  • Define the CT into SKOS concepts.
  • Represent the CT in RDF and XML schemas to achieve schema language interoperability.
  • Publish the CT on the semantic Web linking properties and subproperties of CT by the defined SKOS concepts.
  • Document the refined and defined CT.

Deliverables

  • Chosen sample metadata in various genres of three different databases
  • The Refined CT that can improve metadata interoperability in every level.
  • The Defined CT with SKOS concepts.
  • The Represented and published CT in RDF and XML schemas.
  • The Revised document of CT

Task 4. Conducting the Mapping Experiments with the Developed CT

Objective

The objective of Task 4 is to conduct mapping experiments with Python programming language: the direct mappings between two different schemas’ metadata and the indirect mappings between a metadata and the CT. Each program measures accuracy and losing data rates to assess performance of the mapping.

Work Steps

  • Develop Python programs to implement the direct mappings based on dictionaries or RDF, measuring accuracy and losing data rates:
    • The direct mapping from Harvard metadata format (MARC 21) to MIT metadata format (QDC),
    • The direct mapping from Harvard metadata format (MARC 21) to UIUC metadata format (MODS),
    • The direct mapping form MIT metadata format (QDC) to UIUC metadata format (MODS)
    • The direct mapping from MIT metadata format (QDC) to Harvard metadata format (MARC 21),
    • The direct mapping from UIUC metadata format (MODS) to Harvard metadata format (MARC 21),
    • The direct mapping from UIUC metadata format (MODS) to MIT metadata format (QDC).
  • Develop Python programs to implement the indirect mappings based on RDF graph formats, measuring accuracy and losing data rates:
    • The indirect mapping from Harvard metadata format (MARC 21) to the designed Common Terminology,
    • The indirect mapping from MIT metadata format (QDC) to the designed Common Terminology,
    • The indirect mapping from UIUC metadata format (MODS) to the designed Common Terminology.

Deliverables

  • The direct and indirect mapping programs.
  • The measured performance of accuracy by lexical and semantic match rates, and transfer and losing information rates

Task 5. Evaluate their performance and develop the final paper

Objective

The objective of Task 5 is to compare performance of the direct and the indirect mappings, to evaluate their performance results, to describe analyses, and to develop the final paper. The final paper includes Common Terminology (CT)’s roles, usefulness, effectiveness, and importance to achieve and improve interoperability among different metadata schemas and records of three universities. It also includes what we learn from the project and what are our recommendations or suggestions for the future works to improve interoperability especially metadata interoperability in sharing information.

Work Steps

  • Compare and analyze the performance between the direct mapping and the indirect mapping.
  • Describe their performance and Common Terminology (CT)’s roles, usefulness, effectiveness, and importance, including findings.
  • Recommend best strategies to achieve and improve interoperability in mappings for the future works.
  • Prepare draft research paper and submit to faculty who involve in the project to be reviewed.
  • Prepare PowerPoint briefing and present.\
  • Upon receipt of written comments from faculty, revise the research paper
  • Submit the research paper

Deliverables

  • Draft research paper of Roles of Common Terminology for Improving Metadata Interoperability
  • PowerPoint of Roles of Common Terminology for Improving Metadata Interoperability
  • Final research paper of Roles of Common Terminology for Improving Metadata Interoperability
  • Suggest expending the project into involving more various metadata schemas and metadata records, so that we can establish interoperability among well-designed digital libraries to realize the proposed National Open Public Digital Library.

The Suggested Expanded Work Plan (if it is possible), through Task 5

Task 6. Implement an integrate search engine for Harvard, MIT, and UIUC databases

Objective

The objective of Task 6 is to conduct CT union catalog that will be the fundamentally important task to conduct Linked Open Data and the CT union catalog for International Open Public Digital Library (IOPDL) for the future. First of all, we will research existing problems in search engines and in building an integrated search engine to share information among three universities libraries: Harvard, MIT and UIUC. It is for users of three universities to access any items over three databases. As a result, the integrated service will be very powerful and will improve greatly work efficiency giving amazing advantages for all students, faculty, and staff of three universities. Lastly, it is to improve effectiveness of search engines through the designed CT.

Work Steps

  • Find problems of three universities’ search engines - why is building an integrated search engine difficult? And which reasons have blocked collaboration for providing an integrated service?
  • Addresses current problems suggesting a possible solution.
  • Merge periodically metadata records of three universities.
  • Convert the merged metadata into CT.
  • Build a CT union catalog with the converted CT.
  • Conduct an integrated service that will allow users of three universities to access to any items over three databases.
  • Achieve and improve Metadata Interoperability at the repository level.

Deliverables

  • Converted metadata records from MARC, DC and QDC of three universities libraries.
  • Conducted Linked Open Data and the CT union catalog.
  • Conducted the integrated search engine for Harvard, MIT and UIUC libraries.

Task 7. Document and develop the final paper and Present the paper

Objective

The objective of Task 7 is to document and develop the final paper. The final paper includes: the found problems of three universities’ search engines; analyses why building an integrated search engine is difficult; and reasons that have blocked collaboration for providing an integrated service. To address current problems, it is to suggest a possible solution. It also includes the built CT union catalog by a designed automatic generator and the designed search engine for three universities. Lastly, it includes what we learn from the project and what are our recommendations or suggestions for the future works to improve interoperability especially at the repository metadata level.

Work Steps

  • Prepare a draft research paper for the prototype study.
  • Recommend best strategies to achieve and improve interoperability at the repository level for the future works.
  • Prepare a PowerPoint briefing and present.
  • Upon receipt of written comments from faculty, revise the research paper.
  • Submit the final research paper

Deliverables

  • Draft the CT paper for Improving Metadata Interoperability at the repository level.
  • PowerPoint to demonstrate the built integrate search engine by LOD and CT union catalog.
  • Present recommendations or suggestions for the future works to improve interoperability especially at the repository metadata level.

Last Modified August 25, 2014