A place to share tools and techniques for migrating CONTENTdm collections to the DRC. Migration Scripts and Tools Lynna Cekova's Documentation and JAVA App - https://dev.ohiolink.edu/svn/Scripts/Documentation/CONTENTdm%20to%20DRC/ John Millard's example PHP Ingest script and sample data Ingest Script Using the CONTENTdm OAI Interface as a data source This xslt from Tom Habin at UIUC may be useful - http://dlf.grainger.uiuc.edu/dlfcollectionsregistry/oai/oai_dc2csv.xsl From his message on OAI-general:Some notes on CONTENTdm Compound Objects Compound objects are represented as individual page (or other label) images bound together in a compound object description file, a standard XML document ( the .cpd file) that represents the structure of the compound document. There is one .cpd file for each compound object and it is located in the /image subfolder of the CONTENTdm collection folder alongside the image files that it describes. As a standard XML file, it should be straightforward to apply an xslt transform on the .cpd file to extract the page image filenames and output the formatted contents manifest. A little more work to retrieve the objects and the entire bulk submission package can be constructed. Note, this technique is hypothesized in Lynna's proof of concept. See an example .cpd file from the Letters of John Browne Collection at Miami University See an example xslt stylesheet that successfully transforms a cpd file into a DSpace contents file In the tab delimited export from CONTENTdm, there is a metadata line for both the digital objects and the cpd that binds them together. If you created metadata only for the object as a whole, you can ignore the individual page records. If you applied metadata for each page image, you will need to find a way to merge the data into an appropriate DSpace record as DSpace appears to use a one record per intellectual object model. |