Digital Preservation and Metadata

Digital Preservation and Metadata: History, Theory, Practice. Lazinger, Susan S. Englewood, CO: Libraries Unlimited; 359pp. Price: $55 (ISBN: 1-56308-777-4)

The publisher’s blurb describes this as a long-needed guide for anyone involved in the preservation of digital information and a “must for librarians, archiving professionals, faculty and students of library science, administrators and corporate leaders”. The author herself describes it as having changed from a collation of the latest theories to a history of digital preservation. In sum then it is intended as a basic primer for the non-technical with a professional interest in information management. This produces a simple, homely but lucid style happily bereft of real jargon – or what jargon there is is well explained. There is however an occasional and unintentionally (?) amusing use of mixed metaphors such as “hair raising stories about vital data that is gone with the wind are scattered through the literature… like raisins in rice pudding”, which enlivens the tale. The book could be considered as a comprehensive literature review and the author has clearly been extremely thorouhg in tracking down an exhaustive record of the literature. This is then paraphrased and synthesised but not critically.

The book begins with a chapter which opens up the reason for the book and examines why digital preservation is an issue. It is really an extended literature review looking at inadvertance, accident or misfeasance as reasons for data loss and the damage this can cause.

Chapter 2 considers what to preserve. It considers the definition of digital resources and assets and notes how inconclusive much of the work on typologies has been. It also contains the first major example of according equal and uncritical status to the literature. The claim that the DOI will be seen as the electronic equivalent of the ISBN for the 21st century is an unrealistic, inflated and partisan view from one stakeholder community. The claim is however simply recorded rather than questioned.

Chapter 3 is perhaps the weakest because of the tendency to review the literature rather than the underlying issues. It covers the responsibility for archiving in a fairly thin chapter describing stakeholders and potential responsible agencies. The bulk of the text tends to focus on these agencies and on legal deposit. There is little examination of the potential role of either publishers or aggregators in archiving. Although the role of publishers is touched, on the aggressive stance adopted by some of them on this issue has tended not to appear in the literature and so is not exposed. Similarly, although aggregators such as OCLC are mentioned the development of a cadre of quasi-commercial models based on the model of the serials agents back issues department in the paper world has not been fully explored. JSTOR (which is mentioned elsewhere in the text) is perhaps closer to this model than a conglomerate such as OCLC and mention is made of UMI in particular niche markets. By extension there is no substantial discussion of the threats to long term preservation from publishers and commercial aggregators. The focus on commercial data also tends to obscure the issues of responsibility which might apply to authors or to their institutions and largely ignores the huge quantity of non-commercial data which enriches the Internet so much. Again these issues are raised but might perhaps have benefited from more prominence.

Chapter 4 offers useful coverage of the debate over migration versus emulation. It opens up the topic of processing centers and centralized data stores. The assumption is made that this will most likely fall on existing utilities such as OCLC. However although the Open Archives Initiative is discussed, this comment predates the real push to OAI which the Budapest Declaration seems certain to promote, which may will then shift the focus of the debate on responsibility for preservation. Perhaps even more important than the question of what type of center might emerge, is the very real problem that there are no agreed standards for such centers. Here is a full discussion of the projects looking at different standards for preservation but there has been no discussion in the literature in how to define what is a good center using good standards and what is a bad center misusing good standards. It is that element of quality assurance in data management which will give data owners the confidence to make deposits rather than a concern as to whether they are commercial companies, major universities or national libraries.

For the practitioner Chapter 5 on costing models is in many ways the most useful and one where the strength of an exhaustive literature review is most obvious. It contains useful diagrams as well as detailed costings and methodologies for costing and is a model of its kind.

The following two chapters form a second section which describe models, formats and standards. Here the jargon begins to emerge but it is all lucidly explained. This section is tantamount to a reference guide to all of the key standards and acronyms with a full explanation of how metadata works. Every key standard is discussed in clear prose and anyone who has ever confused their Warwick Framework with their Dublin Core can sleep easily knowing that simple descriptions are readily available here

The final third of the book – one hundred pages consists of a directory of selected data archives in the United States and of international digital cultural heritage centers and sites and electronic data archives. Although the list is selective it uncovers a huge and rich vein of projects and data centres, arranged in the American case by state and by country for the rest of the world. Inevitably one can quibble at sins of omission but the selection is a large and rich one. The URLs were checked by the author in September 2001 and a satisfyingly large number still worked six months later, albeit through redirection in some cases.

The whole is completed with a bibliography of twenty-seven pages. This does not claim to be exhaustive but is satisfyingly rich and particularly notable for its international coverage.

The book does achieve its aim of being a basic primer for the non-technical. As such it is substantial, comprehensive and authoritative. It is always somewhat unfair to criticise a book for what it does not set out to be. However the role of dispassionate reviewer of the literature creates two weaknesses. Firstly in presenting a review of the literature there is no analysis of topics which have been ignored hitherto. Some issues and stakeholders are then ignored or underplayed. For example although archivists are part of the target group for the book, their potential contribution to the subject is underplayed. Their contributions to digital debates are discussed but not the crossover lessons from their experience with paper archives. In fairness this is part of a much wider neglect of the similarities between the traditional role of archivists in the paper-based domain and the lessons they have learned which are relevant to electronic archives. Secondly, the emphasis on reviewing the literature is pursued from an even-handed position which produces an unwillingness to adopt or promote a view. Every side of a question is exposed and authorities cited, but all are accorded equal courtesy and status. Where the literature is thin the book reflects that. The author is clearly hugely knowledgeable on the theme and even as a textbook it might have benefited from some removal of her self-imposed impartiality to allow her own opinions and views expression.

The book’s most irritating feature is a desire to reduce the content to bite sized chunks by numbering large quantities of paragraphs in a detailed hierarchical taxonomy, presumably for ease of reference by and for students. However the use of headings and subheadings such as “4.3.1.4.3. Digital Time Stamping” soon becomes irritating rather than helpful and detracts from any sense of narrative flow.

Perhaps the greatest strength of the book is its recognition that this is not simply a global issue but one which is being addressed by many groups in many countries and that no single answer resides with any of them. This recognition of the need to share experience, resources and lessons is clearly brought out and enriches the book substantially.

All in all this makes an excellent primer for the subject.