Access to the World's Literature

Access to the world’s literature: the global strategy

Derek Law

The Author

Derek Law, Derek Law is Director of Information Services and Systems, King’s College, London, having previously worked in St Andrews, Edinburgh and (briefly) Glasgow University Libraries. He has known Henry Heaney for many years as colleague, fellow committee member and occasional conspirator - as well as unsuccessful interviewee. More importantly Henry has always proved both mentor and role model

Abstract

Progress towards achieving the concepts of Universal Bibliographic Control and Universal Access to Publications is outlined in relation to printed publications and discussed in the context of electronic information. The unique problems relating to consistent identification of electronic materials are indicated and various metadata projects for recording and searching electronic files are outlined. The problems of access to electronic materials are discussed and various relevant international projects considered.

Article Type: Research paper Keyword(s): Bibliographies; Cataloguing; Electronic data interchange; Electronic publishing; Identification; Libraries; Strategy. Content Indicators: Research Implication - * Practice Implication - ** Originality - * Readability - **

Library Review Volume 47 Number 5/6 1998 pp. 296-300 Copyright © MCB UP Ltd ISSN 0024-2535

Access to printed literature

One of the great successes of post-war librarianship is how close it has come to providing comprehensive access to the world’s literature. So great has been that success, that we take it absolutely for granted, which in turn produces a dangerous complacency in assuming that a parallel infrastructure will just happen as we enter a digital and electronic era for research libraries. This paper will briefly review the building blocks of access in the printed environment and discuss what new building blocks are required to provide equivalent access to digital materials. The building blocks may conveniently be split into two groups, reflecting IFLA’s original core programmes of Universal Bibliographic Control (UBC) and Universal Availability of Publications (UAP).

Universal Bibliographic Control

All of the contributors to this issue will have been brought up on those twin pillars of UBC, the Library of Congress Catalogue and the British Library Catalogue. Although they filled great banks of shelves in their dress of green and blue, no research library was (or is) complete without them. But they followed somewhat idiosyncratic rules, reflecting their separate and venerable origins. The international nature of scholarly literature has been reflected in the growth of international (or at least interoperable) standards which at least complement if not replace these great works of scholarship. The Anglo-American Cataloguing Rules, the MARC standard, ISBNs and ISSNs and the growth of national bibliographies all represent a huge and successful professional effort over the last four decades to globalise and standardise bibliographic control. The dominance and constant revision of both Dewey and LC Classification schemes have also forced order and structure on an essentially chaotic publishing output. Research libraries have introduced coherent collection development policies in an effort to focus their collections and as a response to financial pressures. As library housekeeping systems became universal in the 1970s and 1980s co-operative catalogues grew and retrospective conversion on a quite massive scale took place. Although not yet complete it has massively enhanced access to collections throughout the world from throughout the world. The abstracting and indexing of journals has a longer and largely commercial history. But it too has provided a near comprehensive system of access which has kept pace comfortably with the great post-war expansion of scientific literature. But even in this long-standing area, Eugene Garfield’s innovative concept of citation analysis has brought a new thrust to the exploitation of journal literature. Political and economic instability in many parts of the globe may have prevented the total establishment of UBC, but as a profession we have developed and actively maintained a vigorous and robust bibliographic infrastructure. All of this has required a constant but largely unremarked stream of diplomacy and activity of the very highest order.

Universal Availability of Publications

Universal Availability of Publications has been a particularly British-led activity - indeed IFLA’s UAP office is based at the British Library in Boston Spa. If bibliographic control is a necessary prerequisite of access to collections, inter-library lending is its acme. Of course libraries have lent books for centuries; but the development of standardised national and international systems, with first Urquhart then Line, the successive Directors-General of the British Library’s Lending Division at Boston Spa, acting as leading thinkers, advocates, drivers and promoters of internationalism, has created the hard-won rather than inevitable system we see today. It is not self-evident that subject-based document delivery systems such as that of the National Library of Medicine or co-operative systems such as that run by OCLC should be interoperable with systems run by a national library. The development of a common currency; the very act of trusting fellow professionals - and even more implausibly their readers - in small libraries in countries at the other end of the globe to act in a uniformly responsible way; the meshing of different copyright and fair use traditions and legislation, all represent quite remarkable acts of international co-operation. Not all is perfect, of course, since not everything is deliverable and sometimes the scholar must still move to the book. This author fondly remembers an occasion in the early 1970s when a baffled Arabic scholar in St Andrews could not comprehend why the Topkapi Museum would not send original mediaeval manuscripts to him on inter-library loan. Nevertheless it is broadly true that scholars anywhere can identify and acquire or gain access to the most obscure works of scholarship, wherever held.

In sum, the combined efforts of librarians throughout the world have created global systems of bibliographic description and access over the last 40 or 50 years. It might be argued that these are a natural consequence of our professional skills and interests, but even that view would have to acknowledge the sheer scale and complexity of what has been achieved. Yet having climbed the twin peaks of UBC and UAP, like Sisyphus we must now repeat the feat in a yet more complicated digital environment. And the task is much harder this time. In the world of print the field was largely ours to make of it what we would. In the new environment we are jostled and compete with computer scientists, publishers, authors, learned societies and agents, all of whom feel they have a role to play and skills to offer. This is perhaps most simply illustrated by the creation of the ubiquitous URL as a standard. It requires a very particular skill to create a descriptor of an ugliness and lack of structure which makes the Library of Congress Classification seem elegant and classically simple.

Electronic UBC

A whole host of problems surrounds the electronic equivalent of bibliographic control. Some are a function of the medium while others are variations of old issues. A thread which runs through all of the problems is the failure of the academy to recognise that the problems exist. The Internet is seen as a great and liberating development, but it is not a neutral development and requires very substantial international effort if it is to be made usable for sustained scholarly communication rather than short-term gratification. The problems begin at the most basic levels.

The very act of naming and identifying electronic objects consistently is fraught with difficulty. A book is a static object which does not change over time. In an electronic environment there is a need to reference objects as they move and change over time and place. The temporary nature of URLs is notorious. This author was involved in teaching a course recently which involved citing some 64 URLs. These have changed or disappeared at the rate of four a month over the course of one semester - and that in the field of information management! Even where the URL remains constant, issues of version control and quality assurance remain unresolved. The seriousness of this problem cannot be overemphasised for the continuity of citation is central to scholarship and without it scholarship cannot flourish. Some attempts are being made to deal with this problem, the current favourite being Digital Object Identifiers. These originate from the commercial publishing world and it is not then clear whether they have validity and applicability beyond the commercial sector. A significant if unquantified proportion of the material held in any library and in any medium is either non-commercial or out-of-copyright and any new system must be able to embrace everything from incunables to examination papers.

The issue of naming objects is also difficult and as yet unresolved. At present anyone can name an object with no obligation to maintain names over time. This is compounded by the fact that many of the reference points we take for granted in the print world disappear. A book published by Oxford University Press implies a set of values, standards and scholarly rigour that is understood. But an address incorporating the phrase “ox.ac.uk” could be anything from a university press to a student PC in a rented room. The persistence of object names is a long way from having a settled structure - and there is little evidence that the official bodies in scholarship understand the threat this poses.

Metadata and the description of objects is in rather better case. The Dublin Core standard first produced by Stu Weibel at OCLC has very rapidly developed international acceptance with participation in standards work from Europe, USA and the Pacific Rim. But even here much work remains to be done. Cataloguing has historically described static and largely immutable objects. The Internet offers new genres of multimedia and even services which will require appropriate description. This work remains to be developed.

Unlike the book, terms and conditions of use must also be described for electronic materials. Many will have multiple copyright permissions, many will be licensed rather than purchased, many will have restrictions on categories of users - and these will vary according to the terms of sale rather than be inherent in the product. Although the initial success of the Dublin Core gives confidence that these problems can be resolved, a great deal of international effort will be required to create a usable system.

Searching and indexing have proved much more difficult technically than the designers of Web robots would have us believe. Web indexing systems are breaking down as their architecture collapses under the weight of data. It is increasingly common to undertake a search on Lycos, or Excite or Infoseek and recover hundreds of thousands of hits in apparently random order. Much work is going on here but designers despair at the inability or unwillingness of the public to master Boolean searching and most systems still have a long way to go to beat a half way competent reference librarian. Web searching has undoubtedly transformed the ability of searchers to acquire a whole range of current reference information, but is dramatically poor at discovering scholarship and research.

Electronic UAP

Unlike the print world, the electronic one will require validation of the rights of the user. User authentication is regarded as an essential element of electronic commerce, but it too lacks basic elements for the furtherance of scholarly activity. At present there are no good ways of proving membership of the “data club” when away from the parent institution. Scholars visiting another institution, students on vacation or researchers on field trips are difficult to validate. There is then a very knotty problem surrounding usage data. On the one hand commercial publishers wish to collect usage information as a marketing tool. They are, however, unwilling to release this information to libraries so that they can judge whether usage justifies subscription. Conversely many users do not wish anyone to know what they are reading or researching. Traditionally, libraries have preserved the anonymity of user data except where criminal acts are suspected. Is this a right or simply a custom?

Then there are a series of issues and old battlegrounds to revisit. Rights Management Systems are growing quickly and are promoted largely by commercial concerns. They provide many areas of philosophic contention. As mentioned above, the question of whether the user can remain anonymous conflicts with commercial need. Secondly, the issue of preservation remains technically, legally and operationally unresolved. Historically this has been the domain of the national libraries, but it is not clear that they will or can perform the same role in an electronic environment. We cannot reasonably expect preservation to be undertaken by publishers. And thirdly the whole issue of fair use is being revisited by publishers, some of whom declare that it does not or cannot exist electronically. Major battles need to be undertaken on these issues, again with little evidence that the academy understands or cares about the issues.

The preservation and archiving of electronic information has barely surfaced as a very complex issue. The Data Archive at the University of Essex has existed for some 25 years and has perhaps as a clear a picture as anywhere of the so far intractable problems of storing, refreshing and kite-marking information. The problems are staggeringly complex technically and staggeringly expensive to resolve. Although some progress is being made on the legal deposit of commercial material, little appears to be done on the non-commercial and primary materials of scholarship. There are no standards or control or approval mechanisms for institutions or data repositories. This position may be compared with that in the UK where archives are expected to meet the BS5454 standard and the Historical Manuscripts Commission takes an active interest in the state of repositories and where archivists have specialist professional training. A new class of electronic material, what Clifford Lynch of CNI has called “endangered content”[1] is emerging, where the formal and informal records of disciplines are effectively at risk through neglect. Archives collect papers, but institutions do not sample or preserve the electronic mail or word-processed files of their scholars. Lab books are routinely preserved by scientists but it is doubtful if any institution has a policy for the preservation of digitally captured images or data from research equipment.

Network Topology is barely discussed as an issue due to a naïve assumption that there will be an infinitely expanding amount of bandwidth which will somehow be made available to scholarship. And yet there is no evidence to support this view. US universities have abandoned the failing Internet provided by telecommunications companies to create Internet II as a private network attuned to their needs. In Europe the relatively modest ambition of the European Union to link existing research networks through the TEN-34 Project has been “shaped by a series of non-technical influences such as non-availability of required public services” (Behringer, 1997), while “standard PNO (public network operator) services in Europe could not fulfil the requirements of the R&D community in Europe” (Behringer, 1997). Equally the assumption that we accept a simple commercial approach to network planning is questionable. At present in the UK, bandwidth is acquired in the light of use rather than as a result of scholarly or educational policy decisions. Thus bandwidth expands at a great rate to the East Coast of North America to meet traffic growth. There is almost no debate on whether policy should drive such acquisition and route bandwidth say to Southern Africa then India, Singapore, Australia and then the West Coast of the USA, opening up markets and scholarship to what is sometimes called UK Higher Education Limited. There is a creeping form of cybercolonialiasm in the assumption that only the USA has digital material of value to the world. It is interesting to note the recent decision of the Australian Vice-Chancellors to use network charges to discriminate against overseas Web sites and in favour of Australian ones (THES, 1998). No discussion appears to take place of how the products and output of small learned societies are to be mirrored around the world and what standards and quality controls will apply to mirror sites. Again the scholarly community is silent while the commercial giants of the STM world dictate the shape of electronic scholarly communication - despite the fact that large scientific publishers are aberrant rather than the norm.

Nor is the network yet totally robust. A recent Dilbert cartoon pointedly and uncomfortably accurately suggested that all of the time saved through automation in the information age had been lost by people sitting at Web browsers waiting for pages to load. Networks do not yet for example give the reliable quality of service required for multicasting, while video clips have all the power, quality and assurance of early silent films. It should be self-evident that for research institutions working at the leading edge of scholarship and indeed telecommunications, the standard services provided by Internet Service Providers will always be inadequate.

A more positive element which is emerging in the electronic era is the broadening of what constitutes content. Services such as the Arts and Humanities Data Service[2] based at King’s College London or the excellent SCRAN project[3] funded by the museums of Scotland are much involved in the digitisation of museum and archive collections. This is happening fast and brings relevant experience in activities such as new licensing models and standards. It also highlights the role of curators in the digital environment as relating to presentation as well as preservation. But again there appears to be little concerted effort by the official organs of scholarship to build formal cross-domain linkages.

International and Non-Governmental Organisation activity

Although this paper has argued that the academic community has failed to appreciate the threats to scholarship in the Internet - a Pandora’s box disguised as a treasure chest - it would be unfair not to note that some steps are being taken to address some of the issues highlighted.

ACLS - the American Council for Learned Societies - recently held an international round table to look at computing and the humanities, with some emphasis on economic and institutional issues[4].

CNI - the Coalition for Networked Information - has become a de facto international forum for librarians, computer scientists and senior university managers and teachers to meet and consider many of the issues, but it has no authority to act.

ICSU - the International Council for Scientific Unions - has held several meetings and a major conference in Paris with UNESCO in 1996 (Shaw and Moore, 1996) to look at future patterns of scholarly communications. However it appears to see its role as broker between interested parties rather than visionary.

IETF - the Internet Engineering Task Force - is the forum where most of the technical decisions are made on the future of the Internet, but it has little input from information professionals.

IFLA and ICA - the International Federation of Library Associations and the International Council on Archives - signed the Beijing Agenda in 1996 to improve their co-operation on issues of common concern (IFLA Journal 23, 1997).

UNESCO has launched a programme through its information and informatics activities to promote the digitisation and dissemination of public domain information[5].

This is of course a partial list, but it is intended to demonstrate that the efforts made thus far are fragmented, unfocused and incoherent compared with the achievements of many of the same groups in the print on paper world.

Conclusion

Thus we have reached a position where we are building our electronic house on shifting sands. If permanent global access is to be ensured, now is the time to create forums and activities where we can replicate in the coming years the success we have achieved in managing access to the printed word over recent decades. We must take the legacy of UBC and UAP and transpose them into a new vision for an electronic future. But visions are neither easily developed nor readily translated into action. Henry Heaney once famously paraphrased the verses of the prophet Joel that “old men dream dreams…[and] young men shall see visions”[6], adding that unfortunately the world was forever run by the middle-aged. But perhaps behind the characteristically elegant bon mot Henry had in mind that other Irishman, Yeats and his view of the university as the home of “reckless middle age” (Finneran, 1989). For long before Glasgow had become a European City of Culture, Glasgow University Library was a home to cosmopolitan influences, its librarian open to the thinking of research librarians throughout the world and its services developing to meet the international challenges of a global information economy, albeit with rather less overt display than others who strutted and fretted on the public stage. A little reckless middle age might be just what is needed to deliver the digital library.

Notes

1 Discussed in an unpublished Paper given at the European Union Telematics Conference in Barcelona, February 1998.

2 http://www.ahds.ac.uk/

3 http://www.scran.ac.uk/

4 ACLS Occasional Paper No 41 at hhtp://www.acls.org/op41-iv.htm

5 http://www.unesco.org/webworld

6 Book of Joel ii, 28.