13Panizzi

The Universal Library: Realising Panizzi’s Dream

Introduction

Sir Anthony Panizzi famously organised the British Museum Library based on 91 Principles or cataloguing rules. Behind his passionate attention to detail lay an equally passionate philosophical belief in the importance and value of libraries to all: “I want a poor student to have the same means of indulging his learned curiosity, of following his rational pursuits, of consulting the same authorities, of fathoming the most intricate inquiry as the richest man in the kingdom”[i].

This principled view of the function of libraries and their collections still resonates today – at least with librarians - and remains at the heart of the drive to move libraries into a digital world.

However, recent public discourse has come to be seen as being dominated by the big beasts of the new information landscape and by their aggressive and often crude lunges to occupy the information space. Four recent examples demonstrate this very visible appetite for confrontation.

In the United States in early 2012, the proposed Research Works Act pitted publishers against scientists and led, amongst other things, to a proposed boycott of Elsevier by 7500 scientific authors, as a response to the company’s support for the act. The language soon became heated: “The US Research Works Act would allow publishers to line their pockets by locking publicly funded research behind paywalls”[ii] and the weight of public disapprobation, much of it organised through social networking, soon forced a withdrawal of the Act and a hasty climb-down by Elsevier. Almost at the same time, another publisher, Penguin withdrew from its partnership deal with OverDrive, the largest provider of ebook and audiobook lending to libraries, citing security concerns. OverDrive works with about 7,500 public libraries in the United States, over 1,000 publishers, and has access to over 100,000 works which it lends to library users. Most commentators felt it had more to do with feared loss of sales than with security.[iii]

Nor is this dystopian landscape confined to libraries and publishers. Behemoth’s such as Google can also behave in what are seen as arbitrary and authoritarian ways. In the latest row over personal privacy the change in Google’s privacy settings has sparked government led requests to force the company to change, using legal coercion if necessary. France has asked European data authorities to investigate this pooling of user data, which began on 1 March 2012.[iv] And finally one might consider the anguished debate over Wikileaks and whether the US legal system should pursue Julian Assange. Liberal and respected columnists such as Clay Shirky fret over the balance between transparency and privacy and over how to make governments accountable.[v]

And yet behind this very visible, very public and apparently violent clash of information cultures, a quiet transformation has been going on which the chapters of this book collectively describe. We can see the different sectors of the information world working together. We can see the aggregation of resources; the building of portals and services; the creation of new resources. And behind this sits a sharp-eyed concern with ROI (return on investment) and business models, applying just as much to the public sector as to private business. This partnership approach may be less visible and less vocal, but it is infinitely more effective.

Each of the chapters in the book describes in detail a single initiative, but collectively they present a picture of an emerging ecosystem where existing and new players in the information world not only co-exist but actively support each other in a sophisticated and complicated environment. In order to summarise the themes which emerge from the chapters it is sensible to explore particular aspects. But almost all of the projects have multiple facets which involve at the very least the need to have some kind of sustainable economic model and the need to aggregate resources and so their division may seem somewhat arbitrary. Additional examples will also be introduced to demonstrate that these are not isolated examples but rather exemplars of a steady and ineluctable movement to a new world where Panizzi’s dream can be realised.

Aggregation

Historically libraries and librarians have been zealous advocates of aggregation. Beginning with union catalogues, this has led to one of the great if unsung triumphs of international diplomacy, which is enshrined in IFLA’s twin programmes of Universal Bibliographic Control (UAP) and the Universal Availability of Publications (UBC). These have been based not just on the necessary adoption of common standards but on the acceptance of a common philosophy which is the direct descendent of Panizzi’s ambition. Thanks to the success of these programmes, it is broadly possible to identify any book or article ever published in any language and in any country, to request it through one’s home library and to have the work, or a copy of it, delivered to that library within a short space of time. We take this astonishing feat for granted and yet there is no inherent reason to have, say, a public library in Lithuania receive an article from a medical library in Hawaii. But the system manages this extraordinary achievement routinely.

The habit of co-operation seemed to have been sidetracked for well over a decade. But after a period when library groups and consortia became perhaps over-focused on co-operative purchase, we can again see the emergence of the aggregation of resources and adding value to them as a hugely important phenomenon. This time aggregation focuses as much on the sharing of collections as the listing of them.

Two chapters refer to two quite different models. accessCeramics displays a number of interesting characteristics. Firstly, it is image based and images have proved a powerful educational tool for twenty first century users. Secondly it is, at least in a loose sense user created, although there is a vetting process for contributions. And thirdly, it sees adding value to institutional brand and reputation as a powerful benefit. Like most of the activities described it has been created with much charitable and grant support. But the paper also gives a very honest sense of the financial challenges such a project faces when run by a small institution.

But there are other forms of aggregation. One is the recreation of a document which has been scattered. One such example is the Codex Sinaiticus. The Codex is an ancient, handwritten copy of the Greek Bible which came to the attention of scholars in the 19th century at the Greek Orthodox Monastery of Mount Sinai. It became scattered with further material discovered in the 20th and even 21st centuries. Parts of the Codex are held in four libraries around the world. The principal surviving portion is now held by the British Library. A further 43 leaves are kept at the University Library in Leipzig. Parts of six leaves are held at the National Library of Russia in Saint Petersburg. Further portions remain at Saint Catherine’s Monastery on Mount Sinai. A major project[vi] has preserved, digitised, transcribed and “re-united” this important manuscript and made it available on the web.

A second is the creation by institutions of a collection which had not previously existed. A small example of this is the Red Clydeside collection in the Glasgow Digital Library.[vii]During the period between 1910 and 1932 the city of Glasgow was witness to an unparalleled wave of working class protest and political agitation which challenged the forces of capitalism and also, on occasion, directly challenged the state itself. This was strongly suppressed by governments fearly watching what was happening in St Petersburg and Moscow. The events and people who shaped this period forged an enduring legacy which still remains part of the political and social fabric of the city to the present day, and which is known quite simply as Red Clydeside, but the clandestine nature of this movement left its records scattered and fragmented. The Glasgow Digital Library then collected and digitised some 220 items drawn from local archives and special collections to build a coherent record of what had happened almost a century previously.

The third and most ambitious model is the assembly and enrichment of known collections from multiple international sources. Emory University has done this for the Atlantic Slave Trade.[viii] The Voyages database assembles searchable records collected by scholars from all round the Atlantic basin and has value added through the addition of maps, images, data and name indices for individual Africans who were transported. Seen as a dynamic rather than a completed project, scholars who discover new information can add it to the database, and thus share it with their colleagues.

Building Infrastructure: the long haul

The sort of aggregation just described focuses on what historically have been seen as special collections and archives – rare and unusual material. But just as much consideration is needed when considering the ordinary, the material which has formed the vast physical bulk of what was stored in paper based libraries and for most libraries that is journal literature. Again quite new and complex models can be seen to be emerging. We have moved very rapidly from a time when shared library storage was the hot agenda item to considering models where resources are stored in what might be considered a version of the Cloud and made accessible to multiple user categories, each with differing rights. The growth of JSTOR is perhaps the best example of this. It now has a widely known brand name, a positive reputation and a huge client base, with 7000 participating institutions, holdings of 1500 academic journals and 600 million accesses a year its growth has been phenomenal. It is also a broad community resource which is creating sustainable business models which are shaped by the needs of several groups - libraries, publishers, and scholars shape – and not just one. And again it embraces change and adaptability rather than stasis as the new digital landscape develops, changes and matures.

The growth of multinational science publishers has tended to disguise the fact that these are in the strict sense aberrant forms. The backbone of scholarship and scholarly publishing remains the small learned society. Project MUSE recognised this almost twenty years ago and has explored how could smaller journals in the humanities and social sciences afford to have a digital presence? And the resource pressures on these societies can be as much technical as financial. How are they to gain access to competences and standards and to make decisions well outside their professional domain? MUSE has, of course, now moved beyond journals to ebooks, but it again displays all the benefits of collaboration, aggregation, partnership and the recognition of a linked ecology from researcher to reader with different partners using different skill sets to manage the process of dissemination/

Longevity

One of the beauties of copyright libraries is that their mission is evident and simple. Give them a copy of a book or even a manuscript and they will endeavour to keep it forever. And they have done this effectively for literally hundreds of years so far. And even when war, tempest or age prevent this, there will usually be copies somewhere else. How different the fate of computer files and digital objects. Technological obsolescence, media ephemerality, and the content deletion policies of computer centres all conspire to offer very little in the way of guaranteed preservation. Some of the responses are described in other chapters but we can already see that this is both a complex issue and one where co-operation and collaboration have been in play for some time. Again the following examples show the power of working together.

The wonderfully titled LOCKSS (Lots of Copies Keep Stuff Safe), based at Stanford University Libraries, was initiated in 1999 and is an international community initiative that provides libraries with digital preservation tools and support so that they can easily and inexpensively collect and preserve their own copies of authorized e-content. LOCKSS uses open-source software and support to preserve today’s web-published materials for tomorrow’s readers while building their own collections and acquiring a copy of the assets they pay for, instead of simply leasing them. It is a decentralized digital preservation infrastructure. LOCKSS preserves all formats and genres of web-published content.

PORTICO (see Chapter xx) was set up in 2005 and is another example of the partnership model bringing together libraries, publishers and funders. As of 2012[ix] it preserved 12,555 e-journal titles, 123,586 e-book titles and some 46 D-Collections, working with 140 publishers (representing over 2000 societies and associations) and 728 libraries. This simple list of numbers shows an impressive requirement for collaboration but conceals the huge activity which goes on underneath to create a sustainable economic model.

The KEEP Project[x], funded by the European Commission, looks at the preservation of emulation environments which will allow access to all sorts of digital outputs which form the cultural heritage of the late twentieth century and beyond. Using computer games as its testbed it has shown many of the systemic difficulties which are emerging in the preservation of digitally born resources. At least some of these stem not from technical difficulties but from such things as copyright legislation framed in what now seems a different world.

Reducing the data burden relies on promoting data interchange standards, so that data is held once. This facilitates systems integration both across the institution and within wider stakeholder groups. It fosters the community cloud. This will never eliminate keeping data on site and on campus but will reduce data volumes. This can both save money and add value. The Chronopolis network described by Minor and Kozbial is a perfect example of this. The project leverages high-speed networks, mass-scale storage capabilities, and the expertise of its partners to provide a geographically distributed, heterogeneous, highly redundant archive system.

Tools and Services

Digital environments add a new dimension to what libraries can do. Technology allows us to explore and develop new services to meet changing user needs and to match the way they work, live and use technology. Expectations of immediacy are driven by everything from instant Kindle purchases to confirmed restaurant reservations for the same evening. Mobile technologies have led to a step shift in user experience and expectations.

The study from the Borough of Manhattan Community College is a perfect expression of this, where its electronic reserves and streaming video service are tailored to meet the needs of 20,000 commuter students. Almost as important is that there stated mission is to leverage technology to innovate in support of learning

A more recent development has been to look at providing the infrastructure which allows others to contribute. Perhaps the most obvious form of this is portals, two of which are described in chapters by Hogenaar and Smith.

As is often the case, the Netherlands has developed an interesting and clear national structure in the NARCIS system, which Hogenaar describes. It has a clear focus – research – and is based on a number of collaborating institutions representing different but closely related communities. They have combined to create a portal which sits comfortably within the SURFfoundation, which, like JISC in the UK, links all researchers. It is an excellent model of what can be achieved by building on existing relationships.

Smith describes a portal framework for humanities scholars based on a grant-funded study carried out at Emory University. Again the key element of the study is that the framework is designed to encourage community engagement and to respond to expressed user needs.It also begins to address one of the major challenges facing cultural heritage bodies in a digital age – the identification and exposure of “hidden” collections.

Perhaps the largest portal of them all is Europeana[xi], another EU funded project. It provides a single access point to millions of books, paintings, films, museum objects and archival records that have been digitised throughout Europe. It is an authoritative source of information coming from European cultural and scientific institutions and links to over twenty million objects, coming from more than 1500 institutions in 32 countries. But its very size has uncovered other issues. Although it is an astonishing example of collaboration, aggregation and standards development it has had more difficulty in defining its audience and their needs. This in turn has led to some complicated and arguably needlessly arcane issues between Europeana, the Europeana Libraries Project and the European Library over who should hold and provide access to which content.

Standards issues have been at the core of library co-operation for generations. From cataloguing rules to MARC, from Dublin Core to OAI-PMH, the creation and more importantly the application of standards has driven forward library co-operation. And this will remain the case. One such tools and standards issue is addressed by Starr in her description of EZID. Persistent identifiers is a longstanding issue in the digital world. While standards such as DOI have come from the commercial world, it is less common to see standards emerging from the library world, in this case the California Digital Library. Most importantly it recognises the sheer variety of the things libraries collect, which range well beyond the books and journals which dominate the traditional publishing world. EZID can assign identifiers to anything: scientific datasets, technical reports, audio files, and digital photographs, for example.

Born digital collection building

Johnson and Palmer describe the experience of the University Library at Indiana University Purdue University Indianapolis. A key element here has been working with local community groups to aggregate their skills with collections wished for locally by the community. Although not all of them are in the strict sense born digital, it is again the development of partnership models which has led to success. Indeed it allows them to describe such collaboration with one of the most striking phrases in this book “organic relationship development has been wildly beneficial”.

Perhaps the largest growth of born digital material lies in institutional and subject repositories. OAIster[xii] claims to contain over 25 million records from over 1100 institutions. These include records for digitized (scanned) books, journal articles, newspapers, manuscripts, digital text, audio files (wav, mp3), video files (mp4, QuickTime), photographic images (jpeg, tiff, gif), data sets (downloadable statistical information), theses and research papers. OpenDOAR[xiii] (The Directory of Open Access Repositories) lists over 2000 repositories, just over 80% of which are institutional with the number steadily growing, and almost half in Europe. These provide access to ten million items. With most of the material crawled by Google and available through Google Scholar searches, this has become a powerful mainstay of the new ecology. The repository movement is closely aligned to the Open Access movement, which has proved hugely contentious. But irrespective of views on Open Access, repositories are here to stay.

Monographs

Most discussion on the digital environment tends to focus either on journals or special collections. Much less thought seems to have been given to the monograph, which historically has been the backbone particularly of humanities and social science scholarship. The emergence of ebooks and the flailing search by major publishers for a sustainable economic model has disguised some interesting developments.

One fascinating aspect is covered in Gorrell’s chapter on ebooks and audiobooks. It has been suggested[xiv] that only 29% of library patrons have e-readers and that while so many publishers refuse to make e-books available to libraries, librarians would do better to wait until the hugely volatile market has settled down and focus resources on the needs of the majority. But the EBSCO model which Gorrell describes offers libraries a very attractive combination of aggregation, professional support and technical skill. As importantly, this new medium is not seen as separate, different and awkward, but is integrated into an existing platform with which users will be familiar. Again the key is aggregation, but this time of delivery platforms and tools. And most importantly of all libraries are seen as partners for the delivery of a commercial product and not as museums of the book.

There are well known projects which are delivering collections of digitised free e-books, but they do tend to focus on the aggregation of content rather than the integration of delivery mechanisms. The oldest of these is the Gutenberg Project, which has been running since 1971 is run by volunteers, has over 38000 books and is adding 100 titles a week, but there are now major sites such as Many Books, Munseys, Feedbooks and Open Library.

OAPEN (Open Access Publishing in European Networks)[xv] began as a publisher-led but EU funded project to explore the feasibility of publishing scholarly monographs in a sustainable open access model. It now has around 1000 monographs listed on its website, mainly from university presses. Perhaps unsurprisingly given its humanities background, the principal aim is the sharing of knowledge rather than the maximising of profit, but sustainability is a key driver. In this open access model the monograph is made freely available – readers (or their libraries) do not have to pay to read it online, rather the costs of the publishing process (e.g. peer review, typesetting, marketing) are recovered through alternative routes such as research grants, institutional funding or perhaps through readers purchasing print editions or particular formats for their iPad or Kindle. The project is now being extended more widely to the UK. OAPEN-UK[xvi] is an Arts and Humanities Research Council and JISC funded project exploring the issues impacting upon the publishing of scholarly monographs in the humanities and social sciences (HSS). OAPEN-UK has two strands: an open access pilot gathering data on the usage, sales and citations of sixty monographs, and a wider research project which explores the environment for open access publishing. The project is working with Taylor & Francis, Palgrave Macmillan, Berg Publishers, Liverpool University Press, University Wales Press, research funders and universities, to understand the challenges and steps required to move towards an open access publishing model for scholarly monographs.

A different model again is described by Woodhead, whose mid-sized firm specialises in science and technology monographs and has been buffeted by the pace of technological change. But again key messages about partnership and flexibility come through. The company has made huge efforts to find out what the market wants and then deliver that, rather than attempting to dictate what it shall have. It has then chosen to work in partnership with an aggregator, again ensuring that customers will interact with an existing delivery platform aimed at supporting the research process.

Funding

Higher education libraries have not hitherto had to undertake a great deal in the way of financial planning. In the United Kingdom, for example, the library budget is typically last year’s figure plus (or sometimes minus!) a few per cent. Until very recently the Library was seen simply as a necessary if expensive part of the fabric of any university. Of course the budgets were and are very well and very professionally managed, but little was needed in the way of business planning and such revenue generation as was undertaken tended to be either for endowments or was a way of paying for new services whose costs were readily identifiable, whether photocopying, interlending or online searching. Libraries inhabited a dependency culture where sustainability was not a consideration. In essence the Library was simply a topsliced cost from the University budget, or delegated to faculties as is often the case in continental European universities.[xvii]

We do, of course, know quite a lot about existing library costs – at least about direct costs – for example through the long time series of the SCONUL (Society of College, National and University Libraries) statistics in the UK. But it is important to note that these simply do not cover indirect costs such as estate and heating and lighting, security and maintenance and building amortisation costs, which are both increasing costs to the institution and costs which may look quite different for digital libraries. Nor has there been any significant analysis of these figures. Indeed, the whole issue of total cost of ownership is a hugely neglected topic.[xviii] Charles Bailey’s comprehensive bibliography of scholarly economic publishing gives evidence of this[xix]. Only fourteen pages out of over 450 – barely 3% - cover the economic issues associated with the whole digital environment.

This neglect may be changing and indeed many of the chapters in this book offer comment on the business models they have explored or adopted. And two notable trailblazers do offer insightful views on aspects of the economic model. The chapter by Griffiths and King summarises work undertaken over two decades and explores the important topic of return on investment. This analysis of tools and metrics allows a fascinating exploration of such important metrics as contingent valuation and return on investment which are increasingly valuable tools in exploring and explaining the indirect benefits libraries can bring to their parent organisations.

A newer name to enter the battleground of the costing of scholarly publishing is the economist John Houghton. His seminal report[xx] in 2009, opened up the debate on the costs of scholarly publishing, the profits of publishers and where the costs should lie. His chapter considers the work he has done since then in other countries, which confirm the thesis that there are alternative models to the historic ones and that these alternative models must be explored.

While most of the authors of chapters focus on financial matters, there are two striking discussions, on the accessCeramics financial model and on the Chronopolis network. Dahl describes how many projects begin, with local enthusiasm and grant funding. But as the resource grows new financial models must be explored. accessCeramics has shied away from subscription models, in part on principle, and the paper clearly articulates the options and difficult choices it faces in making content available. Chronopolis also began as a grant funded programme but has made decisions to seek a mixed funding economy for the future. This quite different approach reflects different stakeholder groups and different types of data. But what is perhaps most striking with these as with other chapters is the degree of sophisticated economic understanding that underpins their thinking. The digital economy is not just about raising funds but about concepts as varied as monetisation and cost-benefit analysis and intangible institutional benefits.

Conclusions

The diverse range of contributors to this volume demonstrates that the information world is populated by a naturally collaborative set of innovators. For over a decade they have slowly grappled with transient technology, developing and changing standards, whilst absorbing huge shifts in the forms and nature of communication. But throughout it all they have begun to create linkages which allow us to begin to perceive the emerging shape of a viable information new world, where no one group dominates, where collaboration is to the advantage of all and where we still have valuable products and services to offer users in the prosecution of their lives. The mantras listed in the chapter on Project MUSE pithily sum up the lessons learned and therefore bear repeating. If the digital world were old enough to have grandmothers, these would be the pieces of good advice that they would recite and pass on:

· Develop the rationale for ongoing investment into the platform

· Build a robust stable of key stakeholders from all communities

· Listen to your customers and build relationships based on trust

· Communicate extensively in person and one-on-one with partners

· Know your readers, editors, authors, researchers, students, and librarians

· Embrace the digital chaos of tomorrow – do not fear it

· Invest in your people

· Commit to project management

References

[i] Quoted in Fagan, Louis (1880). The Life of Sir Anthony Panizzi, K.C.B.

[ii] http://www.guardian.co.uk/science/2012/jan/16/academic-publishers-enemies-science

[iii] Laura June Feb 13, 2012 The Verge http://www.theverge.com/2012/2/13/2795791/penguin-kills-library-ebook-lending-deal-with-overdrive

[iv] http://www.guardian.co.uk/technology/2012/mar/01/google-privacy-policy-changes-eu

[v] Wikileaks and the long haul by Clay Shirky (December 6th 2010) http://www.shirky.com/weblog/

[vi] http://codexsinaiticus.org/en/project/

[vii] http://gdl.cdlr.strath.ac.uk/redclyde/

[viii] http://www.slavevoyages.org/tast/index.faces

[ix] http://www.portico.org/digital-preservation/the-archive-content-access/archive-facts-figures

[x] http://www.keep-project.eu/ezpub2/index.php?/eng

[xi] http://www.europeana.eu/portal/

[xii] http://www.oclc.org/oaister/about/default.htm

[xiii] http://www.opendoar.org/

[xiv] Newman, Bobbi (2012) Should Libraries Get Out of the eBook Business? Librarian by Day blog http://librarianbyday.net/2012/03/07/should-libraries-get-out-of-the-ebook-business/

[xv] http://www.oapen.org/home

[xvi] http://www.oapen-uk.jiscebooks.org

[xvii] Law, Derek Digital Libraries in Higher Education in Collier, M. Business Models for Digital Libraries. Leuven: University of Leuven Press, 2010

[xviii] Law, Derek Digital library economics: aspects and prospects in Baker, D. Planning and building digital Libraries. Cambridge: Chandos, 2009

[xix] Bailey, Charles Scholarly Electronic Publishing Bibliography 2010. Digital Scholarship: Houston, 2011

[xx] Houghton, J., Rasmussen, B. and Sheehan, P. Economic implications of alternative scholarly publishing models: Exploring the costs and benefits. A report to the Joint Information Systems Committee (JISC), 2009

Available at http://ie-repository.jisc.ac.uk/278/

References

[1] Quoted in Fagan, Louis (1880). The Life of Sir Anthony Panizzi, K.C.B.

[1] http://www.guardian.co.uk/science/2012/jan/16/academic-publishers-enemies-science

[1] Laura June Feb 13, 2012 The Verge http://www.theverge.com/2012/2/13/2795791/penguin-kills-library-ebook-lending-deal-with-overdrive

[1] http://www.guardian.co.uk/technology/2012/mar/01/google-privacy-policy-changes-eu

[1] Wikileaks and the long haul by Clay Shirky (December 6th 2010) http://www.shirky.com/weblog/

[1] http://codexsinaiticus.org/en/project/

[1] http://gdl.cdlr.strath.ac.uk/redclyde/

[1] http://www.slavevoyages.org/tast/index.faces

[1] http://www.portico.org/digital-preservation/the-archive-content-access/archive-facts-figures

[1] http://www.keep-project.eu/ezpub2/index.php?/eng

[1] http://www.europeana.eu/portal/

[1] http://www.oclc.org/oaister/about/default.htm

[1] http://www.opendoar.org/

[1] Newman, Bobbi (2012) Should Libraries Get Out of the eBook Business? Librarian by Day blog http://librarianbyday.net/2012/03/07/should-libraries-get-out-of-the-ebook-business/

[1] http://www.oapen.org/home

[1] http://www.oapen-uk.jiscebooks.org

[1] Law, Derek Digital Libraries in Higher Education in Collier, M. Business Models for Digital Libraries. Leuven: University of Leuven Press, 2010

[1] Law, Derek Digital library economics: aspects and prospects in Baker, D. Planning and building digital Libraries. Cambridge: Chandos, 2009

[1] Bailey, Charles Scholarly Electronic Publishing Bibliography 2010. Digital Scholarship: Houston, 2011

[1] Houghton, J., Rasmussen, B. and Sheehan, P. Economic implications of alternative scholarly publishing models: Exploring the costs and benefits. A report to the Joint Information Systems Committee (JISC), 2009

Available at http://ie-repository.jisc.ac.uk/278/