Why Scholars and Students Still Need Libraries (and Books)

posted May 16, 2017, 1:01 AM by Professor Katz   [ updated May 16, 2017, 1:05 AM ]


The Rector asked me to defend continued investment by the University in libraries and physical books.  This is what I wrote to him:

        Around 6,000 years ago, people began to write things down.  It took about 800 more years for Egyptian hieroglyphs to be conceived, and by 1000 BC, alphabets were in use.  The invention of writing marks the transition from prehistoric humankind to a world where events and personalities could be recorded.  Once we had writing, we had history. 

        Near the beginning of the Christian era, someone had the good idea of chopping up scrolls into individual pages and binding the loose sheets to form a book, or a codex as it was called.  This was a huge technological breakthrough, and created the modern way of visualizing text, in pages, ultimately in paragraphs on numbered sheets.  Reading a book was vastly superior to using a scroll, and the utility of the new format gained widespread recognition and spread rapidly.  Some scholars have even attributed the rise of Christianity to that religion’s early adoption of the codex. 

        The third great technological advance – after the invention of writing and the transition to books – came over a millennium later, when printing with moveable type was invented in about 1450.  Although the Chinese had the same idea 400 years earlier and the Koreans 200 years after that, Gutenberg’s printing had a much wider impact and led to the proliferation of books throughout Europe and beyond.  With that came vastly increased literacy, both the cause and effect of cheaper book production.  In the nineteenth century, paper began to be made from wood pulp rather than cloth rags, and although this led to a decline in durability, books became ever more affordable, and enabled the growth of a huge reading public before the beginning of the twentieth century. 

        We are now in the digital age, which began about 550 years after printing, and more than 1800 years after the invention of books.  Although ARPANET came into being back in 1969, the World Wide Web and accompanying search engines became common tools only from the mid-1990s.  Netscape had its moment, but was replaced by Internet Explorer, Safari, and ultimately Google (founded in 1998). 

        Books became scrolls again, available as PDFs or other text formats on the internet.  A number of virtual libraries sprang up.  The Internet Archive (www.archive.org) offers 10M books and texts.  JSTOR covers about 2,000 journals, and an increasing number of books.  The HathiTrust Digital Library adds millions of books to the pool.  Google has scanned over 25M books, although its plan to include copyrighted materials was shot down in the courts.  The Digital Public Library of America tries to unite many of these sites under one virtual roof.  Finally, there is a mysterious Russian site, heir to the now defunct South Pacific/Irish site, on which are posted a huge number of illegally available digital texts, mostly books still in copyright and many published only last week. 

        With all of this material available online, why do we still need physical books?  More importantly, why should a research university spend its scarce resources to maintain library buildings and update holdings of books, journals and other printed materials contained therein, as we progress deeper into the twenty-first century and over 20 years after the democratization of the digital age? 


1.  There are a lot of books in the world’s libraries.  Google claims that the sum total of books published since printing was invented comes to about 130M.  Nevertheless, there are nearly 550M volumes in American research libraries alone, and this figure does not include holdings in private libraries.  Google Books is based on the collections of five research libraries – Oxford (Bodleian Library), Harvard, Stanford, Michigan and the New York Public Library.  Significantly, 60% of the books digitized can be found in only one of these libraries.  In other words, Google’s scan of 25M books just begins to make libraries digital. 

2.  Books in copyright cannot be legally digitized and made freely available online without the author’s consent.  According to current copyright law, this rule applies to most books published since 1923, and is generally in force for 70 years after the death of the author.  Well over 2M new books are published each year, 300,000 of these in the United States alone.  Many appear in electronic editions as well, available for purchase, but well beyond the scale of Google books to preserve them, even if the courts decide someday to weaken copyright provisions.  In any case, Google, Inc. will probably not outlast the physical books that have been digitized.  Fifteenth-century books still exist; apart from a few universities and the Roman Catholic Church, most institutions that were thriving at that time have disappeared. 

3.  Libraries do not own most of their digital books but access them via a bulk subscription service, often a commercial enterprise.  A service can alter the package of, say, 40,000 titles and simply notify the library of the change.  It is under no obligation to guarantee access to titles the library would prefer to keep.  In turn, the library may decline to acquire a particular digital book even if requested because it doesn't work with the supplier who markets it … and may therefore buy a paper copy instead. 

4.  What is a book?  And on what basis is it decided to digitize a text?  Research libraries are full of ephemera, centuries-old equivalents of hand-outs, pamphlets, flyers, defunct computer-manuals, and telephone books.  Little of this material is likely to be high up on anyone’s digitization agenda, Google or otherwise.  Scholars thrive on sources such as these, and libraries are where they are stored and catalogued. 

5.  Today, the PDF is the digitization medium of choice, as Microsoft Word is the standard writing program.  But before MS Word there was WordPerfect, and Logoscript and others.  Before PDFs there were photographic images of various kinds.  Even earlier we had microfilm and microfiche and xeroxes and photostats.  At every stage, it seemed as if we had come to a technological resting place, and a good deal of money swept under the bridge when the tide turned and the next big thing appeared on the horizon and became the industrial standard.  A 500-year old book printed on high-quality rag paper is just as readable today as it was in 1517.  Go find a way to access a Logoscript document saved on a 5¼ inch floppy disk!  Even digitized files can become degraded and unreadable, and this is true both for books digitized and books born digital. 

6.  As any user of the various digital archives knows, online PDFs are not perfect.  The original paper copy may have been disfigured by the underlining and notation of a reader from a previous century.  Pages can be missing or obscured by images of latex-covered fingers.  Pages too big, too small, or thought to be insignificant may be omitted by the army of underpaid students who do the actual mind-numbing work of flipping pages of old books under a camera. 

7.  Books often come in multiple editions and various printings.  Not all of them will get digitized, in the interest of increasing the number of distinct titles made available.  Scholars who study books and their contents often need to see multiple copies of the same text in order to recreate the reading experience of a particular historical moment.  Even relatively small research libraries are likely to have more than one edition of books. 

8.  Reading a book on a computer screen is an incomplete substitute for holding it in your hands.  The huge folio volumes produced in seventeenth-century England, for example, often with engravings and fold-out maps, are distorted on a small screen, which homogenizes all books to the same dimensions.  The effect of reading such books on a computer is like watching ‘Lawrence of Arabia’ on an airplane monitor. 

9.  Libraries also house manuscripts and archives, which are unlikely to be digitized in their entirety.  This material is not easily catalogued, and can be arranged and ordered only by people with local knowledge who may be otherwise engaged or long deceased.  The preservation of such historical records until scholarly attention is paid to them is part of an institution’s mission. 


Digital books have transformed the way we do our research.  Thanks to the digitization projects, not only can we read books that previously required an expensive and time-consuming trip to, for example, the Bodleian Library in Oxford, but very often we can download to our mobile phones the very same unique copy of a book stored in that library.  Searchable PDFs make certain kinds of research feasible. 

Even after the invention of printing, there was an extended period of overlap during which manuscript production of texts continued.  Some people have predicted the demise of the paper book, but that final death is a long way off, and paper and PDF will co-exist for many years to come.  In any case, manuscripts, books, PDFs: they still need to be purchased, catalogued, and stored, and this is the work of librarians and libraries. 

A research university still needs to maintain and develop its library buildings.  What we lose by accessing books outside of libraries is the productive experience of sitting in a quiet space in the company of other readers, some of whom can offer help and advice, and even take part in chance conversations of the sort that can sometimes spark a mental paradigm shift.  Libraries are still the heart of a university, the magnet that draws visitors to a campus, and the repository of collective memory.

Comments