グーグルは答えで無い:デジタル世代がどの様に歴史に被害を与えるか

SATURDAY, MAY 30, 2015 10:30 PM JST

From floppy disks to thumb drives, we get better at storing things -- while trapping history in obso

Google is not the answer: How the digital age imperils history

(Credit: Spiderstock via iStock)

Our species created about 5 billion gigabytes of information from the dawn of time until 2003. Before long, we will create that much information many times per day, according to IBM. The problem: No one is doing enough to select and preserve the bits that really matter.

One of the great paradoxes of the digital age is that we are producing vastly more information than ever before, but we are not very good at preserving knowledge in digital form for the long haul. There’s a difference between creating big server farms to store the information somewhere for near-term retrieval (industry is very good at that) and in fact choosing and preserving the data that matters, and being able to render it useful, at some time in the future (something that, scarily, we are not nearly as good at). We are radically underinvesting in the processes and technologies that will allow us to preserve our cultural, literary and scientific records.

Consider the experience of pulling out an old shoebox from under a bed and discovering a series of floppy disks there from the 1980s. Perhaps you smile, thinking of what might be on them; perhaps you shiver. How would you find out? Most of us have not preserved a vintage Macintosh SE to be able to play them back. Data formats have changed multiple times since then. From 8-inch to 5-and-a-quarter inch to 3-and-a-half-inch floppy disks to compact disks to thumb drives, we are continuously making progress in how we store our media — and trapping information in lost formats in the process. Best that you put the box back under the bed and not worry too much about it.

Obsolescence of this kind may, in fact, be a blessing. It’s important that much of the information we create is ephemeral. Otherwise, the world will become far too cluttered. Our behaviors would shift, torqued by the constant surveillance to which we increasingly subject ourselves. We will have an even harder time finding the knowledge that’s important in the vast ocean of the unimportant – much less making sense of it all.

It’s fine when it’s your old term papers that are locked away in an obsolete format. And many blogs, tweets, photos and status updates don’t need to be kept for the long run. It’s not so fine, though, when the lost knowledge has historical significance.

The problem is not that it’s impossible to transfer information from one format to another; with enough effort and cost, most data can be transferred to formats that can be read today. A cloud-based world, to which we are headed, is likely to be simpler to manage than a world of shoe-boxes, floppy disks and thumb drives.

But different problems come into relief in a digital era in which we are creating information at such speed and scale. First, most of the parties holding the data are for-profit firms, whose core business is not long-term storage. Unlike universities, libraries and archives, these firms are unlikely to be around for hundreds of years. In the blog-hosting business, the industry has changed enormously in just a decade or so. Even in traditional publishing, consolidation and change have been the watchword, not persistence of firms over the centuries. Second, the scale of what is being created is so far beyond what has been created in the past, which means that we will need new, technologically sophisticated approaches – which can scale along with the pace of production – to curate the meaningful bits of it.

Today, librarians and archivists are not involved enough in selecting and preserving knowledge in born-digital formats, nor in developing the technologies that will be essential to ensuring interoperability over time. Librarians and archivists do not have the support or, in many cases, the skills they need to play the central role in preserving our culture in digital format.

There is added reason to worry. Our national systems have been found to be weak in information technology. This concern was confirmed in March, when the United States Government Accountability Office published a report that criticized the Library of Congress for its information technology practices. The report’s headline: “Library of Congress: Strong Leadership Needed to Address Serious Information Technology Management Weaknesses.”

The good news: the National Archives is much stronger and more advanced when it comes to digital matters, under the leadership of David Ferriero. The completely wonderful Internet Archive, the brainchild of Brewster Kahle, saves iterations of the web in a converted church in San Francisco. A group of universities has come together in a partnership called the Digital Preservation Network, founded at the University of Virginia, to address key aspects of the problem. And despite the GAO’s deep and valid concerns, the Library of Congress has projects focused on this task, including the National Audio-Visual Conservation Center, which features recording and playing devices to help render materials in now-obsolete formats.