Too Big to Know highlights

First, it’s unavoidably obvious that our old institutions are not up to the task because the task is just too large: How many people would you have to put on your library’s Acquisitions Committee to filter the Web’s trillion pages? We need new filtering techniques that don’t rely on forcing the ocean of information through one little kitchen strainer. The most successful so far use some form of social filtering, relying upon the explicit or implicit choices our social networks make as a guide to what will be most useful and interesting for us. These range from Facebook’s simple “Like” button (or Google’s “+1” button) that enables yourRead more at location 342

friends to alert you to items they recommend, to personalized searches performed by Bing based on information about you on Facebook, to Amazon’s complex algorithms for recommending books based on how your behavior on its site matches the patterns created by everyone else’s behavior.Read more at location 346

Second, the abundance revealed to us by our every encounter with the Net tells us that no filter, no matter how social and newfangled, is going to reveal the complete set of knowledge that we need. There’s just too much good stuff.Read more at location 348

Third, there’s also way too much bad stuff. We can now see every idiotic idea put forward seriously and every serious idea treated idiotically. What we make of this is, of course, up to us, but it’s hard to avoid at least some level of despair as the traditional authorities lose their grip and before new tools and types of authority have fully settled in. The Internet may not be making me and you stupid, but itRead more at location 350

sure looks like it’s making a whole bunch of other people stupid.Read more at location 353

Fourth, we can see—or at least are led to suspect—that every idea is contradicted somewhere on the Web. We are never all going to agree, even when agreement is widespread, except perhaps on some of the least interesting facts. Just as information overload has become a fact of our environment, so is the fact of perpetual disagreement. We may also conclude that even the ideas we ourselves hold most firmly are subject to debate, although there’s evidence (which we will consider later) that the Net may be driving us to hold to our positions more tightly.Read more at location 353

Sixth, filters are particularly crucial content. The information that the filters add—“These are the important pages if you’re studying hypercomputation and cognitive science”—is itself publicly available and may get linked up with other pages and other filters. The result of the new filtering to the front is an increasingly smart network, with more and more hooks and ties by which we can find our way through it and make sense of what we find.Read more at location 363

The New Institution of KnowledgeRead more at location 369

Wide.Read more at location 372

Boundary-free.Read more at location 375

Populist.Read more at location 379

“Other”-credentialed.Read more at location 381

Unsettled. We used to rely on experts to have decisive answers. It is thus surprising that in some branches ofRead more at location 385

biology, rather than arguing to a conclusion about how to classify organisms, a new strategy has emerged to enable scientists to make progress together even while in fundamental disagreement.Read more at location 385

White House and the American Association for the Advancement of Science—recognize that traditional ways of channeling and deploying expertise are insufficient to meet today’s challenges. Both agree that the old systems of credentialing authorities are too slow and leave too much talent outside the conversation. Both see that there are times when the rapid development of ideas is preferable to careful and certain development. Both acknowledge that there is value in disagreement and in explorations that may not result in consensus. Both agree that there can be value in building a loose network that iterates on the problem, and from which ideas emerge. In short, Expert Labs is a conscious response to the fact that knowledge has rapidly gotten too big for its old containers. . . . 35Read more at location 419

Cass Canfield of Harper’s was approached one day in his editorial sanctum by a sweet-faced but determined matron who wanted very much to discuss a first novel on which she was working. “How long should a novel be?” she demanded. “That’s an impossible question to answer,” explained Canfield. “SomeRead more at location 441

novels, like Ethan Frome, are only about 40,000 words long. Others, Gone with the Wind, for instance, may run to 300,000.” “But what is the average length of the ordinary novel?” the lady persisted. “Oh, I’d say about 80,000 words,” said Canfield. The lady jumped to her feet with a cry of triumph. “Thank God!” she cried. “My book is finished!”Read more at location 443

When Data.gov launched, it had only 47 datasets. Nine months later, there were 168,00037 and there had been 64 million hits on the site.38Read more at location 728

Obama’s executive order intended to establish—to use a software industry term—a new default. A software default is the configuration of options with which software ships; the user has to take special steps to change them, even if those steps are as easy as clicking on a check box. Defaults are crucial because they determine the user’s first experience of the software: Get the defaults wrong, and you’ll lose a lot of customers who can’t be bothered to change their preferences, or who don’t know that a particular option is open to them. But defaults are even more important asRead more at location 730

symbols indicating what the software really is and how it is supposed to work. In the case of Microsoft Word, writing multi-page, text-based documents, and not posters or brochures, is the default. The default for Ritz crackers, as depicted on the front of the box, is that they’re meant to be eaten by themselves or with cheese.39Read more at location 734

We know that there could be another hundred, thousand, or ten thousand columns of data, and reality would still outrun our spreadsheet. The unimaginably large fields of data at Data.gov—we are back to measuring stacked War and Peaces—do not feel like they’re getting us appreciably closer to having a complete picture of the world. Their magnitude is itself an argument against any such possibility.Read more at location 748

Data.gov and FuelEconomy.gov are not parliamentary blue books. They are not trying to nail down a conclusion. Data.gov and the equivalents it has spurred in governments around the world, the massive databases of economic information released by the World Bank, the entire human genome, the maps of billions of stars, the full text of over 10 million books made accessible by Google Books, the attempts to catalog all Earth species, all of these are part of the great unnailing: the making accessible of vastRead more at location 751

quantities of facts as a research resource for anyone, without regard to point of view or purpose. These open aggregations are often now referred to as “data commons,” and they are becoming the default for data that has no particular reason to be kept secret.Read more at location 755