Big Data

IBM reported in 2012 that 90% of the worlds data had been created since 2010. All that data written on DVDs would be a stack that reaches to the moon and back.

What does all this data mean? How can it be used? What sorts of tools can we use to visualize it? Explore the examples and sites shown below.

Baby Name Voyager

Lots of number data has been turned into visual information in the Baby Name Voyager, showing popularity in names over the last 130 years. For instance, consider the popularity of the name "Jesus" which illustrates immigration trends from Spanish speaking countries.
Try typing in "Mohammed" or "Forrest" and note the sudden changes. Ask your students why they think there might have been these sudden changes. Ask them to forecast what they think the graph for the girl's name "Sunshine" looks like, and why.
If you are teaching an introductory programming course (e.g. CS1 or CS2), consider retrieving the data and giving something similar as a programming assignment.

Baby Names

Google's Ngram Viewer

Google's NGram viewer is a time machine for words that allows you to explore 5 billion words from 5 million books over 5 centuries. Take a look at: http://books.google.com/ngrams

At right is the result of a search on the phrase "computational thinking". Selecting the year ranges below the graph takes you to the books where this phrase was found. What do you think the following searches would look like: Olympics, Race, Black Lives Matter, Salsa, Ketchup

Ngram Viewer

GoodFilms

See which movies are re-watched the most (x-axis) and which are the most critically acclaimed (y-axis) at Goodfilms. You can filter by where you can watch the movie (Netflix, Hulu, etc.) The bigger the dot, the more reviews it has. Hovering over a dot reveals which movie it is, along with a small graph of other movies that are similar in terms of being rewatched and critically acclaimed. Movies near the upper-right are both rewatched often and critically acclaimed. The opposite is true for the movies in the lower-left!

GoodFil.ms

Google's Earth Engine

See Google's EarthEngine showing a time-lapse sequence of satellite images since 1984 showing changes to the planets surface. There are some set locations such as Las Vegas and the man-made islands of Dubai that show drastic changes, but you can also zoom in on your own location. Do views on glaciers provide evidence of global warming?

Google Earth Engine Dubai

Live Flights Radar

At www.flightradar24.com you can see the pattern of flights in the air right now, along with details for each one.

Why do you suppose there are lots of flights in and out of Memphis at night?

Real-time flights radar

Educational Attainment

See the educational attainment in the US by map section. Does this support how you think of different neighborhoods where you live?

Educational Attainment

Information is Beautiful

Information is Beautiful is a collection of visualizations, with a sampling of them shown here.

Lines of Code

Maptive

Maptive.com has a collection of map-based visualizations, such as the ones shown here.



A day in the life of Americans, showing activity throughout the day.

See how the Face of America is changing over time.

See the wind, weather and ocean conditions across the globe.

An exploration of the "Universcale" relative sizes of things, allowing you to zoom in / out.

Gap Minder

Hans Rosling's GapMinder.org allows you to graph real-world data. See also Rosling's TED talks.

Sources

  1. There are many types and examples of visualizations at Information is Beautiful and at Maptive Visualizations

  2. Steve Balmer, who used to be the chief executive at Microsoft, has put together a dataset that allows us to see where all government funding (local, state, federal) goes. See usafacts.org

  3. Project Gutenberg
    See the related project Cyberwill (modeled after Joe Zachary's Random Writer) for a PC executable and the data files to generate random text in the style of other texts.

  4. Government Data: City of Chicago public data portal, US Government Data.gov, and the US Census Data

  5. Divvy Bikes usage data

  6. See uses of historical Big Data visualizations, along with some modern equivalents

  7. Online book describing how to best visualize data.

  8. DataCounts! gives databases and online tools

  9. See examples of how not to represent data: http://cas.illinoisstate.edu/jpda/charting_data/badcharts.shtml

References and more

  1. Information is Beautiful has a helpful list of books on data visualization.

  2. Hacking OKCupid to find true love.

  3. See Alex Pentland's (MIT) 20 min talk about the implications of Big Data.

  4. See the April 6, 2014 NY Times article on 9 potential problems with big data. For instance, "from 2006 to 2011 the United States murder rate was well correlated with the market share of Internet Explorer: Both went down sharply. But it’s hard to imagine there is any causal relationship between the two."

How Unique are You?

Latanya Sweeney's site at dataprivacylab.org/people/sweeney has great examples of this. See for instance aboutmyinfo.org where you can see how unique you are: