Posted on February 26, 2019
by Mika Cadiz
Data empowers people to make better and more informed decisions -- whether these are made by policy makers to benefit their constituents, or nonprofits or individual citizens to empower themselves and their communities. It allows us to move from the realm of speculation and assumption into concrete reality.
This is the role of the Rural Atlas, a web app that presents data and trends focused on rural Minnesota in interactive graphs, charts, and maps. The Center for Rural Policy and Development, a non-partisan research organization funded by the Minnesota State Legislature, has presented versions of the Atlas online since 2002. Since the fall of 2018, I’ve been working on its latest iteration with research associate Kelly Asche. The app is live, and we will be updating it as new data becomes available. It is coded in the programming language R, with visualizations built with the package Shiny and data cleaned and organized with Tidyverse.
The US Census Bureau collects tremendous amounts of data on subjects such as housing, migration, demographics, and employment every year. However, much of this data is available but not readily accessible. In the easiest cases, the data is online but resides in several different online databases that are not easily reached through a simple google search. Furthermore, the spreadsheet format this data is stored in, while highly useful and accessible for statisticians, programmers, or data geeks (like us!), does not lend itself well to quick comprehension by laymen looking for a broad understanding of rural trends. In more difficult cases, as with older data, data is not online at all, and is only available to us through the work of previous students, who transcribed data into spreadsheets from physical data books.
The Rural Atlas not only aggregates this data, it also presents it in accurate and accessible formats, allowing the wealth of information collected by the Census Bureau to be comprehended and utilized by anyone, regardless of their background in data or statistics.
Therefore, much of the work of building the Atlas involved aggregating, cleaning, and wrangling large amounts of data, to create clean and tidy data sets from which we could build interactive visualizations. Combining data from multiple sources presents a significant challenge, especially when sources are in incompatible formats. The questions this kind of work poses are difficult because they are not merely technical. It is easy enough to write lines of code merging spreadsheets. But how do we merge data that is in some ways incompatible, without losing valuable information or divorcing it from the context in which it was collected? City, township, and county lines are drawn and redrawn; small towns are absorbed into larger cities; data collection organizations change the formats and collection conventions. These all have implications for how we present data, and how this data is understood, whether by policy makers working on development strategies, or ordinary people defining rural in our cultural narrative.