Open Refine and Tableau for Visualization REU

The purpose of this tutorial is to introduce you to powerful, yet user-friendly tools for you to clean, analyze, and visualize your tabular data. In this example we are interested in looking at the relationship between reported direct emissions and stationary combustion in major facilities across the US.

We will use Open Refine to clean the data and Tableau for analysis and visualization.

OpenRefine

OpenRefine is a free, powerful tool that allows you to clean, discover, organize messy data and link it on the fly to other databases.

The main reasons for using OpenRefine are:

- It's free, open source, and your data is stored locally so you can clean your data without any privacy issues

- It combines the power of scripting with the simplicity of spreadsheets, which makes it more interactive and experimental than other tools

- User-friendly design that allows for easily detecting and fixing disparities in your data

File formats supported by OpenRefine are: Comma-Separated Values (CSV), Excel documents (.xls, .xlsx), Open Document Format spreadsheets (.ODS), JSON and XML. If you need other formats for your data, you can add them by way of OpenRefine extensions.

You can find excellent resources at: openrefine.org and download the program here.

Tableau

Tableau is a data visualization and analytics platform used by thousands of companies to understand their data. Tableau offers an intuitive, drag-and-drop user interface to build interactive graphs in order to quickly generate insights.

Additional Resources

Tableau's Getting Started Guide

Tableau's Starter Kit - tutorials and guide for beginners

Tableau's online help website

Tableau's student resources page - Download the software and receive a free student license here