In this project, I created visualizations to reveal insights from the US Census Demographic Data Set. I created data visualizations that tell a story and highlighted patterns in the data set. My work is a reflection of the theory and practice of data visualization, such as visual encodings, design principles, and effective communication.
The visualizations in this project center on the data of transportation, population, and income across the US.
The data comes from a Kaggle data set which includes the census data for all counties in 2015.
Question: Which state has the best transportation?
Summary:
In this project, my definition of ‘best transportation’ refers to the lowest mean commute time of a state. In order to find it, I used the ‘state, county’ table in the row because it’s hierarchical and the counties are included in the states. I also changed SUM(Mean Commute) to AVG(Mean Commute) to reflect the accurate mean commute time of each state.
The visualization is a bar chart that reflects the mean commute time of each state. I found that the state with the best transportation is Alaska; It has only 11.23 mean commute time. With a mean commute time of 30.06, New Jersey ranks the worst transportation among all the states. The difference between both states is 18.83, which almost equals the mean commute time of North Dakota.
Design:
I sorted ascending by average of mean commute within the ‘state, county’ table to display the best transportation state, Alaska, at the top of the bar chart. I also added the table to the filter, showed the filter and changed it to "multiple value dropdown menu" to make it easier to study the data of specific states. I kept one single color, blue, to make the data simple and easy to view.
Question: Which state has the lowest transportation percentage?
Summary:
In this project, my definition of ‘lowest transportation percentage’ is the lowest state average percentage of all the transportation used except for ‘walk’. I created a calculated field of ‘All Transportation’ by adding the percentages of ‘drive’, ‘carpool’, ‘transit’, and ‘other transportation’. I changed ‘SUM(All Transp)’ to ‘AVG(All Transp)’ to reflect the state average of All Transportation across the country.
The visualization is a color-coded map of the state average of All Transportation across the U.S. It is very obvious that the lowest percentage of all transportation used is Alaska again, which is only 69.78 percent. With a percentage of 96.37, Alabama ranks the ‘most transportation’ across the U.S.
Design:
The original color pattern is blue, which is not obvious to identify the lowest and the highest percentage. I changed it to orange and blue to make the contrast more outstanding and help color-blind people navigate the map easier. I also added the ‘state, county’ table to the filter, showed the filter and changed it to a "multiple value dropdown menu" to make it easier to study the data of specific states.
Question: How do population and income look across the country?
Summary:
I added the ‘Income’ table to the rows and the ‘Total Population’ table to the columns. The visualization is a scatter plot that shows the relationship between state income and state total populations across the U.S. I changed SUM(Income) to AVG(Income) to reflect the state average income as opposed to the state total income.
I found that most states (44 out of 50) have populations less than 10million; The average income of the majority of the states (46 out of 50) ranges between 35k to 70k. There are two obvious outliers of this scatter plot, namely, California and Puerto Rico. With an average income of 56k, California lies in the average range of the mean income across the country. However, its total population, 38million, is about 4 times the state average. On the other hand, Puerto Rico has the lowest state average income and total population.
Design:
The ‘state, county’ table is used in the visualization. It is added to the ‘detail’ of the ‘Marks’ and can be dropped down from state to county. When we expand this table, we could see that there is a bigger range of the state average income, and Los Angeles is an outlier with the biggest total population, which aligns with the status of the state it belongs to. I also added this table to the filter and changed it to a "multiple value dropdown menu" to make it easier to study the data of specific states. Besides, I kept the single blue color to help the visualization stay simple and clear.
Question: Is there any state whose transportation, population, and income look special?
Summary:
In this project, my definition of ‘special state’ is a state which has extreme scores on at least 2 levels, including the levels of mean commute time (sheet 1), the state average of all transportation (sheet 2), and the total population and average income of a state (sheet 3).
This visualization is a dashboard that incorporates the three sheets mentioned above. After studying the 5 states, Alaska, California, New Jersey, Puerto Rico, Alabama, which have extreme scores from the previous analysis, I found that Alaska is a ‘special state’. Alaska has the lowest scores on 3 levels, namely, the levels of mean commute time, the state average of all transportation, and the total population. However, its state average income ranks second among all the 5 aforementioned states, which makes Alaska special.
Design:
I removed the title of each sheet to make the visualization focus on the data. I also added a filter to the map and applied its changes to worksheets by selecting the ‘all using related data sources’ label. Therefore, when we examine the data of the 5 aforementioned states, we can select them in the drop-down menu which is made float on the map, and the data in the other 2 visualizations will change accordingly.