ESRI's ArcGIS will be used in this Assignment
In this Task we are helping -in a role of public health analyst, a nonprofit in the United States that is interested in the causes and impacts of air quality in California. For now, we are concerned with what causes the buildup of ozone, a chemical beneficial high up in the atmosphere but harmful to human health when in the air we breath.
We are also interested in whether air with higher ozone concentrations disproportionately affects certain groups of people.
Provided Data
1. Air quality station locations as a file geodatabase point feature class. (name: air_quality_stations)
2. Ozone data, with a site_id attribute for joining back to the air quality stations data (field site), as a file geodatabase table. The data is the average hourly ozone concentration for the combined years of 2010 and 2011. When joining, keep only the records that match (name: ozone_averages)
3. Census Tracts, with an attribute for average household income, as a file geodatabase polygon feature class. (name: census_tracts_with_income)
4. A 30 meter digital elevation model as a file geodatabase raster. (name: dem_30m_ca)
Map of California census tracts with Air quality stations Data point overlayed on top.
Air quality stations Data points layer.
To know which station has what values for the Ozone levels, we need to join the two tables based on the primary key "Site" and the foreign key "site_id"
Our Data table for Ozone average levels
Our Air quality locations table
Results:
We now have an additional field called avg_ozone_2010_2011 added to our air_quality_locations data table.
It appears that some Air quality stations don't have average ozone concentration values, and have NULL values instead
Using the "Identify" Tool in ArcGIS we can get a overview of the data included in each layer. and by clicking on each feature we can see all the values corresponding to this feature from selected layer.
Now that we have our air_quality_loactions with ozone levels values, we need to know the average ozone levels for each county in California.
For that we need a way to add this information to our census_Tracts_with_income
But before we delve into that, the census tracts table right now is giving us information on all blocs areas in California about 8057 records, so this means that California in this table is not subdivided into counties, instead it subdivided into smaller areas that need to be grouped together to form counties for us to study on county level.
Sample of Data table values for census_tract_with_income feature class.
We will be using the dissolve tool to merge together areas polygons that belongs to the same county and form newer polygons that consist of our counties.
We get all counties for California, since we are going to do ozone average estimation on county level.
Our Counties Layer
Notice how some counties don't have air quality stations in them, in order to have values for those we need to average out the data on the layer by using TIN to interpolate our data on areas where there is no data but is near by.
The symbology title says "elevation" but that's actually ozone average values we set as Z values
Now we have ozone levels touching almost all counties giving us mean value for each county, again this operation can affect our results negatively but for this case, we want to know the relationship of ozone and other attribute such as Household income and elevation.
Note that TIN is an intermediary step for our objective to get mean values for ozone levels accross all California. TIN will let us create a raster data layer, now raster we know how to get values from that is why we need to take this extra conversion step.
CELLSIZE was set to 30
Our Raster data, looks the same as our TIN but very different underneath the hood
Using a Tool Called Zonal Statistics as Table, we are able to get the mean values of those ozone levels from our newly created Raster data.
Additionally, this tool allows us to create a new table that contains all the calculated fields we selected (in our case just calculating the MEAN) , in our case we want that value in our counties table that already contains the household income values so we join the two tables after that to get all attributes in one table, so that we can later on create a graph to see the relationship between the two attributes.
It appears from the graph that there is somewhat negative trend between household income and ozone concentration so our hypothesis wasn't correct just based on this graph results.
The next part is about showing our finding on the map using proper symbology.
Moving on to the next Item on the list which is comparing Elevation data with ozone level concentration. we start first by loading the Elevation Raster data, which we know we need to get information from it and attach it somewhere where we have ozone concentration so that we can compare the relationship between the two.
We can add the data to our California counties layer or just add it to our Air quality stations since we are looking for trend not values, Using a tool Called Multi Values to Points we can extract points corresponding to our air quality location in the Raster data.
Same as before, we create graph to show theO Ozone levels vs Elevation
There appear to be a positive trend, meaning we can say that according to the graph there is a correlation between altitude and ozone concentration levels. higher altitude mean higher ozone concentration.
Rough summary of the steps :
Import census_tracts_income layer then perform dissolve based on bloc_id field while taking mean values for income as statistics for each county (Tool used : Dissolve)
Import Air_quality_locations layer and ozone_average Table, joined the two based on site / site_id fields respectively (Join tool)
Then selected only the points with value not equal to NULL, (Tool Used : Select by Attribute)
Created TIN from Air_quality_locations selected values using (Create TIN) tool
Generated a raster from the TIN file created (using TIN to Raster) with CELLSIZE = 30
Computed average ozone levels for each county using Zonal statistics as a table Tool and join it to the county layers that we dissolved
adjust symbology to reflect ozone levels with 6 classes chosen with equal interval classification method
imported elevation dem layer then using the (Extract Multivalues to point ) tool we attached the elevation value to each of the slected air_quality_locations
create graph for elevation and ozone levels from the air_quality_locations layer
create graph for household income vs ozone levels from the county layer table
add all the required elements to the Data Layout view to create the output suggested to communicate our results