Problem
Fort Worth Fire Department, Oleander Library, and Dallas County request spatial data analysis given specified criteria. Fort Worth Fire Department requests multiple spatial data analyses of incident calls received during January and February 2015. Oleander Library requests spatial data analysis to determine areas of customer clustering. Dallas County requests spatial data analysis to determine where high and low median incomes are clustered.
Analysis Procedures
Data provided by Fort Worth Fire Department includes January and February 2015 incident calls (separate shapefiles), potential fire station sites, actual fire station sites, alarm territories, and battalion boundaries, as well as major roads in the Fort Worth area. Data provided by Oleander Library included library districts, patron locations, major roads, land records and an apartment complex shapefile. Data provided by Dallas County included census block data for the geographic area which included median income as an attribute and major roads for the area. In each case, data was added to ArcMap for analysis. The appropriate spatial analysis tools were run. When appropriate, results were analyzed for statistical significance. Maps were symbolized for visual display.
Fort Worth Fire Department:
In each case, as a first step, available data was added to ArcMap for analysis.
- To determine if false alarms cluster in Battalion 2, I set a query to define the appropriate calls for analysis (false alarms) and then ran the Average Nearest Neighbor tool using February 2015 incident calls. I reviewed the graphed results of the analysis and determined whether there was statistically significant clustering. I symbolized the map for visual display.
- To determine if there was clustering of high priority calls, I first prepared by using the Calculate Distance Band from Neighbor Count tool to determine the average distance needed for the spatial data analysis tool. I used this distance in the High/Low Clustering (Getis-Ord General G) tool to determine high priority call clustering using January 2015 incident calls. I repeated the use of the tool at incremental distances to determine peak Z-score and statistical significance. I symbolized the map for visual display.
- To determine if there was clustering of high priority calls at multiple distances, I used the Multi-Distance Spatial Cluster Analysis tool to obtain a Ripley’s K Function output for January 2015 incident calls. I then graphed the difference between the observed and expected Ripley’s K values to identify the peak value (largest difference) and the related distance. I then reran the Multi-Distance Spatial Cluster Analysis tool to include a confidence envelope. I graphed the difference in Ripley’s K values (observed - upper confidence envelope) to determine distance of maximum clustering. I symbolized the map for visual display.
- To determine if there was spatial clustering of incident call types (at both high and low values), I ran the Cluster and Outlier Analysis tool. The results were then saved as a feature layer, added to the map and symbolized for analysis. The results were then compared to the census block median income data for further analysis. I symbolized the map for visual display. I then reran the Cluster and Outlier Analysis tool using a fixed distance band. The resulting layer was saved as a feature layer, added to the map and symbolized for analysis. The results were then compared to the census block median income date for further analysis. I symbolized this second map for visual display.
Oleander Library:
Available data (including a required grid) was added to ArcMap for analysis. I performed a spatial join joining the provided grid with customer counts, summing the customer counts for each grid. I excluded zero count cells from the analysis. I then used the Spatial Autocorrelation tool to determine Moran’s I. I examined the graphed results including the index, z-score and p-values. I then repeated this process using the Spatial Autocorrelation tool for interval distances to find the peak z-score. The map was symbolized for display.
Dallas County:
Available data was added to ArcMap for analysis. I used the Hot Spot Analysis tool to identify clustered areas of low and high values of median income by census block. I symbolized the map for display.
General workflow diagram (Click to enlarge)
Results
Map highlighting high/low cluster analysis based on median incomes at census block level (Click to enlarge)
Map highlighting clustering of library patrons (Click to enlarge)
Map highlighting clustered areas of low and high values of median income by census block in Dallas County (Click to enlarge)
Map highlighting high/low cluster analysis based on median incomes at census block level at fixed distance band (Click to enlarge)
Application & Reflection
Spatial data analysis provides a convenient way of determining spatial clustering and dispersion of features as well as attributes of interest. These types of analyses can be used across a variety of industries and public services. As a sociologist interested in food systems, I would be interested in determining if there is any type of spatial patterns related to farmers markets here in North Carolina. In particular, I would be interested in assessing whether any spatial clustering and dispersion is related to income. For this project, I would obtain a list of farmer’s markets locations in North Carolina from the Farmer’s Market Online website. These would be geocoded as points to a TIGER/Line shapefile of North Carolina obtained from the US Census Bureau. Using the Average Nearest Neighbor tool, I would determine whether the farmer’s markets in North Carolina were clustered or dispersed. Following this, if the farmer’s markets were clustered, I would then compare this to median income by census block in North Carolina, which can be obtained from the US Census Bureau, to determine if there is any relationship between the clustering and median income levels in North Carolina.