The goal of this lab was to analyze spatial patterns in United States and Canadian bigfoot sighting data, and visualize clusters with high and low "reliability" of bigfoot sightings.
To complete this lab, I first loaded the shapefiles (bigfoot location sightings, US population by county, and US/Canada boundaries) into ArcGIS Pro. In addition to sighting locations, the bigfoot shapefile also contained a sighting reliability score, running from 1 (low reliability) to 5 (high reliability).
Next I ran the Average Nearest Neighbor tool, using the Euclidian distance method, on the bigfoot points to assess whether the bigfoot data have a pattern to them or are randomly distributed. This tool measures the distance between each feature and it's nearest neighbor, then calculates if the average distances between points is lower than the expected distance if the points were all completely randomly distributed. If so, that suggests the points are clustered.
With this tool, I generated a significance report by checking the "Generate Report" box. The report returned a p-value of 0 and a z-score with a large absolute value, indicating that the data are non-randomly distributed and highly clustered (i.e. mean distance between points was much lower than expected).
Average Nearest Neighbor Summary
Observed Mean Distance: 19064.3701 Meters
Expected Mean Distance: 47325.4297 Meters
Nearest Neighbor Ratio: 0.402836
z-score: -65.347865
p-value: 0.000000
Next, I tested whether the spatial layout of points in this dataset was statistically significant, and whether it was clustered, dispersed, or randomly distributed, using the Spatial Autorcorrelation (Global Moran's I) tool, with the input feature class as the bigfoot dataset and the input field as "Class" (sighting reliability score).
It returned a Moran's Index as higher than 1 (1.415684). Moran's Index is typically bounded -1 to +1, so I re-ran the tool, except this time I had it standardize the data by selecting "Row" in the "Standarization" drop down, and I set the "Conceptualization of Spatial Relationships" to "Fixed Band Distance", with a distance of 1,000,000 (1,000km).
This fixes the issue with the bigfoot data being highly skewed by forcing the tool to calculate the Moran's Index from eight or more neighbors instead of a minimum of one. The corrected summary is below:
Global Moran's I Summary
Moran's Index: 0.006809
Expected Index: -0.000306
Variance: 0.000001
z-score: 8.299707
p-value: 0.000000
With a p-value of 0 and a large z-score, the bigfoot data appear to be non-randomly clustered.
Finally, I ran a hotspot analysis with the Hot Spot Analysis (Getis-Ord Gi*) tool. The hot spot analysis identifies spatial clusters within a dataset that have high (hot) or low (cold) values, using the Getis-Ord Gi* statistic.
I ran it first with the settings as follows:
The "Conceptualization of Spatial Relationships" field as "Inverse Distance"
No distance band input value (Arc will calculate this).
Using the inverse distance setting tells the tool to weight nearby features more than features farther away when running the spatial relationship analysis.
Then I ran it again with the following settings:
The "Conceptualization of Spatial Relationships" field as "Fixed Distance"
No distance band input value (Arc will calculate this).
Using the fixed distance setting tells the tool to calculate a distance for the dataset, where features within that distance are more heavily weighted in the analysis than those outside it.
Finally, I ran the tool again with these settings:
The "Conceptualization of Spatial Relationships" field as "Fixed Distance"
Distance band input as 50,000 (50km)
Setting the distance band manually tells the tool to weight features more heavily within 50km when assessing a point's significance.
This had the effect of creating a cluster of highly significant low-reliability sightings in Missouri, and highly significant high-reliability sighting clusters in the Pacific Northwest, California, near the Great Lakes, and in Florida.
To finalize the map, I symbolized each US county with a color ramp indicating high and low population density values for each county and then overlaid the 50km bigfoot hotspot analysis shapefile over the top. Then I added text and legend, and saved the maps as .png and .pdf files.
High resolution (PDF) download available here