There are two scenarios to be investigated for any spatial patterns within the data. The first scenario involves Fort Worth Fire Department Emergency Medical Services (EMS) calls for Battalion 2. They would like to know where are the hot sports and cold sports and any relationship to the census tract. The second scenario is where the Dallas County Economic Development Office of Texas wants to know the Median Income per household cluster so that it can target its charity efforts to high income households as well as targeting job creation to low income households.
The two scenarios data is from Fort Worth Fire Department Jan 15 calls and Dallas County data. ArcGIS Pro is used in all scenarios. The first scenario executes the Anselin Local Moran’s I to find clustering in the high priority and low priority calls. This tool also will reflect if there is outlier of high priority calls and outlier of low priority calls. The Anselin Moran’s I is ran again with a distance band. Since prior analysis has established a significant z score at 900 feet, this is the distance band of interest. The second scenario finds the Hot Spot Analysis Gi* index for the Dallas County census tract of P053001 at distance of 5280 feet.
As is a practice, the data is first checked for projection. The first scenario used the Cluster and Outlier Analysis (Anselin Local Moran’s I) tool to locate clusters in high and low priority calls. The result was shown in the cluster type and also in the z scores for clusters and outliers of high and low priority calls. The red dots are high positive z scores that includes both high and low ranked calls clusters. Therefore, the cluster type output was needed to distinguish between the high and low ranked calls. The result was layered on the census tract data of Median Household Income to determine if there was a relationship between the priority of calls and the household income. The Cluster and Outlier Analysis tool was executed again but with distance band of 900 feet. The result was again shown in the cluster type and also in the z scores for clusters and outliers of high and low priority calls. This time the result was layered on the station boundaries and network of roads and highways. The second scenario used the Hot Spot Analysis (Getis-Ord Gi*) tool to identify the Hot spots (red) where the high-income households clustered and the Cold spots (blue) where the low-income households clustered using 5280 feet as the threshold.
Statistical analysis concepts learned here is used in point data, except for the Dallas scenario. They can also be used in areal data and I can give an example here.
Problem Description In my study of Ebola Virus Outbreak in West Africa under the geocoding assignment, I had written about geocoding the tabular data from hospitals in those countries. As in standard protocol, every epidemiologist out there wants to know the Hot Spots for the virus.
Data Needed The tabular data is downloaded from WHO also known as the World Health Organization. The administrative area boundaries in West Africa can be downloaded from https://data.humdata.org/dataset/west-and-central-africa-administrative-boundaries-levels. This website is the United Nations Office for the Coordination of Humanitarian Affairs’ Center for Humanitarian Data.
Analysis Procedures The geocoded data can be converted into areal data using Spatial Join to the administrative area boundaries. The Spatial Join tool will generate a count field to aggregate the Ebola cases found in each administrative area boundary. The Getis-Ord Gi* Hot Spot Analysis tool is used to generate Hot Spot Analysis. The Anselin Local Moran’s I can also be applied to the areal data but it generates high positive z scores for both high value and low value clusters and additional work has to be done on the cluster type to tease out the high value clusters from the low value ones.