Census Data

John Kamanga's Course Portfolio

Assessing the distribution patterns of Median Household Income for Baltimore County, Maryland, USA.

Problem Statement

COVID19 pandemic has caused significant economic hardships on households in America. Millions of people have lost their jobs, and as they started bouncing back, Ukraine-Russia war even added more salt to the already bitter soup. The America’s lower- and middle income families were affected more as they had low resilience levels. Baltimore County Council Officials have received funding from the Central Government to help its households bounce back from the said economic hardships. Nevertheless, it has limited information on which block groups are affected the most, and whether there is any clustering of middle and lower income level households for it to provided targeted group efforts. The solutions on the table includes rental assistance, mortgage relief, food stamps and meal programs, homeless services, and employment opportunities. The objective of this assignment is therefore to extract median household income for Baltimore County, Maryland, which will be used as a proxy to identify block groups with high and low poverty levels. This will inform where the County Council should prioritize.

Analysis procedure

To address the problem, I used ArcGIS Pro 3.0.2. I extracted data from the U.S. Census Bureau's dissemination platform and the Topologically Integrated Geographic Encoding and Referencing (TIGER) system, and used spatial join to join the tabular data with the shapefile. The joined dataset was then displayed on the map using graduating color symbiology in ArcGIS pro. This was used to make the thematic map on median household incomes in Baltimore County, Maryland. Then I used Cluster and Outlier Analysis (Anselin Local Moran's I) tool to identify cold spots which would be prioritized for the economic hardship relief program. The tabular data used was the Estimate Median household income in the past 12 months (in 2018 inflation-adjusted dollars) (field B19013_001E), which was downloaded from https://data.census.gov/cedsci/all?t=Income%20and%20Poverty with a file name of ACSDT5Y2018.B19013_2022-10-27T070909. The shapefile was that for Maryland, and was downloaded from https://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2018&layergroup=Block+Groups with a file name of tl_2018_24_bg.

In part A of the assignment, I started with extracting the data from the U.S. Census Bureau's dissemination platform. I applied the following filters 1) on topics, I applied income and poverty, 2) on geography, I applied “all block groups within Baltimore”, 3) finally on year, I used 2018 filter, then downloaded the data. Then I used Topologically Integrated Geographic Encoding and Referencing (TIGER) system, where I selected Maryland State, to download the shapefile. Later, I used excel to table tool to convert the tabular data after preparing into a table in ArcGIS Pro. I realized that the field in question (field B19013 001E) was exported as a string field hence would be difficult to visualize it using graduating colors. I created a new numeric field (Median Income Estimate) and used calculate field feature which imported values from the previous text field above into this. Secondly, I run a definition query on the Maryland Boundary Shapefile inorder to remain with Baltimore County block groups only. I restricted the CountyFP field to 005, which is a code for Baltimore County. I also created another new field in the Median Household Income Table for GEOID which was used as the main joining field with the shapefile. The Tabular data had 21 digit GEOID field which had 9 more digits than the same field in the shapefile. I had to export data from the original GEOID field into this new field using a python 3 script which specified that copy only last 12 digits in the calculate field pane. Then using spatial join, I joined the GEOID field in the shapefile with the newly created GEOID field in the Median Household Income Table. Then I displayed the Median Household Income field using graduating colors.

In Part B, I used Cluster and Outlier Analysis (Anselin Local Moran's I) tool which is a local pattern analysis tool to display the hot spots and cold spots in the data. The input feature class was the Baltimore County Median Income Layer. The input field was the Median Household Income, Spatial Relationships was Inverse Distance, and the number of Permutations was left at zero and this was not standardized. The output showed some significant hot spots and cold spots, hence rejected the null hypothesis that Median Household Income Pattern distribution was a result of a random chance.

Process Diagram

Figure 1: Shows the process used in conducting cluster analysis using census data

Results

Using Cluster and Outlier Analysis (Anselin Local Moran's I) Tool in assessing the local distribution patterns of Median Household Income in Baltimore.

I used Baltimore County Median Income Shapefile as Input Feature Class, and focused on Median Household Income field. The spatial relationship was set on Inverse distance, and number of permutations was set at zero. The figure below shows the output of the hotspot analysis.

After visually assessing the thematic map, it seemed like the data exhibited some clustering pattern. But the question was, is there indeed significant clustering? If yes where? I then conducted Cluster and Outlier Analysis (Anselin Local Moran's I) to identify local cluster patterns which would help the County Officials to provided targeted interventions. I used Inverse distance for a spatial relationship since I didn’t have adequate information on optimal distance where z-score was high and where there clustering was pronounced the most. If this information was available, could have used fixed distance relationship or some other tool. The results from the analysis showed some significant clustering hot and cold spots, and therefore I rejected the null hypothesis that the spatial pattern observed was dull to random chance.

The results also showed that southern lower block groups have a clusters of households with lower median income, which makes them best areas to prioritize for economic empowerment support programs. The map also suggests that the high income households tend to cluster close to the middle of the county. The officials can also learn from these to find out what is it about them that is helping in keeping their blocks with a higher income median.

Figure 2: Shows thematic map on Median Household Income Distribution Estimates by Block Group in Baltimore County

Figure 3: Shows Median Household Income Hot/Cold Spot map for Baltimore County

Application and Reflection

Problem statement: COVID19, Cholera outbreak, Inflation, fuel and forex shortages have continuously exacerbated economic hardships to an already poverty stricken nation of Malawi. The government and donor community would like to come up with targeted interventions to help the households move out of the viscious circle of poverty. The resources are not enough to cover for everyone, hence need to identify most affected households using poverty lens.

Data needed: 2018 population and housing census meta data for Malawi, and Malawi District Boundaries shape files, The data will be obtained from Malawi National Statistics Office.

Analysis procedure: I will use the Cluster and Outlier Analysis (Anselin Local Moran's I) tool to assess the Poverty Rates in Malawi so as to identify the hot spots of traditional authorities which are heavily stricken with poverty and may need some economic empowerment projects and social cash transfers to help lessen the burden. This data will be downloaded from the Malawi National Statistical Office site as the main custodian of census data.

Page updated

Report abuse