Exploring Key Factors of Firearm Fatalities in Mississippi Using Regression Analysis
by Heather Robbins
by Heather Robbins
Firearm fatalities are a significant problem in the United States and are the leading cause of death for young people (Annual Gun Violence Data | Center for Gun Violence Solutions, n.d.). While these deaths vary substantially from state to state, identifying potential predictors of firearm fatalities through regression analysis can help to better understand local factors and explore if relationships between said factors and outcomes vary across geography.
Located in the southeastern United States, Mississippi is characterized by a diverse landscape that includes lowlands along the Mississippi River, rolling hills in the central and northern regions, and flat plains in the Delta region. The state is comprised of 82 counties, with a mix of rural agricultural areas, small towns, and a few urban centers such as Jackson and Gulfport. Mississippi has historically faced high rates of poverty, limited healthcare access, and educational disparities, especially in its rural counties (Office of Preventive Health and Health Equity et al., 2023).
What socioeconomic and health-related factors best explain the variation in firearm fatality rates across counties in Mississippi?
The County Health Rankings & Roadmaps (CHR&R) organization provides an annual report and data from a multitude of entities including the Bureau of Labor Statistics, USDA Food, American Community Surveys (Census Bureau) and the National Center for Labor Statistics. The County Health Rankings Feature Service includes over 300 variables with national, state and county level data regarding various health factors. Firearm fatalities rate (per 100,000 people) is the selected dependent variable from this data. Esri's Demographic Variables is a Feature Service available in the Living Atlas with various key demographic characteristics related to population. The GINI index is used from this service as one of the explanatory variables.
Figure 1 - Histogram of Dependent Variable - Firearm fatalities per 100,000
Statistics of the data indicate there are outliers, 5 counties are highlighted as outliers: Hinds, Holmes, Leflore, Washington and Wilkinson. 9 counties are missing the dependent variable data, all remaining counties are included in the analysis.
A note on the distribution of firearm fatality rate data (Figure 1) -
The raw firearm fatalities data is partially skewed to the left, a breakdown of the general statistics is provided here:
Range: 10.69 – 47.91
Mean: 24.06
Median: 21.96
Standard Deviation: 7.97
Skewness: 0.96
Kurtosis: 3.78
Nulls: 9
All data were projected to the USA Contiguous Albers Equal Area Conic coordinate system (EPSG: 102003) to keep appropriate distance and area calculations for regression analysis.
Dependent Variable:
Firearm fatalities per 100,000
Explanatory variables:
The final explanatory variables used in the exploratory regression analysis include:
Percentage of adults reporting fair or poor health
Population to mental health provider ratio
Percentage of children living below the poverty line
Percentage of adults reporting excessive drinking
Percentage of youth (ages 16–19) not in school or work
GINI Index (income inequality)
These variables provide a well-rounded foundation for examining county-level variation in firearm fatality rates across Mississippi.
Exploratory Variables 1 and 4
Exploratory regression identifies which variables best correlate with a dependent variable, helping assess the strength and positive or negative direction of relationships. It allows testing multiple variable combinations to discover models with strong statistical performance. For this project, exploratory regression tested how factors like education, poverty, and income inequality relate to firearm fatalities. Models were evaluated using R², AICc, and statistical significance of explanatory variables. This process narrowed down the most relevant predictors for further analysis, guiding the selection of variables for Ordinary Least Squares (OLS) regression while ensuring model assumptions like multicollinearity and significance were met.
OLS regression tests a global linear relationship between selected explanatory variables and a dependent variable across all features (e.g., counties). It produces a single equation and suite of statistical diagnostics to assess model strength, redundancy, and bias. In this project, OLS evaluated how well chosen factors—identified through exploratory regression—explained firearm fatality rates. The model was checked for multicollinearity, residual normality, heteroskedasticity (Koenker test), and spatial autocorrelation. Results showed a reasonably good fit, with minimal residual clustering and a normally distributed error term, suggesting a stable and unbiased global model appropriate for this dataset.
GWR builds on the OLS type analysis by accounting for relationships between variables that vary over space. Rather than a single global equation, GWR creates localized models for each feature (e.g., county), weighted by spatial proximity to neighbors. This method helps identify spatial nonstationarity—when explanatory relationships differ by location. If residuals from the OLS model show spatial clustering or if the Koenker test is significant, GWR may better explain the data. The neighborhood of features is specified within the tool parameters which establishes how the local regression equations of dependent and explanatory variable(s) are calculated.
As mentioned earlier, firearm fatality data across Mississippi counties is slightly left-skewed and includes some outliers. While not normally distributed, the skewness was not severe enough to require transformation. However, this should be noted as a potential influence on model behavior.
For exploratory regression, the following criteria were applied:
Explanatory variables: 2 to 5
Minimum R²: 0.50
Maximum p-value (coefficients): 0.05
Maximum VIF (multicollinearity): 7.5
Minimum Jarque-Bera (residual normality): 0.10
Minimum Moran’s I p-value (spatial autocorrelation): 0.10
These thresholds ensure that chosen models explain at least 50% of the variation in firearm fatalities, use non-redundant explanatory variables, and produce unbiased, normally distributed, and spatially random residuals.
For clarification, variable identifiers within the following models are:
A copy of the second and third best model statistics is provided here in Figure 2
Figure 2 - Exploratory Regression 2nd and 3rd Best Models
The top three models based on R-squared value (0.49) and AICc values report that the explanatory variables explain 49% of the variance firearm fatalities in Mississippi. While no models passed the minimum R-squared value of 0.5, multicollinearity is not of concern as all VIF values are well below 7.5 and the Jarque-Bera statistic is acceptable. The lesser of the top three models exhibit higher AICc values, and variables that were not statistically significant.
The best model based on R-squared and AICc is provided in Figure 3:
Figure 3 - Best Model from Exploratory Regression
From the "Choose 3 of 6 Summary" this model has the highest R-squared, the lowest AICc. The Jarque-Bera statistic indicates that residuals are normally distributed, VIF suggests multicollinearity is not a concern, and spatial autocorrelation is low. This model indicates a strong balance of goodness of fit and all three variables are statistically significant. However, the Koenker BP test (0.03) suggests there is non-stationarity of the relationships between variables, and therefore local patterns may be better explained through Geographically Weighted Regression (GWR).
The top model variables are:
+Percentage of adults report fair or poor health*** - Percentage of adults report excessive drinking* + Percentage of children living below poverty line***
Note: *** represents a p-value of 0.05, the variable is 95% statistically significant, * represents a p-value of 0.15, 85% confidence
Figure 4: OLS Diagnostics and Summary
After selecting the top-performing model from the Exploratory Regression results, I reran the model using the OLS tool to obtain the full scope of statistics for those variables. The OLS tool shows the adjusted R-squared decreased slightly, now at 0.443, and the AICc increased to 474.323. These changes are somewhat expected, as Exploratory Regression provides approximated model summaries for many variable combinations, whereas the OLS tool recalculates exact statistics based on the finalized set of variables. The second exploratory variable, percentage of adults who reported excessive drinking is no longer statistically significant in the OLS results.
The standardized residuals map from the OLS analysis shows no obvious spatial clustering of over/under predictions, most counties are within one standard deviation, residuals are randomly distributed across the study area. A visual check of the residual vs. predicted plot (Figure 5) visualizes the Koenker (BP) statistic, which rose to 0.057, residuals no longer show significant nonstationarity and that the model may meet OLS assumptions.
Figure 5: Residuals vs. Predicted Plot from OLS
Figure 5: Histogram of Standardized Residuals from Ordinary Least Squares report
Additionally, the histogram of standardized residuals (figure 5) follows a normal distribution. These patterns, along with a Koenker (BP) p-value above 0.05, suggest the model meets the assumptions of stationarity and homoscedasticity. Since there is no significant spatial dependence in the residuals, a Geographically Weighted Regression (GWR) model is not warranted. Therefore, the OLS model is appropriate and sufficient for interpreting the relationships between explanatory variables and the dependent variable in this analysis.
Mississippi consistently ranks among the states with the highest firearm fatality rates. This analysis identified county-level socioeconomic and health-related factors associated with this pattern. Exploratory regression revealed that the percentage of adults reporting fair or poor health was the strongest explanatory variable, positively associated with firearm fatalities in over 93% of all models. The percentage of children living below the poverty line also showed a strong, positive association. Interestingly, the percentage of adults reporting excessive drinking was negatively correlated with firearm fatalities—an unexpected finding that may warrant further investigation to understand the underlying dynamics.
Selected explanatory variables, including the GINI index, mental health provider access, and disconnected youth rates, were not found to be significant predictors in this context. The study is limited by missing data in approximately 10% of counties, potentially influencing the model’s ability to generalize across the entire state. Additionally, relevant variables may not have been included in this analysis as indicated by adjusted R-squared value 0.44.
While the results successfully highlight some key predictors of firearm fatalities in Mississippi, future research should explore more comprehensive datasets and investigate potential spatial variability or better explanatory variables that might improve the model.
Annual Gun Violence Data | Center for Gun Violence Solutions. (n.d.). Center for Gun Violence Solutions. https://publichealth.jhu.edu/center-for-gun-violence-solutions/annual-gun-violence-data
Esri. (2024). ESRI updated Demographics variables 2024 [Dataset; Feature Service]. https://www.arcgis.com/home/item.html?id=b0b3b31e531e406185f2de4fff596060
Geiger, A. (2025, April 24). What the data says about gun deaths in the U.S. Pew Research Center. https://www.pewresearch.org/short-reads/2025/03/05/what-the-data-says-about-gun-deaths-in-the-us/
Geographically Weighted Regression (GWR) (Spatial Statistics)—ARCGIS Pro | Documentation. (n.d.). https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/geographicallyweightedregression.htm
Ordinary Least Squares (OLS) (Spatial Statistics)—ArCGIS Pro | Documentation. (n.d.). https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/ordinary-least-squares.htm
Tucker, E., & Krishnakumar, P. (2022, May 27). States with weaker gun laws have higher rates of firearm related homicides and suicides, study finds. CNN.com. https://www.cnn.com/2022/01/20/us/everytown-weak-gun-laws-high-gun-deaths-study/index.html
University of Wisconsin Population Health Institute. (2024). County Health Rankings 2024 [Dataset; Feature Serve]. Esri Living Atlas. https://www.arcgis.com/home/item.html?id=76246cfea2f346b99f39c038d39967a8