Showcase: Geospatial & GIS

R Users Group: Introduction to GIS and Disease Mapping with R

December 2022

When: Wednesday, Dec. 14, 2022, Noon – 1 pm 

About RUG:  

The R Users Group (RUG) is a series of organized talks by and for anyone at CCHMC who uses the statistical programming language R.

You’ll Learn: 

Code & Slides -> https://github.com/maurosc3ner/intro2GIS_R_CCHMCDec2022

Using GDAL in Python

February 2021

Nowadays, geospatial data is becoming the queen of data scientists. For remote sensing, earth conservation, or Health as in my research (effect of long-term pollutants), geospatial and particularly raster datasets are generated every day, minutes, and seconds by several satellites.

With projects, theses, new packages (rasterio, shapely or geopandas), or companies on its shoulders. GDAL has stood as the main geospatial library out there for years or even decades. Initially implemented in C/C++, GDAL is guilty of numerous successes in the geospatial community. 

As with any other low-level library, it is far from being easy with a stepped learning curve, especially in vector data. However, I will start adding some basic examples in geospatial tasks from the bottom-up perspective. Those examples will be driven by my curiosity and the necessity to create well-documented examples in Python 3. I plan to add examples to my playground repo incrementally. 

Learning goals:

References

NDVI of Cincinnati Metropolitan Area
Yearly average PM2.5 by county

Two Decades of Pollution in the U.S.

Raster processing using Raster imagery and R

We have use the North American Regional Estimates (V4.NA.02.MAPLE) for the surface pm 2.5 [1]. You can get the raw images from here. For the county-level spatial data, I am using the CDC's US-ADM2 map due to my analysis [2], but you can use getData boundaries too as long as it has the proper column code (FIPS, GEOID? for the join. 

References

[1] van Donkelaar, A., R. V. Martin, et al. (2019). Regional Estimates of Chemical Composition of Fine Particulate Matter using a Combined Geoscience-Statistical Method with Information from Satellites, Models, and Monitors. Environmental Science & Technology, 2019, doi:10.1021/acs.est.8b06392.

[2] CDC's Social Vulnerability Index (SVI) https://svi.cdc.gov/data-and-tools-download.html

[3] Initial custom albers projection https://github.com/hrbrmstr/rd3albers

GIS & Public Health ArcMap 10.x 

The goal of this course is to exemplify the role of ArcGIS in the analysis of spatial data in public health applications. In the beginning, we will review the fundamentals of the use of the ArcMap interface, basic file structures, and operations. Then, we will explore the capabilities of manipulating information in ArcGIS, and finally the use of ArcGIS in real-world scenarios.

Format of the Lab sessions:

References

COVID19 Mortality Risk and Spatial Autocorrelation Bias

Small Area Estimation (SAE) using R-INLA 

Here we used a Bayesian hierarchical spatial model with an Intrinsic Conditional Auto-Regressive (ICAR) was used to assess the risk of COVID-19 related death per county adjusting for previously selected covariates [30]. Due to the strong association between age and COVID-19 death counts, we adjusted all models with age-group population proportions. The multilevel Poisson regression was extended to estimate the local risk of COVID-19. Benefits from the model-based disease mapping for this study are three-fold. First, they allow to obtain reliable risk estimates of disease based not only on Standardized Incidence Ratios (ratio of the observed to the expected disease counts [SIRs]) but also on the regression covariates. Second, disease models offer a mechanism to induce the nested county-state structure as random effects, improving the local estimates while avoiding extreme values of areas with small populations (relative risk [RR]). Lastly, the ICAR model allows us to evaluate the spatial effect (Φ) for connected and disconnected territories if exist (e.g., AK and HI). For counts, cumulative number of deaths was included at the minimum unit of analysis available (County-level). All mapping was performed in R using R-INLA and GGPLOT packages.

References

You can check the pre-print at https://pubmed.ncbi.nlm.nih.gov/32699858/

Spatiotemporal analysis of Covid19

Here I explore the spatio-temporal variation of Covid19.  How the epidemic trespasses from urban to rural counties is the central core of this work. Parametric Spatiotemporal Bayesian model is fitted using monthly death counts per county.


Paper is preparation...

Interactive mapping using Leaflet


We have used kriging to generate the continuous surface across sampling units (PSUs)