Showcase: Geospatial & GIS
R Users Group: Introduction to GIS and Disease Mapping with R
December 2022
When: Wednesday, Dec. 14, 2022, Noon – 1 pm
About RUG:
The R Users Group (RUG) is a series of organized talks by and for anyone at CCHMC who uses the statistical programming language R.
You’ll Learn:
Reasons you might want to create your own maps
How to perform spatial and geometry operations
How to project your geographic data
Where to find more geospatial variables
Code & Slides -> https://github.com/maurosc3ner/intro2GIS_R_CCHMCDec2022
Using GDAL in Python
February 2021
Nowadays, geospatial data is becoming the queen of data scientists. For remote sensing, earth conservation, or Health as in my research (effect of long-term pollutants), geospatial and particularly raster datasets are generated every day, minutes, and seconds by several satellites.
With projects, theses, new packages (rasterio, shapely or geopandas), or companies on its shoulders. GDAL has stood as the main geospatial library out there for years or even decades. Initially implemented in C/C++, GDAL is guilty of numerous successes in the geospatial community.
As with any other low-level library, it is far from being easy with a stepped learning curve, especially in vector data. However, I will start adding some basic examples in geospatial tasks from the bottom-up perspective. Those examples will be driven by my curiosity and the necessity to create well-documented examples in Python 3. I plan to add examples to my playground repo incrementally.
Learning goals:
Learn common geographical tasks (transform, clipping, vector manipulation, zona statistics) in Python and its ecosystem.
Improve my python skills due to the fact that I have been habituated to R for GIS tasks.
References
In the meantime, this is the first exercise [URL]
For the proper installation of gdal env you can check this tutorial
API and documentation is at https://gdal.org/
Two Decades of Pollution in the U.S.
Raster processing using Raster imagery and R
We have use the North American Regional Estimates (V4.NA.02.MAPLE) for the surface pm 2.5 [1]. You can get the raw images from here. For the county-level spatial data, I am using the CDC's US-ADM2 map due to my analysis [2], but you can use getData boundaries too as long as it has the proper column code (FIPS, GEOID? for the join.
References
[1] van Donkelaar, A., R. V. Martin, et al. (2019). Regional Estimates of Chemical Composition of Fine Particulate Matter using a Combined Geoscience-Statistical Method with Information from Satellites, Models, and Monitors. Environmental Science & Technology, 2019, doi:10.1021/acs.est.8b06392.
[2] CDC's Social Vulnerability Index (SVI) https://svi.cdc.gov/data-and-tools-download.html
[3] Initial custom albers projection https://github.com/hrbrmstr/rd3albers
GIS & Public Health ArcMap 10.x
The goal of this course is to exemplify the role of ArcGIS in the analysis of spatial data in public health applications. In the beginning, we will review the fundamentals of the use of the ArcMap interface, basic file structures, and operations. Then, we will explore the capabilities of manipulating information in ArcGIS, and finally the use of ArcGIS in real-world scenarios.
Format of the Lab sessions:
This is a class about the use of ArcGIS in health-related problems
Subtitled videos (No audio) with a detailed explanations about the topics.
References
You can check the playlist at youtube
The site is in my github page at https://maurosc3ner.github.io/6031gis_publichealth/
COVID19 Mortality Risk and Spatial Autocorrelation Bias
Small Area Estimation (SAE) using R-INLA
Here we used a Bayesian hierarchical spatial model with an Intrinsic Conditional Auto-Regressive (ICAR) was used to assess the risk of COVID-19 related death per county adjusting for previously selected covariates [30]. Due to the strong association between age and COVID-19 death counts, we adjusted all models with age-group population proportions. The multilevel Poisson regression was extended to estimate the local risk of COVID-19. Benefits from the model-based disease mapping for this study are three-fold. First, they allow to obtain reliable risk estimates of disease based not only on Standardized Incidence Ratios (ratio of the observed to the expected disease counts [SIRs]) but also on the regression covariates. Second, disease models offer a mechanism to induce the nested county-state structure as random effects, improving the local estimates while avoiding extreme values of areas with small populations (relative risk [RR]). Lastly, the ICAR model allows us to evaluate the spatial effect (Φ) for connected and disconnected territories if exist (e.g., AK and HI). For counts, cumulative number of deaths was included at the minimum unit of analysis available (County-level). All mapping was performed in R using R-INLA and GGPLOT packages.
References
You can check the pre-print at https://pubmed.ncbi.nlm.nih.gov/32699858/
Spatiotemporal analysis of Covid19
Here I explore the spatio-temporal variation of Covid19. How the epidemic trespasses from urban to rural counties is the central core of this work. Parametric Spatiotemporal Bayesian model is fitted using monthly death counts per county.
Paper is preparation...
Interactive mapping using Leaflet
We have used kriging to generate the continuous surface across sampling units (PSUs)