Episode I: Introduction to the messy world of analysing spatial data in R
I have been working on spatial data in R since 2013...wow that's a long time ago! I started my PhD in spatial statistics at the University of St-Andrews, UK in 2013 after a few years of work in the industry following my postgraduate studies in geography. Despite my relatively long experience in dealing with spatial data in ArcGIS in the old days, and from 2015 in R, I still need to regularly use a search engine or ChatGPT to get some guidance about how to do simple operations on spatial data in R.
It seems that I never remember anything when it comes to operations like spatial join or just setting a projection, even if I did it many many times! In addition to my obvious poor memory capabilities, I want to believe that my best excuse is because packages in R evolved too rapidly. It is difficult for most of us to keep the pace, even for those with expertise in spatial data analysis.
But I think that there is a change in the spatial community in R that was relevant and made things really messy. Several key packages for the analysis of spatial data in R have stopped being maintained. The good news for you is that I have been through this and now I am happy to share with you what I have learned during this spatial package transition. Here I will do my best to highlight the most important things you need to know and avoid unnecessary technical jargon.
Have you ever written some code to analyse spatial data in R? Do you want your code to be replicated by others or to collaborate with colleagues? If so, well you certainly might need to change parts of your code.
From the R version >=4.4.0, the packages rgeos, rgdal, maptools, and a few more are no more available. Also issues remain with dependencies with spatial packages such as raster and sp. Yep, a big mess indeed and you probably need help!
see the 👍 blog of the excellent Edzer Pebesma and Roger Bivand for further details about the evolution of packages for spatial data in R: R-spatial evolution
OK. No need to cry that loud. There are new packages to replace rgdal, rgeos, and maptools. And they work as well as before and are amazingly well integrated with the other R packages.
The main packages for spatial analysis are sf and terra. Say hello to your new friends!
sf
Encoding spatial vector data. Linked to non-R libraries such as:
GDAL for reading and writing data
GEOS for geometrical operations
PROJ for projection conversions and datum transformations.
terra
It replaces the raster package for spatial data analysis and spatial operations such as intersection or interpolation with:
vector data such as points, lines, polygons
raster, also called grid, data.
Spatial analysis resources for non-specialists
I found an excellent source of information online for those who want to learn from A to Z how to deal with spatial data, click here: Zhukov-spatial.analysis. It has been realised in 2010 but it remains relevant today. It provides an excellent summary of the main important tasks to properly analyse spatial data. The slides are completed with R code and data, so the user can easily link theory with practice.
Together with Dr Brandsh, we have published a guide for students in International Relations / Political Science aiming at analysing spatial data. More information here: Python, A. and Brandsch. A Case Study of Spatial Analysis: Approaching a Research Question With Spatial Data . SAGE Research Methods Cases. 2019(58): 77-89. DOI: 10.4135/9781526467454.
Spatial modelling with R-INLA
For those who model spatial data within a Bayesian framework, the R-INLA approach has become very popular the last few years. Despite the apparent simplicity of the code lines needed to run very complex models, users may be confronted with technical issues that are not easy to deal with. As a very useful guidance, I would recommend using a book written by Profs Blangiardo and Cameletti: Blangiardo-Cameletti . The book provides a very well structured approach on how to build complex models using R-INLA from scratch.
In the next episode, I will introduce a few tips to setting up R and Studio. This is mainly for those who are not familiar with the software. After that episode, I will show how to do simple operations with spatial data using sf and terra packages in comparison with the previous methods. There will be less video, less images, less text and more code for you to help you coding, I promise, follow me!