Lukumon's Projects Gallery
Highlighted here are several of Lukumon's projects that leverage Earth Observation (EO) data and time-series satellite imagery, integrated with advanced geospatial technologies demonstrating his experience and competence.
A. REMOTE SENSING AND GIS PROJECTS
1. Above Ground Biomass Density (ABGD) Estimation Using GEDI and Satellite Data with Machine Learning
Problem Statement: Aboveground Biomass (AGB) represents the total weight of plants and trees above the ground in a forest, while AGBD is the biomass per unit area. This provides information on carbon stock and plays a crucial role in forest management. The field sampling method for AGBD is time-consuming and costly (financial and human resources). There are satellite-observed data that could be integrated to estimate AGBD, saving cost and time and covering larger extents. In this project, I integrated the Global Ecosystem Dynamics Investigation (GEDI), the recently released High Resolution 1m Global Canopy Height, Sentinel-2, Land Cover, and DEM. The focus was on forests/trees, which play a key role in carbon sequestration. The study site is Mata Nacional do Cabeção (Cabeção National Forest) in Portugal, "an ecological network aimed at conserving local habitats, fauna, and flora." Random Forests (RF) machine learning algorithm was employed in Google Earth Engine (GEE) to estimate the biomass over the study site.
Tasks:
Satellite image processing of time-series Sentinel-2, Global Ecosystem Dynamics Investigation (GEDI), Land cover & DEM.
Data Fusion.
Training/testing Sample Collections.
Machine Learning model training and evaluation.
Accuracy assessment (R2).
Feature importance estimation.
Maps production and interpretations.
Tools/Software:
Google Earth Engine (GEE) and QGIS.
Results/Conclusions:
A map of the Aboveground Biomass Density (AGBD) estimate.
Over an area of 3,634 Ha, the estimated AGBD is 129,833 Mg/Ha, with ± 12.56 Mg/Ha uncertainty.
The model achieved an R² of 0.84, indicating strong predictive accuracy.
The five features most influential on the AGB prediction are NDBI, Canopy Height, DEM, B11, and EVI.
2. Crop Type Classification using Random Forest Machine Learning Classifier and Sentinel-2 Time Series Images in Google Earth Engine (GEE)
Problem Statement: Information about crop types is crucial to food security (SDG2) and requires high precision/level of detail. However, the existing Crop Data Layer (CDL) has a 30m spatial resolution, which generalises some crop types/land cover features. Therefore, there is a need for a higher spatial resolution crop-type map for accurate estimation of crop yield and harvest forecasting. This project produced a 10m crop-type map with finer details.
Tasks:
Satellite image processing of time-series Sentinel-2 & NAIP ( filtering, cloud masking, resampling, reprojection, spectral indices computation, band composites).
Training/testing Sample Collections.
Machine Learning model training and evaluation.
Accuracy assessment (overall accuracy, producer's and user's accuracies).
Feature importance estimation.
Maps production and interpretations.
Tools/Software:
Python (Geemap, Scikit-learn, Folium, Pandas, Geopandas, Numpy), Google Earth Engine (GEE), VS Code, GitHub, and QGIS.
Results/Conclusions:
A 10m spatial resolution crop layer was produced with finer details.
Knowledge about the planting seasons and/or crop calendar is crucial in image selection to capture crop phenology and achieve satisfactory results.
The SWIR bands (5 & 6) and red edge Bands (11 & 12) are the most important features in the classification. This is expected considering the classes are mostly vegetation/crops.
3. Prediction and Spatial Distribution of Chlorophyll a (Chl-a), Turbidity, and Total Suspended solids (TSS) in High Rock Lake Using Machine Learning and Satellite Image
Problem Statement: Monitoring water quality is important to understand the condition of the water body and achieve Sustainable Development Goal 6 ("clean water and sanitation for all”). However, the conventional approach of monitoring water bodies by field observation is time-consuming, expensive, and intensive labour, especially for medium and large water bodies such as lakes. Furthermore, field-observed data might lack the temporal resolution and spatial coverage to tell the current status of the water body. Therefore, the Machine Learning (ML) algorithm was employed in this project with in-situ sampled water quality parameters (WQPs) and Sentinel-2 satellite images to predict and map the spatial distribution of Chlorophyll a (Chl-a), Total suspended solids (TSS), and Turbidity in High Rock Lake, North Carolina, USA.
Tasks:
In-situ WQPs point data processing, cleaning, and sorting.
Satellite image processing of time-series Sentinel-2 (filtering, cloud masking, resampling, spectral indices computation, band composites).
Insitu WQPs points and satellite image match-up (by dates).
Extract spectral reflectance of in-situ samples from the sentinel-2 band composite.
Delineate lake extent using MNDWI.
Machine Learning (ML) models training and evaluation.
Accuracy assessment (R2, RMSE, and MAE)
feature importance.
Maps production and interpretations.
Tools/Software:
Python (Geemap, Scikit-learn, Folium, Pandas, Geopandas, Numpy), Google Earth Engine (GEE), VS Code, GitHub, and QGIS.
Results/Conclusions:
The Random Forest (RF) Regression model produced satisfactory results with the following metrics for the Water Quality Parameters (WQPs):
Chlorophyll a (Chl-a): R2 = 0.67 & MAE = 6.86.
Turbidity: R2 = 0.74, & MAE = 5.73.
Total suspended solids (TSS): R2 = 0.51 & RMSE = 3.25.
4. Land Cover Classification Using Machine Learning Algorithms and Multi-Source Satellite Images
Problem Statement: Land cover (LC) is the Earth’s surface features; water, soil, vegetation and other related classes. Land cover (LC) is the surface features while land use is the purpose that the land serves, which can be residential, recreation, or agriculture. Accurate and up-to-date LC information is important for decision-making processes and planning at various scales/levels. The increasing pressure on land because of population changes also calls for regular information on LC to capture the changes in the ecosystem. However, conventional methods are time-consuming and not efficient for large-scale land cover mapping. This project assessed the performance of Random Forest (RF) and Support Vector Machines (SVM) machine learning algorithms for land cover classification in a predominantly agricultural landscape using the fusion of time series Sentinel-1 and Sentinel-2 images.
Tasks:
Satellite image processing of time-series Sentinel-1 & Sentinel-2 (filtering, cloud masking, resampling, spectral indices computation, band composites).
Training/testing Sample Collections.
Machine Learning (ML) models training and evaluation.
Comparison of ML performances.
Accuracy assessment (overall accuracy, producer's and user's accuracies).
Feature importance.
Land cover area estimation.
Maps production and interpretations.
Tools/Software:
Python (geemap, scikit-learn, folium, pandas, geopandas, numpy), Google Earth Engine (GEE), VS Code, GitHub, and QGIS.
Results/Conclusions:
RF produced the best result with an Overall Accuracy (OA) of 90%.
The SVMs and RF produced satisfactory results with OA higher than 80% for default and adjusted parameters (Tables 1 & 2).
The top seven features from the best (RF) results are MNDWI, B2, NDVI, VV, VH, B8 and NDBI. This could be because of the heterogeneous landscape.
The relative importance of these features changed parameter values.
5. Geospatial Analysis of Agricultural Land Suitability using GIS-MCDA with AHP.
Problem Statement: Information about land suitability is crucial to sustainable agriculture and food security. Land suitability is the ability of the land to meet the intended usage, whether in its present state or the near future (FAO, 1976). The increasing demand for land and water is responsible for a decrease in agricultural land in developing countries, which decreases agricultural production. For a long time, the lack of accurate data on crop-specific land suitability and non-consideration of enough environmental and socio-economic factors before commencing farming is causing low crop yield, massive crop and livestock loss, and excessive spending on agriculture in Nigeria. These subsequent challenges contribute to food insecurity and the rise in Nigeria’s cumulative agricultural imports between 2016 and 2019. This project combined environmental, meteorological, and socio-economic factors to fill the gap of the unavailability of information on suitable areas for agricultural use in the southern zone of Taraba State, Nigeria.
Tasks:
Literature review.
Land cover classification.
Raster data processing, reclassification, and modelling.
Analytical Hierarchical Processing (AHP) and Weighted Overlay Analysis (WOA).
Determine the location and extent of suitable land for agricultural use.
Accuracy assessment with existing croplands.
Report writing and map production.
Tools/Software:
ArcGIS Pro, Google Earth Pro, Microsoft Excel, and Word.
Results/Conclusions:
35% of the study area was found to be highly suitable and 41% was moderately suitable area.
The low suitability area was 17% and the non-suitable area was 7%.
6. Time Series NDVI for Crop Growth Monitoring using GEE
Problem Statement: Floods damaged cropland and called for damage assessment and recovery of cropland after floods. In this project, the recovery pattern and disturbance in the crop phenological pattern were tracked using time-series Normalized Difference Vegetation Indices (NDVI).
Tasks:
Satellite image processing (cloud masking and band composites).
Spectral indices computation.
NDVI plot charts and map production.
Results interpretations.
Tools/Software:
Google Earth Engine (GEE), Sentinel Hub, VS Code, GitHub, and QGIS.
Results/Conclusions:
Low NDVI during the flood (2020 & 2022) compared to non-flood year (2023).
Statistical average NDVI values after floods in 2020 & 2022 did not reach usual values.
7. Flood Extent Mapping in Rio Grande do Sul, Brazil
Problem Statement: The state of Rio Grande do Sul in Brazil experienced what commentators called unprecedented floods caused by torrential rains from 29 April 2024 through to May 2024, which resulted in a loss of lives, and displacement of people from their homes. This led to the damage of critical infrastructure like roads, airports, and stadiums and disruption of socio-economic activities. The flood extent was mapped in this project using Sentinel-2 images.
Tasks:
Satellite Image processing (filtering, cloud masking, band composites).
Mapping the extent of water bodies.
Masking of permanent water bodies.
Flood extent delineation using MNDWI.
Tools/Software:
Google Earth Engine, QGIS, and Sentinel-Hub EO Browser.
Results/Conclusions:
8. Flood Extent Mapping and Damage Assessment in Tana County, Kenya.
Problem Statement: Kenya experienced devastating flooding that caused the loss of lives and properties and the displacement of people from their houses. The flood is caused by continuous rainfall and overflow of rivers/dams. This project used available optical sentinel-2 images in April 2024 to extract the flood extent along the Tana River and assessed the damage using the integration of the flood extent layer, land cover data, and building footprints from OSM.
Tasks:
Satellite image (Sentinel-2) processing.
Flood extent delineation.
Flood damage assessment: Estimation of area of affected land covers and numbers of affected building footprints.
Map production and interpretation of results.
Tools/Software:
Python (geemap, osmnx, pandas, geopandas, numpy) and QGIS.
Results/Conclusions:
Shrubland and cropland are the most affected land cover.
More than 2000 buildings are affected.
Infrastructures such as roads and bridges are also affected, which could cut off the affected population from other parts of the country and make rescue operations challenging.
B. GEO-STATISTICS PROJECTS
1. Spatio-Temporal Analysis of Armed Clash Events and Fatalities in Nigeria
Problem Statement: For more than a decade, Nigeria has been battling with armed clashes by non-state armed and insurgent groups (Boko Haram, Militia/Bandits etc.), especially in the northern part of the country, which has caused the deaths and displacement of people and destruction of towns. The Nigerian military and other security agencies, including those from neighbouring countries, have been fighting these violent groups, and they have recorded some successes. But how well has the most populous African country been able to address the armed clashes that have affected almost all the 36 States of the Federation including the Capital?
Tasks: Analysis of the trend of armed clash events perpetuated by terrorists, insurgents and militia groups and fatalities in Nigeria between 2010 and 2023, including multi-temporal views of two affected towns.
Tools/Software:
Python (requests, folium, pandas, geopandas, numpy, matplotlib), VS Code, GitHub, Microsoft Excel, and QGIS.
Results/Conclusions:
2. Housing Price Prediction using Linear Regression and Random Forest with Python.
Problem Statement: Predicting housing prices accurately is challenging due to the complexity of real estate markets and the interplay of numerous features influencing prices. This project addresses the problem by using data preprocessing, feature selection, and machine learning models (Linear Regression and Random Forest) to build predictive pipelines and evaluate their performance.
Tasks:
Data cleaning and Exploratory Data Analysis (EDA).
Feature selection (cardinality, multicollinearity).
ML pipeline for Regression models to predict house prices.
ML pipeline for Random Forest models for house price predictions.
Model evaluation (R2, MAE, and RMSE).
Results visualization and Interpretations.
Tools/Software:
Python (scikit-learn, pandas, seaborn, numpy, matplotlib), VS Code, and GitHub.
Results/Conclusions:
The R2 score of the LR model is: 0.67.
The R2 score of the RF model is: 0.63.
The mean absolute error of the LR model is 713192.94.
The mean absolute error of the RF model is 747397.56.
The root mean squared error for the LR model is 823898911494.51
The root mean squared error for the RF model is 933066838125.89
All the features have moderately positive correlation with the target (price).
As expected, "area" is the most correlated feature with "price": 0.535997.
"Stories", "bedrooms", and "bathrooms" are moderately correlated with one another.
3. Population Density, Nighttime Lights, and IGR Pattern Map of Nigeria Using QGIS.
Problem Statement:
Tasks:
Tools/Software:
Results/Conclusions:
C. GEOMATICS AND SURVEYING PROJECTS
1. Topographic Map of Hiking Site
Problem Statement:
Tasks:
Tools/Software:
Results/Conclusions:
2. Building Damage Assessment in Adeyi Avenue, Ibadan
Problem Statement: A devastating blast was reported on the evening of 16th January 2024 at Adeniyi Avenue in Bodija, Oyo State, Nigeria. The illegal storage of explosive material for mining purposes was identified as the reason for the explosion. This resulted in several casualties, including loss of lives and severe damage to buildings.
Tasks:
Tools/Software:
Results/Conclusions: