From Rainfall Predictions to Crop Simulations:
Model Validation in Climate-Data Pipelines
Model Validation in Climate-Data Pipelines
:::Home > Service Activities > Seminars & Workshops Conducted > Model Validation in Climate-Data Pipelines
Advanced Workshop on Model Validation, Robustness and Responsible AI for Climate Data Models
5th Philippine Junior Data Science Challenge (PJDSC 2025)
University of the Philippines Data Science Society
Saturday, 9 November 2025, from 1:00 PM to 2:00 PM.
Crop models as tools for climate resilience
Perspectives:
Farmer’s Perspective: managing volatility, extreme weather, planting windows
Business Perspective: forecasting for risk management, insurance, logistics
Food Security Perspective: early-warning systems and vulnerability mapping
Policy & Research Perspective: evidence-based adaptation and investment planning
Integration of rainfall predictions makes crop models dynamic resilience tools
Reliability of process-based crop models: mechanistic, data-driven, integrative
Features of reliable models:
Mechanistic (physics, chemistry, biology)
Integration across the soil–plant–atmosphere continuum
Scalability and uncertainty quantification
Overview: From data → model → simulation → decision → impact
Data (Foundation):
Inputs: weather (temperature, solar radiation, precipitation), soil, management
Challenges: noise, inconsistency, coarse resolution
Model (Processing Engine):
Process-based equations for photosynthesis, transpiration, growth, yield
Requires calibration and validation
Simulation: thousands of runs → probabilistic outcomes
Decision: translating results into farmer, business, or policy action
Impact: economic, environmental, and social implications
Validation as both a scientific and ethical responsibility
Example: Typhoon rainfall forecasting and its role in crop-management decisions
Reference: Zhou et al. (2023) on global concurrent climate extremes
Imbalance: many “normal” days, few “extreme” events
Noise: sensor error, missing values, measurement uncertainty
Case Example: Taal Lake Fishkill (Pabico et al. 2015)
Many “no-fishkill” vs. rare “fishkill” days
Missing and irregular data issues
Standard accuracy misleads under imbalanced classes
Alternative metrics: precision, recall, F1, PR-curve (Miftahushudur et al. 2025)
Resampling Techniques:
Undersampling: random, edited nearest neighbor, cleaning rule, Tomek links
Oversampling: random, SMOTE, ADASYN, Borderline SMOTE, SL-SMOTE, K-means SMOTE, SVM SMOTE
Reference: Chawla et al. (2002)
Five Steps:
Identify minority class
Select sample + nearest neighbors
Randomly choose neighbor
Create synthetic sample
Repeat until classes balanced
Rainfall measurement errors propagate downstream (Heinemann et al. 2002)
Sources: gauge design, aerodynamic effects, wetting losses
Effects on simulated soil moisture, evapotranspiration, yield
Model calibration cannot compensate for poor rainfall data
Rainfall time series are inherently imbalanced
Rare events (heavy rainfall) drive model sensitivity
Measurement errors on rare days disproportionately affect outcomes
Lesson: validation must emphasize rare, high-impact events
Imbalance between normal and extreme rainfall affects yield predictions
Conceptual parallel with minority-class misclassification in ML
Example mapping between data bias and crop-model implications
Imbalance appears as temporal or spatial unevenness
Ethical duty to represent rare, high-impact conditions
When extremes are underrepresented, both AI and crop models learn an incomplete reality
Spatial variability of precipitation affects crop simulations
High-resolution local data yields better outputs
Interpolation smooths extremes, underestimating water stress
Satellite rainfall misrepresents local variability
Importance of spatial + temporal resolution in validation
Encoding continuous and cyclic variables (Pabico 1996)
Interpolation representation for continuous ranges (e.g., growth −20% → 40%)
Example: proportional encoding between adjacent nodes
Periodic variables (time, date) require circular encoding
Wrap-around representations for cyclic continuity
Invitation to questions and open discussion: “Audience's puzzling questions and speaker's (hopefully) logical answers.”
This page's breadcrumbs: Home > Service Activities > Seminars & Workshops Conducted > Model Validation in Climate-Data Pipelines