Model evaluation

Repeated and well-documented comparisons between model simulations and real-world measurements increase one’s confidence in the suitability of a model for a certain purpose. A combination of graphical presentations and statistical measures are being used to evaluate the performance of the model.

Graphical analysis

Graphical comparison of the simulated and measured soil water tension, field water depth, aboveground biomass and grain yields are presented in time series to show how well does the model simulate the temporal fluctuation of the parameter in question. If the fluctuation in simulated (field water depth and soil water tension) generally agrees with the dynamics of measured values, the temporal trends were simulated quite well. The dynamics of simulated and measured values could differ by 1 day. It should be realized that the time step of integration in the model is 1 day and it is unknown whether the rainfall events occurred during the night (i.e., after integration of state variables) or during the day (i.e., before integration). Similarly, it is mostly unknown whether irrigations were applied before or after recording field water depth. 

Scatter plots between simulated and measured data and 1:1 line are used to show the overall goodness-of-model fit (overall model performance). The scatter of measured and simulated data and its distance from the 1:1 line indicates the overall goodness of model fit.

Statistical analysis

Independent experimental data sets are being used to evaluate the performance of ORYZA2000. The model can be used to simulate crop growth and development for each treatment and each experiment using the constructed crop data file, derived soil properties and actual daily weather data and groundwater table depths. These simulated results are then compared with the measured data. 

For each of the crop, soil, and water parameters, the slope, intercept and coefficient of determination (R2) of the linear regression between simulated (S) and measured (M) values are calculated. 

Student’s t-test of means assuming unequal variance may be calculated to compare the simulated and measured data. Furthermore, the absolute (RMSEa) and normalized (RMSEn) root mean square errors between simulated and measured values may also be computed:

A model reproduces experimental data best when a is 1, b is 0, R2 is 1, P(t*) is larger than 0.05, RMSEa is similar to SD, and normalized RMSEn is in the same order of magnitude as the CV of measured values. If CV of measured values is not available, RMSEn may also be compared with literature data or with a standard scale that was developed by Jamieson et al. (1991).