VMD0053: Bias Tests, Prediction Error & the Model Validation Report
The core of model validation
Two statistical tests determine whether a biogeochemical model is credible enough to generate VCS carbon credits: the bias test (is the model systematically over- or under-estimating?) and the prediction error test (how precise are its estimates?). Both results feed directly into the credit deduction applied to the project.
Analogy: Testing a Rifle Scope
A bias test is like checking whether a rifle consistently shoots left of target (systematic bias). A prediction error test checks how scattered the shots are around where they land (precision). A scope that always shoots left by 5 cm can be corrected, you know the offset. A scope with completely scattered shots is unusable because you cannot predict where the next shot lands. VMD0053's two tests mirror this: bias tells you the systematic offset; prediction error tells you the scatter.
The Bias Test (Section 5.2.4)
Bias is the systematic tendency of a model to overestimate or underestimate effects. Even a biased model can still be used, but large biases produce large prediction errors, which result in larger credit deductions.
Equation 1 - Per-Study Bias
Per-Study Bias
The average difference between what the model predicted and what was actually observed across all observations in one study, in tCO₂e
Predicted Value
The model's predicted value for observation i, in tCO₂e
Observed Value
The actual measured value for observation i, in tCO₂e
Number of Observations
Total count of paired observations in the study, used as the divisor
Mean bias = unweighted mean of all per-study biases (unweighted so that one large study with many observations does not automatically dominate model acceptance).
Equation 2 - Pooled Measurement Uncertainty (PMU)
Pooled Measurement Uncertainty
The combined measurement uncertainty across all validation observations, used as the threshold for the bias test, in tCO₂e
Observation Standard Error
Standard error of the j-th observation from field measurements
Replicate Count
Number of measurement replicates in the j-th observation
Observation Count
Total number of observations across all validation studies
Bias validity test: |mean bias| ≤ PMU means the model PASSES the bias test.
Worked Bias Calculation (from VMD0053 Figure 5)
Three studies measuring the effect of reduced tillage on SOC change (values in tCO₂e):
| Study | Observations (modeled vs observed) | Per-study bias |
|---|---|---|
| Study 1 | (1.1 vs 4.5) and (12.2 vs 3.1) | ((1.1-4.5) + (12.2-3.1)) / 2 = +2.85 |
| Study 2 | (0.6 vs 1.2) | (0.6-1.2) / 1 = -0.60 |
| Study 3 | (-3.3 vs 3.5) and (4.5 vs 4.0) | ((-3.3-3.5) + (4.5-4.0)) / 2 = -3.15 |
Mean bias = (2.85 + (-0.60) + (-3.15)) / 3 = (-0.90) / 3 = -0.30
PMU calculated from all study standard errors = 1.6 tCO₂e
Bias test: |-0.30| = 0.30 ≤ 1.6 (PMU) - Model PASSES bias test
The Perverse Effect of Bias on Credits
Large model biases, whether positive (overestimation) or negative (underestimation), produce large residuals when computing prediction error. Large prediction error means fewer credits survive the uncertainty deduction. So even a positively biased model that overestimates benefits is penalized through the credit deduction system. There is no "free lunch" from inflating model outputs.
The Prediction Error Test (Section 5.2.5)
Even an unbiased model may be imprecise, its predictions scatter widely around observed values. VMD0053 quantifies this as model prediction error, which is used to deduct credits under VM0042's uncertainty framework.
Model Prediction Error Variance
Prediction Error Variance
How widely the model's errors scatter around their own mean - larger values mean less precise predictions, in (tCO₂e)²
Modeled Value
The model's predicted value for observation j, in tCO₂e
Observed Value
The actual measured value for observation j, in tCO₂e
Mean Residual
The average of all (modeled minus observed) differences across k observations
Observation Count
Total number of paired modeled-observed observations across all studies
σmodel = the square root of σ²model (the standard deviation of prediction errors).
90% Prediction Interval (Frequentist Models)
Prediction Interval
The range within which 90% of actual observations should fall if the model is valid, for observation i
Modeled Effect
The model's predicted effect of practice change for observation i, in tCO₂e
Z-Score
The z-score for 90% confidence under the standard normal distribution
Prediction Error SD
Standard deviation of model prediction errors from the variance calculation above
Confidence coverage test: At least 90% of observed validation data values must fall within these intervals for the model to PASS.
Note: A prediction interval tells you where a future individual observation is expected to fall. A confidence interval tells you uncertainty in an estimated mean. VMD0053 uses prediction intervals, the harder test, because it is checking individual data points, not just the average.
Prediction Error: Continuing the Worked Example
From the five modeled-observed pairs in our earlier example, suppose:
σ²_model = 35.2 - σ_model = √35.2 = 5.9 tCO₂e
For modeled value μ_i = 6.8: 90% PI = [6.8 - 1.64x5.9, 6.8 + 1.64x5.9] = [-2.9, 16.5]
For modeled value μ_i = -4.0: 90% PI = [-4.0 - 9.7, -4.0 + 9.7] = [-13.7, 5.7]
The test: Are all (or at least 90% of) observed values within their respective intervals?
If yes - model PASSES confidence coverage test - σ_model is used in VM0042 credit deductions
How Model Prediction Error Flows into Credit Deductions
VM0042 uses a probability-of-exceedance framework for uncertainty. The model prediction error (σ_model) becomes one of the uncertainty inputs to the 95% confidence lower-bound calculation. A larger σ_model means the lower-bound estimate of carbon benefits is further from the central estimate, and fewer VCUs are issued. This is why improving model precision directly increases project revenue.
Two Options for Calculating Prediction Error
| Option | When Used | Method |
|---|---|---|
| Option A: Analytical | Sufficient field trial data available | Direct calculation of variance of (modeled - observed) differences |
| Option B: Monte Carlo | Models with parameter uncertainty (e.g., Bayesian) | Posterior predictive distributions (PPDs) resampled iteratively to estimate prediction error |
The Model Validation Report (MVR) and IME Review
All validation work is documented in a Model Validation Report (MVR), which must be reviewed by an Independent Modeling Expert (IME) contracted by the VVB.
What the MVR Must Contain
- - Model version and calibration process description
- - All internal model parameter sets with proof of climate-zone-level resolution
- - Full citation of calibration and validation datasets
- - Per-study bias and mean bias across all studies (ranked highest to lowest)
- - All PMU values used for each PC/CFG/ES combination
- - Graphs of measured vs modeled with 90% prediction interval coverage
- - Scatterplot and histogram of residuals (modeled - observed)
- - Mean squared error for each combination
IME Assessment Steps
- Project proponent submits MVR to VVB
- VVB selects and contracts an IME meeting Verra's minimum qualifications
- IME assesses: model eligibility (Section 4), calibration procedures, bias and prediction error calculations, confidence coverage
- IME issues an assessment report submitted to VVB
- Both MVR and IME assessment report are made publicly available in the Verra registry
IME Minimum Qualifications
- - ≥5 years relevant experience quantifying GHG fluxes from ALM using biogeochemical models
- - Demonstrated use of the specific model type or conceptually similar models (peer-reviewed publications or project reports as evidence)
- - Freedom from conflict of interest (organizational affiliations must be disclosed)
- - Two references from researchers or academic staff
Peer-Reviewed Journal Alternative
Instead of a standalone IME review, an MVR may be published in one of 30+ approved peer-reviewed journals listed in VMD0053 Table 3 (including Geoderma, Global Biogeochemical Cycles, Soil Science Society of America Journal, etc.) AND reviewed by an IME. The peer-reviewed publication must explicitly state its purpose as validating the model for VCS carbon credits under ISO 14064. Important: the project proponent must also submit a separate sub-report outlining how the MVR requirements have been met and clarifying any aspects of the peer-reviewed paper as it pertains to the overall requirements, the published paper alone is not sufficient.
Petitions for Exception: When Tests Fail Due to Data Scarcity
Failing the bias test or the 90% confidence coverage test is not always an automatic dead end. VMD0053 allows project proponents to petition for validation approval even when strict thresholds are not met, where data scarcity is the documented reason. For example:
- Bias > PMU: petition allowed if the mean bias is demonstrably due to insufficient validation observations, subject to IME and VVB approval
- Coverage below 90%: petition allowed for marginal failures (e.g., 6/7 or 7/8 observations covered) where global data gaps exist for the specific PC/CFG/ES combination
Petitions are evaluated case-by-case and require clear justification, they are not routine exceptions.
Default PMU When Standard Errors Are Not Published
Many legacy soil studies do not report standard errors for their measurements, making it impossible to compute PMU directly. In these cases, VMD0053 allows a default replacement PMU value based on typical measurement error for the specific measurement technique (e.g., the combined measurement error of the SOC content and bulk density measurement techniques used). PMU may also be defined as a function of cumulative sampling depth based on observed measurement uncertainty in validation datasets. These default values must be justified and are subject to IME approval and VVB review. The North American Proficiency Testing Program (NAPT) is suggested as a reference for instrument-specific measurement error rates.
When Must the MVR Be Updated?
- Project area expands to include new PCs, CFGs, or emission sources not covered in the original MVR
- Model is changed in a way that substantially affects model runs and estimated emission reductions/removals
- The MVR must be submitted alongside each monitoring report, the existing MVR continues to apply as long as the project remains within the already validated domain
Key Takeaways
- 1The bias test checks if the model systematically over- or under-estimates: pass requires absolute mean bias to be less than or equal to pooled measurement uncertainty (PMU)
- 2Prediction error variance quantifies model scatter - larger variance means bigger credit deductions through VM0042's uncertainty framework
- 3The 90% confidence coverage test requires at least 90% of observed values to fall within their respective 90% prediction intervals
- 4The Model Validation Report (MVR) must be reviewed by an Independent Modeling Expert (IME) with 5+ years of biogeochemical modeling experience and no conflict of interest
- 5Petitions for exception are permitted when bias or coverage tests fail due to documented data scarcity, but require case-by-case IME and VVB approval
- 6The MVR must be updated whenever the project expands to new practice categories, crop functional groups, or emission sources not covered in the original validation