Bias Tests, Prediction Error & the MVR

VMD0053: Bias Tests, Prediction Error & the Model Validation Report

Key takeaway

The core of model validation

Two statistical tests determine whether a biogeochemical model is credible enough to generate VCS carbon credits: the bias test (is the model systematically over- or under-estimating?) and the prediction error test (how precise are its estimates?). Both results feed directly into the credit deduction applied to the project.

Analogy

Analogy: Testing a Rifle Scope

A bias test is like checking whether a rifle consistently shoots left of target (systematic bias). A prediction error test checks how scattered the shots are around where they land (precision). A scope that always shoots left by 5 cm can be corrected, you know the offset. A scope with completely scattered shots is unusable because you cannot predict where the next shot lands. VMD0053's two tests mirror this: bias tells you the systematic offset; prediction error tells you the scatter.

The Bias Test (Section 5.2.4)

Bias is the systematic tendency of a model to overestimate or underestimate effects. Even a biased model can still be used, but large biases produce large prediction errors, which result in larger credit deductions.

Equation 1 - Per-Study Bias

bias=P_i-O_i-n

bias

Per-Study Bias

The average difference between what the model predicted and what was actually observed across all observations in one study, in tCO₂e

P_i

Predicted Value

The model's predicted value for observation i, in tCO₂e

O_i

Observed Value

The actual measured value for observation i, in tCO₂e

Number of Observations

Total count of paired observations in the study, used as the divisor

Mean bias = unweighted mean of all per-study biases (unweighted so that one large study with many observations does not automatically dominate model acceptance).

Equation 2 - Pooled Measurement Uncertainty (PMU)

σ_meas=σ_j×n_j×k

σ_meas

Pooled Measurement Uncertainty

The combined measurement uncertainty across all validation observations, used as the threshold for the bias test, in tCO₂e

σ_j

Observation Standard Error

Standard error of the j-th observation from field measurements

n_j

Replicate Count

Number of measurement replicates in the j-th observation

Observation Count

Total number of observations across all validation studies

Bias validity test: |mean bias| ≤ PMU means the model PASSES the bias test.

Worked example

Worked Bias Calculation (from VMD0053 Figure 5)

Three studies measuring the effect of reduced tillage on SOC change (values in tCO₂e):

Study	Observations (modeled vs observed)	Per-study bias
Study 1	(1.1 vs 4.5) and (12.2 vs 3.1)	((1.1-4.5) + (12.2-3.1)) / 2 = +2.85
Study 2	(0.6 vs 1.2)	(0.6-1.2) / 1 = -0.60
Study 3	(-3.3 vs 3.5) and (4.5 vs 4.0)	((-3.3-3.5) + (4.5-4.0)) / 2 = -3.15

Mean bias = (2.85 + (-0.60) + (-3.15)) / 3 = (-0.90) / 3 = -0.30

PMU calculated from all study standard errors = 1.6 tCO₂e

Bias test: |-0.30| = 0.30 ≤ 1.6 (PMU) - Model PASSES bias test

The Perverse Effect of Bias on Credits

Large model biases, whether positive (overestimation) or negative (underestimation), produce large residuals when computing prediction error. Large prediction error means fewer credits survive the uncertainty deduction. So even a positively biased model that overestimates benefits is penalized through the credit deduction system. There is no "free lunch" from inflating model outputs.

The Prediction Error Test (Section 5.2.5)

Even an unbiased model may be imprecise, its predictions scatter widely around observed values. VMD0053 quantifies this as model prediction error, which is used to deduct credits under VM0042's uncertainty framework.

Model Prediction Error Variance

σ²_model=mod_j-obs_j-μ_(mod-obs)-k

σ²_model

Prediction Error Variance

How widely the model's errors scatter around their own mean - larger values mean less precise predictions, in (tCO₂e)²

mod_j

Modeled Value

The model's predicted value for observation j, in tCO₂e

obs_j

Observed Value

The actual measured value for observation j, in tCO₂e

μ_(mod-obs)

Mean Residual

The average of all (modeled minus observed) differences across k observations

Observation Count

Total number of paired modeled-observed observations across all studies

σ_model = the square root of σ²_model (the standard deviation of prediction errors).

90% Prediction Interval (Frequentist Models)

PI_i=μ_i×1.64×σ_model

PI_i

Prediction Interval

The range within which 90% of actual observations should fall if the model is valid, for observation i

μ_i

Modeled Effect

The model's predicted effect of practice change for observation i, in tCO₂e

1.64

Z-Score

The z-score for 90% confidence under the standard normal distribution

σ_model

Prediction Error SD

Standard deviation of model prediction errors from the variance calculation above

Confidence coverage test: At least 90% of observed validation data values must fall within these intervals for the model to PASS.

Note: A prediction interval tells you where a future individual observation is expected to fall. A confidence interval tells you uncertainty in an estimated mean. VMD0053 uses prediction intervals, the harder test, because it is checking individual data points, not just the average.

Worked example

Prediction Error: Continuing the Worked Example

From the five modeled-observed pairs in our earlier example, suppose:

σ²_model = 35.2 - σ_model = √35.2 = 5.9 tCO₂e

For modeled value μ_i = 6.8: 90% PI = [6.8 - 1.64x5.9, 6.8 + 1.64x5.9] = [-2.9, 16.5]

For modeled value μ_i = -4.0: 90% PI = [-4.0 - 9.7, -4.0 + 9.7] = [-13.7, 5.7]

The test: Are all (or at least 90% of) observed values within their respective intervals?

If yes - model PASSES confidence coverage test - σ_model is used in VM0042 credit deductions

How Model Prediction Error Flows into Credit Deductions

VM0042 uses a probability-of-exceedance framework for uncertainty. The model prediction error (σ_model) becomes one of the uncertainty inputs to the 95% confidence lower-bound calculation. A larger σ_model means the lower-bound estimate of carbon benefits is further from the central estimate, and fewer VCUs are issued. This is why improving model precision directly increases project revenue.

Two Options for Calculating Prediction Error

Option	When Used	Method
Option A: Analytical	Sufficient field trial data available	Direct calculation of variance of (modeled - observed) differences
Option B: Monte Carlo	Models with parameter uncertainty (e.g., Bayesian)	Posterior predictive distributions (PPDs) resampled iteratively to estimate prediction error

The Model Validation Report (MVR) and IME Review

All validation work is documented in a Model Validation Report (MVR), which must be reviewed by an Independent Modeling Expert (IME) contracted by the VVB.

What the MVR Must Contain

- Model version and calibration process description
- All internal model parameter sets with proof of climate-zone-level resolution
- Full citation of calibration and validation datasets
- Per-study bias and mean bias across all studies (ranked highest to lowest)
- All PMU values used for each PC/CFG/ES combination
- Graphs of measured vs modeled with 90% prediction interval coverage
- Scatterplot and histogram of residuals (modeled - observed)
- Mean squared error for each combination

IME Assessment Steps

Project proponent submits MVR to VVB
VVB selects and contracts an IME meeting Verra's minimum qualifications
IME assesses: model eligibility (Section 4), calibration procedures, bias and prediction error calculations, confidence coverage
IME issues an assessment report submitted to VVB
Both MVR and IME assessment report are made publicly available in the Verra registry

IME Minimum Qualifications

- ≥5 years relevant experience quantifying GHG fluxes from ALM using biogeochemical models
- Demonstrated use of the specific model type or conceptually similar models (peer-reviewed publications or project reports as evidence)
- Freedom from conflict of interest (organizational affiliations must be disclosed)
- Two references from researchers or academic staff

Peer-Reviewed Journal Alternative

Instead of a standalone IME review, an MVR may be published in one of 30+ approved peer-reviewed journals listed in VMD0053 Table 3 (including Geoderma, Global Biogeochemical Cycles, Soil Science Society of America Journal, etc.) AND reviewed by an IME. The peer-reviewed publication must explicitly state its purpose as validating the model for VCS carbon credits under ISO 14064. Important: the project proponent must also submit a separate sub-report outlining how the MVR requirements have been met and clarifying any aspects of the peer-reviewed paper as it pertains to the overall requirements, the published paper alone is not sufficient.

Petitions for Exception: When Tests Fail Due to Data Scarcity

Failing the bias test or the 90% confidence coverage test is not always an automatic dead end. VMD0053 allows project proponents to petition for validation approval even when strict thresholds are not met, where data scarcity is the documented reason. For example:

Bias > PMU: petition allowed if the mean bias is demonstrably due to insufficient validation observations, subject to IME and VVB approval
Coverage below 90%: petition allowed for marginal failures (e.g., 6/7 or 7/8 observations covered) where global data gaps exist for the specific PC/CFG/ES combination

Petitions are evaluated case-by-case and require clear justification, they are not routine exceptions.

Default PMU When Standard Errors Are Not Published

Many legacy soil studies do not report standard errors for their measurements, making it impossible to compute PMU directly. In these cases, VMD0053 allows a default replacement PMU value based on typical measurement error for the specific measurement technique (e.g., the combined measurement error of the SOC content and bulk density measurement techniques used). PMU may also be defined as a function of cumulative sampling depth based on observed measurement uncertainty in validation datasets. These default values must be justified and are subject to IME approval and VVB review. The North American Proficiency Testing Program (NAPT) is suggested as a reference for instrument-specific measurement error rates.

When Must the MVR Be Updated?

Project area expands to include new PCs, CFGs, or emission sources not covered in the original MVR
Model is changed in a way that substantially affects model runs and estimated emission reductions/removals
The MVR must be submitted alongside each monitoring report, the existing MVR continues to apply as long as the project remains within the already validated domain

Key Takeaways

The bias test checks if the model systematically over- or under-estimates: pass requires absolute mean bias to be less than or equal to pooled measurement uncertainty (PMU)
Prediction error variance quantifies model scatter - larger variance means bigger credit deductions through VM0042's uncertainty framework
The 90% confidence coverage test requires at least 90% of observed values to fall within their respective 90% prediction intervals
The Model Validation Report (MVR) must be reviewed by an Independent Modeling Expert (IME) with 5+ years of biogeochemical modeling experience and no conflict of interest
Petitions for exception are permitted when bias or coverage tests fail due to documented data scarcity, but require case-by-case IME and VVB approval
The MVR must be updated whenever the project expands to new practice categories, crop functional groups, or emission sources not covered in the original validation

Knowledge Check

Test what you just learned

4 questions · check each one as you go

0 of 4 answered

A model's bias is calculated as +8.5 tCO₂e. The Pooled Measurement Uncertainty (PMU) is 6.2 tCO₂e. Does the model pass the bias test?

What is the minimum confidence coverage requirement for the 90% prediction interval test?

A project proponent uses Option B (Monte Carlo) to calculate model prediction error. What is the output of this approach?

The IME assessment report and Model Validation Report are submitted to the VVB. Where are they ultimately made available?

Previous lesson

Model Eligibility, Calibration & the Project Domain

Next lesson

Internal & External Risk Scoring

Bias Tests, Prediction Error & the MVR

Test what you just learned

We simplify.We show you the source.We make the work easy for you.

We simplify.
We show you the source.
We make the work easy for you.