Gradient Boosting for Grade Forecasting: Promise, Pitfalls and QP Liability

Back to Blog

Machine-learning grade prediction models are proliferating across the exploration industry. Used well, they surface patterns that classical geostatistics misses. Used poorly — or disclosed poorly — they expose the signing QP to regulatory and reputational risk that most practitioners have not yet fully reckoned with.

This post focuses specifically on gradient boosting methods (XGBoost, LightGBM, CatBoost) because they are the most widely adopted ML family in geological prediction tasks right now. The principles, however, apply equally to random forests, neural networks, and any other non-parametric predictive model used to inform a resource estimate that will appear in a regulatory filing.

Where ML Adds Genuine Value in Grade Prediction

Classical kriging excels when mineralisation follows a well-defined spatial covariance structure that can be adequately captured by a variogram model. It struggles when grade is co-controlled by multiple geological variables — lithology, alteration intensity, structural proximity, and geochemical pathfinders — whose interactions are non-linear and high-dimensional.

This is precisely the problem gradient boosting is designed to solve. Given a sufficiently rich training dataset, a well-tuned XGBoost model can learn interaction terms between predictor variables that would require dozens of indicator kriging passes to approximate. In practice I have seen well-constructed gradient boosting models reduce cross-validation RMSE by 15–30% relative to ordinary kriging in lithologically complex porphyry and skarn systems.

The other area where ML adds clear value is in early-stage target generation: identifying drill targets from geophysics, geochemistry and remote sensing data where no resource estimate is yet intended. In this setting the regulatory stakes are lower and the model can be used more exploratorily.

Key Distinction

There is a meaningful difference between using ML to support a resource estimate (feature engineering, domain classification, outlier detection) and using ML as the primary interpolation engine within a regulatory resource estimate. The disclosure obligations — and the QP's liability exposure — differ substantially.

The Black-Box Disclosure Problem Under SK-1300

The SEC's SK-1300 standard (effective February 2021) requires that the QP provide a sufficiently detailed description of the estimation methodology to allow a reasonably informed person to understand the basis for the estimate. NI 43-101 Form 43-101F1 Item 14.3 has a comparable requirement.

This is where gradient boosting creates a genuine disclosure challenge. A hyperparameter-tuned XGBoost model with 500 trees, max depth 6, and learning rate 0.05 is not meaningfully described by those parameters alone. Unlike ordinary kriging — where the variogram model and search ellipsoid fully characterise the estimator — a gradient boosting model's behaviour depends on the entire training dataset, the feature set, the interaction structure learned during fitting, and the regularisation choices made to control overfitting.

"A kriging estimate can be audited parameter by parameter. A gradient boosting estimate cannot. That asymmetry is a regulatory fact every QP signing an ML-assisted resource estimate must confront."

The practical implication is that QPs using gradient boosting in resource estimation must supplement parameter disclosure with interpretability outputs: SHAP (SHapley Additive exPlanations) value plots that show which features drive predictions at individual block level, partial dependence plots showing grade response to key predictor variables, and cross-validation results broken out by geological domain — not just aggregate RMSE.

Due Diligence Questions a QP Must Ask Before Signing Off

When I am retained to review or co-sign a resource estimate that incorporates ML grade prediction, my technical review includes a structured set of questions that I require the modelling team to answer in writing. These are:

What is the training sample size, and is it sufficient for the model complexity used? A gradient boosting model with 500 trees trained on 200 composites is almost certainly overfitting. Cross-validation RMSE alone will not catch this if the folds are not spatially stratified.
Were training/test splits performed spatially? Random splits produce optimistic validation metrics in geological data because nearby samples share grade information. Spatial cross-validation (leave-one-block-out or spatial k-fold) is the minimum standard.
Does the model extrapolate outside the training feature space? Gradient boosting cannot extrapolate — by construction it returns the mean of the terminal leaf, which is bounded by the training data. In areas of the block model with predictor combinations not represented in the training set, the model will silently produce unreliable estimates. Feature space coverage maps should be reviewed for every estimation domain.
How sensitive are the resource tonnage and grade to hyperparameter choices? A sensitivity analysis over the key hyperparameters (max depth, learning rate, subsample fraction) should demonstrate that the estimate is not critically dependent on any single tuning decision.
Has the ML estimate been benchmarked against a conventional kriging estimate? Where the two methods diverge materially (more than 10% on contained metal at any classification level), the divergence should be explained geologically, not simply noted.

Validation Protocols for ML-Assisted Resource Estimates

Beyond cross-validation, ML-assisted resource estimates require the same global validation checks as any other estimate — swath plots, nearest-neighbour comparison, change-of-support analysis — plus additional checks specific to the ML component:

Predicted-vs-actual plots for the hold-out test set, with spatial coverage maps showing where the test samples are relative to the resource boundary
Residual analysis by geological domain to confirm that prediction error is not systematically biased in any sub-population
Feature importance stability across cross-validation folds — if the top features change between folds, the model is not robust
A reproduction check: can a second practitioner reproduce the resource estimate from the disclosed training data, feature set and hyperparameters?

Signing Off on ML-Assisted Estimates: A Personal Position

I will sign a resource estimate that incorporates gradient boosting grade prediction where: (1) the model is used as a complementary estimator alongside kriging rather than as the sole interpolation method; (2) the disclosure meets the interpretability standard described above; (3) the cross-validation protocol is spatially aware; and (4) the estimate has been peer-reviewed by a second practitioner with ML competence.

I am not yet prepared to sign an estimate where a black-box ML model is the sole basis for resource classification without the supplementary disclosures described. The regulatory frameworks have not yet caught up with practice, and that gap is the QP's liability exposure.

AI and ML will transform resource estimation. The transition will be smoother — for practitioners and for capital markets — if QPs engage with the disclosure challenge now rather than waiting for a high-profile regulatory challenge to set the precedent.

JNA Resource Advisory offers independent AI/ML grade forecasting review for resource estimates under NI 43-101, SK-1300 and JORC. Contact us to discuss your project.

AI / ML Grade Forecasting SK-1300 NI 43-101 QP Liability Geostatistics