Articles | Volume 11, issue 2
https://doi.org/10.5194/soil-11-553-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/soil-11-553-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Using Monte Carlo conformal prediction to evaluate the uncertainty of deep-learning soil spectral models
Yin-Chung Huang
CORRESPONDING AUTHOR
School of Life and Environmental Science & Sydney Institute of Agriculture, The University of Sydney, Sydney, NSW, Australia
José Padarian
School of Life and Environmental Science & Sydney Institute of Agriculture, The University of Sydney, Sydney, NSW, Australia
Budiman Minasny
School of Life and Environmental Science & Sydney Institute of Agriculture, The University of Sydney, Sydney, NSW, Australia
Alex B. McBratney
School of Life and Environmental Science & Sydney Institute of Agriculture, The University of Sydney, Sydney, NSW, Australia
Related authors
No articles found.
Marliana Tri Widyastuti, José Padarian, Budiman Minasny, Mathew Webb, Muh Taufik, and Darren Kidd
SOIL, 11, 287–307, https://doi.org/10.5194/soil-11-287-2025, https://doi.org/10.5194/soil-11-287-2025, 2025
Short summary
Short summary
This work aims to predict soil water content at a fine spatiotemporal resolution (80 m grids, daily) to support agricultural management in Tasmania. It proves that transfer learning can improve the accuracy of deep learning models to predict multilevel soil moisture. We address the challenge of mapping soil moisture at field-scale resolution and integrate the model into a near-real-time monitoring system.
Marliana Tri Widyastuti, Budiman Minasny, José Padarian, Federico Maggi, Matt Aitkenhead, Amélie Beucher, John Connolly, Dian Fiantis, Darren Kidd, Yuxin Ma, Fraser Macfarlane, Ciaran Robb, Rudiyanto, Budi Indra Setiawan, and Muh Taufik
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-333, https://doi.org/10.5194/essd-2024-333, 2024
Preprint withdrawn
Short summary
Short summary
PEATGRIDS, the first dataset containing maps of global peat thickness and carbon stock at 1 km resolution. The dataset has been publicly available at Zenodo to support further analyses and modelling of peatlands across the globe. This work employed the random forest machine learning model to provide spatially explicit peat carbon stock at pixel basis.
Tobias Karl David Weber, Lutz Weihermüller, Attila Nemes, Michel Bechtold, Aurore Degré, Efstathios Diamantopoulos, Simone Fatichi, Vilim Filipović, Surya Gupta, Tobias L. Hohenbrink, Daniel R. Hirmas, Conrad Jackisch, Quirijn de Jong van Lier, John Koestel, Peter Lehmann, Toby R. Marthews, Budiman Minasny, Holger Pagel, Martine van der Ploeg, Shahab Aldin Shojaeezadeh, Simon Fiil Svane, Brigitta Szabó, Harry Vereecken, Anne Verhoef, Michael Young, Yijian Zeng, Yonggen Zhang, and Sara Bonetti
Hydrol. Earth Syst. Sci., 28, 3391–3433, https://doi.org/10.5194/hess-28-3391-2024, https://doi.org/10.5194/hess-28-3391-2024, 2024
Short summary
Short summary
Pedotransfer functions (PTFs) are used to predict parameters of models describing the hydraulic properties of soils. The appropriateness of these predictions critically relies on the nature of the datasets for training the PTFs and the physical comprehensiveness of the models. This roadmap paper is addressed to PTF developers and users and critically reflects the utility and future of PTFs. To this end, we present a manifesto aiming at a paradigm shift in PTF research.
Frisa Irawan Ginting, Rudiyanto Rudiyanto, Fatchurahman, Ramisah Mohd Shah, Norhidayah Che Soh, Sunny Goh Eng Giap, Dian Fiantis, Budi Indra Setiawan, Sam Schiller, Aaron Davitt, and Budiman Minasny
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-90, https://doi.org/10.5194/essd-2024-90, 2024
Preprint withdrawn
Short summary
Short summary
This study is the first to map rice cropping intensity and the harvested area across Southeast Asia at a spatial resolution of 10 m (SEA-Rice-Ci10). We have developed a geospatial inventory of paddy rice parcels and rice cropping intensity by integrating Sentinel-1 and 2 time-series data in a framework called LUCK-PALM, based on local phenological expert interpretation. According to our best knowledge, it is the finest-resolution and most accurate database of paddy rice in Southeast Asia.
Wartini Ng, Budiman Minasny, Alex McBratney, Patrice de Caritat, and John Wilford
Earth Syst. Sci. Data, 15, 2465–2482, https://doi.org/10.5194/essd-15-2465-2023, https://doi.org/10.5194/essd-15-2465-2023, 2023
Short summary
Short summary
With a higher demand for lithium (Li), a better understanding of its concentration and spatial distribution is important to delineate potential anomalous areas. This study uses a framework that combines data from recent geochemical surveys and relevant environmental factors to predict and map Li content across Australia. The map shows high Li concentration around existing mines and other potentially anomalous Li areas. The same mapping principles can potentially be applied to other elements.
Mercedes Román Dobarco, Alexandre M. J-C. Wadoux, Brendan Malone, Budiman Minasny, Alex B. McBratney, and Ross Searle
Biogeosciences, 20, 1559–1586, https://doi.org/10.5194/bg-20-1559-2023, https://doi.org/10.5194/bg-20-1559-2023, 2023
Short summary
Short summary
Soil organic carbon (SOC) is of a heterogeneous nature and varies in chemistry, stabilisation mechanisms, and persistence in soil. In this study we mapped the stocks of SOC fractions with different characteristics and turnover rates (presumably PyOC >= MAOC > POC) across Australia, combining spectroscopy and digital soil mapping. The SOC stocks (0–30 cm) were estimated as 13 Pg MAOC, 2 Pg POC, and 5 Pg PyOC.
José Padarian, Budiman Minasny, Alex B. McBratney, and Pete Smith
SOIL Discuss., https://doi.org/10.5194/soil-2021-73, https://doi.org/10.5194/soil-2021-73, 2021
Manuscript not accepted for further review
Short summary
Short summary
Soil organic carbon sequestration is considered an attractive technology to partially mitigate climate change. Here, we show how the SOC storage potential varies globally. The estimated additional SOC storage potential in the topsoil of global croplands (29–67 Pg C) equates to only 2 to 5 years of emissions offsetting and 32 % of agriculture's 92 Pg historical carbon debt. Since SOC is temperature-dependent, this potential is likely to reduce by 18 % by 2040 due to climate change.
Edward J. Jones, Patrick Filippi, Rémi Wittig, Mario Fajardo, Vanessa Pino, and Alex B. McBratney
SOIL, 7, 33–46, https://doi.org/10.5194/soil-7-33-2021, https://doi.org/10.5194/soil-7-33-2021, 2021
Short summary
Short summary
Soil physical health is integral to maintaining functional agro-ecosystems. A novel method of assessing soil physical condition using a smartphone app has been developed – SLAKES. In this study the SLAKES app was used to investigate aggregate stability in a mixed agricultural landscape. Cropping areas were found to have significantly poorer physical health than similar soils under pasture. Results were mapped across the landscape to identify problem areas and pinpoint remediation efforts.
Wartini Ng, Budiman Minasny, Wanderson de Sousa Mendes, and José Alexandre Melo Demattê
SOIL, 6, 565–578, https://doi.org/10.5194/soil-6-565-2020, https://doi.org/10.5194/soil-6-565-2020, 2020
Short summary
Short summary
The number of samples utilised to create predictive models affected model performance. This research compares the number of samples needed by a deep learning model to outperform the traditional machine learning models using visible near-infrared spectroscopy data for soil properties predictions. The deep learning model was found to outperform machine learning models when the sample size was above 2000.
José Padarian, Alex B. McBratney, and Budiman Minasny
SOIL, 6, 389–397, https://doi.org/10.5194/soil-6-389-2020, https://doi.org/10.5194/soil-6-389-2020, 2020
Short summary
Short summary
In this paper we introduce the use of game theory to interpret a digital soil mapping (DSM) model to understand the contribution of environmental factors to the prediction of soil organic carbon (SOC) in Chile. The analysis corroborated that the SOC model is capturing sensible relationships between SOC and climatic and topographical factors. We were able to represent them spatially (map) addressing the limitations of the current interpretation of models in DSM.
Yosra Ellili-Bargaoui, Brendan Philip Malone, Didier Michot, Budiman Minasny, Sébastien Vincent, Christian Walter, and Blandine Lemercier
SOIL, 6, 371–388, https://doi.org/10.5194/soil-6-371-2020, https://doi.org/10.5194/soil-6-371-2020, 2020
Sanjeewani Nimalka Somarathna Pallegedara Dewage, Budiman Minasny, and Brendan Malone
SOIL, 6, 359–369, https://doi.org/10.5194/soil-6-359-2020, https://doi.org/10.5194/soil-6-359-2020, 2020
Short summary
Short summary
Most soil management activities are implemented at farm scale, yet digital soil maps are commonly available at regional/national scales. This study proposes Bayesian area-to-point kriging to downscale regional-/national-scale soil property maps to farm scale. A regional soil carbon map with a resolution of 100 m (block support) was disaggregated to 10 m (point support) information for a farm in northern NSW, Australia. Results are presented with the uncertainty of the downscaling process.
José Padarian and Alex B. McBratney
SOIL, 6, 89–94, https://doi.org/10.5194/soil-6-89-2020, https://doi.org/10.5194/soil-6-89-2020, 2020
Short summary
Short summary
Data sharing and collaboration are critical to solving large-scale problems. The prevailing soil data-sharing model is of a centralized nature and, consequently, results in the participants ceding control and governance over their data to the lead party. Here we explore the use of a distributed ledger (blockchain) to solve the aforementioned issues. We also describe the potential use case of developing a global soil spectral library between multiple, international institutions.
José Padarian, Budiman Minasny, and Alex B. McBratney
SOIL, 6, 35–52, https://doi.org/10.5194/soil-6-35-2020, https://doi.org/10.5194/soil-6-35-2020, 2020
Short summary
Short summary
The application of machine learning (ML) has shown an accelerated adoption in soil sciences. It is a difficult task to manually review all papers on the application of ML. This paper aims to provide a review of the application of ML aided by topic modelling in order to find patterns in a large collection of publications. The objective is to gain insight into the applications and to discuss research gaps. We found 12 main topics and that ML methods usually perform better than traditional ones.
José Padarian and Ignacio Fuentes
SOIL, 5, 177–187, https://doi.org/10.5194/soil-5-177-2019, https://doi.org/10.5194/soil-5-177-2019, 2019
Short summary
Short summary
A large amount of descriptive information is available in geosciences. Considering the advances in natural language it is possible to
rescuethis information and transform it into a numerical form (embeddings). We used 280764 full-text scientific articles to train a language model capable of generating such embeddings. Our domain-specific embeddings (GeoVec) outperformed general domain embedding tasks such as analogies, relatedness, and categorisation, and can be used in novel applications.
Alexandre M. J.-C. Wadoux, José Padarian, and Budiman Minasny
SOIL, 5, 107–119, https://doi.org/10.5194/soil-5-107-2019, https://doi.org/10.5194/soil-5-107-2019, 2019
José Padarian, Budiman Minasny, and Alex B. McBratney
SOIL, 5, 79–89, https://doi.org/10.5194/soil-5-79-2019, https://doi.org/10.5194/soil-5-79-2019, 2019
Short summary
Short summary
Digital soil mapping has been widely used as a cost-effective method for generating soil maps. DSM models are usually calibrated using point observations and rarely incorporate contextual information of the landscape. Here, we use convolutional neural networks to incorporate spatial context. We used as input a 3-D stack of covariate images to simultaneously predict organic carbon content at multiple depths. In this study, our model reduced the error by 30 % compared with conventional techniques.
Edward J. Jones and Alex B. McBratney
SOIL Discuss., https://doi.org/10.5194/soil-2018-12, https://doi.org/10.5194/soil-2018-12, 2018
Revised manuscript has not been submitted
Short summary
Short summary
Variable soil moisture content is one of the main factors limiting field application of visible near-infrared spectroscopy. External parameter orthogonalisation of soil spectra was found to conserve intrinsic soil information under variable moisture conditions. k-means clustering of treated spectra yielded similar classifications under in situ, field moist (laboratory) and air-dried condition. Homogeneous spectral response zones were identified that corresponded with field observed horizons.
Related subject area
Soil sensing
Pooled error variance and covariance estimation of sparse in situ soil moisture sensor measurements in agricultural fields in Flanders
Assessing soil fertilization effects using time-lapse electromagnetic induction
Combining electromagnetic induction and remote sensing data for improved determination of management zones for sustainable crop production
Closing the phenotyping gap with non-invasive belowground field phenotyping
Uncovering soil compaction: performance of electrical and electromagnetic geophysical methods
Overcoming barriers in long-term, continuous monitoring of soil CO2 flux: A low-cost sensor system
Exploring the link between cation exchange capacity and magnetic susceptibility
The effect of soil moisture content and soil texture on fast in situ pH measurements with two types of robust ion-selective electrodes
Best performances of visible–near-infrared models in soils with little carbonate – a field study in Switzerland
Delineating the distribution of mineral and peat soils at the landscape scale in northern boreal regions
Improving models to predict holocellulose and Klason lignin contents for peat soil organic matter with mid-infrared spectra
Marit G. A. Hendrickx, Jan Vanderborght, Pieter Janssens, Sander Bombeke, Evi Matthyssen, Anne Waverijn, and Jan Diels
SOIL, 11, 435–456, https://doi.org/10.5194/soil-11-435-2025, https://doi.org/10.5194/soil-11-435-2025, 2025
Short summary
Short summary
We developed a method to estimate errors in soil moisture measurements using limited sensors and infrequent sampling. By analyzing data from 93 cropping cycles in agricultural fields in Belgium, we identified both systematic and random errors for our sensor setup. This approach reduces the need for extensive sensor networks and is applicable to agricultural and environmental monitoring and ensures more reliable soil moisture data, enhancing water management and improving model predictions.
Manuela S. Kaufmann, Anja Klotzsche, Jan van der Kruk, Anke Langen, Harry Vereecken, and Lutz Weihermüller
SOIL, 11, 267–285, https://doi.org/10.5194/soil-11-267-2025, https://doi.org/10.5194/soil-11-267-2025, 2025
Short summary
Short summary
To use fertilizers more effectively, non-invasive geophysical methods can be used to understand nutrient distributions in the soil. We utilize, in a long-term field study, geophysical techniques to study soil properties and conditions under different fertilizer treatments. We compared the geophysical response with soil samples and soil sensor data. In particular, electromagnetic induction and electrical resistivity tomography were effective in monitoring changes in nitrate levels over time.
Salar Saeed Dogar, Cosimo Brogi, Dave O'Leary, Ixchel Hernández-Ochoa, Marco Donat, Harry Vereecken, and Johan Alexander Huisman
EGUsphere, https://doi.org/10.5194/egusphere-2025-827, https://doi.org/10.5194/egusphere-2025-827, 2025
Short summary
Short summary
Farmers need precise information about their fields to use water, fertilizers, and other resources efficiently. This study combines underground soil data and satellite images to create detailed field maps using advanced machine learning. By testing different ways of processing data, we ensured a balanced and accurate approach. The results help farmers manage their land more effectively, leading to better harvests and more sustainable farming practices.
Guillaume Blanchy, Waldo Deroo, Tom De Swaef, Peter Lootens, Paul Quataert, Isabel Roldán-Ruíz, Roelof Versteeg, and Sarah Garré
SOIL, 11, 67–84, https://doi.org/10.5194/soil-11-67-2025, https://doi.org/10.5194/soil-11-67-2025, 2025
Short summary
Short summary
This work implemented automated electrical resistivity tomography (ERT) for belowground field phenotyping alongside conventional field breeding techniques, thereby closing the phenotyping gap. We show that ERT is not only capable of measuring differences between crops but also has sufficient precision to capture the differences between genotypes of the same crop. We automatically derive indicators, which can be translated to static and dynamic plant traits, directly useful for breeders.
Alberto Carrera, Luca Peruzzo, Matteo Longo, Giorgio Cassiani, and Francesco Morari
SOIL, 10, 843–857, https://doi.org/10.5194/soil-10-843-2024, https://doi.org/10.5194/soil-10-843-2024, 2024
Short summary
Short summary
Soil compaction resulting from inappropriate agricultural practices affects soil ecological functions, decreasing the water-use efficiency of plants. Recent developments contributed to innovative sensing approaches aimed at safeguarding soil health. Here, we explored how the most used geophysical methods detect soil compaction. Results, validated with traditional characterization methods, show the pros and cons of non-invasive techniques and their ability to characterize compacted areas.
Thi Thuc Nguyen, Nadav Bekin, Ariel Altman, Martin Maier, Nurit Agam, and Elad Levintal
EGUsphere, https://doi.org/10.5194/egusphere-2024-3156, https://doi.org/10.5194/egusphere-2024-3156, 2024
Short summary
Short summary
This study presents a new, low-cost sensor system for measuring soil CO2 gas continuously over long periods. Built using easy-to-get hardware components, the system costs USD700. It was tested for six months in desert soil, proving to be reliable, easy to maintain, and capable of capturing important changes in soil CO2. The CO2 flux calculations from this system closely matched those from a standard measurement device, making it a practical tool for research requiring multiple sensor systems.
Gaston Matias Mendoza Veirana, Hana Grison, Jeroen Verhegge, Wim Cornelis, and Philippe De Smedt
EGUsphere, https://doi.org/10.5194/egusphere-2024-3306, https://doi.org/10.5194/egusphere-2024-3306, 2024
Short summary
Short summary
This study explores the link between soil magnetic susceptibility and cation exchange capacity (CEC) to improve prediction models for CEC in European soils. Results show that magnetic susceptibility significantly enhances CEC prediction in sandy soils, achieving high accuracy (R2 = 0.94). This offers a rapid, cost-effective way to estimate CEC, emphasizing the value of geophysical data integration in soil assessment.
Sebastian Vogel, Katja Emmerich, Ingmar Schröter, Eric Bönecke, Wolfgang Schwanghart, Jörg Rühlmann, Eckart Kramer, and Robin Gebbers
SOIL, 10, 321–333, https://doi.org/10.5194/soil-10-321-2024, https://doi.org/10.5194/soil-10-321-2024, 2024
Short summary
Short summary
To rapidly obtain high-resolution soil pH data, pH sensors can measure the pH value directly in the field under the current soil moisture (SM) conditions. The influence of SM on pH and on its measurement quality was studied. An SM increase causes a maximum pH increase of 1.5 units. With increasing SM, the sensor pH value approached the standard pH value measured in the laboratory. Thus, at high soil moisture, calibration of the sensor pH values to the standard pH value is negligible.
Simon Oberholzer, Laura Summerauer, Markus Steffens, and Chinwe Ifejika Speranza
SOIL, 10, 231–249, https://doi.org/10.5194/soil-10-231-2024, https://doi.org/10.5194/soil-10-231-2024, 2024
Short summary
Short summary
This study investigated the performance of visual and near-infrared spectroscopy in six fields in Switzerland. Spectral models showed a good performance for soil properties related to organic matter at the field scale. However, spectral models performed best in fields with low mean carbonate content because high carbonate content masks spectral features for organic carbon. These findings help facilitate the establishment and implementation of new local soil spectroscopy projects.
Anneli M. Ågren, Eliza Maher Hasselquist, Johan Stendahl, Mats B. Nilsson, and Siddhartho S. Paul
SOIL, 8, 733–749, https://doi.org/10.5194/soil-8-733-2022, https://doi.org/10.5194/soil-8-733-2022, 2022
Short summary
Short summary
Historically, many peatlands in the boreal region have been drained for timber production. Given the prospects of a drier future due to climate change, wetland restorations are now increasing. Better maps hold the key to insights into restoration targets and land-use management policies, and maps are often the number one decision-support tool. We use an AI-developed soil moisture map based on laser scanning data to illustrate how the mapping of peatlands can be improved across an entire nation.
Henning Teickner and Klaus-Holger Knorr
SOIL, 8, 699–715, https://doi.org/10.5194/soil-8-699-2022, https://doi.org/10.5194/soil-8-699-2022, 2022
Short summary
Short summary
The chemical quality of biomass can be described with holocellulose (relatively easily decomposable by microorganisms) and Klason lignin (relatively recalcitrant) contents. Measuring both is laborious. In a recent study, models have been proposed which can predict both quicker from mid-infrared spectra. However, it has not been analyzed if these models make correct predictions for biomass in soils and how to improve them. We provide such a validation and a strategy for their improvement.
Cited articles
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems, arXiv, https://www.tensorflow.org/ (last access: 18 July 2025), 2015.
Angelopoulos, A. N. and Bates, S.: A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification, arXiv [preprint], https://doi.org/10.48550/arXiv.2107.07511, 2022.
Begoli, E., Bhattacharya, T., and Kusnezov, D.: The need for uncertainty quantification in machine-assisted medical decision making, Nat. Mach. Intell., 1, 20–23, https://doi.org/10.1038/s42256-018-0004-1, 2019.
Bellon-Maurel, V., Fernandez-Ahumada, E., Palagos, B., Roger, J.-M., and McBratney, A.: Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy, TrAC, Trends Anal. Chem., 29, 1073–1081, https://doi.org/10.1016/j.trac.2010.05.006, 2010.
Bethell, D., Gerasimou, S., and Calinescu, R.: Robust Uncertainty Quantification Using Conformalised Monte Carlo Prediction, Proceedings of the AAAI Conference on Artificial Intelligence, 38, 20939–20948, https://doi.org/10.1609/aaai.v38i19.30084, 2024.
Efron, B. and Tibshirani, R. J.: An Introduction to the Bootstrap, Chapman and Hall/CRC, New York, NY, https://doi.org/10.1201/9780429246593, 1994.
Gal, Y. and Ghahramani, Z.: Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, Proceedings of the 33rd International Conference on Machine Learning, 48, 1050–1059, https://proceedings.mlr.press/v48/gal16.html, 2016.
Heuvelink, G. B.: Uncertainty quantification of GlobalSoilMap products, in: GlobalSoilMap. Basis of the Global Spatial Soil Information System, edited by: Arrouays, D., McKenzie, N., Hempel, J., Richer de Forges, A., and McBratney, A., CRC Press, 335–340, https://doi.org/10.1201/b16500, 2014.
Heuvelink, G. B. M., Angelini, M. E., Poggio, L., Bai, Z., Batjes, N. H., van den Bosch, R., Bossio, D., Estella, S., Lehmann, J., Olmedo, G. F., and Sanderman, J.: Machine learning in space and time for modelling soil organic carbon change, Eur. J. Soil Sci., 72, 1607–1623, https://doi.org/10.1111/ejss.12998, 2021.
Huang, L.: LloydYCHuang/Soil-MC-CP: v1.0.1, Zenodo [code], https://doi.org/10.5281/zenodo.15401499, 2025.
Javadi, S. H., Munnaf, M. A., and Mouazen, A. M.: Fusion of Vis-NIR and XRF spectra for estimation of key soil attributes, Geoderma, 385, 114851, https://doi.org/10.1016/j.geoderma.2020.114851, 2021.
Kakhani, N., Alamdar, S., Kebonye, N. M., Amani, M., and Scholten, T.: Uncertainty Quantification of Soil Organic Carbon Estimation from Remote Sensing Data with Conformal Prediction, Remote Sens., 16, 438, https://doi.org/10.3390/rs16030438, 2024.
Kasraei, B., Heung, B., Saurette, D. D., Schmidt, M. G., Bulmer, C. E., and Bethel, W.: Quantile regression as a generic approach for estimating uncertainty of digital soil maps produced from machine-learning, Environ. Modell. Softw., 144, 105139, https://doi.org/10.1016/j.envsoft.2021.105139, 2021.
Liu, Y., Pagliardini, M., Chavdarova, T., and Stich, S. U.: The Peril of Popular Deep Learning Uncertainty Estimation Methods, Proceedings of the Bayesian Deep Learning workshop, virtual, 14 December 2021, NeurIPS 2021, https://doi.org/10.48550/arXiv.2112.05000, 2021.
Malone, B. P., McBratney, A. B., and Minasny, B.: Empirical estimates of uncertainty for mapping continuous depth functions of soil attributes, Geoderma, 160, 614–626, https://doi.org/10.1016/j.geoderma.2010.11.013, 2011.
McBratney, A. B., Minasny, B., Cattle, S. R., and Vervoort, R. W.: From pedotransfer functions to soil inference systems, Geoderma, 109, 41–73, https://doi.org/10.1016/S0016-7061(02)00139-8, 2002.
Minasny, B., Vrugt, J. A., and McBratney, A. B.: Confronting uncertainty in model-based geostatistics using Markov Chain Monte Carlo simulation, Geoderma, 163, 150–162, https://doi.org/10.1016/j.geoderma.2011.03.011, 2011.
Minasny, B., Bandai, T., Ghezzehei, T. A., Huang, Y.-C., Ma, Y., McBratney, A. B., Ng, W., Norouzi, S., Padarian, J., Rudiyanto, Sharififar, A., Styc, Q., and Widyastuti, M.: Soil Science-Informed Machine Learning, Geoderma, 452, 117094, https://doi.org/10.1016/j.geoderma.2024.117094, 2024.
Ng, W., Minasny, B., Jeon, S. H., and McBratney, A.: Mid-infrared spectroscopy for accurate measurement of an extensive set of soil properties for assessing soil functions, Soil Secur., 6, 100043, https://doi.org/10.1016/j.soisec.2022.100043, 2022.
Ng, W., Minasny, B., Montazerolghaem, M., Padarian, J., Ferguson, R., Bailey, S., and McBratney, A. B.: Convolutional neural network for simultaneous prediction of several soil properties using visible/near-infrared, mid-infrared, and their combined spectra, Geoderma, 352, 251–267, https://doi.org/10.1016/j.geoderma.2019.06.016, 2019.
Omondiagbe, O. P., Roudier, P., Lilburne, L., Ma, Y., and McNeill, S.: Quantifying uncertainty in the prediction of soil properties using mid-infrared spectra, Geoderma, 448, 116954, https://doi.org/10.1016/j.geoderma.2024.116954, 2024.
Padarian, J., Minasny, B., and McBratney, A. B.: Using deep learning to predict soil properties from regional spectral data, Geoderma Reg., 16, e00198, https://doi.org/10.1016/j.geodrs.2018.e00198, 2019.
Padarian, J., Minasny, B., and McBratney, A. B.: Machine learning and soil sciences: a review aided by machine learning tools, Soil, 6, 35–52, https://doi.org/10.5194/soil-6-35-2020, 2020.
Padarian, J., Minasny, B., and McBratney, A. B.: Assessing the uncertainty of deep learning soil spectral models using Monte Carlo dropout, Geoderma, 425, 116063, https://doi.org/10.1016/j.geoderma.2022.116063, 2022.
Python Software Foundation: Python Language Reference, version 3.12.3. https://www.python.org (last access: 18 July 2025), 2024.
Schmidinger, J. and Heuvelink, G. B. M.: Validation of uncertainty predictions in digital soil mapping, Geoderma, 437, 116585, https://doi.org/10.1016/j.geoderma.2023.116585, 2023.
Seybold, C. A., Ferguson, R., Wysocki, D., Bailey, S., Anderson, J., Nester, B., Schoeneberger, P., Wills, S., Libohova, Z., Hoover, D., and Thomas, P.: Application of Mid-Infrared Spectroscopy in Soil Survey, Soil Sci. Soc. Am. J., 83, 1746–1759, https://doi.org/10.2136/sssaj2019.06.0205, 2019.
Shafer, G. and Vovk, V.: A tutorial on conformal prediction, J. Mach. Learn. Res., 9, 371–421, https://doi.org/10.48550/arXiv.0706.3188, 2008.
Shrestha, D. L. and Solomatine, D. P.: Machine learning approaches for estimation of prediction interval for the model output, Neural Networks, 19, 225–235, https://doi.org/10.1016/j.neunet.2006.01.012, 2006.
Singh, G., Moncrieff, G., Venter, Z., Cawse-Nicholson, K., Slingsby, J., and Robinson, T. B.: Uncertainty quantification for probabilistic machine learning in earth observation using conformal prediction, Sci. Rep., 14, 16166, https://doi.org/10.1038/s41598-024-65954-w, 2024.
Soil Science Division Staff: Soil survey manual, in: USDA Handbook 18, edited by: Ditzler, C., Scheffe, K., and Monger, H. C., Government Printing Office, https://www.nrcs.usda.gov/resources/guides-and-instructions/soil-survey-manual (last access: 18 July 2025), 2017.
Soil Survey Staff: Kellogg soil survey laboratory methods manual, Soil Survey Investigations Report No. 42, N. R. C. S. United States Department of Agriculture, https://www.nrcs.usda.gov/sites/default/files/2023-01/SSIR42.pdf (last access: 18 July 2025), 2014.
Solomatine, D. P. and Shrestha, D. L.: A novel method to estimate model uncertainty using machine learning techniques, Water Resour. Res., 45, W00B11, https://doi.org/10.1029/2008WR006839, 2009.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R.: Dropout: A Simple Way to Prevent Neural Networks from Overfitting, J. Mach. Learn. Res., 15, 1929–1958, 2014.
Tinti, A., Tugnoli, V., Bonora, S., and Francioso, O.: Recent applications of vibrational mid-Infrared (IR) spectroscopy for studying soil components: a review, J. Cent. Eur. Agric., 16, 1–22, https://doi.org/10.5513/JCEA01/16.1.1535, 2015.
Wadoux, A. M. J. C.: Using deep learning for multivariate mapping of soil with quantified uncertainty, Geoderma, 351, 59–70, https://doi.org/10.1016/j.geoderma.2019.05.012, 2019.
Wadoux, A. M. J. C., Minasny, B., and McBratney, A. B.: Machine learning for digital soil mapping: Applications, challenges and suggested solutions, Earth-Sci. Rev., 210, 103359, https://doi.org/10.1016/j.earscirev.2020.103359, 2020.
Zadorozhny, K., Ulmer, D., and Cinà, G.: Failures of Uncertainty Estimation on Out-Of-Distribution Samples: Experimental Results from Medical Applications Lead to Theoretical Insights, Proceedings of the ICML 2021 Workshop on Uncertainty and Robustness in Deep Learning, virtual, 23 July 2021, https://www.gatsby.ucl.ac.uk/~balaji/udl2021/accepted-papers/UDL2021-paper-020.pdf (last access: 18 July 2025), 2021.
Zhang, Y., Freedman, Z. B., Hartemink, A. E., Whitman, T., and Huang, J.: Characterizing soil microbial properties using MIR spectra across 12 ecoclimatic zones (NEON sites), Geoderma, 409, 115647, https://doi.org/10.1016/j.geoderma.2021.115647, 2022.
Short summary
Uncertainty quantification plays a crucial role in reporting machine learning models in soil spectroscopy. This study introduces Monte Carlo conformal prediction (MC-CP), a novel method for uncertainty quantification in deep-learning soil spectral models. MC-CP outperformed two established methods, providing the most reliable results. Its efficiency and robustness make it a practical choice for implementing soil spectral models in decision making.
Uncertainty quantification plays a crucial role in reporting machine learning models in soil...