Articles | Volume 11, issue 2
https://doi.org/10.5194/soil-11-553-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/soil-11-553-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Using Monte Carlo conformal prediction to evaluate the uncertainty of deep-learning soil spectral models
Yin-Chung Huang
CORRESPONDING AUTHOR
School of Life and Environmental Science & Sydney Institute of Agriculture, The University of Sydney, Sydney, NSW, Australia
José Padarian
School of Life and Environmental Science & Sydney Institute of Agriculture, The University of Sydney, Sydney, NSW, Australia
Budiman Minasny
School of Life and Environmental Science & Sydney Institute of Agriculture, The University of Sydney, Sydney, NSW, Australia
Alex B. McBratney
School of Life and Environmental Science & Sydney Institute of Agriculture, The University of Sydney, Sydney, NSW, Australia
Related authors
No articles found.
Marliana Tri Widyastuti, José Padarian, Budiman Minasny, Mathew Webb, Muh Taufik, and Darren Kidd
SOIL, 11, 287–307, https://doi.org/10.5194/soil-11-287-2025, https://doi.org/10.5194/soil-11-287-2025, 2025
Short summary
Short summary
This work aims to predict soil water content at a fine spatiotemporal resolution (80 m grids, daily) to support agricultural management in Tasmania. It proves that transfer learning can improve the accuracy of deep learning models to predict multilevel soil moisture. We address the challenge of mapping soil moisture at field-scale resolution and integrate the model into a near-real-time monitoring system.
Marliana Tri Widyastuti, Budiman Minasny, José Padarian, Federico Maggi, Matt Aitkenhead, Amélie Beucher, John Connolly, Dian Fiantis, Darren Kidd, Yuxin Ma, Fraser Macfarlane, Ciaran Robb, Rudiyanto, Budi Indra Setiawan, and Muh Taufik
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-333, https://doi.org/10.5194/essd-2024-333, 2024
Preprint withdrawn
Short summary
Short summary
PEATGRIDS, the first dataset containing maps of global peat thickness and carbon stock at 1 km resolution. The dataset has been publicly available at Zenodo to support further analyses and modelling of peatlands across the globe. This work employed the random forest machine learning model to provide spatially explicit peat carbon stock at pixel basis.
Tobias Karl David Weber, Lutz Weihermüller, Attila Nemes, Michel Bechtold, Aurore Degré, Efstathios Diamantopoulos, Simone Fatichi, Vilim Filipović, Surya Gupta, Tobias L. Hohenbrink, Daniel R. Hirmas, Conrad Jackisch, Quirijn de Jong van Lier, John Koestel, Peter Lehmann, Toby R. Marthews, Budiman Minasny, Holger Pagel, Martine van der Ploeg, Shahab Aldin Shojaeezadeh, Simon Fiil Svane, Brigitta Szabó, Harry Vereecken, Anne Verhoef, Michael Young, Yijian Zeng, Yonggen Zhang, and Sara Bonetti
Hydrol. Earth Syst. Sci., 28, 3391–3433, https://doi.org/10.5194/hess-28-3391-2024, https://doi.org/10.5194/hess-28-3391-2024, 2024
Short summary
Short summary
Pedotransfer functions (PTFs) are used to predict parameters of models describing the hydraulic properties of soils. The appropriateness of these predictions critically relies on the nature of the datasets for training the PTFs and the physical comprehensiveness of the models. This roadmap paper is addressed to PTF developers and users and critically reflects the utility and future of PTFs. To this end, we present a manifesto aiming at a paradigm shift in PTF research.
Frisa Irawan Ginting, Rudiyanto Rudiyanto, Fatchurahman, Ramisah Mohd Shah, Norhidayah Che Soh, Sunny Goh Eng Giap, Dian Fiantis, Budi Indra Setiawan, Sam Schiller, Aaron Davitt, and Budiman Minasny
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-90, https://doi.org/10.5194/essd-2024-90, 2024
Preprint withdrawn
Short summary
Short summary
This study is the first to map rice cropping intensity and the harvested area across Southeast Asia at a spatial resolution of 10 m (SEA-Rice-Ci10). We have developed a geospatial inventory of paddy rice parcels and rice cropping intensity by integrating Sentinel-1 and 2 time-series data in a framework called LUCK-PALM, based on local phenological expert interpretation. According to our best knowledge, it is the finest-resolution and most accurate database of paddy rice in Southeast Asia.
Wartini Ng, Budiman Minasny, Alex McBratney, Patrice de Caritat, and John Wilford
Earth Syst. Sci. Data, 15, 2465–2482, https://doi.org/10.5194/essd-15-2465-2023, https://doi.org/10.5194/essd-15-2465-2023, 2023
Short summary
Short summary
With a higher demand for lithium (Li), a better understanding of its concentration and spatial distribution is important to delineate potential anomalous areas. This study uses a framework that combines data from recent geochemical surveys and relevant environmental factors to predict and map Li content across Australia. The map shows high Li concentration around existing mines and other potentially anomalous Li areas. The same mapping principles can potentially be applied to other elements.
Mercedes Román Dobarco, Alexandre M. J-C. Wadoux, Brendan Malone, Budiman Minasny, Alex B. McBratney, and Ross Searle
Biogeosciences, 20, 1559–1586, https://doi.org/10.5194/bg-20-1559-2023, https://doi.org/10.5194/bg-20-1559-2023, 2023
Short summary
Short summary
Soil organic carbon (SOC) is of a heterogeneous nature and varies in chemistry, stabilisation mechanisms, and persistence in soil. In this study we mapped the stocks of SOC fractions with different characteristics and turnover rates (presumably PyOC >= MAOC > POC) across Australia, combining spectroscopy and digital soil mapping. The SOC stocks (0–30 cm) were estimated as 13 Pg MAOC, 2 Pg POC, and 5 Pg PyOC.
José Padarian, Budiman Minasny, Alex B. McBratney, and Pete Smith
SOIL Discuss., https://doi.org/10.5194/soil-2021-73, https://doi.org/10.5194/soil-2021-73, 2021
Manuscript not accepted for further review
Short summary
Short summary
Soil organic carbon sequestration is considered an attractive technology to partially mitigate climate change. Here, we show how the SOC storage potential varies globally. The estimated additional SOC storage potential in the topsoil of global croplands (29–67 Pg C) equates to only 2 to 5 years of emissions offsetting and 32 % of agriculture's 92 Pg historical carbon debt. Since SOC is temperature-dependent, this potential is likely to reduce by 18 % by 2040 due to climate change.
Edward J. Jones, Patrick Filippi, Rémi Wittig, Mario Fajardo, Vanessa Pino, and Alex B. McBratney
SOIL, 7, 33–46, https://doi.org/10.5194/soil-7-33-2021, https://doi.org/10.5194/soil-7-33-2021, 2021
Short summary
Short summary
Soil physical health is integral to maintaining functional agro-ecosystems. A novel method of assessing soil physical condition using a smartphone app has been developed – SLAKES. In this study the SLAKES app was used to investigate aggregate stability in a mixed agricultural landscape. Cropping areas were found to have significantly poorer physical health than similar soils under pasture. Results were mapped across the landscape to identify problem areas and pinpoint remediation efforts.
Wartini Ng, Budiman Minasny, Wanderson de Sousa Mendes, and José Alexandre Melo Demattê
SOIL, 6, 565–578, https://doi.org/10.5194/soil-6-565-2020, https://doi.org/10.5194/soil-6-565-2020, 2020
Short summary
Short summary
The number of samples utilised to create predictive models affected model performance. This research compares the number of samples needed by a deep learning model to outperform the traditional machine learning models using visible near-infrared spectroscopy data for soil properties predictions. The deep learning model was found to outperform machine learning models when the sample size was above 2000.
Cited articles
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems, arXiv, https://www.tensorflow.org/ (last access: 18 July 2025), 2015.
Angelopoulos, A. N. and Bates, S.: A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification, arXiv [preprint], https://doi.org/10.48550/arXiv.2107.07511, 2022.
Begoli, E., Bhattacharya, T., and Kusnezov, D.: The need for uncertainty quantification in machine-assisted medical decision making, Nat. Mach. Intell., 1, 20–23, https://doi.org/10.1038/s42256-018-0004-1, 2019.
Bellon-Maurel, V., Fernandez-Ahumada, E., Palagos, B., Roger, J.-M., and McBratney, A.: Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy, TrAC, Trends Anal. Chem., 29, 1073–1081, https://doi.org/10.1016/j.trac.2010.05.006, 2010.
Bethell, D., Gerasimou, S., and Calinescu, R.: Robust Uncertainty Quantification Using Conformalised Monte Carlo Prediction, Proceedings of the AAAI Conference on Artificial Intelligence, 38, 20939–20948, https://doi.org/10.1609/aaai.v38i19.30084, 2024.
Efron, B. and Tibshirani, R. J.: An Introduction to the Bootstrap, Chapman and Hall/CRC, New York, NY, https://doi.org/10.1201/9780429246593, 1994.
Gal, Y. and Ghahramani, Z.: Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, Proceedings of the 33rd International Conference on Machine Learning, 48, 1050–1059, https://proceedings.mlr.press/v48/gal16.html, 2016.
Heuvelink, G. B.: Uncertainty quantification of GlobalSoilMap products, in: GlobalSoilMap. Basis of the Global Spatial Soil Information System, edited by: Arrouays, D., McKenzie, N., Hempel, J., Richer de Forges, A., and McBratney, A., CRC Press, 335–340, https://doi.org/10.1201/b16500, 2014.
Heuvelink, G. B. M., Angelini, M. E., Poggio, L., Bai, Z., Batjes, N. H., van den Bosch, R., Bossio, D., Estella, S., Lehmann, J., Olmedo, G. F., and Sanderman, J.: Machine learning in space and time for modelling soil organic carbon change, Eur. J. Soil Sci., 72, 1607–1623, https://doi.org/10.1111/ejss.12998, 2021.
Huang, L.: LloydYCHuang/Soil-MC-CP: v1.0.1, Zenodo [code], https://doi.org/10.5281/zenodo.15401499, 2025.
Javadi, S. H., Munnaf, M. A., and Mouazen, A. M.: Fusion of Vis-NIR and XRF spectra for estimation of key soil attributes, Geoderma, 385, 114851, https://doi.org/10.1016/j.geoderma.2020.114851, 2021.
Kakhani, N., Alamdar, S., Kebonye, N. M., Amani, M., and Scholten, T.: Uncertainty Quantification of Soil Organic Carbon Estimation from Remote Sensing Data with Conformal Prediction, Remote Sens., 16, 438, https://doi.org/10.3390/rs16030438, 2024.
Kasraei, B., Heung, B., Saurette, D. D., Schmidt, M. G., Bulmer, C. E., and Bethel, W.: Quantile regression as a generic approach for estimating uncertainty of digital soil maps produced from machine-learning, Environ. Modell. Softw., 144, 105139, https://doi.org/10.1016/j.envsoft.2021.105139, 2021.
Liu, Y., Pagliardini, M., Chavdarova, T., and Stich, S. U.: The Peril of Popular Deep Learning Uncertainty Estimation Methods, Proceedings of the Bayesian Deep Learning workshop, virtual, 14 December 2021, NeurIPS 2021, https://doi.org/10.48550/arXiv.2112.05000, 2021.
Malone, B. P., McBratney, A. B., and Minasny, B.: Empirical estimates of uncertainty for mapping continuous depth functions of soil attributes, Geoderma, 160, 614–626, https://doi.org/10.1016/j.geoderma.2010.11.013, 2011.
McBratney, A. B., Minasny, B., Cattle, S. R., and Vervoort, R. W.: From pedotransfer functions to soil inference systems, Geoderma, 109, 41–73, https://doi.org/10.1016/S0016-7061(02)00139-8, 2002.
Minasny, B., Vrugt, J. A., and McBratney, A. B.: Confronting uncertainty in model-based geostatistics using Markov Chain Monte Carlo simulation, Geoderma, 163, 150–162, https://doi.org/10.1016/j.geoderma.2011.03.011, 2011.
Minasny, B., Bandai, T., Ghezzehei, T. A., Huang, Y.-C., Ma, Y., McBratney, A. B., Ng, W., Norouzi, S., Padarian, J., Rudiyanto, Sharififar, A., Styc, Q., and Widyastuti, M.: Soil Science-Informed Machine Learning, Geoderma, 452, 117094, https://doi.org/10.1016/j.geoderma.2024.117094, 2024.
Ng, W., Minasny, B., Jeon, S. H., and McBratney, A.: Mid-infrared spectroscopy for accurate measurement of an extensive set of soil properties for assessing soil functions, Soil Secur., 6, 100043, https://doi.org/10.1016/j.soisec.2022.100043, 2022.
Ng, W., Minasny, B., Montazerolghaem, M., Padarian, J., Ferguson, R., Bailey, S., and McBratney, A. B.: Convolutional neural network for simultaneous prediction of several soil properties using visible/near-infrared, mid-infrared, and their combined spectra, Geoderma, 352, 251–267, https://doi.org/10.1016/j.geoderma.2019.06.016, 2019.
Omondiagbe, O. P., Roudier, P., Lilburne, L., Ma, Y., and McNeill, S.: Quantifying uncertainty in the prediction of soil properties using mid-infrared spectra, Geoderma, 448, 116954, https://doi.org/10.1016/j.geoderma.2024.116954, 2024.
Padarian, J., Minasny, B., and McBratney, A. B.: Using deep learning to predict soil properties from regional spectral data, Geoderma Reg., 16, e00198, https://doi.org/10.1016/j.geodrs.2018.e00198, 2019.
Padarian, J., Minasny, B., and McBratney, A. B.: Machine learning and soil sciences: a review aided by machine learning tools, Soil, 6, 35–52, https://doi.org/10.5194/soil-6-35-2020, 2020.
Padarian, J., Minasny, B., and McBratney, A. B.: Assessing the uncertainty of deep learning soil spectral models using Monte Carlo dropout, Geoderma, 425, 116063, https://doi.org/10.1016/j.geoderma.2022.116063, 2022.
Python Software Foundation: Python Language Reference, version 3.12.3. https://www.python.org (last access: 18 July 2025), 2024.
Schmidinger, J. and Heuvelink, G. B. M.: Validation of uncertainty predictions in digital soil mapping, Geoderma, 437, 116585, https://doi.org/10.1016/j.geoderma.2023.116585, 2023.
Seybold, C. A., Ferguson, R., Wysocki, D., Bailey, S., Anderson, J., Nester, B., Schoeneberger, P., Wills, S., Libohova, Z., Hoover, D., and Thomas, P.: Application of Mid-Infrared Spectroscopy in Soil Survey, Soil Sci. Soc. Am. J., 83, 1746–1759, https://doi.org/10.2136/sssaj2019.06.0205, 2019.
Shafer, G. and Vovk, V.: A tutorial on conformal prediction, J. Mach. Learn. Res., 9, 371–421, https://doi.org/10.48550/arXiv.0706.3188, 2008.
Shrestha, D. L. and Solomatine, D. P.: Machine learning approaches for estimation of prediction interval for the model output, Neural Networks, 19, 225–235, https://doi.org/10.1016/j.neunet.2006.01.012, 2006.
Singh, G., Moncrieff, G., Venter, Z., Cawse-Nicholson, K., Slingsby, J., and Robinson, T. B.: Uncertainty quantification for probabilistic machine learning in earth observation using conformal prediction, Sci. Rep., 14, 16166, https://doi.org/10.1038/s41598-024-65954-w, 2024.
Soil Science Division Staff: Soil survey manual, in: USDA Handbook 18, edited by: Ditzler, C., Scheffe, K., and Monger, H. C., Government Printing Office, https://www.nrcs.usda.gov/resources/guides-and-instructions/soil-survey-manual (last access: 18 July 2025), 2017.
Soil Survey Staff: Kellogg soil survey laboratory methods manual, Soil Survey Investigations Report No. 42, N. R. C. S. United States Department of Agriculture, https://www.nrcs.usda.gov/sites/default/files/2023-01/SSIR42.pdf (last access: 18 July 2025), 2014.
Solomatine, D. P. and Shrestha, D. L.: A novel method to estimate model uncertainty using machine learning techniques, Water Resour. Res., 45, W00B11, https://doi.org/10.1029/2008WR006839, 2009.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R.: Dropout: A Simple Way to Prevent Neural Networks from Overfitting, J. Mach. Learn. Res., 15, 1929–1958, 2014.
Tinti, A., Tugnoli, V., Bonora, S., and Francioso, O.: Recent applications of vibrational mid-Infrared (IR) spectroscopy for studying soil components: a review, J. Cent. Eur. Agric., 16, 1–22, https://doi.org/10.5513/JCEA01/16.1.1535, 2015.
Wadoux, A. M. J. C.: Using deep learning for multivariate mapping of soil with quantified uncertainty, Geoderma, 351, 59–70, https://doi.org/10.1016/j.geoderma.2019.05.012, 2019.
Wadoux, A. M. J. C., Minasny, B., and McBratney, A. B.: Machine learning for digital soil mapping: Applications, challenges and suggested solutions, Earth-Sci. Rev., 210, 103359, https://doi.org/10.1016/j.earscirev.2020.103359, 2020.
Zadorozhny, K., Ulmer, D., and Cinà, G.: Failures of Uncertainty Estimation on Out-Of-Distribution Samples: Experimental Results from Medical Applications Lead to Theoretical Insights, Proceedings of the ICML 2021 Workshop on Uncertainty and Robustness in Deep Learning, virtual, 23 July 2021, https://www.gatsby.ucl.ac.uk/~balaji/udl2021/accepted-papers/UDL2021-paper-020.pdf (last access: 18 July 2025), 2021.
Zhang, Y., Freedman, Z. B., Hartemink, A. E., Whitman, T., and Huang, J.: Characterizing soil microbial properties using MIR spectra across 12 ecoclimatic zones (NEON sites), Geoderma, 409, 115647, https://doi.org/10.1016/j.geoderma.2021.115647, 2022.
Short summary
Uncertainty quantification plays a crucial role in reporting machine learning models in soil spectroscopy. This study introduces Monte Carlo conformal prediction (MC-CP), a novel method for uncertainty quantification in deep-learning soil spectral models. MC-CP outperformed two established methods, providing the most reliable results. Its efficiency and robustness make it a practical choice for implementing soil spectral models in decision making.
Uncertainty quantification plays a crucial role in reporting machine learning models in soil...