Articles | Volume 6, issue 1
https://doi.org/10.5194/soil-6-35-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/soil-6-35-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Machine learning and soil sciences: a review aided by machine learning tools
José Padarian
CORRESPONDING AUTHOR
Sydney Institute of Agriculture & School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
Budiman Minasny
Sydney Institute of Agriculture & School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
Alex B. McBratney
Sydney Institute of Agriculture & School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
Related authors
Yin-Chung Huang, José Padarian, Budiman Minasny, and Alex B. McBratney
SOIL, 11, 553–563, https://doi.org/10.5194/soil-11-553-2025, https://doi.org/10.5194/soil-11-553-2025, 2025
Short summary
Short summary
Uncertainty quantification plays a crucial role in reporting machine learning models in soil spectroscopy. This study introduces Monte Carlo conformal prediction (MC-CP), a novel method for uncertainty quantification in deep-learning soil spectral models. MC-CP outperformed two established methods, providing the most reliable results. Its efficiency and robustness make it a practical choice for implementing soil spectral models in decision making.
Marliana Tri Widyastuti, José Padarian, Budiman Minasny, Mathew Webb, Muh Taufik, and Darren Kidd
SOIL, 11, 287–307, https://doi.org/10.5194/soil-11-287-2025, https://doi.org/10.5194/soil-11-287-2025, 2025
Short summary
Short summary
This work aims to predict soil water content at a fine spatiotemporal resolution (80 m grids, daily) to support agricultural management in Tasmania. It proves that transfer learning can improve the accuracy of deep learning models to predict multilevel soil moisture. We address the challenge of mapping soil moisture at field-scale resolution and integrate the model into a near-real-time monitoring system.
Marliana Tri Widyastuti, Budiman Minasny, José Padarian, Federico Maggi, Matt Aitkenhead, Amélie Beucher, John Connolly, Dian Fiantis, Darren Kidd, Yuxin Ma, Fraser Macfarlane, Ciaran Robb, Rudiyanto, Budi Indra Setiawan, and Muh Taufik
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-333, https://doi.org/10.5194/essd-2024-333, 2024
Preprint withdrawn
Short summary
Short summary
PEATGRIDS, the first dataset containing maps of global peat thickness and carbon stock at 1 km resolution. The dataset has been publicly available at Zenodo to support further analyses and modelling of peatlands across the globe. This work employed the random forest machine learning model to provide spatially explicit peat carbon stock at pixel basis.
José Padarian, Budiman Minasny, Alex B. McBratney, and Pete Smith
SOIL Discuss., https://doi.org/10.5194/soil-2021-73, https://doi.org/10.5194/soil-2021-73, 2021
Manuscript not accepted for further review
Short summary
Short summary
Soil organic carbon sequestration is considered an attractive technology to partially mitigate climate change. Here, we show how the SOC storage potential varies globally. The estimated additional SOC storage potential in the topsoil of global croplands (29–67 Pg C) equates to only 2 to 5 years of emissions offsetting and 32 % of agriculture's 92 Pg historical carbon debt. Since SOC is temperature-dependent, this potential is likely to reduce by 18 % by 2040 due to climate change.
Yin-Chung Huang, José Padarian, Budiman Minasny, and Alex B. McBratney
SOIL, 11, 553–563, https://doi.org/10.5194/soil-11-553-2025, https://doi.org/10.5194/soil-11-553-2025, 2025
Short summary
Short summary
Uncertainty quantification plays a crucial role in reporting machine learning models in soil spectroscopy. This study introduces Monte Carlo conformal prediction (MC-CP), a novel method for uncertainty quantification in deep-learning soil spectral models. MC-CP outperformed two established methods, providing the most reliable results. Its efficiency and robustness make it a practical choice for implementing soil spectral models in decision making.
Marliana Tri Widyastuti, José Padarian, Budiman Minasny, Mathew Webb, Muh Taufik, and Darren Kidd
SOIL, 11, 287–307, https://doi.org/10.5194/soil-11-287-2025, https://doi.org/10.5194/soil-11-287-2025, 2025
Short summary
Short summary
This work aims to predict soil water content at a fine spatiotemporal resolution (80 m grids, daily) to support agricultural management in Tasmania. It proves that transfer learning can improve the accuracy of deep learning models to predict multilevel soil moisture. We address the challenge of mapping soil moisture at field-scale resolution and integrate the model into a near-real-time monitoring system.
Marliana Tri Widyastuti, Budiman Minasny, José Padarian, Federico Maggi, Matt Aitkenhead, Amélie Beucher, John Connolly, Dian Fiantis, Darren Kidd, Yuxin Ma, Fraser Macfarlane, Ciaran Robb, Rudiyanto, Budi Indra Setiawan, and Muh Taufik
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-333, https://doi.org/10.5194/essd-2024-333, 2024
Preprint withdrawn
Short summary
Short summary
PEATGRIDS, the first dataset containing maps of global peat thickness and carbon stock at 1 km resolution. The dataset has been publicly available at Zenodo to support further analyses and modelling of peatlands across the globe. This work employed the random forest machine learning model to provide spatially explicit peat carbon stock at pixel basis.
Tobias Karl David Weber, Lutz Weihermüller, Attila Nemes, Michel Bechtold, Aurore Degré, Efstathios Diamantopoulos, Simone Fatichi, Vilim Filipović, Surya Gupta, Tobias L. Hohenbrink, Daniel R. Hirmas, Conrad Jackisch, Quirijn de Jong van Lier, John Koestel, Peter Lehmann, Toby R. Marthews, Budiman Minasny, Holger Pagel, Martine van der Ploeg, Shahab Aldin Shojaeezadeh, Simon Fiil Svane, Brigitta Szabó, Harry Vereecken, Anne Verhoef, Michael Young, Yijian Zeng, Yonggen Zhang, and Sara Bonetti
Hydrol. Earth Syst. Sci., 28, 3391–3433, https://doi.org/10.5194/hess-28-3391-2024, https://doi.org/10.5194/hess-28-3391-2024, 2024
Short summary
Short summary
Pedotransfer functions (PTFs) are used to predict parameters of models describing the hydraulic properties of soils. The appropriateness of these predictions critically relies on the nature of the datasets for training the PTFs and the physical comprehensiveness of the models. This roadmap paper is addressed to PTF developers and users and critically reflects the utility and future of PTFs. To this end, we present a manifesto aiming at a paradigm shift in PTF research.
Frisa Irawan Ginting, Rudiyanto Rudiyanto, Fatchurahman, Ramisah Mohd Shah, Norhidayah Che Soh, Sunny Goh Eng Giap, Dian Fiantis, Budi Indra Setiawan, Sam Schiller, Aaron Davitt, and Budiman Minasny
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-90, https://doi.org/10.5194/essd-2024-90, 2024
Preprint withdrawn
Short summary
Short summary
This study is the first to map rice cropping intensity and the harvested area across Southeast Asia at a spatial resolution of 10 m (SEA-Rice-Ci10). We have developed a geospatial inventory of paddy rice parcels and rice cropping intensity by integrating Sentinel-1 and 2 time-series data in a framework called LUCK-PALM, based on local phenological expert interpretation. According to our best knowledge, it is the finest-resolution and most accurate database of paddy rice in Southeast Asia.
Wartini Ng, Budiman Minasny, Alex McBratney, Patrice de Caritat, and John Wilford
Earth Syst. Sci. Data, 15, 2465–2482, https://doi.org/10.5194/essd-15-2465-2023, https://doi.org/10.5194/essd-15-2465-2023, 2023
Short summary
Short summary
With a higher demand for lithium (Li), a better understanding of its concentration and spatial distribution is important to delineate potential anomalous areas. This study uses a framework that combines data from recent geochemical surveys and relevant environmental factors to predict and map Li content across Australia. The map shows high Li concentration around existing mines and other potentially anomalous Li areas. The same mapping principles can potentially be applied to other elements.
Mercedes Román Dobarco, Alexandre M. J-C. Wadoux, Brendan Malone, Budiman Minasny, Alex B. McBratney, and Ross Searle
Biogeosciences, 20, 1559–1586, https://doi.org/10.5194/bg-20-1559-2023, https://doi.org/10.5194/bg-20-1559-2023, 2023
Short summary
Short summary
Soil organic carbon (SOC) is of a heterogeneous nature and varies in chemistry, stabilisation mechanisms, and persistence in soil. In this study we mapped the stocks of SOC fractions with different characteristics and turnover rates (presumably PyOC >= MAOC > POC) across Australia, combining spectroscopy and digital soil mapping. The SOC stocks (0–30 cm) were estimated as 13 Pg MAOC, 2 Pg POC, and 5 Pg PyOC.
José Padarian, Budiman Minasny, Alex B. McBratney, and Pete Smith
SOIL Discuss., https://doi.org/10.5194/soil-2021-73, https://doi.org/10.5194/soil-2021-73, 2021
Manuscript not accepted for further review
Short summary
Short summary
Soil organic carbon sequestration is considered an attractive technology to partially mitigate climate change. Here, we show how the SOC storage potential varies globally. The estimated additional SOC storage potential in the topsoil of global croplands (29–67 Pg C) equates to only 2 to 5 years of emissions offsetting and 32 % of agriculture's 92 Pg historical carbon debt. Since SOC is temperature-dependent, this potential is likely to reduce by 18 % by 2040 due to climate change.
Cited articles
Ahmad, S., Kalra, A., and Stephen, H.: Estimating soil moisture using remote
sensing data: A machine learning approach, Adv. Water Resour., 33,
69–80, 2010. a
Ahmed, O., Habbani, F. I., Mustafa, A., Mohamed, E., Salih, A., and Seedig, F.:
Quality assessment statistic evaluation of X-ray fluorescence via NIST and
IAEA standard reference materials, World Journal of Nuclear Science and
Technology, 7, 121–128, 2017. a
Arrouays, D., Grundy, M. G., Hartemink, A. E., Hempel, J. W., Heuvelink, G. B.,
Hong, S. Y., Lagacherie, P., Lelyk, G., McBratney, A. B., McKenzie, N., Mendonca-Santos, M. d. L., Minasny, B., Montanarella, L., Odeh, I., Sanchez, P., Thompson, J., and Zhang, G.: GlobalSoilMap: Toward a fine-resolution global grid of soil
properties, in: Advances in agronomy, Elsevier,
125, 93–134, 2014. a
Beguin, J., Fuglstad, G.-A., Mansuy, N., and Paré, D.: Predicting soil
properties in the Canadian boreal forest with limited data: Comparison of
spatial and non-spatial statistical approaches, Geoderma, 306, 195–205,
2017. a
Blanco, C. M. G., Gomez, V. M. B., Crespo, P., and Ließ, M.: Spatial
prediction of soil water retention in a Páramo landscape: Methodological
insight into machine learning using random forest, Geoderma, 316, 100–114,
2018. a
Bondi, G., Creamer, R., Ferrari, A., Fenton, O., and Wall, D.: Using machine
learning to predict soil bulk density on the basis of visual parameters:
Tools for in-field and post-field evaluation, Geoderma, 318, 137–147, 2018. a
Børgesen, C. D. and Schaap, M. G.: Point and parameter pedotransfer
functions for water retention predictions for Danish soils, Geoderma, 127,
154–167, 2005. a
Bui, E. N., Henderson, B. L., and Viergever, K.: Knowledge discovery from
models of soil properties developed through data mining, Ecol.
Modell., 191, 431–446, 2006. a
Butler, B. M., O'Rourke, S. M., and Hillier, S.: Using rule-based regression
models to predict and interpret soil properties from X-ray powder diffraction
data, Geoderma, 329, 43–53, https://doi.org/10.1016/j.geoderma.2018.04.005, 2018. a
Cao, B., Domke, G. M., Russell, M. B., and Walters, B. F.: Spatial modeling of
litter and soil carbon stocks on forest land in the conterminous United
States, Sci. Total Environ., 654, 94–106, 2019. a
Castro-Franco, M., Domenech, M. B., Borda, M. R., and Costa, J.: Spatial
dataset of topsoil texture for the southern Argentine Pampas, Geoderma
Regional, 12, 18–27, 2017. a
Catlett, J.: Mega induction: A test flight, in: Machine Learning Proceedings
1991, Proceedings of the Eighth International Conference, Evanston, Elsevier, Illinois, 596–599, 1991. a
Caubet, M., Dobarco, M. R., Arrouays, D., Minasny, B., and Saby, N. P.:
Merging country, continental and global predictions of soil texture: Lessons
from ensemble modelling in France, Geoderma, 337, 99–110, 2019. a
Chlingaryan, A., Sukkarieh, S., and Whelan, B.: Machine learning approaches
for crop yield prediction and nitrogen status estimation in precision
agriculture: A review, Comput. Electron. Agr., 151,
61–69, 2018. a
Coopersmith, E. J., Minsker, B. S., Wenzel, C. E., and Gilmore, B. J.: Machine
learning assessments of soil drying for agricultural planning, Comput.
Electron. Agr., 104, 93–104, 2014. a
Cortes, C., Jackel, L. D., Solla, S. A., Vapnik, V., and Denker, J. S.:
Learning curves: Asymptotic values and rate of convergence, in: Advances
in Neural Information Processing Systems, 327–334, 1994. a
Costa, J. G., Reigosa, M., Matías, J., and Covelo, E.: Soil Cd, Cr, Cu,
Ni, Pb and Zn sorption and retention models using SVM: variable selection and
competitive model, Sci. Total Environ., 593, 508–522, 2017. a
Dai, F., Zhou, Q., Lv, Z., Wang, X., and Liu, G.: Spatial prediction of soil
organic matter content integrating artificial neural network and ordinary
kriging in Tibetan Plateau, Ecol. Indic., 45, 184–194, 2014. a
Dharumarajan, S., Hegde, R., and Singh, S.: Spatial prediction of major soil
properties using Random Forest techniques-A case study in semi-arid tropics
of South India, Geoderma Regional, 10, 154–162, 2017. a
Dobarco, M. R., Cousin, I., Le Bas, C., and Martin, M. P.: Pedotransfer
functions for predicting available water capacity in French soils, their
applicability domain and associated uncertainty, Geoderma, 336, 81–95,
2019. a
Doherty, M. E. and Balzer, W. K.: Cognitive feedback, in: Advances in
psychology, Elsevier, 54, 163–197, 1988. a
Dybczyński, R., Tugsavul, A., and Suschny, O.: Soil-5, a new IAEA
certified reference material for trace element determinations, Geostandard.
Newslett., 3, 61–87, 1979. a
Fajardo, M., McBratney, A., and Whelan, B.: Fuzzy clustering of Vis–NIR
spectra for the objective recognition of soil morphological horizons in soil
profiles, Geoderma, 263, 244–253, 2016. a
Farfani, H. A., Behnamfar, F., and Fathollahi, A.: Dynamic analysis of
soil-structure interaction using the neural networks and the support vector
machines, Expert Syst. Appl., 42, 8971–8981, 2015. a
Feng, Y., Cui, N., Hao, W., Gao, L., and Gong, D.: Estimation of soil
temperature from meteorological data using different machine learning
models, Geoderma, 338, 67–77, 2019. a
Flynn, T., Rozanov, A., de Clercq, W., Warr, B., and Clarke, C.:
Semi-automatic disaggregation of a national resource inventory into a
farm-scale soil depth class map, Geoderma, 337, 1136–1145, 2019. a
Friedman, J. H.: Greedy function approximation: a gradient boosting machine,
Ann. Stat., 29, 1189–1232, 2001. a
Gal, Y. and Ghahramani, Z.: Dropout as a bayesian approximation: Representing
model uncertainty in deep learning, in: International conference on machine
learning, 1050–1059, 2016. a
Gao, M., Li, H.-Y., Liu, D., Tang, J., Chen, X., Chen, X., Blöschl, G., and
Leung, L. R.: Identifying the dominant controls on macropore flow velocity
in soils: A meta-analysis, J. Hydrol., 567, 590–604, 2018. a
Geissen, V., Kampichler, C., López-de Llergo-Juárez, J., and
Galindo-Acántara, A.: Superficial and subterranean soil erosion in
Tabasco, tropical Mexico: development of a decision tree modeling approach,
Geoderma, 139, 277–287, 2007. a
Gomes, L. C., Faria, R. M., de Souza, E., Veloso, G. V., Schaefer, C. E. G.,
and Fernandes Filho, E. I.: Modelling and mapping soil organic carbon
stocks in Brazil, Geoderma, 340, 337–350, 2019. a
Greifeneder, F., Khamala, E., Sendabo, D., Wagner, W., Zebisch, M., Farah, H.,
and Notarnicola, C.: Detection of soil moisture anomalies based on
Sentinel-1, Phys. Chem. Earth, Pt. A/B/C, 112, 75–82, 2018. a
Grinand, C., Le Maire, G., Vieilledent, G., Razakamanarivo, H., Razafimbelo,
T., and Bernoux, M.: Estimating temporal changes in soil carbon stocks at
ecoregional scale in Madagascar using remote-sensing, Int. J.
Appl. Earth Obs., 54, 1–14, 2017. a
Grunwald, S.: Multi-criteria characterization of recent digital soil mapping
and modeling approaches, Geoderma, 152, 195–207, 2009. a
Grunwald, S.: What do we really know about the space–time continuum of
soil-landscapes?, in: Environmental Soil-Landscape Modeling,
CRC Press, 16–49, 2016. a
Grunwald, S., Vasques, G. M., and Rivero, R. G.: Fusion of soil and remote
sensing data to model soil properties, Adv. Agron., Elsevier, 131,
1–109, 2015. a
Han, J., Mao, K., Xu, T., Guo, J., Zuo, Z., and Gao, C.: A soil moisture
estimation framework based on the cart algorithm and its application in
china, J. Hydrol., 563, 65–75, 2018. a
Hanna, A. M., Ural, D., and Saygili, G.: Neural network model for liquefaction
potential in soil deposits using Turkey and Taiwan earthquake data, Soil
Dyn. Earthq. Eng., 27, 521–540, 2007. a
Heggemann, T., Welp, G., Amelung, W., Angst, G., Franz, S. O., Koszinski, S.,
Schmidt, K., and Pätzold, S.: Proximal gamma-ray spectrometry for
site-independent in situ prediction of soil texture on ten heterogeneous
fields in Germany using support vector machines, Soil Till. Res.,
168, 99–109, https://doi.org/10.1016/j.still.2016.10.008, 2017. a
Hutson, M.: Boycott highlights AI's publishing rebellion, Science, 360, p. 699, 2018. a
Ivushkin, K., Bartholomeus, H., Bregt, A. K., Pulatov, A., Bui, E. N., and
Wilford, J.: Soil salinity assessment through satellite thermography for
different irrigated and rainfed crops, Int. J. Appl.
Earth Obs., 68, 230–237, 2018. a
Karandish, F. and Šimŭnek, J.: A field-modeling study for assessing
temporal variations of soil-water-crop interactions under water-saving
irrigation strategies, Agr. Water Manage., 178, 291–303, 2016. a
Khadim, F. K., Su, H., Xu, L., and Tian, J.: Soil salinity mapping in
Everglades National Park using remote sensing techniques and vegetation salt
tolerance, Phys. Chem. Earth, Pt. A/B/C, 110, 31–50, 2019. a
Kheir, R. B., Chorowicz, J., Abdallah, C., and Dhont, D.: Soil and bedrock
distribution estimated from gully form and frequency: A GIS-based
decision-tree model for Lebanon, Geomorphology, 93, 482–492,
https://doi.org/10.1016/j.geomorph.2007.03.010, 2008. a
Koenker, R. and Bassett Jr., G.: Regression quantiles, Econometrica: journal
of the Econometric Society, 33–50, 1978. a
Koestel, J. and Jorda, H.: What determines the strength of preferential
transport in undisturbed soil under steady-state flow?, Geoderma, 217,
144–160, 2014. a
Kohavi, R.: A study of cross-validation and bootstrap for accuracy
estimation and model selection, in: Ijcai,
Montreal, Canada, 14, 1137–1145, 1995. a
Kovačević, M., Bajat, B., and Gajić, B.: Soil type classification
and estimation of soil properties using support vector machines, Geoderma,
154, 340–347, 2010. a
Lacoste, M., Lemercier, B., and Walter, C.: Regional mapping of soil parent
material by machine learning based on point data, Geomorphology, 133,
90–99, https://doi.org/10.1016/j.geomorph.2011.06.026, 2011. a
Leenaars, J. G., Claessens, L., Heuvelink, G. B., Hengl, T., González,
M. R., van Bussel, L. G., Guilpart, N., Yang, H., and Cassman, K. G.:
Mapping rootable depth and root zone plant-available water holding capacity
of the soil of sub-Saharan Africa, Geoderma, 324, 18–36, 2018. a
Liang, Z., Chen, S., Yang, Y., Zhao, R., Shi, Z., and Rossel, R. A. V.:
National digital soil map of organic matter in topsoil and its associated
uncertainty in 1980's China, Geoderma, 335, 47–56, 2019. a
Lin, J.: Divergence measures based on the Shannon entropy, IEEE T. Inform. Theory, 37, 145–151, 1991. a
Liu, S., Yang, Y., Shen, H., Hu, H., Zhao, X., Li, H., Liu, T., and Fang, J.:
No significant changes in topsoil carbon in the grasslands of northern China
between the 1980s and 2000s, Sci. Total Environ., 624,
1478–1487, 2018. a
Lou, Y., Caruana, R., and Gehrke, J.: Intelligible models for classification
and regression, in: Proceedings of the 18th ACM SIGKDD international
conference on Knowledge discovery and data mining, ACM, 150–158, 2012. a
Lu, W., Lu, D., Wang, G., Wu, J., Huang, J., and Li, G.: Examining soil
organic carbon distribution and dynamic change in a hickory plantation region
with Landsat and ancillary data, Catena, 165, 576–589, 2018. a
Ma, Y., Minasny, B., and Wu, C.: Mapping key soil properties to support
agricultural production in Eastern China, Geoderma Regional, 10, 144–153,
2017. a
Ma, Y., Minasny, B., Malone, B. P., and Mcbratney, A. B.: Pedology and digital soil mapping (DSM), Europ. J. Soil Sci., 70, 216–235, 2019. a
Mansuy, N., Thiffault, E., Paré, D., Bernier, P., Guindon, L., Villemaire,
P., Poirier, V., and Beaudoin, A.: Digital mapping of soil properties in
Canadian managed forests at 250 m of resolution using the k-nearest neighbor
method, Geoderma, 235, 59–73, 2014. a
Märker, M., Pelacani, S., and Schröder, B.: A functional entity
approach to predict soil erosion processes in a small Plio-Pleistocene
Mediterranean catchment in Northern Chianti, Italy, Geomorphology, 125,
530–540, 2011. a
Martin, M., Orton, T., Lacarce, E., Meersmans, J., Saby, N., Paroissien, J.,
Jolivet, C., Boulonne, L., and Arrouays, D.: Evaluation of modelling
approaches for predicting the spatial distribution of soil organic carbon
stocks at the national scale, Geoderma, 223, 97–107, 2014. a
Martinez, G., Weltz, M., Pierson, F. B., Spaeth, K. E., and Pachepsky, Y.:
Scale effects on runoff and soil erosion in rangelands: Observations and
estimations with predictors of different availability, Catena, 151,
161–173, 2017. a
Massawe, B. H., Subburayalu, S. K., Kaaya, A. K., Winowiecki, L., and Slater,
B. K.: Mapping numerically classified soil taxa in Kilombero Valley,
Tanzania using machine learning, Geoderma, 311, 143–148, 2018. a
Matthew and Honnibal, M. I.: spaCy 2: Natural language understanding with
Bloom embeddings, convolutional neural networks and incremental parsing,
https://github.com/explosion/spaCy/ (last access: 5 February 2020), 2017. a
McBratney, A., de Gruijter, J., and Bryce, A.: Pedometrics timeline,
Geoderma, 338, 568–575, 2019. a
McCallum, A. K.: MALLET: A Machine Learning for Language Toolkit,
http://mallet.cs.umass.edu (last access: 5 February 2020), 2002. a
Minasny, B. and Flantis, D.: “Helicopter research”: who benefits from
international studies in Indonesia?,
https://theconversation.com/helicopter-research-who-benefits-from-international-studies-in-indonesia-102165
(last access: 29 April 2019), 2018. a
Mjolsness, E. and DeCoste, D.: Machine learning for science: state of the art
and future prospects, Science, 293, 2051–2055, 2001. a
Montavon, G., Samek, W., and Müller, K.-R.: Methods for interpreting and
understanding deep neural networks, Digit. Signal Process., 73, 1–15,
2018. a
Morellos, A., Pantazi, X.-E., Moshou, D., Alexandridis, T., Whetton, R.,
Tziotzios, G., Wiebensohn, J., Bill, R., and Mouazen, A. M.: Machine
learning based prediction of soil total nitrogen, organic carbon and moisture
content by using VIS-NIR spectroscopy, Biosyst. Eng., 152,
104–116, 2016. a
Mutanga, O., Adam, E., and Cho, M. A.: High density biomass estimation for
wetland vegetation using WorldView-2 imagery and random forest regression
algorithm, Int. J. Appl. Earth Obs., 18, 399–406, 2012. a
Naderi-Boldaji, M., Tekeste, M. Z., Nordstorm, R. A., Barnard, D. J., and
Birrel, S. J.: A mechanical-dielectric-high frequency acoustic sensor fusion
for soil physical characterization, Comput. Electron.
Agr., 156, 10–23, 2019. a
Oh, Y.-Y., Yun, S.-T., Yu, S., Kim, H.-J., and Jun, S.-C.: A novel
wavelet-based approach to characterize dynamic environmental factors
controlling short-term soil surface CO2 flux: Application to a controlled CO2
release test site (EIT) in South Korea, Geoderma, 337, 76–90, 2019. a
Padarian, J., Minasny, B., and McBratney, A.: Using deep learning to predict
soil properties from regional spectral data, Geoderma Regional, 16,
e00198, https://doi.org/10.1016/j.geodrs.2018.e00198, 2019b. a, b
Padarian, J., Minasny, B., and McBratney, A. B.: Using deep learning for
digital soil mapping, Soil, 5, 79–89, 2019c. a
Pasini, A.: Artificial neural networks for small dataset analysis, J.
Thoracic Dis., 7, 953–960, 2015. a
Perlich, C., Provost, F., and Simonoff, J. S.: Tree induction vs. logistic
regression: A learning-curve analysis, J. Mach. Learn. Res.,
4, 211–255, 2003. a
Poggio, L., Gimona, A., Spezia, L., and Brewer, M. J.: Bayesian spatial
modelling of soil properties and their uncertainty: The example of soil
organic matter in Scotland using R-INLA, Geoderma, 277, 69–82, 2016. a
Prasad, R., Deo, R. C., Li, Y., and Maraseni, T.: Ensemble committee-based
data intelligent approach for generating soil moisture forecasts with
multivariate hydro-meteorological predictors, Soil Till. Res.,
181, 63–81, 2018. a
Probst, P., Wright, M. N., and Boulesteix, A.-L.: Hyperparameters and tuning
strategies for random forest, WIRES Data Min.
Knowl., 9, e1301, https://doi.org/10.1002/widm.1301, 2019. a
Pueyo, M., Rauret, G., Bacon, J., Gomez, A., Muntau, H., Quevauviller, P., and
López-Sánchez, J.: A new organic-rich soil reference material
certified for its EDTA-and acetic acid-extractable contents of Cd, Cr, Cu,
Ni, Pb and Zn, following collaboratively tested and harmonised procedures,
J. Environ. Monit., 3, 238–242, 2001. a
Rauber, P. E., Fadel, S. G., Falcao, A. X., and Telea, A. C.: Visualizing the
hidden activity of artificial neural networks, IEEE T.
Vis. Comput. Gr., 23, 101–110, 2017. a
Reale, C., Gavin, K., Librić, L., and Jurić-Kaćunić, D.:
Automatic classification of fine-grained soils using CPT measurements and
Artificial Neural Networks, Adv. Eng. Inform., 36, 207–215,
2018. a
Reeves, M. K., Perdue, M., Munk, L. A., and Hagedorn, B.: Predicting risk of
trace element pollution from municipal roads using site-specific soil samples
and remotely sensed data, Sci. Total Environ., 630, 578–586,
2018. a
Rial, M., Cortizas, A. M., Taboada, T., and Rodríguez-Lado, L.: Soil
organic carbon stocks in Santa Cruz Island, Galapagos, under different
climate change scenarios, Catena, 156, 74–81, 2017. a
Röder, M., Both, A., and Hinneburg, A.: Exploring the space of topic
coherence measures, in: Proceedings of the eighth ACM international
conference on Web search and data mining, ACM, 399–408, 2015. a
Rudin, C. and Wagstaff, K. L.: Machine learning for science and society, Mach. Learn., 95, 1–9,
2014. a
Sagasti, F. R.: Underdevelopment, science and technology: the point of view of
the underdeveloped countries, Sci. Stud., 3, 47–59, 1973. a
Schaap, M. G. and Bouten, W.: Modeling water retention curves of sandy soils
using neural networks, Water Resour. Res., 32, 3033–3040, 1996. a
Schillaci, C., Acutis, M., Lombardo, L., Lipani, A., Fantappie, M., Märker,
M., and Saia, S.: Spatio-temporal topsoil organic carbon mapping of a
semi-arid Mediterranean region: The role of land use, soil texture,
topographic indices and the influence of remote sensing data to modelling,
Sci. Total Environ., 601, 821–832, 2017a. a
Schillaci, C., Lombardo, L., Saia, S., Fantappiè, M., Märker, M., and
Acutis, M.: Modelling the topsoil carbon stock of agricultural lands with
the Stochastic Gradient Treeboost in a semi-arid Mediterranean region,
Geoderma, 286, 35–45, 2017b. a
Shavlik, J. W., Mooney, R. J., and Towell, G. G.: Symbolic and neural learning
algorithms: An experimental comparison, Mach. Learn., 6, 111–143, 1991. a
Shaw, J., West, L., Radcliffe, D., and Bosch, D.: Preferential flow and
pedotransfer functions for transport properties in sandy Kandiudults, Soil
Sci. Soc. Am. J., 64, 670–678, 2000. a
Snoek, J., Larochelle, H., and Adams, R. P.: Practical bayesian optimization
of machine learning algorithms, Adv. Neur. In., 25, 2951–2959, 2012. a
Snoek, J., Rippel, O., Swersky, K., Kiros, R., Satish, N., Sundaram, N.,
Patwary, M., Prabhat, M., and Adams, R.: Scalable bayesian optimization
using deep neural networks, Proceedings of the 32nd International Conference on Machine Learning, Int. Conf. Mach.
Learn., 2171–2180, 2015. a
Somarathna, P., Minasny, B., and Malone, B. P.: More data or a better model?
Figuring out what matters most for the spatial prediction of soil carbon,
Soil Sci. Soc. Am. J., 81, 1413–1426, 2017. a
Song, X.-D., Yang, F., Ju, B., Li, D.-C., Zhao, Y.-G., Yang, J.-L., and Zhang,
G.-L.: The influence of the conversion of grassland to cropland on changes
in soil organic carbon and total nitrogen stocks in the Songnen Plain of
Northeast China, Catena, 171, 588–601, 2018. a
Stevens, A., van Wesemael, B., Bartholomeus, H., Rosillon, D., Tychon, B., and
Ben-Dor, E.: Laboratory, field and airborne spectroscopy for monitoring
organic carbon content in agricultural soils, Geoderma, 144, 395–404, 2008. a
Stevens, K., Kegelmeyer, P., Andrzejewski, D., and Buttler, D.: Exploring
topic coherence over many models and many topics, in: Proceedings of the
2012 Joint Conference on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning, Association for
Computational Linguistics, 952–961, 2012. a, b
Stine, R. A.: Bootstrap prediction intervals for regression, J.
Am. Stat. Assoc., 80, 1026–1031, 1985. a
Stumpf, F., Keller, A., Schmidt, K., Mayr, A., Gubler, A., and Schaepman, M.:
Spatio-temporal land use dynamics and soil organic carbon in Swiss
agroecosystems, Agr. Ecosyst. Environ., 258, 129–142, 2018. a
Subburayalu, S., Jenhani, I., and Slater, B.: Disaggregation of component soil
series on an Ohio County soil survey map using possibilistic decision trees,
Geoderma, 213, 334–345, 2014. a
Sugimoto, C. R., Li, D., Russell, T. G., Finlay, S. C., and Ding, Y.: The
shifting sands of disciplinary development: Analyzing North American Library
and Information Science dissertations using latent Dirichlet allocation,
J. Am. Soc. Inf. Sci. Tech., 62,
185–204, 2011. a
Taghizadeh-Mehrjardi, R., Nabiollahi, K., Minasny, B., and Triantafilis, J.:
Comparing data mining classifiers to predict spatial distribution of
USDA-family soil groups in Baneh region, Iran, Geoderma, 253, 67–77, 2015. a
Taghizadeh-Mehrjardi, R., Nabiollahi, K., and Kerry, R.: Digital mapping of
soil organic carbon at multiple depths using different data mining techniques
in Baneh region, Iran, Geoderma, 266, 98–110, 2016. a
Tomasella, J., Hodnett, M. G., and Rossato, L.: Pedotransfer Functions for the
Estimation of Soil Water Retention in Brazilian Soils, Soil Sci. Soc. Am. J., 64, 327–338, 2000. a
Tranter, G., Minasny, B., and McBratney, A.: Estimating Pedotransfer Function
Prediction Limits Using Fuzzy k-Means with Extragrades, Soil Sci. Soc. Am.
J., 74, 1967–1975, 2010. a
Tziachris, P., Aschonitis, V., Chatzistathis, T., and Papadopoulou, M.:
Assessment of spatial hybrid methods for predicting soil organic matter
using DEM derivatives and soil parameters, Catena, 174, 206–216, 2019. a
Vaysse, K. and Lagacherie, P.: Using quantile regression forest to estimate
uncertainty of digital soil mapping products, Geoderma, 291, 55–64, 2017. a
Vincent, S., Lemercier, B., Berthier, L., and Walter, C.: Spatial
disaggregation of complex Soil Map Units at the regional scale based on
soil-landscape relationships, Geoderma, 311, 130–142, 2018. a
Viscarra-Rossel, R. and Behrens, T.: Using data mining to model and interpret
soil diffuse reflectance spectra, Geoderma, 158, 46–54, 2010. a
Řehůřek, R. and Sojka, P.: Software Framework for Topic Modelling
with Large Corpora, in: Proceedings of the LREC 2010 Workshop on New
Challenges for NLP Frameworks, ELRA, Valletta, Malta,
45–50, http://is.muni.cz/publication/884893/en (last access: 5 February 2020), 2010. a
Wang, B., Waters, C., Orgill, S., Gray, J., Cowie, A., Clark, A., and Li Liu,
D.: High resolution mapping of soil organic carbon stocks using remote
sensing variables in the semi-arid rangelands of eastern Australia, Sci. Total Environ., 630, 367–378, 2018b. a
Ware, M. and Mabe, M.: The STM report: An overview of scientific and scholarly
journal publishing, available at: https://www.stm-assoc.org/2015_02_20_STM_Report_2015.pdf (last access: 5 February 2020), 2015.
a
Warner, D. L., Guevara, M., Inamdar, S., and Vargas, R.: Upscaling
soil-atmosphere CO2 and CH4 fluxes across a topographically complex forested
landscape, Agr. Forest Meteorol., 264, 80–91, 2019. a
Watson, S. J., Luck, G. W., Spooner, P. G., and Watson, D. M.: Land-use
change: incorporating the frequency, sequence, time span, and magnitude of
changes into ecological research, Front. Ecol. Environ.,
12, 241–249, 2014. a
Were, K., Bui, D. T., Dick, Ø. B., and Singh, B. R.: A comparative
assessment of support vector regression, artificial neural networks, and
random forests for predicting and mapping soil organic carbon stocks across
an Afromontane landscape, Ecol. Ind., 52, 394–403, 2015. a
Wu, G., Kechavarzi, C., Li, X., Wu, S., Pollard, S. J., Sui, H., and Coulon,
F.: Machine learning models for predicting PAHs bioavailability in compost
amended soils, Chem. Engin. J., 223, 747–754, 2013. a
Wu, Q., Zhang, C., Hong, Q., and Chen, L.: Topic evolution based on LDA and
HMM and its application in stem cell research, J. Inf.
Sci., 40, 611–620, 2014. a
Xie, X.-L. and Li, A.-B.: Identification of soil profile classes using
depth-weighted visible-near-infrared spectral reflectance,
Geoderma, 325, 90–101, https://doi.org/10.1016/j.geoderma.2018.03.029, 2018. a
Xing, L., Li, L., Gong, J., Ren, C., Liu, J., and Chen, H.: Daily soil
temperatures predictions for various climates in United States using
data-driven model, Energy, 160, 430–440, 2018. a
Xu, Y., Smith, S. E., Grunwald, S., Abd-Elrahman, A., and Wani, S. P.:
Incorporation of satellite remote sensing pan-sharpened imagery into digital
soil prediction and mapping models to characterize soil property variability
in small agricultural fields, ISPRS J Photogramm., 123, 1–19, 2017. a
Zeynoddin, M., Bonakdari, H., Ebtehaj, I., Esmaeilbeiki, F., Gharabaghi, B.,
and Haghi, D. Z.: A reliable linear stochastic daily soil temperature
forecast model, Soil Till. Res., 189, 73–87, 2019. a
Zhang, C., Mishra, D. R., and Pennings, S. C.: Mapping salt marsh soil
properties using imaging spectroscopy, ISPRS J. Photogramm., 148, 221–234, 2019. a
Zhang, Q., Nian Wu, Y., and Zhu, S.-C.: Interpretable convolutional neural
networks, in: Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, 8827–8836, 2018a. a
Zhou, D., Ji, X., Zha, H., and Giles, C. L.: Topic evolution and social
interactions: how authors effect research, in: Proceedings of the 15th ACM
international conference on Information and knowledge management,
ACM, 248–257, 2006. a
Short summary
The application of machine learning (ML) has shown an accelerated adoption in soil sciences. It is a difficult task to manually review all papers on the application of ML. This paper aims to provide a review of the application of ML aided by topic modelling in order to find patterns in a large collection of publications. The objective is to gain insight into the applications and to discuss research gaps. We found 12 main topics and that ML methods usually perform better than traditional ones.
The application of machine learning (ML) has shown an accelerated adoption in soil sciences. It...