Preprints
https://doi.org/10.5194/soil-2020-102
https://doi.org/10.5194/soil-2020-102

  28 Apr 2021

28 Apr 2021

Review status: this preprint is currently under review for the journal SOIL.

On the benefits of clustering approaches in digital soil mapping: an application example concerning soil texture regionalization

Istvan Dunkl1,2 and Mareike Ließ2 Istvan Dunkl and Mareike Ließ
  • 1Max Planck Institute for Meteorology, Hamburg, Germany
  • 2Department Soil System Science, Helmholtz Centre for Environmental Research – UFZ, Halle (Saale), Germany

Abstract. High resolution soil maps are urgently needed by land managers and researchers for a variety of applications. Digital Soil Mapping (DSM) allows to regionalize soil properties by relating them to environmental covariates with the help of an empirical model. In this study, a legacy soil data set was used to train a machine learning algorithm in order to predict the particle size distribution within the catchment of the Bode river in Saxony-Anhalt (Germany). The ensemble learning method random forest was used to predict soil texture based on environmental covariates originating from a digital elevation model, land cover data and geologic maps. We studied the usefulness of clustering applications in addressing various aspects of the DSM procedure. To investigate the role of the imbalanced data problem in the learning process, the environmental variables were used to cluster the landscape of the study area. Different sampling strategies were used to create balanced training data and were evaluated on their ability to improve model performance. Clustering applications were also involved in feature selection and stratified cross-validation. Overall, clustering applications appear to be a versatile tool to be employed at various steps of the DSM procedure. Beyond their successful application, further application fields in DSM were identified. One of them is to find adequate means to include expert knowledge.

Istvan Dunkl and Mareike Ließ

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on soil-2020-102', Anonymous Referee #1, 05 May 2021
  • RC2: 'Comment on soil-2020-102', Anonymous Referee #2, 04 Jun 2021

Istvan Dunkl and Mareike Ließ

Istvan Dunkl and Mareike Ließ

Viewed

Total article views: 306 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
238 60 8 306 2 2
  • HTML: 238
  • PDF: 60
  • XML: 8
  • Total: 306
  • BibTeX: 2
  • EndNote: 2
Views and downloads (calculated since 28 Apr 2021)
Cumulative views and downloads (calculated since 28 Apr 2021)

Viewed (geographical distribution)

Total article views: 289 (including HTML, PDF, and XML) Thereof 289 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 15 Jun 2021
Download
Short summary
Legacy soil data provides a valuable data basis to generate high-resolution soil maps by digital soil mapping (DSM). DSM allows to regionalize soil properties by relating them to environmental covariates with the help of an empirical model. We studied the usefulness of data clustering methods to tackle potential sampling bias in the used legacy soil data while applying DSM for soil texture regionalization. Clustering has proved to be useful in various steps of the DSM procedure.