Mapping of soil properties at high resolution in Switzerland using boosted geoadditive models
- 1Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, Universitätstrasse 16, 8092 Zürich, Switzerland
- 2Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Zürcherstrasse 111, 8903 Birmensdorf, Switzerland
- 3Research Station Agroscope Reckenholz-Taenikon ART, Reckenholzstrasse 191, 8046 Zürich, Switzerland
Abstract. High-resolution maps of soil properties are a prerequisite for assessing soil threats and soil functions and for fostering the sustainable use of soil resources. For many regions in the world, accurate maps of soil properties are missing, but often sparsely sampled (legacy) soil data are available. Soil property data (response) can then be related by digital soil mapping (DSM) to spatially exhaustive environmental data that describe soil-forming factors (covariates) to create spatially continuous maps. With airborne and space-borne remote sensing and multi-scale terrain analysis, large sets of covariates have become common. Building parsimonious models amenable to pedological interpretation is then a challenging task.
We propose a new boosted geoadditive modelling framework (geoGAM) for DSM. The geoGAM models smooth non-linear relations between responses and single covariates and combines these model terms additively. Residual spatial autocorrelation is captured by a smooth function of spatial coordinates, and non-stationary effects are included through interactions between covariates and smooth spatial functions. The core of fully automated model building for geoGAM is component-wise gradient boosting.
We illustrate the application of the geoGAM framework by using soil data from the Canton of Zurich, Switzerland. We modelled effective cation exchange capacity (ECEC) in forest topsoils as a continuous response. For agricultural land we predicted the presence of waterlogged horizons in given soil depths as binary and drainage classes as ordinal responses. For the latter we used proportional odds geoGAM, taking the ordering of the response properly into account. Fitted geoGAM contained only a few covariates (7 to 17) selected from large sets (333 covariates for forests, 498 for agricultural land). Model sparsity allowed for covariate interpretation through partial effects plots. Prediction intervals were computed by model-based bootstrapping for ECEC. The predictive performance of the fitted geoGAM, tested with independent validation data and specific skill scores for continuous, binary and ordinal responses, compared well with other studies that modelled similar soil properties. Skill score (SS) values of 0.23 to 0.53 (with SS = 1 for perfect predictions and SS = 0 for zero explained variance) were achieved depending on the response and type of score. GeoGAM combines efficient model building from large sets of covariates with effects that are easy to interpret and therefore likely raises the acceptance of DSM products by end-users.