Machine learning (ML) models have become key ingredients for digital soil mapping. To improve the interpretability of their predictions, diagnostic tools such as the widely used local attribution approach known as SHapley Additive exPlanations (SHAP) have been developed. However, the analysis of ML model predictions is only one part of the problem, and there is an interest in obtaining deeper insights into the drivers of the prediction uncertainty as well, i.e. explaining why an ML model is confident given the set of chosen covariate values in addition to why the ML model delivered some particular results. In this study, we show how to apply SHAP to local prediction uncertainty estimates for a case of urban soil pollution – namely, the presence of petroleum hydrocarbons in soil in Toulouse (France), which pose a health risk via vapour intrusion into buildings, direct soil ingestion, and groundwater contamination. Our results show that the drivers of the prediction best estimates are not necessarily the drivers of confidence in these predictions, and we identify those leading to a reduction in uncertainty. Our study suggests that decisions regarding data collection and covariate characterisation as well as communication of the results should be made accordingly.

Maps of soil physical properties, such as cation exchange capacity, pH, soil organic content, and hydraulic properties; chemical properties, such as concentrations of heavy metals (arsenic, mercury, etc.) and radionuclides (caesium-137, plutonium-239+240); and target-oriented indicators, such as erodibility and soil compaction (see, for example, Panagos et al., 2022), are essential for multiple types of studies, such as pollution assessment, urban planning, and construction design. In recent years, these maps have led to many advances in improving spatial prediction in the domain of digital soil mapping, denoted by DSM (McBratney et al., 2003), with the development of methods and approaches based on either the geostatistical framework (Chilès and Desassis, 2018) or machine learning (denoted ML) techniques (Wadoux et al., 2020).

Beyond spatial prediction, the question of uncertainty in spatial prediction has emerged as a key challenge (Heuvelink and Webster, 2022, Sect. 4). Historically, this question has been addressed with kriging (see, for example, Veronesi and Schillaci, 2019, for a discussion for DSM). Techniques based on ML have been increasingly used or adapted for this purpose. For instance, the popular quantile random forest model (e.g. Vaysse and Lagacherie, 2017) has been used to produce soil information worldwide in the SoilGrids 2.0 database together with uncertain information (Poggio et al., 2021). Along these lines, improvements to validation procedures have been proposed (Schmidinger and Heuvelink, 2023) together with tools for assessing prediction error and transferability (Ludwig et al., 2023).

However, quantifying the uncertainty is only one part of the problem, and there is an interest in gaining deeper insights into the influence of each covariate on the overall prediction uncertainty. This is the objective of global sensitivity analysis (Saltelli et al., 2008), which can be conducted within two settings: either “factor fixing” to identify noninfluential covariates or “factor prioritisation” to rank the covariates in terms of importance. The expected results can be of different types: the former setting provides justification for simplifying the spatial predictive model by removing the noninfluential covariates, whereas the latter setting provides justification for prioritising future characterisation efforts by focusing on the most important variables. In DSM, this question has been addressed with the tools of variance-based global sensitivity analysis (i.e. the Sobol' indices, as implemented by Varella et al., 2010) or with variable importance scores together with potentially recursive feature elimination procedures (as implemented by Poggio et al., 2021, and Meyer et al., 2018). Both approaches provide a “global” answer to the problem of sensitivity analysis, i.e. by exploring the influence over the whole range of variation in the covariates. However, these methods do not enable us to measure the influence of the covariates locally, i.e. for a prediction at a certain spatial location.

Recently, an alternative local approach was proposed by relying on a popular method from the domain of interpretable machine learning (Molnar, 2022) based on Shapley values (Shapley, 1953). This method has shown promising results in attributing the contributions of each covariate to any spatial prediction (Padarian et al., 2020; Wadoux et al., 2023; Wadoux and Molnar, 2022).

To date, the application of Shapley values to DSM has focused mainly on the prediction best estimate, and little information has been provided on the local prediction uncertainty. Motivated by a case of pollution concentration mapping in the city of Toulouse, France (Belbeze et al., 2019), we aim to investigate how to use Shapley values to decompose the local uncertainty, measured by either an interquantile width or a variance estimate. Our objective is to explore the relationship between the drivers of the prediction best estimate and the drivers of confidence in the predictions. Providing evidence of differences in the dominant drivers is expected to have implications in terms of data collection and covariate characterisation. Communication of the results is also expected to be adapted accordingly. Figure 1 illustrates the type of result that can be derived with the approach. In this example (based on the synthetic test case fully detailed in Sect. 2.1), the mean prediction (best estimate, left panel) of the variable of interest at a certain location does not have the same contributors as the local uncertainty measured by the interquartile width (right panel). The group of covariates including the maximum and mean temperatures of the warmest quarter of the year (named

Example of SHAP-based decomposition of the prediction best estimate

The remainder of the paper is organised as follows. We first describe two application cases that motivated this study. In Sect. 3, we provide further details on the statistical methods used to estimate the local contributions to the prediction uncertainty. In Sect. 4, we apply the methods and provide an in-depth analysis of the differences in the drivers of prediction best estimates and uncertainty. In Sect. 5, we discuss the practical implications of the proposed procedure and its transferability to global-scale projects.

The first test case is synthetic. It aims to predict a virtual species suitability surface, denoted by

Covariates (with prior normalisation between 0 and 1) used in the synthetic test case (see Table 1 for a detailed description). The spatial distributions of the 25 soil samples are indicated by square-shaped markers. The size of each square is proportional to the synthetic variable calculated from the covariates based on Eq. (1).

Description of the covariates for the synthetic test case.

The real case focuses on DSM to predict the total petroleum hydrocarbon (C10–C40) concentration over the city of Toulouse (located in southwestern France) as part of the definition of urban soil geochemical backgrounds (see the comprehensive review by Belbeze et al., 2023). In this case, petroleum hydrocarbons have multiple sources, such as road and air traffic, industrial emissions, and residential heating. The presence of this pollutant may inhibit several soil functions and hence prevent the delivery of associated ecosystem services (Adhikari and Hartemink, 2016). A primary soil function that may be jeopardised is the ability of the soil to provide a platform for human activities in a risk-free environment. Petroleum hydrocarbons in soil pose a risk to human health via several pathways – namely, direct soil ingestion, which is a particularly sensitive pathway for young children; exposure through respiration via vapour intrusion into buildings; and contamination of groundwater used for drinking water purposes. Notably, our study uses the data of this case to illustrate and discuss the applicability of the proposed approach and is not meant to supplement the results of Belbeze et al. (2019).

We use 1043 soil samples collected over a depth interval

Description of the covariates of the real test case.

In addition to these covariates, we follow the approach proposed by Behrens et al. (2018) to better account for spatial dependence: we also consider seven additional covariates – namely, the two geographical coordinates,

We first consider that the value of the variable of interest

Our objective is to decompose

Importantly, Eq. (3) does not aim to linearise

Covariates of continuous type over the city of Toulouse (see Table 2 for a detailed description).

Covariates of categorical type over the city of Toulouse.

The SHAP approach relies on the Shapley value (Shapley, 1953), which is used in game theory to evaluate the “fair share” of a player in a cooperative game; i.e. it is used to fairly distribute the total gains between multiple players working cooperatively. It is a “fair” distribution in the sense that it is the only distribution that satisfies some desirable properties (efficiency, symmetry, linearity, and “dummy player”). See the proofs by Shapley (1953) and Aas et al. (2021, Appendix A) for a comprehensive interpretation of these properties from an ML model perspective.

Formally, we consider a cooperative game with

Formally, the following applies:

In this setting, the Shapley value can then be interpreted as the contribution of the considered covariate to the difference between the prediction,

In this study, we aim to calculate the Shapley values for both the prediction best estimates modelled by

In practice, the computation of the Shapley value may be demanding because Eq. (4) implies covering all subsets

This definition raises the practical question of how to define groups. As explained by Jullum et al. (2021), this definition can be based on knowledge/expertise, i.e. on information that makes sense in relation to the problem at hand. The main advantage is that this facilitates the interpretation of the Shapley values. The second grouping option, which is complementary to the one based on expertise, consists of identifying covariates that provide redundant information because they share a strong dependency. The groups of dependent covariates can be identified with a clustering algorithm (see Hastie et al., 2009, Chap. 14) by taking the matrix of pairwise similarities as input. This approach does not, however, ensure that the effect of dependence among the covariates is completely removed, which may influence the SHAP results, as was extensively investigated by Aas et al. (2021). To account for this, we rely on the method proposed by Redelmeier et al. (2020) using conditional inference trees (Hothorn et al., 2006) to model the dependence structure of the covariates.

The proposed approach, named group-based SHAP (see the implementation details in Appendix A), comprises three steps.

Step 1 aims to build and train the RF models based on the dataset of soil samples together with the covariate values, which is named the training dataset. The RF hyperparameters correspond to the number of variables at which to possibly split in each node (denoted by

To eliminate the covariates of negligible influence, different options are available in the literature (see, for example, the procedure based on recursive feature elimination described by Poggio et al., 2021). Here, we propose relying on a popular method in the ML community – namely, the dependence measure based on the Hilbert–Schmidt independence criterion (HSIC) of Gretton et al. (2005). This generic measure has the advantages of being applicable (1) to any type of dependence, i.e. linear, monotonic, or nonlinear (see the discussion by Song et al., 2022); (2) to mixed variables, i.e. continuous or categorical (as in our case, described in Sect. 2); and (3) without the use of RF importance measures, for which limitations have been identified in the literature, as has been extensively discussed (see, for example, Bénard et al., 2022, and references therein). In addition, the combination of HSIC with a hypothesis testing procedure (El Amri and Marrel, 2024) provides a rigorous setting for quantifying the significance of the considered covariate through the computation of

Step 2 is optional and aims to identify groups of covariates. The objective is twofold. First, by grouping covariates suitably with respect to the problem at hand, the Shapley values can be easily interpreted. This can be done based on expert knowledge or/and by identifying covariates that share a strong dependence to reduce the redundancy of information, for instance, using the HSIC pairwise dependence measure (Appendix C). The second practical implication is a reduction in the computational burden of the Shapley value estimation, as discussed in Sect. 3.2.

Step 3 is to compute the Shapley values associated with each covariate or group identified in Step 2 to decompose the prediction uncertainty provided by the qRF model (trained in Step 1).

Using the 25 randomly selected soil samples described in Sect. 2.1, we construct an RF model to estimate the conditional mean, which is used as the best estimate of the prediction, and a qRF model to estimate the interquartile width (IQW), which is used as the uncertainty estimate. To select the RF parameters

Using the trained RF model, Fig. 6 shows the best estimate of the true value of the synthetic variable of interest (panel a) and the prediction best estimate (panel b) together with the uncertainty measure (panel c) at 10 000 grid points across the European study area (with a spatial resolution of

To ease the interpretation, we group the temperature variables

Using the trained RF model and the selected groups of covariates, we apply the group-based SHAP approach to decompose the data at the 10 000 grid points of the study area. An example of this analysis at the grid point of coordinates 4206729 m, 2149423 m is provided in Fig. 1. To facilitate comparison across the study area, we plot the scaled Shapley values, as defined in Sect. 3.2, and use them to map the contributions to the prediction best estimate (i.e. the conditional mean; Fig. 7a) and to the corresponding uncertainty (i.e. IQW; Fig. 7b). With regard to the prediction best estimate, the upper panels of Fig. 7 show that the three groups of covariates have Shapley values in the range

Scaled Shapley values (in %) for the synthetic test case considering the prediction best estimate modelled by the RF conditional mean

We construct an RF model using the 1043 soil samples described in Sect. 2.2. We use the conditional mean as the best estimate of the prediction and the interquartile width IQW as the uncertainty estimate, with the 25th and 75th percentiles computed using a qRF model. In our case, one additional difficulty is related to how the points are spatially distributed. Figure 3a shows that the points are spatially clustered as they overrepresent some regions while underrepresenting or even missing others. This situation might lead to biased predictions because the same weight is given to every point, and thus regions with high sampling density are overweighted. To alleviate this problem, we follow an approach similar to that of Bel et al. (2009). First, we use an inverse sampling intensity weighting to assign more weight to the observations in sparsely sampled zones and less weight to the observations in densely sampled zones. Second, to estimate the sampling intensity, we use a two-dimensional normal kernel density estimation with bandwidth values estimated based on the rule of thumb of Venables and Ripley (2002). Finally, the inverse sampling intensity weights, with prior normalisation to between 0 and 1, are used for RF training during bootstrap sampling (see Appendix B) to create individual trees with different probability weights by following a method similar to that of Xu et al. (2016).

To select the RF parameters

Screening analysis showing the

A screening analysis is performed within the cross-validation procedure using HSIC measures combined with a hypothesis testing procedure. Figure 10 shows the statistics of the

The distance to potentially polluted sites,

Land use has a strong impact, whereas lithology appears to have little impact, with a

The distance to roads was not included even though its relation to hydrocarbon concentration was expected. This is less due to its dependence on the other covariates, whose HSIC values are as high as 0.14 (Sect. S2), than to its very dense spatial distribution: the value of this covariate varies very little over a large area, as indicated by the almost homogeneous colour in Fig. 3; i.e. very few zones are discriminated by this covariate in this case.

Out of all the cross-validation replicates, nine covariates have a statistically significant influence on hydrocarbon concentration, considering a significance threshold of 5 %. These covariates are retained in the construction of the final RF model that is used for the application of the group-based SHAP approach.

Figure 9 shows the prediction (panel a) of the hydrocarbon concentration together with the uncertainty measurement (panel b) at the grid points across the city with a spatial resolution of

To ease the interpretation of the Shapley values, we define two groups of covariates whose contributions are considered together with the land use and the elevation:

The group

The geographical group includes

To support these choices, we analyse the pairwise dependence measure (Sect. S2), which confirms the moderate-to-high pairwise dependence of all variables in this group. This analysis also indicates that land use has a low dependence on the other variables.

Using the trained RF model, we apply the group-based SHAP approach to decompose the data at

Scaled Shapley values (in %) for each group of covariates of the Toulouse test case considering the prediction best estimate using the RF conditional mean

Three distinct situations are identified that are relevant from the viewpoint of uncertainty management. The first situation corresponds to the locations outlined in Fig. 11, where the groups of covariates significantly contribute to the prediction best estimate, with a scaled Shapley value exceeding that of the uncertainty by more than 25 %. Interestingly, the main contributors to this situation are the three groups that are relevant to the soil prediction problem as opposed to the geographical group, whose objective is to account for spatial dependence. This gives some confidence in the process underlying the RF prediction because it indicates that the best estimate is controlled mainly by the soil-relevant predictors. This also indicates that their influence on the prediction is not masked by the use of geographical covariates, i.e. the use of spatial surrogate covariates, also called pseudocovariates, as discussed by Wadoux et al. (2020, Sect. 3.3).

Locations where each corresponding group of covariates significantly contributes to the best estimate, with a scaled Shapley value exceeding that of the uncertainty by more than 25 %. The colours indicate the scaled Shapley value for the best estimate. The areas in grey correspond to grid points where the condition is not met. The black squares indicate the locations of the soil samples used for RF training.

An examination of the distribution of the corresponding covariates (Fig. 12) reveals that these locations have elevation values and distances

Boxplots of the covariate values for the training dataset and for the locations (named selection) where the corresponding group of covariates significantly contributes to the best estimate, with a scaled Shapley value exceeding that of the uncertainty by more than 25 %. Panel

The second situation is the opposite of the first and corresponds to locations where the groups of covariates significantly contribute to the uncertainty, exceeding their contributions to the best estimate by more than 25 %. These are shown in dark green and light red in Fig. 10 (right) and only concern the geographical and

Finally, the third situation corresponds to where the covariate groups have negative contributions (outlined in light blue in Fig. 10, right), i.e. where they participate directly in reducing the prediction uncertainty. An examination of the distribution of the corresponding covariates (Sect. S2) indicates the same result as for the first situation, with distances to the nearest rivers even smaller than those observed in the training dataset and with even more marked land use behaviour, where agriculture, forest, and urban are almost the only categories identified. We also note that the geographical group negatively contributes to uncertainty only in the vicinity of the soil samples at distances of 1 to 2 km.

To date, the Shapley values have been used to explain individual predictions related to a certain instance of covariates by computing the contribution of each of them to the prediction. Translated for DSM, the Shapley values can be used to determine why a spatial ML model reached a certain value for the soil or chemical properties at a certain spatial location. As discussed by Wadoux and Molnar (2022), the use of Shapley values has the potential to constitute a key tool for environmental soil scientists to improve the interpretation of ML-based DSMs, particularly by providing insights into the underlying physical processes that drive soil variations.

In this study, we complement this type of analysis by addressing the “why” question with respect to the prediction uncertainty, i.e. by explaining why the spatial ML model is confident. For this purpose, the SHAP approach for estimating the Shapley values is applied to decompose the uncertainty indicator provided by the ML spatial model. The attribution results are expected to facilitate communication between environmental soil scientists and stakeholders, which is essential for the inclusion of these new digital soil map products in current practices (see, for example, the discussion by Arrouays et al., 2020). The SHAP results are expected to improve the framing of the prediction results together with the associated uncertainty as illustrated with the synthetic test case described in the introduction as follows: the predicted value of 12.20 is mainly attributable, by a positive factor of almost 50 %, to the maximum and mean temperature of the warmest quarter of the year. The confidence in this result measured by the uncertainty indicator of 1.98 is explained by the diurnal range of almost 50 %. To further increase this confidence, i.e. by decreasing the uncertainty, next efforts should concentrate on the characterisation of this particular covariate.

The second implication of our study is in terms of uncertainty management. Our application to hydrocarbon concentration mapping in Toulouse as well as to the synthetic test case reveals that the determining contributors to the best estimate or the uncertainty may not necessarily be the same. Figure 13 shows maps of the most important groups of covariates with respect to the scaled Shapley absolute values for the real case. These maps show that the prediction uncertainty is dominated by the geographical group over almost 65 % of the entire study area, whereas the best estimate is influenced by this group of covariates over less than 35 %; these areas are outlined in purple in Fig. 13a and b. Overall, the most important group of covariates differs for both prediction objectives over about 50 % of the entire study area (outlined in red in Fig. 13c), mostly in the western and central part of the city. The dichotomy between the drivers of the best estimate and uncertainty is also illustrated for the synthetic test case in Fig. 7.

Spatial distributions of the most important group of covariates with respect to the scaled Shapley values for the prediction best estimate (mean; panel

On this basis, distinct situations can be identified with different practical implications for data collection and covariate characterisation. This is illustrated in the Toulouse case in Sect. 4.2. If the primary objective of environmental soil scientists is to increase the confidence in the prediction, the characterisation efforts should be concentrated in the zones outside the spatial domain covered by the soil samples, i.e. in regions where the RF models appear to be used to make spatial extrapolations. Two improvements are particularly notable: (1) the modelling of the spatial dependence in the ML model, as revealed by the high importance of the geographical group, and (2) the need for more samples outside the range covered by the soil samples to better characterise the pair of distances

The application cases analysed in this study correspond to situations with a moderate number of covariates (a few tens) and predictions at either the city scale or the national scale, with several tens of thousands of grid cells, which are representative of other real case situations described in the literature, such as those of Meyer et al. (2018), de Bruin et al. (2022), and Wadoux et al. (2023). In this section, we extend the discussion regarding the applicability to global-scale projects such as that described by Poggio et al. (2021) with hundreds of covariates and millions of grid cells. Applying the SHAP approach is challenging in this context due to its computational load, which is directly related to the number of covariates (Sect. 3.2).

As an illustration, we run SHAP for the Toulouse test case using the nine important covariates (without grouping) at 100 randomly selected grid points (on a Windows desktop x64 with a PC – Intel®Core™ i5-13600H, 2800 MHz, 12 cores, 16 logical processors, with 32 GB physical RAM), which led to an average CPU time of 2.15 s. Given the constraints of global-scale studies, a direct SHAP analysis would require at least 200 d of calculation on a single laptop. The first solution relies on the use of a high-performance computing architecture, as proposed by Wadoux et al. (2023). The second option involves approximating the Shapley values using, for instance, sampling algorithms (Chen et al., 2023), with some approximation errors opposite to those of the exact method used here. The third option explored in this study is the combination of a screening analysis and a grouping approach. Although RF models can handle a large number of covariates, eliminating the covariates before calculating the Shapley values has a clear benefit for saving CPU time. In the real case, the SHAP computational complexity is proportional to

Providing insights into the uncertainty impacting DSM is a key challenge that requires appropriate diagnostic tools. In an effort to complement the toolbox of environmental soil scientists, in this study, we assessed the feasibility of using the SHAP approach to quantify the contributions of covariates to the machine-learning-based prediction uncertainty at any location in the study area. Using a real case of pollution concentration mapping in the city of Toulouse as well as a synthetic test case, we explored the benefits of jointly analysing the contributors to the prediction best estimate and to the prediction uncertainty. Our results revealed that the drivers of the prediction best estimate are not necessarily the drivers of the confidence in the predictions: this means that decisions in terms of data collection and covariate characterisation may differ depending on the target, the prediction best estimate or the confidence/uncertainty, and the way in which the results of the prediction (and their uncertainty) are communicated.

However, to integrate SHAP at a fully operational level, several lines of improvement need to be considered. First, the implementation to global-scale projects still remains challenging and deserves further work to find a compromise between accuracy, efficiency, and interpretability by paying particular attention to estimation algorithms (Chen et al., 2023), with potential combination with screening and grouping analysis. Second, we focused on a unique uncertainty indicator – namely, the interquartile width – but in some situations, it may not be representative of the total uncertainty, and additional developments are necessary to integrate the entire prediction probability distribution within the SHAP setting. The use of an information-theoretic variant of Shapley values, as investigated by Watson et al. (2023), may be helpful here. Third, we focused on one type of machine learning model, i.e. the quantile RF model. Alternative approaches should be considered in future research: different types of machine learning models, such as deep learning techniques, which have shown promising results (see, for example, Kirkwood et al., 2022), and improved approaches, in particular, to addressing complex sample distributions such as clustering (see, for example, de Bruin et al. (2022) and references therein). Different uncertainty measures should also be tested – for example, geostatistical methods using the kriging variance or statistical quantities derived from stochastic simulations (see, for example, Chilès and Delfiner, 2012), Bayesian techniques (see Abdar et al., 2021, for deep learning techniques), and data-driven approaches such as cross-validation procedures (Ben Salem et al., 2017). These future studies are made possible by the model-agnostic nature of SHAP.

The R package

The random forest (RF) model, as introduced by Breiman (2001), is used here for regression. It is a non-parametric technique based on a combination (ensemble named forest) of regression trees (Breiman et al., 1984). Each tree is constructed by relying on recursive partitioning, which aims at finding an optimal partition of the covariates' domain of variation by dividing it into

The RF method is very flexible and can be adapted to predict quantiles. The quantile random forest (qRF) model was originally developed by Meinshausen (2006), who proposed to estimate the conditional quantile,

The major difference to the formulation for regression RF is that the qRF model computes a weighed empirical CDF (cumulative distribution function) of

The number of covariates is 15 (Sect. 2), which is sufficiently large to pose some difficulties regarding the computational time cost of the SHAP approach (Sect. 3.2). To filter out covariates of negligible influence (screening analysis), we rely on the HSIC (Hilbert–Schmidt independence criterion) measure, which can capture arbitrary dependence between two random variables (potentially of mixed type, continuous or categorical). In the following, we describe the main aspects, and the interested readers can refer to Gretton et al. (2005) and da Veiga (2015).

Let us associate

The role of the characteristic kernel is central here because it can be defined depending on the type of the considered variable. For continuous variables, the Gaussian kernel is used and is defined as

The pairwise dependence is measured by

To perform the screening analysis, we rely on the interpretation of

Sources of data of the covariates are listed in Tables 1 and 2. We provide the R scripts to run the synthetic test case in the form of an R markdown on the GitHub repository at

The supplement related to this article is available online at:

JR designed the concept. SB provided support for the data processing, for the implementation of the machine learning model, and for the application to the Toulouse case. JR undertook the statistical analyses. The results were analysed by JR, SB, and DG. JR wrote the manuscript; SB and DG reviewed the manuscript.

The contact author has declared that none of the authors has any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

We are grateful to the Ministere de la Transition Ecologique et Solidaire, Direction Générale de la Prévention des Risques (MTES/DGPR) and Toulouse Métropole for letting us use the data that supported the study (Belbeze et al., 2019). We thank Sebastien da Veiga (ENSAI) for useful discussions on the use of HSIC measures. We are grateful to the anonymous reviewers whose comments and recommendations led to the improvements of the manuscript.

This research has been supported by the Agence Nationale de la Recherche (grant no. ANR-22-CE56-0006).

This paper was edited by Alexandre Wadoux and reviewed by two anonymous referees.