Articles | Volume 12, issue 1
https://doi.org/10.5194/soil-12-321-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
Assessing the potential of complex artificial neural networks for modelling small-scale soil erosion by water
Download
- Final revised paper (published on 30 Mar 2026)
- Preprint (discussion started on 08 Aug 2025)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-3583', Anonymous Referee #1, 29 Aug 2025
- AC1: 'Reply on RC1', Nils Barthel, 16 Oct 2025
-
RC2: 'Comment on egusphere-2025-3583', Anonymous Referee #2, 17 Sep 2025
- AC2: 'Reply on RC2', Nils Barthel, 16 Oct 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Reconsider after major revisions (further review by editor and referees) (21 Oct 2025) by Pedro Batista
AR by Nils Barthel on behalf of the Authors (02 Dec 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (04 Dec 2025) by Pedro Batista
RR by Anonymous Referee #1 (14 Dec 2025)
RR by Anonymous Referee #2 (22 Jan 2026)
ED: Publish subject to revisions (further review by editor and referees) (25 Jan 2026) by Pedro Batista
AR by Nils Barthel on behalf of the Authors (04 Mar 2026)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (10 Mar 2026) by Pedro Batista
ED: Publish subject to technical corrections (10 Mar 2026) by Peter Fiener (Executive editor)
AR by Nils Barthel on behalf of the Authors (16 Mar 2026)
Manuscript
The manuscript explores whether machine learning approaches could improve our ability to predict soil erosion. We are just starting with this field, and I appreciate the authors' efforts. The manuscript reports an interesting piece of work that needs some improvement, but in general is clearly worth publishing.
However, the conclusions are based on unreplicated results and thus speculative. Setting up a replicated experiment would be relatively easy and fast (see my last paragraph). Furthermore, the authors justify their work with clearly wrong statements, and I wonder whether there is no better justification for their work. The enthusiasm about something relatively new cannot replace logically sound arguments:
Please be accurate in your arguments. They are not random quantities.
Details
L 77: Which models?
Chapter Data collection: In general, this chapter does not give enough details about the sources of data, the measurement methods, their range, their resolution, and their quality. The lack of reference to the sources also makes it impossible for the reader to get an idea about these relevant aspects.
L 95: What is the accuracy of the data? Were there independent repeated surveyors to estimate the accuracy? How did you know there had been an erosion event, given that high-intensity rain cells have only a spatial extent of about 1 km² (see Lochbihler et al. 2017, Geophysical Research Letters)?
L 97: What is sheet-to-linear erosion? Isn't this rill erosion, which is already in the first group?
L 101: Nineteen variables are pretty limited. I would not criticise this, but in L 38, you criticised a limited number of variables. Your arguments do not match. (BTW: The (R)USLE uses more than 19 variables to calculate the final six factors; hence, your data set is more limited).
L 110: Better call it the Pearson correlation coefficient because Pearson and even the regression have several coefficients. In the following, r is mostly in italics. Please be consistent.
Table 1: DEM is definitely wrong because this is the entity of all elevation data. Do you mean altitude?
More details about the resolution and the quality of your DEM have to be given (see the general remark regarding the data chapter) because many of your following variables depend strongly on these two parameters.
How was slope length defined, in the sense of the USLE or in a geomorphological sense? Was it defined for the field or for the raster cell? I guess you did not use slope length, which would be one value for the entire slope, but you may have used the upslope length of each raster cell. I do not like guessing what you did (a similar question could be raised for almost all variables).
Flow accumulation is described as the total accumulated runoff. This would require runoff modelling because runoff will depend on soil, crops, heterogeneity of rain and other variables. I guess you mean the upslope drainage area. More explanation required!
Wetness index: What is a 'modified catchment area calculation'?
Machining direction: This will differ on different field parts because of the headland and complex topography. How was it defined? It may also vary over time.
Regarding the R and LS factors, see below. How was the C factor determined? Did you consider individual rains and the corresponding field states, or did you use some more generalised C factor? Which degree of generalisation did you use? K factor, based on which data?
The table must be complemented with statistical metrics like mean, SD, min, and max, which give an idea of the range the data covers. This is essential for the interpretation of Fig. 5.
L 178: Conventional cross-validation is inappropriate in your case because your raster cells are highly autocorrelated. Hence, the left-out data are not an independent data set. I suggest using a seven-fold cross-validation by leaving out one of your study areas at a time.
L 185: I cannot see the five pairs in Table 1. Which pairs do you mean?
L 187: The correlation between R and altitude is strange. I am not aware of any meteorological process that would influence rain within your altitudinal and spatial range. I guess the correlation is an artefact of an inappropriate resampling procedure. Unfortunately, resampling was not described.
Fig. 4: The x-axis appears to have a log scale. Then, zero would not be possible, although shown (likely it is 0.001) and although being found in the data set. I recommend using a square-root scale, which allows for a true zero and does not compress the data in the relevant range of 0.1 to 50 t because of the inflation of the irrelevant range between 0.001 and 0.1.
This also leads to the question: Were there no negative values in your data set (colluviation)? Including negative values would be a clear advantage compared to the USLE. In any case, the reason for the lack of negative values has to be explained.
L 224: The high importance of altitude shows that the results of your approach lack transferability to other areas. I can easily imagine a similar erosion situation (similar topography, similar soils, similar land use, similar rain), but a few hundred meters higher (or even a few thousand meters higher if we think of a high valley in the Andes). The large importance of altitude would then cause very strange predictions. The matching of the training and the application situation is an indispensable requisite for your approach that does not restrict the input data to meaningful and universally valid variables (especially if you request unlimited variables). It is worth discussing this constraint, which is especially important in the black box of neuronal networks. Whether the variables are used meaningfully in view of the erosion process by the network is unknown and irrelevant for the result. It is, however, highly relevant for the transferability. While it is relatively easy to find out whether, for instance, the K factor equation is applicable in a specific case (e.g., peatland erosion), it is difficult to find out in which case a neural network result will fail when transferred to a different situation.
Fig. 5: The low importance of LS is strange, particularly because of the higher importance of flow accumulation and slope. Essentially, LS is the product of flow accumulation and slope gradient and thus must be of higher importance. Could LS be wrongly calculated by assuming straight slopes, although you have converging and diverging slopes? Furthermore, did you use the field's LS factor or the pixel's LS factor, which is entirely different information? Your M&M section requires clearly more information. Otherwise, the results cannot be understood.
CNN was the best method in your case. Does this have any relevance? Will CNN always or at least often be the best? We don't know because this is an unreplicated experiment. Usually, we regard unreplicated results as meaningless. I wonder whether you could improve the validity of your analysis. For instance, you could run your seven study areas separately. Is CNN the best in all seven cases? Is the ranking of variables similar in all seven cases (which would allow us to say something about transferability at least within your region)? You could run your analysis ten times with a subset of 10 randomly selected variables from your data set. Is CNN the best method in all cases? Presently, we do not know, and hence your conclusion that CNN outperforms other methods remains just a speculation.