Articles | Volume 11, issue 1
https://doi.org/10.5194/soil-11-413-2025
© Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License.
Using 3D observations with high spatio-temporal resolution to calibrate and evaluate a process-focused cellular automaton model of soil erosion by water
Download
- Final revised paper (published on 12 Jun 2025)
- Supplement to the final revised paper
- Preprint (discussion started on 12 Sep 2024)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2024-2648', Anonymous Referee #1, 14 Oct 2024
- AC1: 'Reply on RC1', Anette Eltner, 20 Nov 2024
-
RC2: 'Comment on egusphere-2024-2648', Anonymous Referee #2, 23 Oct 2024
- AC2: 'Reply on RC2', Anette Eltner, 20 Nov 2024
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Publish subject to revisions (further review by editor and referees) (06 Dec 2024) by Nikolaus J. Kuhn
AR by Anette Eltner on behalf of the Authors (14 Jan 2025)
Author's response
Manuscript
ED: Referee Nomination & Report Request started (22 Jan 2025) by Nikolaus J. Kuhn
RR by Pedro Batista (17 Feb 2025)
EF by Polina Shvedko (06 Feb 2025)
Author's tracked changes
ED: Publish subject to minor revisions (review by editor) (18 Feb 2025) by Nikolaus J. Kuhn
AR by Anette Eltner on behalf of the Authors (03 Mar 2025)
Author's response
Author's tracked changes
Manuscript
ED: Publish subject to technical corrections (15 Mar 2025) by Nikolaus J. Kuhn
ED: Publish as is (17 Mar 2025) by Peter Fiener (Executive editor)
AR by Anette Eltner on behalf of the Authors (17 Mar 2025)
Manuscript
This manuscript is precisely what we need in soil erosion modelling: innovative, nuanced approaches to improve model evaluation and an honest assessment of model performance.
In the specific comments below, I made several questions that came up while reading the manuscript and some suggestions to hopefully improve the paper. Specifically, I recommend adjusting some of the modelling terminology (e.g. the usage of terms such as calibration, validation, and evaluation) and reducing the focus on identifying a ‘best model run’. Moreover, the figures generally could use some improvements. I expand on these topics in specific comments below.
Specific comments
Abstract: Here and in the introduction you could highlight the novelty of your work. To my knowledge, this is the first time such high spatiotemporal resolution data is used for evaluating erosion models.
L50-75: I found the model description a bit on the long side for the introduction. I would consider moving most of this to section 2.3 in the methods.
L74: Please define the abbreviations RD-ST and RD-FT.
L105: “Validate” and “evaluate” seem to be used interchangeably, but these terms can signify different meanings (Oreskes, 1998; Oreskes et al., 1994). Beven and Young (2013) suggest avoiding the term “validation” in hydrological modelling.
L121: Consider rephrasing to: “Ten objective functions were considered to calibrate model parameters”.
L122: By model runs, do you mean model realisations, i.e. “one random sample taken from the set of all possible random samples in a Monte Carlo simulation” (Beven, 2009)?
I missed a stronger statement about the importance and novelty of your work. What you have done is innovative and exciting and creates new possibilities for testing erosion models.
L167: How did you choose the DEM resolution?
L168: Why do the time-lapse data have finer resolutions than the DEMs used as input for the model?
L170: What is M3C2-PM?
L170-180: It’s great to have this spatially distributed error estimate for the point clouds. Have you considered using this as part of the model evaluation process, i.e. for defining limits of acceptability of model error (Beven, 2018)?
I appreciate the narrative model description, but having a list of model equations in the supplement would be very helpful. Please add this information.
L204: Please check if this formulation is correct: “If a wet cell’s sediment load is less than the transport capacity, then soil is eroded from the cell using a probabilistic detachment equation by Nearing (1991)”.
Shouldn’t you first calculate soil detachment for a given cell, sum it to the sediment load delivered to this cell by upstream cells, and then compare it to the transport capacity of the overland flow for this given cell to estimate the amount of sediment routed downstream? That is, why would soil detachment (and not transport) be dependent on the transport capacity of the overland flow? Maybe I misunderstood something – please clarify.
L231: I found it strange to calibrate the parameter ‘DEM base level’, as this is a measurable quantity that would not need to be estimated via calibration. Can you explain your rationale here?
L242-243: Based on this initial simulation, how did you choose the parameters for calibration? Based on some kind of sensitivity measure?
What was the parameter space sampled with the Latin hypercube simulation? Please give ranges (assuming you were sampling from uniform distributions) for each calibrated parameter.
L254: How did you assess the suitability of a function for calibrating the model?
L269-280: How is the DL metric interpreted? The higher the value, the greater the similarity?
L310: Why did you smooth out the DoDs and not the simulated DEMs?
Explaining the methods employed in section 2.5 was difficult, but I think you did a good job. Still, I have some questions/comments:
Figure 4: The font size for the axis text in the DTW WC and DTW SY panels is too small. The legend for the rasters is missing – I do not think referring to the legend from figure 2 is very helpful here. Moreover, it would be nice to identify the panels (e.g. a, b, c…) to improve readability.
L339: Where is this shown in Figure 6?
L345-347: This is very cool!
L349-350: Maybe the DEM smoothing is necessary for this kind of model application. Edit: Why was this not necessary for the lab rainfall simulation?
L355: I did not understand that “the metrics capture different aspects of soil surface change, including erosion”. The example in the next sentence did not clarify your point to me. Moreover, are there any processes potentially leading to changes in the soil surface that RillGrow does not represent?
L359-363: This is what I meant above – a single model realisation that optimises all functions is irrelevant. If you choose different functions, repeat the rainfall simulation experiment, or change any steps in the DEM processing, you’ll end up with a different optimal parameter set. Moreover, what is a good fit in this case? How do you define if the realisation fits a function “well”? I suggest rephrasing this to something along the lines of “We are looking to explore the behavioural parameter space constrained by different sources of data and objective functions”.
L364-365: This is a great demonstration of the equifinality problem!
L378: Where are these metrics being used for “validation”? From what I understand, so far you have explored different metrics as part of the model calibration procedure.
L389: What does it mean that the model does not predict splash or interill erosion? Is this identified by a given parameterisation or by the outputs? Moreover, I thought RillGrow did not differentiate between rill and interill processes (L207).
L390: Do you mean more splash is modelled for the realisations with better DTW and SY metrics?
You go into a lot of detail describing single model realisations, which makes the text long and sometimes difficult to follow. I think this stems from your focus on identifying a single realisation to optimise all functions. I suggest focusing on more generalisable patterns and shortening some of the results.
L405-406: Calibration and validation seem to get confused, please check or define this somewhere. Beven (2009) defines calibration as “the process of adjusting parameter values of a model to obtain a better fit between observed and predicted variables”. A calibrated model can then be tested against new data not used doing the calibration procedure (Klemeš, 1986). After reading the manuscript, I understand you tested different data and functions for calibrating RillGrow. Of course, this can be considered part of an evaluation process, but I suggest being precise about the terminology.
L412: Could this result from model input variables changing during the simulation and this not being picked up by the model parametrisation?
L415: Please try to be more precise when describing model performance. What is a very close fit to the observation?
L440: Similar error metrics?
Figure 9: Please add a legend for the point colours. Moreover, while the ten ‘best’ realisations can be very scattered, using a larger number of behavioural realisations might help you identify and describe patterns in the dotty plots.
L453-455: Is this a limitation of the model or the data? As you mentioned above, the initial changes in the soil surface are too small to be detected, considering the DEM errors.
Figure 10: Here the comma is used as a decimal separator.
L455-464: I had a hard time understanding this paragraph. What are these ranges? Which parameters do they represent?
Figure 11: Would be great to have the observed DoD here. Also, shouldn’t the abbreviations in the panel titles be described in the figure legend?
L530-535: It makes sense that the same model realisations that simulate higher changes in elevation also simulate higher sediment yield, right? The output variables should be correlated. Getting the rill patterns right is a different story.
Figure 15: Please check the decimal separators. Why doesn’t the y-axis start at zero? I also don’t understand this figure; what is this parameter range?
L580-595: There have been multiple attempts to evaluate erosion models using spatial data, e.g. from field surveys, aerial images, and fallout-radionuclide data (Brazier et al., 2001; Fischer et al., 2018; Jetten et al., 2003; Saggau et al., 2022; Vigiak et al., 2006; Wilken et al., 2020). So, I am not sure that model evaluation has lagged behind the models – the technology is out there; the problem is that it is so much easier not to use it.
What I think is really unique and exciting in your approach is the quality, the spatiotemporal resolution, and the different sources of data (plot outlet and SfM) used for model calibration.
L605-610: Yes, I found this similarity index very useful!
References
Beven, K.: Towards a methodology for testing models as hypotheses in the inexact sciences, Proc. R. Soc. A Math. Phys. Eng. Sci., 475(2224), doi:10.1098/rspa.2018.0862, 2019.
Beven, K. J.: Environmental Modelling: An Uncertain Future, Routledge, Oxon., 2009.
Beven, K. J.: Rainfall-Runoff Modelling, 2nd ed., John Wiley & Sons, Chichester., 2012.
Beven, K. J.: On hypothesis testing in hydrology: Why falsification of models is still a really good idea, WIREs Water, 5, e1278, doi:10.1002/wat2.1278, 2018.
Beven, K. J. and Young, P.: A guide to good practice in modeling semantics for authors and referees, Water Resour. Res., 49(8), 5092–5098, doi:10.1002/wrcr.20393, 2013.
Brazier, R. E., Beven, K. J., Anthony, S. G. and Rowan, J. S.: Implications of model uncertainty for the mapping of hillslope-scale soil erosion predictions, Earth Surf. Process. Landforms, 26, 1333–1352, 2001.
Cândido, B. M., Quinton, J. N., James, M. R., Silva, M. L. N., de Carvalho, T. S., de Lima, W., Beniaich, A. and Eltner, A.: High-resolution monitoring of diffuse (sheet or interrill) erosion using structure-from-motion, Geoderma, 375(May), 114477, doi:10.1016/j.geoderma.2020.114477, 2020.
Fischer, F. K., Kistler, M., Brandhuber, R., Maier, H., Treisch, M. and Auerswald, K.: Validation of official erosion modelling based on high-resolution radar rain data by aerial photo erosion classification, Earth Surf. Process. Landforms, 43(1), 187–194, doi:10.1002/esp.4216, 2018.
Jetten, V., Govers, G. and Hessel, R.: Erosion models: Quality of spatial predictions, Hydrol. Process., 17(5), 887–900, doi:10.1002/hyp.1168, 2003.
Klemeš, V.: Operational testing of hydrological simulation models, Hydrol. Sci. J., 31(1), 13–24, doi:10.1080/02626668609491024, 1986.
Oreskes, N.: Evaluation (not validation) of quantitative models, Environ. Health Perspect., 106(6), 1453–1460, doi:10.1289/ehp.98106s61453, 1998.
Oreskes, N., Shrader-Frechette, K. and Belitz, K.: Verification, validation, and confirmation of numerical models in the Earth Sciences, Science (80-. )., 263, 641–646, doi:10.1126/science.263.5147.641, 1994.
Saggau, P., Kuhwald, M., Hamer, W. B. and Duttmann, R.: Are compacted tramlines underestimated features in soil erosion modeling? A catchment-scale analysis using a process-based soil erosion model, L. Degrad. Dev., 33(3), 452–469, doi:10.1002/ldr.4161, 2022.
Takken, I., Beuselinck, L., Nachtergaele, J., Govers, G., Poesen, J. and Degraer, G.: Spatial evaluation of a physically-based distributed erosion model (LISEM), Catena, 37(3–4), 431–447, doi:10.1016/S0341-8162(99)00031-4, 1999.
Vigiak, O., Sterk, G., Romanowicz, R. J. and Beven, K. J.: A semi-empirical model to assess uncertainty of spatial patterns of erosion, Catena, 66(3), 198–210, doi:10.1016/j.catena.2006.01.004, 2006.
Warren, S. D., Mitasova, H., Hohmann, M. G., Landsberger, S., Iskander, F. Y., Ruzycki, T. S. and Senseman, G. M.: Validation of a 3-D enhancement of the Universal Soil Loss Equation for prediction of soil erosion and sediment deposition, Catena, 64(2–3), 281–296, doi:10.1016/j.catena.2005.08.010, 2005.