A GLUE-based assessment of WaTEM/SEDEM for simulating soil erosion, transport, and deposition in soil conservation optimised agricultural watersheds

Seufferheld, Kay D.; Batista, Pedro V. G.; Shokati, Hadi; Scholten, Thomas; Fiener, Peter

doi:10.5194/soil-12-301-2026

Articles | Volume 12, issue 1

https://doi.org/10.5194/soil-12-301-2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Special issue:

Advances in dynamic soil modelling across scales

https://doi.org/10.5194/soil-12-301-2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 12, issue 1

Original research article

|

30 Mar 2026

Original research article |

| 30 Mar 2026

A GLUE-based assessment of WaTEM/SEDEM for simulating soil erosion, transport, and deposition in soil conservation optimised agricultural watersheds

Kay D. Seufferheld, Pedro V. G. Batista, Hadi Shokati, Thomas Scholten, and Peter Fiener

Download

Final revised paper (published on 30 Mar 2026)
Preprint (discussion started on 05 Sep 2025)

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-3391', Anonymous Referee #1, 02 Oct 2025

The Authors investigate the use of a soil erosion and sediment delivery model (WaTEM/SEDEM: WS) on a well-instrumented catchment in Southern Germany. WS is replicated in Python and implemented within a GLUE approach, assuming prior uncertainty on input parameters and predictive target (sediment load) measurements. The study adopts a limits-of-acceptability approach, comparing model realisation against measurements to test its ability to perform adequate simulations in the presence of uncertainty. Model realisations are tested using both annual timesteps (i.e. annually aggregated sediment yield) and the average annual sediment yield (i.e. the yearly average sediment load) in several microcatchments, allowing comparisons at different levels of spatial and temporal aggregation. Although the average annual sediment yield is the intended predictive target for WS, testing WS at different temporal resolutions is nevertheless both scientifically and practically interesting.
Firstly, I commend the authors on their work. The manuscript reads well, and I believe the general research question is of interest to multiple backgrounds. In terms of the scientific advancement which are offered for soil erosion modelling, I find the conclusion to reiterate what is perhaps the most fundamental concept of the USLE - that temporal aggregation is required to generate acceptable predictions according to the central limit theorem - which isn't neccessarily surprising. Reiterating this point is nevertheless useful, however one could argue that “why” is more relevant than “if”. While the study correctly demonstrates that the model fails at the annual timestep, it offers limited insight into the reasons for this failure. A key value of testing a model outside its intended use case is to diagnose its structural weaknesses; this potential is not fully realised here.
Linking to this broader point is the main methodological critique I have of the manuscript. Despite using an implementation design which is intended to understand model drawbacks and accept only a subset of simulations, the authors do not consider uncertainty on the individual USLE parameters but instead opt for the use of an error surface on the gross erosion predictions. Given that a subset of these parameters are propagated into the transport capacity (i.e. L, S, R and K), the setup potentially ignores important parameter interactions which may impact both the uncertainty quantification and the insights into the model’s shortcomings. Considering the simplistic design of the model, parameter lumping seems avoidable. The choice of approach also induces a circularity into the argumentation of the study, where the lack of acceptable model realisations at the annual timestep is attributed to unconsidered uncertainty in the input factors (e.g. the (bio)physical impacts of conservation tillage on overland flow and erosion), which could have otherwise been considered in the modelling approach.
Secondly, the study implements a replication of the WS code but performs neither benchmarking against the standard model nor releases the source code openly. I am in favour of replications of models such as WS in popular programming languages such as Python (which although on average slower, permit easy data integration and parallelization as mentioned by the authors), but without showing at least a benchmarking use-case the current implementation lacks reproducibility and good modelling practice.
I deem both points necessary to address before recommending the study for publication in SOIL. I have included additional points below on a section-by-section basis.
Kind regards,
Introduction:
No introduction on temporal resolution is given, and how it influences the model assumptions, constrains the model parameters, and influences equifinality. WaTEM/SEDEM simulates the central tendency not the temporal variability. Finer timescales can mean more variability through time compared to through space (i.e. in the long-term annual average), which has obvious implications for the (required) parameter sensitivity. So using it for a case in which it is tested on the annual dynamics of sediment load should be clarified, and also a mention of the assumed impact on model equifinality compared to the long-term simulation.
L54-57 - can you add evidence regarding the most used model? I suggest adding a citation.
L66-68: I suggest being more specific and mentioning what conservation measures haven’t been evaluated. It should be stated what differences there are compared to grass and non arable elements which are commonly represented in the model. Or is it that they are not typically evaluated with real data in studies?
L73-75: This lumps measurements at vastly different spatial scales (e.g. erosion plots to large watersheds) into the same context, despite having considerably different scale-related implications when considering sediment delivery.
L69-77: This paragraph would benefit from a consideration of the practical considerations in the modelling process, since the implications depend on the objective of the modeller. Many modelling efforts seek acceptable sediment yield predictions, and use models with this predictive target but producing intermediate spatially distributed estimates. Others do require accurate spatial estimations with an acceptable level of uncertainty at their representative spatial and temporal scale.
Methods:
Currently I don’t see a justification for the selection of the priors given the driving processes of erosion. Why are error distributions considered uniform for all parameter distributions at all considered time scales? Is it not the case that driving events may be driven by low probability rainfall events or high intensity bursts? I suggest including justifications which match the nature of the driving processes, particularly in the case of changing temporal scales. In such a well-measured watershed, is it not possible to constrain the uncertainty components? It is later discussed that short windows of coincidence between bare soil and heavy rainfall can be critical, which would manifest as high uncertainty in the C-factor and R-factor. This is arguably the advantage of generating synthetic data.
As mentioned above, lumping everything into an error parameter on gross erosion poorly represents the individual contributions of sub-parameters, their interactions, and identifiability. At present, I miss a justification for this. What about the contribution (combinations of) sub-factors and their contribution to erosion and sediment transport realisations?
Regarding the general model implementation, a German USLE formulation is used in place of the typical RUSLE formulation. Can the authors show time series data of the annual parameter inputs used? It would be helpful for the reader to know the distributions of the input values. I would also suggest a discussion on what impact this may have on the model, plus the consequences for comparing parameter values (e.g. ktc) with other studies given the parameter compensation effects.
What about stream initiation and transition to channelised flow? In WS, there are various ways to consider the stream channel initiation by digitizing channels or considering a flow accumulation threshold. I would recommend mentioning this.
L197: The word seasonal is ambiguous in this case. I would also suggest using a mathematical formulation for the C-factor, showing how the SLR is generated and combined with rainfall erosivity and at what time scale. It’s also of general interest to the reader to know what these SLR and C-factor values are for both arable and grassland, and how they change through time. The literature reference for the SLR formulation is also grey literature, so more details are justified.
L297-304: How does the calculation of likelihoods vary between temporal aggregations? For individual years is this done by comparing the time series or individual simulations?
L316-324: Can the authors justify the use of the median? This would assume an underestimation of the total sum due to the positively skewed nature of sediment yield, which would have obvious implications for watershed management. Typical applications of WaTEM/SEDEM are applied to the mean.
Results:
The results are in general concisely presented. However, the spatial analysis section is overly brief. Do the multiple model runs which were made to address equifinality not give significantly more spatial information in addition to the median? What is the spatial variability of the behavioural predictions? Only the median is currently given but arguably one of the advantages of multiple realisations in WS is that you can get some idea of the variability in the spatial patterns from acceptable simulations. This is also useful to know the added spatial information which can be achieved for land management.
Discussion:
A typical explanation for the lack of global ktc parameters is the existence of unconsidered processes. Is this the case, or is it more of an inadequacy to capture the system behaviour? Can the field data give insights on this?
L459-467: This is somewhat difficult to follow. Is including this uncertainty in the input parameters not the purpose of using GLUE?
“Conservation landscapes” is combining multiple physical characteristics of the agricultural watersheds together, which have differing roles in soil erosion and sediment delivery through on-site and off-site effects. Is it due to grassed areas or conservation tillage? Indeed, grassed areas are commonly applied in the model through land use elements and grass buffer strips. So one could argue that they are indeed commonly applied in the model, but the impacts of conservation tillage on erosion and overland flow generation less so. It would help to separate conservation landscapes into their specific elements.
Can the authors elaborate on the effect of using the USLE formulation for Germany versus the typical RUSLE formulation used in WaTEM/SEDEM. Indeed I expect the model to be better calibrated for Germany agri-environmental conditions on which it was developed, however there are differences compared to the RUSLE formulation which are worthwhile to mention.
Conclusion:
L584-585: I didn’t see this point addressed in the manuscript. Is it not the case that the gross erosion estimates from the USLE factors overestimate the rates based on the most likely error surface values?

Citation: https://doi.org/10.5194/egusphere-2025-3391-RC1
- AC1: 'Reply on RC1', Kay Seufferheld, 01 Dec 2025
  
  Thank you for the for the detailed review of our manuscript. Please find attached our response to the general comments provided.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3391-AC1
RC2:
'Comment on egusphere-2025-3391', Joris Eekhout, 12 Nov 2025

Review of “A GLUE-based assessment of WaTEM/SEDEM for simulating soil erosion, transport, and deposition in soil conservation optimised agricultural watersheds” submitted to SOIL for consideration for publication.
The manuscript describes a model application of the WaTEM/SEDEM model in six agricultural fields in Germany. The authors applied a GLUE approach to assess the parameter space of a number of model parameters that are relevant for in-field and structural conservation measures. The authors show that the model gives relatively good results at the field-scale, especially when considering the long-term trends, but largely overestimates sediment yield when structural conservation measures are considered.
In general, the manuscript is very well written and is accompanied with clear figures and tables. However, there are some aspects of the manuscript that need revisions. These are mostly related to the objectives and the description of the applied methods.
The objectives should be better defined. The first objective focusses on testing of the model’s capabilities. This does not seem to be too ambitious. No matter what is the outcome, this objective will always be achieved. So please refine this objective to make it more ambitious. The second objective seems related to the GLUE approach, I suggest to explicitly include the GLUE approach in this objective. The last objective is similarly not too ambitious (either testing or analysing something will always be achieved). The concept of aggregating the data using different spatiotemporal resolutions has not been mentioned in the Introduction. I was expecting that this would go in some direction of using different spatial and temporal resolutions (different cell sizes and time steps, for instance). However, this is totally not the case. The authors instead use the long-term median model outcome, instead of the annual outcomes (the way USLE-type of models should actually be used). And the spatial aggregation is related to the two different conservation types considered. I’m not sure if this requires to be included in an objective.
Below I have provided specific comments to the text, figures and tables.
Specific comments

Lines 43-44: I was expecting the two strategies already in this sentence. I suggest to add after the colon “(i) in-field control measures and (ii) off-site sediment transport control structures”. It would be even better to make a clearer distinction, such as on-site and off-site measures.

Lines 43-46: In-field measures also frequently have the aim to increase infiltration and reduce runoff generation or to increase surface roughness to reduce flow velocities. This definition of in-field measures can be made a bit broader.

Line 47: Replace the first “and” with a comma.

Lines 50-51: Field demonstration would be highly feasible at the scale the authors are working and likely more convincing for stakeholders. Models are indeed valuable tools, but likely more for scenario evaluation, for instance, for different configurations of on-site and off-site measures.

Lines 72-73: What is the difference between meso-scale watersheds and large-scale catchments? Please clarify in the text.

Line 80: Replace “at the” with “in a”.

Line 81: Replace “in large-catchment sediment yield observations” with “at larger scales”.

Line 87: Replace the first “and” with a comma.

Lines 94-96: These two sentences are a bit difficult to understand. For instance, what do the authors mean with “These behavioural models”, behavioural models were not mentioned in the previous sentence.

Lines 101-102: The second objective refers to a sensitivity analysis? Or is this related to the application of the GLUE method? Please clarify in the text.

Lines 102-103: The third objective seems unrelated to the information provided in the Introduction or is this also related to the GLUE method (seems unlikely)? If not, please provide a short introduction on how differences in spatiotemporal resolutions impact soil erosion model outcomes.

Line 105: Replace “from” with “for”.

Lines 121-125: The main difference between the two different systems seems to be that W05 and W06 include grass strips, while the other study areas don’t. The other study areas also include retention ponds, which I consider to be a structural conservation measure. Please clarify in the text.

Line 127: Looking at Figure 1, it does not seem that the fields are arranged parallel to the contour lines (assuming that the curved lines indicate the contour lines). Please clarify in the text.

Line 135: What is meant with F15-F18? These are different configurations of the crop rotation? Please clarify.

Figure 1: I had to study the figure quite a bit to figure out which study area belonged to which field. I suggest to include another smaller panel where the fields are better indicated, with different colours, for instance. It would also be useful to get some more information about the contour lines, to give the reader an idea about the slopes in the study area. Moreover, the differences in crop rotation are not that clear from this figure. To which fields do the different F-codes belong?

Line 151: With “aliquot” the authors mean “sample”?

Lines 148-156: How were runoff and sediment totals estimated using this system? Was this continuously monitored or estimated after each event? Were the sediment samples further analysed on grain size distribution? Please clarify in the text.

Lines 161-164: The authors included a code availability statement saying that the code is available on reasonable request. I highly suggest to make the code publicly available through an open-source repository such as GitHub or Zenodo.

197-223: I suggest to restructure this subsection, especially the first paragraph introduces several concepts that are later on described in more detail. This might be confusing for many readers. Please add 1-2 sentences where is explained what will follow in this subsection.

Lines 197-202: This seems to be a bit confusing. What do the authors mean by “combining seasonal rainfall erosivity with temporal changes in soil cover”? The rainfall erosivity is used to calculate the crop factor? How is the SLR calculated and how is the SLR related to the crop factor? Please clarify in the text.

Lines 203-205: Change “bi-weekly measurements” to “bi-weekly crop and residue cover measurements” and remove the sentence in line 205.

Lines 205-207: How were the bi-weekly measurements translated to daily cover values? What is meant by standardised crop development? Please clarify.

Lines 238-240: These values were obtained from literature or included in a further analysis, e.g. GLUE. Please clarify.

Lines 260-261: How was this standard deviation applied? Please clarify.

Line 270: So the runoff samples are collected in a barrel. This has not been described under Data.

Lines 317-319: Here the authors mean that the study areas were subdivided into field- and structure-dominated systems? But that was already defined much earlier in this section. Why is there a need to repeat that here? Please clarify.

Lines 319-320: How did the authors aggregate the median values between field- and structure-dominated systems, by taking the average? Please clarify.

Lines 343-344: With “model performance” the authors mean “simulations”?

Lines 344-345: It seems that the median of the median is higher than 0.3 t/ha/yr (based on the boxplot in panel j of Figure 3), but here the authors suggest 0.24 t/ha/yr. Please explain where this value is based on.

Lines 345-347: Similar to the previous comment. The simulated median of the median is higher than 0.5 t/ha/yr (panel d of Figure 4). Please explain where the 0.15 t/ha/yr is based on.

Lines 356-358: This means that the model is performing better in W04 and worse in W05?

Table 2: If the Simulated SY is indeed the median of the median, then these values do not align with what is shown in Figures 3 and 4. See previous comments about this.

Figure 7: It seems that negative values are erosion and positive values deposition, please indicate this in the figure caption.

Lines 476-479: In that sense, it would be logical to also apply the model using long-term average rainfall erosivity values for the different study areas, instead of taking the median of the annual results.

Line 520-521: The TC is controlled by any value of kTC/A, not only high values. Please revise the sentence accordingly.

Lines 520-526: What is exactly the point the authors want to make here? That TC remains high enough to transport all sediment, without causing deposition. The question should be if this coincides with the observations or does the inclusion of retention ponds has a large influence on the modelled processes? (Ok, this is further explained in the subsequent paragraphs. I suggest to add 1-2 sentences explaining the likely reasons for this behaviour in the model, which you subsequently explain in the following paragraphs.)

Citation: https://doi.org/10.5194/egusphere-2025-3391-RC2
- AC2: 'Reply on RC2', Kay Seufferheld, 01 Dec 2025
  
  Dear Joris,
  Thank you for the detailed review of our manuscript. Please find attached our responses to your general comments.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3391-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Publish subject to revisions (further review by editor and referees) (05 Jan 2026) by Lutz Weihermueller

Based on the comments from expert reviewers and the reply from the authors the manuscript is returned back to the authors for revision. I recommend that you revise the manuscript in accordance with the suggestions provided in the comments and below and submit a revision. If you choose to revise the manuscript, you must address each revision request or concern with a written response and add in or modify the text accordingly.

One referee pointed to the objectives, which have been stated in the original version but are too general, and therefore, should be adapted and/or be more precise. Same hold for the Introduction, where one reviewer stated that it is not fully developed with respect to spatiotemporal resolution of the data used in the modelling approach and its implications. Clarifying this point in the Introduction and maybe also in the Methods section might helpful to better understand your motivation and the model outcome.
The second reviewer pointed to the discussion and would like to see an in-depth discussion on why the temporal aggregation is required. Here, it seems that the discussion provided so far was not clear to the reviewer, and therefore, needs some adaptation. Same hold for the uncertainty analysis. The authors already provided feedback that individual parameter uncertainty analysis or generating single error surfaces by Monte Carlos simulations do not make sense as the study is based on monitoring data providing decent information of the ABAG factors. Nevertheless, this should be clearly stated and the approach should be also critically discussed.
Finally, the reviewer requested benchmarking of the new Python code and information about the benchmarking have been provided by the authors in the response. As this is an important information, which has been not reported in the current version of the manuscript, I would recommend to describe in detail the testing and outcome.

Hide

AR by Kay Seufferheld on behalf of the Authors (20 Feb 2026) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (23 Feb 2026) by Lutz Weihermueller

RR by Joris Eekhout (27 Feb 2026)

RR by Anonymous Referee #1 (13 Mar 2026)

ED: Publish as is (13 Mar 2026) by Lutz Weihermueller

ED: Publish as is (13 Mar 2026) by Rémi Cardinael (Executive editor)

AR by Kay Seufferheld on behalf of the Authors (16 Mar 2026)

Short summary

Soil erosion threatens global food security, yet modeling soil conservation remains challenging. We evaluated WaTEM/SEDEM (Water and Tillage Erosion Model/Sediment Delivery Model) in six highly instrumented micro-scale watersheds optimised for soil conservation using a GLUE (Generalized Likelihood Uncertainty Estimation) framework. The model captured the magnitude of very low sediment yields but showed limited accuracy for annual steps. However, it performed well over eight-year timeframes and larger spatial scales, demonstrating its suitability for strategic, long-term soil conservation planning.