|I thank the authors for the responses they provided to each of my comments. If I was convinced by some responses (e.g. about DOC concentration or the two scenarios of temporal pH and SOC trends for long-term ecotoxicological predictions), I am also still very doubtful about what it is to me the bulk of the paper, i.e. the modelling of total and labile metal concentrations in soil overtime.|
To me, the adequate prediction of total metal concentation in soil overtime is a vital perequisite to an adequate prediction of the labile metal pool, more particular when the latter is thereafter used to predict ecotoxicological consequences. Indeed, total metal concentrations in contaminated soils remained the primary (and even sometime a reasonably good) indicator for further risk assessment (see e.g. Mossa et al. 2020, https://doi.org/10.1016/j.scitotenv.2020.137441). However, the authors provided the simulations of total metal concentrations overtime in figure S6, which showed that predictions matched very barely observations with some very large discrepancy both in the absolute values predicted and in the trends simulated (increasing vs. decreasing). Considering such large discrepancy, it is even surprising that the predictions of labile metal concentrations were relatively good and it questions how it can be as good while total metal concentrations are poorly predicted. Something does not match here. As already underlined, the case of Cu is even more worrying and should definitely be removed. Overall, the IDDM is an interesting approach but I think that the dataset from the Zofe used here to assess the relevance of its application to the study of the long-term soil contamination by organic amendment applications is an issue. The trends in metal concentration in soil observed are far from the bulk of literature showing an increasing pattern of metal concentrations in soil overtime and this discrepancy with the literature is not clearly characterised and discussed (only disputable hypotheses for Cu for instance). To say that briefly, the tool is relevant but the present case study is not due to too many uncertainties and non-representative and non-understandable patterns in the dataset.
Secondly, the fact that the authors did not have dedicated measurements to validate the modelling approach used to account for lateral mixing is worrying. Considering the simulations presented in figure 5 for the Idealised trend, it is really not obvious to say that lateral mixing provide an improvement in the fitting of the observed data (except for SS). Accordingly, the validation of the modelling approach used to account for lateral mixing with a dedicated, complementary dataset seems to me necessary.
Finally, I am not convinced by the statement about the interest of the Ftir and XRD datasets. I do not seen the necessity of using the Ftir data to confirm the decreasing trend of SOC overtime (Ftir is not an adequate tool to make quantitative estimation of SOC). Also, showing that the nature of SOC differs between treatments does not give any practical information about the potential impact of these differences on the evolution of metal lability in soil. Finally, XRD is not a very sensible technique. So, to show a change in the mineralogy of a soil sample, you need to have very important changes. In the present case, do you have factual data supporting the fact that the inorganic phases added by organic amendments can be high enough to quantitatively alter soil mineralogy. Beyond that, to my point of view, both Ftir and XRD datasets are extra datasets that do not provide additional relevant information (i.e. factual and quantitative) for the purpose of the paper which is to simulate overtime metal lability.