Quantifying soil carbon in temperate peatlands using a mid-IR soil spectral library

Traditional laboratory methods of acquiring soil information remain important for assessing key soil properties, soil functions and ecosystem services over space and time. Infrared spectroscopic modelling can link and massively scale up these methods for many soil characteristics in a cost-effective and timely manner. In Switzerland, only 10 % to 15 % of agricultural soils have been mapped sufficiently to serve spatial decision support systems, presenting an urgent need for rapid quantitative soil characterization. The current Swiss soil spectral library (SSL; n = 4374) in the mid-infrared range includes soil samples 5 from the Biodiversity Monitoring Program (BDM), arranged in a regularly spaced grid across Switzerland, and temporallyresolved data from the Swiss Soil Monitoring Network (NABO). Given that less than 2 % of samples in the SSL originate from organic soils, we aimed to develop both an efficient calibration sampling scheme and accurate modelling strategy to estimate soil carbon (SC) contents of heterogeneous samples between 0 m to 2 m depth from 26 locations within two drained peatland regions (HAFL dataset; n = 116). The focus was on minimizing the need for new reference analyses by efficiently mining the 10 spectral information of the SSL. We used partial least square regressions (PLSR) together with a 5 times repeated, grouped by location, 10-fold cross validation to predict SC ranging from 1 % to 52 % in the local HAFL dataset. We compared the validation performance of different calibration schemes involving local models (1), models using the entire SSL combined with local samples, commonly referred to as spiking (2), and subsets of local and SSL samples optimized for the peatland target sites using the RS-LOCAL algorithm 15 (3). Using local and RS-LOCAL calibrations with at least 5 local samples, we achieved similar validation results for predictions of SC up to 52 % (R2 = 0.93 to 0.97, bias = -0.07 to 1.65, RMSE = 2.71 % to 3.89 % total carbon, RPD = 3.38 to 4.86 and RPIQ = 4.93 to 7.09). However, calibrations using RS-LOCAL only required 5 or 10 local samples for very accurate models (RMSE = 3.16 % and 2.71 % total carbon, respectively), while purely local calibrations required 50 samples for similarly accurate results (RMSE < 3 % total carbon). Of the three approaches, the entire SSL spiked with local samples for model calibration 20 led to validations with the lowest performance in terms of R2, bias, RMSE, RPD and RPIQ. Hence, we show that a simple

The slight disadvantage is that often more sample preparation is needed compared to samples measured in the vis-NIR range.
For spectroscopy in the mid-IR, unlike in the NIR range for example, the soil has to be finely ground in order to optimize the 55 signal to noise ratio (Guillou et al., 2015). This, however, makes it especially efficient to use prepared (legacy) soil datasets for mid-IR spectroscopy models.
In the soil spectroscopy modelling community, most current research efforts are focusing on minimizing the differences in performance between local (e.g. location or field specific) models versus large-scale (e.g. national, continental or global) models. On the one hand, this may be because the choice of the statistical model itself only results in slight performance 60 variability depending on the complexity of the soils. Traditional chemometric approaches (e.g. partial least squares regression (PLSR); e.g. Janik and Skjemstad, 1995), machine learning (e.g. regression tree methods; e.g. Clairotte et al., 2016;Dangal et al., 2019) and deep learning (e.g. convolutional neural networks; e.g. Padarian et al., 2019a, b) have all been used fairly successful. On the other hand, the focus of these studies may be explained by the fact that in the past, large-scale models still tended to perform less accurate than small-scale models (Guerrero et al., 2016;Stevens et al., 2013). Two reasons for lower accuracy are a higher soil heterogeneity across larger spatial scales, which leads to a higher variability of spectral patterns and the limitations of statistical models to deal with such variability. Another reason is inharmonious sample preparation, measurement protocols and instruments (Nocita et al., 2015). Therefore, local models used to predict soil properties for a specific location or region were initially favored Stevens et al., 2013;Guerrero et al., 2016;Sila et al., 2016;Viscarra Rossel et al., 2016a), but these had to be re-calibrated using new samples and laboratory analysis for 70 every new region.
More recently, however, methods were developed to use large soil spectral libraries (SSL) to predict soil properties locally at new locations, further minimizing the time and expenses required for sampling and laboratory work. Currently, several countries have established SSLs using archived, legacy and new soil data, such as The Czech Republic (Brodský et al., 2011), France (Gogé et al., 2012;Clairotte et al., 2016), Denmark (Knadel et al., 2012), China (Shi et al., 2014), the United States 75 (Wijewardane et al., 2018;Dangal et al., 2019), Brazil  and Switzerland (Baumann et al., 2021). Continental, e.g. Australia (Viscarra Rossel et al., 2008) or Europe (Stevens et al., 2013), as well as global SSLs have also been established (Viscarra Rossel et al., 2016a;ICRAF, 2020). The operational value of SSLs lies in the ability to pull representative information (either the actual soil spectra or learnt model "rules") from them, requiring less new local samples and laboratory analysis. These methods can be summarized as spiking (Shepherd and Walsh, 2002;Brown, 2007;Wetterlind and Stenberg, 80 2010; Seidel et al., 2019), subsetting (Araújo et al., 2014;Lobsey et al., 2017), memory-or instance-based learning (Ramirez-Lopez et al., 2013;Gholizadeh et al., 2016), or transfer learning (Padarian et al., 2019a). Spiking can be defined as adding local soil samples to a general SSL. Subsetting can generally be defined as dividing the SSL into smaller partitions based on characteristic features (e.g. geographic regions, soil type, etc.) or a specific method. One such method is memory-or instance-based learning, in which soil samples similar or related to the target local samples are retrieved from memory and merged to calibrate 85 a new model. Finally, transfer learning is the process of sharing intra-domain information and rules learnt by general models to a local domain (Pan and Yang, 2010).
In this study, we used the RESAMPLING-LOCAL, or RS-LOCAL algorithm developed by Lobsey et al. (2017) because it combines several advantages of all four methods listed above. RS-LOCAL is a data-driven method to subset a SSL using spectra from local samples. The subset includes these local, or spiked samples for calibration and may thus be summarized as instance-90 based transfer learning. In two case studies in Australia and New Zealand (Lobsey et al., 2017), the reduction of the SSL by means of local performance-based selection (RS-LOCAL) gave better results than constraining the SSL feature space by spectral similarity (memory-based learning).
We chose to specifically focus on soil carbon (SC) from peat soils in our mid-IR spectroscopic modelling approaches using different datasets for several reasons. Firstly, scientists agree that SC is an indispensable soil property for assessing agricultural 95 lands (e.g. Noellemeyer and Six, 2015). Secondly, Cardelli et al. (2017) pointed out that spectroscopic modelling has almost only been used for mineral soils, stating the need for soil spectroscopy of more diverse datasets that include organic soils.
Thirdly, we argue that currently, organic soil samples are underrepresented in SSLs and that this is a problem because the agricultural use of drained organic soils, or peatlands, is subject of immense debate in multiple sectors of societies. On the one hand, drained organic soils belong to the most fertile agricultural areas (Ferré et al., 2018), especially due to their high SOM 100 content and the release of plant nutrients during mineralization. On the other hand, drained peatlands are a major source of greenhouse gas emissions (e.g. Parish et al., 2008;Joosten, 2010;Leifeld and Menichetti, 2018), susceptible to wind and water erosion (Zobeck et al., 2013), enhance subsidence of agricultural parcels due to compaction and rapid mineralization and are prone to flooding (Leifeld et al., 2011). As a result, often only a substrate consisting of a thin organic horizon above a geologic and/or water-logging substrate remains. These factors have made crop production on such locations increasingly expensive; 105 expenses may include drainage renovation or adding allochtone sand to the soil among other measures (Ferré et al., 2018).
Due to ongoing discussion of optimizing the land use of drained organic soils between stakeholders with agricultural, socioeconomic and environmental interests, there is a need for using the advantages of mid-IR soil spectroscopic modelling to quantitatively characterize these soils. It is unknown whether current SSLs can ultimately be used to make location-specific land use decisions, particularly for small-scale heterogeneous regions made up of a variety of mineral and organic soils. In 110 the current soil spectroscopy literature, there is to our knowledge no study about partitioning a SSL using RS-LOCAL with mid-IR spectroscopy, especially for a specialized organic soils dataset. Unlike Padarian et al. (2019a), who demonstrated the application of transfer learning at a continental scale, this study looks into the application of transfer models from a national to local scale, specifically for peat soils.
The aim of this study is to compare mid-IR spectroscopic modelling approaches for SC from peat soils using different 115 datasets: 1) a local dataset specifically from drained peatlands; 2) the Swiss SSL spiked with local samples; and 3) RS-LOCAL subsets containing local and representative SSL samples. The goal is to develop both an accurate modelling strategy for predicting SC ranging between 1 % to 52 % and an efficient calibration sampling scheme to minimize the number of new samples required.

Soil Data: The Swiss SSL and the HAFL Dataset
The current Swiss SSL in the mid-IR range consists of 3723 topsoil (0 cm to 20 cm) samples from 1094 locations from the Biodiversity Monitoring program (BDM; e.g. FOEN, 2018;Meuli et al., 2017) and 572 topsoil samples from 71 locations from the National Soil Monitoring Network (NABO; Figure 1 and Table 1; e.g. NABO, 2018;Gubler et al., 2015). The Swiss SSL is described in full detail in Baumann et al. (2021). Less than 2 % of the samples in the SSL originate from organic soils.
We introduce a dataset from the Bern University of Applied Sciences, School of Agricultural, Forest and Food Sciences (HAFL), which was set up using a purposive sampling design to specifically study drained peatlands and organic soils. This local "HAFL" dataset (n = 116) contains soil samples from between 0 m to 2 m depth from a range of natural as well as disturbed histosols from 26 different locations in the "Seeland" and "St. Galler Rheintal" regions of Switzerland ( Figure 1 and Table 1; IUSS Working Group WRB, 2014). These samples originate from either undisturbed and waterlogged organic 130 horizons, mineralized organic horizons under agricultural use, horizons with sandy or calcareous substrate material or horizons containing a mixture of these characteristics. Table 1. Summary statistics of all samples in the HAFL dataset, in this study also referred to as the local dataset, and the BDM and NABO datasets. The latter two together constitute the current Swiss SSL, which contains carbon reference measurements from 4295 samples from 1150 different locations (Baumann et al., 2021

Mid-IR Soil Spectroscopy Measurements and Pre-Processing
All samples were dried, sieved (< 2 mm) and finely ground using a ball-mill to maximize the signal to noise ratio (Guillou et al., 2015). The samples were measured with a VERTEX 70 ® FT-IR Spectrometer with a High Throughput Screening Extension

135
(HTS-XT) from Bruker Optics (Massachusetts, USA). We used a spectral range of 7500 to 600 cm −1 and a spectral resolution of 2 cm −1 so that each spectrum comprised of reflectance values at 6901 wavelengths. On each 24-well plate, a fixed gold panel as the reflectance background, three NABO standards and two subsamples of 10 different samples were measured using the HTS-XT extension. This means that ten samples with two different measurements were analyzed per plate, maximizing the signal-to-noise ratio by averaging the two measurements. Reflectance spectra were transformed to apparent absorbance and 140 recorded as such. OPUS ® software was used for correcting atmospheric water and CO 2 .
We tested several pre-processing steps in order to increase the information content for modelling and reduce collinearity between consecutive wavelengths. We tested using a Savitzky-Golay (SG) filter with first and second derivative as well as a first, second and third order polynomial (Savitzky and Golay, 1964). The window size (resolution) of 35 variables (70 cm −1 ) was kept constant. Instead, we tested different resolutions by selecting either all variables, every 4 th or every 8 th variable to 145 reduce collinearity and redundancy among predictors. The combination of pre-processing steps used for all final modelling approaches was chosen that resulted in the lowest RMSE across the cross-validated calibration (see section 2.4 below).
We used the simplerspec package for the R statistical language for reading spectra and metadata from Bruker OPUS ® binary files into a R list, gathering spectra into a list column data structure, resampling spectra to new wavenumber intervals, averaging spectra of replicate scans and pre-processing the raw spectra with the parameters described above (Baumann, 2020).

Reference Chemical Analysis
In order to guarantee that reference data for all mid-IR spectroscopy models was measured using the same standard soil chemical analysis methods, we prepared and measured SC in the local HAFL dataset using the same procedures as for the Swiss SSL (Baumann et al., 2021). Briefly, all the dried, sieved (< 2mm) and finely ground samples were measured for total carbon content by dry combustion using the CHN628 Series Elemental Determinator ® from the Laboratory Equipment 155 Corporation (LECO Corp., St, Joseph, MI, USA). We used a soil standard sample with a mean total carbon content of 2.372 %.
In order to compare measurement accuracy and accordance between the two different CHN628 Series Elemental Determinator machines used for the Swiss SSL and HAFL samples, representative samples were selected using a two-step process. The data was separated into two clusters using the K-Means clustering algorithm (Hartigan and Wong, 1979), followed by using the Kennard-Stone (KS) algorithm for each cluster separately (Kennard and Stone, 1969). The KS algorithm is a deterministic 160 approach that uses Euclidean or Mahalanobis distance to select a set of samples uniformly distributed in principal component (PC) space (Kennard and Stone, 1969). Within the Swiss SSL, NABO and BDM samples used the same machine (Baumann et al., 2021).

Spectroscopic Modelling
We resampled the data using a 5 times repeated, grouped by location, 10-fold cross-validation for all of our models to determine 165 the optimal number of components in model tuning as well as evaluating the model performance using the hold-out samples.
The predicted carbon content was calculated in each model using the hold-out values of the measured, pre-processed and averaged mid-IR spectra. For the final models, the calculated average (mean) predictions over these five repeats with the chosen number of components are shown. To avoid overfitting, we used the "one standard error" rule: instead of choosing the tuning parameter associated with the lowest RMSE, we chose the simplest model within one standard error (SE) of the 170 empirically optimal model (e.g. Hastie et al., 2009).
We used partial least squares regressions (PLSR; e.g. Wold, 1975;Wold et al., 1983Wold et al., , 1984Wold et al., , 2001 to predict SC. We tuned the PLSR using 1 to 10 components and the final model for number of components was chosen according to the "one SE" rule. All spectroscopic models were evaluated for their performance using the root mean squared error (RMSE), the ratio of performance to deviation (RPD; Williams and Norris, 1987, Equation 5), the ratio of performance to interquartile range (RPIQ), 175 the bias and R 2 . The RPD is suitable for normal distributions while the RPIQ is more suitable for non-normal distributions (Bellon-Maurel et al., 2010). Since different definitions of R 2 exist, we used the equation of the mean squared error skill score (SS mse ; Wilks, 2011), also known as the model efficiency coefficient (MEC; Nash and Sutcliffe, 1970) to indicate the R 2 .
Given the large range of SC in these datasets, we also assessed the RMSE, RPD and RPIQ by increments of 10 % SC for all model validations. Hence, we calculated these metrics for all samples for which the measured SC values are between 0 % to 180 10 % SC, 10 % to 20 % SC and so on. In this manner, we expected to detect for which range of SC prediction error increases or decreases.
One advantage of repeated cross-validation is that model imprecision can easily be assessed for each prediction (Ŷ ) using the standard deviation (SD) and mean (Ȳ ) of the predictions across 5 repeats, respectively. In this study, the SD is shown as error bars for each prediction (Ŷ ) and was also calculated for each overall summary statistic assessing the model performance.

Local Models
Spectroscopy becomes time and cost efficient when minimizing the amount of laborious chemical reference analysis. Therefore, it makes sense to split the local HAFL data into a calibration and validation subset ( Figure 2). In this manner, the validation can be used to determine how many samples are needed to accurately and precisely calibrate a model. We selected n = 15, 20, 25, 190 30, 40, 50 and 58 representative local HAFL calibration samples by using the KS algorithm to the first 5 PCs. We did not build models using less than 15 samples. These selected samples were then used to calibrate iterations of PLSR models ( Figure 2).
In an application of the method described, reference data would only have to be measured for the selected samples used for calibration. Each calibrated model iteration was validated using the same 58 remaining local samples never used for any of the calibration iterations.

SSL Spiked Models
In the next step, we utilized the Swiss SSL in iterations of model calibrations to see if predictive performance can be improved while further reducing the number of new local samples needed for reference analysis ( Figure 2). Also, we expected a large amount of additional data from the SSL to improve model robustness and reliability (Lobsey et al., 2017). With the help of all SSL samples containing carbon reference data (n = 4295), we were able to include iterations of PLSR calibrations spiked with 200 as few as n = 3, 5, 7 and 10 local HAFL samples. Further iterations with the same n = 15, 20, 25, 30, 40, 50 and 58 local HAFL samples as for the local models were also calibrated. Just as with the local models, each iteration was validated using the same 58 remaining local samples never used for any of the calibration iterations.

Models using RS-LOCAL Subsets
In the third approach, we tested whether representative subsets of the SSL using the RS-LOCAL algorithm improved the ac-205 curacy of predicting SC of the local HAFL samples ( Figure 2). The RS-LOCAL algorithm was used to data-mine the SSL for samples suitable for local or location-specific calibrations according to similarities of spectral signatures between the local HAFL and SSL soil samples (Lobsey et al., 2017). Local HAFL samples were selected in the same manner as in local and SSL spiked models, resulting in iterations of the same samples as before ( Figure 2). This variable was defined in the RS-LOCAL algorithm as m (Lobsey et al., 2017); so in our case, m = 3, 5, 7, 10, 15, 20, 25, 30, 40, 50 and 58 for each respective iteration.

210
RS-LOCAL used m data to resample, evaluate and then remove irrelevant data from the SSL so that only the most appropriate data for deriving a local calibration remained in a new SSL subset K. K and m together formed a RS-LOCAL dataset, which was used for a calibration. In addition to the SSL and m data, three parameters were needed for RS-LOCAL (Lobsey et al., 2017): k: the number of SSL samples randomly selected in the resampling step, and also the target number of SSL samples 58 58 58    2) SSL spiking and 3) RS-LOCAL approaches. The same 58 local HAFL samples never used in calibration were used for validating each modelling approach and iteration. Note different scales of the y-axis.

Spectral Pre-processing and Analysis of Local HAFL Spectra
The lowest RMSE was achieved in the cross-validated calibration when using a Savitzky-Golay (SG) filter with a first derivative 225 and second-order polynomial (Savitzky and Golay, 1964)   The raw and pre-processed measured mid-IR spectra of the local HAFL soil samples (n = 116) were clearly distinct in relation to the SC content, ranging from 1 % to 52 % (Figures 3a & b). Mineral and organic soil samples showed different absorbance patterns in both the raw and pre-processed mid-IR spectra. Pre-processed absorbance values showed a clear pattern according to the SC content almost across the entire spectrum and particularly, but not exclusively, around 800 cm −1 , 235 1050 cm −1 , 1900 cm −1 , 2050 cm −1 , 2900 cm −1 and 3600 cm −1 .
A PCA of the pre-processed spectra of the local HAFL samples clearly revealed a variance in distribution of the soil samples related to the SC content (Figure 4). The first two PCs together explained 53.5 % of the total variance in the pre-processed spectra. Figure 4 also exemplifies one possible local calibration scheme, whereby the HAFL data is split into 20 representative samples used for calibration, 58 samples used for validation and the remaining samples. We also compared the similarity of 240 soil samples from different depths at the same location by coloring the first two PCs by location (Appendix C). However, the pre-processed spectra from the same locations generally showed little similarity; there was no distinct pattern as there was for the SC content.

Comparing Local HAFL to SSL data
When comparing the three datasets, we found that the local HAFL dataset showed a different SC distribution and covered a 245 different range of soil variability than the Swiss SSL (Table 1 and Figure 5). The relatively small HAFL dataset originating from peaty soils had a uniform continuous distribution whereas the BDM and NABO data had a positively skewed distribution with regard to SC (Table 1). Although the BDM dataset contained the highest single value of SC, over 98 % of soil samples in the Swiss SSL originated from mineral soils. In contrast, more than half of the HAFL samples ranging from 1 % to 52 % were classified as organic soils. The first and second PCs -which together covered 40.3 % of the total variance -also revealed a 250 clear overlap of pre-processed mid-IR absorbance variance for the BDM and NABO datasets ( Figure 5). There was less overlap in the variance of the pre-processed mid-IR absorbance values of the Swiss SSL and the HAFL dataset. In PCA space, the BDM and NABO datasets show a similarly shaped convex hull and almost identical centroid (mean), whereas the centroid is very distinct for the HAFL dataset ( Figure 5). In Figure 5, 122 samples, of which 20 are the local HAFL samples, were used as a RS-LOCAL subset to calibrate a model.

Predicted SC Using 1) Local, 2) SSL Spiking and 3) RS-LOCAL Subsets
We predicted SC content (Ŷ ) of the HAFL, BDM and NABO datasets using mid-IR soil spectroscopic PLSR models and compared them to the reference chemical measurements (Y ). This is exemplified for each modelling approach in the case of 20 local HAFL samples in Figure 6. For models using an RS-LOCAL subset, we found best overall validation results using k = For calibration, all modelling approaches showed a high fit (R 2 > 0.9) and low overall bias (≈ 0) (Figure 6a). The SSL spiking calibration scheme showed the lowest RMSE value. However, the accuracy of the spiked SSL calibration decreased We compared R 2 , bias, RMSE, RPD and RPIQ as indicators of overall model performance depending on the number of local HAFL samples in each calibration scheme for all three modelling approaches: 1) local, 2) SSL spiking and 3) RS-LOCAL subsets (Figures 7 & 8). R 2 was a poor indicator of model performance and did not show substantial differences between modelling approaches and calibration schemes (Figure 7). Local models showed the lowest bias, regardless of the number of 280 samples used during calibration (Figure 7). The bias of validations of spiked SSL models only lowered slightly as the number of local samples was increased and remained large overall (> 3). Bias was reduced significantly in validated models of RS-LOCAL subsets when at least 5 local samples were included and continued to decrease slightly with increasing number of local samples. Models with 5 and 10 samples stand out with very little bias.
Model accuracy (RMSE) varied considerably between calibrations and validations when SSL samples were used during 285 calibration (Figure 7). There was less of a difference between calibration and validation in local models, where the RMSE decreased with increasing number of local samples in model calibration. As with model bias, the RS-LOCAL subsets again showed a threshold or minimum of 5 samples in order for the RMSE of model validations to lower significantly. Model validations of local and RS-LOCAL subsets showed very similar accuracy overall (RMSE ≈ 3 % total carbon). However, only 5 or 10 local samples were required to achieve an accuracy of RMSE = 3.16 or 2.71 % SC, respectively (Figure 7 & Appendix F).

290
In contrast, local modelling approaches required 50 local samples to achieve a RMSE of < 3 % SC.
Local and RS-LOCAL models also revealed similar and better model performance than SSL spiking modelling approaches ( Figure 8). Both RPD and RPIQ gradually increased in local models with increasing numbers of local HAFL samples in calibration. As with bias and RMSE, RPD and RPIQ values also indicate high prediction accuracy with as few as 5 or 10 local HAFL samples when calibrating with RS-LOCAL subsets (RPD = 4.08 and 4.66, RPIQ = 5.96 and 6.81, respectively).

295
Prediction accuracy in model validations was highest for mineral soils, or ranges of SC between 0 % to 20 % for all modelling approaches (Figure 9). Local and RS-LOCAL modelling approaches showed better predictive performance for samples with higher SC. In local modelling approaches, samples with 0 % to 20 % and especially 10 % to 20 % SC were predicted increasingly well with increasing number of local HAFL samples in model calibration. Soils between 0 % to 10 % SC were predicted with the highest accuracy with the SSL spiking approach. However, samples with > 10 % were predicted with  We found that firstly, mid-IR spectra can be used to predict SC up to 52 % with R 2 ≥ 0.94, negligible bias and RMSE = 2.8 % to 3.6 % total carbon using validated local PLSR models (RPD = 3.6 to 4.69; RPIQ = 5.32 to 6.85). Secondly and most importantly, time-consuming and expensive field and laboratory measurements can be reduced for new locations when using a SSL together with RS-LOCAL. In our study, only 10 local HAFL samples were required in a RS-LOCAL subset to achieve similar validation performance as with at least 50 local samples in a local model (R 2 = 0.96, bias ≈ 0.2, RMSE ≈ 2.7 % total carbon, 310 RPD = 4.86 and RPIQ = 7.09; Figures 7 & 8). This is a major improvement to local models without a SSL because it not only reduces field and laboratory expenses, but also because no reliable model can be calibrated using so little data. Furthermore, a SSL subsetting method such as RS-LOCAL combined with a simple model such as PLSR are easy to understand and require little computational power compared to alternative machine-or deep learning approaches (e.g. Padarian et al., 2019a, b).

315
The measured and pre-processed mid-IR soil spectra and PCA results of all local HAFL samples revealed high correlation between the spectral absorbance values and a broad range of SC content overall (Figures 3 & 4). According to past studies that assessed variable importance, we assumed that soil texture and mineralogical and organic composition influence mid-IR spectral absorbance the most (Madari et al., 2006;Bornemann et al., 2010;Calderón et al., 2011). SOC, for example, is known to be related to a variety of bands that represent absorptions due to organic molecules such as proteins with C O, 320 C O and N H bonds (Viscarra Rossel and Behrens, 2010). Local HAFL samples containing both a high amount of organic compounds as well as carbonates created distinct absorbance bands around 1450 cm −1 , 1460 cm −1 , 2855 cm −1 and 2930 cm −1 for aliphatic C-H constituents (Madari et al., 2006) or around 1320 cm −1 for hydroxyl groups bonded to carbon (C-O-H) (Bornemann et al., 2010) and around 2500 cm −1 for carbonates (Calderón et al., 2011). Future studies should investigate whether overlapping of spectral signals for organic and mineral components increases at high SC concentrations, implying that 325 spectral absorbance patterns above a certain threshold of SC no longer differentiate substantially.

RS-LOCAL Improves Model Performance and Increases Efficiency
The RS-LOCAL approach using representative local and SSL samples was found to be the best of the three compared approaches. It significantly reduces the amount of reference measurements that need to be made at new locations to 5 samples.
In addition, the RS-LOCAL approach helps remove the strong bias of spiked SSL calibrations (Figures 6 & 7), increasing the 330 under-estimated predictions of SC in the upper range. Finally, the additional samples provided from the SSL also reduce uncertainty of SC predictions of resampled repeats in the PLSR models, as can be seen from the smaller error bars of the residuals.
Predictions of SC from 0 % to 10 % became more accurate when using additional samples from the SSL in the RS-LOCAL subset ( Figure 9).
We postulate that SC prediction accuracy of organic soil samples using SSL-derived models may be improved in future 335 studies by adding more peat soil data to the Swiss SSL (Baumann et al., 2021), specifically samples at different decomposition and mineralization stages. One of the most important characteristics of a high-quality SSL is that it contains the highest possible variation of soil characteristics within its designated area (Viscarra Rossel et al., 2016a Figure 5).

340
The use of our modelling approach, PLSR of RS-LOCAL subsets, to predict soil properties at new locations for future studies and applications depend on the level of accuracy needed. For organic soils on a farm or landscape level, an accuracy of approximately 2 % to 3 % total carbon is suitable for quantifying SC. On one hand, this range of accuracy is not useful for mineral soils, which, in Switzerland for example, contain average (mean) topsoil SOC concentrations of 2 % on arable locations and 2.5 % on temporary grassland locations (Leifeld et al., 2005). On the other hand, our validation results using the 345 SSL spiking approach show that samples with 0 % to 10 % SC were constantly predicted with a RMSE < 1 % SC and RPD above 2 (Figures 9). This implies that when targeting mineral agricultural soils, mid-IR spectroscopic models making use of a SSL deliver the required accuracy for applications and end-users. These findings are supported by Baumann et al. (2021).
Few studies to our knowledge have predicted organic soils up to 52 % total carbon using mid-IR soil spectroscopy without splitting model calibration for mineral and organic soils. One exception is the mid-IR SSL of the United States, which contains 350 about 2000 organic soil samples (Wijewardane et al., 2018;Dangal et al., 2019). Nocita et al. (2014) predicted SOC for croplands, grasslands, woodlands and organic soils separately of about 20'000 samples from the Land Use/Cover Area frame Statistical Survey (LUCAS) across Europe using Vis-NIR spectroscopy. For the model using only organic soil data with a range of 12.0 % to 58.68 % SOC, predictions were less accurate (RMSE = 5.114 % SOC) than in this study (Nocita et al., 2014).
One advantage of the RS-LOCAL-PLSR approach used here is that the statistical modelling is simple and produces easy 355 to understand models compared to other transfer and deep learning approaches (Padarian et al., 2019a, b). Using spectral-or model-based information from SSLs by spiking, subsetting and memory, instance and transfer learning may even be beneficial for parts of the world lacking legacy soil data or funding to establish their own SSL. In other words, a SSL of one region may be used to predict locally for another region that does not have a SSL. This is comparable to the concept of Homosoil in digital soil mapping (Mallavan et al., 2010).

360
However, there are still some drawbacks. As mentioned by Padarian et al. (2019a), spiking and subsetting are dependent on the size of local and global datasets and may still bias the predictions towards the local dataset rather than fully using valuable global information, which generates less robust models. Although transfer learning of model "rules", or network weights, showed promising results on a continental scale (Padarian et al., 2019a), it has not been tested when transferring national spectral knowledge to a field scale.

365
Ultimately, soil spectroscopy has the potential to speed up quantification of soil properties for soil mapping and monitoring.
Several studies have shown how this affects soil maps and associated uncertainties (Brodský et al., 2013;Viscarra Rossel et al., 2016b;Ramirez-Lopez et al., 2019), but only on a farm-scale. It still remains to be implemented in a large-scale soil information system.

370
This study reveals that, if adequately mined, the information in a SSL is sufficient to predict soil carbon of a new study region with very different soil characteristics. Whereas past spectroscopy studies mostly focused on mineral soils, these model validations of SC ranging between 1 % to 52 % show that using as few as 5 new samples in combination with RS-LOCAL and a SSL yields promising results. This approach decreases the time and cost of field and laboratory soil analysis, reduces the bias of large-scale spectroscopy or SSL spiked models and the uncertainty of small-scale local models. Including more organic 375 soil samples in the Swiss SSL will make it more robust for future modelling applications (Baumann et al., 2021). This case study for assessing SC in drained peatlands under agricultural management shows that an operative SSL is useful for scaling up quantitative soil information over space and time.  Appendix C: Variance of Local HAFL Samples Over Depth Local HAFL soil samples from different depths of the same location were diverse. This may be due to pedogenetic formation conditions unique to peatlands near bodies of water as well as anthropogenic influence. The "Seeland" and "St. Galler Rheintal" regions are characterised by an extreme diversity of intact peat, decomposed and mineralised peat, calcareous lacustrine sediments and fluvial sand, silt and clay deposits depending on past river flow conditions (Bader et al., 2018;Burgos et al., 400 2018). Anthropogenic influence such as changing the course of and channeling rivers, lowering lake and groundwater tables and draining peatlands further complicate soil characterisation. These conditions create a mosaic of extremely heterogeneous soil characteristics that vary vertically depending on soil depth as strongly as they vary horizontally across the entire three study areas that are part of the HAFL dataset ( Figure C1).  Figure D1. R 2 , bias and RMSE (% total carbon) of the validation of RS-LOCAL modelling approaches using different numbers of local HAFL samples in model calibration, when altering the RS-LOCAL parameter k. RMSE is on a logarithmic scale.

Appendix E: Geographical Position of Chosen Samples by RS-LOCAL
We mapped the locations from which RS-LOCAL selected samples from the SSL for calibration together with 20 local HAFL samples (n = 334, k = 300; Figure E1). There appeared to be no spatial correlation between chosen RS-LOCAL locations and geographical distance from the local HAFL locations. In other words, locations chosen by RS-LOCAL suggest that spectrally relevant soil samples from the SSL to predict SC in local HAFL samples are not confined to nearby areas in terms of geographical 410 distance. This may be linked to the heterogeneity of soils found in these drained peatlands: in between layers of organic soils, sampled soil layers also contained geologic substrate material from lacustrine carbonates, dense clay or fluvial sand depositions. We speculate that this soil and spectral diversity at local HAFL sampling locations may explain why RS-LOCAL even selected relevant SSL samples originating from the Alps mountain range. This ultimately suggests that RS-LOCAL is able to use segments of soil spectra from a variety of similar but also dissimilar locations for prediction of new local soil samples.  Figure F1. Predicted (Ŷ ) vs. observed (Y ) total carbon content [%] by calibrating a PLSR of a RS-LOCAL subset with 5, 7 and 10 local HAFL samples (a) and validating it with 58 local HAFL samples (b). The error bars signify the SD and where none are present, the components remained identical across 5 repeats and thus SD = 0.