Short-range-order minerals as powerful factors explaining deep soil organic carbon stock distribution: the case of a coffee agroforestry plantation on Andosols in Costa Rica

Soil organic carbon (SOC) constitutes the largest terrestrial C stock, particularly in the Andosols of volcanic areas. Quantitative information on distribution of SOC stocks is needed to construct a baseline for studying temporal changes in SOC. The spatial variation of soil short-range-order minerals such as allophane usually explains the variability of topsoil SOC contents, but SOC data for deeper soil layers are needed. We found that within a 1 km2 Costa Rican basin covered by coffee agroforestry, SOC stocks in the upper 200 cm of soil were highly variable (24 to 72 kgCm−2). Topsoil SOC stocks were not correlated with SOC stocks present in deeper layers. Diffuse-reflectance mid-infrared (MIR) spectroscopy made possible the analysis of a large number of samples (69 soil profiles, i.e. 598 soil samples) for ammonium-oxalate and sodium-pyrophosphateextractable forms of Al, Fe, and Si, as well as SOC content and bulk density. Using the MIR spectra, we identified two different soil materials, which were identified as allophanic and halloysitic soil material. Allophanic soil occurred on top of the halloysitic soil. The thickness of the allophanic soil material, rich in SRO minerals and related to a young andic A horizon, explained the variability of SOC. This study illustrates that knowledge of topography and pedogenesis is needed to understand and extrapolate the distribution of SOC stocks at landscape scales.

additional hypothesis: that signatures of SRO minerals in MIR spectra of soil samples would be useful proxies to predict type of soil material (Andic vs Halloysitic soil material), SOC and bulk density (Bd).
Several authors have shown that diffuse-reflectance MIR spectroscopy is a time-and cost-effective analysis to quantify SOC contents. Therefore, MIR spectroscopy has become increasingly popular for spatial mapping of SOC (Ben-Dor et al., 2009;75 Clairotte et al., 2016;Nocita et al., 2015;Visacarra Rossel et al., 2016). Especially in the MIR region, each of a soil's mineral constituents affects spectra in a characteristic way. For example, absorbance peaks of allophane and imogolite, two SRO minerals that are specific to Andosols-are near 1000 cm -1 . In contrast, the absorbance peaks of polymerized silicates are near 350 cm -1 (Parfitt, 2009). As SRO minerals control soil Bd (Shoji et al. 1996) and SOC content (Torn et al., 1997), SRO signatures in MIRS spectra may be useful proxies for soil Bd and SOC. Therefore, MIR spectroscopic analysis could replace soil extractions (Janik et 80 al., 1998) and be used for classifying soil samples as Andosols or non-Andosols. Based upon those classifications, MIRS could be appropriate for spatial mapping of SROs and SOCs in volcanic areas where soil age, type, and andic properties vary. In summary, the three hypotheses tested were (1) Spatial distribution of SOC stocks at depths down to 200 cm can vary dramatically in volcanic areas, even within a small watershed; (2) Surface SOC stocks in volcanic areas are not reliable predictors of stocks that might exist down to depths of 200 cm; and (3) MIRS is an effective and reliable technique for classifying soil materials according to some 85 characteristics of andic soils associated with contents of SRO minerals.

Site description
Previous authors have studied the site and described it in detail (Kinoshita et al. 2016, Gómez-Delgado et al., 2011. It is a 0.9 km 2 watershed located in the Central-Caribbean area of Costa Rica. More specifically, it is located within the Aquiares Coffee Farm 90 (83° 43' 35"W; 9° 56' 35" N), in the Reventazón river basin on the slope of the Turrialba volcano. The site's elevation ranges from 1020 to 1280 m a.s.l., and the mean slope is 11.31°. Mean annual precipitation was 3014 mm between 1973 and 2009. The climate is tropical humid without a dry season; the mean annual temperature is 19.5°C. According to Mora-Chinchilla (2000), volcanic avalanche deposits form the geology of the area, which was originally produced by the collapse of a 1.3-km-wide strip of the southeastern slope of the Turrialba volcano. The Turrialba deposits are mainly andesitic volcanic ashes. Indications of lava flows, 95 agglomerates, lahars, and ashes are also present. Thin ash deposits still occur regularly; the last was in mid-2016. The site's soils are classified as Andosols (Payan et al., 2009), with a pH 5 at the surface and 6 below 1 m (data not shown).
Before the introduction of the coffee in 1975, the watershed's lands were occupied by housing and gardens-including a cardamom plantation-all of which had replaced a pristine forest. Since 1975, the vegetation has consisted of coffee trees (Coffea arabica L., var Caturra, 6,300 trees ha -1 ) and Erythrina poeppigiana leguminous shade trees at a density of 7.4 trees ha -1 . The Aquiares farm 100 is intensively managed, with regular pruning and applications of fertilizer (214 ± 44 kg N ha -1 yr -1 ). Its management of pests and weeds complies with Rainforest Alliance TM standards. Coffee yields were1375 ± 341 kg green coffee ha -1 yr -1 (1994-2011).

Soil-profile samples for MIRS and chemical analyses
Taking into account Kinoshita et al.'s (2016) data on the spatial variation of SOC contents in the study site's topsoils, we chose 69 105 sampling points, all of them within inter-row of coffee plants ( Fig. 1). At each location, after removing surface plant residues, we used a hand-driven, 5-cm-diameter steel auger to extract soil profile down to the lithic contact or 200 cm. Each soil profile thus consisted of four to ten 20-cm-thick soil samples, depending on soil depth at each location. The soil samples (a total of 598) were then oven-dried at 40°C for 72 hours and ground to <200 µm for MIRS and chemical analyses.

Samples for Bd determination 110
Because digging soil pits for bulk density is tedious, and because the spatial variability of Bd may be less than that of SOC contents, at least in non-Andosols (Don et al., 2007), we collected samples for Bd determination at only seven of the 69 locations identified in Sect. 2.2.1. The seven locations were chosen because of their locations and the spectral characteristics of their corresponding MIRS samples (See Sect. 2.4.1.) For logistical reasons, the soil pits for bulk density could not be dug at exactly the same points where MIRS samples had been obtained. However, every pit was dug at a distance of less than 100 cm from the corresponding 115 MIRS-sample location. Each pit was dug at depth intervals of 20 cm to a depth of 200 cm. Four replicates were sampled from each soil layer of each pit.

Acquisition and pre-processing of MIR spectra
For each of the 598 soil samples, we acquired a reflectance spectrum in the MIR region at 934 wavenumbers between 4000 and 400 cm −1 , at 3.86 cm −1 intervals. The spectrophotometer (a Fourier transform Nicolet 6700, by Thermo Fischer Scientific, Madison, 120 WI, USA) was equipped with a silicon carbide source, a Michelson interferometer as dispersive element, and a deuterated triglycine sulphate detector. Soil samples were placed in a 17-well plate, then scanned using an auto-sampler (soil surface area scanned: ca. 10 mm²). The resulting MIR spectrum for each sample was the sum of 32 co-added scans. The body of the plate was used as reference standard; it was scanned once per plate. Reflectance was converted to absorbance. Twenty wavenumbers were removed due to noisy spectra. The data set that resulted from acquisition and pre-processing consisted of 598 mean MIR spectra. 125

Chemical analyses
Of the 69 soil profiles described in Sect. 2.2.1, we selected ten spatially dispersed 200-cm soil profiles (comprising a total of 98 soil samples), as being representative of the total sample set from a spectral viewpoint ( Fig. 2A). The 98 soil samples were analysed for SOC contents and extractable Al, Si, and Fe. The SOC contents were determined with a CHN elemental analyser (Carlo Erba 130 NA 2000, Milan, Italy). Al, Si, and Fe associated with active amorphous constituents and organo-metal ligands (Alo, Sio, and Feo) were extracted with ammonium oxalate solution (Blakemore et al., 1981) as follows. Soil samples (0.5 g) were shaken for 4 h in the dark in 50 mL of a solution of 0.2 M ammonium oxalate and 0.2 M oxalic acid at pH 3. Then, after centrifuging and filtration, we measured the Alo, Sio, and Feo in the filtrate by ICP-AES.
Al and Fe associated with soil organic matter (Alp and Fep) were estimated by extraction with sodium pyrophosphate solution 135 (Blakemore et al., 1981). We caution that pyrophosphate has been reported to extract different forms of Fe, some of which are not specifically related to organo-Fe complexes (Parfitt and Childs, 1988). For this analysis, soil samples (0.5 g) were shaken for 16 h in 50 mL of a 0.1 M sodium pyrophosphate solution at pH 10. After centrifuging and filtration, sodium-pyrophosphate-extractable Alp and Fep in the filtrate were determined by ICP-AES.
The data set that resulted from this task consisted of the 98 samples' respective contents of SOC; Sio; Alo and Alp; and Feo and Fep. 140

Bulk density (Bd)
Bulk densities were determined according to the soil-core method, using a bevelled cylinder (98 mL). The four replicates (Sect. 2.2.2) of each soil sample were oven-dried for 48 h at 105 °C and sieved at 2 mm to remove coarse fragments such as stones and living roots. Those fragments represented less than 1% of each soil sample mass, and therefore were considered negligible. The Bd of each soil sample was calculated as the average of the Bds of the sample's four replicates. 145 The data set that resulted from this task consisted of 66 Bd values. One value for each of the ten soil depths at the seven sampling locations, except where the sampling pit could not be dug to the full 200-cm depth.
2.5 Development of models relating MIR spectra to Bd, and contents of Alo, Alp, Sio, Feo, and SOC

Models for contents of Alo, Alp, Sio, Feo, and SOC
From the data sets for MIRS (Sect. 2.3) and laboratory analyses (Sect. 2.4.1), we developed predictive models for contents of Alo, 150 Alp, Sio, Feo, and SOC. One model was developed for each constituent, for a total of five models. All of the models were based upon 69 samples, from seven soil profiles, that were common to both data sets. The other 29 samples that were common to both data sets were used for validating the models, as described below. The models were developed by fitting the samples' MIR absorbance spectra (with no mathematical pre-treatments) to the samples' measured contents of each of the five constituents.
Fitting was done via modified partial least-squares regressions. The accuracy of each prediction model was determined by external 155 validation using laboratory analyses and MIRS spectra of the above-mentioned 29 samples, which were from three different soil profiles. The accuracy was quantified by computing (i) the coefficient of determination (R²), (ii) the root mean square error (RMSE) between predicted and measured values, and (iii) the ratio (denoted as RPD) of the standard deviation of the value set to RMSE.
We also tested four additional calibration models for predicting SOC contents. These models were based on two spectral classes and two conventionally measured data classes. One model was built for each combination of spectral class and soil-type class. The 160 two spectral classes and two soil-type classes were defined via two analyses. The first was a principal-component analysis (PCA) and a K-means clustering analysis of the 598 soil spectra. The second was a principal-component analysis (PCA) and a K-means clustering analysis of the 98 samples conventionally analysed data: Alo, Alp, Sio, Feo, Alo+0.5 Feo, Allophane, and (Alo-Alp)/Sio (Terra et al., 2018). Because these models were built on a small number of soil samples (about 30 or 60 soil samples in one cluster), the accuracy of the prediction models was determined by cross validation, using the leave-one-out method. 165

Model for Bd
This model was based upon the 66 measured Bd values (Sect. 2.4.3), and the MIR absorbance spectra of soils from the seven corresponding sites. To build the model, we used modified least-squares regressions to fit the MIR absorbance spectra to the Bd values. Our Bd-prediction model (built on spectral signatures of soil samples passed through a 200 µm-sieve) did not pretend to predict Bd sensu stricto, which is a physical property determined by soil constituents and their arrangements; i.e. the soil structure. 170 Instead, it used specific vibrational processes of the SRO minerals and organo-metal complexes as proxies for predicting Bd.
Because only 66 Bd were measured, we could not build two different models based on clusters (as we had done for SOC contents).
Hence, we assessed the accuracy of the Bd-prediction model by cross validation, using the leave-one-out method.

Data calculations 175
In the case of samples whose contents of Alo, Alp, Sio, Feo, and SOC, and Bds had not been measured analytically, we estimated those contents from the samples' MIR spectra, using the models described in Sect. 2.5. Then for each of the 598 soil samples, we calculated: • The degree of weathering of the volcanic glass, calculated as Alo+1/2 Feo, with Alo and Feo in grams per 100 g of soil (Shoji et al., 1996) 180 • Grams of Allophane per 100 g of soil, calculated as 100 /�23.4 − 5.1� − � ÷ � with Sio, Alo, and Alp given in grams per 100 g of soil (Mizota and Van Reewijk, 1989), • The atomic Al:Si ratio in the soil aluminosilicates, calculated as Al:Si = (Alo-Alp)/Sio , with Sio, Alo, and Alp given in grams per 100 g of soil (Parfitt and Wilson, 1985). A ratio of Al:Si about 2 is characteristic of imogolite, proto-imogolite, and Alrich Allophane. Ratios lower than 2 indicate Si-rich Allophane and the presence of significant amounts of crystallized minerals 185 (Levard et al., 2012).
• SOC stock (kgC per m -2 ), calculated as "sample thickness" which refers to the 20-cm thickness of each of the 598 soil samples.

Statistical analyses
Effects of spectral cluster and soil depth on SRO minerals, organo-metal complexes and SOC contents, Bd, and SOC stocks were 190 analysed with linear mixed models that considered soil profile a random effect. The T-test was used to assess the effect of spectral cluster on the variable to be explained for each depth. A random forest regression model was used to evaluate and order the importance of SRO minerals and organic-metal complexes, spectral cluster, and soil depth on SOC contents. Random forest is a machine-learning technique based on randomly built decision trees. At each node, a subset of covariates is also randomly chosen.
Random forest was used instead of multiple-linear-regression methods because it allows use of both categorical and numeric 195 covariates, collinearity between covariates, and non-linear relationships between covariates and the variable to be explained. %IncMSE was used to assess the relative importance of covariates in explaining variability of SOC content. For a given covariate, %IncMSE is the difference between mean standard error (MSE) of the model with permutation of this covariate, and model without that permutation. The larger the %IncMSE, the more important this covariate in predicting SOC content. We used R software (R Core Team, 2018) for the statistical analyses. Results were given for the 598 soil samples, but statistics on the conventionally-200 measured variables can be found in the supplementary materials (S1, S2, S3).

Spatial maps
Golden Software Surfer V 8.0 was used for spatial mapping of the predicted SOC stocks, Allophane contents, and thickness of allophanic soil material layer (see Sect. 3.1.2 for definition). The Digital elevation model (DEM) used in some of the mapping was created from a vertical-resolution, digital-terrain model obtained from the TERRA-1998 project (scale ≈ 1: 25000; CENIGA, 205 1998). The vertical and vertical resolutions of that model were 5 m and 10 m respectively.

Prediction of extractable Al, Si, Fe contents (SRO minerals and organo-metal complexes)
The five models for predicting contents of extractable Al, Si, and Fe from MIR spectra gave accurate results for the whole set of 210 soil samples (Table 1) with RPD>2 (Chang et al., 2001). Our attempts to develop a prediction model for Fep content proved unsuccessful, probably because pyrophosphate extraction does not discriminate sufficiently among forms of Fe (Parfitt and Childs, 1988). The model for predicting Alp contents from MIR spectra gave negative values (mean= -0.19 ± 0.08 g Alp 100 g -1 soil) for nine samples, all of which were among the 500 that were not analysed conventionally. Most of those nine samples were from two specific soil profiles (i.e., two locations). By convention, we set the Alp contents of those nine to zero when calculating their other 215 soil characteristics. All other predicted contents of extractable Al, Si and Fe contents were positive.
In previous published studies, Soriano-Disla et al. (2014) found that contents of oxalate-extractable forms could be predicted from MIR spectra, but Misnany et al. (2009) found otherwise. Our results indicated that MIR spectroscopy seems to be a promising tool for predicting contents of extractable Al, Si, and Fe in volcanic soils, and thus contents of SRO minerals and organo-Al complexes, providing that the specific prediction models are based upon conventional analyses of a large set of representative samples. That 220 was likely due to the specific peaks of imogolite and allophane in the MIR range (Parfitt, 2009). However, further studies are needed, especially on pyrophosphate-extractable elements.

Clustering according to soil type based upon MIR spectra, versus conventional analyses
MIR spectra for the 598 soil samples formed two distinct clusters ("blue" and "orange" in Fig2B), as did the conventionally analysed data of the 98 samples ("circle" and "square" in Fig 2B). The data from the measured cluster represented by circle showed 225 andic properties and was called Allophanic measured cluster by comparison to the other measured cluster represented by square and called Halloysitic measured cluster. A comparison of the clusterings of the 98 samples that were analysed both conventionally and by MIRS is revealing. For the most part, samples whose MIR spectra are in the "blue" spectral cluster (see Fig. 2B) fell in the Allophanic measured cluster, while samples whose spectra are in the "orange" spectral cluster fell in the Halloysitic measured cluster. That correlation held true for all but 18 of the 98 samples. The conventionally analysed data of all 18 of those samples are 230 in the measured Allophanic cluster, but their spectra fall in the "orange" spectral cluster. In Fig. 2B, those samples are identifiable as the ones whose andic-property symbol is a circle that surrounds an "orange" spectral symbol. All of those occurrences along the border between the two spectral clusters reveal that the Allophanic and Halloysitic measured clusters overlap slightly. This study of the clusterings shows that MIR spectra depended upon some of the characteristics of andic soil samples, and also shows a graduation of the two soil materials. Because of the above-described strong correlations between spectral clusters and measured 235 clustered, we will denote the orange spectral cluster as the Halloysitic spectral cluster, and the blue spectral cluster as the Allophanic spectral cluster from this point onward.
Their Alp:Alo ratios are highly variable (about 1.7 ± 1.5), and their Al:Si ratios are about 1.7 ± 0.6. In a given cluster, the Alp and the Feo content, the Alp:Alo ratio decrease with depth, but not the allophane content, Al:Si ratio, nor the quantity Alo+0.5 Feo.
Chemical extraction provides no structural information, but it does help distinguish between crystalline and SRO minerals. The literature (Levard et al., 2012) notes that SRO minerals like Al-rich allophanes, proto-imogolite, and imogolite predominate in 250 volcanic-ash soils with Alo+0.5Feo >> 2% and Al:Si ratios >2, such as in the Allophanic cluster. Surprisingly, and especially in the surface soil, ratios of Alp to Alo are greater than 1 with high allophane contents (from 5 to 20 g allophane per 100 g soil), indicating a co-existence of allophane and Al-organo complexes; which may be due to some combination of the soil pH (approximately 5; i.e. near the boundary between allophane and Al/Fe-organo complexes) and the regular inputs of organic materials and ashes from the surface (Mizota and Van Reewijk, 1989). On the contrary in the Halloysitic cluster, a lower Al:Si ratio should sign a high Si 255 activity in solution and likely formation of halloysite (Parfitt et al., 1997). However, in both types of soil materials, the high Alp:Alo ratio revealed that the majority of active Al groups were present mainly as organo-Al complexes, especially near the surface.
Even though the soil reflectance and andic characteristics varied continuously through soil samples (Fig. 2B), we observed that a two-cluster classification of MIR spectra was powerful for describing the horizontal and vertical variation of andic soil properties.
However, MIR does not provide such definitive results for soils in other contexts and at a global scale (Viscara Rossel et al., 2016) 260 because most of the world's soil types are complex mixtures of materials from diverse origins, and because the criteria for global soil taxonomic classifications include parameters with no direct influence upon spectral character. When applied to soil materials sharing the same weathering and pedogenesis processes, vis-NIR spectra (Terra et al., 2018) or MIR spectra seem to be useful tools for soil survey and classification. In volcanic areas where strong variations in mineralogy occur (e.g. SRO minerals may dehydrate to form phyllosilicates), MIR spectra seem to be especially well adapted. These difference in mineralogy and proportion 265 of SRO minerals in the Allophanic and Halloysitic materials cause changes MIR spectra in reflectance intensity and absorption features in particular at the bands near 3620 cm -1 and 3700 cm -1 , which are characteristic of halloysite (Hidalgo et al., 2010;Fig. 2C). Hence, the effects of mineral weathering or pedogenesis pathways upon MIR spectra of volcanic-derived soil samples were clear enough for building two different soil-material clusters (allophanic vs. halloysitic) using PCA and K-means algorithms based upon spectral data distance. 270

Three types of soil profiles in the watershed
Both types of material were encountered at every soil depth, but allophanic materials predominated near the surface (n=57 on 68 samples at 0-20 cm), while halloysitic material predominated at depth (n=24 on 42 soil samples at 180-200 cm) ( Table 2). These results confirmed the common observation that allophane is associated with halloysite (Ross andKerr, 1934 quoted in Parfitt 2009). Both minerals may be derived directly from volcanic glass and halloysite can also form via weathering of Allophane (Parfitt, 275 2009;Torn et al., 1997;Wada, 1989). The allophane: halloysite ratio depends upon the chemistry and regularity of ash depositions as well as upon the concentration of Si in solutions which depend on precipitation and drainage (Churchman et al. 2016).
Of the 69 sampled soil profiles (0-200 cm), 35 were denoted as "allophanic profiles", i.e. every sample's spectrum fell in the allophanic cluster. The soil was composed of allophanic materials down to 200 cm. The samples from eight profiles (the "halloysitic profiles") were exclusively of the halloysitic spectral class, and 26 soil profiles (the allophanic-halloysitic profiles) 280 contained soil of both types (Fig. 3).
Of the 26 allophanic-halloysitic profiles, 20 had soils of the allophanic spectral class at surface, and soils of the halloysitic spectral class at depth. In five of the other allophanic-halloysitic profiles, no clear relationship between depth and spectral class was present.
The thickness of the allophanic material at sites with allophanic-halloysitic soil profiles was taken as being equal to the depth at which halloysitic material first appeared in the soil profile. Calculated in this way, the thickness of allophanic material varied from 285 0 cm in the halloysitic soil profiles to 200 cm in the allophanic soil profiles (Fig. 4a).
In this micro-watershed, allophanes predominated over halloysite because total annual rainfall was above 3000 mm (Parfitt et al. 1983) and because the study site does not experience a pronounced dry season. Therefore, a high leaching rates (Gomez-Delgado et al., 2011;Benegas et al., 2014;Welsh et al., 2018) of Si probably resulted in the formation of materials in which Al groups predominate (Al:Si>> 2). A majority of soil profiles (35) were indeed rich in SRO minerals and organo-mineral (Al/Fe) complexes 290 down to 200 cm. In 26 other profiles, contents of SRO minerals and amorphous materials were also present but decreased with depth (Fig. 3). In these allophanic-halloysitic soil profiles, allophane contents (10 g 100 g -1 soil) were intermediate between the allophane contents in allophanic soil profiles (> 15 g 100 g -1 soil) and in halloysitic soil profiles (< 5 g 100 g -1 soil) (Fig. 3). This coexistence of allophane and halloysite marked a transition from the allophanic soils to the halloysitic soils. This transition could be explained by formation of halloysite at microsites within an allophane matrix (Aomine & Wada, 1962quoted in Parfitt 2009, 295 when periods of leaching alternate with periods of desiccation (Churchman et al., 2016). The resulting fluctuations in activity of Si in soil solutions could cause differential formation of amorphous and crystalline weathering products. A transition zone attributable to that phenomenon has been observed in pedons along the slopes of volcanoes in Ecuador at intermediate altitudes (Zehetner et al., 2003). However, the variations in the soil profile types observed in the present study occurred over much shorter distances than in Zehetner's study. Only 20% of the soil profile in the watershed were of the halloysitic type, in which halloysite 300 minerals predominated at all depths. Halloysite could be the result of an older soil development from allophane weathering on older tephras, or the result of silica-rich environments, such in buried soil layers that receive silica from overlying soil layer, or in zones where restricted drainage prevents removal of Si via leaching (Churchman et al. 2016). This is likely to occur in Si-rich, rhyolitic tephras than in basaltic ones (Dahlgren et al., 2004). Even if in the last millennia, most of the volcanic deposits at Turrialba have been andesitic ashes that varied little in composition over time (Meijer and Buurman, 2003), recent investigations found 305 rhyolitic materials and multistage magmas in which rhyolite is mixed with end members of basaltic andesite (Devitre et al., 2018).
As climate, paleoclimate, and composition of volcanic ejecta were identical over the studied 1-km 2 micro-watershed, we assumed that the thickness and composition of different soil materials likely resulted from an integration through time of several pedogeomorphic processes: (i) regular new ash deposits, (ii) deeper weathering of parent materials, (iii) hydrological dynamics (i.e. Si leaching or accumulation of Si), and (iv) soil erosion (Gessler et al., 2000), i.e. in unstable landscape positions, the erosion 310 of topsoil could have removed allophanic soil materials and exposed the older halloysitic subsoils (Zehetner et al., 2003). The influence of the site's topography upon hydrology, pedogenesis, and topsoil erosion (Fig. 4a) superimposed upon the general weathering trend, and could contribute to explaining the spatial variation of the different soil profiles and allophanic material thickness in the micro-watershed.

Prediction of SOC contents from MIR spectra
The model built on the whole data set yielded better results (higher R 2 , higher RPD and smaller RMSE) than those using two soil groups separately (Table 3). The SOC prediction have been considered accurate according to the usual criteria (Chang et al., 2001), and are comparable in performance to those developed by McDowell et al (2012) for Hawaiian soil samples. However, this model predicted negative SOC contents for 12 samples, which all belonged to the halloysitic spectral class. Negative predicted SOC 320 contents have been seen previously in an Andosol dataset in Mc Dowell et al. (2012), who noted difficulties in predicting low SOC contents accurately. Those difficulties could be explained by the presence of different soil materials, with their own properties to stabilize SOC. Consequently, we preferred the SOC-content predictions made by the models that were built on soils clusters that were differentiated by their MIRS spectra and thus their type of soil material. For these models, which did not predict any negative SOC contents, the predictions of low SOC contents were better with a decreasing RMSE from 5 to 2 gC kg -1 soil for samples in 325 the halloysitic cluster (Table 3).

SRO minerals and metal-organo complexes explain the variation of SOC contents
The SOC contents were explained by soil depth and by clustering based on soil MIR spectra. SOC contents decreased significantly with soil depth, and were much larger in allophanic than in halloysitic spectral classes at all depths (Table 4). As already and commonly observed, the more SRO minerals that are present, the more SOC is concentrated and preserved in soils (Fig. 5, 6; e.g. 330 Torn et al. 1997;Chevallier et al. 2010). The concentration of SOC in the allophanic materials of this study were similar to those observed in the pedogenetic A horizons of Andosols from Costa Rica (Meijer and Buurman, 2003), Ecuador (Zehetner et al. 2003), Martinique (Chevallier et al. 2010), and Hawai (Torn et al. 1997).
The strongest correlations were between Feo, or Alp, and SOC, suggesting both (i) an important role of organo-metal complexes in SOC stabilization, but also (ii) a stronger impact of SRO mineral with Fe on SOC contents at all depths (Fig. 5) than was expected 335 from Kinoshita et al.'s (2016) research on the surface soil (0-5 cm) at the same site. Ferrihydrite is known to be highly involved in the stabilization of SOC in Andosol (Kleber et al. 2005;Matus et al. 2014;Filimonova et al. 2016;Parfitt et al. 1997). In our study, we did not explicitly analyse the form of Fe extractable by oxalate, whether ferrihydrite dominated Fe forms or not, but the preservation of SOC related to Feo, even in relatively small amounts (0.9-1.3 g Feo 100g -1 soil), seemed to be the major factor explaining SOC-content variations at the surface and at depth (Fig. 5 and Fig. 6). In addition, the relation between SOC and Feo were high and very sensitive to small variations in Feo content (SOC content = 82 Feo -66, r 2 = 0.73). In contrast, below that threshold SOC were lower and less sensitive to Feo content (SOC content = 24 Feo -5, r 2 = 0.85). However, our results did not provide analysis for determining whether the form of Feo present in ALL materials differed from that in halloysitic materials.
The organo-Al complexes, represented by Alp, were the second most important soil variable in explaining the variation of SOC 345 (Fig. 6). The importance of Alp on SOC stabilization has already been noted in literature (Beare et al., 2014;Huygens et al., 2005), although the organo-Al stabilization is not fully understood (Takahashi and Dahlgren, 2016). The most given explanations were: Al toxicity for microorganisms, or electrostatic sorption between Al and SOC which limits the accessibility of SOC to microorganisms or to their enzymes. The organo-Al complexes were observed to be more effective than allophane in protecting SOC from degradation (Boudot, 1992;Powers and Schlesinger, 2002), especially in the surface soil layers (Fig. 6) where contents 350 of organo-Al complexes and the Alp:Alo ratios tended to be high (Fig. 3), revealing that the majority of active Al groups pertained to organo-Al complexes, which stabilized the vegetation organic inputs.
To a lesser extent, the SOC contents were explained by calculated parameters linked to soil andic properties, the ratio (Alo-Alp)/Sio, allophane, Alo + 0.5 Feo, and to soil depth (Fig. 5, 6; Table 4). The mineral matrix explained the SOC stabilization even better at depth than at the surface (Fig. 6c). The relationship between allophane or Alo + 0.5 Feo, and SOC was mediated by soil depth. For 355 a given amount of allophane, the soil was richer in SOC at the surface than at depth. In surface samples, C inputs by vegetation likely explained the higher SOC contents as well as the slightly greater variation of that content (Table 4; Matus et al., 2014).
Nevertheless, the SOC contents are reported to be poorly correlated with above-ground biomass in Andosols (Noponen et al., 2013) and mainly controlled by contents of organo-Al complexes and SRO minerals even in the topsoil (Kinoshita et al., 2016).
Furthermore, the decrease in SOC content with depth in allophanic cluster was much less noticeable than in other studies on 360 Andosols. The SOC contents reported here decreased slightly and smoothly with depth throughout the soil profile (Fig. 3), and still remained high (40 gC kg -1 soil) at the 200-cm depth. The ratio (Alo-Alp)/Sio displayed a threshold near 2.5 (Fig.6) similarly to that was noticed for Feo at 1.3 g Feo 100g -1 soil. Beyond that threshold, all soil samples were in allophanic clusters and SOC was very sensitive to a small variation in (Alo-Alp)/Sio ratio.
As the volcanic ejecta at the study site contained little carbon (2 to 3 mgC g -1 ash, data not shown), the SOC-content distribution 365 in the watershed was controlled mainly by vegetation inputs with pedogenic processes. The dominant pedogenic processes could be the development of an A horizon by andosolization, which consists of the stabilization of accumulations of vegetation-derived organic matter by active Al and Fe, in parallel with rapid formation of SRO minerals from volcanic ejecta. Rapid weathering of the ejecta released Si, Al, and Fe faster than crystalline minerals could form. Complexation of humic/colloidal organic substances with metal to form organo-metal (Al/Fe) complexes was especially noticeable at the soil surface. Preferential precipitations of 370 metastable SRO minerals (allophane, imogolite, ferrihydrite) occurred in the subsoil Dahlgren, 2002, Kramer et al., 2012). The permanent humid weather and the regular ash deposition favoured these mechanisms (Ugolini and Dahlgren, 2002;Mora et al., 2014) and the stabilization of SOC (Buurman et al. 2007;Chevallier et al., 2010). Regular burial of soil by volcanic ashes would explain why some locations have a very thick layer of allophanic material, assimilated to a young pedogenetic Andosol A horizon rich in SOC. Over time, or in specific locations where Si could accumulated, halloysite could form at the expense of 375 SRO minerals. Formation of halloysite with lower surface area and charge density would cause loss of stabilized OC (Torn et al., 1997). Our results showed that deep-soil carbon contents as well as surface-soil carbon contents were essentially driven by the type of soil material and contents of SRO minerals. The combination of soil development and mineralogy was a powerful factor for explaining SOC content, regardless of the soil depth.

Predictions of Bd
The prediction model for Bd (Sect. 2.5.2) gave accurate results, with a RMSE of about 0.09 and a average Bd of 0.8 g cm -1 (Table   5). This level of performance was close that of Cambou et al.'s (2016) predictions of Bd from near-infrared spectroscopy of Luvisols. Bd is determined not only by soil constituents but also by soil structure, which has no effect upon the MIR spectral signature of a soil sample that has been dried and sieved to <200 µm. Nevertheless, the MIR spectra of such a sample could contain 385 enough information to predict Bd. Here, andic properties (e.g. Alo+0.5 Feo), which are well predicted by MIRS was known to control Bd strongly (Shoji et al. 1996). Therefore, these well predicted properties were used as proxy for Bd prediction. The MIR spectroscopy of soil samples prepared for soil analysis seemed to offer promise for predicting Bd of Andosols.
The Bd increased slightly with soil depth: from 0.65 to 0.75 in allophanic soil materials, and from 0.87 to 0.98 in halloysitic soil materials. As already and commonly observed, soil horizons with SRO minerals have much lower Bd than soil with crystalline 390 clays (e.g. Mora et al. 2014) independently of soil depth (Table 6).

Variation of SOC stocks in the Aquiares watershed
Most of the ecosystem C in the watershed is in the soil (from 24 to 72 kgC m -2 ) compared to the above-ground biomass of coffee and Erythrina poeppigiana of 2.8 ± 0.2 kgC m -2 (Charbonnier et al., 2017), or the below-ground biomass of coffee down to 400 cm at 0.9 kgC m -2 (Defrenet et al., 2016). The SOC stocks were highly variable in both dimensions, horizontal and vertical along 395 the soil depth (Fig. 7). The relative distribution of SOC stocks with depth, i.e. the ratio of SOC stocks at 0-100 cm to SOC stocks at 0-200 cm, also varied and was lower in the allophanic (0.56 ± 0.03) than in the halloysitic (0.75 ± 0.10) soil profile type. Our results were consistent with most other studies (e.g. Mora et al. 2014;Batjes 2014), the allophanic soil materials with high contents of SRO minerals had much larger SOC stocks, at every soil depth, than did halloysitic soil materials with crystalline clays (Fig. 7).
Thus SOC stocks in the upper 200 cm of soil varied markedly among the three soil profile types defined in Sect. 3.1.3. The range 400 of calculated stocks for halloysitic soil profiles was 24.5 ± 0.5 kgC m -2 , compared to 49.9 ± 1.8 kgC m -2 for allophanic -halloysitic soil profiles and 72.4 ± 2.0 kgC m -2 for allophanic soil profiles (Fig. 7). The high SOC stocks allophanic soil profiles confirmed that large stocks can be present under plantations at moderate altitudes (1000 m a.s.l.), possibly as a result of organic-material accumulation from previous land cover (pristine forest, then households) along with andosolization. Larger SOC stocks were also measured in deep, homogeneous volcanic-ash soils of Andean ecosystems in Ecuador under upper montane forest and high altitude 405 paramo (87 ± 12 kgC m -2 within the 0-200 cm depth, Tonneijck et al., 2010), as well as in young Hydric Andosols derived from recent volcanic ash (Poulenard et al., 2003). Nevertheless, when Alo-Alp/Sio ratio > 2.5 and Feo > 1.3 g 100g -1 soil (see Sect. 3.2.2.), the SOC stocks were high with high variations (50 and 74 kgC m -2 ) which remained unexplained.
Globally, the distribution of SOC stocks down to 200 cm was based on the distribution of the soil profile type in the watershed and could not be predicted from the SOC stocks at surface. For halloysitic soil profiles, we observed a linear relationship between SOC 410 stock in 0-20 cm and SOC stock in 0-200 cm as already reported for soils with different texture but without SRO minerals (Andriamananjara et al. 2016). However, this relationship did not hold in allophanic and allophanic -halloysitic soil profiles ( Figure 8a). The SOC stocks down to 200 cm could not be easily predicted from the SOC stock at the surface because of the discontinuity in the type of soil material in horizontal and vertical (i.e. allophanic-halloysitic soil profile) dimensions.
The thickness of the allophanic soil material layer was a better predictor of SOC stock in 0-200 cm than the SOC stocks in the 415 topsoil ( Figure 8b) because the SOC stocks varied only weakly within this layer and is especially high. Thus, the distribution of this layer in the watershed explained the distribution of SOC stocks (0-200 cm). Thick allophanic soil material and high SOC stocks were on gentle slopes ( Fig. 4; Mora et al., 2014), while outcroppings of halloysitic soil material and small SOC stocks were found where slopes changed abruptly (Fig. 4). We thus hypothesized that erosion and regular ash deposition moulded the watershed's landscape, the thickness of a young andic A horizon, and the distribution of SOC within the upper two meters of that 420 soil. Pedogeomorphic processes could explain the high variation of SOC stocks at fine-scales within the landscape.

Conclusion
The SOC stocks of the studied 1 km 2 watershed were much larger than the SOC stocks stored in vegetation biomass. The large variation of SOC stocks in the 0-200 cm soil profiles at fine scales (m) was related to the horizontal and vertical variations in SRO mineral contents. Knowing surface SOC stocks provided little information about deep SOC stocks, whereas SOC stocks within 425 the 0-200 cm depth were positively correlated with thickness of a allophanic soil material layer, likely a young andic A horizon.
In this volcanic environment, where soil minerology varies greatly, MIR spectroscopy was confirmed as a convenient and accurate tool for classifying soil materials as either andic or non-andic. As a result, MIR spectroscopy also proved useful for predicting SOC contents, bulk density, and SOC stocks for a large set of samples. The main conclusion of our study is that ignoring topography and soil pedogenesis would introduce a serious weakness in current approaches for evaluating regional C stocks, especially in 430 young volcanic areas where mineralogy controls SOC stocks and varies strongly on a fine spatial scale. Improved knowledge of pedogenesis is thus needed for understanding the variability and distribution of SOC stocks.

Acknowledgements
Our study was part of the work of the CoffeeFlux observatory, developed by CIRAD (Centre de Coopération Internationale en Recherche Agronomique pour le Développement) and CATIE (Centro Agronómico Tropical de Investigación y Enseñanza). This 435 site contributes to FLUXNET (CR-AqC) and belongs to the SOERE F-ORE-T network of observatories, with support from Ecofor, Allenvi, and the French national research infrastructure ANAEE-F (http://www.anaee-france.fr/fr/). This work was also supported by AIRD (SAFSE) and the ANR, the French Research Agency through two projects: Ecosfix (ANR-2010-STRA-003-01) and MACACC (ANR-13-AGRO-0005).
The authors give warm thanks for the field support we received in Costa Rica, including that from Álvaro Barquero and his family; 440 from Aléxis Pérez; and from the following persons at Cafetalera Aquiares: Alfonso and Diego Robelo, Luis Guillermo Navarro, Manuel Jara and Rafael Acuña Vargas.          n: number of soil samples for calibration. Mean and Standard deviation (Sd) of the measured Alo, Sio, Feo, Alp population (in g 100 g -1 soil) and SOC content (g kg -1 ) used for model calibration RMSE: Root Mean Square Error, R 2 : cross-validation determination coefficient, RPD : ratio of Sd to RMSE, Sd : Standard deviation. Table 2. Organo-Al complexes and SRO minerals contents (means ± standard deviation) of the predicted data of all soil samples (All) and for those in allophanic and halloysitic spectral clusters. Letters indicate significant differences among soil depth at p < 0.05.

Depth (cm)
Alp g 100 g -1 soil Feo g 100 g -1 soil Allophanes g 100 g -1 soil (Alo-Alp)/Sio Alo + 0.5 Feo g 100 g -1 soil Number of samples per cluster Table 3 Cross validation statistics of modified partial least square regression for the models of prediction by MIRS of soil organic carbon contents (gC kg -1 soil) after calibration on all of the soil samples (All) or on two clusters defined by their MIRS spectra, or by their conventionally measured data (Alo, Alp, Sio, Feo, Alo+0.5 Feo, Allophane, (Alo-Alp)/Sio    Table 6 Descriptive statistics for bulk densities with depth and with their spectral cluster. P-value express results from t-test between samples from the two cluster classes (Welsh, two sided alternative). For a given cluster, means followed by the same letters do not differ significantly at p=0.05.