Mapping homogeneous spectral response zones in a soil profile

Homogeneous spectral response zones represent relatively uniform regions of soil that may be useful for identifying soil horizons or delineating soil units spatially. External parameter orthogonalisation (EPO) and direct standardisation (DS) were assessed for their ability to conserve intrinsic soil information of spectra under variable moisture conditions, as experienced when taking measurements in situ. A 1 m x 1 m section of a soil profile was intensively sampled using visible near-infrared 10 diffuse reflectance spectroscopy at 2.5 cm vertical intervals and 10 cm horizontal intervals. Further samples were taken on a 10 cm grid and scanned in a laboratory under field moist and air-dry conditions. A principal component space was constructed based on the in situ scans following either EPO transformation, DS transformation or following pre-processing only (PP). Scores from the first four principal components – which accounted for more than 0.97 of the total variance in each case – were subject to k-means clustering to identify homogeneous spectral response zones. Laboratory-based scans were 15 then projected onto the same principal component space and fitted to the pre-existing cluster centroids. Both EPO and DS were found to have potential in reconciling differences observed between in situ and laboratory-based measurements compared to pre-processing only (PP). EPO outperformed DS in terms of conserving the relationship between PC scores (LCCC: EPO = 0.84, DS = 0.58, PPO = 0.44; RMSE: EPO = 11.8, DS = 15.4, PPO = 38.5) and also in identifying homogeneous spectral response zones that corresponded to field observed horizons. 20


Introduction
Horizons are characteristic features of soils, which represent regions of relative uniformity in a highly heterogeneous medium.Historically, horizons have offered an efficient way of characterising a profile by capturing the maximum variation within a soil profile using a minimum number of investigation sites.Horizons form through many factors including the accumulation of OM, deposition of aeolian or alluvial material, surface weathering, or translocation of clays or Fe/Al chelates 25 (Isbell, 2002).They are identified in the field by observing changes of soil properties with depth.Common diagnostic criteria include colour, texture, mineral composition, structure, redoximorphic features and the presence of inclusions.
Many horizon diagnostic criteria such as colour, texture and mineral composition can be estimated using visible nearinfrared diffuse reflectance spectroscopy (VisNIR) (e.g.Viscarra Rossel et al., 2009).Previous studies have utilised this 30 relationship to characterise horizons with VisNIR.Galvao et al. (1997) investigated VisNIR spectra of 35 air-dry and ground, horizon-based samples from six profiles in Brazil.The authors identified that the principal components of VisNIR spectra held intrinsic information that showed a characteristic decrease with depth.Viscarra Rossel and Webster (2011) analysed VisNIR spectra from 13,654 air-dried and ground samples from Australia.Horizon centroids in canonical space were identified and by reallocating samples to the nearest centroid, it was possible to distinguish topsoil and subsoil 35 horizons.Meanwhile, Fajardo et al., (2016) intensively sampled 59 air-dry soil cores at 2 cm increments up to 130 cm depth with a VisNIR contact probe.Principal components (PCs) of the spectra were subject to fuzzy clustering and a digital gradient was applied to identify spectrally derived horizon boundaries that exhibited similarity to traditional horizons.Direct standardisation was developed to allow transfer of calibrated models developed on one spectrometer to be used on another spectrometer (Wang et al., 1991).The approach establishes a relationship between the spectra obtained by the 'master' spectrometer and the corresponding spectra obtained by the 'slave' spectrometer; the relationship is then used to transform the slave spectra to correspond with the master spectra.It has been adapted to removing the effects of soil moisture where the moist spectra act as the slave set and are converted to the air-dry master set (Ji et al., 2015).Both spectra 85 are collected with the same spectrometer in this case.
It remains unclear if moisture corrected spectra either by EPO and DS can conserve sufficient intrinsic soil information for the identification of homogeneous spectral response zones under field conditions.This study aims to evaluate EPO and DS in terms of their ability to conserve relationships between VisNIR spectra obtained in situ, and those obtained under 90 laboratory conditions for the identification of homogeneous spectral response zones.

Site description
The study site was located on Westwood Farm, an experimental property 3 km northwest of the township of Cobbitty, NSW, Australia (33°59'44.9"S150°39'11.9"E)(Fig. 1).The parent material of the site is Ashfield Shale, a Triassic sedimentary 95 rock comprising black mudstones and grey shales (Howard, 1969).Soils developing from this parent material are known to have a well-developed texture profile and the marine nature of the parent material commonly results in expression of sodicity in the subsoil.The mineralogy of the clay fraction of this soil is dominated by kaolinite, producing soils of low to moderate fertility (Davey et al., 1975).The site has been extensively cleared for agricultural purposes and is currently used for intensive grazing on naturalised kikuyu (Pennisetum clandestinum) and paspalum (Paspalum dilatatum) grasses.100

Profile preparation
A pit was excavated 1.5 m wide, 5 m long and reaching a depth of 1.5 m at the centre.Four horizons were identified and the soil was classified as a Brown Kurosol in the Australian Soil Classification (Isbell, 2002), or an Abruptic Lixisol using the World Reference Base.Notable features of the soil include an abrupt textural contrast from sandy clay loam in the E horizon to medium heavy clay in the Bt1 horizon (Table 1).The Bt1 horizon was also found to be strongly acidic, pH (1:5 H2O) < 105 5.5 (Hazelton and Murphy, 2016).A small quantity of magnetic gravel (~2-4 mm diameter) was found in the A and E horizons, and heavy mottling occurs in the Bt2.Horizon based sampling and laboratory analysis was conducted to further characterise the soil, including a surface sample taken at 0-2 cm depth (Table 2) A 1 m x 1 m sampling region was delineated on the pit wall and sheared to a smooth surface (Fig. 2).The final shearing was 110 conducted in a horizontal direction, progressing from the soil surface to the bottom of the sampling region to limit surface contamination from falling debris.Galvanised nails were inserted on a 10 cm grid to guide sampling.

In situ scanning and sample collection
Visible near-infrared (VisNIR) spectra were obtained with an AgriSpecTM device, connected via fibre-optic cable to a contact probe attachment (Analytical Spectral Devices, Boulder, Colorado, USA).A Spectralon ® tile (Labsphere Inc., North 115 Sutton, New Hampshire, USA) was used to take a baseline reading every 15-20 measurements.Indico ® Pro software was used to interface with the spectrometer; spectra were exported at 1 nm resolution covering the 350 -2,500 nm wavelength range.Visible near-infrared readings were taken in 2.5 cm increments to give 41 readings over each 1 m transect.Eleven SOIL Discuss., https://doi.org/10.5194/soil-2018-12Manuscript under review for journal SOIL Discussion started: 5 June 2018 c Author(s) 2018.CC BY 4.0 License.vertical transects were taken at 10 cm lateral spacing, as well as three horizontal transects at 0, 50 and 100 cm depth (Fig. 3).
Bulk density cores were extracted on a 10 cm grid for further scanning under laboratory conditions (Fig. 3).Bulk density 120 cores at 0 cm depth were taken perpendicular to the soil surface, i.e. driven into the soil surface.Those taken at depth were taken parallel to the soil surface, i.e. driven into the pit wall.Bulk density cores were immediately placed in aluminium tins and sealed with vinyl tape to conserve field moist condition.

Constructing the projection matrices
The DS transfer matrix was constructed as per Wang et al., (1991) and the EPO projection matrix following Minasny et al., 125 (2012).A single library was used to construct both EPO and DS transformation matrices.Structural differences in the EPO and DS matrices are immediately evident (Fig. 4).However, features around the 1,400 and 1,900 nm water absorption bands can be identified in both.

Data processing
All data processing and analysis was performed in the R environment for statistical computing (R Core Team, 2016).130 2.5.1 Spectral pre-processing Some discontinuities were observed at the site of the VisNIR detector junctions, i.e. 1,000 and 1,800 nm.The spliceCorrection() function from the "prospector" package was employed to remove these artefacts (Stevens and Ramirez-Lopez, 2013).This process corrects for the offset of VNIR and SWIR2 and applies linear interpolation at the edges to create a smooth junction with the SWIR1 range.Spectra were then trimmed to remove areas at the end of the detector range with 135 low signal to noise ratios, leaving the 500 -2,450 nm wavelength range.Reflectance readings were converted to absorbance using, A= log(1/R).Data was compressed by a factor of two through the dropping of alternate wavelengths.Compressing data reduces calculation time, without affecting model performance, as much of the data is highly correlated.Spectra were then smoothed using a Savitzky-Golay filter with a window size of 11 and a second order polynomial (Savitzky and Golay, 1964).140

Principal component analysis
Principal component analysis (PCA) is a statistical procedure commonly utilised as a dimensionality reducing technique when processing VisNIR spectra.Data are subjected to a number of orthogonal projections, each accounting for the maximum variability remaining in the dataset.The effectiveness of PCA attributed to autocorrelation between wavelengths in VisNIR spectra, which can be reduced so that a small number of variables explain the vast majority of observed variance.

145
The in situ VisNIR dataset was used to build the principal component (PC) space.The PP, DS and EPO spectral datasets were individually centred and scaled to a mean of zero and unit variance and PCA performed.The centring and scaling parameters, as well as the loadings of the PCs, were then used to project laboratory-based VisNIR scans under field moist and air-dry condition onto the same PC space for comparison.

k-means clustering 150
k-means clustering is an iterative process which partitions observations into clusters based on minimum distance from cluster centroids.Following partitioning new cluster centroids are calculated and observations are repartitioned to the new centroids.
The algorithm proceeds until an error function is minimised (Eq.1), so as to minimise the within cluster variance (MacQueen, 1967).The PCs of in situ scans were subject to k-means clustering to identify zones of homogeneous spectral response.To standardise the analysis the number of clusters was set equal to four, i.e. the same number of soil horizons observed.
Methods are available to automate the selection of cluster number if the number of required clusters is unknown, e.g.cubic 165 clustering criterion (Sarle, 1983).The PCs of moist and air-dry laboratory scans were fit to the cluster centroids established from in situ scans.

Evaluation
Differences between PP, DS and EPO spectra under field moist and air-dry condition were assessed by calculating Lin's 170 concordance correlation coefficient (LCCC) (Eq.2) and root mean square error (RMSE) (Eq. 3) of PC scores projected into the PC space of in situ scans.Soil moisture content varied vertically and horizontally within the profile (Fig. 5).A local maximum was seen at the soil surface before decreasing to 20 cm depth then increasing again in the clayey subsoil.A maximum of 0.25 was observed at 50 cm depth and a minimum of 0.10 was observed at 20 cm depth.This large vertical distribution of moisture variability in the 195 vertical sense was also met with significant variability laterally.At 100 cm depth the moisture content ranged from 0.17 to 0.24.The observed vertical and lateral variability in moisture content reinforces the need to remove the effects of moisture to gain useful insights from the spectra.

Moisture and treatment effects on spectra
Spectra taken on field condition cores had a lower reflectance than those taken on air-dry or ground samples, as noted 200 previously (Bowers and Hanks, 1965) (Fig. 6a).The effect was nonlinear, an increased reduction was generally observed with increasing wavelengths, and two broad absorption bands were observed at 1400 nm and 1900 nm representing overtones of the fundamental vibrations of water molecules (Stoner and Baumgardner, 1980).Pre-processing only had little effect on removing the influence of variable soil moisture (Fig. 6b).Although spectra did converge around the mineral peak at 2200 nm, large differences were still observed, specifically at the broad 1400 nm and 1900 nm water absorbance peaks.

205
Direct standardisation reduced the influence of soil moisture (Fig. 6c).For topsoil samples, DS resulted in near perfect alignment of moist samples.However, for subsoil samples DS worked best between 800 -1850 nm with divergence observed in the visible section and also at wavelengths greater than 1850 nm.External parameter orthogonalisation produced a high degree of convergence between air-dry and field moist samples in the orthogonal space (Fig. 6d).Differences between the topsoil and subsoil samples also appeared to be accentuated.210

Transfer to principal component space
The first PC explained 0.69 of the variance for PP, 0.63 for DS and 0.79 for EPO respectively.The cumulative proportion of variance explained by the first four PCs was greater than 0.97 in each instance.All treatments showed a high correlation between scans taken in situ and those taken in the laboratory under field moist condition, indicating that soil moisture was effectively conserved and that field moist results can be extrapolated to in situ readings (Fig. 7).

215
The effect of moisture can be seen by comparing PC scores of samples scanned in the laboratory under moist and air-dry condition.Deviations between moist and air-dry PP spectra occurred for all of PC1 and in the subsoil and topsoil for PC2 and PC3 respectively.Direct standardisation exhibited a strong coherence for PC1 and PC2; however, deviations occurred for PC3 and PC4 in the subsoil.Following EPO, there was a strong coherence throughout all four PCs and the PC scores also 220 exhibited stronger vertical differentiation than was seen under PP and DS.Comparison of the first four PC scores for moist and air-dry scans shows that EPO (LCCC = 0.84, RMSE = 9.6) conserved more intrinsic information than DS (LCCC = 0.58, RMSE = 22.3) and PP (LCCC = 0.44, RMSE = 37.0) (Fig. 8).

Comparison of clusters to observed horizons
The four classes identified from clustering the PP PC scores were only able to effectively identify the A horizon in situ, as 225 other horizons showed no continuous vertical disaggregation (Fig. 9).Under moist conditions, PP effectively isolated the A and E horizon from the B horizons.However, under air-dry conditions only one horizon was identified, indicating that the External parameter orthogonalisation was the most effective approach for identify horizons in situ, and for conserving class 235 allocations under variable moisture conditions.Continuous horizontal bands, that resembled the field observed horizons were identified under all scanning environments.
The success of horizon identification by k-means clustering of VisNIR spectra is attributable to horizon delineation being derived by strong changes in colour and clay content in this soil, not properties less spectrally active, such as structure.

240
Organic carbon ranged from 3.39 g dag -1 for between the soil surface and 0.37 g dag -1 in the Bt2 horizon, while clay ranged from 16 to 67 g dag -1 (Table 2).
Assigned classes often did not translate to contiguous, mutually exclusive zones on the soil profile.Associations of classes were observed, especially in the heavily mottled Bt2 horizon.Within horizon variation is expected, as horizons are never 245 uniform.Horizons may represent gradational zones between two more clearly identifiable horizons, as distinguished with transitional AB and BC horizons.Alternatively, discrete parts of one horizon may be present in another, as represented by broken horizons A/B and B/C.In addition, VisNIR is capable of identifying horizons not identified through field observations (Fajardo et al., 2016).

250
The preservation of the spatial variability of horizons when captured in this way will no doubt provide insight into the development and functioning of soils.As opposed to the homogenisation that occurs when soils are ground and sieved prior to analysis.The benefits of this spatial disaggregation warrant further investigation.

Evaluation of DS and EPO
Direct standardisation produced variable results for the profile wall under the observed moisture contents.Slight 255 improvements in the prediction accuracy of models calibrated following DS have been found when the moisture content of the training set is similar to the moisture content of the unknown sample (Wijewardane et al., 2016b).This moisture-explicit DS adds complexity to the moisture correction process.To apply the correct DS transfer set a priori knowledge of the sample's moisture content is required.Any method to ascertain soil moisture that requires drying a sample fundamentally renders the correction processes redundant, as the dried sample could instead be scanned, and it is also impractical in situ.

260
One approach is to predict the soil moisture content directly from the VisNIR spectra.Haubock et al., (2008) found that soil moisture could be predicted, R 2 = 0.71, with a normalised soil moisture index utilising just the 1,800 and 2,119 nm wavelength channels.However, using this approach could lead to compounding errors when a sample is placed in the wrong moisture class.If creation of moisture classes were to be applied to this soil profile, three different calibration models would be required in total, and two would be required within the majority of lateral transects.It remains unclear if underlying 265 homogeneous spectral response zones would be retained or if they would become a reflection of predicted moisture content and the subsequent transfer matrix applied.As both moist and air-dry spectra are projected into the same space when applying EPO, a priori knowledge of soil moisture content is not required.EPO was more effective under the variable soil moisture levels seen in this soil profile and as 270 expected when surveying a larger area for delineation of soil map units.It is thus seen as a more effective approach.

Conclusions
Both EPO and DS were able to reduce the negative effects of soil moisture on VisNIR spectra, whilst retaining useful spectral information.More intrinsic soil information was retained following EPO, as opposed to DS, and k-means classes consistent with field observed horizons were better expressed under field moist and air-dry condition.The approach can be 275 easily upscaled to mapping soil units spatially.
is the covariance between the two variables   2 ,   2 are the variance of each variable ̅ ,  ̅ are the mean of each variable 180 is the  th observed value  ̂ is the  th predicted value 185  is the number of observations Qualitative assessment of homogeneous spectral response zones was provided by comparison of the distribution of classes on the profile with field observed horizons.190 SOIL Discuss., https://doi.org/10.5194/soil-2018-12Manuscript under review for journal SOIL Discussion started: 5 June 2018 c Author(s) 2018.CC BY 4.0 License.
SOIL Discuss., https://doi.org/10.5194/soil-2018-12Manuscript under review for journal SOIL Discussion started: 5 June 2018 c Author(s) 2018.CC BY 4.0 License.spectra of air-dry subsoil samples are more similar to moist topsoil samples.This reaffirms the notion that variation in moisture can greatly exceed variation between samples (Wijewardane et al., 2016a).230 Direct standardisation effectively identified three horizons in situ, despite the A and E horizons being combined.Under field moist condition in the laboratory, the separation of the two B horizons is less clear and is completely removed under ground condition; where DS could only effectively identify two horizons, with the E horizon split in half.

Figure 1 :
Figure 1: Location of the sampled profile in relation to Sydney within the state of New South Wales, Australia.

Figure 2 :
Figure 2: Photograph of the prepared soil profile displaying a natural face section (left) and the prepared 1 m x 1 m sampling area.Galvanised nails were inserted on a 10 cm grid to guide sampling locations.

Figure 3 :
Figure 3: Schematic representation of sampling design of the soil profile.

Figure 4 :
Figure 4: Comparison of matrix structure: a) direct standardisation transfer matrix; and b) external parameter orthogonalisation projection matrix.

Figure 5 :
Figure 5: Box plots displaying the distribution of gravimetric moisture content by depth.

Figure 6 :
Figure 6: Comparison of a representative topsoil and subsoil sample scanned in field moist and air-dried condition: a) trimmed and splice corrected reflectance spectra (500-2450 nm); b) pre-processed (PP) spectra; c) direct standardisation (DS) approach whereby the moist sample is corrected to resemble the 5

Figure 7 :
Figure 7: Principal component scores for VisNIR spectra obtained in situ (black), field moist in the laboratory (blue) and air-dry in the laboratory (red).

Figure 8 :
Figure 8: The first four PCs of VisNIR spectra under moist and air-dry condition: a) pre-processing only; b) direct standardisation; c) external parameter orthogonalisation.

Figure 9 :
Figure 9: The distribution of classes identified by k-means clustering of the first four PC scores of spectra following: i) PPpro-processed only spectra; ii) DSdirect standardisation; and EPOexternal parameter orthogonalisation.Field observed horizon boundaries are indicated by dashed horizontal lines.Horizon designations are indicated.5

Table 1 Horizon-based field observations of a 1 m x 1 m surface of the soil profile.
SOIL Discuss., https://doi.org/10.5194/soil-2018-12Manuscript under review for journal SOIL Discussion started: 5 June 2018 c Author(s) 2018.CC BY 4.0 License.