<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">SOIL</journal-id><journal-title-group>
    <journal-title>SOIL</journal-title>
    <abbrev-journal-title abbrev-type="publisher">SOIL</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">SOIL</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">2199-398X</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/soil-5-79-2019</article-id><title-group><article-title>Using deep learning for digital soil mapping</article-title><alt-title>Using deep learning for digital soil mapping</alt-title>
      </title-group><?xmltex \runningtitle{Using deep learning for digital soil mapping}?><?xmltex \runningauthor{J. Padarian et al.}?>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes">
          <name><surname>Padarian</surname><given-names>José</given-names></name>
          <email>jose.padarian@syndey.edu.au</email>
        <ext-link>https://orcid.org/0000-0003-2250-5299</ext-link></contrib>
        <contrib contrib-type="author" corresp="no">
          <name><surname>Minasny</surname><given-names>Budiman</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no">
          <name><surname>McBratney</surname><given-names>Alex B.</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-0913-2643</ext-link></contrib>
        <aff id="aff1"><institution>Sydney Institute of Agriculture and School of Life and Environmental Sciences, the University of Sydney, Sydney, New South Wales, Australia</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">José Padarian (jose.padarian@syndey.edu.au)</corresp></author-notes><pub-date><day>26</day><month>February</month><year>2019</year></pub-date>
      
      <volume>5</volume>
      <issue>1</issue>
      <fpage>79</fpage><lpage>89</lpage>
      <history>
        <date date-type="received"><day>14</day><month>August</month><year>2018</year></date>
           <date date-type="rev-request"><day>3</day><month>September</month><year>2018</year></date>
           <date date-type="rev-recd"><day>18</day><month>January</month><year>2019</year></date>
           <date date-type="accepted"><day>7</day><month>February</month><year>2019</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2019 José Padarian et al.</copyright-statement>
        <copyright-year>2019</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019.html">This article is available from https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019.html</self-uri><self-uri xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019.pdf">The full text article is available as a PDF file from https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019.pdf</self-uri>
      <abstract><title>Abstract</title>
    <p id="d1e95">Digital soil mapping (DSM) has been widely used as a cost-effective
method for generating soil maps. However, current DSM data representation
rarely incorporates contextual information of the landscape. DSM models are
usually calibrated using point observations intersected with spatially
corresponding point covariates. Here, we demonstrate the use of the
convolutional neural network (CNN) model that incorporates contextual information
surrounding an observation to significantly improve the prediction accuracy
over conventional DSM models. We describe a CNN model that takes inputs as images of covariates and explores spatial
contextual information by finding non-linear local spatial relationships of
neighbouring pixels. Unique features of the proposed model include input
represented as a 3-D stack of images, data augmentation to reduce overfitting,
and the simultaneous prediction of multiple outputs. Using a soil mapping example
in Chile, the CNN model was trained to simultaneously predict soil organic
carbon at multiples depths across the country. The results showed that, in
this study, the CNN model reduced the error by 30 % compared with
conventional techniques that only used point information of covariates. In
the example of country-wide mapping at 100 m resolution, the neighbourhood
size from 3 to 9 pixels is more effective than at a point location and larger
neighbourhood sizes. In addition, the CNN model produces less prediction
uncertainty and it is able to predict soil carbon at deeper soil layers more
accurately. Because the CNN model takes the covariate represented as images, it
offers a simple and effective framework for future DSM models.</p>
  </abstract>
    </article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <title>Introduction</title>
      <p id="d1e105">Digital soil mapping (DSM) has now been widely used globally for mapping soil
classes and properties <xref ref-type="bibr" rid="bib1.bibx6" id="paren.1"/>. In particular, DSM
has been used to map soil carbon efficiently around the world (e.g.
<xref ref-type="bibr" rid="bib1.bibx11" id="altparen.2"/>). DSM methodology has been adopted by FAO (2018) so
that digital soil maps can be produced reliably for sustainable land
management. While DSM can be said to now be operational, there are still
unresolved methodological issues regarding a better representation of landscape
pattern and soil processes. Some of the methodological research studies
include the use of multiple remotely sensed images
<xref ref-type="bibr" rid="bib1.bibx45" id="paren.3"/> or time series of images as covariates
<xref ref-type="bibr" rid="bib1.bibx15" id="paren.4"/>, testing of novel regression and machine learning
models <xref ref-type="bibr" rid="bib1.bibx5 bib1.bibx53" id="paren.5"/>, and incorporation
of spatial residuals of the regression model
<xref ref-type="bibr" rid="bib1.bibx27 bib1.bibx4" id="paren.6"/>.</p>
      <?pagebreak page80?><p id="d1e127">The formalisation of the DSM methodology was done by the publication of
<xref ref-type="bibr" rid="bib1.bibx35" id="text.7"/>. Following the ideas of
<xref ref-type="bibr" rid="bib1.bibx17" id="text.8"/> and <xref ref-type="bibr" rid="bib1.bibx23" id="text.9"/>, they described the
<italic>scorpan</italic> model as the empirical quantitative relationship of a soil
attribute and its spatially implicit forming factors. Such factors correspond
to the following: <inline-formula><mml:math id="M1" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> – soil, which refers to the other properties
of the soil at a point; <inline-formula><mml:math id="M2" display="inline"><mml:mi>c</mml:mi></mml:math></inline-formula> – climate, which refers to the climatic
properties of the environment at a point; <inline-formula><mml:math id="M3" display="inline"><mml:mi>o</mml:mi></mml:math></inline-formula> – organisms, which refers to the vegetation or fauna
or human activity; <inline-formula><mml:math id="M4" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> – topography, which refers to the landscape attributes; <inline-formula><mml:math id="M5" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula> – parent
material, which refers to the lithology; <inline-formula><mml:math id="M6" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> – age, which refers to the time factor; and <inline-formula><mml:math id="M7" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> – space, which refers to the spatial
position. Explicitly, the scorpan model can be written as

              <disp-formula specific-use="align" content-type="numbered"><mml:math id="M8" display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>=</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>f</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>o</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mtr><mml:mlabeledtr id="Ch1.E1"><mml:mtd/><mml:mtd><mml:mstyle displaystyle="true" class="stylechange"/></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>+</mml:mo><mml:msub><mml:mi>e</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

          where <inline-formula><mml:math id="M9" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> corresponds to the coordinates of a soil observation, and <inline-formula><mml:math id="M10" display="inline"><mml:mi>e</mml:mi></mml:math></inline-formula>
is the spatial residual.</p>
      <p id="d1e386">The usual steps for deriving the scorpan spatial soil prediction
functions include intersecting soil observations (point data) with the
scorpan factors (raster images at a particular resolution) and
calibrating a prediction function <inline-formula><mml:math id="M11" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula>. In effect, we are only looking at
relationships between point observations and point representation of
covariates. The scorpan factors have implicit spatial information;
however the prediction function <inline-formula><mml:math id="M12" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> does not explicitly take into account the
spatial relationship.</p>
      <p id="d1e403">Attempts have been made to incorporate more local information in the
scorpan covariates, in particular topography. Approaches to include
covariate information about the vicinity around the observations <inline-formula><mml:math id="M13" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> have
been devised. One approach is to derive topographic or terrain attributes
(e.g. slope, curvature) at multiple scales by expanding the size of the
window or neighbour size in the calculation
<xref ref-type="bibr" rid="bib1.bibx37 bib1.bibx7" id="paren.10"/>. Another approach includes
multi-scale analysis using spatial filters such as wavelets on the covariate
raster <xref ref-type="bibr" rid="bib1.bibx10 bib1.bibx56" id="paren.11"/>. Thus, the raster
represents larger spatial support. Studies indicated that, generally,
covariates with larger support than their original resolution could
enhance the prediction accuracy of the model <xref ref-type="bibr" rid="bib1.bibx36 bib1.bibx56" id="paren.12"/>.</p>
      <p id="d1e432">DSM can be thought of as linking observable landscape structure and soil
processes expressed as observed soil properties. To effectively link
structure and processes, <xref ref-type="bibr" rid="bib1.bibx16" id="text.13"/> suggested the use of
analysis that spans over several spatial and temporal scales.
<xref ref-type="bibr" rid="bib1.bibx9" id="text.14"/> proposed the contextual spatial modelling to
account for the interactions of covariates across multiple scales and their
influence on soil formation. The authors' approach (e.g.
<xref ref-type="bibr" rid="bib1.bibx7 bib1.bibx8" id="altparen.15"/>) derived covariates based on the
elevation at the local to the regional extent. Their approaches include
ConMap <xref ref-type="bibr" rid="bib1.bibx7" id="paren.16"/>, which is based on elevation differences from
the centre pixel to each pixel in a sparse neighbourhood, and ConStat
<xref ref-type="bibr" rid="bib1.bibx8" id="paren.17"/>, which used statistical measures of elevation within
growing sparse circular spatial neighbourhoods. These approaches produce a
large number of predictors computed for each location, as shown in an example
with 100 distance scales (e.g. from 20 m to 20 km) and 1000 predictors per
grid cell. These hyper-covariates, solely based on elevation, are used as
predictors in a random forest regression model.</p>
      <p id="d1e450">Spatial filtering, multi-scale terrain calculation, and contextual mapping
approaches require the preprocessing of each covariate independently. The
useful scale for each covariate needs to be figured out via numerical
experiments and most of the time the process relies on ad hoc decisions.
Here, we take advantage of the success of deep learning models that are used
for image recognition, as an effective tool in DSM to optimally search for
local contextual information of covariates. This work aims to expand the
classic DSM approach by including information about the vicinity around
<inline-formula><mml:math id="M14" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> to fully leverage the spatial context of a soil observation. The aim
is achieved by devising a convolutional neural network (CNN) which can take
multiple spatial contextual inputs.</p>
</sec>
<sec id="Ch1.S2">
  <title>Rationale</title>
      <p id="d1e475">The theoretical background of DSM is based on the relationship between a soil
attribute and soil-forming factors. In practice, a single soil observation is
usually described as a point <inline-formula><mml:math id="M15" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula> with coordinates <inline-formula><mml:math id="M16" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>
(Eq. <xref ref-type="disp-formula" rid="Ch1.E1"/>), and the corresponding soil-forming factors are
represented by a vector of pixel values of multiple covariate rasters <inline-formula><mml:math id="M17" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> at the same location, where <inline-formula><mml:math id="M18" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> is the total number of
covariate rasters.</p>
      <p id="d1e543">Soils are highly dependent on their position in the landscape, and
information at a particular pixel might not be sufficient to represent that
complex relationship. Our method expands the classic DSM approach by
replacing the covariates, usually represented as a vector, with a 3-D array
with shape <inline-formula><mml:math id="M19" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M20" display="inline"><mml:mi>w</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M21" display="inline"><mml:mi>h</mml:mi></mml:math></inline-formula> are the width and height in pixels
of a window centred at point <inline-formula><mml:math id="M22" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula> (Fig. <xref ref-type="fig" rid="Ch1.F1"/>). Methods commonly
used in DSM are not designed to adequately handle the data structure depicted
in Fig. <xref ref-type="fig" rid="Ch1.F1"/>. The data representation is similar to the network
model by <xref ref-type="bibr" rid="bib1.bibx33" id="text.18"/> which used hyperspectral images for
classification purposes.</p>
      <p id="d1e595">As described in the introduction, while multi-scale or contextual mapping
approaches have been used in DSM, they still rely on a vector representation
of covariates and rely on machine learning methods such as random forest to
select important predictors. While deep learning methods have been used in
DSM (e.g. <xref ref-type="bibr" rid="bib1.bibx54" id="altparen.19"/>), most studies still use a vector
representation of covariates.</p>
      <p id="d1e601">In the following sections, we introduce the use of convolutional neural
networks to exploit spatial information of covariates that will
perform a more effective DSM.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F1"><label>Figure 1</label><caption><p id="d1e607">Representation of the vicinity around a soil observation <inline-formula><mml:math id="M23" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula>, for
<inline-formula><mml:math id="M24" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> number of covariate rasters. <inline-formula><mml:math id="M25" display="inline"><mml:mi>w</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M26" display="inline"><mml:mi>h</mml:mi></mml:math></inline-formula> are the width and height in
pixels, respectively.</p></caption>
        <?xmltex \igopts{width=156.490157pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f01.png"/>

      </fig>

</sec>
<sec id="Ch1.S3">
  <title>Deep learning</title>
      <?pagebreak page81?><p id="d1e650">Deep learning is a machine learning method that is able to learn the
representation of data through a series of processing layers. In agriculture
and environmental mapping, it is mainly used in hyperspectral and
multispectral image classification problems, e.g. land cover classification
<xref ref-type="bibr" rid="bib1.bibx25" id="paren.20"/>. We have not seen much application of deep learning
in DSM, except for <xref ref-type="bibr" rid="bib1.bibx54" id="text.21"/>, who used the deep belief network
for predicting soil moisture.<?xmltex \hack{\newpage}?></p>
      <p id="d1e660">In this section we briefly introduce CNNs and some associated methods used
during this work. For a more detailed and general description about CNNs we
refer the reader to <xref ref-type="bibr" rid="bib1.bibx30" id="text.22"/> and
<xref ref-type="bibr" rid="bib1.bibx29" id="text.23"/>.</p>
<sec id="Ch1.S3.SS1">
  <title>CNN</title>
      <p id="d1e674">CNNs are based on the concept of a layer of convolving windows which move
along a data array in order to detect features (e.g. edges) of the data by
using different filters (Fig. <xref ref-type="fig" rid="Ch1.F2"/>). When stacked together,
convolutional layers are capable of extracting features of increasing
complexity and abstraction <xref ref-type="bibr" rid="bib1.bibx30" id="paren.24"/>. Since CNNs have the
capacity to leverage the spatial structure of the data, they have been widely
and effectively used in computer vision for image recognition or extraction
<xref ref-type="bibr" rid="bib1.bibx31" id="paren.25"/>.</p>
      <p id="d1e685">A CNN has a number of three-dimensional hidden layers, with each layer learning to
detect different features of the input images <xref ref-type="bibr" rid="bib1.bibx32" id="paren.26"/>. In our
case, each of the layers can perform one of the two types of operations:
convolution or pooling. Convolution takes the input images through a set of
convolutional filters (e.g. a <inline-formula><mml:math id="M27" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula> size filter), each of which detects and
enhances certain features from the images. Units in a convolutional layer are
organised in feature maps (here we used <inline-formula><mml:math id="M28" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula>). Each unit of the feature map
is connected to local patches in the feature maps of the previous layer
through a set of weights. This local weighted sum is then passed through a
non-linear transfer function.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F2"><label>Figure 2</label><caption><p id="d1e717">Example of the first three steps of a convolution of a <inline-formula><mml:math id="M29" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula>
filter over a <inline-formula><mml:math id="M30" display="inline"><mml:mrow><mml:mn mathvariant="normal">5</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:math></inline-formula> array (image). The resulting pixel values
correspond to the sum of the element-wise multiplication of the initial
pixels (dashed lines) and the filter.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f02.png"/>

        </fig>

      <p id="d1e750">A pooling operation merges similar features by performing non-linear
down-sampling. Here we used max-pooling layers which combine inputs from a
small <inline-formula><mml:math id="M31" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula> window. Pooling also makes the features robust against
noise. All the convolutional and pooling layers are finally “flattened” to
the fully connected layer. In effect, the fully connected layer is a weighted
sum of the previous layers.<?xmltex \hack{\newpage}?></p>
      <p id="d1e767">To obtain optimal weights for the network, we train the network using a
training dataset. Weights were adjusted based on a gradient-based algorithm
to minimise the error using an Adam optimiser <xref ref-type="bibr" rid="bib1.bibx28" id="paren.27"/>. We
refer to a review by <xref ref-type="bibr" rid="bib1.bibx32" id="text.28"/> on the details of CNN.</p>
</sec>
<sec id="Ch1.S3.SS2">
  <title>Multi-task learning</title>
      <p id="d1e782">CNNs have the capacity to predict multiple properties simultaneously. By doing so,
a multi-task CNN is capable of sharing learned representations between
different targets and also using the other targets as “clues” during the
prediction process. In consequence, the error of the simultaneous prediction
is generally lower compared with a single prediction for each target
<xref ref-type="bibr" rid="bib1.bibx42 bib1.bibx48" id="paren.29"/>. An additional advantage of
using a multi-task CNN is the reported reduction on the risk of overfitting
<xref ref-type="bibr" rid="bib1.bibx50" id="paren.30"/>.</p>
      <p id="d1e791">In DSM, where the combination of large extents, high resolution, and
bootstrap routines leads to running multiple model realisations on billions
of pixels, combined with the fact that CNNs use a group of pixels around the
soil observation instead of a single pixel, the time and computational
resources required for training and inference are an important factor. Due to
the simultaneous training and inference of multiple targets, a multi-task CNN
presents the advantage of reducing both training and inference time compared
with a single-task model.</p>
</sec>
</sec>
<sec id="Ch1.S4">
  <title>Methods</title>
<sec id="Ch1.S4.SS1">
  <title>Data</title>
      <p id="d1e806">The data used in this work correspond to Chilean soil information. Since most
observations are distributed on agricultural lands, we complemented that
information with a second small data collection compiled from the literature
and collaborators. We selected soil organic carbon (SOC) content (%) at depths
0–5, 5–15, 15–30, 30–60, and 60–100 cm as our target attribute. In total, 485 soil
profiles were used after excluding soil profiles with total depth lower than
100 cm (in<?pagebreak page82?> order to assure that all the profiles have observations at all
depth intervals). For more details about the data and depth standardisation
we refer the reader to <xref ref-type="bibr" rid="bib1.bibx41" id="text.31"/>.</p>
      <p id="d1e812">As covariates, we used (a) a digital elevation model (HydroSHEDS,
<xref ref-type="bibr" rid="bib1.bibx34" id="altparen.32"/>), which is provided at 3 arcsec resolution, in
addition to its derived slope and topographic wetness index, calculated using
SAGA <xref ref-type="bibr" rid="bib1.bibx14" id="paren.33"/>; and (b) long-term mean annual temperature and
total annual rainfall derived from information provided by WorldClim
<xref ref-type="bibr" rid="bib1.bibx22" id="paren.34"/> at 30 arcsec resolution. All data layers were
standardised to a 100 m grid size.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <title>Data augmentation</title>
      <p id="d1e830">Deep learning techniques are described as “data-hungry” since they usually
work better with large volumes of data. The direct effect of data
augmentation is to generate new samples by modifying the original data
without changing its meaning <xref ref-type="bibr" rid="bib1.bibx52" id="paren.35"/>. To achieve this, we
rotated the 3-D array shown in Fig. <xref ref-type="fig" rid="Ch1.F1"/> by 90, 180, and 270<inline-formula><mml:math id="M32" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>, hence quadruplicating the number of observations. It is important to
note that the central pixel preserves its initial position.</p>
      <p id="d1e847">A secondary effect of data augmentation is regularisation, reducing the
variance of the model and overfitting <xref ref-type="bibr" rid="bib1.bibx29" id="paren.36"/>. Data
augmentation also induces rotation invariance <xref ref-type="bibr" rid="bib1.bibx57" id="paren.37"/> by
generating alternative situations (rotated data) where the model response
should be similar to the original data (e.g. a soil profile next to a gully
is expected to be similar to a profile next to the opposite side of the
gully, <italic>ceteris paribus</italic>).</p>
</sec>
<sec id="Ch1.S4.SS3">
  <title>Network architecture</title>
      <p id="d1e865">The multi-task CNN used in this study (Fig. <xref ref-type="fig" rid="Ch1.F3"/>;
Table <xref ref-type="table" rid="Ch1.T1"/>) consists of an input layer passed through a
series of convolutional and pooling layers with a ReLU (rectified linear
unit) activation function, which adds non-linearity by
passing the learned weights through the function <inline-formula><mml:math id="M33" display="inline"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo>max⁡</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The
initial common/shared network has a function of extracting features shared
between the five target depth ranges. Next, the common features are
propagated through independent branches, one per depth range, of three fully
connected layers.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F3" specific-use="star"><label>Figure 3</label><caption><p id="d1e902">Architecture of the multi-task network. “Shared layers” represent
the layers shared by all the depth ranges. Each branch, one per depth range,
first flattens the information to a 1-D array, followed by a series of
two fully connected layers and a fully connected layer of size equal to 1, which
corresponds to the final prediction.</p></caption>
          <?xmltex \igopts{width=441.017717pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f03.png"/>

        </fig>

      <p id="d1e911">The multiple connection between the layers generates a high number of
parameters. In order to reduce the risk of overfitting, we introduce a
dropout rate. In between the layers, 30 % of the connections were randomly
disconnected <xref ref-type="bibr" rid="bib1.bibx40" id="paren.38"/>. We added this dropout operation
in the shared layer and another dropout before the output.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T1"><label>Table 1</label><caption><p id="d1e921">Sequence of layers used to build the multi-task neural network.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Layer type</oasis:entry>
         <oasis:entry colname="col2">Kernel size</oasis:entry>
         <oasis:entry colname="col3">Filters</oasis:entry>
         <oasis:entry colname="col4">Activation</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Convolutional<inline-formula><mml:math id="M36" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">a</mml:mi></mml:msup></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M37" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3">16</oasis:entry>
         <oasis:entry colname="col4">ReLU</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Max-pooling<inline-formula><mml:math id="M38" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">a</mml:mi></mml:msup></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3">–</oasis:entry>
         <oasis:entry colname="col4">-</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Dropout (0.3)<inline-formula><mml:math id="M40" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">a</mml:mi></mml:msup></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">–</oasis:entry>
         <oasis:entry colname="col3">–</oasis:entry>
         <oasis:entry colname="col4">-</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Convolutional<inline-formula><mml:math id="M41" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">a</mml:mi></mml:msup></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3">32</oasis:entry>
         <oasis:entry colname="col4">ReLU</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Fully connected<inline-formula><mml:math id="M43" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">b</mml:mi></mml:msup></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">–</oasis:entry>
         <oasis:entry colname="col3">10</oasis:entry>
         <oasis:entry colname="col4">ReLU</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Dropout (0.3)<inline-formula><mml:math id="M44" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">b</mml:mi></mml:msup></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">–</oasis:entry>
         <oasis:entry colname="col3">–</oasis:entry>
         <oasis:entry colname="col4">-</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Fully connected<inline-formula><mml:math id="M45" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">b</mml:mi></mml:msup></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">–</oasis:entry>
         <oasis:entry colname="col3">10</oasis:entry>
         <oasis:entry colname="col4">ReLU</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Fully connected<inline-formula><mml:math id="M46" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">b</mml:mi></mml:msup></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">–</oasis:entry>
         <oasis:entry colname="col3">1</oasis:entry>
         <oasis:entry colname="col4">ReLU</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table><table-wrap-foot><p id="d1e924"><inline-formula><mml:math id="M34" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">a</mml:mi></mml:msup></mml:math></inline-formula> Shared layers; <inline-formula><mml:math id="M35" display="inline"><mml:msup><mml:mi/><mml:mi mathvariant="normal">b</mml:mi></mml:msup></mml:math></inline-formula> for each property.</p></table-wrap-foot></table-wrap>

</sec>
<sec id="Ch1.S4.SS4">
  <title>Inputs</title>
      <p id="d1e1200">As explained in Sect. <xref ref-type="sec" rid="Ch1.S2"/>, our method uses a window around
a soil observation which encloses a group of pixels instead of the single
pixel that coincides with the observation. Most likely, the extent or size of
that window will affect the model performance. To assess this effect, we
compared the results of different models trained with a window size of 3, 5,
7, 9, 15, 21, and 29 pixels.</p>
      <p id="d1e1205">As the vicinity size increases, so does the number of parameters of the
network (considering a fixed network architecture) and the risk of
overfitting. To minimise overfitting, we modified the architecture of the
network depending on the vicinity size (Table <xref ref-type="table" rid="Ch1.T2"/>).</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T2"><label>Table 2</label><caption><p id="d1e1213">List of modifications made to the base network architecture for
specific input window sizes.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="2">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="justify" colwidth="142.26378pt"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Window size</oasis:entry>
         <oasis:entry colname="col2">Changes</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M47" display="inline"><mml:mrow><mml:mn mathvariant="normal">15</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">15</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M48" display="inline"><mml:mo>•</mml:mo></mml:math></inline-formula> Extra max-pooling (<inline-formula><mml:math id="M49" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula>) after last convolutional layer</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M50" display="inline"><mml:mrow><mml:mn mathvariant="normal">21</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">21</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M51" display="inline"><mml:mo>•</mml:mo></mml:math></inline-formula> Extra max-pooling (<inline-formula><mml:math id="M52" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula>) after last convolutional layer</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M53" display="inline"><mml:mo>•</mml:mo></mml:math></inline-formula> Extra convolutional (<inline-formula><mml:math id="M54" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula>, 16 filters)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M55" display="inline"><mml:mrow><mml:mn mathvariant="normal">29</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">29</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M56" display="inline"><mml:mo>•</mml:mo></mml:math></inline-formula> Extra max-pooling (<inline-formula><mml:math id="M57" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula>) after last convolutional layer</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M58" display="inline"><mml:mo>•</mml:mo></mml:math></inline-formula> Extra convolutional (<inline-formula><mml:math id="M59" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula>, 64 filters)</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M60" display="inline"><mml:mo>•</mml:mo></mml:math></inline-formula> Dropouts changed to 0.5</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S4.SS5">
  <title>Training and validation</title>
      <p id="d1e1425">First, 10 % (<inline-formula><mml:math id="M61" display="inline"><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">49</mml:mn></mml:mrow></mml:math></inline-formula>) of the total dataset was randomly selected and
used as a test set. The remaining 90 % of the samples (<inline-formula><mml:math id="M62" display="inline"><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">436</mml:mn></mml:mrow></mml:math></inline-formula>) were
augmented (see Sect. <xref ref-type="sec" rid="Ch1.S4.SS2"/>), obtaining a total of
1744 samples. Following the data augmentation, we performed a bootstrapping
routine <xref ref-type="bibr" rid="bib1.bibx20" id="paren.39"/> with 100 repetitions, where the
training set is obtained by sampling with replacement, generating a set of size 1744. The samples which were not
selected, about one-third, correspond to the out-of-bag validation set.</p>
      <p id="d1e1457">As a control, we compared our results with a previous study by
<xref ref-type="bibr" rid="bib1.bibx41" id="text.40"/> where we used a Cubist regression tree model
<xref ref-type="bibr" rid="bib1.bibx47" id="paren.41"/> to predict soil organic carbon at a
national extent. The Cubist model has been used<?pagebreak page83?> in many other DSM studies due to its
interpretability and robustness. In that study, we used the same set of soil
observations and covariates described in Sect. <xref ref-type="sec" rid="Ch1.S4.SS1"/>.</p>
</sec>
<sec id="Ch1.S4.SS6">
  <title>Uncertainty analysis</title>
      <p id="d1e1474">In this work (and in <xref ref-type="bibr" rid="bib1.bibx41" id="altparen.42"/>), the uncertainty is
represented as the 90 % prediction interval derived from the 100 bootstrap
iterations. To estimate the upper and lower prediction interval limits, we
used the following formula:
            <disp-formula id="Ch1.E2" content-type="numbered"><mml:math id="M63" display="block"><mml:mrow><mml:mi mathvariant="normal">PIL</mml:mi><mml:mo>=</mml:mo><mml:mover accent="true"><mml:mi>x</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.645</mml:mn><mml:msqrt><mml:mrow><mml:msup><mml:mi mathvariant="italic">σ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>+</mml:mo><mml:mi mathvariant="normal">MSE</mml:mi></mml:mrow></mml:msqrt><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M64" display="inline"><mml:mover accent="true"><mml:mi>x</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="M65" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="italic">σ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> are the mean and variance of the 100 iterations,
and MSE is the mean square error of the 100 fitted models.</p>
</sec>
<sec id="Ch1.S4.SS7">
  <title>Implementation</title>
      <p id="d1e1539">The CNN was implemented in Python (v3.6.2; <xref ref-type="bibr" rid="bib1.bibx46" id="altparen.43"/>) using Keras
(v2.1.2; <xref ref-type="bibr" rid="bib1.bibx12" id="altparen.44"/>) and Tensorflow (v1.4.1;
<xref ref-type="bibr" rid="bib1.bibx1" id="altparen.45"/>) back end. Computing was done using the
University of Sydney's Artemis high-performance computing facility.</p>
</sec>
</sec>
<sec id="Ch1.S5">
  <title>Results and discussion</title>
<sec id="Ch1.S5.SS1">
  <title>Data augmentation</title>
      <p id="d1e1564">To generalise and improve the CNN model, we created new data using only
information from the training data by rotating the original image input. Data
augmentation was effective at reducing model error and variability
(Fig. <xref ref-type="fig" rid="Ch1.F4"/>). It was possible to observe a decrease in the
error, by 10.56 %, 10.56 %, 11.25 %, 14.51 %, and 24.77 % for the 0–5, 5–15, 15–30,
30–60, and 60–100 cm depth ranges, respectively. The results are in
accordance with image classification studies which generally showed that data
augmentation increased the accuracy of classification tasks
<xref ref-type="bibr" rid="bib1.bibx44" id="paren.46"/>. It is hypothesised that, by increasing the
amount of training data, we can reduce overfitting of CNN models.</p>
      <p id="d1e1572">In terms of the data spatial autocorrelation, we need to consider that after
augmenting the data we have four samples in the same locations with exactly the
same SOC content, therefore assuming that there is no variance when
distance is equal to 0. That is theoretically true if we consider that the distance is
exactly equal to 0. In practice, when calculating the semivariogram, the
semivariance value of the first bin will be lower, but that does not
significantly affect the final model.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F4"><label>Figure 4</label><caption><p id="d1e1577">Effect of using data augmentation as a pretreatment on a 7 pixel <inline-formula><mml:math id="M66" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 7 pixel array.</p></caption>
          <?xmltex \igopts{width=213.395669pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f04.png"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F5" specific-use="star"><label>Figure 5</label><caption><p id="d1e1596">Effect of vicinity size on prediction error, by depth range.
Ref_<inline-formula><mml:math id="M67" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> corresponds to a fully connected neural network without any
surrounding pixels. Ref_Cubist corresponds to the Cubist models used by
<xref ref-type="bibr" rid="bib1.bibx41" id="text.47"/>.</p></caption>
          <?xmltex \igopts{width=469.470472pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f05.png"/>

        </fig>

</sec>
<sec id="Ch1.S5.SS2">
  <title>Vicinity size</title>
      <?pagebreak page84?><p id="d1e1626">To incorporate contextual information for DSM prediction, we represent the
input as an image. The image is represented as observation in the centre,
with surrounding pixels in a square format. The size of the neighbourhood
window (vicinity) has a significant effect on the prediction error
(Fig. <xref ref-type="fig" rid="Ch1.F5"/>). There is no significant difference when using a
vicinity size of 3, 5, 7, or 9 pixels, but sizes above 9 pixels showed an
increase in the error. It is possible to observe a lower error in the test
dataset, compared with the training and validation, due to the slight
differences in the dataset distributions (Fig. <xref ref-type="fig" rid="Ch1.F6"/>).
Since the SOC distribution is right-skewed, the random sampling used to
generate the training dataset does not completely recreate the original
distribution, excluding samples with very high SOC values. This should not
significantly affect the conclusions given that the error for the samples
with hight SOC values is accounted for during the bootstrapping routine and
reflected in the training and validation curves of
Fig. <xref ref-type="fig" rid="Ch1.F5"/>. In this example, for a country-scale mapping of
SOC at 100 m grid size, information from a 150 to 450 m radius is useful. A
similar influence distance was obtained by <xref ref-type="bibr" rid="bib1.bibx24" id="text.48"/> and
<xref ref-type="bibr" rid="bib1.bibx55" id="text.49"/>, who reported a medium-scale spatial correlation
range for SOC in China of around 300 and 550 m, respectively; by
<xref ref-type="bibr" rid="bib1.bibx49" id="text.50"/>, who reported a range of around 190 m for a coastal
forest in Tanzania; and by <xref ref-type="bibr" rid="bib1.bibx18" id="text.51"/>, who reported a range of
around 200 m in two grassland sites in Germany. A similar spatial
correlation range was reported for croplands in an review by
<xref ref-type="bibr" rid="bib1.bibx43" id="text.52"/>, where, based on 41 variograms, the authors
estimated an average spatial correlation range of around 400 m.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F6"><label>Figure 6</label><caption><p id="d1e1653">Distribution of the original dataset and the test dataset. The
random sampling excludes some observation with high SOC values.</p></caption>
          <?xmltex \igopts{width=227.622047pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f06.png"/>

        </fig>

      <p id="d1e1662">As described in Sect. <xref ref-type="sec" rid="Ch1.S4.SS3"/>, we slightly modified the
architecture of the network as input window size increased, in order to
minimise the risk of overfitting and isolate the effect of the vicinity size.
As we increase the vicinity size, we give the model a broader spatial
context. Our results show that just a small amount of extra context provides
enough information to improve the model predictions, and a larger amount of
neighbouring information acts as noise, impairing the generalisation of the
model. Since we used the relatively large resolution of 100 m, it is hard to
tell specifically what<?pagebreak page85?> the minimum amount of context needed to improve SOC
predictions is. We believe that using higher resolutions (<inline-formula><mml:math id="M68" display="inline"><mml:mrow><mml:mo>&lt;</mml:mo><mml:mn mathvariant="normal">10</mml:mn></mml:mrow></mml:math></inline-formula> m) could produce
more insights about this matter.</p>
      <p id="d1e1677">Soil-forming factors interact in complex ways and affect soil properties with
different strength. At the local scale, a broader context (i.e. larger vicinity
size) does not necessarily provide extra information to the model, for
instance when one of the factors is relatively homogeneous. The extra
information could be even detrimental if the vicinity size is well beyond the
area of influence of a factor, which is what probably happened when we
increased the vicinity size above 9 pixels (radius <inline-formula><mml:math id="M69" display="inline"><mml:mrow><mml:mo>≈</mml:mo><mml:mn mathvariant="normal">450</mml:mn></mml:mrow></mml:math></inline-formula> m).
Representing this complexity in numerical terms would imply varying the size
of the input array, such as a different vicinity
size for each forming factor, most likely also varying depending on the spatial location of the soil
observation (e.g. smaller vicinity for homogeneous areas and larger for
heterogeneous areas). This is technically possible but considerably increases
the complexity of the modelling.</p>
</sec>
<sec id="Ch1.S5.SS3">
  <title>Comparison with other methods</title>
      <p id="d1e1696">We compared our approach with the Cubist model used in our previous study
<xref ref-type="bibr" rid="bib1.bibx41" id="paren.53"/>, where we did not use any contextual information.
We observed a significant decrease in the error (Fig. <xref ref-type="fig" rid="Ch1.F5"/>)
by 23.0 %, 23.8 %, 26.9 %, 35.8 %, and 39.8 % for the 0–5, 5–15, 15–30, 30–60, and
60–100 cm depth ranges, respectively. Most current DSM studies rely on
punctual observations without contextual information, and, given the
improvements shown by our approach, we believe there is a big potential for
CNNs to be used in operational DSM.</p>
      <p id="d1e1704">To compare our results with a method that uses contextual information, we ran
a test using wavelet decomposition as per <xref ref-type="bibr" rid="bib1.bibx36" id="text.54"/>. In
addition to the five covariates, we used their approximation coefficients
from the first, second, and third levels of a Haar decomposition
<xref ref-type="bibr" rid="bib1.bibx13 bib1.bibx21" id="paren.55"/>. The results including
wavelet-decomposed variables were similar to ones obtained with the Cubist
model. The CNN approach reduced the error by 24.8, 24.7, 28.5, 28.6, and 23.5
for the 0–5, 5–15, 15–30, 30–60, and 60–100 cm depth ranges,
respectively. <xref ref-type="bibr" rid="bib1.bibx36" id="text.56"/> reported an average improvement of
1 % for the prediction of clay content. In our case the wavelet
decomposition reduced the error of the SOC content by 5.1 %, on average,
compared with the Cubist model, but the reduction was only observed in depth,
where SOC content is low (2.4 %, 1.2 %, 2.3 %, <inline-formula><mml:math id="M70" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">10.1</mml:mn></mml:mrow></mml:math></inline-formula> %, and
<inline-formula><mml:math id="M71" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">21.4</mml:mn></mml:mrow></mml:math></inline-formula> % error change for the 0–5, 5–15, 15–30, 30–60, and
60–100 cm depth ranges, respectively), hence reducing the effect in
applications such as carbon accounting. Our approach showed greater error
reductions and through the whole profile.<?xmltex \hack{\newpage}?></p>
</sec>
<sec id="Ch1.S5.SS4">
  <title>Prediction of deeper soil layers</title>
      <p id="d1e1743">Our approach uses a multi-task CNN to predict multiple depths simultaneously
in order to produce a synergistic effect. Compared with predicting each depth
range in isolation by training a network with the same structure
(Sect. <xref ref-type="sec" rid="Ch1.S4.SS3"/>) but with only one output, our approach
reduced the error by 1.5 %, 6.7 %, 6.6 %, 8.9 %, and 13.0 % for the 0–5, 5–15, 15–30,
30–60, and 60–100 cm depth ranges, respectively. In this case, the reduction
was modest and we believe the effect can be greater when more soil
observations are available.</p>
      <p id="d1e1748">In DSM, there are two main approaches to deal with the vertical variation of
a soil attribute: 2.5-D and 3-D modelling. In the first one, an independent
model is fitted for each depth range. The latter explicitly incorporates
depth in order to obtain a single model for the whole profile. Interestingly,
both approaches show a decrease in the variance explained by the model as the
prediction depth increases. In a 3-D mapping of SOC for a 125 km<inline-formula><mml:math id="M72" display="inline"><mml:msup><mml:mi/><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:math></inline-formula> region
in the Netherlands, <xref ref-type="bibr" rid="bib1.bibx26" id="text.57"/> presented <inline-formula><mml:math id="M73" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values of 0.75,
0.23, and 0.09 for the 0–30, 30–60, and 60–90 cm depth ranges,
respectively. In our previous study <xref ref-type="bibr" rid="bib1.bibx41" id="paren.58"/>, the 2.5-D mapping
showed <inline-formula><mml:math id="M74" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values of 0.39, 0.39, 0.27, 0.19, and 0.17 for the 0–5, 5–15,
15–30, 30–60, and 60–100 cm depth ranges, respectively. Similar studies
show the same trend <xref ref-type="bibr" rid="bib1.bibx3 bib1.bibx38 bib1.bibx2" id="paren.59"/>, independent of the models used or the soil attribute
predicted. This is expected as the information used as covariates usually
represents surface conditions. Our multi-task network presented the opposite
trend (Fig. <xref ref-type="fig" rid="Ch1.F7"/>), showing an increase in the explained
variance as the prediction depth increases.</p>
      <p id="d1e1794">The prediction of the adjacent layers served as guidance, producing a
synergistic effect. A soil attribute through a profile usually has a
predictable behaviour (unless there are lithological discontinuities), which
has been described by many authors in the form of depth functions
<xref ref-type="bibr" rid="bib1.bibx26 bib1.bibx39 bib1.bibx51" id="paren.60"/>. A CNN is
capable of generating an internal representation of the vertical distribution
of the target attribute, which resembles the observed pattern
(Fig. <xref ref-type="fig" rid="Ch1.F8"/>).</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F7"><label>Figure 7</label><caption><p id="d1e1804">Percentage change in model <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> as a function of depth. The
multi-task model corresponds to a CNN trained using a 7 pixel <inline-formula><mml:math id="M76" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 7 pixel
vicinity. Data for “other studies” correspond to validation statistics from
<xref ref-type="bibr" rid="bib1.bibx41" id="text.61"/>, <xref ref-type="bibr" rid="bib1.bibx3" id="text.62"/>,
<xref ref-type="bibr" rid="bib1.bibx38" id="text.63"/>, and <xref ref-type="bibr" rid="bib1.bibx2" id="text.64"/>.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f07.png"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F8"><label>Figure 8</label><caption><p id="d1e1847">Vertical SOC distribution for 20 randomly selected profiles.
Predictions correspond to the multi-task CNN.</p></caption>
          <?xmltex \igopts{width=227.622047pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f08.png"/>

        </fig>

</sec>
<sec id="Ch1.S5.SS5">
  <title>Visual evaluation of maps</title>
      <?pagebreak page86?><p id="d1e1863">Visually, the maps generated with the Cubist tree model and our multi-task
CNN showed differences (Fig. <xref ref-type="fig" rid="Ch1.F9"/>). In an example for an area
in southern Chile (around 72.57<inline-formula><mml:math id="M77" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> S), the map generated with the
Cubist model (Fig. <xref ref-type="fig" rid="Ch1.F9"/>a) shows more details related with the
topography, but also presents some artefacts due to the sharp limits
generated by the tree rules. On the other hand, the map generated with the
multi-task CNN using a <inline-formula><mml:math id="M78" display="inline"><mml:mrow><mml:mn mathvariant="normal">7</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">7</mml:mn></mml:mrow></mml:math></inline-formula> window (Fig. <xref ref-type="fig" rid="Ch1.F9"/>b) shows
a smoothing effect, which is an expected behaviour consequence of using
neighbour pixels.<?xmltex \hack{\newpage}?></p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F9"><label>Figure 9</label><caption><p id="d1e1896">Detailed view of the <bold>(a)</bold> map generated by a Cubist model
<xref ref-type="bibr" rid="bib1.bibx41" id="paren.65"/> and <bold>(b)</bold> model generated by our multi-task
CNN showing the smoothing effect of the CNN.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f09.png"/>

        </fig>

</sec>
<sec id="Ch1.S5.SS6">
  <title>Uncertainty</title>
      <p id="d1e1920">A recommended DSM practice is to present a map of a predicted attribute along
its associated uncertainty <xref ref-type="bibr" rid="bib1.bibx6" id="paren.66"/>. Our multi-task
CNN significantly reduced the prediction interval width (PIW,
Table <xref ref-type="table" rid="Ch1.T3"/>) compared with the Cubist model. On average, we
observed a reduction of 13.8 % and 13.1 % for the CNN model generated
with and without data augmentation pretreatment, respectively, for the first
three depth intervals. Our multi-task CNN model showed a slightly lower
prediction interval coverage, but all were wider than the proposed 90 %
coverage.</p>
      <p id="d1e1928">In terms of the spatial patterns of the uncertainty
(Fig. <xref ref-type="fig" rid="Ch1.F10"/>), the greater reductions of the PIW were observed
in elevated areas of the Andes, followed by the central valleys. A slight
increase, on the order of 6 %–8 %, was observed in the western coastal ranges.
The reduction of the PIW in the Andes is most likely due to a more reserved
extrapolation by the CNN models compared with the Cubist model. It is worth noting that
the central valleys are where most of the agricultural lands are located and
the uncertainty reduction observed in these areas could have important
implications.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T3"><label>Table 3</label><caption><p id="d1e1936">Median prediction interval width (PIW, SOC %) and proportion of
observations that fell within the 90 % prediction interval (PICP)
estimated at the test dataset locations. For the Cubist model, values were
extracted from the final maps. For the CNN models, the values correspond to
the mean of the 100 bootstrap iterations.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="5">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3">Cubist</oasis:entry>
         <oasis:entry colname="col4">Not augmented</oasis:entry>
         <oasis:entry colname="col5">Augmented</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">0–5 cm</oasis:entry>
         <oasis:entry colname="col2">PICP</oasis:entry>
         <oasis:entry colname="col3">0.96</oasis:entry>
         <oasis:entry colname="col4">0.96</oasis:entry>
         <oasis:entry colname="col5">0.94</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">PIW</oasis:entry>
         <oasis:entry colname="col3">7.96</oasis:entry>
         <oasis:entry colname="col4">7.20</oasis:entry>
         <oasis:entry colname="col5">7.25</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">5–15 cm</oasis:entry>
         <oasis:entry colname="col2">PICP</oasis:entry>
         <oasis:entry colname="col3">0.97</oasis:entry>
         <oasis:entry colname="col4">0.96</oasis:entry>
         <oasis:entry colname="col5">0.92</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">PIW</oasis:entry>
         <oasis:entry colname="col3">7.69</oasis:entry>
         <oasis:entry colname="col4">6.15</oasis:entry>
         <oasis:entry colname="col5">6.06</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">15–30 cm</oasis:entry>
         <oasis:entry colname="col2">PICP</oasis:entry>
         <oasis:entry colname="col3">0.97</oasis:entry>
         <oasis:entry colname="col4">0.96</oasis:entry>
         <oasis:entry colname="col5">0.96</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">PIW</oasis:entry>
         <oasis:entry colname="col3">7.16</oasis:entry>
         <oasis:entry colname="col4">6.47</oasis:entry>
         <oasis:entry colname="col5">6.35</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <?xmltex \floatpos{t}?><fig id="Ch1.F10"><label>Figure 10</label><caption><p id="d1e2086">Percentage change on the prediction interval width using as a
reference a Cubist model <xref ref-type="bibr" rid="bib1.bibx41" id="paren.67"/>. Comparison made using
data augmentation pretreatment.</p></caption>
          <?xmltex \igopts{width=156.490157pt}?><graphic xlink:href="https://soil.copernicus.org/articles/5/79/2019/soil-5-79-2019-f10.png"/>

        </fig>

</sec>
</sec>
<sec id="Ch1.S6" sec-type="conclusions">
  <title>Conclusions</title>
      <p id="d1e2105">The incorporation of contextual information into DSM models is an important aspect
that deserves more attention. Since a soil surveyor will look at the
surrounding landscape to make a prediction of soil type, DSM models should
also incorporate information surrounding an observation. We demonstrated the
use of a convolutional neural network as an efficient, effective, and
accurate method to achieve this goal. In particular we introduce a deep
learning model for DSM which has the following innovative features:
<list list-type="bullet"><list-item>
      <p id="d1e2110"><italic>The representation of input as an image, which takes into account information surrounding a point observation.</italic> CNN is able to
recognise contextual information and extract multi-scale information automatically,
which circumvents the need to preprocess the data in the form of spatial filtering or multi-scale analysis.</p></list-item><list-item>
      <p id="d1e2116"><italic>The use of data augmentation as a general representation of soil in the landscape, which can reduce overfitting and improve model accuracy.</italic></p></list-item><list-item>
      <p id="d1e2121"><italic>The ability to predict different soil depth simultaneously in a model and thus take into account the depth correlation of soil properties and attributes.</italic> In our example, the prediction of soil properties at
deeper layers, which is a common problem in DSM studies, improved significantly.</p></list-item></list></p>
      <p id="d1e2126">Overall, in this study, we observed an error reduction of 30 % compared with
conventional techniques. The resulting prediction also has less uncertainty.
Furthermore, the use of this data structure with CNN seems to eliminate
artefacts generally found in DSM products due to the incompatible scale of
covariates and sharp discontinuities due to tree models.</p>
      <p id="d1e2129">A CNN can handle a large number of covariates and has advantages over other
machine learning algorithms used in DSM, such as random forests and Cubist
regression tree models, because its architecture is flexible and explicitly takes
spatial information of covariates around observations. While there have been
attempts to include information surrounding an observation as covariates in a
random forest model, those inputs still do not have a spatial relationship. CNN
does not require preprocessing such as wavelet transformation, rather such
a function is built into the model. There are other features such as handling
missing values via data imputation <xref ref-type="bibr" rid="bib1.bibx19" id="paren.68"/> which can be
readily added in the network.</p>
      <p id="d1e2135">The example presented in this paper is for a country-wide modelling at 100 m
resolution, and we need to further test such an approach in the regional to
landscape mapping. The CNN model would be highly suitable for mapping soil
class. In addition, the presented model can be used for other environmental
mapping.</p>
</sec>

      
      </body>
    <back><notes notes-type="dataavailability"><title>Data availability</title>

      <p id="d1e2142">The data were manually extracted from books, which are
publicly available and cited on Padarian (2017).</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d1e2148">The authors declare that they have no conflict of
interest.</p>
  </notes><?xmltex \hack{\newpage}?><ack><title>Acknowledgements</title><p id="d1e2155">This research was supported by Sydney Informatics Hub, funded by the University of Sydney.<?xmltex \hack{\newline}?><?xmltex \hack{\newline}?>
Edited by: Bas van Wesemael<?xmltex \hack{\newline}?>
Reviewed by: two anonymous referees</p></ack><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><?xmltex \def\ref@label{{Abadi et~al.(2015)Abadi, Agarwal, Barham, Brevdo, Chen, Citro,
Corrado, Davis, Dean, Devin, Ghemawat, Goodfellow, Harp, Irving, Isard, Jia,
Jozefowicz, Kaiser, Kudlur, Levenberg, Man\'{e}, Monga, Moore, Murray, Olah,
Schuster, Shlens, Steiner, Sutskever, Talwar, Tucker, Vanhoucke, Vasudevan,
Vi\'{e}gas, Vinyals, Warden, Wattenberg, Wicke, Yu, and
Zheng}}?><label>Abadi et al.(2015)Abadi, Agarwal, Barham, Brevdo, Chen, Citro,
Corrado, Davis, Dean, Devin, Ghemawat, Goodfellow, Harp, Irving, Isard, Jia,
Jozefowicz, Kaiser, Kudlur, Levenberg, Mané, Monga, Moore, Murray, Olah,
Schuster, Shlens, Steiner, Sutskever, Talwar, Tucker, Vanhoucke, Vasudevan,
Viégas, Vinyals, Warden, Wattenberg, Wicke, Yu, and
Zheng</label><mixed-citation>Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado,
G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp,
A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M.,
Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C.,
Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P.,
Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P.,
Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: Large-Scale
Machine Learning on Heterogeneous Systems, available at:
<uri>https://www.tensorflow.org/</uri> (last access: 22 February 2019), 2015.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Adhikari et al.(2014)Adhikari, Hartemink, Minasny, Kheir, Greve, and
Greve</label><mixed-citation>Adhikari, K., Hartemink, A. E., Minasny, B., Kheir, R. B., Greve, M. B., and
Greve, M. H.: Digital mapping of soil organic carbon contents and stocks in
Denmark, PloS one, 9, e105519, <ext-link xlink:href="https://doi.org/10.1371/journal.pone.0105519" ext-link-type="DOI">10.1371/journal.pone.0105519</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>Akpa et al.(2016)Akpa, Odeh, Bishop, Hartemink, and
Amapu</label><mixed-citation>
Akpa, S. I., Odeh, I. O., Bishop, T. F., Hartemink, A. E., and Amapu, I. Y.:
Total soil organic carbon and carbon sequestration potential in Nigeria,
Geoderma, 271, 202–215, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Angelini and Heuvelink(2018)</label><mixed-citation>
Angelini, M. E. and Heuvelink, G. B.: Including spatial correlation in
structural equation modelling of soil properties, Spat. Stat.-Nath., 25,
35–51, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Angelini et al.(2017)</label><mixed-citation>
Angelini, M., Heuvelink, G., and Kempen, B.: Multivariate mapping of soil
with
structural equation modelling, Eur. J. Soil Sci., 68,
575–591, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Arrouays et al.(2014)</label><mixed-citation>Arrouays, D., McBratney, A., Minasny, B., Hempel, J., Heuvelink, G.,
MacMillan, R., Hartemink, A., Lagacherie, P., and McKenzie, N.: The
GlobalSoilMap project specifications, in: GlobalSoilMap: Basis of the Global
Spatial Soil Information System – Proceedings of the 1st GlobalSoilMap
Conference, edited by: Arrouays D., McKenzie, N., Hempel, J., Richer de
Forges, A., and McBratney, A. B., Orleans, France, 7–9 October 2013, CRC
Press, 9–12, <ext-link xlink:href="https://doi.org/10.1201/b16500-4" ext-link-type="DOI">10.1201/b16500-4</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx7"><label>Behrens et al.(2010)Behrens, Schmidt, Zhu, and
Scholten</label><mixed-citation>
Behrens, T., Schmidt, K., Zhu, A.-X., and Scholten, T.: The ConMap approach
for terrain-based digital soil mapping, Eur. J. Soil Sci.,
61, 133–143, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>Behrens et al.(2014)Behrens, Schmidt, Ramirez-Lopez, Gallant, Zhu,
and Scholten</label><mixed-citation>
Behrens, T., Schmidt, K., Ramirez-Lopez, L., Gallant, J., Zhu, A.-X., and
Scholten, T.: Hyper-scale digital soil mapping and soil formation analysis,
Geoderma, 213, 578–588, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Behrens et al.(2018)Behrens, Schmidt, MacMillan, and
Rossel</label><mixed-citation>
Behrens, T., Schmidt, K., MacMillan, R., and Rossel, R. V.: Multiscale
contextual spatial modelling with the Gaussian scale space, Geoderma, 310,
128–137, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Biswas and Si(2011)</label><mixed-citation>
Biswas, A. and Si, B. C.: Revealing the controls of soil water storage at
different scales in a hummocky landscape, Soil Sci. Soc. Am.
J., 75, 1295–1306, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Chen et al.(2018)Chen, Martin, Saby, Walter, Angers, and
Arrouays</label><mixed-citation>
Chen, S., Martin, M. P., Saby, N. P., Walter, C., Angers, D. A., and
Arrouays,
D.: Fine resolution map of top-and subsoil carbon sequestration potential in
France, Sci. Total Environ., 630, 389–400, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Chollet(2015)</label><mixed-citation>Chollet, F.: Keras, available at: <uri>https://github.com/fchollet/keras</uri>
(last access: 22 February 2019), 2015.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Chui(2016)</label><mixed-citation>
Chui, C. K.: An introduction to wavelets, Elsevier, Noston, MA, USA, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx14"><?xmltex \def\ref@label{{Conrad et~al.(2015)Conrad, Bechtel, Bock, Dietrich, Fischer, Gerlitz,
Wehberg, Wichmann, and B\"{o}hner}}?><label>Conrad et al.(2015)Conrad, Bechtel, Bock, Dietrich, Fischer, Gerlitz,
Wehberg, Wichmann, and Böhner</label><mixed-citation>Conrad, O., Bechtel, B., Bock, M., Dietrich, H., Fischer, E., Gerlitz, L.,
Wehberg, J., Wichmann, V., and Böhner, J.: System for Automated
Geoscientific Analyses (SAGA) v. 2.1.4, Geosci. Model Dev., 8, 1991–2007,
<ext-link xlink:href="https://doi.org/10.5194/gmd-8-1991-2015" ext-link-type="DOI">10.5194/gmd-8-1991-2015</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx15"><?xmltex \def\ref@label{{Dematt\^{e} et~al.(2018)Dematt\^{e}, Fongaro, Rizzo, and
Safanelli}}?><label>Demattê et al.(2018)Demattê, Fongaro, Rizzo, and
Safanelli</label><mixed-citation>
Demattê, J. A. M., Fongaro, C. T., Rizzo, R., and Safanelli, J. L.:
Geospatial Soil Sensing System (GEOS3): A powerful data mining procedure to
retrieve soil spectral reflectance from satellite images, Remote Sens.
Environ., 212, 161–175, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>Deumlich et al.(2010)Deumlich, Schmidt, and
Sommer</label><mixed-citation>
Deumlich, D., Schmidt, R., and Sommer, M.: A multiscale soil–landform
relationship in the glacial-drift area based on digital terrain analysis and
soil attributes, J. Plant Nutr. Soil Sc., 173, 843–851,
2010.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>Dokuchaev(1883)</label><mixed-citation>
Dokuchaev, V. V.: Russian Chernozem. Selected works of V. V. Dokuchaev. v.
1, Israel Program for Scientific Translations, Jerusalem, Israel, 1883
(translated in 1967).</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>Don et al.(2007)Don, Schumacher, Scherer-Lorenzen, Scholten, and
Schulze</label><mixed-citation>
Don, A., Schumacher, J., Scherer-Lorenzen, M., Scholten, T., and Schulze,
E.-D.: Spatial and vertical variation of soil carbon at two grassland
sites – implications for measuring soil carbon stocks, Geoderma, 141,
272–282, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>Duan et al.(2016)Duan, Lv, Liu, and Wang</label><mixed-citation>
Duan, Y., Lv, Y., Liu, Y.-L., and Wang, F.-Y.: An efficient realization of
deep learning for traffic data imputation, Transport. Res. C-Emer, 72,
168–181, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>Efron and Tibshirani(1993)</label><mixed-citation>
Efron, B. and Tibshirani, R. J.: An introduction to the bootstrap, vol. 57,
CRC press, New York, USA, 1993.</mixed-citation></ref>
      <ref id="bib1.bibx21"><label>Haar(1910)</label><mixed-citation>
Haar, A.: Zur theorie der orthogonalen funktionensysteme, Math. Ann., 69,
331–371, 1910.</mixed-citation></ref>
      <ref id="bib1.bibx22"><label>Hijmans et al.(2005)Hijmans, Cameron, Parra, Jones, and
Jarvis</label><mixed-citation>
Hijmans, R. J., Cameron, S. E., Parra, J. L., Jones, P. G., and Jarvis, A.:
Very high resolution interpolated climate surfaces for global land areas,
Int. J. Climatol., 25, 1965–1978, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>Jenny(1941)</label><mixed-citation>
Jenny, H.: Factors of soil formation: a system of quantitative pedology,,
Macgraw Hill, New York 1941.</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Jian-Bing et al.(2006)Jian-Bing, Du-Ning, Xing-Yi, Xiu-Zhen, and
Xiao-Yu</label><mixed-citation>
Jian-Bing, W., Du-Ning, X., Xing-Yi, Z., Xiu-Zhen, L., and Xiao-Yu, L.:
Spatial variability of soil organic carbon in relation to environmental
factors of a typical small watershed in the black soil region, northeast
China, Environ. Monit. Assess., 121, 597–613, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx25"><?xmltex \def\ref@label{{Kamilaris and Prenafeta-Bold\'{u}(2018)}}?><label>Kamilaris and Prenafeta-Boldú(2018)</label><mixed-citation>
Kamilaris, A. and Prenafeta-Boldú, F. X.: Deep learning in agriculture:
A survey, Comput. Electron. Agr., 147, 70–90, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>Kempen et al.(2011)Kempen, Brus, and Stoorvogel</label><mixed-citation>
Kempen, B., Brus, D., and Stoorvogel, J.: Three-dimensional mapping of soil
organic matter content using soil type–specific depth functions, Geoderma,
162, 107–123, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx27"><label>Keskin and Grunwald(2018)</label><mixed-citation>
Keskin, H. and Grunwald, S.: Regression kriging as a workhorse in the
digital soil mapper's toolbox, Geoderma, 326, 22–41, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx28"><label>Kingma and Ba(2014)</label><mixed-citation>
Kingma, D. and Ba, J.: Adam: A method for stochastic optimization, arXiv
preprint arXiv:1412.6980, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx29"><label>Krizhevsky et al.(2012)</label><mixed-citation>
Krizhevsky, A., Sutskever, I., and Hinton, G. E.: Imagenet classification
with deep convolutional neural networks, in: Advances in neural information
processing systems, MIT Press, 1097–1105, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx30"><label>LeCun et al.(1990)</label><mixed-citation>
LeCun, Y., Boser, B. E., Denker, J. S., Henderson, D., Howard, R. E.,
Hubbard, W. E., and Jackel, L. D.: Handwritten digit recognition with a
back-propagation network, in: Advances in neural information processing
systems, MIT Press, 396–404, 1990.</mixed-citation></ref>
      <ref id="bib1.bibx31"><label>LeCun and Bengio(1995)</label><mixed-citation>
LeCun, Y. and Bengio, Y., : Convolutional networks for images, speech, and
time series, The handbook of brain theory and neural networks, MIT Press,
3361, 1995.</mixed-citation></ref>
      <ref id="bib1.bibx32"><label>LeCun et al.(2015)LeCun, Bengio, and Hinton</label><mixed-citation>
LeCun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521,
436–444, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx33"><label>Lee and Kwon(2017)</label><mixed-citation>
Lee, H. and Kwon, H.: Going deeper with contextual CNN for hyperspectral
image classification, IEEE T. Image Process., 26, 4843–4855, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx34"><label>Lehner et al.(2008)Lehner, Verdin, and Jarvis</label><mixed-citation>
Lehner, B., Verdin, K., and Jarvis, A.: New global hydrography derived from
spaceborne elevation data, EOS, Transactions American Geophysical Union, 89,
93–94, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx35"><?xmltex \def\ref@label{{McBratney et~al.(2003)McBratney, {Mendon\c{c}a Santos}, and
Minasny}}?><label>McBratney et al.(2003)McBratney, Mendonça Santos, and
Minasny</label><mixed-citation>
McBratney, A., Mendonça Santos, M. L., and Minasny, B.: On digital
soil mapping, Geoderma, 117, 3–52, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx36"><?xmltex \def\ref@label{{Mendon\c{c}a-Santos et~al.(2006)Mendon\c{c}a-Santos, McBratney, and
Minasny}}?><label>Mendonça-Santos et al.(2006)Mendonça-Santos, McBratney, and
Minasny</label><mixed-citation>
Mendonça-Santos, M., McBratney, A., and Minasny, B.: Soil prediction
with spatially decomposed environmental factors, Dev. Soil Sci., 31,
269–278, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx37"><label>Miller et al.(2015)Miller, Koszinski, Wehrhan, and
Sommer</label><mixed-citation>
Miller, B. A., Koszinski, S., Wehrhan, M., and Sommer, M.: Impact of
multi-scale predictor selection for modeling soil properties, Geoderma, 239,
97–106, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx38"><label>Mulder et al.(2016)Mulder, Lacoste, de Forges, and
Arrouays</label><mixed-citation>
Mulder, V., Lacoste, M., de Forges, A. R., and Arrouays, D.: GlobalSoilMap
France: High-resolution spatial modelling the soils of France up to two meter
depth, Sci. Total Environ., 573, 1352–1369, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx39"><label>Nakane(1976)</label><mixed-citation>
Nakane, K.: An empirical formulation of the vertical distribution of carbon
concentration in forest soils, Japanese Journal of Ecology, 26, 171–174,
1976.</mixed-citation></ref>
      <ref id="bib1.bibx40"><label>Nitish et al.(2014)Nitish, Hinton, Krizhevsky, Sutskever, and
Salakhutdinov</label><mixed-citation>
Nitish, S., Hinton, G. E., Krizhevsky, A., Sutskever, I., and Salakhutdinov,
R.: Dropout: a simple way to prevent neural networks from overfitting.,
J. Mach. Learn. Res., 15, 1929–1958, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx41"><label>Padarian et al.(2017)Padarian, Minasny, and
McBratney</label><mixed-citation>
Padarian, J., Minasny, B., and McBratney, A.: Chile and the Chilean soil
grid: a contribution to GlobalSoilMap, Geoderma Regional, 9, 17–28, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx42"><label>Padarian et al.(2019)Padarian, Minasny, and
McBratney</label><mixed-citation>Padarian, J., Minasny, B., and McBratney, A.: Using deep learning to predict
soil properties from regional spectral data, Geoderma Regional, 16, e00198,
<ext-link xlink:href="https://doi.org/10.1016/j.geodrs.2018.e00198" ext-link-type="DOI">10.1016/j.geodrs.2018.e00198</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx43"><label>Paterson et al.(2018)Paterson, McBratney, Minasny, and
Pringle</label><mixed-citation>
Paterson, S., McBratney, A. B., Minasny, B., and Pringle, M. J.: Variograms
of Soil Properties for Agricultural and Environmental Applications, in:
Pedometrics, Springer, Cham, Switzerland, 623–667, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx44"><label>Perez and Wang(2017)</label><mixed-citation>
Perez, L. and Wang, J.: The effectiveness of data augmentation in image
classification using deep learning, arXiv preprint arXiv:1712.04621, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx45"><label>Poggio and Gimona(2017)</label><mixed-citation>
Poggio, L. and Gimona, A.: Assimilation of optical and radar remote sensing
data in 3D mapping of soil properties over large areas, Sci. Total
Environ., 579, 1094–1110, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx46"><label>Python Software Foundation(2017)</label><mixed-citation>Python Software Foundation: Python Language Reference, Python Software
Foundation, available at: <uri>https://www.python.org</uri> (last access: 22
February 2019), 2017.</mixed-citation></ref>
      <ref id="bib1.bibx47"><label>Quinlan(1992)</label><mixed-citation>
Quinlan, J. R.: Learning with continuous classes, in: 5th Australian joint
conference on artificial intelligence, Singapore, 16–18 November 1992,
vol. 92, 343–348, 1992.</mixed-citation></ref>
      <ref id="bib1.bibx48"><label>Ramsundar et al.(2015)Ramsundar, Kearnes, Riley, Webster, Konerding,
and Pande</label><mixed-citation>
Ramsundar, B., Kearnes, S., Riley, P., Webster, D., Konerding, D., and Pande,
V.: Massively multitask networks for drug discovery, arXiv preprint
arXiv:1502.02072, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx49"><label>Rossi et al.(2009)Rossi, Govaerts, De Vos, Verbist, Vervoort,
Poesen, Muys, and Deckers</label><mixed-citation>
Rossi, J., Govaerts, A., De Vos, B., Verbist, B., Vervoort, A., Poesen, J.,
Muys, B., and Deckers, J.: Spatial structures of soil organic carbon in
tropical forests—a case study of Southeastern Tanzania, Catena, 77,
19–27, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx50"><label>Ruder(2017)</label><mixed-citation>
Ruder, S.: An overview of multi-task learning in deep neural networks,
arXiv preprint arXiv:1706.05098, 2017.</mixed-citation></ref>
      <?pagebreak page89?><ref id="bib1.bibx51"><label>Russell and Moore(1968)</label><mixed-citation>
Russell, J. S. and Moore, A. W.: Comparison of different depth weighting in
the numerical analysis of anisotropic soil profile data, in: Transactions of
the 9th International Congress Soil Science, Adelaide, Sydney, International
Soil Science Society, 205–213, 1968.</mixed-citation></ref>
      <ref id="bib1.bibx52"><label>Simard et al.(2003)</label><mixed-citation>
Simard, P. Y., Steinkraus, D., and Platt, J. C.: Best practices for
convolutional neural networks applied to visual document analysis, Seventh
International Conference on Document Analysis and Recognition, Edinburgh, UK,
2003, vol. 3, 958–962, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx53"><label>Somarathna et al.(2017)Somarathna, Minasny, and
Malone</label><mixed-citation>Somarathna, P., Minasny, B., and Malone, B. P.: More Data or a Better Model?
Figuring Out What Matters Most for the Spatial Prediction of Soil Carbon,
Soil Sci. Soc. Am. J., 81, 1413–1426, <ext-link xlink:href="https://doi.org/10.2136/sssaj2016.11.0376" ext-link-type="DOI">10.2136/sssaj2016.11.0376</ext-link>, 2017.
</mixed-citation></ref><?xmltex \hack{\newpage}?>
      <ref id="bib1.bibx54"><label>Song et al.(2016)Song, Zhang, Liu, Li, Zhao, and
Yang</label><mixed-citation>
Song, X., Zhang, G., Liu, F., Li, D., Zhao, Y., and Yang, J.: Modeling
spatio-temporal distribution of soil moisture by deep learning-based cellular
automata model, J. Arid Land, 8, 734–748, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx55"><label>Sun et al.(2003)Sun, Zhou, and Zhao</label><mixed-citation>
Sun, B., Zhou, S., and Zhao, Q.: Evaluation of spatial and temporal changes
of soil quality based on geostatistical analysis in the hill region of
subtropical China, Geoderma, 115, 85–99, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx56"><label>Sun et al.(2017)Sun, Wang, Zhao, Zhang, and Zhang</label><mixed-citation>
Sun, X.-L., Wang, H.-L., Zhao, Y.-G., Zhang, C., and Zhang, G.-L.: Digital
soil mapping based on wavelet decomposed components of environmental
covariates, Geoderma, 303, 118–132, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx57"><label>Vo and Hays(2016)</label><mixed-citation>
Vo, N. N. and Hays, J.: Localizing and orienting street views using overhead
imagery, in: European Conference on Computer Vision, Amsterdam, the
Netherlands, Springer, 494–509, 2016.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Using deep learning for digital soil mapping</article-title-html>
<abstract-html><p>Digital soil mapping (DSM) has been widely used as a cost-effective
method for generating soil maps. However, current DSM data representation
rarely incorporates contextual information of the landscape. DSM models are
usually calibrated using point observations intersected with spatially
corresponding point covariates. Here, we demonstrate the use of the
convolutional neural network (CNN) model that incorporates contextual information
surrounding an observation to significantly improve the prediction accuracy
over conventional DSM models. We describe a CNN model that takes inputs as images of covariates and explores spatial
contextual information by finding non-linear local spatial relationships of
neighbouring pixels. Unique features of the proposed model include input
represented as a 3-D stack of images, data augmentation to reduce overfitting,
and the simultaneous prediction of multiple outputs. Using a soil mapping example
in Chile, the CNN model was trained to simultaneously predict soil organic
carbon at multiples depths across the country. The results showed that, in
this study, the CNN model reduced the error by 30&thinsp;% compared with
conventional techniques that only used point information of covariates. In
the example of country-wide mapping at 100&thinsp;m resolution, the neighbourhood
size from 3 to 9 pixels is more effective than at a point location and larger
neighbourhood sizes. In addition, the CNN model produces less prediction
uncertainty and it is able to predict soil carbon at deeper soil layers more
accurately. Because the CNN model takes the covariate represented as images, it
offers a simple and effective framework for future DSM models.</p></abstract-html>
<ref-html id="bib1.bib1"><label>Abadi et al.(2015)Abadi, Agarwal, Barham, Brevdo, Chen, Citro,
Corrado, Davis, Dean, Devin, Ghemawat, Goodfellow, Harp, Irving, Isard, Jia,
Jozefowicz, Kaiser, Kudlur, Levenberg, Mané, Monga, Moore, Murray, Olah,
Schuster, Shlens, Steiner, Sutskever, Talwar, Tucker, Vanhoucke, Vasudevan,
Viégas, Vinyals, Warden, Wattenberg, Wicke, Yu, and
Zheng</label><mixed-citation>
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado,
G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp,
A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M.,
Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C.,
Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P.,
Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P.,
Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: Large-Scale
Machine Learning on Heterogeneous Systems, available at:
<a href="https://www.tensorflow.org/" target="_blank">https://www.tensorflow.org/</a> (last access: 22 February 2019), 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Adhikari et al.(2014)Adhikari, Hartemink, Minasny, Kheir, Greve, and
Greve</label><mixed-citation>
Adhikari, K., Hartemink, A. E., Minasny, B., Kheir, R. B., Greve, M. B., and
Greve, M. H.: Digital mapping of soil organic carbon contents and stocks in
Denmark, PloS one, 9, e105519, <a href="https://doi.org/10.1371/journal.pone.0105519" target="_blank">https://doi.org/10.1371/journal.pone.0105519</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Akpa et al.(2016)Akpa, Odeh, Bishop, Hartemink, and
Amapu</label><mixed-citation>
Akpa, S. I., Odeh, I. O., Bishop, T. F., Hartemink, A. E., and Amapu, I. Y.:
Total soil organic carbon and carbon sequestration potential in Nigeria,
Geoderma, 271, 202–215, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Angelini and Heuvelink(2018)</label><mixed-citation>
Angelini, M. E. and Heuvelink, G. B.: Including spatial correlation in
structural equation modelling of soil properties, Spat. Stat.-Nath., 25,
35–51, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Angelini et al.(2017)</label><mixed-citation>
Angelini, M., Heuvelink, G., and Kempen, B.: Multivariate mapping of soil
with
structural equation modelling, Eur. J. Soil Sci., 68,
575–591, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Arrouays et al.(2014)</label><mixed-citation>
Arrouays, D., McBratney, A., Minasny, B., Hempel, J., Heuvelink, G.,
MacMillan, R., Hartemink, A., Lagacherie, P., and McKenzie, N.: The
GlobalSoilMap project specifications, in: GlobalSoilMap: Basis of the Global
Spatial Soil Information System – Proceedings of the 1st GlobalSoilMap
Conference, edited by: Arrouays D., McKenzie, N., Hempel, J., Richer de
Forges, A., and McBratney, A. B., Orleans, France, 7–9 October 2013, CRC
Press, 9–12, <a href="https://doi.org/10.1201/b16500-4" target="_blank">https://doi.org/10.1201/b16500-4</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Behrens et al.(2010)Behrens, Schmidt, Zhu, and
Scholten</label><mixed-citation>
Behrens, T., Schmidt, K., Zhu, A.-X., and Scholten, T.: The ConMap approach
for terrain-based digital soil mapping, Eur. J. Soil Sci.,
61, 133–143, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Behrens et al.(2014)Behrens, Schmidt, Ramirez-Lopez, Gallant, Zhu,
and Scholten</label><mixed-citation>
Behrens, T., Schmidt, K., Ramirez-Lopez, L., Gallant, J., Zhu, A.-X., and
Scholten, T.: Hyper-scale digital soil mapping and soil formation analysis,
Geoderma, 213, 578–588, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Behrens et al.(2018)Behrens, Schmidt, MacMillan, and
Rossel</label><mixed-citation>
Behrens, T., Schmidt, K., MacMillan, R., and Rossel, R. V.: Multiscale
contextual spatial modelling with the Gaussian scale space, Geoderma, 310,
128–137, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Biswas and Si(2011)</label><mixed-citation>
Biswas, A. and Si, B. C.: Revealing the controls of soil water storage at
different scales in a hummocky landscape, Soil Sci. Soc. Am.
J., 75, 1295–1306, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Chen et al.(2018)Chen, Martin, Saby, Walter, Angers, and
Arrouays</label><mixed-citation>
Chen, S., Martin, M. P., Saby, N. P., Walter, C., Angers, D. A., and
Arrouays,
D.: Fine resolution map of top-and subsoil carbon sequestration potential in
France, Sci. Total Environ., 630, 389–400, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Chollet(2015)</label><mixed-citation>
Chollet, F.: Keras, available at: <a href="https://github.com/fchollet/keras" target="_blank">https://github.com/fchollet/keras</a>
(last access: 22 February 2019), 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Chui(2016)</label><mixed-citation>
Chui, C. K.: An introduction to wavelets, Elsevier, Noston, MA, USA, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Conrad et al.(2015)Conrad, Bechtel, Bock, Dietrich, Fischer, Gerlitz,
Wehberg, Wichmann, and Böhner</label><mixed-citation>
Conrad, O., Bechtel, B., Bock, M., Dietrich, H., Fischer, E., Gerlitz, L.,
Wehberg, J., Wichmann, V., and Böhner, J.: System for Automated
Geoscientific Analyses (SAGA) v. 2.1.4, Geosci. Model Dev., 8, 1991–2007,
<a href="https://doi.org/10.5194/gmd-8-1991-2015" target="_blank">https://doi.org/10.5194/gmd-8-1991-2015</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Demattê et al.(2018)Demattê, Fongaro, Rizzo, and
Safanelli</label><mixed-citation>
Demattê, J. A. M., Fongaro, C. T., Rizzo, R., and Safanelli, J. L.:
Geospatial Soil Sensing System (GEOS3): A powerful data mining procedure to
retrieve soil spectral reflectance from satellite images, Remote Sens.
Environ., 212, 161–175, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>Deumlich et al.(2010)Deumlich, Schmidt, and
Sommer</label><mixed-citation>
Deumlich, D., Schmidt, R., and Sommer, M.: A multiscale soil–landform
relationship in the glacial-drift area based on digital terrain analysis and
soil attributes, J. Plant Nutr. Soil Sc., 173, 843–851,
2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>Dokuchaev(1883)</label><mixed-citation>
Dokuchaev, V. V.: Russian Chernozem. Selected works of V. V. Dokuchaev. v.
1, Israel Program for Scientific Translations, Jerusalem, Israel, 1883
(translated in 1967).
</mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>Don et al.(2007)Don, Schumacher, Scherer-Lorenzen, Scholten, and
Schulze</label><mixed-citation>
Don, A., Schumacher, J., Scherer-Lorenzen, M., Scholten, T., and Schulze,
E.-D.: Spatial and vertical variation of soil carbon at two grassland
sites – implications for measuring soil carbon stocks, Geoderma, 141,
272–282, 2007.
</mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Duan et al.(2016)Duan, Lv, Liu, and Wang</label><mixed-citation>
Duan, Y., Lv, Y., Liu, Y.-L., and Wang, F.-Y.: An efficient realization of
deep learning for traffic data imputation, Transport. Res. C-Emer, 72,
168–181, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Efron and Tibshirani(1993)</label><mixed-citation>
Efron, B. and Tibshirani, R. J.: An introduction to the bootstrap, vol. 57,
CRC press, New York, USA, 1993.
</mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Haar(1910)</label><mixed-citation>
Haar, A.: Zur theorie der orthogonalen funktionensysteme, Math. Ann., 69,
331–371, 1910.
</mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Hijmans et al.(2005)Hijmans, Cameron, Parra, Jones, and
Jarvis</label><mixed-citation>
Hijmans, R. J., Cameron, S. E., Parra, J. L., Jones, P. G., and Jarvis, A.:
Very high resolution interpolated climate surfaces for global land areas,
Int. J. Climatol., 25, 1965–1978, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>Jenny(1941)</label><mixed-citation>
Jenny, H.: Factors of soil formation: a system of quantitative pedology,,
Macgraw Hill, New York 1941.
</mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Jian-Bing et al.(2006)Jian-Bing, Du-Ning, Xing-Yi, Xiu-Zhen, and
Xiao-Yu</label><mixed-citation>
Jian-Bing, W., Du-Ning, X., Xing-Yi, Z., Xiu-Zhen, L., and Xiao-Yu, L.:
Spatial variability of soil organic carbon in relation to environmental
factors of a typical small watershed in the black soil region, northeast
China, Environ. Monit. Assess., 121, 597–613, 2006.
</mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Kamilaris and Prenafeta-Boldú(2018)</label><mixed-citation>
Kamilaris, A. and Prenafeta-Boldú, F. X.: Deep learning in agriculture:
A survey, Comput. Electron. Agr., 147, 70–90, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>Kempen et al.(2011)Kempen, Brus, and Stoorvogel</label><mixed-citation>
Kempen, B., Brus, D., and Stoorvogel, J.: Three-dimensional mapping of soil
organic matter content using soil type–specific depth functions, Geoderma,
162, 107–123, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>Keskin and Grunwald(2018)</label><mixed-citation>
Keskin, H. and Grunwald, S.: Regression kriging as a workhorse in the
digital soil mapper's toolbox, Geoderma, 326, 22–41, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>Kingma and Ba(2014)</label><mixed-citation>
Kingma, D. and Ba, J.: Adam: A method for stochastic optimization, arXiv
preprint arXiv:1412.6980, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Krizhevsky et al.(2012)</label><mixed-citation>
Krizhevsky, A., Sutskever, I., and Hinton, G. E.: Imagenet classification
with deep convolutional neural networks, in: Advances in neural information
processing systems, MIT Press, 1097–1105, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>LeCun et al.(1990)</label><mixed-citation>
LeCun, Y., Boser, B. E., Denker, J. S., Henderson, D., Howard, R. E.,
Hubbard, W. E., and Jackel, L. D.: Handwritten digit recognition with a
back-propagation network, in: Advances in neural information processing
systems, MIT Press, 396–404, 1990.
</mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>LeCun and Bengio(1995)</label><mixed-citation>
LeCun, Y. and Bengio, Y., : Convolutional networks for images, speech, and
time series, The handbook of brain theory and neural networks, MIT Press,
3361, 1995.
</mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>LeCun et al.(2015)LeCun, Bengio, and Hinton</label><mixed-citation>
LeCun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521,
436–444, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Lee and Kwon(2017)</label><mixed-citation>
Lee, H. and Kwon, H.: Going deeper with contextual CNN for hyperspectral
image classification, IEEE T. Image Process., 26, 4843–4855, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Lehner et al.(2008)Lehner, Verdin, and Jarvis</label><mixed-citation>
Lehner, B., Verdin, K., and Jarvis, A.: New global hydrography derived from
spaceborne elevation data, EOS, Transactions American Geophysical Union, 89,
93–94, 2008.
</mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>McBratney et al.(2003)McBratney, Mendonça Santos, and
Minasny</label><mixed-citation>
McBratney, A., Mendonça Santos, M. L., and Minasny, B.: On digital
soil mapping, Geoderma, 117, 3–52, 2003.
</mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Mendonça-Santos et al.(2006)Mendonça-Santos, McBratney, and
Minasny</label><mixed-citation>
Mendonça-Santos, M., McBratney, A., and Minasny, B.: Soil prediction
with spatially decomposed environmental factors, Dev. Soil Sci., 31,
269–278, 2006.
</mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Miller et al.(2015)Miller, Koszinski, Wehrhan, and
Sommer</label><mixed-citation>
Miller, B. A., Koszinski, S., Wehrhan, M., and Sommer, M.: Impact of
multi-scale predictor selection for modeling soil properties, Geoderma, 239,
97–106, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Mulder et al.(2016)Mulder, Lacoste, de Forges, and
Arrouays</label><mixed-citation>
Mulder, V., Lacoste, M., de Forges, A. R., and Arrouays, D.: GlobalSoilMap
France: High-resolution spatial modelling the soils of France up to two meter
depth, Sci. Total Environ., 573, 1352–1369, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>Nakane(1976)</label><mixed-citation>
Nakane, K.: An empirical formulation of the vertical distribution of carbon
concentration in forest soils, Japanese Journal of Ecology, 26, 171–174,
1976.
</mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>Nitish et al.(2014)Nitish, Hinton, Krizhevsky, Sutskever, and
Salakhutdinov</label><mixed-citation>
Nitish, S., Hinton, G. E., Krizhevsky, A., Sutskever, I., and Salakhutdinov,
R.: Dropout: a simple way to prevent neural networks from overfitting.,
J. Mach. Learn. Res., 15, 1929–1958, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Padarian et al.(2017)Padarian, Minasny, and
McBratney</label><mixed-citation>
Padarian, J., Minasny, B., and McBratney, A.: Chile and the Chilean soil
grid: a contribution to GlobalSoilMap, Geoderma Regional, 9, 17–28, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Padarian et al.(2019)Padarian, Minasny, and
McBratney</label><mixed-citation>
Padarian, J., Minasny, B., and McBratney, A.: Using deep learning to predict
soil properties from regional spectral data, Geoderma Regional, 16, e00198,
<a href="https://doi.org/10.1016/j.geodrs.2018.e00198" target="_blank">https://doi.org/10.1016/j.geodrs.2018.e00198</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Paterson et al.(2018)Paterson, McBratney, Minasny, and
Pringle</label><mixed-citation>
Paterson, S., McBratney, A. B., Minasny, B., and Pringle, M. J.: Variograms
of Soil Properties for Agricultural and Environmental Applications, in:
Pedometrics, Springer, Cham, Switzerland, 623–667, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>Perez and Wang(2017)</label><mixed-citation>
Perez, L. and Wang, J.: The effectiveness of data augmentation in image
classification using deep learning, arXiv preprint arXiv:1712.04621, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>Poggio and Gimona(2017)</label><mixed-citation>
Poggio, L. and Gimona, A.: Assimilation of optical and radar remote sensing
data in 3D mapping of soil properties over large areas, Sci. Total
Environ., 579, 1094–1110, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Python Software Foundation(2017)</label><mixed-citation>
Python Software Foundation: Python Language Reference, Python Software
Foundation, available at: <a href="https://www.python.org" target="_blank">https://www.python.org</a> (last access: 22
February 2019), 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>Quinlan(1992)</label><mixed-citation>
Quinlan, J. R.: Learning with continuous classes, in: 5th Australian joint
conference on artificial intelligence, Singapore, 16–18 November 1992,
vol. 92, 343–348, 1992.
</mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>Ramsundar et al.(2015)Ramsundar, Kearnes, Riley, Webster, Konerding,
and Pande</label><mixed-citation>
Ramsundar, B., Kearnes, S., Riley, P., Webster, D., Konerding, D., and Pande,
V.: Massively multitask networks for drug discovery, arXiv preprint
arXiv:1502.02072, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>Rossi et al.(2009)Rossi, Govaerts, De Vos, Verbist, Vervoort,
Poesen, Muys, and Deckers</label><mixed-citation>
Rossi, J., Govaerts, A., De Vos, B., Verbist, B., Vervoort, A., Poesen, J.,
Muys, B., and Deckers, J.: Spatial structures of soil organic carbon in
tropical forests—a case study of Southeastern Tanzania, Catena, 77,
19–27, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>Ruder(2017)</label><mixed-citation>
Ruder, S.: An overview of multi-task learning in deep neural networks,
arXiv preprint arXiv:1706.05098, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>Russell and Moore(1968)</label><mixed-citation>
Russell, J. S. and Moore, A. W.: Comparison of different depth weighting in
the numerical analysis of anisotropic soil profile data, in: Transactions of
the 9th International Congress Soil Science, Adelaide, Sydney, International
Soil Science Society, 205–213, 1968.
</mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>Simard et al.(2003)</label><mixed-citation>
Simard, P. Y., Steinkraus, D., and Platt, J. C.: Best practices for
convolutional neural networks applied to visual document analysis, Seventh
International Conference on Document Analysis and Recognition, Edinburgh, UK,
2003, vol. 3, 958–962, 2003.
</mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>Somarathna et al.(2017)Somarathna, Minasny, and
Malone</label><mixed-citation>
Somarathna, P., Minasny, B., and Malone, B. P.: More Data or a Better Model?
Figuring Out What Matters Most for the Spatial Prediction of Soil Carbon,
Soil Sci. Soc. Am. J., 81, 1413–1426, <a href="https://doi.org/10.2136/sssaj2016.11.0376" target="_blank">https://doi.org/10.2136/sssaj2016.11.0376</a>, 2017.

</mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>Song et al.(2016)Song, Zhang, Liu, Li, Zhao, and
Yang</label><mixed-citation>
Song, X., Zhang, G., Liu, F., Li, D., Zhao, Y., and Yang, J.: Modeling
spatio-temporal distribution of soil moisture by deep learning-based cellular
automata model, J. Arid Land, 8, 734–748, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>Sun et al.(2003)Sun, Zhou, and Zhao</label><mixed-citation>
Sun, B., Zhou, S., and Zhao, Q.: Evaluation of spatial and temporal changes
of soil quality based on geostatistical analysis in the hill region of
subtropical China, Geoderma, 115, 85–99, 2003.
</mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>Sun et al.(2017)Sun, Wang, Zhao, Zhang, and Zhang</label><mixed-citation>
Sun, X.-L., Wang, H.-L., Zhao, Y.-G., Zhang, C., and Zhang, G.-L.: Digital
soil mapping based on wavelet decomposed components of environmental
covariates, Geoderma, 303, 118–132, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>Vo and Hays(2016)</label><mixed-citation>
Vo, N. N. and Hays, J.: Localizing and orienting street views using overhead
imagery, in: European Conference on Computer Vision, Amsterdam, the
Netherlands, Springer, 494–509, 2016.
</mixed-citation></ref-html>--></article>
