Transformation of n-alkanes from plant to soil: a review

: Despite the importance of soil organic matter (SOM) in the global carbon cycle, there remain many open questions regarding its formation and preservation. The study of individual organic compound classes that make up SOM, such as lipid biomarkers including n-alkanes, can provide insight into the cycling of bulk SOM. While studies of lipid biomarkers, particularly n-alkanes, have increased in number in the past few decades, only a limited number have focused on the transformation of these compounds following deposition in soil archives. We performed a systematic review to consolidate the available information on plant-derived n-alkanes and their transformation from plant to soil. Our major findings were (1) a nearly ubiquitous trend of decreased total concentration of n-alkanes either with time in litterbag experiments or with depth in open plant–soil systems and (2) preferential degradation of odd-chain length and shorter chain length n-alkanes represented by a decrease in either carbon preference index (CPI) or odd-over-even predominance (OEP) with depth, indicating degradation of the n-alkane signal or a shift in vegetation composition over time. The review also highlighted a lack of data transparency and standardization across studies of lipid biomarkers, making analysis and synthesis of published data time-consuming and diﬀicult. We recommend that the community move towards more uniform and systematic reporting of biomarker data. Furthermore, as the number of studies examining the complete leaf–litter– soil continuum is very limited as well as unevenly distributed over geographical regions, climate zones, and soil types, future data collection should focus on underrepresented areas as well as quantifying the transformation of n-alkanes through the complete continuum from plant to soil. Abstract. Despite the importance of soil organic matter (SOM) in the global carbon cycle, there remain many open questions regarding its formation and preservation. The study of individual organic compound classes that make up SOM, such as lipid biomarkers including n -alkanes, can provide insight into the cycling of bulk SOM. While studies of lipid biomarkers, particularly n -alkanes, have increased in number in the past few decades, only a limited number have focused on the transformation of these compounds following deposition in soil archives. We performed a systematic review to consolidate the available information on plant-derived n -alkanes and their transformation from plant to soil. Our major ﬁndings were (1) a nearly ubiquitous trend of decreased total concentration of n -alkanes either with time in litterbag experiments or with depth in open plant–soil systems and (2) preferential degradation of odd-chain length and shorter chain length n -alkanes represented by a decrease in either carbon preference index (CPI) or odd-over-even predominance (OEP) with depth, indicating degradation of the n -alkane signal or a shift in vegetation composition over time. The review also highlighted a lack of data transparency and standardization across studies of lipid biomarkers, making analysis and synthesis of published data time-consuming and difﬁcult. We recommend that the community move towards more uniform and systematic


Introduction
Soil organic matter (SOM) is one of the largest terrestrial reservoirs in the carbon cycle, containing a carbon stock more than 10 times greater than that of forest biomass (Settele et al., 2015). Despite its importance, there remain many gaps in the knowledge of the formation, degradation, and preservation of SOM (Schmidt et al., 2011). Studying the degradation and preservation of various individual compound classes included in bulk SOM, such as lipids, could increase the overall understanding of SOM dynamics. Although the currently prevailing paradigm holds that there is no intrinsic recalcitrance of particular molecular classes (e.g., Schmidt et al., 2011;Lehmann and Kleber, 2015), insights are emerging that SOM turnover rates are linked to functional complexity with an important role for variations in the molecular diversity of SOM (Lehmann et al., 2020). As lipids represent a molecularly diverse sub-class of SOM (Kögel-Knabner, 2002), further study of lipid dynamics in soils under various pedogenic and environmental conditions will help advance the current debate on SOM dynamics.
In the last few decades, there has been a myriad of studies using lipids in soil as molecular proxies for a variety of purposes, including paleoecology and paleoclimate reconstructions (Jansen and Wiesenberg, 2017). Many of these compounds are considered "biological markers", or "biomarkers" for short, because they can be indicative of their source organisms and may be preserved following deposition in environmental archives, such as soils and sediments (Peters et al., 2005). n-Alkanes, in particular, have been used extensively Published by Copernicus Publications on behalf of the European Geosciences Union. due to their potential for preservation and because their extraction and analysis are relatively easy compared to other compound classes (Diefendorf and Freimuth, 2017).
The primary sources of n-alkanes in soils and sediments are cuticular waxes of plant leaves and roots (Eglinton and Eglinton, 2008). Plant-derived n-alkanes are characterized as having long chains of carbon atoms (between C 25 and C 35 ) and a strong predominance of molecules with an odd number of carbons (Eglinton et al., 1962a, b;Kolattukudy et al., 1976). Because n-alkanes are ubiquitous in plant leaves and roots, the distribution pattern of n-alkanes across multiple chain lengths is used as a proxy to determine past vegetation species, not merely the presence of individual n-alkanes with specific chain lengths (Jansen et al., 2006). Therefore, the preservation of these patterns following deposition in an archive is essential to their continued use as a proxy for paleovegetation, mediated by (and therefore also an estimator of) contemporary climatic and hydrological conditions (e.g., Bush and McInerney, 2015).
To aid in characterizing n-alkane patterns, various index measurements have been developed over the years. These include the carbon preference index (CPI), odd-over-even predominance (OEP), and average chain length (ACL). The CPI and OEP were both developed initially as a method for identifying sources of petroleum (Bray and Evans, 1961;Scalan and Smith, 1970). They have since been used extensively in studies in other environmental settings such as soil and sediments to determine the sources and degree of degradation of n-alkanes (e.g., Zhou et al., 2010;Trigui et al., 2019). Higher values for the CPI and OEP indicate nalkanes derived from plant waxes and are also characteristic of well-preserved biomarker signals (Cranwell, 1981;Zech et al., 2009). Lower values are typical of n-alkanes originating from other sources such as microbes or indicate a high degree of degradation (Cranwell, 1981;Zech et al., 2009). The ACL is a weighted average of the chain length distribution of the odd-chain lengths of n-alkanes. It has been used to differentiate between dominant vegetation types, such as woody plants versus grasses, and as an indicator of environmental conditions such as drought (e.g., Crausbay et al., 2014;Wüthrich et al., 2017).
The equations used for each of these indices often vary between studies, particularly in terms of what chain lengths are included. Basic equations for CPI (Marzi et al., 1993), OEP (Hoefs et al., 2002), and ACL (Poynter et al., 1989) are as follows: where C x is the concentration of each n-alkane containing x carbon atoms; n and m are the chain lengths of, respectively, the starting and ending n-alkane divided by 2 (note: both 2n and 2m should be even numbers). Though there have been many studies regarding the diagenesis of n-alkanes and other biomarkers in sediments (e.g., Cranwell, 1981;Meyers and Ishiwatari, 1993;Bourbonniere and Meyers, 1996;Zhang et al., 2006), only a few have focused on the potential degradation or transformation of these molecules in soil (e.g., Bull et al., 2000;Wiesenberg et al., 2004;Otto and Simpson, 2005). This review aims to consolidate the available information on the fate of n-alkanes in soil and how their distribution patterns may be altered from fresh plant material to litter to soil. Clearer knowledge of degradation and transformation will lead to a better understanding of the accuracy of biomarker observations and more careful interpretation; it can also lay the basis for a better process understanding of degradation and transformation. Ultimately, this understanding can be translated into observation models to standardize biomarker observations and render them more useful.

Methods
The review was performed systematically. A Boolean search string was developed to search all of the databases, except the Derwent Innovation Index, included in the Web of Science: (("leaf wax * " OR "lipid biomarker * " OR "alkane * " OR "n-alkane * " OR "chemical fossil * " OR "epicuticular wax * " OR "molecular prox * ") AND ("soil * OR "peat * " OR "topsoil * " OR "litter * ")). Selection criteria were also developed to guide the process of screening the search results. These include the following.
1. Peer-reviewed source 2. Study includes primary data or observations on the degradation of the distribution patterns, concentration, or index measurements (e.g., CPI, OEP, or ACL) of nalkanes.
3. Study occurs in natural soil or peat with minimal contamination or in lab conditions if microbial degradation of n-alkanes in natural soils was investigated. Studies including data obtained using pyrolysis were not included as this method provides indirect measurements of nalkanes.
For studies that included concentration data but not index measurements, Eq. (1) was used to calculate CPI with n = 11 and m = 16, Eq. (2) was used to calculate OEP, and Eq. (3) was used to calculate ACL using odd-chain lengths between 27 and 33.

Results
Prior to incorporation in soil, plant debris can form a litter layer on top of the soil, especially in forest ecosystems. This plant debris is the principal source of material for the formation of soil organic matter in the topsoil (Kögel-Knabner, 2002), and degradation of plant-derived n-alkanes begins in the litter layer. There have been 27 studies examining the degradation of n-alkanes in the litter, including six litterbag experiments as well as studies assessing the litter layer in open plant-soil systems, which we define as natural systems including no manipulation.

Results from litterbag experiments
We found a total of six litterbag experiment studies during the review ranging in time intervals from 300 d to 23 years (Table 1). The species investigated for n-alkane degradation included Calluna vulgaris in a peatland setting (Huang et al., 1997), Acer pseudoplatanus, Fagus sylvatica, and Sorbus aucuparia in forest settings (Zech et al., 2011;Nguyen Tu et al., 2017), Setaria viridis, Eleusine indica, Amaranthus retroflexus, and Erigeron speciosus deposited at an agricultural experimental field (Wang et al., 2014), and Neosinocalanus affinis and Osmanthus fragrans (Li et al., 2017). Additionally, Schulz et al. (2012) measured degradation of Zea mays and Pisum sativum in an incubation experiment using agricultural soil.
The primary trend spanning nearly all (five out of six) of the litterbag experiments is a considerable decrease in the total concentration of n-alkanes compared to the initial biomass (µg/g dry weight). The majority of species contained a much lower relative concentration of n-alkanes in the final measurement of the experiments (Fig. 1). The largest decreases in the first 100 d occurred in litter from N. affinis (Li et al., 2017), Z. mays, and P. sativum (Schulz et al., 2012) (Fig. 1). In contrast, n-alkanes in E. speciosus and S. viridis actually increased in concentration by the end of the experimental period (Wang et al., 2014) (Fig. 1). From the beginning to the end of the litterbag experiments, the distribution patterns of the n-alkanes remained relatively constant, with the same range of chain lengths and the dominant chain length staying the same in nearly all of the species (Supplement).
The ACL was reported in four of the studies; the results from Huang et al. (1997), Li et al. (2017), Wang et al. (2014), andNguyen Tu et al. (2017) are shown in Fig. 2a. There were only small variations in ACL over the course of the studies. The N. affinis litter showed the largest increase in ACL (Li et al., 2017), while C. vulgaris was the only species that ended the experiment with a lower ACL than the initial measurement (Fig. 2a).
The CPI was also reported in the same four studies; the results are shown in Fig. 2b. There is some variability between the studies in how the CPI changed with time. The CPI of N. affinis dropped rapidly at the beginning of the experiment but then slightly increased until the end of the experiment (Li et al., 2017) (Fig. 2b). In contrast, Nguyen Tu et al. (2017) measured an increase in the CPI of F. sylvatica litter from 9.9 to 14.7 after 2 years (Fig. 2b). C. vulgaris and O. fragrans showed slight decreases in CPI over the experimental periods, 23 years and 369 d, respectively. Three of the grasses in the experiment of Wang et al. (2014) revealed a similar pattern with small increases or decreases at different sampling points but returning to about their starting CPI after 210 d (Fig. 2b). However, A. retroflexus showed a different pattern with a sharp increase at 60 d followed by a decrease to 1.4 from its initial 3.7. Zech et al. (2011) reported OEP rather than CPI (Fig. 2b). They found that A. pseudoplatanus was characterized by a small initial increase and then remained about the same. Both S. aucuparia and F. sylvatica had a significant decrease in OEP over 27 months.

Results from the plant-litter-soil continuum in open plant-soil systems
Twenty-one studies compared the n-alkane compositions of litter to that of fresh plants or soil in open plant-soil systems without using litterbags (Fig. 3, Table 2). These include field sites from a range of climates and environmental conditions. To examine the presence of general trends throughout the data, we clustered them into six broad classes of biomes: coniferous forest, deciduous forest, mixed forest, grassland or shrubland, peatland, and steppe. The choice for  these biomes was based on the abundance and distribution of the study sites for which results were found.
The majority of sites included in these studies showed a clear decrease in absolute n-alkane concentration from fresh leaves to senescent leaves (if included) to litter to the organic layer (if included) to the topsoil (Fig. 3) (e.g., Chikaraishi and Naraoka, 2006;Zhang et al., 2017;Marseille et al., 1999;Otto and Simpson, 2005;Nguyen Tu et al., 2001). From ei-ther the fresh plants or the litter to the topsoil, the total concentration of n-alkanes decreased noticeably. The average percent decrease from litter to topsoil across the included studies was 46 %, while that from fresh plants to topsoil was 87 %. In Fig. 3a, the total concentration normalized to dry weight is shown along with the total concentration normalized to total organic carbon (TOC) in Fig. 3b. The primary difference between the trends shown in Fig. 3a and b is that for coniferous and mixed forest vegetation, the concentration of n-alkanes increases along the continuum when normalized to TOC.
The studies reported various changes to the distribution patterns of the n-alkanes from plant to litter to soil (Supplement). Some of the results showed contradicting trends; for example, Zhang et al. (2017) measured an increase in long-chain n-alkanes (C 27 -C 33 ) from fresh plants to litter in Sphagnum-dominated peatlands and a decrease in mid-chain n-alkanes (C 21 -C 25 ), while Marseille et al. (1999) found evidence of increasing proportions of mid-chain lengths in deeper litter layers in forest soils under Fagus sylvatica. Other studies noted decreases in the relative concentration of long-chain n-alkanes (Chikaraishi and Naraoka, 2006;Otto and Simpson, 2005;Hirave et al., 2020), while Nguyen Tu et al. (2001) noted a preferential decrease in shorter chain lengths from fresh leaves to litter of Gingko biloba. Figure 4a-c show the changes in ACL, CPI, and OEP from plant to soil that were either reported or calculated with provided data. The ACL typically increased from fresh plant material to topsoil, though not at sites with grassland or shrubland vegetation and mixed forest vegetation (Fig. 4a). Both the CPI and OEP decreased from fresh material to topsoil in all vegetation types except for coniferous forest (Fig. 4b, c).

Results from soil profiles
With time, partially degraded plant material from the litter layer will be incorporated into the topsoil. Degradation will continue in the soil, aided by soil microbiota and organisms, such as earthworms. Below the topsoil, there is a relative increase in root-derived carbon, including lipids, and a simultaneous decrease in litter-derived carbon (Angst et al., 2016). Limited studies have shown that root biomass could be an important source of n-alkanes in soil, particularly in the subsoil under certain environmental circumstances, such as the presence of a nutrient-rich fossil subsoil horizon (Jansen and Wiesenberg, 2017). We found a large number of studies containing n-alkane measurements from soil profiles and grouped these in the same six biomes as previously mentioned. As some of the studies only included results from the organic layer, we analyzed the trends from 0 to 20 cm depth and 20 cm depth and below separately. The deepest measurement varied from study to study. Figure 5a shows the changes in the total concentration of n-alkanes normalized to dry weight in the six biomes from SOIL, 7, 785-809, 2021 https://doi.org/10.5194/soil-7-785-2021     (Schäfer et al., 2016). For the deciduous forest type, 39 sites were included from seven studies (Bliedtner et al., 2018;Bush and McInerney, 2015;Chikaraishi and Naraoka, 2006;Schäfer et al., 2016;Stout, 2020;Trigui et al., 2019;Wu et al., 2019). For the mixed forest type, 25 sites were included from two studies (Bush and McInerney, 2015;Howard et al., 2018). For the grassland or shrubland vegetation type, 110 sites were included from eight studies (Bliedtner et al., 2018;Bush and McInerney, 2015;Howard et al., 2018;Lemma et al., 2019;Li et al., 2018;Schäfer et al., 2016;Trigui et al., 2019;Yao et al., 2019). For the peat vegetation type, five sites were included from two studies (Ficken et al., 1998;Zhang et al., 2017). For the steppe vegetation type, 57 sites were included from two studies (Struck et al., 2020;Yao et al., 2019). (b) Total concentration (µg/g OC) of n-alkanes in fresh leaves, litter, organic layer (litter removed), and topsoil. For the coniferous forest vegetation type, five sites were included from two studies (Hirave et al., 2020;Otto and Simpson, 2005). For the deciduous forest vegetation type, 11 sites were included from five studies (Angst et al., 2016;Anokhina et al., 2018;Hirave et al., 2020;Otto and Simpson, 2005;Wu et al., 2019). For the mixed forest type, one site was included from one study (Hirave et al., 2020). For the grassland or shrubland vegetation type, two sites were included from one study . 0 to 20 cm depth, and Fig. 5b shows the total concentration normalized to dry weight from 20 cm depth and below. Figure 6a, b also present concentration data but normalized to TOC. The latter was only reported for studies in the deciduous forest, grassland or shrubland, and peat vegetation types. Below the uppermost mineral soil horizon, the results varied per study and biome. In areas with coniferous forest vegetation, there was some evidence of increasing concentration of n-alkanes with increasing depth within the A horizons (Marseille et al., 1999;Schäfer et al., 2016), though the majority of sites still had decreasing concentrations (Schäfer et al., 2016) (Fig. 5a). There were mixed results at the sites with deciduous forest vegetation, depending on whether the nalkane concentration was normalized to the TOC (Figs. 5a, b,  6a, b). Within the A horizons, conflicting trends were identified in two studies: Anokhina et al. (2018) found an increase SOIL, 7, 785-809, 2021 https://doi.org/10.5194/soil-7-785-2021  (Hirave et al., 2020;Schäfer et al., 2016). For the deciduous forest type, 41 sites were included from nine studies (Angst et al., 2016;Bliedtner et al., 2018;Bush and McInerney, 2015;Hirave et al., 2020;Lei et al., 2010;Schäfer et al., 2016;Stout, 2020;Trigui et al., 2019;Wu et al., 2019). For the mixed forest type, 26 sites were included from three studies (Bush and McInerney, 2015;Hirave et al., 2020;Howard et al., 2018). For the grassland or shrubland vegetation type, 110 sites were included from eight studies (Bliedtner et al., 2018;Bush and McInerney, 2015;Howard et al., 2018;Lemma et al., 2019;Li et al., 2018;Schäfer et al., 2016;Trigui et al., 2019;Yao et al., 2019). For the peat vegetation type, five sites were included from two studies (Ficken et al., 1998;Zhang et al., 2017). For the steppe vegetation type, 58 sites were included from three studies (Buggle et al., 2010;Struck et al., 2020;Yao et al., 2019). (b) CPI of n-alkanes in fresh leaves, litter, organic layer (litter removed), and topsoil. For the coniferous forest vegetation type, there were 13 sites included from two studies (Hirave et al., 2020;Schäfer et al., 2016). For the deciduous forest type, 42 sites were included from 10 studies (Angst et al., 2016;Bliedtner et al., 2018;Bush and McInerney, 2015;Chikaraishi and Naraoka, 2006;Hirave et al., 2020;Lei et al., 2010;Schäfer et al., 2016;Stout, 2020;Trigui et al., 2019;Wu et al., 2019). For the mixed forest type, 26 sites were included from three studies (Bush and McInerney, 2015;Hirave et al., 2020;Howard et al., 2018). For the grassland or shrubland vegetation type, 110 sites were included from eight studies (Bliedtner et al., 2018;Bush and McInerney, 2015;Howard et al., 2018;Lemma et al., 2019;Li et al., 2018;Schäfer et al., 2016;Trigui et al., 2019;Yao et al., 2019). For the peat vegetation type, five sites were included from two studies (Ficken et al., 1998;Zhang et al., 2017). For the steppe vegetation type, 57 sites were included from two studies (Struck et al., 2020;Yao et al., 2019). in n-alkane concentrations by a range of 66.8 % to 120.9 % (Fig. 6b), while Schäfer et al. (2016) noted decreases in concentrations ranging from 33.0 % to 89.3 % (Fig. 5a). Further down the profiles, Anokhina et al. (2018) found a significant increase in concentration in an E horizon followed by a decrease in a B horizon (Fig. 6a, b). Other studies noted generally decreasing concentration through the whole profile (e.g., Angst et al., 2016;Bull et al., 2000;Cui et al., 2010;Wu et al., 2019). At sites with grassland vegetation, from the first to second horizons, there was typically a decrease in the con-centration of n-alkanes (Marseille et al., 1999;Schäfer et al., 2016;Bull et al., 2000), though there were three sites where n-alkanes increased in concentration by a range of 4.8 %-22.2 % (Celerier et al., 2009;Schäfer et al., 2016) (Fig. 5a).
Only two studies on grasslands reported on horizons below the A horizon: Celerier et al. (2009) found the concentration decreased from the A to B horizon by 53.56 %, while Feng and Simpson (2007) noted increases in the concentration of n-alkanes C 24 -C 33 from the A to B horizon in four grasslands ranging from 100 % to 350 %, followed by further changes SOIL, 7, 785-809, 2021 https://doi.org/10.5194/soil-7-785-2021 from the B to C horizon ranging from a 25.0 % decrease to a 314.3 % increase. Another study in a steppe biome measured not by horizons but by 15 cm intervals, and this study found alternating increases and decreases in the n-alkane concentrations down to a depth of 97.5 cm (Buggle et al., 2010). The soil contained a wider range of chain lengths than plants or litter (e.g., Angst et al., 2016;Anokhina et al., 2018), though no study reported measuring n-alkanes shorter than C 14 . Soils were also reported to contain more n-alkanes with even chain lengths than the plants or litter, indicative of a source other than higher terrestrial plants (Almendros et al., 1996). Even so, the dominant chain length in many studies remained the same at all depths (e.g., Buggle et al., 2010;Cui et al., 2010;Huang et al., 1996) or was replaced by a longer chain length in the lower depths (e.g., Angst et al., 2016;Anokhina et al., 2018;Bull et al., 2000). Every study, except one (Almendros et al., 1996), reported a dominant chain length of at least C 25 in the soil layers, showing that the most abundant source of n-alkanes is leaf wax from higher plants.
As with the litter studies, not every study included the same type of index measurements. A few studies reported the ACL (Fig. 7a, b), which decreased with depth at many sites (e.g., Wu et al., 2019) but also increased at a significant number of sites, particularly in the upper horizons (e.g., Schäfer et al., 2016). The CPI is presented in Fig. 8a, b and was generally found to decrease with depth in the majority of studies that included it (e.g., Angst et al., 2016;Celerier et al., 2009;Huang et al., 1996;Wu et al., 2019). In other sites, the OEP (Fig 9a, b) was found to decrease with depth at most sites, even at those sites that only provided measurements from the organic layer (e.g., Schäfer et al., 2016). However, there were a few instances of the OEP increasing in lower soil layers, including at the Anokhina et al. (2018) study site, in which the OEP had decreased to ∼ 1 in the 12-28 cm interval and then increased in the following 28-60 cm interval (Fig. 9a, b).

Discussion
Overall, the literature review indicated that there is a very limited number of studies that examined the degradation of n-alkanes either in manipulative experiments or in open systems. The studies that are available focus on very specific settings and are modest in scope and size, which makes it difficult to determine what the primary drivers of change are and how these might cause a variance between environmental settings and vegetation species. Studies with available data are unevenly distributed geographically, with the vast majority performed in Europe. Therefore, more research focusing on n-alkane degradation along the entire pathway from plant to soil is urgently needed, and future research should be performed on a wider geographic range. Despite these limitations, there are some trends emerging from the literature available to date with respect to n-alkane degradation along (parts of) the trajectory plant-litter-soil. These trends are as follows:

Decrease in total concentration of n-alkanes
There are many potential variations in the fate of n-alkanes in soil and peat profiles. However, across vegetation types and study sites, there is generally a trend of decreasing overall n-alkane concentration with soil depth. This is most likely caused by the degradation or reworking of n-alkanes by soil microbiota. Degradation products of n-alkanes include nmethyl ketones through subterminal oxidation (Klein et al., 1968;Amblès et al., 1993;Jansen and Nierop, 2009) or nalcohols through terminal oxidation that can be converted into aldehydes and finally into n-fatty acids (Rojo, 2009  the decarboxylation of n-fatty acids (Jansen and Nierop, 2009).
In litterbag experiments, the total concentrations of nalkanes decreased with time in nearly all of the vegetation species (Fig. 1). Biodegradation is likely accelerated in the litter layer due to oxic conditions (Chikaraishi and Naraoka, 2006) and an active microbial community (Rojo, 2009). Additionally, Zech et al. (2011) acknowledged the possibility that n-alkanes may have been washed out of the litterbags. However, as n-alkanes are hydrophobic and almost insoluble in water, this is unlikely, especially for plant-derived longchain n-alkanes. Unexpectedly, each of the grass species included in the Wang et al. (2014) study showed increases in the concentration of total n-alkanes (Fig. 1). Wang et al. (2014) explained this unexpected increase as the result of n-alkane degradation being a complex process influenced by both chemical properties and plant species, as evidenced by the results of their short-term experiment.
In open plant-soil systems, the absolute concentration of n-alkanes generally decreased substantially (averaging 87 %) along the sequence from fresh plant leaves to topsoil, as seen in most of the litterbag experiments (Figs. 1, 3). The observed deviation from this trend in a few of the coniferous forest sites (Fig. 3) is likely due to the fact that the vegetation or litter samples at these sites often had a very low initial concentration of n-alkanes (e.g., Schäfer et al., 2016;Otto and Simpson, 2005). Therefore, it is feasible that the accumulation of n-alkanes in the lower layers over time could lead to higher concentrations, even as degradation is occurring. Additionally, as waxes on coniferous needles are renewed throughout the year, physical abrasion and subsequent deposition of the waxes onto the soil can cause an enrichment of wax lipids in the topsoil compared to the fresh biomass (Heinrich et al., 2015).
When subsequently focusing on the development of the nalkanes with depth in soil profiles, the trends are generally in line with the expected microbial degradation over time https://doi.org/10.5194/soil-7-785-2021 SOIL, 7, 785-809, 2021 (Figs. 5, 6). However, an interesting case is formed by the single study of a peat profile that reported concentrations of n-alkanes with depth relative to the TOC (Andersson and Meyers, 2012; Fig. 6b). Recently, Keiluweit et al. (2017) proposed that under (partially) anaerobic conditions, reduced organic compounds such as lipids and waxes could be selectively protected from degradation. This would result in a relative increase in the n-alkane concentrations with time and, consequently, also with depth until a certain threshold in anaerobic environments such as peat soils. However, such a trend is not supported by the results of Andersson and Meyers (2012; Fig. 6b). Even earlier studies (e.g., Ficken et al., 1998) showed in such peat sequences that the historical development of the peatland with changing biogenic inputs can have a stronger effect on alkane concentrations than the selective enrichment under anaerobic conditions as proposed by Keiluweit et al. (2017). The latter might be more relevant for nearly steady-state conditions of peaty soils with continuous accumulation of similar peat biomass over long periods of time. The lack of more datasets of relative n-alkane con-SOIL, 7, 785-809, 2021 https://doi.org/10.5194/soil-7-785-2021 centrations with depth in peat and other (partially) anaerobic soils prevents us from reconciling this apparent mismatch. It underpins the need to expand our database of n-alkane trends in different systems also from the point of view of unraveling the factors governing carbon stabilization in various soil environments.

Preferential degradation
As the overall distribution patterns of n-alkanes are often used for the purpose of vegetation reconstruction (e.g., Schwark et al., 2002), it is essential that these patterns remain relatively unchanged following deposition onto the litter layer so that they can be used accurately. All of the species in the litterbag experiments retained their range of chain lengths as well as their most abundant chain length (Supplement), evincing that there are limited changes in the distribution patterns of the n-alkanes and no preferential degradation of long chain lengths. Though some of the results from the litterbag experiments are supported by those seen in the open plant-soil systems, there are also some differences. While the distribution patterns of the n-alkanes in the litterbag experiments were not found to change much from their original, this was not the case in all of the studies in the open systems. Differences in results from the litterbag experiments could be a result of the mesh bags used, preventing larger soil organisms such as earthworms from accessing the litter inside (Bradford et al., 2002).
There is a lot of evidence for microbial alterations of nalkanes in open plant-soil systems and the resulting changes in distribution patterns. These changes could be partially due to preferential degradation of n-alkanes of certain chain lengths or selective preservation of some alkanes (Lichtfouse et al., 1998) rather than the same rate of microbial degrada-tion for all n-alkane compounds. A few studies have found that n-alkanes with shorter chain lengths are degraded more quickly than those with longer chain lengths (Moucawi et al., 1981;Amblès et al., 1993). Additionally, the results of the other studies included in the review appear to support this as well, considering that in the majority of the studies, the dominant long-chain n-alkane either remained the most abundant in the soil or was superseded by an even longer chain length (e.g., Bliedtner et al., 2018;Buggle et al., 2010;Cui et al., 2010;Huang et al., 1996;Angst et al., 2016;Anokhina et al., 2018;Bull et al., 2000). Therefore, even if preferential degradation occurs, it is still generally possible to identify characteristic patterns of plant-derived n-alkanes in soil because the primary long chain lengths remain dominant.
As evidence of preferential degradation, the CPI and OEP tended to decrease with depth or from litter to soil (Figs. 8,9), with some exceptions. In aerobic conditions, microbial reworking has been shown to cause an increase in lower chain length n-alkanes and a loss of high odd-over-even predominance (Grimalt et al., 1998;Brittingham et al., 2017). This is a result of the degradation of longer chain length n-alkanes occurring at the same time as an accumulation of microbialderived medium-chain n-alkanes (Brittingham et al., 2017). The resulting change in chain length distribution can be the cause of the decreases in CPI and OEP with depth throughout the mineral soil that are seen in many studies (Angst et al., 2016;Celerier et al., 2009;Huang et al., 1996;Wu et al., 2019;Li et al. 2017;Bliedtner et al., 2018;Schäfer et al., 2016). Changes in CPI could also indicate a shift in vegetation composition over time.
In the litterbag experiments, the trend was not quite as clear. For Calluna vulgaris and three of the grass species, only slight changes were noted (Fig. 2b). Wang et al. (2014) inferred from their results that the CPI might be insensitive to early litter degradation, e.g., within the first year. This canhttps://doi.org/10.5194/soil-7-785-2021 SOIL, 7, 785-809, 2021 not fully explain the results of Huang et al. (1997) as the experiment lasted 23 years; however, as it occurred in a peatland with anaerobic conditions present at shallow depth, it is likely that degradation was inhibited following the litterbag's burial by newer litter accumulation. Nguyen Tu et al. (2017) found that the CPI increased in Fagus sylvatica litter after 2 years, which can further support that there is limited degradation of the distribution pattern of n-alkanes in litter. Zech et al. (2011) found that there was a large decrease in OEP in Fagus sylvatica and Sorbus aucuparia, which could indicate that there is preferential degradation of odd n-alkanes rather than even n-alkanes as the ratio approaches 1 or that there is a different source of n-alkanes affecting the litter's signal that has an even-over-odd predominance. Additionally, other organic compounds such as biopolymers could break down through degradation and contribute n-alkanes that do not have an odd-over-even predominance (Jansen and Wiesenberg, 2017). The general decreases in CPI and OEP will not affect source apportionment or vegetation reconstruction if the oddchain length n-alkanes are used. However, changes in the ACL could indicate that there has been a shift in the pattern of odd-chain lengths. In terms of the biomes, the largest relatively short-term changes seem to occur the most in the deciduous forest vegetation (Fig. 7a). This could cause difficulties in identifying dominant vegetation. To determine what could be causing a change in ACL in the short term in certain biomes, further study of preservation in specific environments should be considered. While changes in the deeper profiles could be a result of a shift in the vegetation input over time, this is not likely the case in the studies that only considered topsoil samples, as did most of the included studies.

Other sources of n-alkanes
There are also other potential sources of n-alkanes that may affect distribution patterns and alter plant-derived signals. These include n-alkanes present in aerosols after being abraded from leaves (Rogge et al., 1993;Nelson et al., 2017Nelson et al., , 2018. The leaf waxes found in aerosols may obscure the local vegetation biomarker signal found in soil with a more regional signal (Nelson et al., 2017;Howard et al., 2018). Two additional potential sources that have been relatively neglected in studies of biomarkers are pollen (Hagenberg et al., 1990) and insects (Chikaraishi et al., 2012).
Leaves are not the only plant organs containing n-alkanes. Roots have been found to generally have different distributions of n-alkanes than aboveground biomass (Jansen et al., 2006;Angst et al., 2016), though for grasses, the composition can be similar (e.g., Kuhn et al., 2010). Therefore, the root signal can potentially affect the distribution of nalkanes in deeper soil by imprinting a new signal over that of plant waxes (Gocke et al., 2010). However, roots often contain a much lower concentration of n-alkanes compared to plant leaves and are not likely to be the primary source of n-alkanes in topsoils (Angst et al., 2016;Gamarra and Kahmen, 2015). Nevertheless, studies have found that roots in some species have a higher concentration of n-alkanes than in the leaves (Huang et al., 2011), and the continuous growth of roots, as well as the release of exudates, can cause roots to be a larger source of n-alkanes, particularly in lower soil horizons (Jansen and Wiesenberg, 2017). Woody tissues and bark also contain low concentrations of n-alkanes, though they may be relevant sources in forests (Seca et al., 2000).

Factors affecting preservation
Additionally, there are many environmental factors that can affect the preservation of the n-alkane concentrations and distribution patterns in soil. These include soil characteristics. In regards to soil, the pH has been found to affect the preservation of lipids and other organic compounds, with lower rates of decomposition noted in more acidic soils (Moucawi et al., 1981;Oades, 1988;Bull et al., 2000). This is likely due to reduced microbial activity in acidic soils. Furthermore, n-alkanes can be physically protected from microbes due to encapsulation within larger organic macromolecules (Almendros and González-Vila, 1987;Lichtfouse et al., 1998) or in soil aggregates . Forest fires have also been shown to affect the distribution pattern of n-alkanes found in soil, causing a higher abundance of shorter chain lengths (Almendros et al., 1988;González-Pérez et al., 2008).

Transportation of n-alkanes in soil
Because lipids are generally hydrophobic and not soluble in water, n-alkanes are not considered to be as susceptible to leaching as other organic compounds (Naafs et al., 2004). However, n-alkanes could be transported as particulate matter when sorbed to minerals or other organic material. This could be a substantial factor, in particular in soil types such as Alisols and Luvisols characterized by vertical transport of clay particles (IUSS Working Group, 2014). The potential for changes in n-alkane concentration or composition as a result of vertical transport in soil has not yet been researched in depth.

Potential to correct for post-depositional changes
Despite the evidence of n-alkane degradation, there have been very few studies that have attempted to address this when measuring and quantifying n-alkanes in soil archives, particularly when using the composition to reconstruct vegetation changes. A notable exception is Zech et al. (2009), who developed a two-end-member model using long-chain n-alkane ratios from litter and topsoil under pure forest and SOIL, 7, 785-809, 2021 https://doi.org/10.5194/soil-7-785-2021 pure grassland cover as the two end-members. Data from their study and their reviewed literature were plotted on "degradation lines" using the n-alkane ratios as the dependent variable and OEP as the independent one. Using these simple models, it was possible to estimate the percentage of total n-alkanes that had been contributed by grass species or forest species. However, Zech et al. (2009) noted that the accuracy decreased with lower OEP values as the lines converged. This method has been subsequently used in a number of studies from the same working group (Zech et al., 2012(Zech et al., , 2013Schäfer et al., 2016;Bliedtner et al., 2018). Buggle et al. (2010) used a linear regression approach to calculate a relationship between long-chain n-alkane ratio and OEP values in a modern soil, and using this "alteration line", they corrected (i.e., removed the effect of degradation) the long-chain n-alkane ratios found in paleosol samples. This approach assumed that the alteration of n-alkanes in the paleosol would be similar to that in the modern soil. The corrected ratios were used to determine the contributions of n-alkanes in the paleosol that were derived from trees or grasses.
These two approaches rely on the assumption that forest and grassland vegetation can be differentiated based only on the long-chain n-alkane composition, which is likely not always applicable due to variations in composition across species and environmental conditions (Bush and McInerney, 2013). Therefore, there is certainly room for the development of more approaches to accounting for degradation effects on n-alkane compositions. However, the lack of quantitative, process-based studies of n-alkane degradation in various ecosystems hampers the development of more advanced approaches to correct for n-alkane degradation.

Knowledge gaps
What this review has illustrated above all is that there are still many gaps in fully understanding the fate of n-alkanes in soil, particularly in quantifying post-depositional changes. This is not aided by the fact that there does not seem to be a uniform method for reporting lipid data measured in soils. Although soil characteristics have been shown to influence the preservation of n-alkane composition, basic information, such as soil pH, is often not included in papers.
Additionally, there has been a noticeable shift towards data transparency and increased availability in recently published studies. However, no data and meta-data standards exist for storing and labeling biomarker data, the soil and vegetation from which they were collected, and the lab methods used for their determination, and as a result, the biomarker data found in the literature are not interoperable (i.e., they cannot readily be processed by analysis software) (Wilkinson et al., 2016). Furthermore, the data from older but still frequently cited studies are often simply shown in poorly labeled graphs or histograms and are not accessible anywhere else. Access to these older datasets could potentially provide information helpful for answering enduring questions related to n-alkanes.
Finally, there is a noticeable lack of available data from many areas worldwide, as seen in Fig. 10. Most sites with reported n-alkane data are concentrated in Europe, though there is a trend in recent studies being performed in less represented areas, such as Peru (Wu et al., 2019), Ethiopia (Lemma et al., 2019), and Mongolia (Struck et al., 2020). Future studies of n-alkanes should similarly aim to cover less researched geographic areas.

Recommendations
Meta-analysis and synthesis have become more prevalent across scientific disciplines, including in the natural sciences, and have enabled better understanding of scientific questions on global scales (Gurevitch et al., 2018). This review provides a rudimentary synthesis of the current state of knowledge on the degradation and transformation of plant-derived n-alkanes in soil archives. Due to the knowledge gaps and issues of data accessibility and compatibility, truly rigorous systematic review and meta-analysis are currently not possible for biomarker studies. However, if the research community begins to move towards more uniform and systematic reporting of biomarker data, the potential for synthesis across studies could increase rapidly and enable a more complete and quantitative understanding of the fate of n-alkanes in soils across ecosystems.
To move towards this goal, we provide three concrete suggestions for anyone who plans to collect plant and soil biomarker data.
1. Measure and report basic soil parameters, such as bulk OC and soil pH. If the soil type has already been characterized, include a reference to this information. If the soil has not been characterized, the soil type should be determined using the World Reference Base for Soil Resources (IUSS Working Group, 2014) or the USDA Soil Taxonomy (Soil Survey Staff, 2014) guidelines rather than a localized classification system in the interest of standardization.
2. Local climate information should be reported, or enough specific location data should be provided so that it is easy for readers to find the climate information from other sources. Additionally, local vegeta-tion should be characterized at a minimum by reporting dominant species cover.
3. Longer chain lengths of n-alkanes should be measured and reported, e.g., C 20 -C 36 . The primary data are more useful to report than index measurements due to the variations in equations for index measurements.
In the future, these data should be gathered in a freely accessible database dedicated to lipid biomarker measurements, including n-alkanes, that could allow scientists globally to have a better understanding of what data are already available and what study areas are underrepresented. To be built up, such a database would need a general template of required data including soil characteristics and environmental parameters in addition to the biomarker concentration and pattern data. In addition, meta-data standards for describing these data would need to be selected (for an overview, see Hoffmann et al., 2020). Schädel et al. (2020) recently developed a database for soil incubation studies, the Soil Incubation Database (SIDb), along with guidelines for reporting data from the studies. Due to the related nature of the research, we have adapted their reporting guidelines as a suggested starting point for standardization of biomarker studies (Table 3). As mentioned, site information including soil classification and vegetation should be considered essential to report. Some soil characteristics are essential, particularly bulk OC, pH, and depth. Others are recommended as they could enable a more quantitative understanding of biomarker degradation or preservation. It is preferable to report primary biomarker data rather than index measurements, though if index measurements are included, the equations used for calculation must also be included. Studies using litterbag experiments should report some additional data regarding the setup and execution of the experiment. Wider adoption of these guidelines would be very beneficial for developing a biomarker database. Eventually, such a database could be similar in structure and usability to the Neotoma Paleoecology Database (https://www. neotomadb.org/, last access: 20 August 2021). Until that time, but also as an alternative in case data are incompatible with the structure imposed by the database, scientists could use existing scientific data repositories such as Pangaea (https://www.pangaea.de, last access: 20 August 2021) or the EarthChem Library (https://www.earthchem.org, last access: 20 August 2021).
Data availability. Data extracted from the reviewed studies may be found in the Supplement.