Riverbank erosion affects river morphology and local habitat, and results in riparian land loss, property and infrastructure damage, and ultimately flood defence weakening. An important issue concerning riverbank erosion is the identification of the vulnerable areas in order to predict river changes and assist stream management/restoration. An approach to predict areas vulnerable to erosion is to quantify the erosion probability by identifying the underlying relations between riverbank erosion and geomorphological or hydrological variables that prevent or stimulate erosion. In the present work, a statistical methodology is proposed to predict the probability of the presence or absence of erosion in a river section. A physically based model determines the locations vulnerable to erosion by quantifying the potential eroded area. The derived results are used to determine validation locations for the evaluation of the statistical tool performance. The statistical tool is based on a series of independent local variables and employs the logistic regression methodology. It is developed in two forms, logistic regression and locally weighted logistic regression, which both deliver useful and accurate results. The second form, though, provides the most accurate results as it validates the presence or absence of erosion at all validation locations. The proposed tool is easy to use and accurate and can be applied to any region and river.

Riverbank erosion is a complex phenomenon resulting from various factors which affect the balance of ecosystems. It is also important from the geomorphological aspect as it also induces changes in the river channel course and in the development of the floodplain (Hooke, 1979; Bridge, 2003). Mass-failure processes constitute a significant source of sediment in disturbed streams, which occur due to a combination of hydraulic and geotechnical processes that undercut bank toes and cause bank collapse (Simon et al., 2009). Riverbank erosion is a natural geomorphologic process that affects the fluvial environment in many aspects: physical, ecological and socio-economical. It is the result of a complex interaction between the channel hydraulic conditions and the physical characteristics of the banks, both of which are highly variable in nature. Bank retreat affects the riverbed structure and morphology as well as the floodplain morphology and the physical habitat. In addition, riparian land losses and damage to human property and infrastructures lead to direct financial consequences. Moreover, turbidity increase, sediment and debris transport, and flood defense weakening reveal a complex combination of arising issues due to riverbank erosion. According to Atkinson et al. (2003), significant parameters affecting erosion are vegetation index (stability), the presence or absence of meanders, bank material (classification) and stream power. Other factors such as bank height, riverbank slope, river cross section width, riverbed slope and water velocity have also been reported to affect the erosion rate (Hooke, 1979; Abam, 1993; Winterbottom and Gilvear, 2000; Rinaldi et al., 2008; Luppi et al., 2009). Therefore, the identification of riverbanks which are vulnerable to erosion is of utmost importance, either for their protection or restoration.

On the other hand, riverbank erosion constitutes a significant factor to the functioning of river-dependent ecosystems and provides a sediment source that creates a riparian habitat. Bank erosion is a key geomorphological mechanism for the fluvial ecosystems, since it regulates the diversity of habitats, species and vegetal units. The process provides riparian vegetation succession and develops dynamic habitats, vital for fluvial plants and animals. For small-scale bank erosion or for local extent, there is no significant influence on the aquatic ecosystem and and contributes to its sustainability. In the opposite case, the ecosystem is significantly affected, while riparian land losses and damages occur, leading to areas vulnerable to flooding (Piégay et al., 1997, 2005; Florsheim, 2008).

The bank erosion process is closely related to soil composition of the riverbanks, and the erodibility factor is affected by the composition of sand, silt and clay. A high content of sand and silt leads to easily eroded soils since they are both fine in size and can be carried away by river flow. The most common type of bank structure is a stratified or interbedded bank of cohesive or non-cohesive layers. Riverbanks made up of non-cohesive soil are very erodible due to the low clay content and the weak erosion-resistant strength of the bank soil. Instead, cohesive soils have increased clay or clayey silt content and are more resistant to erosion. Non-cohesive soils erode as individual grains, while cohesive soils erode as aggregates. On the other hand, a bedrock bank is usually very stable and will only experience gradual erosion (Raudkivi, 1998; Roslan et al., 2013).

Although riverbank erosion is a common phenomenon, the prediction of the location and the extent of riverbank erosion is difficult. Therefore, a range of approaches and methods have been developed and tested. The most important issue concerning riverbank erosion is the identification of the areas vulnerable to bank erosion in order to predict changes in the river channel form and assist stream management/restoration options. Different methods have been used to predict erodibility, such as analyses of historical maps and the use of sequential aerial photographs based on GIS technology. However, riverbank erosion is usually approached by using a combination of bank stability methods and hydrodynamic models to predict the vulnerable areas and estimate the erosion rate (Nardi et al., 2013). Of these two methods, the former has a relatively high degree of inaccuracy, while the latter is too complex to be applied, as it requires a significant number of data variables.

Herein, a statistical tool is proposed using the logistic regression (LR) technique for the determination of riverbank erosion probability. This technique was selected due to its ability to link related dependent and independent variables by converting their relationship to a probability of presence or absence of the dependent variable. In addition, it can be extended to account for locally spatially correlated independent variables. The suggested statistical model, entitled locally weighted logistic regression (LWLR), combines LR and locally weighted regression (LWR) principles to create a local model that calculates the probability of erosion occurring based on spatially correlated secondary information (e.g. bank slope, river cross section). Therefore, the accuracy of the predictions is expected to improve compared to the global regression model LR.

The proposed statistical model identifies the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. It utilizes the available data to detect areas vulnerable to erosion. In addition, the erosion occurrence probability can be calculated in conjunction with the model deviance for each independent variable or model form tested. A similar method was introduced and applied successfully to a river in northern Wales (Atkinson et al., 2003), for the estimation of the variables that mostly affect riverbank erosion. The simple logistic regression was applied.

This work also involves the application of the Bank-Stability and Toe-Erosion Model (BSTEM 5.2) in order to predict eroded or non-eroded riverbank areas for the validation of the proposed tool. The BSTEM model is a physically based model, developed by the National Sedimentation Laboratory in Oxford, Mississippi, USA (Simon et al., 2000), and it has been used to simulate the hydraulic and geotechnical processes responsible for mass failure. It represents two distinct processes, namely the failure by shearing of a soil block of variable geometry and the erosion by flow of bank and bank toe material. The BSTEM has been successfully applied in diverse alluvial environments (e.g. Simon et al., 2000, 2002; Simon and Thomas, 2002; Pollen and Simon, 2005; Pollen-Bankhead and Simon, 2009; Simon et al., 2011). It was used to simulate the effects of enhanced matric suction from evapotranspiration and decreased soil erodibility driven by the presence of plant roots, quantifying the effects on streambank factor of safety and comparing with the effects of mechanical root reinforcement (Pollen-Bankhead and Simon, 2010). BSTEM was also used to quantify bank retreat, which ranged from 7.8 to 20.9 m along 100 m of riverbank at the Barren Fork Creek site (Midgley et al., 2012). In addition, it was also used to quantify the reductions of mass-failure frequency and sediment loading from streambanks in the Lake Tahoe in the USA (Simon et al., 2009).

The developed methodology was applied to Koiliaris River basin at the island of Crete, Greece. The overall concept of this work is to provide estimates of the erosion probability at specific ungauged riverbank locations, based on independent secondary explanatory information in terms of LWLR methodology. BSTEM has an auxiliary role to estimate/validate potential eroded riverbank locations by calculating the potential eroded area, using field measurements of hydraulic, hydrologic and geomorphologic variables. These estimations (dependent variables) are then used to set up and validate the statistical model. To the best of our knowledge, this is the first time that this combination of deterministic and stochastic models to predict riverbank erosion has appeared in the scientific literature.

The Koiliaris River basin is situated 25 km east of Chania
(350

The downstream part of the Koiliaris River, located in the western part of the island of Crete. The yellow pins represent the measurement locations, the red pins the validation locations and the green pin the gauging station located at the intersection of the Koiliaris River with the Keramianos tributary. A representation of the measured geomorphological values is provided in the upper left corner.

Typical hydrograph of the Koiliaris River at the gauging station (November 2013–June 2014).

The bank erosion vulnerability of the Koiliaris' riverbanks was first studied
during the hydrologic period 2010–2011. The downstream section of the river
was divided into eight subsections of variable length, starting from the
gauging station up to pin no. 8 on the study area map (Fig. 1). In each subsection,
the geomorphological characteristics of the riverbanks and the riverbed were
measured at the beginning and at the end of the subsection, during the first
field campaign. Channel and bank geometry characteristics, flow parameters,
bank material, bank vegetation and protection parameters were identified and
used as input to the BSTEM model to calculate the riverbank eroded area
(

Therefore, reach slope varied between 0.0042 and 0.11 m m

At the beginning of hydrological year 2013–2014, a second field campaign was designed to identify this time specific locations vulnerable to erosion. Twelve riverbank locations were selected along the aforementioned eight subsections and scaled sticks were installed. Two of those locations were selected at restored parts of the river section to monitor potentially stable riverbank points. Six months later, at the end of the wet period and after three flood events (Fig. 2 – Red peaks), the erosion sticks were visually inspected, during a field trip, to identify the presence or absence of erosion. Therefore, the eroded area was roughly estimated.

The concept for this second campaign was to establish measurement points, which were necessary to develop and apply a statistical model that, taking into account a series of explanatory variables, would determine the probability of riverbank erosion at local scale. Furthermore, a series of validation points were necessary to validate the model's efficiency. Thus, the endpoints of each subsection from the first campaign were used because an overall estimate of the riverbank vulnerability was available from the BSTEM results.

However, in order to verify the BSTEM prediction efficiency, it was decided that the model would be tested by using the 12 locations of the second campaign. Based on the model's efficiency and the quality of estimation, the reliability of BSTEM results was evaluated at the eight subsections of the aforementioned river section. The second BSTEM model application estimated the cumulative riverbank erosion effect for the three flash flood events (Fig. 2) at the 12 locations. The other parameters were similar for both model applications since the same river section was employed.

Process flowchart that presents the combined application of the BSTEM and of the proposed statistical model (SMODEL) based on LR principles. “S” and “U” correspond to stable and unstable riverbanks, respectively.

The BSTEM model results (at the 12 locations) together with field
inspection were used to set up the statistical model by interpreting the
erosion existence in terms of binary data (1

Next, the probability of erosion at the riverbanks of the Koiliaris River was estimated considering a series of easy-to-determine independent geomorphological variables (bank slope, river cross section) through LR and LWLR methodologies, first at the validation points and then at ungauged riverbank locations. The methodological steps of the proposed tool and of the overall process are briefly described by a flowchart presented in Fig. 3.

Riverbank erosion can be simulated by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary with geographical location, and therefore a spatially non-stationary regression model is preferred instead of a stationary equivalent. Locally weighted regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression (LR) model. It is referred to as locally weighted logistic regression (LWLR). The two independent variables considered herein were river cross section width and bank slope.

In statistics, LR is a type of regression analysis used for predicting the
outcome of a categorical dependent variable (e.g. binary response) based on
one or more predictor variables (continuous or categorical). The method can
be used along with LWR to assign weights to local independent variables. LWR
allows model parameters to vary over space in order to reflect spatial
heterogeneity (Atkinson et al., 2003; Lall et al., 2006). The probabilities
of the possible outcomes are modelled as a function of independent variables
using a logistic function. LR measures the relationship between a categorical
dependent variable and, usually, one or several continuous independent
variables by converting the dependent variable to probability scores. Then, a
LR is formed, which predicts success or failure of a given binary variable
(e.g. 1

The LR model is based on the logistic function, a common sigmoid function.
The mathematical form is represented by the following equation:

The goal of LR is to derive estimates for the

LWR is an extension to the concept of general regression. The difference
between LWR and multiple linear regression is that, in LWR, the independent
variables' effect on the dependent one is weighted based on a weighted
function in terms of their geographical location. Basically, LWR is a form of
spatial data analysis that allows for the evaluation of a dependent variable,
based on one or more local independent variables (Cleveland and Devlin, 1988;
Brunsdon et al., 1996; Fotheringham et al., 2002; Atkinson et al., 2003; Lall
et al., 2006). LWR is used to improve the results obtained with simple LR,
allowing for the coefficients

The erosion occurrence probability can be calculated in conjunction with the
model deviance. The reliability of both LR and LWLR is determined using the

The BSTEM model was validated for the predicted erosion (m

The evaluation of the BSTEM model results involved the calculation of the
percentiles used to categorize the significance of the BSTEM calculated
eroded area. The BSTEM model results are in very good agreement with the
behaviour of the banks after the flood events. Of the 12 measurement
points, 4 were identified with no or low erosion, as the affected area was
below or very close to the 25th percentile and equal to 0.52 m

Photo highlighting the riverbank location (KI) with the most intense observed erosion accompanied by the appropriate scaled tools to provide a rough estimate of the eroded area.

Amount of bank erosion at the measurement locations (Fig. 1). Modelling results obtained by BSTEM.

The aforementioned results mean that the BSTEM outcome for the eight subsections of the first campaign can be also characterized as reliable, as they are located in between the 12 points that were successfully validated by the field inspection. The model outcome provided seven subsections with potential to erosion vulnerability and one not vulnerable to erosion, based on the estimated affected area in comparison to the total area of the bank at the respective river subsection. Therefore, they could be used as validation locations for the assessment of the statistical model performance.

Presence (1) or absence (0) of erosion at measurement locations using a binary indication for the statistical model (LR and LWLR) setup based on inspection and BSTEM results. The third and fourth columns present the measured independent geomorphological variables.

Consequently, the 12 measurements of the second field campaign were used to
apply LR and LWLR, while the 8 locations of the first campaign were
employed as validation points. The first BSTEM application has provided a
vulnerability assessment of the riverbank sections that these eight locations
assign. The riverbank areas vulnerable to erosion, and therefore the
associated locations are characterized as unstable (“U”) and the
non-vulnerable as stable (“S”). Corresponding to the LR and LWLR that
deliver probabilities of erosion occurring,

Result of LR application at the eight validation locations (Fig. 1). The independent variables used and the BSTEM estimates are also presented. In the fourth column, S denotes stable and U unstable bank locations.

Result of LWLR application at the eight validation locations (Fig. 1). The LR estimates, the independent variables used and the BSTEM estimates are also presented. The diverged values are indicated in bold. In the fourth column, S denotes stable and U unstable bank locations.

Erosion probability predictions using

The results derived from the application of the LR model, with uniform
parameters for all estimation points, are presented in Table 3. The values of
the independent variables and the BSTEM erosion estimates at the validation
points are also presented in the same table. The model deviance was
calculated equal to 6.14 and the

On the other hand, results for the erosion probability at the validation
points derived by applying LWLR with the exponential and the tri-cubic
weighting functions are presented in Table 4. The graphical representation
of the results for the erosion probability at the ungauged locations is
provided in Fig. 5b and c for the exponential and tri-cubic functions,
respectively. In the case of the exponential weighting function, the model
deviance is equal to 6.27 and the

Inter-comparison of estimations of the three methods tested is possible, as
the

Both LWLR models involve a non-linear parameter in the weighting function
that determines the correlation distance of the spatially correlated
measurement points. The optimal distance in each case was calculated using a
leave-one-out cross validation analysis, involving the measurement locations.
As a result, parameter

The results obtained with the LR method were in very close agreement with those of BSTEM as the erosion presence or absence was accurately predicted at six out of the eight locations, with one of the failure locations having a narrow deviance from the set erosion presence limit. Next, to improve predictions, the LWLR method was applied to account for the local spatial dependence of the independent variables at the measurement locations. The LWLR model with the exponential function has, overall, similar performance to the LR model. The derived results are in agreement with the BSTEM estimates at seven out of the eight validation locations, and the approach fails at only one validation location. The application of the LWLR model with the tri-cubic function leads to significant improvement in the estimates and to the accurate prediction of the erosion probability at all eight validation locations. The significant result for this model was the validation of a clearly unstable point (pin no. 7) which has independent variables that should provide a stable indication (as delivered by LR). Another point with similar characteristics (pin no. 4, Fig. 1) was correctly identified as stable. Therefore, such performance is possible only when local spatial weighting functions are used.

The only validation point indicated as stable (pin no. 4, Fig. 1) belongs to the fourth river section (between pins no. 3 and 4, Fig. 1), which as a whole was determined by BSTEM to be stable. However, two out of the three local measurements in the same section (pins KB and KC in Fig. 1) showed signs of erosion after the inspection. Generally though, apart from limited locations, the banks of that section did not show erosion signs due to the presence of dense seasonal riparian vegetation. The erosion probability estimation at this point is affected significantly, at local scale, by the spatially correlated measurement points with low vulnerability to erosion. Similarly, validation points 6 and 7 are also affected by the close presence of measurement locations with low vulnerability to erosion. This explains the difficulty in predicting erosion at these points. The model results may confirm the presence or absence of erosion at the validation points, but they are quite different from the targeted values of zero for no erosion and one for erosion presence. This is expected to improve when a larger data set with greater variability in the independent variables' effect on erosion becomes available.

The graphical representation of the LWLR model results at the discretized river section (Fig. 5b and c) shows a significant difference in performance for the two weighting functions. The tri-cubic function (Fig. 5c) delivers more reliable results as it is clearly considers the variability in the independent variables inside the correlation distance. This can be observed through the colour variability in the graph of Fig. 5c, which represents the variability in the erosion occurrence probability. On the other hand, the exponential function (Fig. 5b) shows a smooth change in probability for the different pairs of independent variable values. This can be explained in terms of the function shape behaviour and the correlation distance. The tri-cubic function is herein applied in a shorter correlation distance according to the cross validation results, which can capture the local dependence of the explanatory variables that, at longer distances, are smoothed due to the presence of more data.

The LWLR method with the tri-cubic function yields the highest value for the

The LR-based models results suggest that riverbank erosion probability generally increases as the bank slope increases and the river cross section decreases. This is due to an increase in the flow velocity that removes the non-cohesive soil components from the banks. Based on field measurements analysis, the bank material at the Koiliaris River was classified as “fine rounded sand”. The fine rounded material is more easily removed due to its low resistance and increased flow friction. This characteristic is associated with the LR-based models' results, as they provide mainly favourable probabilities of riverbank erosion at the validation points. However, in order to connect the soil properties' effect with the probability of erosion that results from geomorphological variables in detail, the LR-based models should also account for soil properties, such as particle size distribution and bulk density, which also consider mechanical properties of the riverbanks. This is a task that the authors plan to address in research in the near future.

The proposed statistical model is a useful, fast, efficient and fairly easy
to apply tool that requires information from easy-to-determine
geomorphological and/or hydrological variables. This tool provides a
quantified measure of the erosion probability along the riverbanks, and
could be used to assist in managing erosion and flooding events. On the other
hand, the BSTEM model can be successfully applied to determine the potential
riverbank eroded area (

The BSTEM model setup provides reliable results regarding the potential erosion vulnerability of the riverbanks that can be used to validate the estimations of the proposed statistical model. On the other hand, the proposed LR-based statistical model efficiently estimates the erosion probability at the riverbanks, using two secondary variables that affect significantly the presence or absence of erosion. However, in LWLR, locality is of utmost importance; the location of the new pair of secondary variables was used to identify and weight the effect of spatially correlated measurement points in order to calculate the model parameters. The proposed methodology, LWLR, exploits the local information of independent variables and translates it successfully to bank erosion probability. This is not a typical regression estimation based on global parameters, but herein the model parameters are calculated iteratively for the new pairs of secondary variables.

The LR method performs satisfactorily in the plain form where uniform
parameters are considered for all estimation points. A difference from the
BSTEM results is observed only at two of the eight validation points. The
LWLR method with the exponential weighting function gives results similar to
those of LR. The LWLR method with the tri-cubic function provides
significantly improved estimates which coincide with the BSTEM results at all
validation points. The graphical presentation of the results in the
discretized river section shows that the erosion probability increases with
bank slope and decreases with cross section width. This is also confirmed by
the positive sign of the bank slope coefficients and the negative sign of the
cross section width coefficients in all LR applications. The deviance and the

This work presents the framework of a methodology that can be applied in order to estimate the probability of erosion at specific riverbank locations considering explanatory and easy to determine secondary variables. Channel geomorphological characteristics, such as cross section and bank slope, are relatively easy to determine at unmeasured locations by using a digital elevation model. On the other hand, hydrological variables or bank material requires extensive field measurements in order for characteristic variables to be considered as secondary information. Such measurements did not take place during the field campaigns as it was not in the context of this work. The developed statistical tool provides an alternative proposition for the estimation of riverbank locations vulnerable to erosion, which requires limited information on explanatory variables yet can provide vulnerable location estimates with increased reliability. It is therefore considered a very promising approach for the estimation of riverbank erosion probability. The tool is proposed as a supplementary solution to the riverbank erosion identification issue.

E. A. Varouchakis developed the statistical model and the model code and performed the simulations. Along with G. V. Giannakis, M. A. Lilli and N. P. Nikolaidis designed and carried out the field campaigns, and with the aid of G. P. Karatzas they analysed the collected data and the model results. E. Ioannidou performed part of the model simulations. M. A. Lilli and N. P. Nikolaidis applied the BSTEM model. Finally, E. A. Varouchakis prepared the manuscript with the contribution of all co-authors.

This work is part of a THALES project (CYBERSENSORS – High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project was co-financed by the European Union (European Social Fund – ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF) – Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.

In addition, the authors would like to thank the topical editor and the anonymous reviewers for their contribution to improve the manuscript. Edited by: A. Millares