The deformation prediction models of Wuqiangxi concrete gravity dam are developed, including two statistical models and a deep learning model. In the statistical models, the reliable monitoring data are firstly determined with Lahitte criterion; then, the stepwise regression and partial least squares regression models for deformation prediction of concrete gravity dam are constructed in terms of the reliable monitoring data, and the factors of water pressure, temperature and time effect are considered in the models; finally, according to the monitoring data from 2006 to 2020 of five typical measuring points including J_{23} (on dam section _{33} (on dam section _{35} (on dam section _{37} (on dam section _{39} (on dam section

During the service period, dams not only bear various cyclic loads and sudden disasters, but also suffer from erosion and corrosion from harsh environment, and it leads to the gradual decline of local and overall safety performance. Once the dam is wrecked, it will yield disastrous consequences. Therefore, it is very important to identify the potential risk and evaluate dam safety behavior in time based on the dam monitoring data collected by prototype observation instruments.

The traditional analysis models for dam deformation monitoring data include statistical model [

Li et al. [

Stepwise regression and partial least squares regression are widely used in statistical models. In stepwise regression model, environmental variables are added to the model one by one, and the significance of environmental variables to the model is sequentially assessed to obtain the optimal variable set. Hu et al. [

In recent years, with the development of dam safety monitoring, computer, big data, artificial intelligence and other theories and technologies, more and more data mining methods have been applied to the dam safety monitoring modeling, and many intelligent algorithm monitoring models have emerged, and these models show unique advantages in solving the problems of uncertainty and nonlinearity of monitoring model factors, prediction accuracy and generalization. Qu et al. [

This study aims to establish the deformation prediction models of Wuqiangxi concrete gravity dam based on the monitoring data. According to the monitoring data from 2006 to 2020 of measuring points J_{23} (on dam section _{33} (on dam section _{35} (on dam section _{37} (on dam section _{39} (on dam section

After the introduction, the statistical models and LSTM model for deformation prediction of concrete gravity dam are described in

The monitoring data are independent, and are easily affected by environmental factors in the process of observation, resulting in some monitoring data that do not conform to the regular changes, which are unreliable data. In order to improve the prediction performance of statistical model, the unreliable data should be removed.

Generally, the reliability of data is judged by Lahitte criterion (also called 3_{i}^{th} monitoring data can be expressed as

where _{i}^{th} monitoring data.

Assume that

The ratio of absolute value of runout deviation and mean square deviation of the ^{th} monitoring data is defined as

As _{i}

The displacement vector at one point in dam can be divided into horizontal displacement

where

The displacement component at one point in the concrete gravity dam under water pressure and reservoir water weight can be described as [

where _{1i} and _{2i} are the regression coefficients of upstream and downstream water pressure factors, respectively.

After some years of dam operation, the temperature inside the dam reaches the quasi-stable temperature field, so it can be assumed that the internal temperature of the dam body is only affected by the water temperature and air temperature, the water temperature and air temperature change harmoniously, and the deformtion is linearly related to the temperature of concrete. The temperature component is expressed as [

where _{1} and _{2} are the regression coefficients of temperature factor,

At the beginning of impoundment, the aging displacement generally changes violently, and then tends to be stable. The aging component can be expressed as [

where _{1} and _{2} are the regression coefficients of aging factor, and

Substituting

The stepwise regression method [

There are

with

where

In order to solve regression coefficients

where

The process of solving

In order to objectively evaluate the application of stepwise regression method to dam deformation prediction, the multiple correlation coefficient (MCC) and residual standard deviation (RSD) are used as the evaluation indexes of prediction accuracy. The MCC and RSD are defined as

where ^{th} eliminating-introducing.

where

The bigger the value of MCC, the better the prediction accuracy. The smaller the value of RSD, the better the prediction accuracy.

Partial least squares regression [_{1} and _{1} (_{1} is linear combination of _{1} is linear combination of _{1} and _{1} contain as much variation information in _{1} and _{1} is the highest. After extracting the first component, perform regression of _{1}, and _{1}. If the regression equation does not achieve satisfactory accuracy, the second component extraction is performed with the interpreted residual information _{1} and the interpreted residual information _{1}, until the satisfactory accuracy is achieved. If the _{k}

Long short-term memory (LSTM) network is a kind of back propagation recurrent neural network. In the LSTM [

In general, the hyper-parameters involved in the training of recurrent neural network are selected according to experience. However, the LSTM recurrent neural network is sensitive to the selection of hyper-parameters in training. In order to achieve better results, the optimization algorithm is adopted to optimize the hyper-parameters. The optimization of network hyper-parameters mainly focuses on three parameters: sample length, the number of hidden neurons in feedforward network layer (also known as state vector size), the learning rate which controls the adjustment range of network parameter. The grid search algorithm and random search algorithm are widely used to optimize the hyper-parameters of LSTM recurrent neural network.

Grid search determines the spatial dimension of grid search according to the number of hyper-parameters, divides the grid on each dimension, then determines the best hyper-parameters according to the results given by grid intersections. The search process of grid search is mainly divided into three steps: first, the values of less important hyper-parameters are fixed; second, the range of three hyper-parameters are set; finally, the target function LSTM recurrent neural network test set is set to the highest recognition accuracy. The existing studies show that the method based on grid search can obtain high precision result, but the computing cost increases exponentially with the increase of the number of hyper-parameters.

The search space of random search cannot be discrete, which allows random search to try more hyper-parameter combinations under the same computing resources. Grid search wastes a lot of computing resources in the hyper-parameters which have little impact on the network performance, while random search tests the unique value of each hyper-parameter which has an impact on the results almost every time, that is to say, random search tries more beneficial hyper-parameter combinations. Random search can greatly shorten the search time and improve the computational efficiency on the premise of ensuring a certain accuracy. After selecting the optimized super parameters, the recognition accuracy of LSTM recurrent neural network is greatly improved.

In this study, the LSTM recurrent neural network coupled with random search is used to construct the deformation prediction model of concrete gravity dam.

In order to objectively evaluate the application of LSTM model to dam deformation prediction, the root mean square error (RMSE) is used as the evaluation index of prediction accuracy. The RMSE is defined as

where _{i}^{th} data of the prediction group;

Wuqiangxi hydropower station which was completed in 1999 is located in the middle and upper reaches of the main stream of the Yuan River in Hunan province, China, as shown in

The dam is a concrete gravity dam. The crest elevation is 117.5 m, the highest dam height is 85.83 m, and the total crest length is 719.7 m. The main dam is divided into 34 dam sections, including the right bank retaining dam sections

There are 51 measurement points of the hydrostatic leveling instrument on the dam crest, numbered J_{47}, as shown in

The hydrostatic leveling system on the dam crest is based on J_{0} on the right bank, and double metal markers SJ_{1} and SJ_{2} are embedded in this part. Based on the monitoring data of measuring points J_{23} (on dam section _{33} (on dam section _{35} (on dam section _{37} (on dam section _{39} (on dam section

The unreliable monitoring data of vertical displacements are determined with the Lahitte criterion. _{23}. _{23}. After removing the unreliable monitoring data, the subsidence process line of J_{23} is smoother.

Date | Subsidence | Date | Subsidence | Date | Subsidence |
---|---|---|---|---|---|

2007/5/24 | −0.37 | 2007/7/27 | −9.11 | 2007/9/16 | −5.05 |

2007/7/26 | −4.85 | 2007/7/31 | −6.79 | 2007/12/11 | −4.65 |

2007/7/27 | −8.84 | 2007/9/15 | −9.44 | 2007/12/12 | 1 |

For convenience of expression,

where

In the cross-validity analysis of measuring point J_{23} from 2006 to 2020, _{h}^{*}, i.e.,

The regression equation of standardized displacement variable is written as

By reducing standardized variables to original variables, the regression equation of partial least squares method is obtained as follows

The multiple correlation coefficients (MCCs) and the residual standard deviations (RSDs) of the different measuring points for the partial least squares model are given in

Measuring point | |||||
---|---|---|---|---|---|

MCC | 0.9118 | 0.9280 | 0.9273 | 0.9310 | 0.9445 |

RSD | 0.0830 | 0.0786 | 0.0831 | 0.0778 | 0.0778 |

_{23}, J_{33}, J_{35}, J_{37} and J_{39} obtained with the partial least squares model, respectively. _{23}, J_{33}, J_{35}, J_{37} and J_{39}, respectively. It can be seen that this model can better reflect the variation law of dam crest settlement.

_{23} with stepwise regression method, showing the significance level of each factor. According to the significance level of the factors, the stepwise regression equation can be obtained by introducing variables in turn.

Factor | ^{2} |
^{3} |
^{2} |
||
---|---|---|---|---|---|

F-significance test | 0.007 | 0.00702 | 0.0069 | 0.0039 | 0.0044 |

Factor | ^{3} |
||||

F-significance test | 0.0048 | 0.3720 | 0.4437 | 0.0235 | 0.0166 |

The multiple correlation coefficients (MCCs) and the residual standard deviations (RSDs) of the different measuring points for the stepwise regression model are given in

Measuring point | |||||
---|---|---|---|---|---|

MCC | 0.9161 | 0.9258 | 0.9293 | 0.9336 | 0.9523 |

RSD | 0.4013 | 0.3782 | 0.3695 | 0.3587 | 0.3055 |

_{23}, J_{33}, J_{35}, J_{37} and J_{39} obtained with stepwise regression model, respectively. _{23}, J_{33}, J_{35}, J_{37} and J_{39}, respectively.

From

According to the numerical experiments, two LSTM layers are used, the rectified linear unit function is adopted as the activation function, and the input sequence length is 20, that is, the subsidence data of the first 20 days are used to predict the subsidence of the 21st day. The monitoring data of the training set are from 2006 to 2017, and the monitoring data from 2018 to 2020 are taken as the test set. The RMSEs of the trained model in the training set and in the test set for different measuring points are shown in _{23}, J_{33}, J_{35}, J_{37} and J_{39} by LSTM method. The RMSEs of measuring point J_{23} in the training set and in the test set for different training datasets are shown in

Measuring point | |||||
---|---|---|---|---|---|

Training set | 0.39 | 0.39 | 0.48 | 0.47 | 0.47 |

Test set | 0.40 | 0.44 | 0.54 | 0.50 | 0.49 |

Training dataset | 2006–2010 | 2006–2013 | 2006–2017 |
---|---|---|---|

Training set | 0.53 | 0.43 | 0.39 |

Test set | 0.43 | 0.41 | 0.40 |

The fitting results of the stepwise regression model and partial least square regression model are very similar, and the model quality is good, which can reflect the subsidence change law of measuring points. The multiple correlation coefficient of the stepwise regression method is slightly bigger than that of the partial least square method, while the residual standard deviation of the stepwise regression method is much bigger than that of the partial least square method.

From the process lines of the stepwise regression model and the partial least square model, it is found that the temperature component changes most obviously, and the change of the water pressure component and aging component is small. In other words, the displacements of measuring points are greatly affected by temperature change and less affected by water level and aging. This can be found from the introduction sequence of independent variables of stepwise regression method, which is in accord with the actual situation of dam operation. The dam body rises or sinks when the temperature rises or drops, and the changes are periodic. This is consistent with the actual situation of the dam.

Because of the high linear correlation of factors in water pressure component, stepwise regression method removes the factors of primary term of upstream water level and primary and tertiary terms of downstream water level, which will affect the regression analysis accuracy to some extent. In this regard, the partial least square regression method is more advantageous.

In the aging component, the fitting of partial least square regression method shows that the dam has settlement effect, and the subsidence will gradually slow with the time. The fitting of stepwise regression method shows that the dam has the effect of rising, and the rising amount will decrease with the time. The measuring point is not the starting time of statistical model based on the time of dam construction, so the time component has displacement at the beginning. Generally speaking, the change law of aging displacement of normal operation dam is sharp change in the initial stage and tends to be stable in the later period. However, the real displacement curves of measuring points show that the measuring points have obvious subsidence trend, so the aging components of both models are not stable, which is in agreement with the actual situation. On the whole, the aging component obtained by the partial least square model is more fit for the real situation.

_{23} obtained by the partial least squares model (PLS) and the stepwise regression model (STEPWISE) and the test results obtained by the LSTM model. From

Two statistical models and a deep learning model of vertical displacement at five typical measuring points located on the crest of Wuqiangxi dam are constructed by partial least square method, stepwise regression method and LSTM recurrent neural network. The fitting results and the influence of each component on displacement value in the statistical models are compared and analyzed, the test curve in the LSTM model is given and compared with the fitting curves of the statistical models. The following conclusions are drawn:

The prediction accuracy of the LSTM model is higher than the statistical models when there are enough training data, so the LSTM model is suggested when there are enough training data.

From the multiple correlation coefficient, the fitting results of partial least squares regression model and stepwise regression model are similar, and the residual standard deviation obtained by partial least squares regression model is lower.

The stepwise regression model removes some factors, and has a large residual standard deviation. The partial least squares regression model considers factors comprehensively and explains each component more strictly.

It is more appropriate to use partial least squares regression model or LSTM model to predict the subsidence of measuring points in Wuqiangxi dam.

In the deformation prediction of concrete gravity dam, the LSTM model is suggested when there are sufficient training data, and the partial least squares regression model is suggested when the training data are insufficient. In addition, all deformation, water level, and temperature data in this study can be accessed at:

The authors wish to express their appreciation to the reviewers for their helpful suggestions which greatly improved the presentation of this paper.