A huge number of old arch bridges located in rural regions are at the peak of maintenance. The health monitoring technology of the long-span bridge is hardly applicable to the small-span bridge, owing to the absence of technical resources and sufficient funds in rural regions. There is an urgent need for an economical, fast, and accurate damage identification solution. The authors proposed a damage identification system of an old arch bridge implemented with a machine learning algorithm, which took the vehicle-induced response as the excitation. A damage index was defined based on wavelet packet theory, and a machine learning sample database collecting the denoised response was constructed. Through comparing three machine learning algorithms: Back-Propagation Neural Network (BPNN), Support Vector Machine (SVM), and Random Forest (R.F.), the R.F. damage identification model were found to have a better recognition ability. Finally, the Particle Swarm Optimization (PSO) algorithm was used to optimize the number of subtrees and split features of the R.F. model. The PSO optimized R.F. model was capable of the identification of different damage levels of old arch bridges with sensitive damage index. The proposed framework is practical and promising for the old bridge’s structural damage identification in rural regions.

By the end of 2020, the quantity of highway bridges in China has reached 912,800 with a total length of 66,285,500 meters. Among them, small and medium-span bridges, accounting for 86.15%, are the predominant bridge types, which contain a large proportion of old double-curved arch bridges and stone arch bridges in rural regions [

Because the Machine Learning (ML) technique is capable of dealing with complex nonlinear structural systems under extreme action, along with the availability of large data sets, the use of ML in structural engineering has become increasingly popular in recent years. The applications of ML mainly include: (1) structural design and analysis; (2) structural health monitoring and damage detection; (3) structural fire resistance; (4) structural member resistance to various actions; (5) concrete mechanical properties and mix design [

In recent years, various cross-disciplinary bridge inspection and diagnosis technologies have emerged. In the field of data preprocessing, Fu et al. [

With the popularity of artificial intelligence (A.I.) cloud computing, ML technology has been applied to bridge health diagnosis and damage identification. The Convolutional Neural Network (CNN) algorithm and an autoencoder data system were used to install structured health monitoring (SHM) on some long-span suspension bridges [

Types of classification | Features | Types of classification | Common algorithm |
---|---|---|---|

Supervised learning | Scarcity of data and poor effect of the markup, good generalization performance | Regression: commonly used in the analysis of load effects; |
Decision trees, K-nearest neighbors, SVM, artificial neural networks |

Unsupervised learning | Poor stability of the algorithm, easy convergence at local optimal solutions | Clustering, correlation analysis: handling high-dimensional nonlinear data, feature selection, and extraction | K-means, Self-Organizing Map clustering, genetic algorithms |

Anomaly identification | Anomaly identification beyond the threshold of the normative model | Probability-based, distance-based, reconstruction-based | Extreme value statistics, likelihood ratio test, X-bar various control charts |

In

Since there is a large number of bridges and culverts in the northwest rural areas in China, and municipal funding is limited. It is difficult to install the advanced bridge SHM in those undeveloped area. With respect to the old bridges in rural areas, an affordable system is essential. The goal of this work is to provide a system for identifying old arch bridge deterioration that is quick, inexpensive, and accurate. The suggested approach also has general applicability in assessing the overall condition of various bridge types in rural areas quickly and accurately.

This paper integrates time-frequency domain feature extraction and Machine Learning (ML) for damage identification. Taking the vehicle’s live load as the environmental excitation, the acceleration sensor measuring points are arranged at different positions of the old arch bridge to collect acceleration mode signals containing damage information. The working principle is presented in

Wavelet analysis only targets the decomposition of the low-frequency part at each scale and the frequency domain resolution in the high-frequency part is insufficient, resulting in partial loss of structural damage information in the high-frequency signal. By further dividing the signal at several scales and identifying features in different time-frequency with stable orthogonal time-frequency properties, wavelet packet analysis allows for improved frequency resolution [

Parseval’s theorem states that signal energy is the sum of squares of varying amplitudes. The high and low frequencies are decomposed into frequency bands, and the sum of each band is the signal energy [

Wavelet packet coefficients can be expressed as:

The decomposed different frequency bands are the equations as follows:

In

The energy values of distinct signal bands are determined from the orthogonality (

The frequency changes when the signal passes through the damaged part of the old arch bridge, and the energy value will change abruptly. The energy of the components has a high sensitivity to the change of the signal, so the adoption of wavelet packet energy values makes better response to the damage changes to the structure [

Defining the wavelet packet energy damage index as Damage Index (

The steps to obtain the wavelet energy damage index are as follows. To begin with, the wavelet packet energy values of health and damage conditions under decomposition scale

Wavelet packet analysis is considered because low-scale wavelet decomposition cannot depict the changes before and after structural degradation, according to engineering experience [

In the implementation of accelerometers, the inevitable interference of random white noise is mainly from the electromagnetic effect of the accelerometer and the observation error. The noise with a variance of

In

As the former reviews in

The R.F. algorithm works by randomly extracting multiple sub-samples from the sample database, modeling Cart decision trees for each sub-sample set using the Bootstrap method, which uses reversible random sampling to form different self-help sets. R.F. trains the base classifier’s sampling, and

Assuming that

According to

Each Cart classification decision tree model

When the random forest generates the Cart decision tree combination without pruning through Bootstrap sampling, the probability of not being extracted is

According to

Defining all kinds of sets in the sample library as features, and feature selection is to remove irrelevant features from all features extracted from the sample library, which are connected to the selection of target variables. The curse of dimensionality is mitigated by feature selection, which decreases the difficulty of the learning task and the training time. For example, in terms of the KNN algorithm model, the distribution of samples in space is sparse when the number of indicators is large, which means less effective. At present, feature selection includes the Relieff algorithm, neighborhood component analysis algorithm and mRMR algorithm [

The R.F. has superiority on screening the importance of indexes to select features. The Gini value is employed to evaluate the accuracy of determining the random forest algorithm. The concept is that noise is introduced into features, and then the importance of those features is determined by a sudden change in Gini value. Testing the performance with OOB data and comparing it to the old OOB accuracy, the random forest performance is tested after artificially adding noise to the characteristic variables. At last, the new OOB accuracy is obtained. The characteristic variable measurement value is the difference between the old and new OOB accuracy. The importance indexes are screened out during the construction of the arch bridge damage model so that the damage identification model with a large amount of samples can be modified and the model's accuracy can be improved.

The damage of two old arch bridges was assessed using the damage assessment method based on the R.F. algorithm, and the damage identification process of the old arch bridge under vehicle-induced response was simulated by the finite element software CSiBridge.

The entire bridge was discreted as beam element, as demonstrated in

C50 concrete was used in the main arch ribs, arch tile, bridge deck, and cross-linked, and stone was used for the web arch cross-wall in the old arch bridge’s material selection. The material parameters, such as modulus of elasticity, Poisson’s ratio, and bridge type parameters, are summarized in

Width of bridge (m) | 7.0 | Modulus of stone (N·mm^{−2}) |
5.50 × 10^{4} |
---|---|---|---|

Bridge length (m) | 50.0 | Poisson’s ratio of stone | 0.168 |

Ceiling clearance height (m) | 12.6 | Stone’s coefficient of linear expansion (°C) | 1.17 × 10^{−5} |

Depth of arch (m) | 0.8 | Modulus of C50 concrete: (N·mm^{−2}) |
3.45 × 10^{4} |

Arch rib’s width (m) | 0.9 | Poisson’s ratio of C50 concrete | 0.20 |

Thickness of bridge deck (m) | 0.3 | C50 concrete’s coefficient of linear expansion (°C) | 1.00 × 10^{−5} |

Load (kN) | 12.0 | Density (kN·m^{−3}) |
25 |

The second arch bridge was made up of two 13 meters approach span and 45 meters main span. The main arch was a constant section with a simply-supported pillar on the arch. The arch was a single box section with five chambers, and its dimension was: a height of 1.2 m, a total width of 7 m and a net height of 5.4 m. The arch was made of C30 concrete, and the roadway was 12 meters wide with one flexible lane of 3.5 meters wide, one non-motorway lane of 1.5 meters wide and one sidewalk in each direction. The specific modeling process is similar to the first bridge, and

In CSiBridge, it provided the vehicle load function. The front and rear axle spacing was 3 m, both axles weighed 750 kg, the front axle weighed 1560 kg and the rear axle weighed 1820 kg of the two types of vehicle concentrated force live load. The above vehicles are the typical traffic flows at the bridge location, and the vehicle selection is consistent with the actual situation. The vehicle was set to pass over the arch bridge at a constant speed of 30, 40 and 50 km/h respectively in the load section.

The vehicle-bridge coupling analysis of a 50 m-span hyperbolic arch bridge is carried out in the finite element software CSiBridge to improve the layout of the test process and establish the location control of the acceleration sensor. The arch bridge is loaded with a live car with double axles of 7.5 KN and axle spacing of 1.5 m. The vehicle passes through at speeds of 10, 15 and 20 m/s per second. The fast Fourier transform is applied to the acceleration time-history response, and the amplitude peaks of each point are compared. The sensor measurement point’s control position is set to the maximum node energy.

In the entire bridge, 19 nodes are picked for comparison.

FFT is applied to a time-domain curve with a vehicle speed of 15 m/s and a sampling frequency of 10 s, which satisfies the sampling theorem. As shown in

In most cases, the meso-damage of in-service arch bridge is hard to detect. After decades of weathering, material degradation and the accumulation of the meso-damage, the local stiffness of the arch bridge in the damaged region will change dramatically. From this perspective, the level of component’s damage and different damage conditions can be simulated by the stiffness degradation of the element. The greatest amplitude peak of the spectrum position is chosen as the measuring point of the acceleration sensor. On the arch bridge, five measuring spots with the highest response amplitude in the acceleration spectrum map are determined as demonstrating in

Each measurement point in the model has a 10 s acceleration sampling time and a frequency of 100 Hz. Simulating damage conditions of 10%, 20% and 30% for the third measurement point, while simulating the healthy working condition mode of the No. 3 measurement point. The vehicle with a front axle load of 1560 kg and rear axle load of 1820 kg passed through the arch bridge at a speed of 30 km/h and produced six-dimensional accelerations: the translational acceleration of the XYZ axis and the acceleration of rotation around each axis. These records were exported to Matlab for time-frequency domain analysis, and the acceleration of each dimension was operated as

The combined acceleration time-history response curve under different damage conditions is plotted in

The signal has been frequently disrupted by noise during the field test and recording. Gaussian white noise with signal-to-noise ratio magnitudes of about 33 and 53 dB has been applied to the acceleration response signals to guarantee that the trained numerical model reflects the same characteristics as the real field test [

To ensure the balance of the samples, the amplitude of the same order of magnitude as the original signal, 96 groups of white Gaussian noise with SNR size of 53, 73 and 93 dB were introduced to various damage conditions, vehicle speed and load condition, while calculating the wavelet packet energy damage values under the corresponding conditions.

The damage conditions of each point and the remaining parameters were kept constant by uniformly applying two types of vehicle load and three types of speed through the arch bridge as demonstrated in

The wavelet packet 2N decomposition of DB20 was performed in Matlab on the 5 measuring points representing health status and three damage conditions under various vehicle and speed conditions. After that, the corresponding layers were superimposed by

The four typical classification indexes T.P., F.N., F.P., and T.N. that define the positive class are shown in

Indexes | Meanings |
---|---|

T.P. (True Positive) | Number of positive classes defined as positive classes |

F.N. (False Negative) | Number of positive classes defined as negative classes |

T.N. (True Negative) | Number of negative classes defined as negative classes |

F.P. (False Positive) | Number of negative classes defined as positive classes |

The most common classification of rating models includes accuracy, Micro/Macro F1 score, True Positive Rate (TPR), False Positive Rate (FPR), Receiver Operating Characteristic (ROC) curve, and Area Under Curve (AUC) [

With the true class rate TPR as the vertical coordinate and the false-positive class rate FPR as the horizontal coordinate, the ROC curve represents a curve drawn from a series of different classification thresholds. The larger the FPR, the more actual negative classes in the positive class are predicted, while the larger the TPR, the more actual positive classes in positive classes are predicted. Ideally, the true class rate TPR is close to 1 and FPR is close to 0, which means that the closer the ROC curve is to (0, 1) and the more it deviates from the 45 degrees diagonal, the better the classification effect of the model is. In the multi-classification problem of damage identification, the ROC curves are finally averaged by combining two and two with different damage levels. In the actual analysis, if multiple ROC curves intersect, it is difficult to judge the model's merit. Hence, the AUC value is introduced, which is the area enclosed by the ROC curve and the axis below (the value range is [0,1]).

In this study, the numerical simulation results were used as the test training set to construct a sample library of old arch bridge damage identification for classification learning, testing and evaluation.

It is necessary to clean and standardize the sample data source because it will appear in various discrete and continuous forms. As shown in

Feature name | Field name | Feature name | Field name |
---|---|---|---|

Arch_Length | Arch bridge length | Speed | Loaded vehicle speed |

Arch_Width | Arch bridge width | Vehicle_Weight | Vehicle weight |

Arch_Span | Arch bridge span length | Energy | Noiseless wavelet packet energy damage value |

Rise | Calculated rise of arch | Noise_Energy | Wavelet packet energy impairment value with noise introduction |

Materials | Material parameters | Condition | Damage conditions |

For machine learning in the latter section, the sample library has been trained with random forest and compared to other algorithmic models.

The training results of random forest are compared with BPNN, the SVM algorithm model. The training of the BPNN belongs to gradient descent, using Cross-Entropy as the loss function. The smaller the learning rate, the better, but it brings longer training. The SVM chooses the Radial Basis Function (RBF) as kernel function [

Type | Model parameters | Classification accuracy | Micro F1 | Macro F1 |
---|---|---|---|---|

BPNN | Iteration cycle: 1000 Neurons: 27 |
56.7% | 56.8% | 57.0% |

SVM | Kernel function: RBF Penalty factor: 64 |
83.3% | 83.3% | 83.5% |

R.F. | Number of splitting features: 4 |
91.7% | 91.7% | 91.7% |

The ROC curves of the three are plotted and the corresponding AUC values have been derived. As displayed in

To improve the accuracy of the random forest damage recognition model, two important parameters of the random forest, the number of split features and the number of sub-trees, are hyperparameter tuned. Traditional grid search will traverse a variety of situations, parameter optimization is relatively blind, the calculation time is long, and the use of heuristic search algorithms can accelerate the solution and find the optimal solution.

Four popular algorithms have been compared: PSO (Particle Swarm Optimization Algorithms), WOA (Whale Optimization Algorithms), MFO (Moth Flame Optimization), and G.A. (Genetic Algorithm). Firstly, the MFO has the disadvantage of getting into the local best and the convergence rate cannot be satisfying [

Secondly, the G.A. algorithm has not been able to use the feedback information of the network in time. At the same time, the realization of the three operators also has many parameters, such as crossover rate and mutation rate, and the selection of these parameters seriously affects the solution. At present, the selection of these parameters is mostly based on experience. The PSO algorithm implemented in this paper can use inertia weight and learning factor gradient to prevent dependence on empirical parameters [

Thirdly, WOA whale converges slowly in the search process, and it is easy to fall into local optimum in the update mechanism, which restricts the classification performance and dimensionality reduction of the algorithm [

The particle search process is divided into three parts: search inertia, self, and other group search experiences. Particle local optimal position

By changing the inertia weights and the individual or group learning factors [

The arch bridge’s damage output consisted of three damage conditions. The PSO optimized R.F. damage identification model was designed to evaluate hyperparameter selection and characterize the model’s damage identification performance. The damage identification was evaluated in terms of overall accuracy and the proportion of sample data with accurate identification of the three damage conditions to all samples.

The data sample pool was divided into training and test sets with an 8:2 ratio, and the model was tested with random sequence labels before PSO optimization. Observed from

The iterative process of optimizing parameters of the R.F. algorithm is depicted in

After iterations, the parameter combinations of multiple optimal R.F. models were obtained, the number of split features and the number of subtree trees with the most intensive occurrence 3 and 149 were taken, and the optimal fitness of the particles was 98.3%. The classification accuracy of PSO-RF was 95.6%, 95.6% and 95.6% for Micro F1 and Macro F1, and all indicators were higher than the other three types of models.

Feature selection has been utilized to identify the significant characteristics. The selected features contain physical quantities such as damage indications. Comparing the important features of R.F. and PSO-RF models in

In this research, a fast structural damage identification framework based on machine learning is proposed, which aims to address the problem of rural regions’ old bridge maintenance. Three different machine learning approaches-BPNN, SVM and R.F. have been compared in the evaluation of the damage status of old arch bridges. The following conclusions can be drawn:

The study simulates a numerical damage model in the environment with field noise, then trains an R.F. model according to the simulated 960 sets of data. The proposed damage identification index performs optimally in terms of feature importance in the R.F. damage identification model, and it plays an important role during the selection of important features.

The R.F. model has a 35 percent and 8.4 percent higher precision than SVM and BPNN, respectively, and a 34.9 percent and 8.4 percent higher index F1 score. R.F.’s Auc value is 2 percent higher and 17.7 percent greater than both SVM and BPNN. The constructed RF damage identification model has a better recognition capability compared with BPNN and SVM.

The two hyperparameters of RF, the number of split features and subtrees have been optimized by the PSO algorithm, which avoids the problem of selecting empirical parameters that make the model classification unreliable and improves the accuracy of the model.

The proposed framework can also be easily extended to other bridge types, such as the prestressed concrete girder bridge, which has a general validity to the rapid and accurate assessment of the overall state of different bridge types in rural regions.