The method of time series analysis, applied by establishing appropriate mathematical models for bridge health monitoring data and making forecasts of structural future behavior, stands out as a novel and viable research direction for bridge state assessment. However, outliers inevitably exist in the monitoring data due to various interventions, which reduce the precision of model fitting and affect the forecasting results. Therefore, the identification of outliers is crucial for the accurate interpretation of the monitoring data. In this study, a time series model combined with outlier information for bridge health monitoring is established using intervention analysis theory, and the forecasting of the structural responses is carried out. There are three techniques that we focus on: (1) the modeling of seasonal autoregressive integrated moving average (SARIMA) model; (2) the methodology for outlier identification and amendment under the circumstances that the occurrence time and type of outliers are known and unknown; (3) forecasting of the model with outlier effects. The method was tested with a case study using monitoring data on a real bridge. The establishment of the original SARIMA model without considering outliers is first discussed, including the stationarity, order determination, parameter estimation and diagnostic checking of the model. Then the timebytime iterative procedure for outlier detection, which is implemented by appropriate test statistics of the residuals, is performed. The SARIMAoutlier model is subsequently built. Finally, a comparative analysis of the forecasting performance between the original model and SARIMAoutlier model is carried out. The results demonstrate that proper time series models are effective in mining the characteristic law of bridge monitoring data. When the influence of outliers is taken into account, the fitted precision of the model is significantly improved and the accuracy and the reliability of the forecast are strengthened.
In recent years, bridge health monitoring (BHM) system has become an inseparable part in not only supermajor and major bridges but also small and mediumsized bridges. Vast amounts of monitoring data, which contain a variety of characteristic information of the structure under the operation phase, flow into BHM system every day. How to make effective use of the monitoring data for indepth mining is of vital importance for bridge early warning and assessment [
Time series techniques, originally developed for analyzing long sequences of regularly sampled data, are inherently suitable for BHM. Time series forecasting methods can be broadly classified into two main categories, namely statistical methods and deep learning methods. The performance of each method depends on multiple factors such as trend, seasonality and noise in the data, as well as external conditions and internal damages [
Deep learning (DL) techniques, which can automatically learn the temporal dependencies present in time series and effectively reduce the complexity of the forecasting pipeline, have proved to be an effective solution in time series forecasting, and have gained considerable prominence [
Time series analysis is one of the statistical procedures applied to simulate the degradation mechanism of bridge structures and make predictions by establishing various time series data mining models and algorithms that can reflect the variation of variables in the time domain [
It can be seen that an appropriate time series model or combination model does better explain the characteristics of nonlinearity, nonstationarity, high dimensionality and heteroscedasticity of monitoring data. However, these models were built on the assumption that the monitoring data were accurate and reliable [
However, from the existing literature, most of the studies focus on the selection of the most convenient type of time series model and its parameter recognition and estimation [
The main purpose of outlier correction is to optimize the data in such a way that the normality hypothesis of the SARIMA model can be better accepted. Moreover, by containing outlier intervention effect in the SARIMA model, the residual variance of the model is reduced, the fitting precision of the model is significantly improved, and the accuracy and reliability of the forecast are strengthened.
Under the influence of external loads and structural performance degradation, the monitoring responses of bridge structures exhibit randomness in a short period of time. However, from a longer time perspective, the monitoring time sequences will present certain regularity, such as longterm trend, periodicity, random fluctuation, and mutation. As can be seen from the sequence diagram of observed data, there exists significant nonstationarity in some of the monitoring series, while others have conspicuous seasonal characteristics due to the effect of seasonal temperature difference. Therefore, the seasonal ARIMA model is introduced to fit and analyze the nonstationary bridge monitoring data [
Assume {
The model in
In particular, when there is an additive relationship between the seasonal effect and other effects in the series, the seasonal information can be fully extracted by taking a first difference with periodic step length. At this point, the model can be simplified into
Generally speaking, the measured responses of the structure can be considered as the linear superposition of various load effects when the structure is in the normal operation state. Therefore, for the observed time series with seasonal effect, the seasonal additive model shown in
To consider the influence of different types of intervention outliers, the indicator function is introduced. Thus, the general model of time series {
where
When there are outliers detected in monitoring data, the abnormal behavior can be explained by intervention analysis techniques if the timing and causes of interruptions are knowable. However, the timing of interventions is usually unknown. So, it is necessary to detect and estimate the possible effects. Here, two common outlier models, innovational outlier (IO) and additive outlier (AO), are introduced [
Let
If the error
where
If
Hence, an AO affects only the
If a time series is influenced by
where, for SARIMA model
When outliers exist, the estimated parameters are biased. In this case, appropriate test statistics are constructed from the residuals, which are the discrepancies between the observed and the estimated values, for the detection and correction of outliers. Outlier mining based on time series analysis is to identify the time, size and types of outliers, estimate their impacts, and modify the original time series model affected by outliers so as to improve the accuracy of the model.
Let {
The process of outliers detection is mainly discussed in the following two cases [
Only one type of outlier is included, and the timing of the outlier occurrence is known.
The residual series of IO and AO models can be written respectively as
where
According to the leastsquares principle, when
The test statistics for IO and AO at time
Under the null hypothesis
The timing and type of the outliers are unknown.
Case (1) is applicable to situations where the timing and type of the outliers are known. However, more often in practice, the type, timing and number of outliers are unknown and have to be estimated. An iterative detection procedure is proposed to handle the situation when an unknown number of AO or IO may exist [
The initial estimate of the standard deviation of the residuals is
Bonferroni correction is used to control the overall error rate of multiple tests. Based on 0.05 significance level, if the
It should be noted that the maximum likelihood estimation of
According to different types of outliers, the effect of IO/AO at time
where
The new estimate of the standard deviation of the residuals
Repeat Step 2 to Step 4 until no new outlier is identified.
One of the most important aims of time series analysis is to predict or forecast future development trends of observed data using the fitted models. This is also an important purpose for bridge health monitoring, to figure out the current state of the bridge structure and predict its future longterm development trends through the observed monitoring data.
Consider the general additive seasonal ARIMA (
Let time
According to the minimum mean square error principle, the MSE shown in
The forecast error is
It can be seen that the variance of the forecast is only related to step size
where
Kunshan Yufeng bridge, located in Kunshan city, Jiangsu Province, is a nonthrusting leaningtype arch bridge with a main span of 110 m (see
The monitoring data, which are chosen as the modeling basis, are picked from the stress measuring point at the bottom of the vault of the southwest main arch rib of Kunshan Yufeng bridge (Gauge S12), with the date ranging from August 23rd, 2011 to February 28th, 2014. These data are presented in the form of weeklycycle mean value (M value). The weekly cycle here is not strictly measured by the traditional week. We uniformly divide each month into four weeks, namely, 1st∼7th, 8th∼15th, 16th∼23rd and 24th∼30th (31st) (For February, 7th, 14th and 21st are taken as the split nodes). In this way, a year is fixedly divided into 48 weeks, giving a total of 117 sample data. The first 105 monitoring data are used for the model calibration, while the remaining for verification. The weeklycycle stress Mvalue of the first 105 data of the sample series is drawn in
The series shown in
Type of test  PP test  KPSS test  Conclusion  

Lag  Stat  Lag  Stat  
No drift no trend  3  −68.4  ≤0.01  1  0.0493  ≥0.1  Stationary 
With drift no trend  3  −68.4  ≤0.01  1  0.0522  ≥0.1  Stationary 
With drift and trend  3  −68.2  ≤0.01  1  0.0282  ≥0.1  Stationary 
As illustrated in
The values of orders
AR ( 
MA ( 
ARMA ( 


ACF  Tails off as exponential decay of damped sine wave  Cuts off after lag 
Tails off after lag ( 
PACF  Cuts off after lag 
Tails off as exponential decay of damped sine wave  Tails off after lag ( 
From
AR  MA  

0  1  2  3  4  5  6  7  
0  X  0  X  0  0  0  0  0 
1  X  X  X  0  0  0  0  0 
2  0  0  0  0  0  0  0  0 
3  X  0  0  0  0  0  0  0 
4  X  X  0  0  0  0  0  0 
5  X  X  X  0  0  0  0  0 
6  X  0  0  0  0  0  0  0 
7  X  0  0  0  0  0  0  0 
Note from
To quantitatively evaluate the accuracy and stability of the preliminary proposed models, the residual sum of squares (
where
These statistics are normally based on summary statistics from residuals computed from the fitted model. For RSS, sigma^{2}, log likelihood, AICc and BIC, lower values specify a better model. Adjusted Rsquared values range from 0 to 1. A higher adjusted Rsquared value closer to 1 indicates a superior. The statistics of the four models are estimated in
AICc  BIC  sigma^{2}  Log likelihood  Adjusted Rsquared  RSS  

ARIMA (2, 1, 0) × (0, 1, 0)_{48}  178.25  184.20  1.294  −86.06  0.9626  69.898 
ARIMA (0, 1, 1) × (0, 1, 0)_{48}  180.28  184.29  1.370  −88.12  0.9612  75.368 
ARIMA (1, 1,(2)) × (0, 1, 0)_{48}  181.47  187.31  1.319  −87.61  0.9614  73.883 
ARIMA (1, 1,(2, 3)) × (0, 1, 0)_{48}  179.30  187.00  1.193  −85.45  0.9636  66.816 
According to the statistical information illustrated in
The residual series of the fitted model (see
The first iteration  Model parameters  ——  ——  
−0.6896  −0.4919  1.2481  
Detected outliers  Time  Type  Time  Type  
91  AO  3.564134  88  IO  5.226359  
63  IO  4.256304  9  IO  −4.007655  
87  IO  −6.214973  ——  
The second iteration  Model parameters  
−0.5461  −0.4204  1.2106  2.3466  
0.6093  
−3.5170  3.5502  −2.3406  
Detected outliers  Time  Type  ——  
77  AO  −4.088277  ——  
The final results  Model parameters  
−0.5478  −0.4078  1.2286  2.3443  
0.5945  
−3.5260  3.5541  −2.3106  −0.7850  
Detected outliers  None  
Model statistics with outlier interventions  AICc  BIC  Log likelihood  Adjusted Rsquared  RSS  
149.83  166.56  −65.16  0.9793  33.30 
The fitted SARIMAoutlier model can be presented as follows in the form of the main SARIMA model plus outlier interventions:
After two rounds of iteration, as shown in
The fitted results of the original model and SARIMAoutlier model are illustrated in
Use the original model (
Indices  Mean Absolute Error (MAE)  Root Mean Squared Error (RMSE)  Mean Absolute Percentage Error (MAPE)  Theil Inequality Coefficient (TIC) 

Original model  0.5594  0.7678  5.8111%  0.0492 
SARIMAoutlier model  0.5481  0.7566  5.6864%  0.0485 
As can be seen from
On the other hand, the difference between the predicted values of the two models at each time point is not significant, and the maximum absolute value of the difference is only 0.036. The overall prediction results of the two models are quite close. Therefore, although the existence of outliers has a significant impact on the parameter identification and fitted precision of the time series models, it is insensitive to the forecasting results when there are only a small number of outliers and the value of outlier effect
In order to ensure the reliability of monitoring data and improve the accuracy of forecasts, the time series model with outliers for bridge monitoring is established using the intervention analysis theory. IOs and AOs are diagnosed and extracted from observed data. Through comparative analysis with the original model without considering outliers, some conclusions are drawn as follows:
The additive seasonal ARIMA model is suitable for the fitting of bridge monitoring data with obvious seasonal effects. The residuals of the fitted model are white noise, which indicates that the fitted model is of significant effectiveness. Use this model for forecasts, the errors of which can meet the demand for prediction and assessment of bridge structures.
The outlier detection algorithm presented can rapidly and efficiently identify the outliers existing in BHM data. The detected IOs and AOs are sensitive to the model parameter estimation. After considering the influence of IOs and AOs, the model parameters vary greatly and the fitting accuracy improves significantly, which also verifies the effectiveness and accuracy of the outlier identification.
In comparison with the original model, the prediction confidence interval of the SARIMAoutlier model is narrower, indicating a more reliable forecasting result. In the meantime, the accuracy measurement indices are smaller and the prediction accuracy is higher.
The existence of outliers is insensitive to the forecasting results under the condition that the number of outliers is small and the test statistics for outlier effects are not big. The original time series model has strong prediction robustness.