Time-series data provide important information in many fields, and their processing and analysis have been the focus of much research. However, detecting anomalies is very difficult due to data imbalance, temporal dependence, and noise. Therefore, methodologies for data augmentation and conversion of time series data into images for analysis have been studied. This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance, temporal dependence, and robustness to noise. The method of data augmentation is set as the addition of noise. It involves adding Gaussian noise, with the noise level set to 0.002, to maximize the generalization performance of the model. In addition, we use the Markov Transition Field (MTF) method to effectively visualize the dynamic transitions of the data while converting the time series data into images. It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data. For anomaly detection, the PatchCore model is applied to show excellent performance, and the detected anomaly areas are represented as heat maps. It allows for the detection of anomalies, and by applying an anomaly map to the original image, it is possible to capture the areas where anomalies occur. The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images. Additionally, when processed as images rather than as time series data, there was a significant reduction in both the size of the data and the training time. The proposed method can provide an important springboard for research in the field of anomaly detection using time series data. Besides, it helps solve problems such as analyzing complex patterns in data lightweight.

Time series data is complex and volatile, making it challenging to identify anomalies. Recently, deep learning technology has been widely used for anomaly detection [

Transforming time-series data into images has many advantages [

Concrete structures are one of the main factors constituting the infrastructures of modern society. Such structures are exposed to diverse environmental factors and physical burdens, which may lead to reduced durability, performance degradation, and defect occurrence as time progresses. The consequent accidents lead to big accidents and result in extensive casualties, property damage, and social loss [

The complex patterns and structures of data are intuitively examined using an approach that transforms time-series data into images, and outliers are effectively detected using 2D patterns. Therefore, it becomes possible to apply diverse technologies and models related to image processing to time-series data.

Since a PatchCore-based image deep learning model is used to analyze patterns in detail based on a small patch within an image, it is possible to effectively detect detailed abnormal patterns. In addition, since the model shows robust performance against noise or temporary data changes, it is possible to perform stable anomaly detection.

Of all data augmentation methods, a data augmentation method using noise injection is applied to enhance the generalization ability of the model as well as to alleviate the data imbalance problem. By doing so, the model becomes less sensitive to noise, and its generalization performance is enhanced.

By transforming time-series data into images, it becomes possible to maximize lightweight effects on anomaly detection.

This paper is constructed as follows. In

Time-series data consists of a series of data points aligned in time order. Although there are many conventional methods used to explore and analyze the hidden patterns and structures of such data, recently, a method that analyzes time-series data by transforming it into images has been attracting attention [

Gramian Angular Field–The Gramian angular field method transforms time-series data into polar coordinates and constructs a matrix using the cosine and sine values [

Recurrence Plot–The recurrence plot method visualizes the repeated patterns of time-series data into 2D images. It creates a 2D plot that visualizes when certain patterns within time-series data will re-appear, and transforms such patterns into images [

Continuous Wavelet Transform–This method provides information on the time-frequency domains of time-series data [

Markov Transition Field–The Markov transition field method performs image transformation based on the Markov transition probability of time-series data [

Grey Scale Encoding–This method transforms time-series data into images in the most intuitive manner [

Spectrogram–This method is mainly used in audio data analysis, and uses FFT (Fast Fourier Transform) to visualize the frequency components of time-series data per time period [

As far as pre-existing methods for detecting concrete structure defects are concerned, in general, the NDT (Non-Destructive Testing) method is used. This method includes diverse technologies such as ultrasonic tests, radiographic tests, magnetic-inductive inspection, acoustic-impact methods, and infrared thermographic tests [

Image-based defect detection mainly uses CNN (Convolutional Neural Network). It is a method that detects defects by entering surface photos or radar images of concrete structures on CNN. Since CNN well captures the regional patterns within images, it is suitable for defect detection. However, Gaur et al. [

Sensor data-based defect detection monitors the status of structures and detects anomalies through the use of data collected from diverse sensors such as vibration, temperature, and humidity sensors [

Since concrete structures are exposed to diverse environments, problems associated with durability, defects, etc., occur as time progresses, and such problems lead to not only casualties but also losses such as property damage. Therefore, it is necessary to come up with a method that predicts defects through monitoring the internal and external statuses of concrete. Time-series data used for concrete defect detection is capable of early detecting internal defect occurrences that are difficult to directly confirm. However, since there is not enough defect data to sufficiently train a deep learning model, a data augmentation method is required.

In

The data used in this study is data obtained by simulating the external defects of model piles included in concrete, analyzing the propagation characteristics of the reflected electromagnetic pulses, and using the non-destructive method on the piles [

In

ROUNDTRIP | Normal | Air | Dsoil | Wsoil10 | Wsoil20 | Wsoil30 | Clay |
---|---|---|---|---|---|---|---|

0 | 0.00818 | 0.00328 | 0.00397 | 0.03703 | 0.00111 | 0.02824 | 0.01774 |

0.01 | 0.01656 | 0.01217 | 0.01508 | 0.07215 | 0.00724 | 0.05231 | 0.04161 |

0.02 | 0.03810 | 0.04116 | 0.02749 | 0.13325 | 0.02408 | 0.08812 | 0.07660 |

… | … | … | … | … | … | … | … |

81.9 | 1.03056 | 1.03802 | 1.03073 | 1.03272 | 1.03272 | 1.01399 | 1.01935 |

81.91 | 1.03020 | 1.03858 | 1.03257 | 1.02806 | 1.03171 | 1.01406 | 1.02176 |

In

Since there are 1 normal situation and 6 abnormal situations, data imbalance exists. In addition, an absolute insufficiency in the number of data exists. Therefore, data augmentation is necessary. In addition, a difference in measured time exists between individual tests. Therefore, each time-series data has a different time. To perform proper learning, it is necessary to standardize the time difference. To standardize each data’s time difference, the file with the lowest number of lines is set as the standard, and the other data are adjusted to use the same number of lines as that of the standard file. The last part of the measured value is unnecessary since it is a value measured past the piles. Therefore, the shortest line is set as the standard, and the last part of the other lines is deleted to have the same length as that of the standard line.

Data augmentation is a method that increases the diversity of training data by applying diverse transformations or adding noise to the given original data to enhance the generalization performance of a model. In addition, in the field of anomaly detection, since the amount of abnormal data is significantly less than that of normal data, the data imbalance problem occurs. Therefore, generalization performance and accuracy can be enhanced through data augmentation. In this study, taking into consideration the attributes of time-series data, the following 6 augmentation methods are applied: time slicing, window slicing, noise injection, time warping, smoothing, and trend/cycle shifting.

In

For noise injection, it is necessary to set the level of Gaussian noise. Taking into consideration the size of the actual data, a level of noise of 0.002 is appropriate for adding sufficient volatility to the data without distorting the original attributes. In addition, an excessive level of noise may distort the original attributes of data. Therefore, through a low level of noise, diversity in the learning process is guaranteed without excessively responding to the minute volatility of data. Lastly, data measurement in an actual environment includes small noise. Therefore, a low level of noise is suitable for imitating minute volatility in such an actual environment. Therefore, in this study, the level of Gaussian noise is set as 0.002, and this contributes to enhancing the generalization ability of a model by adding sufficient volatility while preserving the original attributes of data.

The result in

Transforming time-series data into images intuitively visualizes the complex patterns and structures of data, and such a process enables utilizing the advantages of advanced image processing techniques such as CNN. In addition, since the dimensions and shapes of data are diversified through transformation, enhancement and lightweight effects on the learning performance of a model can be expected. Due to such reasons, time-series data is transformed into images. To transform time-series data into images, the following methods can be taken into consideration: Gramian angular field, recurrence plot, continuous wavelet transform, MTF (Markov Transition Field), grey-scale encoding, and spectrogram [

Transforming time-series data into images enables anomaly detection through the transformed images. In this paper, anomaly detection is performed using a PatchCore model [

When measuring the anomaly score based on the distance between patches, the pivot point is based on the closest distance to the learned patch. In particular, the measurement is performed using the distance to the first neighbor. Through this, the anomaly score and anomaly map for each image are obtained. The anomaly score indicates the intensity of abnormal patterns within the image. The higher the score, the higher the number of abnormal patterns within the image. On the other hand, since an anomaly map visually expresses anomalies at specific locations within an image, users can easily confirm the specific domains where abnormal patterns occur within that image.

An accuracy evaluation is performed to evaluate the model. For a relative evaluation, a comparison of performance between noise levels, a comparison of performance before and after the transformation of time-series data into images, and a comparison of performance before and after data augmentation are performed. As aforementioned, data collected in

The performance evaluation is performed through measuring accuracy and F1-score. Accuracy and F1-score are measured based on the confusion matrix.

TP stands for True Positive, which is the case where observation is predicted positively and is positive. FP stands for False Positive, which is the case where observation is predicted positively and is negative. FN stands for False Negative which is the case where observation is predicted negative and is positive. TN stands for True Negative, which is the case where observation is predicted negatively and is negative. F1-score’s precision is the percentage of actually true images out of the images predicted true by the model. Recall is the percentage of images predicted true by the model out of the true images. F1-score is the harmonic mean between precision and recall and is advantageous in that it is capable of accurately evaluating the performance of a model when the data label is imbalanced, and in that it is capable of expressing the performance as one number.

To examine the effects of time-series data anomaly detection depending on diverse levels of Gaussian noise, levels of Gaussian noise of 0.001, 0.002, 0.003, 0.004, 0.005, and 0.01 are used to perform the test. Since combining the advantages of LSTM with the re-construction ability of AE is known to demonstrate high efficiency and performance in learning the continuity and patterns of time-series data, the test is performed using an LSTM-AE (Long Short-Term Memory AutoEncoder) model.

Noise level | F1-score (LSTM-AE) |
---|---|

0.001 | 0.8781 |

0.002 | |

0.003 | 0.8791 |

0.004 | 0.8784 |

0.005 | 0.8713 |

0.01 | 0.8761 |

Through the test, it is confirmed that the highest performance is shown when the level of Gaussian noise is set as 0.002. Through this, to which extent a level of noise enables the model to effectively perform anomaly detection can be confirmed, and how the changes in noise conditions have an effect on stability can be confirmed.

To evaluate the effects of data augmentation, a comparison of performance before and after data augmentation is performed. Data augmentation increases the diversity of a training dataset and is a methodology that plays an important role in enhancing the generalization performance of a model. The test was performed using the two models of LSTM-AE and PatchCore.

Data augmentation | F1-score (LSTM-AE) | F1-score (PatchCore) |
---|---|---|

Before | 0.7793 | 0.8757 |

After |

Based on the performance evaluation results before data augmentation shown in

The difference in performance between the two different models of LSTM-AE and PatchCore is compared, and the lightweight effects are evaluated as well. In particular, such comparison is focused on the changes in the training time and amount of data of the models.

Lightweight effects | Time series (LSTM-AE) | Image (PatchCore) |
---|---|---|

Training time | 3375.5 s | |

Data size | 71 MB |

Based on the test results shown in

To make a quantitative comparison of performance between LSTM-AE and PatchCore, two main indicators are used.

Based on the test results, LSTM-AE showed an accuracy of 0.8923 and an F1-score of 0.88. On the other hand, PatchCore showed an accuracy of 0.9843 and an F1-score of 0.9898. We can see that both F1-score and Accuracy are higher for the PatchCore model than the LSTM-AE model. Based on such results, it can be confirmed that PatchCore showed relatively higher performance indicators than those of LSTM-AE. As one of the causes of such performance enhancement, the information compression and pattern recognition ability in the process of transforming time-series data into images can be mentioned. Transforming time-series data into images enables PatchCore to capture the unique patterns and features of data more, and this leads to the model’s performance enhancement. In particular, transforming time-series data into images enables taking into consideration diverse dimensions and patterns of data at the same time, and this results in an increase in performance indicators such as F1-score. As a result, PatchCore uses a special approach that transforms time-series data into images and shows higher performance than that of LSTM-AE. The research results demonstrate the efficiency of anomaly detection by converting time series data into images. However, the model’s performance heavily depends on the characteristics and noise level of the data used. For instance, further research is necessary regarding the model’s applicability in real-world settings with various types of noise. The approach of this study can be applied in numerous fields, such as industrial process monitoring, defect detection in architectural structures, and medical data analysis. It can play a crucial role in effectively analyzing complex patterns and structures in time series data and detecting anomalies in these areas. Nonetheless, it is important to acknowledge that the model’s performance may vary depending on the characteristics and environment of the data in these application domains.

To effectively detect defects within concrete, in this study, a defect detection model using time-series data augmentation and transformation is proposed. The proposed method used time-series data measured using TDR (Time Domain Reflectometry). Since such data had a class imbalance between its normal data and abnormal data, it was augmented using the noise injection method. Therefore, by applying diverse levels of Gaussian noise to the time-series data, to which extent a level of noise enables the model to effectively perform anomaly detection was confirmed. Through experimentation, this paper found that a Gaussian noise level of 0.002 performed best, so this paper applied it. By doing so, it was confirmed that the model shows high robustness against uncertainties or potential external noise in an actual environment and is capable of maintaining stable performance. However, since the shape of the augmented data was similar to that of the original data, methods to effectively detect and visualize abnormal patterns were proposed through the transformation of time-series data into images using MTF (Markov Transition Field) and through the application of the anomaly detection model PatchCore. In addition, through a study on heat map techniques for visual expression of abnormal domains, the users’ understanding was enhanced while improving the actual availability at the same time. This paper has presented a new methodology for analyzing complex patterns and features in time series data. Through a performance evaluation, the effects of image transformation on time-series data were confirmed. Compared to the test results obtained using the LSTM-AE model, the test results obtained using the PatchCore model through image data showed higher performance both in F1-score and accuracy. This paper also experimented with converting time series data to images to see how lightweight it can be. Therefore, it is possible to detect anomalies such as defects within concrete through the proposed method. However, this method has limitations in that it is limited to concrete data and is difficult to generalize.

In future work, the proposed methods will be applied to various environments and datasets to enhance generalization performance. The PatchCore model exhibits strengths in transforming the unique patterns and characteristics of time series data into images, but it may not be equally effective for all types of time series data. Therefore, it is necessary to further explore the applicability and limitations of the model across various datasets. In particular, future research will focus on the model’s performance with datasets that have high variability or rapidly changing data, as well as any potential performance degradation when using long-term data. Finally, the study will explore methods to analyze the transformed images and apply them to various deep learning models.

Not applicable.

This research was financially supported by the Ministry of Trade, Industry, and Energy (MOTIE), Korea, under the “Project for Research and Development with Middle Markets Enterprises and DNA (Data, Network, AI) Universities” (AI-based Safety Assessment and Management System for Concrete Structures) (Reference Number P0024559) supervised by the Korea Institute for Advancement of Technology (KIAT).

The authors confirm contribution to the paper as follows: study conception and design: G. I. Kim, K. Chung; data collection: G. I. Kim; analysis and interpretation of results: G. I. Kim, H. Yoo; draft manuscript preparation: G. I. Kim, H. J. Cho. All authors reviewed the results and approved the final version of the manuscript.

Data available on request from the authors. The data that support the findings of this study are available from the corresponding author, G. I. Kim, upon reasonable request.

The authors declare that they have no conflicts of interest to report regarding the present study.