Time series segmentation has attracted more interests in recent years, which aims to segment time series into different segments, each reflects a state of the monitored objects. Although there have been many surveys on time series segmentation, most of them focus more on change point detection (CPD) methods and overlook the advances in boundary detection (BD) and state detection (SD) methods. In this paper, we categorize time series segmentation methods into CPD, BD, and SD methods, with a specific focus on recent advances in BD and SD methods. Within the scope of BD and SD, we subdivide the methods based on their underlying models/techniques and focus on the milestones that have shaped the development trajectory of each category. As a conclusion, we found that: (1) Existing methods failed to provide sufficient support for online working, with only a few methods supporting online deployment; (2) Most existing methods require the specification of parameters, which hinders their ability to work adaptively; (3) Existing SD methods do not attach importance to accurate detection of boundary points in evaluation, which may lead to limitations in boundary point detection. We highlight the ability to working online and adaptively as important attributes of segmentation methods, the boundary detection accuracy as a neglected metrics for SD methods.

In recent years, data-driven services, e.g., autonomous driving, and IoT applications, e.g., wearable sensing, have become an indispensable part of daily life, constantly generating a large amount of time series data, a.k.a., KPI (Key Performance Indicator) in AIOps (Artificial Intelligence for IT Operations) domain. These co-evolving multivariate time series time series carry rich information capturing different behaviors and facets of the monitored objects. Exploiting these latent states by segmenting time series could be beneficial for many real-world tasks, e.g., pattern recognition, and event detection [

Time series segmentation, as a classic problem, has been extensively studied and there are several reviews that systematically summarize these works. However, the term “time series segmentation” is somewhat overloaded in the context of time series analysis and may refer to at least three categories of work. To distinguish between these three categories, we formally name them

Before the formal review, let us highlight the scope of this paper by clarifying the concepts of CPD, BD, and SD. The concepts of CPD and BD are somewhat similar, both aim to find the change points in the characteristics of a time series, thereby segmenting the time series into segments. Gharghabi et al. [

Existing surveys mainly focus on change point detection, thus overlooking the recent advancements in boundary detection and state detection. This paper differs from existing surveys in that it focuses on the recent advances in both change point detection and state detection, rather than change point detection methods, as these have already been thoroughly reviewed. An example of BD and CPD on the MoCap dataset is shown in

The motivations and contributions of this paper are three-fold:

Existing surveys mainly focus on change point detection (CPD) methods while overlooking recent advances in boundary detection (BD) and state detection (SD) methods. Particularly, these surveys failed to capture the recent advances in BD and SD methods. This paper aims to fill this gap by reviewing recent advances in BD and SD methods;

The term “segmentation” is overloaded in the context of time series analysis and can refer to CPD, BD, and SD methods, but lacks a unified framework for describing these methods. This paper aims to clarify their differences and relationships, as well as provide a unified description under the framework of time series segmentation;

The inherent principles and evaluation manner of SD and BD differ greatly, we categorize BD and SD methods based on their underlying techniques/models, discuss the differences in evaluating BD and SD methods, and appeal to more comprehensive evaluation for BD and SD methods.

The rest of this paper is organized as follows. In

There are multiple forms of definitions for time series in the context of time series analysis. In the topic of time series segmentation, we usually consider multivariate time series, which is generally treated as a sequence of multivariate observations. In this paper, we use the definitions presented in [

With the above definition, a subsequence starts from time

The concept of state is abstract and encompassing, relating to contextual interests and cannot be strictly defined by explicit rules. Otherwise, states could arguably be identified using handcrafted features. An abstract definition provided by [

The concept of state sequence is established by indexing the states of a multivariate time series in [

With the above definitions, we can now formally formulate the boundary detection and state detection problem. Given a multivariate time series

Before reviewing time series segmentation methods, it is essential to introduce the criteria for assessing these methods and the datasets used. Understanding the details of the evaluation criteria and datasets helps practitioners to test the methods at hand before real-world deployment. In generally, the datasets used for evaluating BD and SD methods are universal, but there are differences in the evaluation metrics. This section provides an introduction to the primary metrics and commonly used datasets for evaluating BD and SD methods. The evaluation metrics include F-measure, error, and several clustering metrics, each measuring different aspects of the performance. Regarding datasets, we will discuss several widely used public datasets that cover various application domains such as EEG and activity data. The selection of these metrics and datasets is crucial for a fair and comprehensive evaluation of different methods.

The evaluation of time series segmentation mainly revolves around two core aspects. Firstly, it entails evaluating the performance in identifying boundary points, and secondly, the performance in assigning states to time points. Subsequently, we will introduce the evaluation metrics for each of these aspects separately.

Many works use the F-Measure (i.e.,

However, as pointed out by Lin et al. [

A simple way to mitigate the above question is to add a window to bracket the ground truth. When a predicted boundary falls into a specified time window of the ground truth boundaries, it is counted as a TP; When a predicted boundary falls outside the time window of any ground truth boundaries, it is counted as a FN; When multiple predicted boundaries fall within the time window of a ground truth boundary, only the closest prediction will be counted as a TP, the others will be counted as FP. However, Gharghabi et al. [

Although the above methods can further alleviate the problems in evaluation, there are still shortcomings. As pointed out in [

Based on the above introduction and analysis, we believe that a more reasonable approach would be to simultaneously use F-Measure and the scoring function proposed in [

For state detection evaluation, the metrics are more intuitive, two commonly adopted metrics are the Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI), which are two clustering evaluation metrics that intuitively measure the point-wise matching degree between prediction and ground truth [

We notice that SD methods rarely use the metrics for evaluating BD methods as evaluation metrics, but focus more on the point-wise state assignment performance by using clustering metrics. We believe that incorporating BD metrics will make the evaluation of SD methods more comprehensive.

Understanding the details of the evaluation criteria and datasets helps in better comprehending the performance and applicability of different methods, especially helpful for practitioners to test their methods before real-world deployment. Having introduced the evaluation metrics and datasets, we now move on to discuss the recent advances in BD and SD methods. By analyzing these methods in detail, we aim to highlight their strengths, weaknesses, and suitable application scenarios, thus aiding researchers and practitioners in selecting the appropriate methods for their specific problems.

In this section, we introduce existing methods and categorize them based on their foundational models, boundary detection and state detection methods are introduced separately. Within each method type, our focus will be on highlighting key milestones and introducing notable techniques that have shaped the evolutionary path of these methodologies. Simultaneously, we will analysis the advantages and disadvantages inherent in each of these methods. We will also summarize the key attributes of these methods in segmentation scenarios that are of concern to users, such as whether a method requires the specification of the number of states/segments, and whether a method is for univariate or multivariate time series.

Non-parametric | N-free | K-free | Online | |||
---|---|---|---|---|---|---|

Boundary detection methods | Profile-based | FLOSS [ |
– | |||

ClaSP [ |
– | |||||

Entropy-based | IGTS [ |
– | ||||

ESPRESSO [ |
– | |||||

State detection methods | Statistical | HMM [ |
||||

HSMM [ |
||||||

HDP-HSMM [ |
||||||

Hybrid | Time2State [ |
|||||

E2USD [ |
||||||

Compression-based | AutoPlait [ |
|||||

StreamScope [ |

Note: N denotes the number of segments, K denotes the number of states, – means not applicable.

The traditional statistical time series segmentation methods are mainly based on Markov models, e.g., Hidden Semi-Markov Model (HSMM) [

A Hidden Markov Model (HMM) can be formulated as a 5-tuple

There are three core problems of HMM, which are the evaluation, decoding, and learning problem, respectively:

In the context of time series segmentation, we mainly focus on the decoding and learning problem. Given an observation sequence, the first step is to fit a HMM on the observation sequence, the corresponding hidden state sequence is the desired state sequence for segmenting the time series. On homogeneous datasets, fitting the model once typically suffices, enabling its use for decoding other data. Yet, with heterogeneous datasets, the performance may decline, often necessitating the fitting of a distinct HMM model for each time series.

Among these variants, a noteworthy model is the Hidden Semi-Markov Model (HSMM) [

In summary, the motivation behind HSMM stems from the limitations of the traditional HMMs in accurately capturing temporal dynamics with variable durations. HSMMs aim to address this limitation by allowing states to have variable durations. By introducing explicit modeling of state durations, HSMMs provide a more flexible framework for capturing complex temporal patterns in sequential data.

The HDP-HSMM is a non-parametric Bayesian extension of HSMM, which uses the hierarchical Dirichlet process (HDP) [

Representation learning-based methods [

Time2State utilizes the advancements in time series representation learning for effective and efficient time series state detection. The architecture of Time2State is shown in

However, as pointed out by Lai et al. [

Compression-based hybrid methods are also a noteworthy category, which utilize the minimum description length (MDL) [

Tatti et al. utilize MDL to segment discrete event sequence by using code table as the model, but cannot handle multivariate time series with continuous value [

In the recent advancements of BD methods, two categories of methods are noteworthy, which are the profile-based methods and entropy-based methods, respectively.

Profile-based methods utilize the profile curve to split time series into non-overlapping segments, which usually can only detect the boundaries of segments. As a general pipeline, profile-based methods first extract the profile curve from raw data; then, they search for boundaries by finding the local maxima or minima on the extracted profile curve. An example of a profile curve is shown in

Gharghabi et al. [

Schäfer et al. [

A major improvement of ClaSP over profile-based methods is that it proposes a recursive strategy to find boundary points upon the profile curve. In FLOSS, a simple method is used to extract local maxima/minima locations from the profile curve. This method only needs to run FLOSS once, but requires setting an exclusion zone to prevent matching values near the peak. In ClaSP, a recursive strategy is adopted where it selects only the global maximum each time. The time series is then divided into two parts based on this value, and ClaSP is recursively run on the newly obtained parts, until the desired number of boundaries are generated.

To optimize the above loss function, the authors propose a TopDown approach and a dynamic programming approach. According to [

ESPRESSO [

Time series segmentation plays a key role in time series analysis, which facilitates many downstream technologies, such as pattern recognition, event detection, etc. This paper categorizes time series segmentation methods into three types: change point detection, boundary detection, and state detection methods. Although many literature reviews have focused on time series segmentation technology, they mainly focus on change point detection methods, neglecting the advancements in boundary detection and state detection. This paper fills this gap by focusing on the advances in boundary detection and state detection methods, subdividing boundary detection methods and state detecting methods, and highlight review on the milestones that have shaped the development trajectory of a category of methods. In addition, we have also reviewed the evaluation metrics and datasets for boundary detection methods and state detection methods.

The authors would like to express their gratitude to the editors and reviewers for their detailed review and insightful advice.

This work is supported by the National Key Research and Development Program of China (2022YFF1203001), National Natural Science Foundation of China (Nos. 62072465, 62102425), the Science and Technology Innovation Program of Hunan Province (Nos. 2022RC3061, 2023RC3027).

Study conception and design: Chengyu Wang, Xionglve Li; draft manuscript preparation: Chengyu Wang, Xionglve Li; supervision: Tongqing Zhou, Zhiping Cai. All authors reviewed the results and approved the final version of the manuscript.

This article does not involve data availability and this section is not applicable.

The authors declare that they have no conflicts of interest to report regarding the present study.