Traffic flow prediction plays a key role in the construction of intelligent transportation system. However, due to its complex spatio-temporal dependence and its uncertainty, the research becomes very challenging. Most of the existing studies are based on graph neural networks that model traffic flow graphs and try to use fixed graph structure to deal with the relationship between nodes. However, due to the time-varying spatial correlation of the traffic network, there is no fixed node relationship, and these methods cannot effectively integrate the temporal and spatial features. This paper proposes a novel temporal-spatial dynamic graph convolutional network (TSADGCN). The dynamic time warping algorithm (DTW) is introduced to calculate the similarity of traffic flow sequence among network nodes in the time dimension, and the spatiotemporal graph of traffic flow is constructed to capture the spatiotemporal characteristics and dependencies of traffic flow. By combining graph attention network and time attention network, a spatiotemporal convolution block is constructed to capture spatiotemporal characteristics of traffic data. Experiments on open data sets PEMSD4 and PEMSD8 show that TSADGCN has higher prediction accuracy than well-known traffic flow prediction algorithms.

With the improvement of the modernization level and the continuous advancement of urbanization, lots of new vehicles are projected onto the road, these cause serious traffic security, result in critical traffic jams, and prolong the traffic time of people. To deal with complex traffic problems, Intelligent Transportation systems (ITS) have emerged. Traffic flow prediction is the key task of ITS. Timely and accurate traffic prediction results can not only relieve traffic congestion, increase traffic efficiency, and decrease energy consumption and pollution but also be the basis and premise of new applications across the traffic field.

Traffic forecasting research has been going on for decades. Early works focus on statistical methods, such as historical average (HA) [

Most of the existing studies are based on graph neural networks to model traffic flow graphs, and the fixed graph structure is usually applied to deal with the relationship between nodes. However, due to the time-varying spatial correlation of the traffic network, there is no fixed relationship between the traffic nodes, and these methods cannot effectively integrate the spatio-temporal characteristics.

To address these issues, the paper proposed TSADGCN, a novel traffic flow prediction model based on a temporal-spatial relationship graph and attention mechanism network. TSADGCN deeply explores the complex temporal and spatial characteristics of traffic flow data and establishes their dependence relationship, combining with attention mechanism and dilated gated convolution network (DGCN) [

The main contributions of this work are described as follows:

A dilated and gating convolution is proposed to achieve deep temporal features with long receptive fields, based on a spatial self-attention mechanism, and the node correlation of the real spatial relationship is calculated.

The design is a set of adaptive graphs: A time graph with an adjacency matrix as prior information and a spatial graph with a sequential association matrix to capture dynamic real node dependencies. The DTW algorithm is used to calculate the similarity degree of traffic flow sequence between nodes in the road network in time dimension, and the concept of time graph is proposed. Based on the spatio-temporal relationship diagram, the graph convolution and temporal dimension convolution are respectively carried out in each branch combined with the attention mechanism to capture the spatio-temporal characteristics of traffic flow and their dependencies, and realize the modeling of the spatio-temporal relationship of road network traffic flow data.

On the public data set PeMSD4 and PeMSD8 the model proposed in this paper is experimentally verified and the performance is compared with common predictive models. Experiments show that the MAE and RMSE of the TSADGCN are respectively better than those of ARIMA, STGCN, and ASTGCN.

The remainder of this article is organized as follows.

Traditional prediction methods aim to mine the rule of temporal dimension from the traffic flow sequence. These methods include parameter models and non-parameter models. Parameter models consist of ARIMA, Kalan filter, etc. Non-parameter models include k-nearest neighbor (KNN), support vector machine (SVM), and Bayes networks [

Many researchers have employed Euclidean convolutional networks to forecast traffic flow. Combining the Residual network with CNN, Zhang et al. [

Recently, GCN has obtained wide attention in traffic flow prediction and graphs have been used to construct models of non-Euclidean traffic flow data.

Guo et al. [

There are many transport management solutions to resolve the challenges resulting in traffic congestion. Jain et al. [

In summary, existing graph convolutional models usually pay attention to the temporal dimension or spatial dimension relationships, fail to consider the dependency among traffic nodes at different timestamps, and the possible dependencies between different nodes at different time intervals are not fully considered. In addition, most algorithms are based on graph neural networks to model traffic graphs and attempt to use fixed graph structures to obtain relationships between nodes.

In this paper, the traffic prediction task is to learn a function that maps signals from T historical graphs to future T’ graphs, i.e., to predict the traffic state in a future time interval based on the traffic information in the past time interval.

To represent the forecasting process, give some definitions as follows:

Definition 1 (Road graph). A road graph

Definition 2 (Adjacency Matrix). The adjacency matrix AM is defined as the collective characteristic of the road map, representing the connectivity of the road network, which can be formalized as

Let

The proposed TSADGCN framework in this paper is depicted in

Input flow is a traffic data set.

Considering the periodicity, proximity, and other characteristics of traffic flow data in the temporal dimension, the input flow is divided into three parts: Adjacent data fragment, recent data fragment, and historical data fragment to extract features. As shown in

Adjacent data fragment

To capture the structure features at the temporal dimension and spatial dimension, in this paper, three spatial-temporal components are used to explore and analyze the spatial-temporal relationships between historical data sequence, recent data sequence, and adjacent data sequence. The spatial-temporal component consists of the attention mechanism, DGCN, and graph attention network.

The traffic flow has related change within a period. The attention model can allocate different weights and obtain dynamic temporal relationships according to traffic flow data. Meanwhile, the noise ratio is reduced accordingly. Different kinds of data are input into the attention layer for deep fusion to obtain a more credible attention weight. Attention weight can be calculated as follows:

where

In this paper, DGCN is mainly applied to achieve deep temporal features and dependencies. DGCN is a temporal convolutional layers that have the advantages of dilated convolution and gating mechanisms. The gated mechanism is the key component of DGCN. The aim of using the residual structure is to prevent gradients from disappearing during the deepening of the network, while allowing more information to be transmitted across multiple features. Dilated convolution is another key component of DGCN, whose internal process is shown in

The DTW is a typical algorithm for calculating the similarity of a two-time sequence, and this paper uses DTW ideas to construct a time association matrix of traffic flow data between different nodes. For any two nodes

where

Euclidean distance between middle points of two sequences is used to measure the similarity, let

where

Let path

The whole path meets the constraint conditions:

where

The total matching cost between two sequences is defined as

In this paper, the minimum total matching cost is defined recursively as

Finally, the minimum matching cost

Definition 3 (Temporal graph) A temporal graph

Definition 4 (Time sequence association matrix). Given the timing sequence similarity threshold, the time sequence association matrix is a location characterization of nodes, which can be formalized as

where

Definition 5 (Spatial-temporal relation graph) A spatial-temporal relation graph

The spatial-temporal diagram is shown in

Considering the influence of each node on the traffic flow and the interaction between different nodes, the attention mechanism is added to measure the importance of input features of different nodes and input features at different times.

Double-linear attention is applied to design the score function. Given as

where

According to the spatial correlation matrix, then take the attention weight matrix of the adjacent information branch space as an example, the expression of the spatial attention weight between node

The spatial attention weight matrices of the other two branches

In this paper, SGCN is mainly applied to achieve not only the spatial relationship in the road network structure but also the time-dimensional relationship and space-time correlation feature of sequence similarity between nodes. Based on establishing the spatial-temporal diagram, the spatial dimension characteristics of the nodes in the space-time graph are extracted by graph convolution. This feature combines nThe entire graph is represented by its corresponding Laplacian matrix, and the Laplacian matrix L of the space-time graph is defined as follows:

where

Performing the eigenvalue decomposition on L, then

where

The first-order Chebyshev polynomial is used to approximately reduce the

where

Using the above graph convolution calculation method, combined with the spatial attention weight matrix in

where

In the spatial dimension, the graph convolution captures the neighboring information of each node on the graph. Next, a standard convolution layer is stacked in the temporal dimension, and then the information on neighboring time slices is merged to update the signal of the node.

For instance, the operation on the layer in the recent component is calculated as

where

In each branch, the temporal attention weight matrix is introduced separately, and the feature extraction of the temporal dimension is realized by combining two-dimensional convolution.

Through the convolution of the time dimension, the output of 3 branches can be obtained:

Finally, a fully connected layer is used to ensure that the output of each component has the same size and specification as the prediction target, and uses ReLU as the activation function.

Because the three components have different degrees of influence on the fusion process, the output result is a linearly weighted fusion to obtain the stream. The fusion prediction results are defined as

where

To evaluate the proposed model, many comparative experiments were implemented on the PEMSD4 and PEMSD8 data set for verification. PEMSD4 is data collected from January to February 2018 at 307 monitoring sites in the San Francisco Bay Area, USA. PeMSD8 is the traffic data of San Bernardino from July to August 2016. The data is organized into a record every 5 min and includes data on flow, vehicle speed, and lane occupancy. A specific information summary is in

Data set | PeMSD4 | PeMSD8 |
---|---|---|

Data type | Traffic flow | Traffic flow |

Nodes | 307 | 170 |

Edges | 341 | 195 |

Time steps | 16992 | 17856 |

Features | 3 | 3 |

Sampling | 5 min | 5 |

To eliminate the adverse effects of too large or too small traffic volumes on overall predictions in traffic data, This article uses the Z-score side method to standardize the data and all data values fall within the range of [0,1]. The average value is zero.

The mean absolute error (MAE) function and the root mean square error (RMSE) function are used as evaluation functions. MAE can reflect the actual error of the predicted value, while RMSE reflects the square root of the arithmetic mean of the squared error.

The specific formula is defined as

The experimental data set is divided into training set, validation set and test set, each with a ratio of 6:2:2. In this paper, 32 1 × 3 convolution kernels are employed in graph convolution and time convolution, the prediction time step c is 12, the learning rate is set to 0.001, the batch size is set to 64, the mean square error is used as the loss function, Adam is used as the optimizer for optimization, and the number of model iterations is set to 100. For each of the three blocks of time and space, we consider 12 historical data:

In this paper, the idea of dynamic programming is used to find the minimum total matching cost recursively. When constructing the temporal graph, a threshold is defined to establish the temporal correlation between nodes. Different threshold values will have a certain impact on the prediction performance of the whole network. Therefore, different temporal maps are established by changing different thresholds for multiple training to verify the best results of the prediction performance of the whole network, as shown in

When the threshold is set to 3200, the number of neighboring nodes of each node in the temporal graph must be around five. If the threshold is set too low, the number of adjacent points obtained will be small and the association relationship of the time dimension cannot be effectively extracted. When the setting is large, the number of adjacent nodes is too large, resulting in poor overall prediction.

PyTorch 1.7.2 framework was used to implement the architecture and experimental simulation of the above model. They were trained and evaluated on an NVIDIA GeForce RTX 3090 with 16 GB of memory. Other results refer to ASTGCN [

To evaluate the prediction performance of the model proposed in this paper, the following models are selected for comparative analysis.

HA: Predict the traffic value for the next time stamp based on the average traffic flow in the past 1 h.

VAR: The kernel function selected in this article is the radial basis function, with the kernel coefficient set to 0.1.

ARIMA: A classic model for time sequence forecasting analysis, which is simple and does not require other variables.

STGCN: A model based on spatial methods for spatial-temporal data analysis.

ASTGCN: The model fully considers the periodic characteristics of time, and combines graph convolution operation to extract spatial-temporal networks with spatial characteristics.

We used RMSE and MAE to verify and evaluate the proposed model on the PeMSD4 data set. On average, the model presented in this paper is more effective and efficient than all other comparison models, as shown in

Dataset | Evaluation metrics | HA | VAR | ARIMA | STGCN | ASTGCN | TSADGCN |
---|---|---|---|---|---|---|---|

PeMSD4 | MAE | 36.76 | 33.76 | 32.11 | 27.28 | 23.29 | 22.82 |

RMSE | 54.14 | 51.73 | 68.13 | 38.41 | 36.88 | 35.34 | |

PeMSD8 | MAE | 29.52 | 21.41 | 24.04 | 20.99 | 17.95 | 17.73 |

RMSE | 44.03 | 31.21 | 43.30 | 30.78 | 27.35 | 27.17 |

From

As can be seen from

As can be seen from

Compared with most models, TSADGCN has certain differences in the traffic flow prediction results of different nodes.

Node | MAE | RMSE |
---|---|---|

0 | 29.26 | 14 |

1 | 27.26 | 16.67 |

… | … | … |

137 | 43.45 | 31.08 |

196 | 6.29 | 10.58 |

306 | 14.28 | 11.58 |

Combining

To further understand the influence of each module in TSADGCN, ablation comparison experiments are conducted. Based on the TSADGCN model developed in this paper, three variants are designed, as shown below. (1) TSADGCN-T: Remove the TAL module. (2) TSADGCN-R: Remove the residual network. (3) TSADGCN-G: Remove the DGCN module.

It can be seen from

Method | Training (s/epoch) | Inference |
---|---|---|

STGCN | 29.26 | 14 |

ASTGCN | 27.26 | 16.67 |

TSADGCN | 43.45 | 31.08 |

The models presented in this paper are based on time diagrams, spatial diagrams, and space-time diagrams, and add GRU and self-attention layers, which in part increases the complexity of the model. Since each graph processing is independent of each other, the complexity of the model can be reduced.

This paper proposes TSADGCN, which combines graph convolution network and DGCN to build spatial-temporal convolutional blocks and acquire the spatial-temporal features of traffic data simultaneously. DTW algorithm is used to measure the similarity of time series. Compared with HA, VAR, ARIMA, STGCN, ASTGCN, TSADGCN fully considers the spatial-temporal characteristics of traffic flow and its correlation, and the accuracy of traffic flow prediction is significantly higher than other models. In future work, will further verify the multi-scale information of traffic flow forecasting.

The authors would like to express their gratitude for the valuable feedback and suggestions provided by all the anonymous reviewers and the editorial team.

This work was supported by the National Natural Science Foundation of China (Grant: 62176086).

Conceptualization, Yunchang Liu; Data curation, Fei Wan; Formal analysis, Yunchang Liu; Methodology, Yunchang Liu and Chengwu Liang; Software, Yunchang Liu and Fei Wan; Writing–original draft, Yunchang Liu and Chengwu Liang. All authors reviewed the results and approved the final version of the manuscript.

The training data used in this paper were obtained from the Caltrans performance measurement system. Available online via the following link:

The authors declare that they have no conflicts of interest to report regarding the present study.