# Spatiotemporal Graph Convolutional Network for Multi-Scale Traffic Forecasting

^{*}

## Abstract

**:**

## 1. Introduction

- To optimize the extraction of a feature, a novel spatiotemporal graph neural network model was proposed that simultaneously considers temporal periodicity, spatiotemporal multi-scale features, connection method, and node pattern embedding.
- Based on Res2Net, we design hierarchical temporal attention layers and hierarchical adaptive graph convolution, so as to learn multi-scale spatiotemporal features. To the best of our knowledge, this paper is the first study to apply the idea of Res2Net in the field of spatiotemporal graph neural networks for traffic forecasting.
- Systematic experiments were conducted to compare our approach with existing state-of-the-art methods using two publicly available real-world traffic volume datasets. The results show that our model performs good accuracy, outperforming existing methods by up to 9.4%.

## 2. Related Work

#### 2.1. Short-Term Traffic Volume Forecasting Models

#### 2.2. Spatiotemporal Graph Neural Network for Traffic Forecasting

## 3. Methodology

#### 3.1. Preliminary

#### 3.2. Design of TRes2GCN

- Through a multi-component approach, our model learns and fuses spatiotemporal features of different time periods and explores the travel patterns of vehicles.
- TRes2GC-Submodule connects spatiotemporal blocks in the way of DenseNet, which increases the width of features by fusing spatiotemporal features of different levels and effectively mitigates the problem of network degradation and oversmoothing.

#### 3.3. Multiple Temporal Periods

_{0}, the forecast window size is T

_{p}and the number of samples per day is $q$. We intercept three time series T

_{h}, T

_{d}and Tw, respectively, on the time axis as the time component inputs for the recent periods, daily periods and weekly periods, where T

_{h}, T

_{d}and T

_{w}are of the same quantity as T

_{p}. We have demonstrated the different time periods for the input. The details of these three periods are described as follows:

- (1)
- Recent period: Recent period refers to the historical data in the nearby of the forecast value, denoted as ${X}_{h}=\left({X}_{{T}_{0}-{T}_{k}+1},{X}_{{T}_{0}-{T}_{k}+2},\dots ,{X}_{{T}_{0}}\right)\in {R}^{N\times F\times {T}_{k}}$ Since sudden changes in traffic flow are precursory, the near moment fragment is particularly important for the forecast fragment. The specific slice is shown in Figure 1a, and the green color relative to the black color is its recent period.
- (2)
- Daily period: A daily period refers to the historical data of one day ago at the same time as the forecast segment, denoted as ${X}_{d}=\left({X}_{{T}_{0}-q+1},{X}_{{T}_{0}-q+2},\dots ,{X}_{{T}_{0}-q+{T}_{p}}\right)\in {R}^{N\times F\times {T}_{d}}$ It is a fragment of the same time interval as the forecast period in the past day. The traffic data are likely to show a part of the same pattern over some time, for example, there are morning peak and evening peak for each day of a weekday. Therefore, we choose this segment as part of the common forecast, thus capturing the similar characteristics of the daily period. The specific slice is shown in Figure 1a, and the orange color relative to the black color is its daily period.
- (3)
- Weekly period: A weekly period refers to the historical data of a week ago at the same time as the forecast segment, denoted as ${X}_{w}=\left({X}_{{T}_{0}-7\ast q+1},{X}_{{T}_{0}-7\ast q+2},\dots ,{X}_{{T}_{0}-7\ast q+{T}_{p}}\right)\in {R}^{N\times F\times {T}_{w}}$ It is a fragment of the same time interval as the forecast period in the past week. The reason is the same as the daily period. For example, the flow change of this Friday is very similar to next Friday, but there are some differences with the flow change of the weekend. Thus, we use it to capture the similar characteristics of the weekly period. The specific slice is shown in Figure 1a, and the blue color relative to the black color is its weekly period.

#### 3.4. Spatiotemporal Feature Capture Method

**Temporal Attention Layer (TAL).**For the traffic flow, it is dynamic and relevant in the time dimension, and such dynamic features cannot be learned by ordinary CNN or RNN. Therefore, this paper uses the attention mechanism in the NLP domain [42], thus focusing on the importance of information at different time nodes in a complete period of time and assigning greater weights to valid time points [30]. It not only adds dynamic and adaptive temporal relevance to our model, but also expands the feeling field to solve longer time prediction problems. Its mathematical formula is shown as follows:

**Hierarchical temporal attention layer (Res2TAL).**Res2Net [43] has proven its ability in the field of computer vision (Euclidean convolution) to divide a single convolution into multiple convolutions that are interrelated, allowing us to extract multi-scale feature information from a fine-grained perspective. We built on the temporal attention layer by deploying the framework part of Res2Net with the residual structure removed, thus proposing a hierarchical temporal attention layer, as shown in Figure 2b. The hierarchical temporal attention layer not only focuses on the importance of each moment, but also learns the similarity of temporal trends over a period of time. This allows us to learn both the temporal value weight changes and temporal trend changes over a period of time, thus learning temporal features in a more fine-grained perspective. Res2Net, on the other hand, can be used without increasing the number of parameters with more training time under specific settings. Therefore, Res2TAL can enable us to extract temporal feature information at different scales without increasing parameters. Its mathematical formula is shown as follows:

**Adaptive adjacency matrix generation.**In the spatial dimension, there are differences in traffic patterns at each node, which can have a huge impact on traffic prediction. The current way of graph construction mostly starts from individual attributes (e.g., road network, point of interest (POI), traffic similarity, etc.), which are highly interpretable but do not contain the complete spatial dependency information. In addition, most graph structures require predefined graphs in advance, and their models cannot work when they are missing. In order to solve the above appeared problems, we applied a node embedding approach to the adaptive adjacency matrix construction [29,40]. The formula is shown as follows:

**Adaptive Graph Convolution Layer (AGCN).**Since the convolution kernel of GCN is shared, although it can capture the most important traffic patterns in the whole traffic graph, it is actually difficult for us to learn the variability among nodes and learn the traffic patterns of different nodes in a fine-grained manner. AGCN [40] enables us to accomplish fine-grained feature capture of nodes without the need of known node attribute data. It adds one more node embedding parameter in the above adjacency matrix to the GCN. This parameter can help us train the adaptive adjacency matrix while influencing the learning parameters during graph convolution and giving a different deflation of the features of each node. Therefore, it helps us to consider the node pattern differences so as to obtain dynamic adaptive spatial features at fine-granularity. Its equation is shown as follows:

**Hierarchical Adaptive Graph Convolution Layer (Res2GCN).**Res2Net [43] has been verified to have a better ability in extracting features at multiple scales in Euclidean space. We refer to this idea and deploy the construction idea of Res2Net to AGCN to propose a hierarchical adaptive graph convolution layer. Similarly, it is able to capture multi-scale spatial features in traffic flow (non-Euclidean space) at a finer granularity level, and the internal operation is shown in Figure 2c. The multi-scale features of Res2GCN not only increase the perceptual field of convolution, but also alleviate the performance limitations caused by the GCN kernel fixation as well as sharing problems to a certain extent, because the AGCN within each scale are not sharing parameters. On the other hand, since the same embedding vector is used by AGCN at different scales in the parameter back propagation process, this allows our adjacency matrix’s to also fuse to learn node patterns at multiple scales. In a later section, we also verify that different adjacency matrices with node embedding parameter approaches have different effects on Res2GCN. The equation is described as follows:

#### 3.5. Dense Connection

#### 3.6. Multi-Component Fusion

## 4. Experiment and Result

#### 4.1. Datasets

#### 4.2. Baseline Methods

**VAR**: Vector Auto-Regression is a forecasting model that captures the spatiotemporal feature between traffic data.**SVR**: Support Vector Regression utilizes a support vector machine to perform linear regression.**LSTM**: The long short-term memory network is a variant model of RNN that can better handle time-series tasks.**DCRNN**: The diffusion convolution recurrent neural network is an auto-encoder framework. It uses diffusion map convolution to obtain spatial features and Seq2Seq to encode temporal information.**STGCN**: The spatiotemporal graph convolutional network uses ChebNet to obtain spatial correlation and CNN with a gating mechanism to obtain temporal correlation.**MSTGCN**: The multi-component spatiotemporal graph convolution network extracts and fuses spatiotemporal information in different time periods by modeling different temporal patterns. It obtains temporal features by CNN and spatial features by ChebNet.**ASTGCN**: The attention-based spatiotemporal graph convolutional network adds on temporal attention and spatial attention to MSTGCN to extract dynamic spatiotemporal information.**Graph WaveNet**: Graph WaveNet combines GCN and dilated convolution network to obtain spatial correlation and temporal correlation separately. It also utilizes node embedding to adaptively learn adjacency matrix from the data.**STSGCN**: The spatiotemporal synchronous graph convolutional networks employ GCN to construct spatiotemporal synchronous convolutional blocks to synchronously obtain temporal and spatial correlations by stacking spatiotemporal synchronous convolutional modules.**AGCRN**: The adaptive graph convolutional recurrent network proposed a novel adaptive graph convolutional network so as to capture fine-grained spatial feature. In addition, it employs amplified GRU to capture the temporal feature.

#### 4.3. Experiment Settings

#### 4.4. Results Comparison

## 5. Discussion

#### 5.1. Influence of Connection Method and Multi-Scale Feature

- (1)
- Two Blocks-ResNet: This model is the basis of our study. It consists of two spatiotemporal blocks and is stacked in the currently most commonly used ResNet structure. Each spatiotemporal block does not contain a hierarchical structure, i.e., it contains only one TAL layer and one AGCN layer.
- (2)
- Two Blocks-DenseNet: This model is based on the first variant with the ResNet structure replaced by the DenseNet structure.
- (3)
- Three Blocks-ResNet: This model is based on the first variant with one more spatiotemporal block stacked and the rest unchanged.
- (4)
- Three Blocks-DenseNet: This model is based on the third variant by replacing the ResNet structure with the DenseNet structure.
- (5)
- Four Blocks-ResNet: This model is based on the third variant with one more spatiotemporal block stacked and the rest unchanged.
- (6)
- Four Blocks-DenseNet: This model is based on the fifth variant by replacing the ResNet structure with the DenseNet structure.
- (7)
- Four Blocks-DenseNet + Res2GCN: This model is based on the sixth variant with the addition of hierarchical adaptive graph convolution (Res2GCN).
- (8)
- TRes2GCN: This is the full version of TRes2GCN. It adds hierarchical temporal attention layer on top of the seventh variant.

#### 5.2. Influence of Node Embedding Vector and Adaptive Adjacency Matrix

#### 5.3. Forecasting Capability over Different Spans

#### 5.4. Temporal Periodic Analysis

#### 5.5. Spatial Correlation Analysis

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Ahmed, M.S.; Cook, A.R. Analysis of Freeway Traffic Time-Series Data by Using Box-Jenkins Techniques. Transp. Res. Rec.
**1979**, 722, 1–9. [Google Scholar] - Chen, P.; Ding, C.; Lu, G.; Wang, Y. Short-term traffic states forecasting considering spatial–Temporal impact on an urban expressway. Transp. Res. Rec.
**2016**, 2594, 61–72. [Google Scholar] [CrossRef] - Smola, A.J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput.
**2004**, 14, 199–222. [Google Scholar] [CrossRef] [Green Version] - Li, H.; Liu, J.; Liu, R.W.; Xiong, N.; Wu, K.; Kim, T.-H. A dimensionality reduction-based multi-step clustering method for robust vessel trajectory analysis. Sensors
**2017**, 17, 1792. [Google Scholar] [CrossRef] [Green Version] - Aqib, M.; Mehmood, R.; Alzahrani, A.; Katib, I.; Albeshri, A.; Altowaijri, S.M. Smarter traffic prediction using big data, in-memory computing, deep learning and GPUs. Sensors
**2019**, 19, 2206. [Google Scholar] [CrossRef] [Green Version] - Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv, 2018; arXiv:1803.01271. [Google Scholar]
- Van Lint, J.; Hoogendoorn, S.; van Zuylen, H.J. Freeway travel time prediction with state-space neural networks: Modeling state-space dynamics with recurrent neural networks. Transp. Res. Rec.
**2002**, 1811, 30–39. [Google Scholar] [CrossRef] - Tian, Y.; Pan, L. Predicting short-term traffic flow by long short-term memory recurrent neural network. In Proceedings of the 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity), Chengdu, China, 21 December 2015; pp. 153–158. [Google Scholar]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS 2014 Workshop on Deep Learning, Montréal, QC, Canada, 13 December 2014. [Google Scholar]
- Sato, R. A survey on the expressive power of graph neural networks. arXiv, 2020; arXiv:2003.04078. [Google Scholar]
- Ye, J.; Zhao, J.; Ye, K.; Xu, C. How to build a graph-based deep learning architecture in traffic domain: A survey. IEEE Trans. Intell. Transp. Syst.
**2020**, 1–21. [Google Scholar] [CrossRef] - Jiang, W.; Luo, J. Graph neural network for traffic forecasting: A survey. arXiv, 2021; arXiv:2101.11174. [Google Scholar]
- Liu, J.; Guan, W. A summary of traffic flow forecasting methods. J. Highw. Transp. Res. Dev.
**2004**, 3, 82–85. [Google Scholar] - Li, W.; Wang, J.; Fan, R.; Zhang, Y.; Guo, Q.; Siddique, C.; Ban, X.J. Short-term traffic state prediction from latent structures: Accuracy vs. efficiency. Transp. Res. Part C Emerg. Technol.
**2020**, 111, 72–90. [Google Scholar] [CrossRef] - Kumar, K.; Parida, M.; Katiyar, V. Short term traffic flow prediction for a non urban highway using artificial neural network. Procedia-Soc. Behav. Sci.
**2013**, 104, 755–764. [Google Scholar] [CrossRef] [Green Version] - Wu, Y.; Tan, H. Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework. arXiv, 2016; arXiv:1612.01022. [Google Scholar]
- Liu, Y.; Zheng, H.; Feng, X.; Chen, Z. Short-term traffic flow prediction with Conv-LSTM. In Proceedings of the 2017 9th International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, 11–13 October 2017; pp. 1–6. [Google Scholar]
- Zhang, J.; Zheng, Y.; Qi, D.; Li, R.; Yi, X.; Li, T. Predicting citywide crowd flows using deep spatio-temporal residual networks. Artif. Intell.
**2018**, 259, 147–166. [Google Scholar] [CrossRef] [Green Version] - Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv, 2016; arXiv:1609.02907. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. Adv. Neural Inf. Processing Syst.
**2016**, 29, 3844–3852. [Google Scholar] - Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv, 2017; arXiv:1710.10903. [Google Scholar]
- Atwood, J.; Towsley, D. Diffusion-convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 4–9 December 2016; pp. 1993–2001. [Google Scholar]
- Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-gcn: A temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst.
**2019**, 21, 3848–3858. [Google Scholar] [CrossRef] [Green Version] - Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 1–3 May 2018. [Google Scholar]
- Cui, Z.; Henrickson, K.; Ke, R.; Wang, Y. Traffic graph convolutional recurrent neural network: A deep learning framework for network-scale traffic learning and forecasting. IEEE Trans. Intell. Transp. Syst.
**2019**, 21, 4883–4894. [Google Scholar] [CrossRef] [Green Version] - Guo, K.; Hu, Y.; Qian, Z.; Liu, H.; Zhang, K.; Sun, Y.; Gao, J.; Yin, B. Optimized graph convolution recurrent neural network for traffic prediction. IEEE Trans. Intell. Transp. Syst.
**2020**, 22, 1138–1149. [Google Scholar] [CrossRef] - Bai, J.; Zhu, J.; Song, Y.; Zhao, L.; Hou, Z.; Du, R.; Li, H. A3T-GCN: Attention temporal graph convolutional network for traffic forecasting. ISPRS Int. J. Geo-Inf.
**2021**, 10, 485. [Google Scholar] [CrossRef] - Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 3634–3640. [Google Scholar]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Zhang, C. Graph WaveNet for Deep Spatial-Temporal Graph Modeling. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China, 10–16 August 2019. [Google Scholar]
- Guo, S.; Lin, Y.; Feng, N.; Song, C.; Wan, H. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January 2019; pp. 922–929. [Google Scholar]
- Zheng, C.; Fan, X.; Wang, C.; Qi, J. Gman: A graph multi-attention network for traffic prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 1234–1241. [Google Scholar]
- Guo, K.; Hu, Y.; Qian, Z.; Sun, Y.; Gao, J.; Yin, B. Dynamic graph convolution network for traffic forecasting based on latent network of laplace matrix estimation. IEEE Trans. Intell. Transp. Syst.
**2020**, 1–10. [Google Scholar] [CrossRef] - Hu, J.; Chen, L. Multi-Attention Based Spatial-Temporal Graph Convolution Networks for Traffic Flow Forecasting. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Online, 18–22 July 2021; pp. 1–7. [Google Scholar]
- Liang, Y.; Zhao, Z.; Sun, L. Dynamic spatiotemporal graph convolutional neural networks for traffic data imputation with complex missing patterns. arXiv, 2021; arXiv:2109.08357. [Google Scholar]
- Bai, L.; Yao, L.; Wang, X.; Li, C.; Zhang, X. Deep spatial–temporal sequence modeling for multi-step passenger demand prediction. Future Gener. Comput. Syst.
**2021**, 121, 25–34. [Google Scholar] [CrossRef] - Zhu, J.; Wang, Q.; Tao, C.; Deng, H.; Zhao, L.; Li, H. AST-GCN: Attribute-Augmented Spatiotemporal Graph Convolutional Network for Traffic Forecasting. IEEE Access
**2021**, 9, 35973–35983. [Google Scholar] [CrossRef] - Yu, B.; Yin, H.; Zhu, Z. St-unet: A spatio-temporal u-network for graph-structured time series modeling. arXiv, 2019; arXiv:1903.05631. [Google Scholar]
- Song, C.; Lin, Y.; Guo, S.; Wan, H. Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 914–921. [Google Scholar]
- Guo, K.; Hu, Y.; Sun, Y.; Qian, S.; Gao, J.; Yin, B. Hierarchical Graph Convolution Networks for Traffic Forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, QC, Canada, 2–9 February 2021; pp. 151–159. [Google Scholar]
- Bai, L.; Yao, L.; Li, C.; Wang, X.; Wang, C. Adaptive Graph Convolutional Recurrent Network for Traffic Forecasting. arXiv, 2020; arXiv:2007.02842. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Feng, X.; Guo, J.; Qin, B.; Liu, T.; Liu, Y. Effective deep memory networks for distant supervised relation extraction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; pp. 4002–4008. [Google Scholar]
- Gao, S.; Cheng, M.-M.; Zhao, K.; Zhang, X.-Y.; Yang, M.-H.; Torr, P.H. Res2net: A new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell.
**2021**, 43, 652–662. [Google Scholar] [CrossRef] [Green Version] - Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Chen, C.; Petty, K.; Skabardonis, A.; Varaiya, P.; Jia, Z. Freeway performance measurement system: Mining loop detector data. Transp. Res. Rec.
**2001**, 1748, 96–102. [Google Scholar] [CrossRef] [Green Version] - Huber, P.J. Robust estimation of a location parameter. In Breakthroughs in Statistics; Springer: Berlin/Heidelberg, Germany, 1992; pp. 492–518. [Google Scholar]
- Li, Q.; Han, Z.; Wu, X.-M. Deeper insights into graph convolutional networks for semi-supervised learning. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Chang, X.; Zhang, C. Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, CA, USA, 6–10 July 2020; pp. 753–763. [Google Scholar]

**Figure 2.**The (

**a**) ST Block of TRes2GCN. The framework of (

**b**) hierarchical temporal attention layer and (

**c**) hierarchical adaptive graph convolution layer.

**Figure 3.**Model comparison under different time spans: (

**a**) PeMS04-RMSE, (

**b**) PeMS04-MAE, (

**c**) PeMS08-MAPE, (

**d**) PeMS08-RMSE, (

**e**) PeMS04-MAE and (

**f**) PeMS08-MAPE.

**Figure 4.**Visualization of the impact of components on the prediction error (PeMS08 as an example), (

**a**) RMSE, (

**b**) MAE and (

**c**) MAPE; the error bars represent the floating changes of the predicted metrics in 60 min; the red line represents the outcome metrics predicted in 5 min; the blue line represents the outcome metrics predicted in 30 min; the purple line represents the outcome metrics predicted in 60 min.

**Figure 6.**Part of the adjacency matrix at different scales and hourly average traffic flow variation (with PeMS08 as an example). (

**a**) Predefined adjacency matrix. (

**b**) Self-adaptive adjacency matrix at a single scale. (

**c**) Self-adaptive adjacency matrix at multiple scales. (

**d**) Hourly flow variation for the first fifteen nodes. (

**e**) Hourly traffic variation of node 3 and node 4. (

**f**) Hourly traffic variation of node 10 and node 11.

Dataset | Nodes (Sensors) | Edges | Time Steps | Time Range | Data Range (Per Time Period) | Average (Per Time Period) |
---|---|---|---|---|---|---|

PeMS04 | 307 | 341 | 16,992 | 1/1/2018–2/28/2018 | 0–919 | 91.74 |

PeMS08 | 170 | 295 | 17,856 | 7/1/2016–8/31/2016 | 0–1147 | 98.17 |

Baseline Methods | VAR | SVR | LSTM | DCRNN | STGCN | MSTGCN | ASTGCN | Graph WaveNet | STSGCN | AGCRN | TRes2GCN (ours) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|

Datasets | Evaluation Metrics | |||||||||||

PeMS04 | RMSE | 36.66 | 44.59 | 40.74 | 37.12 | 37.07 | 35.48 | 34.50 | 39.70 | 33.83 | 32.26 | 31.98 |

MAE | 23.75 | 28.66 | 26.81 | 23.65 | 24.43 | 22.65 | 21.97 | 25.45 | 21.08 | 19.83 | 19.62 | |

MAPE (%) | 18.09 | 19.15 | 22.33 | 16.05 | 18.34 | 16.32 | 15.47 | 17.29 | 13.88 | 12.97 | 12.96 | |

PeMS08 | RMSE | 33.83 | 36.15 | 33.59 | 28.29 | 30.11 | 28.27 | 26.91 | 31.05 | 26.83 | 25.22 | 24.52 |

MAE | 22.32 | 23.25 | 22.19 | 18.22 | 19.95 | 18.54 | 17.37 | 19.13 | 17.10 | 15.95 | 14.45 | |

MAPE (%) | 14.47 | 14.71 | 18.74 | 11.56 | 14.27 | 13.04 | 12.28 | 12.68 | 10.90 | 10.09 | 9.50 |

Parameters of Res2GCN | Parameters of Res2TAL | With SE-Block | Average RMSE | Average MAE | Average MAPE (%) | ||
---|---|---|---|---|---|---|---|

Scale | Width | Scale | Width | ||||

4 | 26 | 4 | 26 | No | 24.88 | 14.63 | 9.56 |

4 | 26 | 3 | 26 | No | 24.52 | 14.45 | 9.50 |

3 | 26 | 3 | 26 | No | 24.68 | 14.57 | 9.64 |

4 | 26 | 3 | 26 | Yes | 25.52 | 15.13 | 10.82 |

**Table 4.**Influence of the adaptive adjacency matrix construction method on prediction (with PeMS08 as an example).

Method of Generating Self-Adaptive Adjacency Matrix | Node Embedding Vector | Average RMSE | Average MAE | Average MAPE (%) |
---|---|---|---|---|

Without Self-Adaptive Adjacency Matrix | Without Embedding | 26.97 | 15.91 | 11.82 |

ReLU(EE^{T}) | Without Embedding | 26.91 | 15.86 | 11.80 |

ReLU(EE^{T}) | E∈R^{N×D} | 24.52 | 14.45 | 9.50 |

ReLU(E_{1}E_{2}) | MLP(E_{1} + E_{2})∈R^{N×D} | 25.43 | 14.84 | 10.62 |

ReLU(tanh(α(tanh(αMLP(E))·tanh(αMLP(E))^{T}))) | tanh(αMLP(E))∈R^{N×D} | 26.09 | 15.23 | 11.00 |

ReLU(tanh(α(tanh(αMLP(E_{1}))·tanh(αMLP(E_{2}))^{T}-α(tanh(αMLP(E_{2}))·tanh(αMLP(E_{1}))^{T}))) | MLP(tanh(αMLP(E_{1})) + tanh(αMLP(E_{2})))∈R^{N×D} | 26.42 | 15.61 | 11.18 |

Time Period | With Res2Net Structure | Average RMSE | Average MAE | Average MAPE (%) | ||
---|---|---|---|---|---|---|

Recent Period | Daily Period | Weekly Period | ||||

Yes | No | No | No | 29.35 | 18.83 | 12.35 |

Yes | Yes | No | No | 27.52 | 17.58 | 11.44 |

Yes | No | Yes | No | 26.40 | 15.23 | 10.96 |

Yes | Yes | Yes | No | 25.01 | 14.78 | 10.50 |

Yes | No | No | Yes | 26.04 | 16.48 | 10.33 |

Yes | Yes | Yes | Yes | 24.52 | 14.45 | 9.50 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, Y.; Jing, C.
Spatiotemporal Graph Convolutional Network for Multi-Scale Traffic Forecasting. *ISPRS Int. J. Geo-Inf.* **2022**, *11*, 102.
https://doi.org/10.3390/ijgi11020102

**AMA Style**

Wang Y, Jing C.
Spatiotemporal Graph Convolutional Network for Multi-Scale Traffic Forecasting. *ISPRS International Journal of Geo-Information*. 2022; 11(2):102.
https://doi.org/10.3390/ijgi11020102

**Chicago/Turabian Style**

Wang, Yi, and Changfeng Jing.
2022. "Spatiotemporal Graph Convolutional Network for Multi-Scale Traffic Forecasting" *ISPRS International Journal of Geo-Information* 11, no. 2: 102.
https://doi.org/10.3390/ijgi11020102