# A Novel Trajectory Based Prediction Method for Urban Subway Design

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- We employed the weighted LeaderRank to rank subway stations, it consider the direction of passenger travel and the impact of the dynamic and complex network link relationship formed by passenger travel on the site;
- We employed a clustering method based on the trajectory of taxi to predict the location of subway stations. The experimental results show that our method is helpful to determine the location of subway stations;
- We proposed a method to select the best clustering parameters to better predict the location of stations. In addition, we further characterized the land use around predicted sites to find the law of land use for potential subway stations.

## 2. Related Work

#### 2.1. Study on Traffic Data

#### 2.2. Ranking Nodes in Complex Networks

#### 2.3. Transport System Planning and Modeling

## 3. Definitions

**Definition**

**1**

**.**Let $tf=\left\{t{f}_{i}\right\}$ and $mf=\left\{m{f}_{i}\right\}$ be set of taxi OD flows and subway passenger flows. Then $t{f}_{i}=\{{O}_{i},{D}_{i},T{O}_{i},T{D}_{i}\}$, $m{f}_{i}=\{O{S}_{i},D{S}_{i}\}$ represents a directed flow moves from origin station ${O}_{i}$ at time $T{O}_{i}$ to the destination station ${D}_{i}$ at time $T{D}_{i}$, where ${O}_{i}=(oln{g}_{i},ola{t}_{i})$, ${D}_{i}=(dln{g}_{i},dla{t}_{i})$, $O{S}_{i}$ and $D{S}_{i}$ are name of subway station $(oln{g}_{i},ola{t}_{i})$ and $(dln{g}_{i},dla{t}_{i})$ are the coordinates of the origin and the destination respectively.

**Definition**

**2**

**.**There are two main topological descriptions for building complex transportation networks: space L and space P [28]. The space L method considers traffic stations as graph nodes. If two stations are adjacent on a certain traffic line, there is an edge connecting the two nodes. The space P method also considers a transportation station as a graph node. If there are accessible traffic lines between the two stations, then there is an edge connecting the two nodes. Figure 1 shows a simple subway network represented by two different descriptive methods. Obviously, the network constructed by the space L method is a sub network constructed by the space P method. Space L reflects the real state of the subway station network, only reflects that two stations are truly adjacent on a traffic line, while space P better reflects the connection state of the traffic network stations. Therefore, this paper uses the space P method to construct a weighted subway network.

**Definition**

**3.**

## 4. Methodology

- Classify the OD data of taxis according to the coverage of subway stations.
- Do GMM clustering for OD data which are beyond the coverage, and study the influence of different parameters on the prediction results.
- Through leaderank algorithm, get some special station, and select O or D data included in the coverage of this station for experiments, to predict the future extension of the station.

#### 4.1. GMM Model

#### 4.2. Weighted LeaderRank Passenger Flow Model

## 5. Analysis and Mining of Future Station Areas Based on Taxi OD

#### 5.1. Trajectory Data Cleanup

- Because the GPS system has certain errors, it is necessary to correct the taxi’s GPS raw data to improve the GPS positioning accuracy. The longitude and latitude offsets are −0.002 456 degrees and 0.002 241 degrees, respectively;
- We sorted taxi GPS data by license plate and dropped out exception data includes deleting attributes unrelated to the study in this paper, GPS status exception data, wrong license plate ID and wrong record time;
- The approximate area of Beijing is located at 115.7 to 117.4 degrees longitude East and 39.4 to 41.6 degrees latitude north. In this area, we eliminated data collected from GPS trajectory data that is beyond the latitude and longitude boundaries in order to reduce data redundancy;
- We obtained the travel OD data of taxis, and excluded the data whose travel time is outside the subway operating time (6:00–22:00). Each row of the finally obtained data includes the vehicle id, pick-up/drop-off time and location.

#### 5.2. Beijing Subway Station

#### 5.3. Taxi ODs Analysis

#### 5.4. Location Prediction of Subway Stations

#### 5.5. Determining GMM Model Parameters

Algorithm 1 GetCoords. |

#### 5.6. Results and Case Study

## 6. Conclusions and Further Work

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Li, M.; Dong, L.; Shen, Z.; Lang, W.; Ye, X. Examining the Interaction of Taxi and Subway Ridership for Sustainable Urbanization. Sustainability
**2017**, 9, 242. [Google Scholar] [CrossRef] [Green Version] - Xu, X.; Zhou, J.; Liu, Y.; Xu, Z.; Zhao, X. Taxi-RS: Taxi-Hunting Recommendation System Based on Taxi GPS Data. IEEE Trans. Intell. Transp. Syst.
**2015**, 16, 1716–1727. [Google Scholar] [CrossRef] - Mazimpaka, J.D.; Timpf, S. Exploring the Potential of Combining Taxi GPS and Flickr Data for Discovering Functional Regions; Springer International Publishing: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Zhang, C.; Xia, H.; Song, Y. Rail Transportation Lead Urban Form Change: A Case Study of Beijing. Urban Rail Transit
**2017**, 3, 15–22. [Google Scholar] [CrossRef] [Green Version] - Liu, C.; Wang, S.; Cuomo, S.; Mei, G. Data analysis and mining of traffic features based on taxi GPS trajectories: A case study in Beijing. Concurr. Comput. Pract. Exp.
**2021**, 33, e5332. [Google Scholar] [CrossRef] - Croce, A.; Musolino, G.; Rindone, C.; Vitetta, A. Estimation of Travel Demand Models with Limited Information: Floating Car Data for Parameters’ Calibration. Sustainability
**2021**, 13, 8838. [Google Scholar] [CrossRef] - Zhu, D.; Wang, N.; Wu, L.; Liu, Y. Street as a big geo-data assembly and analysis unit in urban studies: A case study using Beijing taxi data. Appl. Geogr.
**2017**, 86, 152–164. [Google Scholar] [CrossRef] - Ibrahim, R.; Shafiq, M.O. Mining Trajectory Data and Identifying Patterns for Taxi Movement Trips. In Proceedings of the Thirteenth International Conference on Digital Information Management (ICDIM 2018), Berlin, Germany, 24–26 September 2018. [Google Scholar]
- Comi, A.; Rossolov, A.; Polimeni, A.; Nuzzolo, A. Private Car O-D Flow Estimation Based on Automated Vehicle Monitoring Data: Theoretical Issues and Empirical Evidence. Information
**2021**, 12, 493. [Google Scholar] [CrossRef] - Jiang, S.; Guan, W.; He, Z.; Yang, L. Exploring the Intermodal Relationship between Taxi and Subway in Beijing, China. J. Adv. Transp.
**2018**, 2018, 3981845. [Google Scholar] [CrossRef] - Zhang, J.; Lai, T.; Fan, Z.; Huang, B. A Real-Time Bus Transfer Scheme Recommendation Systems. In Proceedings of the International Conference on Advanced Cloud & Big Data, Shanghai, China, 13–16 August 2017. [Google Scholar]
- Ren, X.-L.; Lü, L. Review of ranking nodes in complex networks. Chin. Sci. Bull.
**2014**, 59, 1175–1197. [Google Scholar] [CrossRef] - Kitsak, M.; Gallos, L.K.; Havlin, S.; Liljeros, F.; Muchnik, L.; Stanley, H.E.; Makse, H.A. Identification of influential spreaders in complex networks. Nat. Phys.
**2010**, 6, 888–893. [Google Scholar] [CrossRef] [Green Version] - Lü, L.; Zhang, Y.C.; Yeung, C.H.; Zhou, T. Leaders in Social Networks, the Delicious Case. PLoS ONE
**2011**, 6, e21202. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Zhou, B.; Lei, Y.; Li, C.; Fang, B.; Wu, Q.; Li, L.; Li, Z. Electrical LeaderRank method for node importance evaluation of power grids considering uncertainties of renewable energy. Int. J. Electr. Power Energy Syst.
**2019**, 106, 45–55. [Google Scholar] [CrossRef] - Huang, C.; Wen, S.; Li, M.; Wen, F.; Yang, X. An empirical evaluation of the influential nodes for stock market network: Chinese A-shares case. Financ. Res. Lett.
**2020**, 38, 101517. [Google Scholar] [CrossRef] - Huang, M.; Zou, G.; Zhang, B.; Gan, Y.; Jiang, S.; Jiang, K. Identifying influential individuals in microblogging networks using graph partitioning. Expert Syst. Appl.
**2018**, 102, 70–82. [Google Scholar] [CrossRef] - Li, Q.; Zhou, T.; Lü, L.; Chen, D. Identifying influential spreaders by weighted LeaderRank. Phys. Stat. Mech. Its Appl.
**2014**, 404, 47–55. [Google Scholar] [CrossRef] [Green Version] - Church, R.; Clifford, T. Discussion of environmental optimization of power lines by Economides and Sharifi. J. Environ. Eng. Div. ASCE
**1979**, 105, 438–439. [Google Scholar] [CrossRef] - Gendreau, M.; Laporte, G.; Mesa, J.A. Locating rapid transit lines. J. Adv. Transp.
**2010**, 29, 145–162. [Google Scholar] [CrossRef] - Chien, S.I.J.; Qin, Z. Optimization of bus stop locations for improving transit accessibility. Transp. Plan. Technol.
**2004**, 27, 211–227. [Google Scholar] [CrossRef] - Polzin, S.E. Transportation/Land-Use Relationship: Public Transit’s Impact on Land Use. J. Urban Plan. Dev.
**1999**, 125, 135–151. [Google Scholar] [CrossRef] - Cascetta, E. Transportation Systems Analysis: Models and Applications; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Birgillito, G.; Rindone, C.; Vitetta, A. Passenger Mobility in a Discontinuous Space: Modelling Access/Egress to Maritime Barrier in a Case Study. J. Adv. Transp.
**2018**, 2018, 6518329. [Google Scholar] [CrossRef] - Yang, J.; Chen, J.; Le, X.; Zhang, Q. Density-oriented versus development-oriented transit investment: Decoding metro station location selection in Shenzhen. Transp. Policy
**2016**, 51, 93–102. [Google Scholar] [CrossRef] - Zhang, Y.; Xue, W.; Wei, W.; Nazif, H. A public transport network design using a hidden Markov model and an optimization algorithm. Res. Transp. Econ.
**2021**, 101095, in press. [Google Scholar] [CrossRef] - Król, A.; Król, M. The Design of a Metro Network Using a Genetic Algorithm. Appl. Sci.
**2019**, 9, 433. [Google Scholar] [CrossRef] [Green Version] - Sen, P.; Dasgupta, S.; Chatterjee, A.; Sreeram, P.A.; Mukherjee, G.; Manna, S.S. Small-world properties of the Indian Railway network. Phys. Rev. Stat. Nonlinear Soft Matter Phys.
**2002**, 67, 036106. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Dempster, A.P. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc.
**1977**, 39, 1–22. [Google Scholar]

**Figure 1.**Diagrams of two descriptive methods for a simple subway Network. (

**a**) Space L, (

**b**) Space P.

**Figure 4.**Example of clustering rectangular regions (Red represents the cluster regions and cluster center, green and blue represent the new built station and future planning station respectively).

**Figure 13.**Case study of selected experimental results. (

**a**) Jinsong, (

**b**) Dongzhimen, (

**c**) Dawanglu, (

**d**) Sanyuanqiao, (

**e**) ChongWenmen, (

**f**) Region 4 in (

**c**).

Field | Definition |
---|---|

ID | terminal ID |

VEHICLE CODE | license plate number |

GPS TIME | generation time |

POSITION | real time latitude and longitude |

SPEED | real time speed of vehicle |

VEH STATE | vehicle status: 0 = empty, 1 = loaded, 2 = parking, 3 = stop operation |

GPS STATE | GPS status: 0 = Invalid, 1 = Valid |

Category | O | D | % |
---|---|---|---|

1 | Inside | Inside | 52 |

2 | Outside | Outside | 15 |

3 | Inside | Outside | 17 |

4 | Outside | Inside | 16 |

Rank | Station | WLR |
---|---|---|

1 | Dong Zhimen | 4.460489 |

2 | Guo Mao | 4.294859 |

3 | Da Wanglu | 3.839321 |

4 | Xi Erqi | 3.829502 |

5 | Xi Zhimen | 3.659599 |

6 | Xi Dan | 3.1957 |

7 | Chao Yangmen | 3.0331 |

8 | San Yuanqiao | 2.861383 |

9 | Tian Tongyuan | 2.770496 |

10 | Ji Shuitan | 2.650263 |

11 | Chong Wenmen | 2.636904 |

12 | Fu Chengmen | 2.591404 |

13 | Si Hui | 2.572391 |

14 | Wu Daokou | 2.565621 |

15 | Jin Song | 2.539041 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Cai, Z.; Wang, J.; Li, T.; Yang, B.; Su, X.; Guo, L.; Ding, Z.
A Novel Trajectory Based Prediction Method for Urban Subway Design. *ISPRS Int. J. Geo-Inf.* **2022**, *11*, 126.
https://doi.org/10.3390/ijgi11020126

**AMA Style**

Cai Z, Wang J, Li T, Yang B, Su X, Guo L, Ding Z.
A Novel Trajectory Based Prediction Method for Urban Subway Design. *ISPRS International Journal of Geo-Information*. 2022; 11(2):126.
https://doi.org/10.3390/ijgi11020126

**Chicago/Turabian Style**

Cai, Zhi, Jiawei Wang, Tong Li, Bowen Yang, Xing Su, Limin Guo, and Zhiming Ding.
2022. "A Novel Trajectory Based Prediction Method for Urban Subway Design" *ISPRS International Journal of Geo-Information* 11, no. 2: 126.
https://doi.org/10.3390/ijgi11020126