# Mobile Phone Data Feature Denoising for Expressway Traffic State Estimation

## Abstract

## 1. Introduction

## 2. Literature Review

## 3. Data Feature Extraction and Denoising

#### 3.1. Data Preprocessing

- Processing of Mobile Identifiers

- Processing of Invalid Data

- Processing of Duplicate Data

- Processing of Ping-Pong Data

- Processing of Drift Data

#### 3.2. Feature Extraction

#### 3.3. Origin of Data Noise

#### 3.4. Feature Data Denoising Method Based on the Improved DPCA

- Optimization of the adaptive selection of the cut-off distance ${\mathrm{d}}_{\mathrm{c}}$

- Optimization of the cluster center automatic selection

- Optimization of valid data type selection

## 4. Construction of an LSTM-Based Traffic State Estimation Model

## 5. Case Study

#### 5.1. Source Data

#### 5.2. Mobile Phone Feature Data Denoising Based on the Improved DPCA

#### Denoising Effect Analysis

#### 5.3. Traffic State Estimation Based on Denoising Data

#### Accuracy Evaluation of the Traffic State Estimation Model

## 6. Summary

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

**Figure 1.**(

**a**) Original speed characteristic distribution example of mobile phones; (

**b**) Comparison of the original quantity characteristics of mobile phones and the corresponding measured traffic.

**Figure 2.**Decision map of clustering centers based on DPCA for 2D data. (

**A**) the actual distribution of the 2D data points on a plane. (

**B**) a 2D cluster-decision map that is drawn with the parameters of $\rho $ and $\delta $ of each sample point.

**Figure 8.**(

**a**) Correlation diagram of the cut-off distance and corresponding entropy value; (

**b**) BDD based on the local density and high-density distance.

**Figure 10.**Comparison chart of the original speed feature value of mobile phones and the radar speed: (

**a**) Speed comparison line chart; (

**b**) Speed scatter mapping plot.

**Figure 11.**Comparison chart of the denoised speed feature value of mobile phones and the radar speed: (

**a**) Speed line comparison chart; (

**b**) Speed scatter mapping plot.

**Figure 12.**Comparison chart of the quantity feature values of mobile phones before and after denoising and the corresponding radar-measured traffic.

**Figure 13.**Comparison and verification of traffic flow speed estimation results: (

**a**) Comparison of the estimated and the actual traffic flow speed on 5 October; (

**b**) Comparison of the estimated and the actual traffic flow speed on 6 October.

Degree of Congestion | Design Speed (km/h) | ||
---|---|---|---|

120 | 100 | 80 | |

Free-flowing | [90, max) | [75, max) | [60, max) |

Smooth | [60, 90) | [50, 75) | [40, 60) |

Congested | [30, 60) | [25, 50) | [20, 40) |

Severely congested | [0, 30) | [0, 25) | [0, 20) |

Evaluation Index | Design Speed (km/h) | |||
---|---|---|---|---|

5 October 2015 | 6 October 2015 | |||

Raw Data | Filtered Data | Raw Data | Filtered Data | |

MSE (km^{2}/h^{2}) | 134.41 | 57.34 | 268.61 | 87.59 |

MAE (km/h) | 9.73 | 4.98 | 13.01 | 6.32 |

ME (km/h) | −5.16 | −1.61 | −4.78 | −0.52 |

MAPE (%) | 17.37 | 10.71 | 21.32 | 11.32 |

**Table 3.**Comparison in the quantitative indicators of traffic estimation between different denoising methods.

Evaluation Index | Design Speed (km/h) | |||||
---|---|---|---|---|---|---|

5 October 2015 | 6 October 2015 | |||||

Gaussian | k-Means | The Improved DPCA | Gaussian | k-Means | The Improved DPCA | |

MSE (km^{2}/h^{2}) | 100.60 | 85.67 | 57.34 | 196.80 | 219.18 | 87.59 |

MAE (km/h) | 8.72 | 7.57 | 4.98 | 12.08 | 11.13 | 6.32 |

ME (km/h) | −5.18 | −2.16 | −1.61 | −4.85 | −1.78 | −0.52 |

MAPE (%) | 15.95 | 14.47 | 10.71 | 19.39 | 18.26 | 11.32 |

