TD3-Based EMS Using Action Mask and Considering Battery Aging for Hybrid Electric Dump Trucks
Abstract
:1. Introduction
- (1)
- A TD3-based EMS is proposed to extend the battery life and reduce the total usage cost. Because battery aging affects vehicle range, costly battery replacements are required when battery life terminates.
- (2)
- Most of EMSs ignore safety issues during the exploration stage such as the MG1 overloading, which cause serious problems in automotive control and is unacceptable in industrial applications. Then action masks are used to eliminate invalid actions that exceed the physical limits and improve the training efficiency of the policy.
- (3)
- The TD3 algorithm can reduce the overestimation bias of DDPG, thus the TD3 algorithm is applied as an EMS for hybrid electric dump trucks and trained by the self-learning capability of DRL. Finally, a comparison with DDPG-based EMS is presented.
- (4)
- The reward function that includes battery aging cost and fuel consumption cost is designed to extend the battery life and reduce fuel consumption.
2. Vehicle Modeling and Optimization Problem
2.1. Vehicle Model
2.2. Battery Aging Model and Optimization Problem for EMS
3. Method and Design of TD3-Based EMS
3.1. Reinforcement Learning
3.2. TD3 Algorithm
Algorithm 1: TD3 |
initialization: critic networks Qθ1, Qθ2 with random parameters θ1, θ2, actor network πφ with random parameters φ, target critic networks θ′1 ← θ1, θ′2 ← θ2, actor network ϕ′ ← ϕ replay buffer for t = 1 to T do observe state s choose action with exploration noise a ~ πφ(s) + ε, and observe reward r and new state s′, store transition tuple (s, a, r, s’, d) in randomly sample a mini-batch of N transitions {(s, a, r, s′, d)} from update critic networks if t mod d then update φ by deterministic policy gradient: Update target networks by soft update: end if end for |
3.3. Action Mask
3.4. Design of TD3-Based EMS
4. Results
4.1. The Impact of Action Mask
4.2. Battery Capacity Loss and Fuel Consumption
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ali, A.; Söffker, D. Towards Optimal Power Management of Hybrid Electric Vehicles in Real-Time: A Review on Methods, Challenges, and State-Of-The-Art Solutions. Energies 2018, 11, 476. [Google Scholar] [CrossRef][Green Version]
- Saiteja, P.; Ashok, B. Critical Review on Structural Architecture, Energy Control Strategies and Development Process towards Optimal Energy Management in Hybrid Vehicles. Renew. Sust. Energ. Rev. 2022, 157, 112038. [Google Scholar] [CrossRef]
- Tran, D.; Vafaeipour, M.; El Baghdadi, M.; Barrero, R.; Van Mierlo, J.; Hegazy, O. Thorough State-of-the-Art Analysis of Electric and Hybrid Vehicle Powertrains: Topologies and Integrated Energy Management Strategies. Renew. Sust. Energ. Rev. 2020, 119, 109596. [Google Scholar] [CrossRef]
- Padmarajan, B.; McGordon, A.; Jennings, P. Blended Rule-Based Energy Management for PHEV: System Structure and Strategy. IEEE Trans. Veh. Technol. 2016, 65, 8757–8762. [Google Scholar] [CrossRef]
- Zhou, W.; Yang, L.; Cai, Y.; Ying, T. Dynamic Programming for New Energy Vehicles Based on Their Work Modes Part I: Electric Vehicles and Hybrid Electric Vehicles. J. Power Sources 2018, 406, 151–166. [Google Scholar] [CrossRef]
- Rezaei, A.; Burl, J.; Zhou, B.; Rezaei, M. A New Real-Time Optimal Energy Management Strategy for Parallel Hybrid Electric Vehicles. IEEE Trans. Control Syst. Technol. 2019, 27, 830–837. [Google Scholar] [CrossRef]
- East, S.; Cannon, M. Scenario Model Predictive Control for Data-Based Energy Management in Plug-In Hybrid Electric Vehicles. IEEE Trans. Control Syst. Technol. 2022, 30, 2522–2533. [Google Scholar] [CrossRef]
- Yu, P.; Li, M.; Wang, Y.; Chen, Z. Fuel Cell Hybrid Electric Vehicles: A Review of Topologies and Energy Management Strategies. World Electr. Veh. J. 2022, 13, 172. [Google Scholar] [CrossRef]
- Zhang, F.; Wang, L.; Coskun, S.; Pang, H.; Cui, Y.; Xi, J. Energy Management Strategies for Hybrid Electric Vehicles: Review, Classification, Comparison, and Outlook. Energies 2020, 13, 3352. [Google Scholar] [CrossRef]
- Hu, Y.; Li, W.; Xu, K.; Zahid, T.; Qin, F.; Li, C. Energy Management Strategy for a Hybrid Electric Vehicle Based on Deep Reinforcement Learning. Appl. Sci. 2018, 8, 187. [Google Scholar] [CrossRef][Green Version]
- Zou, Y.; Liu, T.; Liu, D.; Sun, F. Reinforcement Learning-Based Real-Time Energy Management for a Hybrid Tracked Vehicle. Appl. Energy 2016, 171, 372–382. [Google Scholar] [CrossRef]
- Xiong, R.; Cao, J.; Yu, Q. Reinforcement Learning-Based Real-Time Power Management for Hybrid Energy Storage System in the Plug-in Hybrid Electric Vehicle. Appl. Energy 2018, 211, 538–548. [Google Scholar] [CrossRef]
- Liu, T.; Zou, Y.; Liu, D.; Sun, F. Reinforcement Learning of Adaptive Energy Management with Transition Probability for a Hybrid Electric Tracked Vehicle. IEEE Trans. Ind. Electron. 2015, 62, 7837–7846. [Google Scholar] [CrossRef]
- Li, Y.; He, H.; Peng, J.; Wang, H. Deep Reinforcement Learning-Based Energy Management for a Series Hybrid Electric Vehicle Enabled by History Cumulative Trip Information. IEEE Trans. Veh. Technol. 2019, 68, 7416–7430. [Google Scholar] [CrossRef]
- Wu, J.; He, H.; Peng, J.; Li, Y.; Li, Z. Continuous Reinforcement Learning of Energy Management with Deep Q Network for a Power Split Hybrid Electric Bus. Appl. Energy 2018, 222, 799–811. [Google Scholar] [CrossRef]
- Han, X.; He, H.; Wu, J.; Peng, J.; Li, Y. Energy Management Based on Reinforcement Learning with Double Deep Q-Learning for a Hybrid Electric Tracked Vehicle. Appl. Energy 2019, 254, 113708. [Google Scholar] [CrossRef]
- Li, Y.; He, H.; Khajepour, A.; Wang, H.; Peng, J. Energy Management for a Power-Split Hybrid Electric Bus via Deep Reinforcement Learning with Terrain Information. Appl. Energy 2019, 255, 113762. [Google Scholar] [CrossRef]
- Tan, H.; Zhang, H.; Peng, J.; Jiang, Z.; Wu, Y. Energy Management of Hybrid Electric Bus Based on Deep Reinforcement Learning in Continuous State and Action Space. Energy Conv. Manag. 2019, 195, 548–560. [Google Scholar] [CrossRef]
- Wu, Y.; Tan, H.; Peng, J.; Zhang, H.; He, H. Deep Reinforcement Learning of Energy Management with Continuous Control Strategy and Traffic Information for a Series-Parallel Plug-in Hybrid Electric Bus. Appl. Energy 2019, 247, 454–466. [Google Scholar] [CrossRef]
- Zhou, J.; Xue, S.; Xue, Y.; Liao, Y.; Liu, J.; Zhao, W. A Novel Energy Management Strategy of Hybrid Electric Vehicle via an Improved TD3 Deep Reinforcement Learning. Energy 2021, 224, 120118. [Google Scholar] [CrossRef]
- Li, T.; Cui, W.; Cui, N. Soft Actor-Critic Algorithm-Based Energy Management Strategy for Plug-In Hybrid Electric Vehicle. World Electr. Veh. J. 2022, 13, 193. [Google Scholar] [CrossRef]
- Cheng, Y.; Xu, G.; Chen, Q. Research on Energy Management Strategy of Electric Vehicle Hybrid System Based on Reinforcement Learning. Electronics 2022, 11, 1933. [Google Scholar] [CrossRef]
- Wang, J.; Liu, P.; Hicks-Garner, J.; Sherman, E.; Soukiazian, S.; Verbrugge, M.; Tataria, H.; Musser, J.; Finamore, P. Cycle-Life Model for Graphite-LiFePO4 Cells. J. Power Sources 2011, 196, 3942–3948. [Google Scholar] [CrossRef]
- Tang, L.; Rizzoni, G.; Onori, S. Energy management strategy for HEVs including battery aging optimization. IEEE Trans. Transp. Electrif. 2015, 1, 211–222. [Google Scholar] [CrossRef]
- Xu, D.; Cui, Y.; Ye, J.; Cha, S.W.; Li, A.; Zheng, C. A Soft Actor-Critic-Based Energy Management Strategy for Electric Vehicles with Hybrid Energy Storage Systems. J. Power Sources 2022, 524, 231099. [Google Scholar] [CrossRef]
- Fujimoto, S.; Hoof, H.; Meger, D. Addressing Function Approximation Error in Actor-Critic Methods. In Proceedings of the PMLR/35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018; pp. 1587–1596. [Google Scholar]
- Zhou, G.; Huang, F.; Liu, W.; Zhao, C.; Xiang, Y.; Wei, H. Comprehensive Control Strategy of Fuel Consumption and Emissions Incorporating the Catalyst Temperature for PHEVs Based on DRL. Energies 2022, 15, 7523. [Google Scholar] [CrossRef]
- Nam, H.; Kim, Y.; Bae, J.; Lee, J. GateRL: Automated Circuit Design Framework of CMOS Logic Gates Using Reinforcement Learning. Electronics 2021, 10, 1032. [Google Scholar] [CrossRef]
- Wu, Y.; Tseng, B.; Rasmussen, C. Improving Sample-Efficiency in Reinforcement Learning for Dialogue Systems by Using Trainable-Action-Mask. In Proceedings of the ICASSP/2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 1 May 2020; pp. 8024–8028. [Google Scholar]
- Tang, C.; Liu, C.; Chen, W.; You, S.D. Implementing Action Mask in Proximal Policy Optimization (PPO) Algorithm. ICT Express 2020, 6, 200–203. [Google Scholar] [CrossRef]
Ref. | Author | Year | Categories | Continuous | Main Topic | |
---|---|---|---|---|---|---|
State | Action | |||||
[4] | Padmarajan et al. | 2016 | Rule-based | System structure and strategy | ||
[5] | Zhou et al. | 2018 | Optimization-based | Improvement of DP-based EMS for different HEVs | ||
[7] | East et al. | 2022 | Optimization-based | Scenario MPC for data-based EMS | ||
[13] | Liu et al. | 2015 | Learning-based | Reinforcement learning of adaptive EMS | ||
[15] | Wu et al. | 2018 | Learning-based | x | Continuous RL-based EMS | |
[16] | Han et al. | 2019 | Learning-based | x | DDQL-based EMS avoids falling into policy value overestimation | |
[17] | Li et al. | 2019 | Learning-based | x | EMS with terrain information | |
[18] | Tan et al. | 2019 | Learning-based | x | x | Continuous state and action spaces |
[19] | Wu et al. | 2019 | Learning-based | x | x | Continuous control and traffic information |
[21] | Li et al. | 2022 | Learning-based | x | x | SAC-AET-based EMS to improve the control effects |
Parts | Parameter Name | Value |
---|---|---|
Vehicle | Gross weight (kg) | 31,000 |
Dimension (mm) | 9662 × 2495 × 3450 | |
Dimension of cargo box (mm) | 6800 × 2350 × 1500 | |
Drive form | 8 × 4 | |
Drag coefficient | 0.56 | |
Frontal area (m2) | 8.24 | |
Rolling resistance coefficient | 0.0041 + 0.0000256v | |
ICE | Max. power (kW) | 243 |
Max. torque (Nm) | 1400 | |
Max. speed (rpm) | 2200 | |
MG1 | Max. power (kW) | 110 |
Max. torque (Nm) | 340 | |
Max. speed (rpm) | 7500 | |
MG2 | Max. power (kW) | 196 |
Max. torque (Nm) | 375 | |
Max. speed (rpm) | 15,000 | |
Gear ratio | 6.7 | |
Transmission | Transmission ratio of PG | 4.4 |
AMT gears ratio | 6.3/2.1/1/0.86 | |
Final drive ratio | 5.1 | |
Battery | Capacity (Ah) | 70 |
Voltage (V) | 576 |
Parameters | Value |
---|---|
Actor network learning rate | 0.0001 |
Critic network learning rate | 0.0002 |
Discount factor | 0.99 |
Mini-batch size | 256 |
Experience buffer size | 1 × 106 |
EMS | Consider Battery Aging | F.C. (L/100 km) | F.C. Cost (CNY) | Battery Capacity Loss (%) | Battery Aging Cost (CNY) | Total Cost (CNY) | Performance |
---|---|---|---|---|---|---|---|
TD3-based | No | 24.81 | 178.63 | 0.0423 | 28.99 | 207.62 | 95.82% |
Yes | 25.06 | 180.43 | 0.0270 | 18.51 | 198.94 | 100% | |
DDPG-based | No | 24.74 | 178.13 | 0.0434 | 29.75 | 207.88 | 95.70% |
Yes | 25.27 | 181.94 | 0.0286 | 19.60 | 201.54 | 98.71% |
EMS | Consider Battery Aging | F.C. (L/100 km) | F.C. Cost (CNY) | Battery Capacity Loss (%) | Battery Aging Cost (CNY) | Total Cost (CNY) | Performance |
---|---|---|---|---|---|---|---|
TD3-based | No | 30.14 | 217.01 | 0.0293 | 20.08 | 237.09 | 97.51% |
Yes | 30.31 | 218.23 | 0.0189 | 12.95 | 231.18 | 100% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mo, J.; Yang, R.; Zhang, S.; Zhou, Y.; Huang, W. TD3-Based EMS Using Action Mask and Considering Battery Aging for Hybrid Electric Dump Trucks. World Electr. Veh. J. 2023, 14, 74. https://doi.org/10.3390/wevj14030074
Mo J, Yang R, Zhang S, Zhou Y, Huang W. TD3-Based EMS Using Action Mask and Considering Battery Aging for Hybrid Electric Dump Trucks. World Electric Vehicle Journal. 2023; 14(3):74. https://doi.org/10.3390/wevj14030074
Chicago/Turabian StyleMo, Jinchuan, Rong Yang, Song Zhang, Yongjian Zhou, and Wei Huang. 2023. "TD3-Based EMS Using Action Mask and Considering Battery Aging for Hybrid Electric Dump Trucks" World Electric Vehicle Journal 14, no. 3: 74. https://doi.org/10.3390/wevj14030074