Trajectory Planning for UAVAssisted Data Collection in IoT Network: A Double Deep Q Network Approach
Abstract
:1. Introduction
1.1. Background
1.2. Related Work and Contributions
 Our work considers a complex and realistic urban environment in order to study the effects of obstacles and jammers on 3D UAV trajectory planning. Particularly, during the simulation phase, we randomly generate the positions of jammers and ground devices for each iteration, which makes the scenario more uncertain and complicates the design of the Markov decision process (MDP).
 In this paper, the environmental information is not predetermined, and the UAV dynamically senses and navigates around the obstacles in real time using onboard sensors such as cameras. It also learns from historical environmental information obtained from a memory bank to speed up its decision making.
 To address the problem of the limited computing power of UAVs, we developed a DDQNbased UAV trajectory optimization algorithm. The algorithm sets the reward value according to the scene and converges faster. We also provide flight results under different scene parameters and comparison experiments of various reinforcement learning algorithms to support our view. The article fully demonstrates the simulation experiments and algorithm comparisons that validate the effectiveness and superiority of our approach.
2. System Model and Problem Formulation
2.1. Channel Model
2.2. Throughput Maximization Problem Formulation
3. DDQNBased UAV Trajectory Optimization Algorithm
3.1. Markov Decision Process
3.2. DDQNBased UAV Trajectory Optimization Algorithm
Algorithm 1: DDQNBased UAV Trajectory Optimization Algorithm 
Initialize replay memory $\mathcal{D}$, the online network parameters $\theta $, the target network parameters ${\theta}^{\prime}=\theta $, and the target network update period ${N}_{freq}$. 

4. Simulation Results and Discussion
4.1. Parameter Initialization
4.2. Result Analysis with Different Numbers of IoT Devices
 The result analysis for the scenario with three devices and five obstacles (3D+5O, in short) is as follows:
 2.
 The result analysis for the scenario with five devices and five obstacles (5D+5O, in short), as follows:
4.3. Result Analysis with Different Numbers of Obstacles
5. Conclusions
6. Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
 Li, B.; Fei, Z.; Zhang, Y. UAV Communications for 5G and Beyond: Recent Advances and Future Trends. IEEE Internet Things J. 2019, 6, 2241–2263. [Google Scholar] [CrossRef]
 Cao, L.; Wang, H. Research on UAV Network Communication Application Based on 5G Technology. In Proceedings of the 2022 3rd International Conference on Electronic Communication and Artificial Intelligence (IWECAI), Zhuhai, China, 14–16 January 2022; pp. 125–129. [Google Scholar]
 Geraci, G.; GarciaRodriguez, A.; Azari, M.M.; Lozano, A.; Mezzavilla, M.; Chatzinotas, S.; Chen, Y.; Rangan, S.; Di Renzo, M. What Will the Future of UAV Cellular Communications Be? A Flight from 5G to 6G. IEEE Commun. Surv. Tutorials 2022, 24, 1304–1335. [Google Scholar] [CrossRef]
 Li, P.; Liu, Y.; Deng, X.; Wu, B.; Fu, R. Selforganized Cooperative Electronic Jamming by UAV Swarm Based on Contribution Priority and Cost Competition. In Proceedings of the 2021 IEEE 15th International Conference on Electronic Measurement & Instruments (ICEMI), Nanjing, China, 29–31 October 2021; pp. 49–53. [Google Scholar]
 Chen, B.W.; Rho, S. Autonomous Tactical Deployment of the UAV Array Using SelfOrganizing Swarm Intelligence. IEEE Consum. Electron. Mag. 2020, 9, 52–56. [Google Scholar] [CrossRef]
 Hu, J.; Wu, H.; Zhan, R.; Rafik, M.; Zhou, X. Selforganized searchattack mission planning for UAV swarm based on wolf pack hunting behavior. J. Syst. Eng. Electron. 2021, 32, 1463–1476. [Google Scholar]
 Xu, X.; Zhao, H.; Yao, H.; Wang, S. A BlockchainEnabled EnergyEfficient Data Collection System for UAVAssisted IoT. IEEE Internet Things J. 2021, 8, 2431–2443. [Google Scholar] [CrossRef]
 Cheng, N.; Wu, S.; Wang, X.; Yin, Z.; Li, C.; Chen, W.; Chen, F. AI for UAVAssisted IoT Applications: A Comprehensive Review. IEEE Internet Things J. 2023, 10, 14438–14461. [Google Scholar] [CrossRef]
 Kosmerl, J.; Vilhar, A. Base stations placement optimization in wireless networks for emergency communications. In Proceedings of the 2014 IEEE International Conference on Communications Workshops (ICC), Sydney, Australia, 10–14 June 2014; pp. 200–205. [Google Scholar]
 Zeng, Y.; Zhang, R.; Lim, T.J. Wireless communications with unmanned aerial vehicles: Opportunities and challenges. IEEE Commun. Mag. 2016, 54, 36–42. [Google Scholar] [CrossRef]
 Wang, J.; Jiang, C.; Han, Z.; Ren, Y.; Maunder, R.G.; Hanzo, L. Taking drones to the next level: Cooperative distributed unmanned aerial vehicular networks: Small and mini drones. IEEE Trans. Veh. Technol. 2016, 12, 73–82. [Google Scholar] [CrossRef]
 Pikner, I.; Sivasundaram, S. New Approaches to the Development and Employment of the UAV. AIP Conf. Proc. 2012, 1493, 752–757. [Google Scholar]
 Jakaria, A.H.M.; Rahman, M.A.; Asif, M.; Khalil, A.A.; Kholidy, H.A.; Anderson, M.; Drager, S. Trajectory Synthesis for a UAV Swarm Based on Resilient Data Collection Objectives. IEEE Trans. Netw. Serv. Manag. 2023, 20, 138–151. [Google Scholar] [CrossRef]
 Ding, T.; Liu, N.; Yan, Z.M.; Liu, L.; Cui, L.Z. An efficient reinforcement learning game framework for uavenabled wireless sensor network data collection. J. Comput. Sci. Technol. 2022, 37, 1356–1368. [Google Scholar] [CrossRef]
 Caposciutti, G.; Bandini, G.; Marracci, M.; Buffi, A.; Tellini, B. Capacity Fade and Aging Effect on Lithium Battery Cells: A Real Case Vibration Test with UAV. IEEE J. Miniaturization Air Space Syst. 2021, 2, 76–83. [Google Scholar] [CrossRef]
 Pan, Y.; Chen, Q.; Zhang, N.; Li, Z.; Zhu, T.; Han, Q. Extending Delivery Range and Decelerating Battery Aging of Logistics UAVs Using Public Buses. IEEE Trans. Mob. Comput. 2023, 22, 5280–5295. [Google Scholar] [CrossRef]
 Youn, W.; Choi, H.; Cho, A.; Kim, S.; Rhudy, M.B. Accelerometer FaultTolerant ModelAided State Estimation for HighAltitude LongEndurance UAV. IEEE Trans. Instrum. Meas. 2020, 69, 8539–8553. [Google Scholar] [CrossRef]
 Pan, H.; Liu, Y.; Sun, G.; Fan, J.; Liang, S.; Yuen, C. Joint power and 3D trajectory optimization for UAVenabled wireless powered communication networks with obstacles. IEEE Trans. Commun. 2023, 71, 2364–2380. [Google Scholar] [CrossRef]
 Zhang, S.; Zeng, Y.; Zhang, R. CellularEnabled UAV Communication: A ConnectivityConstrained Trajectory Optimization Perspective. IEEE Trans. Commun. 2019, 67, 2580–2604. [Google Scholar] [CrossRef]
 Li, B.; Qi, X.; Yu, B.; Liu, L. Trajectory Planning for UAV Based on Improved ACO Algorithm. IEEE Access 2020, 8, 2995–3006. [Google Scholar] [CrossRef]
 Tang, Q.; Liu, L.; Jin, C.; Wang, J.; Liao, Z.; Luo, Y. An UAVassisted mobile edge computing offloading strategy for minimizing energy consumption. Comput. Netw. 2022, 207, 108857. [Google Scholar] [CrossRef]
 Wang, Y.; Gao, Z.; Zhang, J.; Cao, X.; Zheng, D.; Gao, Y.; Ng, D.W.K.; Di Renzo, M. Trajectory Design for UAVBased Internet of Things Data Collection: A Deep Reinforcement Learning Approach. IEEE Internet Things J. 2022, 9, 3899–3912. [Google Scholar] [CrossRef]
 Zhang, B.; Liu, C.H.; Tang, J.; Xu, Z.; Ma, J.; Wang, W. LearningBased EnergyEfficient Data Collection by Unmanned Vehicles in Smart Cities. IEEE Trans Ind. Inform. 2018, 14, 1666–1676. [Google Scholar] [CrossRef]
 Bayerlein, H.; Theile, M.; Caccamo, M.; Gesbert, D. UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach. In Proceedings of the GLOBECOM 2020—2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar]
 Zeng, Y.; Xu, X. Path design for cellularconnected uav with reinforcement learning. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA, 9–13 December 2019; pp. 1–6. [Google Scholar]
 Khamidehi, B.; Sousa, E.S. A double qlearning approach for navigation of aerial vehicles with connectivity constraint. In Proceedings of the ICC 2020—2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar]
 Yin, S.; Zhao, S.; Zhao, Y.; Yu, F.R. Intelligent Trajectory Design in UAVAided Communications with Reinforcement Learning. IEEE Trans. Veh. Technol. 2019, 68, 8227–8231. [Google Scholar] [CrossRef]
 Esrafilian, O.; Bayerlein, H.; Gesbert, D. Modelaided Deep Reinforcement Learning for Sampleefficient UAV Trajectory Design in IoT Networks. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar]
 Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
 Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double qlearning. In Proceedings of the AAAI Conference on Artificial Intelligence 2016, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
 Wei, X.; Cai, L.; Wei, N.; Zou, P.; Zhang, J.; Subramaniam, S. Joint UAV Trajectory Planning, DAG Task Scheduling, and Service Function Deployment Based on DRL in UAVEmpowered Edge Computing. IEEE Internet Things J. 2023, 10, 12826–12838. [Google Scholar] [CrossRef]
 Theile, M.; Bayerlein, H.; Nai, R.; Gesbert, D.; Caccamo, M. UAV Path Planning using Global and Local Map Information with Deep Reinforcement Learning. In Proceedings of the 2021 20th International Conference on Advanced Robotics (ICAR), Ljubljana, Slovenia, 6–10 December 2021; pp. 539–546. [Google Scholar]
 Hou, X.; Liu, F.; Wang, R.; Yu, Y. A UAV Dynamic Path Planning Algorithm. In Proceedings of the 2020 35th Youth Academic Annual Conference of Chinese Association of Automation (YAC), Zhanjiang, China, 16–18 October 2020; pp. 127–131. [Google Scholar]
 Pehlivanoğlu, Y.V.; Bekmezci, İ.; Pehlivanoğlu, P. Efficient Strategy for MultiUAV Path Planning in Target Coverage Problems. In Proceedings of the 2022 International Conference on Theoretical and Applied Computer Science and Engineering (ICTASCE), Ankara, Turkey, 29 September–1 October 2022; pp. 110–115. [Google Scholar]
Parameters  Value 

$\mathcal{N}$  3 
$\mathcal{J}$  3 
$B$  100 MHz 
$P$  43 W 
${P}_{j}$  [10, 50] 
${\sigma}^{2}$  −60 dBm/Hz 
${\beta}_{LoS}$  −30 dB 
${\beta}_{NLoS}$  −35 dB 
${\eta}_{LoS}$  1.41 [28] 
${\eta}_{NLoS}$  2.23 [28] 
${\alpha}_{LoS}$  −2.5 [28] 
${\alpha}_{NLoS}$  −3.04 [28] 
Parameters  Value 

Learning rate $\alpha $  0.0004 
Discount factor $\gamma $  0.99 
$\epsilon $  0.01 
${N}_{freq}$  100 
Minibatch  32 
Iteration  80,000 
Method  3D+5O  5D+5O  3D+8O 

Qlearning  1.34460  1.28929  1.39889 
PPO  1.03520  1.63133  1.27002 
DQN  1.75778  4.26592  1.80001 
Dueling DQN  2.05745  4.35702  2.13843 
DDQN  1.63612  3.95510  1.71541 
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, S.; Qi, N.; Jiang, H.; Xiao, M.; Liu, H.; Jia, L.; Zhao, D. Trajectory Planning for UAVAssisted Data Collection in IoT Network: A Double Deep Q Network Approach. Electronics 2024, 13, 1592. https://doi.org/10.3390/electronics13081592
Wang S, Qi N, Jiang H, Xiao M, Liu H, Jia L, Zhao D. Trajectory Planning for UAVAssisted Data Collection in IoT Network: A Double Deep Q Network Approach. Electronics. 2024; 13(8):1592. https://doi.org/10.3390/electronics13081592
Chicago/Turabian StyleWang, Shuqi, Nan Qi, Hua Jiang, Ming Xiao, Haoxuan Liu, Luliang Jia, and Dan Zhao. 2024. "Trajectory Planning for UAVAssisted Data Collection in IoT Network: A Double Deep Q Network Approach" Electronics 13, no. 8: 1592. https://doi.org/10.3390/electronics13081592