Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Research on Scheme Design and Decision of Multiple Unmanned Aerial Vehicle Cooperation Anti-Submarine Based on Knowledge-Driven Soft Actor-Critic

Appl. Sci. 2023, 13(20), 11527; https://doi.org/10.3390/app132011527

by Xiaoyong Zhang

, Wei Yue^* and Wenbin Tang

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Appl. Sci. 2023, 13(20), 11527; https://doi.org/10.3390/app132011527

Submission received: 25 September 2023 / Revised: 14 October 2023 / Accepted: 19 October 2023 / Published: 20 October 2023

(This article belongs to the Special Issue Intelligent Control of Unmanned Aerial Vehicles)

Round 1

Reviewer 1 Report

This article deals with an important and interesting topic: using UAVs and AI approaches to support security/defense actions. Specifically, it presents a knowledge-driven cooperative framework to support anti-submarine missions. It uses a reinforcement learning algorithm called Knowledge Driven Soft Actor-Critic (KD-SAC) consisting of a UAV Group Seach Knowledge Base (UGSKB) and Rule-based Deductive Inference Return Visit (RDIRV).

To support the novelty of the proposed framework, I suggest that authors create a table comparing the related work to their proposal at the end of the Related Work section.

Problem description is in a suitable level of detail and formalism, presenting the mathematical models and involved concepts. Figure 2 contains the workflow related to the Framework in a suitable level of detail. Figure 4 details even more a part of the process related to path planning.

Scenarios were prepared/parametrized to carry out tests using UAVs and subsequently, the results were presented and analyzed.

In Figures 8, 9, and 10, why compare just Scenarios 2 and 3? Note that in lines 495 and 496 you affirmed: “The comparison of success rates among the four algorithms in three scenarios is illustrated in Figure 8”. For each one of the mentioned figures each of the compared algorithm’s curves can be seen, but there are just 2 scenarios. Did the proposed approach not perform well in scenarios 1 and 4? Please, provide graphs and details with performance comparisons for the four planned scenarios.

Another thing to be considered about your performance analysis: did you look for other similar/related work results to compare them? This type of comparison is important to further validate what you are proposing. If this has not been done, I suggest considering it for future work and making a comment about it.

In Section 6 there is just a brief discussion about the work. This section deserves to be expanded detailing Theoretical and Practical Implications, both in dedicated subsections and a final subsection for Limitations and Difficulties faced during the research.

Please, create a final section (7) for conclusions or final considerations, also talking about future work, expanding what you already mentioned in the last part of the current section 6 paragraphs: “In the next step, we will use the knowledge- 593 driven search path planning system framework to solve existing problems”. My suggestion here is to use the current section 6 text for the conclusive section (7), expand future work, and develop a new section 6 keeping it as a Discussion but adding the subsection I previously commented on.

In general, I reiterate that this is an interesting work that, with a little more work, will be ready.

Author Response

To Reviewer 1#:

Special thanks for your comments. We have revised the manuscript as you requested and marked the modifications in green.

Q1: In Figures 8, 9, and 10, why compare just Scenarios 2 and 3? Note that in lines 495 and 496 you affirmed: “The comparison of success rates among the four algorithms in three scenarios is illustrated in Figure 8”. For each one of the mentioned figures each of the compared algorithm’s curves can be seen, but there are just 2 scenarios. Did the proposed approach not perform well in scenarios 1 and 4? Please, provide graphs and details with performance comparisons for the four planned scenarios.

Reply: Thanks for your suggestion. I'm sorry for the confusion caused by our incorrect statement. We added a note in section 5.1.1 and rectified the inaccurate statements in the simulation experiment. The paper provides only three simulation scenarios. The utilization of Figures 8, 9 and 10 purposes of assessing the comprehensive capabilities of the four distinct algorithms in both Scenario 2 and Scenario 3. The purpose of simulation scenario 1 is to validate the efficacy of the event-triggering strategy proposed in this paper. The UAV can select its search mode based on the fundamental search rules (refer to Table 1) to adapt to different cases, such as employing the spiral search or conducting a search along the central vertical axis. Specifically, the UAV employs different search rules from the Knowledge Base based on the number of targets discovered, such as utilizing rule 1 when one target is detected and rule 2 when two targets.

Q2: Another thing to be considered about your performance analysis: did you look for other similar/related work results to compare them? This type of comparison is important to further validate what you are proposing. If this has not been done, I suggest considering it for future work and making a comment about it.

Reply: Thanks for your comment. The present paper introduces a knowledge-driven collaborative anti-submarine strategy for multiple UAVs. In addition to an automated decision mode that applies search rules based on different situations, we propose a path planning strategy (KD-SAC) utilizing the SAC approach to enhance the knowledge base. Due to the limited availability of relevant literature and research on establishing a search knowledge base for anti-submarine tasks, we solely focus on comparing path planning strategies. Specifically, in section 5.1.2, we present the comparative results between KD-SAC, RI-MAC from [21], PSO from [22] and SAC from [23]. The four algorithms are then analyzed from three perspectives: the success rate of path planning, cooperation, and average reward. Detailed results are presented in Section 5.2.

Q3: In Section 6 there is just a brief discussion about the work. This section deserves to be expanded detailing Theoretical and Practical Implications, both in dedicated subsections and a final subsection for Limitations and Difficulties faced during the research.

Reply: Thanks for your suggestion. In Section 6, we summarize the proposed strategies based on the experimental results. Additionally, we provide a detailed explanation of KD-SAC’s mechanism, as well as the limitations encountered during the research process. The supplementary information is highlighted in green.

Q4: Please, create a final section (7) for conclusions or final considerations, also talking about future work, expanding what you already mentioned in the last part of the current section 6 paragraphs: “In the next step, we will use the knowledge- 593 driven search path planning system framework to solve existing problems”. My suggestion here is to use the current section 6 text for the conclusive section (7), expand future work, and develop a new section 6 keeping it as a Discussion but adding the subsection I previously commented on.

Reply: Thanks for your comment. We have added a description of future work in Section 7 on page 23 of the paper. It mainly includes the following three parts: (i) the next step involves validating the proposed strategy's path planning and search capabilities in a larger cluster of UAVs and a more complex environment that includes dynamic obstacles and irregular obstacles with overlapping polygons. (ii) The impact of the environment on sensor performance will be further investigated. (iii) We will develop specific search rules for different formations to enhance the search efficiency of UAVs upon locating the initial target. (iv) The knowledge-driven search path planning system will be used framework to consider existing anti-submarine problems.

Reviewer 2 Report

The authors proposes a reinforcement learning algorithm called as KD-SAC that interacts with real-time environmental information to enhance the search capabilities of multiple UAV groups. Compared to several existing algorithms, the authors have shown the effectiveness of their algorithm through different tested scenarios. Generally, the paper is well structured and designed along with interesting obtained results. Indeed, I have the following comments:

1) Please mention the main findings and results of KD-SAC in the Abstract section.

2) English should be revised. For instance, “motive”-> “motivation”, “Relate Work” -> “Related Work”, etc.

3) I miss a study for the internal and external factors that affect the UAV motion in the sub-marine environment. Did you consider the ideal scenario in your work?

4) Figures size should be increased to be more visible.

5) Give more explanation for the variables and equations in the pseudocode of KD-SAC.

6) Please justify the selection of the threshold values used in Table 3.

7) The authors consider a small number of UAVs in the simulation. Please explain what happens in case of dense UAVs deployment.

8) In Figures 5 to 11, the authors write what observe in the figures without giving any interpretation for the obtained results. Why KD-SAC outperformed the other algorithms?

9) Replace Discussion section by Conclusion section.

Moderate editing of English language required

Author Response

To Reviewer 2#:

Special thanks for your comments. We have revised the manuscript as you requested and marked the modifications in cyan.

Q1: Please mention the main findings and results of KD-SAC in the Abstract section.

Reply: Thanks for your comment. We add the experimental results of KD-SAC in the abstract. The supplementary content is as follows: The final results demonstrate that the proposed method achieves a success rate of 73.63% in multi-UAV flight path planning within complex environments, surpassing the other three algorithms by 17.27%, 29.88%, and 33.51% respectively. In addition, the KD-SAC algorithm outperforms the other three algorithms in terms of synergy and average search reward.

Q2: English should be revised. For instance, “motive”-> “motivation”, “Relate Work” -> “Related Work”, etc.

Reply: Thanks for your suggestion. We have carefully checked and corrected grammatical errors and typos throughout the manuscript.

Q3: I miss a study for the internal and external factors that affect the UAV motion in the sub-marine environment. Did you consider the ideal scenario in your work?

Reply: Thanks for your comment. For the underwater target search process, we describe the underwater target as a moving target whose motion state is completely unknown. When the water target is found, UAVs initiate a search for the underwater target in the vicinity of the water target, employing predefined search rules from the knowledge base. In addition, the performance of the sensor employed for underwater target detection in this paper is ideal, meaning that when an underwater target appears within the sensor's detection range, it is considered to be successfully detected. In Section 3.2, the underwater target detection sensor is utilized as an action template to establish the UAV's event attributes and associated conditions. In future work, the impact of the environment on sensor performance will be further investigated.

Q4: Figures size should be increased to be more visible.

Reply: Thanks for your suggestion. The sizes of the figures have been adjusted, especially in Figures 8, 9 and 10.

Q5: Give more explanation for the variables and equations in the pseudocode of KD-SAC.

Reply: Thanks for your comment. The variables and formulas in KD-SAC's pseudo-code have been further elucidated, as outlined in Table 2 on page 13.

Q6: Please justify the selection of the threshold values used in Table 3.

Reply: Thanks for your suggestion. The experimental introduction in Section 5 incorporates a description of the pertinent parameters, which are presented in Table 3. Since this paper is inspired by the paper [23], the KD-SAC strategy is proposed based on the knowledge base. To verify and compare the learning performance of KD-SAC and SAC, we provide the values of the same simulation hyper-parameters in [23], as shown in Table 3. In addition, to accelerate the learning rate of KD-SAC, we enhance the values of the learning rate parameter.

Q7: The authors consider a small number of UAVs in the simulation. Please explain what happens in case of dense UAVs deployment.

Reply: Thanks for your comment. The collaboration between UAVs in this paper primarily manifests as the jointly maintained knowledge base. Each UAV conducts path planning and self-decision-making based on either the KD-SAC algorithm or the shared knowledge base. Subsequently contributes to enriching the latter through its optimal decision. Therefore, the expansion speed of the knowledge base will increase with the number of UAVs when operating on a large scale. The number of action templates available for UAVs to utilize will increase. The advent of large-scale UAV clusters will also give rise to challenges such as collision avoidance and cooperative communication. These aspects will constitute the focus of our future research endeavors. We sincerely appreciate your valuable suggestion, which has served as a source of inspiration for our future work.

Q8: In Figures 5 to 11, the authors write what observe in the figures without giving any interpretation for the obtained results. Why KD-SAC outperformed the other algorithms?

Reply: Thanks for your suggestion. We have added additional explanatory notes (cyan marked) above Figure 5, Figure 7, Figure 8, Figure 9, and Figure 11 respectively to elucidate the expanding and calling mechanisms of KD-SAC, these notes clarify why KD-SAC outperforms other algorithms.

Q9: Replace Discussion section by Conclusion section.

Reply: Thanks for your comment. The strategies presented in this article are summarized in Section 6, highlighted in green. Additionally, we have relocated the original discussion on future work content from Section 6 to Section 7. In Section 7, we draw inspiration from your insights and in the future will explore the performance of the KD-SAC in large-scale UAV swarm anti-submarine missions, while also considering the influence of environmental factors on sensor capabilities.

Reviewer 3 Report

The paper introduces an approach, the Knowledge Driven Soft Actor-Critic (KD-SAC) algorithm, designed to enhance the anti-submarine and search capabilities of multiple Unmanned Aerial Vehicle (UAV) groups in complex marine environments. The algorithm incorporates two key components: the UAV Group Search Knowledge Base (UGSKB) and path planning strategy. The UGSKB serves as a continually updated database containing valuable information for decision-making, including prior data, search rules, real-time detection details, and historical detection information.

The proposed cooperation search framework for multiple UAVs is structured into three layers of information models. The data layer provides essential background information and fundamental search rules. The knowledge layer enriches these rules and the database during ongoing search processes. The decision layer utilizes the information from the previous two layers to facilitate autonomous decision-making by UAVs.

Additionally, the Rule-based Deductive Inference Return Visit (RDIRV) strategy is introduced to continuously expand the UGSKB, enabling the generation of new knowledge and search rules by storing optimal decisions as exemplary search cases in the cognitive layer. The paper also outlines an event-based UGSKB calling mechanism in the decision-making layer to select appropriate action templates from the knowledge base based on matching event attributes and tasks. Real-time adjustments to UAV actions and states are achieved through interactions with the environment using the SAC algorithm to generate optimal decisions.

The paper presents an approach to improving UAV search capabilities, backed by experimental comparisons with alternative methods. The proposed KD-SAC algorithm showcases promise for enhancing anti-submarine and search operations in complex marine environments. Further validation and testing will be crucial to confirm its practicality and superiority in real-world scenarios. The contribution is not so clear. The paper is not well-written. Many typos are found. These should be revised again. The literature reviews should be seriously added.

Overall, the paper may try to offer insights into the development of advanced search and decision-making algorithms for UAVs, with potential applications in critical marine operations. However, the quality of the paper is seriously not enough. Therefore, I recommend that this paper should be completely revised and resubmitted for reviewing again.

Author Response

To Reviewer 3#:

Special thanks for your comments. We have revised the manuscript as you requested and marked the modifications in yellow.

Q1: The proposed KD-SAC algorithm showcases promise for enhancing anti-submarine and search operations in complex marine environments. Further validation and testing will be crucial to confirm its practicality and superiority in real-world scenarios. The contribution is not so clear. The paper is not well-written. Many typos are found. These should be revised again.

Reply: Thanks for your suggestion. We have thoroughly reviewed and rectified grammatical errors and spelling mistakes throughout the manuscript. In the abstract section, we have extracted our contribution by eliminating redundant content from the previous version, and ultimately incorporated an analysis of experimental data to substantiate the superiority of the proposed strategy. In addition, in Section 1.1 we elaborate on the contributions of this paper and mark them in yellow. In the simulation experiment section, we compare four algorithms under distinct scenarios. The analysis of the experimental results confirms KD-SAC's path planning and cooperative anti-submarine missions. Furthermore, Section 5.3 compares KD-SAC and SAC to validate KD-SAC's learning ability, where SAC is a strategy-free artificial intelligence algorithm.

Round 2

Reviewer 2 Report

The authors answered all my comments. I recommend the manuscript for publication.

Minor editing of English language required

Author Response

To Reviewer 2#:

Special thanks for your comments. We have revised the manuscript as you requested.

Q1: Minor editing of English language required.

Reply: Thanks for your suggestion. We have thoroughly reviewed and rectified grammatical errors and spelling mistakes throughout the manuscript.

Author Response File: Author Response.pdf

Reviewer 3 Report

The paper delineates the scheme design and decision-making process for a multi-UAV system, incorporating knowledge-driven algorithms. However, it falls short in providing adequate detail concerning the specific algorithms, models, or methodologies utilized. To bolster the paper's credibility and reproducibility, it is imperative to furnish more comprehensive information about the technical facets of the knowledge-driven system, including the underlying AI or machine learning techniques. Regrettably, the revisions made to the paper have not addressed these concerns and the previous issues adequately. I would recommend a substantial rewriting of the paper prior to submission to another venue.

Author Response

To Reviewer 3#:

Special thanks for your comments. We have revised the manuscript as you requested and marked the modifications in cyan.

Q1: The paper delineates the scheme design and decision-making process for a multi-UAV system, incorporating knowledge-driven algorithms. However, it falls short in providing adequate detail concerning the specific algorithms, models, or methodologies utilized. To bolster the paper's credibility and reproducibility, it is imperative to furnish more comprehensive information about the technical facets of the knowledge-driven system, including the underlying AI or machine learning techniques. Regrettably, the revisions made to the paper have not addressed these concerns and the previous issues adequately. I would recommend a substantial rewriting of the paper prior to submission to another venue.

Reply: Thanks for your suggestion. The specific algorithms and models presented in this paper have been supplemented and highlighted in cyan. In the second section, we primarily propose two models: (1) within the cooperative search framework (Section 2.3), a detailed is provided on the three-layer structure and initial search rules; (2) the knowledge base model (Section 2.4) elaborates on the introduction of multiple variables in the mapping space. The Section 3 is dedicated to introducing the process of knowledge-driven autonomous decision-making of UAVs, the initial paragraph provides a comprehensive description of the entire working mechanism of RDIRV. Section 3.2 further illustrates the event-based template invocation mechanism through relevant examples. Moving on to Section 4, we delve into explaining how KD-SAC determines the optimal strategy using formulas (7)-(15). Additionally, this section also elaborates on the advantages offered by KD-SAC.

Author Response File: Author Response.pdf

Article Menu

Research on Scheme Design and Decision of Multiple Unmanned Aerial Vehicle Cooperation Anti-Submarine Based on Knowledge-Driven Soft Actor-Critic

Further Information

Guidelines

MDPI Initiatives

Follow MDPI