Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

A Multi-Stream Attention-Aware Convolutional Neural Network: Monitoring of Sand and Dust Storms from Ordinary Urban Surveillance Cameras

Remote Sens. 2023, 15(21), 5227; https://doi.org/10.3390/rs15215227

by Xing Wang^1,2,3, Zhengwei Yang^1,*, Huihui Feng⁴

, Jiuwei Zhao⁵, Shuaiyi Shi⁶

and Lu Cheng⁷

Reviewer 1:

Bai Li

Reviewer 2: Anonymous

Reviewer 3:

Miguel Angel Fernandez-Granero

Remote Sens. 2023, 15(21), 5227; https://doi.org/10.3390/rs15215227

Submission received: 26 September 2023 / Revised: 26 October 2023 / Accepted: 28 October 2023 / Published: 3 November 2023

(This article belongs to the Special Issue Artificial Intelligence-Driven Methods for Remote Sensing Target and Object Detection II)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript presents a study using surveillance cameras to monitor sand and dust storms (SDS). The authors propose a Multi-Stream Attention-aware Convolutional Neural Network (MA-CNN) that learns SDS image features at different scales and employs the attention mechanism to enhance detection performance. They construct a dataset with images of SDS and non-SDS scenarios to train and test the MA-CNN. The experimental results show that the MA-CNN outperforms other deep learning models regarding accuracy, precision, and F1 score. The authors conclude that surveillance cameras can effectively observe SDS events and provide valuable supplements to existing SDS observation networks. However, this manuscript needs to address the following concerns before it gets published:

Formatting:

1. In Line 242, "For a input" should be "For an input".

Content:

1. What computer configuration is used in the model's training and testing experiments?

2. Line 325 mentions "the MA-CNN model taking more extended time in the training process.", so how does the time overhead of the MA-CNN algorithm compare to the other algorithms at the time of real-world scenario testing? Could the authors expand the comparison to the testing process accordingly?

3. Line 318 of the manuscript mentions, "Comparing table 2 and table 3, the performance of VGG16, Mobile Net V2, Inception V3, and DenseNet121 algorithms has decreased after the attention layers were added.", then why did the authors choose the algorithms without attention mechanisms for comparison when doing the comparison of algorithms in real-world scenario testing in Section 3.2? It is hoped that the authors can choose those algorithms with increased accuracy due to adding the attention mechanism in the training process for comparison in Section 3.2. Or, according to Table 7 and Table 8, did the authors choose to add the attention mechanism to all algorithms for comparison? Would it then be possible that the addition of the attention mechanism adversely affects the accuracy of the algorithms mentioned earlier?

4. The paper lacks detailed implementation details, such as the architecture and hyperparameters of the MA-CNN model, making it difficult to reproduce the study. Could the authors provide more details about the architecture and hyperparameters of the MA-CNN model? This information would be helpful for reproducing the study and understanding the design choices. Or the author could consider open-sourcing the code on GitHub.

5. The study does not discuss the limitations or potential challenges of using surveillance cameras for SDS monitoring, such as the impact of environmental conditions such as night on image quality. And how could this problem be addressed?

Comments on the Quality of English Language

The authors still need to polish the English writing of this manuscript because there are some obvious grammar errors.

Author Response

Formatting:

In Line 242, "For a input" should be "For an input".

Reply: accept and revised.

Content:

What computer configuration is used in the model's training and testing experiments?

Reply: We have added the experimental environment in section 2.3.1. please see lines 257-266.

Line 325 mentions "the MA-CNN model taking more extended time in the training process.", so how does the time overhead of the MA-CNN algorithm compare to the other algorithms at the time of real-world scenario testing? Could the authors expand the comparison to the testing process accordingly?

Reply: Thank you for your suggestion. We have added more details about the time cost in section 4 discussion, please see lines 403-412.

Line 318 of the manuscript mentions, "Comparing table 2 and table 3, the performance of VGG16, Mobile Net V2, Inception V3, and DenseNet121 algorithms has decreased after the attention layers were added.", then why did the authors choose the algorithms without attention mechanisms for comparison when doing the comparison of algorithms in real-world scenario testing in Section 3.2? It is hoped that the authors can choose those algorithms with increased accuracy due to adding the attention mechanism in the training process for comparison in Section 3.2. Or, according to Table 7 and Table 8, did the authors choose to add the attention mechanism to all algorithms for comparison? Would it then be possible that the addition of the attention mechanism adversely affects the accuracy of the algorithms mentioned earlier?

Reply: we have added performance of different comparison algorithms after attention mechanism added in Table 7, 8, and 9. please see lines 366-382.

The paper lacks detailed implementation details, such as the architecture and hyperparameters of the MA-CNN model, making it difficult to reproduce the study. Could the authors provide more details about the architecture and hyperparameters of the MA-CNN model? This information would be helpful for reproducing the study and understanding the design choices. Or the author could consider open-sourcing the code on GitHub.

Reply: The training details (learning rate, batch size, and training epochs) of MA-CNN are added and presented in Section 2.3.4. The dataset and code of this project are available after 1 December 2023.

The study does not discuss the limitations or potential challenges of using surveillance cameras for SDS monitoring, such as the impact of environmental conditions such as night on image quality. And how could this problem be addressed?

Reply: thank you for your suggestion. We have added this point in the discussion part (section 4) please see lines 399-429.

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors,

Thank you for sending us your paper titled MA-CNN: Monitoring sand and dust storms from ordinary 2 urban surveillance cameras. I have the following observations:

- References are not cited according to MDPI rules.

- Equations must be written in the equation editor.

- Figures 1 and 2 must have better size and quality to check the study scenarios.

- What are the characteristics of your dataset? Dataset building should be another subsection. It is unclear how you built your dataset. What is the percentage of training and testing labels?

- What are libraries, programming language (R or Python or what), and IDE used for your implementation

- Results are not presented in a discussion section. What is the importance of your study in comparison with other studies?

- You show Table 7 with images; what does color mean? There is no discussion about it.

- Some English sections must be improved.

Your result seems interesting to have a difference between sand and dust, but you are not describing it correctly in the manuscript. It would be best if you worked on the writing and order of your draft. I suggest checking the author's section.

Thank you.

Comments on the Quality of English Language

Dear authors,

Thank you for sending us your paper titled MA-CNN: Monitoring sand and dust storms from ordinary 2 urban surveillance cameras. I have the following observations:

- References are not cited according to MDPI rules.

- Equations must be written in the equation editor.

- Figures 1 and 2 must have better size and quality to check the study scenarios.

- What are the characteristics of your dataset? Dataset building should be another subsection. It is unclear how you built your dataset. What is the percentage of training and testing labels?

- What are libraries, programming language (R or Python or what), and IDE used for your implementation

- Results are not presented in a discussion section. What is the importance of your study in comparison with other studies?

- You show Table 7 with images; what does color mean? There is no discussion about it.

- Some English sections must be improved.

Thank you.

Author Response

Thank you for sending us your paper titled MA-CNN: Monitoring sand and dust storms from ordinary urban surveillance cameras. I have the following observations:

- References are not cited according to MDPI rules.

Reply: Accept and revised.

- Equations must be written in the equation editor.

Reply: Accept and revised.

- Figures 1 and 2 must have better size and quality to check the study scenarios.

Reply: Accept and revised.

- What are the characteristics of your dataset? Dataset building should be another subsection. It is unclear how you built your dataset. What is the percentage of training and testing labels?

Reply: we have added more description about the dataset in section 2.3.3. Please see lines 273-292.

- What are libraries, programming language (R or Python or what), and IDE used for your implementation.

Reply: We have added the experimental environment in section 2.3.1. Please see lines 257-266.

-- Results are not presented in a discussion section. What is the importance of your study in comparison with other studies?

We have added this point in the discussion parts (section 4), please see lines 398-429.

- You show Table 7 with images; what does color mean? There is no discussion about it.

Reply: The legend of Table 7 (Table 10 in the revised manuscript) was added. Please see lines 461 and 463.

- Some English sections must be improved.

We have applied the MDPI English Polish service to improve our manuscript.

Reply: thank you for your suggestion. Authors believe that rapid changes over time are important distinctions between SDS and similar weather events (e.g., fog, haze, and smoke). However, limited by the lack of sufficient other types of weather data for comparison, the manuscript focuses on analyzing the monitoring accuracy of the MA-CNN method for SDS in real monitoring scenarios, which we add and illustrate in the discussion section. Please see lines 421-429.

Reviewer 3 Report

Comments and Suggestions for Authors

I have several questions: The data set has been divided into 3, training, validation and test: 1.- Could you give more details about the validation method used? 2.- How has the system been tested on the test subset? 3 .- Are the results obtained from the application of the algorithm to which of the 3 subsets, the training, the validation or the test subset? 4.- How could you justify that better results are obtained in the three different scenarios than in the subset with which the system has been trained?

Author Response

I have several questions: The data set has been divided into 3, training, validation and test:

1.- Could you give more details about the validation method used?

Reply: we have added more information in lines 294-298.

2.- How has the system been tested on the test subset?

Reply: We split the SDSI dataset into training, test, and validation datasets (please see Table 1). The test of the trained models was performed on the test dataset and the same metrics (please see Eq.(4), (5), and (6)) as for the training stage are employed.

3 .- Are the results obtained from the application of the algorithm to which of the 3 subsets, the training, the validation or the test subset?

Reply: Results in Tables 2 and 3 were obtained on the test subset. We have added this in lines 288 and 306.

4.- How could you justify that better results are obtained in the three different scenarios than in the subset with which the system has been trained?

Reply: The proposed SDS monitoring model employed in the real scenario is built on the SDSI dataset, which achieved a better performance in real scenarios than on the dataset. For this, the authors believe that the selected real-world surveillance scenarios have a higher degree of consistency. In contrast, the SDSI dataset contains complex monitoring scenarios, especially for similar weather events (as described in Section 2.1, figures 2 and 6). This is the main reason for the higher accuracy obtained in real scenarios.

Article Menu

A Multi-Stream Attention-Aware Convolutional Neural Network: Monitoring of Sand and Dust Storms from Ordinary Urban Surveillance Cameras

Further Information

Guidelines

MDPI Initiatives

Follow MDPI