Next Article in Journal
Seasonal Variation of Stratospheric Gravity Waves in the Asian Monsoon Region Derived from COSMIC-2 Data
Next Article in Special Issue
An Efficient Method for Detection and Quantitation of Underwater Gas Leakage Based on a 300-kHz Multibeam Sonar
Previous Article in Journal
Using UAV and Structure-From-Motion Photogrammetry for the Detection of Boulder Movement by Storms on a Rocky Shore Platform in Laghdira, Northwest Morocco
 
 
Article
Peer-Review Record

Imbalanced Underwater Acoustic Target Recognition with Trigonometric Loss and Attention Mechanism Convolutional Network

Remote Sens. 2022, 14(16), 4103; https://doi.org/10.3390/rs14164103
by Yanxin Ma 1,2, Mengqi Liu 3,*, Yi Zhang 2,3, Bingbing Zhang 1,2, Ke Xu 4, Bo Zou 5 and Zhijian Huang 6
Reviewer 1:
Reviewer 2: Anonymous
Remote Sens. 2022, 14(16), 4103; https://doi.org/10.3390/rs14164103
Submission received: 13 July 2022 / Revised: 8 August 2022 / Accepted: 19 August 2022 / Published: 21 August 2022
(This article belongs to the Special Issue Advancement in Undersea Remote Sensing)

Round 1

Reviewer 1 Report

This paper is devoted to study of imbalanced datasets for underwater acoustic target recognition. To address the imbalance and poor robustness, two strategies are proposed to study distinguishing features: (i) attention mechanism is used to fuse the studied multi-scale features and the fusion module based on attention mechanism is used to highlight the dominant features to suppress high-intensity noise; (ii) a trigonometric cos x weighted cross entropy loss function (CFWCEL) is designed to deal with the imbalanced data. As a result, a multi-scale residual-convolutional neural network with embedded attention mechanism (named MR-CNN-A) and CFWCEL is proposed for target recognition. CFWCEL adds an impact factor to the standard cross entropy loss according to the predicted probability of each sample.

As shortcoming and missing, it should be pointed the following:

1. English must be improved and carefully checked into a whole text, because there are many errors. For example:

(Lines 126-127) “…Mel-Frequency Cepstral Coefficients (MFCC) [18-20], which has…”

(Line 242) “…expermrnts…”

2. Fig. 2 should be bigger, because inscriptions are very small.

3. Is there permission for Fig. 3?

4. What does “relu” mean in Fig. 3?

5. (Line 242 – 243) Why is the r = 2 most suitable value for your imbalanced data recognition?

Here there are results only for r = 1 and r = 2. What is the comparative picture for r = 3?

6. Figs. 4 (a-c) must be bigger. What is the curve in Fig. 4(b) corresponds to the case r = 2, a = p/2, designated by point into insertion of the figure?

7.  There are 6 curves into Fig. 4(c), at the same time, 7 curves are designated into insertion of the figure. Only color and dashed lines are insufficient for understanding Figs. 4 (b, c).

8. Conclusion should be expanded for account certain results of the paper.

9. References list should be revised according to MDPI rules for references (fonts).

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors consider the problem of data imbalance in underwater target recognition using acoustic methods. We propose an approach for overcoming imbalance consequences and prove its effectiveness. The paper is interesting for a broad community connected to underwater monitoring. The only considerable lack is very clumsy English. In my opinion, the paper can be published in Remote Sensing after the linguistic quality is significantly improved.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

+ In general, my recommendation is that the paper need to be reworded in several paragraphs/pages. They use the same word several times in a single row/column (e.g. lines 251, 252,253… "Figure 4"; line 207-208 “is shown in Figure 3. As shown in Figure 3,”). It's pretty boring to read the obvious over and over again. This denotes a lack of knowledge of technical writing in English.

 

+ Please check your typos. Some examples are:

 

Figure 1. “Signal acquisitio" -> Signal acquisition

 

Figure 1. Reverse the arrow between signal acquisition and passive sonar

 

Line 389: “EQ.(2). where” w->W

 

why are Figure and Table referencing in your text in bold ?

 

Increase size of Figure 2. I can read clearly the text inside.

 

Line140: orphan title (line)

 

After read line 213, a reader can not understand what is g(dot)?  Remember that this paragraph explains Figure 3.

 

Does cos x means cosine function?? (Line112) why is it in bold?

 

In most of the lines the variables, functions and other elements referred to the images are unaligned with its base line. See lines 194-200.

 

Equation 1 and line 234. If you write “where” with (‘w’ lower case), you must add a comma to the equation. If you don’t add comma to equation, you must write “Where” (upper case)

 

Line 242… “expermrnts”

 

Line 243. Why you write again equation 1, to include the 2??? 

 

Lines 237 and 238… why “a” in both lines are not equal in size? Are you using copy and paste "a" images??? 

 

In general, All figures look horrible (poor resolution). If you zoom in on the images, you will see jpeg compression loss. Do not use bitmaps formats (see the steps in your lines in Figure 5), use vector graphics.

 

 

Experiments

 

At first glance:

 

A reader can be lost easily in your explanation. When you try to read a paper with comparisons you expect the classic “literature vs Ours”. Despite you use 28 literature references, none of then are used to compare.

 

Mainly:

 

When a reader looks to table 1 thinks “ok SVM, SVM?, the first line in your paper where this appears is in line 310. And what are the conditions and implementation used (where is its reference?). And of course, in “some no exposed conditions” DEMON is better for SVM and MFCC for your proposal…. NO… for Base-CNN. Where is the literature reference for Base CNN?

 

Then Table 2, based on your proposal is MR-CNN-A and literature approaches are SVN and Base-CNN. A reader can see in this table that your model size is 340 TIMES greater than the SVN. Also the Model size of Base-CNN is 15 times smaller than yours.

 

SVN is not comparable in terms of model size or the used SVN is very bad dimensioned.

 

On the other hand, this Table 2 shows that CNN method has 10 less params than your proposal and also requires 10 times more computational effort just to increase a  2.63% the accuracy. Really? You must take in consideration that you are simulating in certain conditions both algorithms. In general, we can assume a +-2.5% of error in all simulation. You must improve or justify better your results, other wise you can not ensure that a 2.63% of better accuracy is not a good deal for increase 10 times the computational effort.

 

Figure 6 said “alpha” showing numbers and your text said “a” and pi/2. Select one and use always, do not change the nomenclature. In same way you use pi and its greek letter…

 

Looking to this Figure 6 a reader can understand that the curves do not show nothing clear. I mean, it shows the accuracy variation as function of alpha. This variable has a range between 0 and 3, however the accuracy changes globally from 0.935 up to 0.982 (approx.). So the variation is <5% globaly. Objectively, the two used DI (0.14 and 0.53) have the same form, also they have the same maximum location (close to 1.5 ?? Pi/2??)… and the difference is close to 0.01 -> that is 1%… At this point, a reader can ensure that the accuracy is independent on the DI.

 

The main problem of the authors approach is that their advantage over the other methods is poor (2.63%). Also, this poor margin requires the tuning of a lot of minimal parameters DI in order to do not loss accuracy. The required 10 times computational effort in comparison with other methods indicates that probably taking the similar computational effort with other methods make its accuracy better than proposed.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors have performed necessary improvement of the manuscript and the paper could be accepted in this form.

Reviewer 3 Report

Thanks to the authors for their effort. No more comments from my side.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Back to TopTop