An Improved U-Net Model Based on Multi-Scale Input and Attention Mechanism: Application for Recognition of Chinese Cabbage and Weed
Round 1
Reviewer 1 Report
The authors propose a MSECA-Unet model for identifying weeds and vegetables, which has high application value. My specific comments are as follows.
1. The author's motivation for choosing VGG 16 should be introduced in Chapter 1.
2. There are many works that mix VGG16 with U-Net, and the authors should discuss the pros and cons of these works.
3. At the end of the Chapter 1, authors should briefly introduce the framework of the manuscript.
4. It is recommended that the parameters of the image acquisition equipment be displayed in a table.
5. A simple ablation experiment can show the effect of each module.
6. The conclusion should be extended with more future work.
7. Many grammar and syntax issues need correction.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The manuscript presents an improved U-Net model based on multi-scale input and attention mechanism for application in the recognition of Chinese cabbage and weed. In contrast to the original U-Net model and the current commonly used semantic segmentation models PSPNet and DeepLab V3+, the improved model has the best segmentation effect on the Chinese cabbage and weed, which can offer strong technical support for the development of intelligent spraying robots and intelligent weeding robots. It is a topic of interest to researchers in related areas. In general, the manuscript is well organized and its presentation is also good. However, the manuscript still needs some minor improvements before acceptance for publication. My detailed comments are as follows:
1. The expression "Application to semantic segmentation of Chinese cabbage and weed" in the title of the manuscript is not clear enough. Semantic segmentation is a relatively small concept that is not easy for the reader to understand, so I suggest changing it to "Application to recognition of Chinese cabbage and weed".
2. The references in the introduction are improperly cited, e.g., Fully Convolutional Networks (FCN) is mentioned several times in the introduction, but the references to it are missing, and the references to FCN should be added, e.g., Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 2015, 39, 640-651.
3. Some parts of the manuscript are not expressed scientifically and critically enough, e.g., the four types of data enhancement are used in Section 2.2 of the manuscript, where: "a. Gaussian noise: The image is augmented with random Gaussian noise;" is not formulated rigorously enough compared to the other three types of data enhancement. The manuscript should detail the specific data of the randomly added Gaussian noise, e.g., the mean of the randomly added Gaussian noise? The variance of the randomly added Gaussian noise?
4. Some of the references listed in the last part of the manuscript are in the wrong format, e.g. the name of the journal "Journal of Chinese Agricultural Mechanization" in reference 6 is not in the right format, and it should be italicized as "Journal of Chinese Agricultural Mechanization". In addition, the journal name "Ieee Robotics and Automation Letters" in reference 22 should be written in its abbreviated form "IEEE Robot. Autom. Lett." The journal name "Cogent Engineering" in reference 24 should be also written in its abbreviated form "Cogent Eng.". Therefore, recheck the format of all your references, especially pay attention to the format and abbreviation of the journal name in them.
5. There are some typos in the manuscript. It is noted that your manuscript needs to be carefully edited by someone with technical English editing expertise, paying particular attention to English grammar, spelling, and sentence structure so that the reader has a clearer understanding of the process, goals, and results of your study.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
In this study, the segmentation effects of test set images in various models were compared to solve the problem of efficient and accurate identification of vegetables and weeds in the field and to realize accurate spraying of herbicides and smart weed removal operations. It has been seen that the proposed MSECA-Unet model in the study has more accurate segmentation effects on weeds close to the crop and overlapping the plant than other PSPNet, DeepLab V3+, U-Net. In conclusion, the proposed MSECA-Unet model is a good work in terms of providing strong technical support for the development of intelligent spraying robots and intelligent weeding robots with the necessary prerequisite for correct spraying and weeding.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 4 Report
Dear Authors,
the work was written correctly and logically, you have prepared an interesting introduction referring meaningfully to the research conducted. A big plus of the manuscript is the preparation of a thorough methodological description.
A shortcoming in my opinion is the lack of discussion of the results, which clearly excludes this text from the status of "scientific article". Please consider the following comments:
Abstract
Please indicate the location of the experiment in the text.
Material and methods
Please paste a map of China indicating the experimental location. Please provide geographical coordinates.
Please give a very brief description of the climatic conditions in the area.
Results and discussion.
The paper is devoid of a discussion of the results, hence the name of point 3 is incorrect. You have not made a comparison of the results of your own research with the results obtained by other authors in similar studies. I ask you to reliably correct the article-cite at least 15 papers from the last 10 years on similar topics.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 5 Report
Reviewer’s Report on the manuscript entitled:
An improved U-Net model based on multi-scale input and attention mechanism: Application to semantic segmentation of Chinese cabbage and weed
The authors proposed a semantic segmentation model based on improved U-Net for an accurate spraying of herbicides and intelligent mechanical weeding operations. Though the topic and results are important and interesting, the presentation can be improved. Please see below my comments.
The first sentence of abstract is too long. Please break it into two sentences.
Please define VGG. All abbreviations must be defined in both abstract and Introduction.
Page 2. Second paragraph. Also, Deep transfer learning, ResNet50, and VGG can be added there:
https://doi.org/10.3390/s21238083
and fuzzy k-means:
https://doi.org/10.3390/rs15030548
At the end of the Introduction, please highlight the main contributions preferably using bullet points.
The quality and font size of the figures should be improved.
Please enlarge the font size of Figures 3 and 7.
Section 2.4. Did you use optimization methods, such as early stopping to prevent over fitting issues and reduce computational cost? Please see the first article that I suggested above.
The result section needs another table where you can show the performance (accuracy, f1-score, etc.) of different architectures.
The limitations of the study should also be mentioned.
Regards,
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 4 Report
I accept
Reviewer 5 Report
I thank the authors for addressing my comments and improving their manuscript. Please carefully proofread the manuscript before publication if accepted by the editor.
Please also see a minor editorial comment (you may address during the proofreading):
Line 69. Please remove "where the authors work"
Regards,