Next Article in Journal
High-Precision Position Detection and Communication Fusion Technology Using Beacon Spread-Spectrum Modulation with Four-Quadrant Detector
Previous Article in Journal
Study on Temperature Characteristics of Lubrication Film of Valve Plate Pair in Axial Piston Pumps
Previous Article in Special Issue
Seatbelt Detection Algorithm Improved with Lightweight Approach and Attention Mechanism
 
 
Article
Peer-Review Record

RDD-YOLO: Road Damage Detection Algorithm Based on Improved You Only Look Once Version 8

Appl. Sci. 2024, 14(8), 3360; https://doi.org/10.3390/app14083360
by Yue Li, Chang Yin *, Yutian Lei, Jiale Zhang and Yiting Yan
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Appl. Sci. 2024, 14(8), 3360; https://doi.org/10.3390/app14083360
Submission received: 7 March 2024 / Revised: 13 April 2024 / Accepted: 15 April 2024 / Published: 16 April 2024
(This article belongs to the Special Issue Deep Learning for Object Detection)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Comment 1. The authors explain several open problems regarding damage detection algorithms in the introduction. Nevertheless, their proposal must clarify how they overcome these problems with their results.

 

Comment 2. The authors need to explain the dataset fully. This is mandatory because the result must show what kind of damages the proposal is better at or what conditions decrease the F1 score. A comparison of intra- and interclass should explain the behavior of the algorithm proposal. Is the dataset balanced?

 

Comment 3. Why did the authors choose bounding boxes instead of semantic segmentation to detect the damages? 

 

Comment 4. Table 1 has not been referenced in the text. 

 

Comment 5. It is unclear if the authors solved the detection as a binary identification or if they can identify the kind of damage detected. It is essential to understand this approach to interpret the metrics used. 

 

Comment 6. The metrics results could be more satisfactory since several current works have presented better metrics. I suggest that the authors carry out a better state-of-the-art and deep analysis of their proposal versus current works, focusing on their differences, advantages, and limitations. 

 

Comment 7. Section 2 should be replaced with a better state-of-the-art section focused on how several works have attempted to solve the same problem.

 

Comment 8. The contributions need to be better supported by the results shown since the authors only compare the other Yolo solutions against their modification but do not contrast them, showing how they overcome the problems shown and comparing them with the state of the art. 

Author Response

Thank you for your comment.Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

1. **SimAM Attention Mechanism:**

   - How does the SimAM attention mechanism enhance the focus on important information in the input image, and what impact does it have on the model's performance and generalization ability?

   - What distinguishes the SimAM attention mechanism from existing attention modules, particularly in terms of flexibility and complexity?

   - Could you elaborate on how SimAM simultaneously considers both channel and spatial dimensions to estimate three-dimensional weights?

   - What are the main components of the energy function used in SimAM, and how does it contribute to refining features in the network?

 

2. **GhostConv Convolution Module:**

   - How does GhostConv address the issue of feature redundancy in conventional convolutional modules?

   - Could you explain how GhostConv reduces the number of parameters and computational complexity while maintaining superior recognition performance?

   - What distinguishes GhostConv from regular convolutional neural networks in terms of its approach to generating feature maps?

   - How does GhostConv preserve the spatial size of the output feature map while applying linear operations to intrinsic features?

 

3. **Bilinear Interpolation:**

   - What shortcomings of nearest interpolation does bilinear interpolation aim to overcome?

   - Can you explain the differences between corner alignment and edge alignment methods in bilinear interpolation?

   - How does bilinear interpolation calculate the final interpolated value of a pixel based on the four nearest original image points?

   - In what ways does bilinear interpolation contribute to obtaining smoother and higher-quality enlarged images compared to nearest interpolation?

 

4. **Experimental Results and Analysis:**

   - What dataset was used for model training and evaluation, and what were its key characteristics?

   - Could you describe the experimental environment, including hardware, software, and parameter configurations?

   - What evaluation metrics were used to assess the performance of the proposed model, and how were they calculated?

Author Response

Thank you for your comment. Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

1.     Summary

Presented article introduces extension/modification of YOLOv8 algorithm for road damage detection. The final RDD-YOLO algorithm accommodates extensions of a) receptive field module with focus on specific objects by using SimAM attention mechanism (attention mechanism which combines spatial and channel attention mechanisms), b) convolution modules based on GhostConv for redundant information reduction in YOLOv8 neck architecture.

The article provides good overview of all steps and explains methodology and experimental evaluation with the respect to the target application in automated road surface damage detection. 

 

2.     Citations and resources

Cited references are relevant. In my option, two important references are missing – reference to the original SimAM attention mechanism and GhostConv convolution modul.

Some minor information could be provided including: 

reference to sources for images,

 references to original YOLOv8 article.

 

3.     Manuscript

The article has good structure, it is very good for reading, and provides all necessary details for problem, methodology, and application understanding. Experimental design follows standards defined for similar types of research and provides results in clear structure. All the necessary evaluation criteria are well described and presented inside the article. Moreover, all the datasets and data preparation are described in the references or in the article itself.

 

3.1  General comments

Based on review criteria, I have following comments to submitted article.

(L: denotes the line number)

1.     L:35: Could you provide sources for statistics, please? (infrastructure in China)

2.   L:99: Even it is quite clear, could you describe in one or two sentences your decision about why you choose YOLOv8 as the basic model for your work, please?

3.   L:175 – Figure 3: There seems to be resulted map size declared in the last column instead of defined r(ratio), which is 2.0, 2.0, 1.5, 1.0, and 1.0 here.

The article itself provides good message and results. Please, provide necessary changes for the final version of the paper.

 

3.2  Specific comments

 

1.     L:298 – Formula 12: Statement in brackets on 3rd line forms, in fact, one line. I am missing final + sign at the end of the first line or, better, formatting as one line only to be not confused with matrix.

2.     L:368 – Figure 12: Colors (greed and red) inside the image are very hard to read. 

3.     L:371-374: Whole sentence is very complex and hard to read. Could you, please, split it to multiple simple sentences?

 

4.     Reproducibility

 

The datasets used for the model preparation and experimental work are publicly available and referenced in the article.

Author Response

Thank you for your comment. Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

 

The paper presents an innovative approach to enhancing road damage detection using deep learning techniques. The work introduces an optimized version of the You Only Look Once (YOLO) algorithm, specifically YOLOv8, for road damage detection, incorporating three main improvements: the Simple Attention Mechanism (SimAM) in the backbone network, optimization of the neck structure with GhostConv modules, and the substitution of nearest interpolation with bilinear interpolation in the up-sampling algorithm. This enhanced model, the RDD-YOLO, demonstrates superior performance in detecting various types of road damage with improved accuracy and computational efficiency.

 

- Originality / Novelty: The incorporation of SimAM, GhostConv, and bilinear interpolation to YOLOv8 for road damage detection is a novel approach that addresses specific challenges in the field, such as the diverse and irregular nature of road damage and the impact of environmental conditions on detection accuracy.

- Significance of Content: This work is highly relevant and significant, offering advancements in automated road maintenance and safety measures, potentially reducing manual labor and improving the timeliness and accuracy of road damage assessments.

- Quality of Presentation: The paper is well-structured and clearly presents the methodology, experimental setup, results, and implications of the findings, making it accessible to readers with varying levels of expertise in the field. - Quality of Presentation: The paper is well-structured and clearly presents the methodology, experimental setup, results, and implications of the findings, making it accessible to readers with varying levels of expertise in the field. However, the paper lacks a dedicated “Related work” section crucial to situating the study within the literature compared to previous works and highlighting the novelty and improvements.

 

 

 

- Scientific Soundness: The proposed RDD-YOLO model is rigorously tested on the RDD2022 dataset, demonstrating improved performance over the baseline YOLOv8 model and other contemporary models in terms of accuracy and efficiency.

- Interest to the Readers: The topic is of broad interest to researchers in computer vision, civil engineering, urban planning, and related fields, as well as to governmental and maintenance organizations looking for efficient road assessment tools.

- In summary: This study is a valuable contribution to the field of automated road damage detection, offering a practical solution with potential for real-world application and further development.

Suggestions for improvement:
- Inclusion of "Related work" section dedicated to related work in the field of road damage detection 
- Further enhancements can include a dataset with a broader range of environmental conditions and road types. In addition, investigating the model performance in real-time with scenarios that include variable lighting and weather conditions could offer more insights to test the robustness of the model.   

 

Author Response

Thank you for your comment. Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have improved the paper, and now it reads more fluently. Besides, they answered my questions appropriately. Nevertheless, the authors could include a better discussion of the work's limitations to point out the problems found for each class of damage, which could allow them to solve the open problems. Also, a confusion matrix could help better understand each class's classification. 

 

Author Response

Thank you for this comment. Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have taken into account the comments from the previous review to introduce improvements to the work. The “Sim Am Attention Mechanism” section includes more information related to how the Sim Am Attention Mechanism acts as an attention mechanism, directing the model to prioritize salient information and suppress irrelevant data with improved processing efficiency.
Appropriately added limitations of the closest interpolation methods used in Yolov8 compared to bilinear interpolation. The information of the Dataset used in the experiments was completed.

Author Response

Thank you for this comment. Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop