Next Article in Journal
Stoichiometric and Nonstoichiometric Surface Structures of Pyrochlore Y2Zr2O7 and Their Relative Stabilities: A First-Principles Investigation
Next Article in Special Issue
WFRE-YOLOv8s: A New Type of Defect Detector for Steel Surfaces
Previous Article in Journal
Epoxy Coating Modification with Metal Nanoparticles to Improve the Anticorrosion, Migration, and Antibacterial Properties
Previous Article in Special Issue
Steel Surface Defect Recognition: A Survey
 
 
Article
Peer-Review Record

Multi-Scale Lightweight Neural Network for Steel Surface Defect Detection

Coatings 2023, 13(7), 1202; https://doi.org/10.3390/coatings13071202
by Yichuan Shao 1, Shuo Fan 2, Haijing Sun 1, Zhenyu Tan 2, Ying Cai 2, Can Zhang 2 and Le Zhang 1,*
Reviewer 1:
Reviewer 2:
Reviewer 3:
Reviewer 4: Anonymous
Coatings 2023, 13(7), 1202; https://doi.org/10.3390/coatings13071202
Submission received: 4 June 2023 / Revised: 30 June 2023 / Accepted: 3 July 2023 / Published: 4 July 2023
(This article belongs to the Special Issue Solid Surfaces, Defects and Detection)

Round 1

Reviewer 1 Report

1.An introduction to the  Defect Detection should not be included in the abstract; instead, a description of the case study should be included.

2.Please rewrite the abstract by identifying the purpose, the problem, the methodology and the important results (not all) and conclusions of your work.

3.You must write all equations with references.

4.Figures 2 and 3 need to be more clear.

5.Conclusion section is extremal short. The conclusions are very weak and ít require a deeper analysis of the results.

The Quality of English Language is fine.

Author Response

  • Please see the attached document.

Author Response File: Author Response.docx

Reviewer 2 Report

The comments have been provided in the attached file.

In this study, a multiscale light neural network model has been used to predict six types of cracks in steel surfaces. Advantages include inspection for different types of cracks as well as a good visualization of the results. However, to improve the quality of work, the following are presented:

1.     The reference of the dataset used to train the model must be provided. Was it collected by field experiments or from public datasets?

2.     Figure 2 is further explained. Batch normalization and Ada can be further discussed in this format. Also, why isn't Leaky ReLU used instead of ReLU?

3.     Since that batch normalization itself reduces dropout rates, this is discussed in more detail in Section 2.

4.     The results presented in Table 3 are related to which data set (Training, validation, or Testing)? Be mentioned.

5.     What method or software was used to annotate the images?

6.     What does MM stand for? Do you mean Multiscale Artificial Neural Network (MsANN)? This should be corrected throughout the paper.

7.     For Section 3, a concise introduction should be provided. In the following, about the CCD technology in Section 3.1, more additional explanations are needed.

8.     The title number of the introduction should be corrected.

9.     Please use a multiplication sign instead of an asterisk (Line 184, Page 5). check this for the whole paper.

10.  The title of the six crack categories in Table 1 is different from that of Table 3! Titles must be corrected and unified.

11.  The title of the first column in Table 3 needs to be corrected.

12.  At the start of the conclusion section, the introduction of the topic gap ought to be given. In addition, the conclusion section is frail as well, and the research contribution should be investigated.

Author Response

  • Please see the attached document.

Author Response File: Author Response.docx

Reviewer 3 Report

The paper presents an innovative solution for automated surface defects in steel. The paper focuses on using a multi-scale lightweight neural network model (MM) for improving existing solution in both aspects of classification accuracy and computational cost. In general, the research findings are significant; the research method is appropriate; and the experimental results are well reported. The performance of the proposed MM model was compared to that of ResNet-50, ResNet-101, VGG, AlexNet, MobileNetV2, and MobileNetV3 networks. The MM model gains an outstanding classification accuracy of 98.06%. The paper is easy to follow and can be a meaningful contribution to the body of knowledge. After reading the paper, the reviewer has the following comments:

 

1) Abstract: “secondly, using the key point feature set for mapping fusion to combine the generated features with the original network, thus effectively solving the problem of key point feature loss”, this sentence is vague in the context of the abstract. Please only provide general information of the research findings in the abstract; more details regarding the technical aspects should be mentioned in the paper.

 

2) Introduction: “The task of classifying defects on steel surfaces using computer vision techniques poses a significant challenge due to the effects of illumination and material variations on defect images. In addition, the appearance of defects varies dramatically not only within categories of steel surfaces, but also between categories, thus further complicating the classification process.”, please add references to support the point of discussion.

 

3) Introduction: “ Various complex situations are faced in practical defect classification applications, and it is difficult to achieve the requirements in terms of accuracy using traditional image processing methods. ”, please add references to support the point of discussion.

 

4) Introduction: the current literature review section is not comprehensive; more papers related to the research topic should be covered to provide a better view of the state-of-the-art methods. For instance, more papers regarding detecting welding defects in steel plates using computer vision, image processing-based pipe corrosion detection, artificial neural network-based detection of defects on rolled steel, automated pitting corrosion detection using image processing and machine learning, deep learning-based detect detection for steel/metal structures, … should be reviewed and discussed.

 

5) On page 2, line 45, “The identification of surface defects in steel undergoes three processes: manual human detection, prediction using machine learning algorithms and automatic detection through deep learning, translated into English.” Should be revised as: “The identification of surface defects in steel undergoes three processes: manual human detection, prediction using machine learning algorithms and automatic detection through deep learning.” In addition, ref. [8] seems irrelevant for the context of detection of steel surface defects. Please double-check this reference.

 

6) Although lightweight neural network models [16] have been shown to achieve good performance with low computational cost, what are their disadvantages or limitations? What must be sacrificed to reduce the model complexity and computational expense? Please elaborate these points in the introductory section of the paper.

 

7) Although various advantages and innovative points of the paper (e.g., a fusion coding module as the core, a Gaussian difference pyramid, etc.) have been mentioned scatteredly throughout the 1st section, it is required to summarize the main contributions of the paper via several bullet points at the end of the 1st section.

 

8) On line 103, regarding “Gaussian differential pyramids provide an effective scale-space representation that captures the features of an image at different scales. This is important for dealing with real-world image problems, as real-world objects can appear at a variety of different scales.”, please add references to support the point of discussion. In addition, more advantages of features of the Gaussian differential pyramids should be provided in the paragraph of interest.

 

9) On line 108: “Key point feature convolution is a group convolution of each key point feature set into a separate set, which reduces the number of parameters and computational cost. Using the idea of multi-core combination, convolution using different key points feature sets can enhance the model’s ability to adapt to varying levels of detail. The network needs both fuzzy key point features to capture high-resolution patterns and fine key point features to capture low-resolution patterns for better model accuracy and efficiency.”, please add references to support the point of discussion.

 

10) On line 115 “To address issues related to information exchange and loss, and performance degradation of key point features in multi-scale space, The paper suggests using key points to map and encode information from different scales, then combining it with the original network through matching transformations in order to enhance the discriminative features” , please add references to support the point of discussion.

 

11) References are required for the algorithmic steps on lines 131-145.

 

12) In 1.2 and 1.3, please highlight the differences between conventional convolution layers in Convolution Neural Networks and the ones used in the current papers (Convolution of key point features and Key point feature set mapping fusion module). Are the weights of convolution of key point features and Key point feature set mapping fusion module learnable or handcrafted? Please elaborate.

 

13) In Figure 2. MM network construction, please provides a note for the used abbreviations (e.g. BN).

 

14) Regarding “The data sets used in this paper are obtained through such image acquisition units, taken from a hot-rolled strip mill specializing in the production of cover hot-rolled strip surfaces, with more than 20,000 original sample images taken from different production batches.”, please provide the source of the dataset.

 

15) Regarding “In this paper, a sliding window of 128*128 pixels is used to intercept the whole CCD-captured hot-rolled strip surface defect images, and the complete defect, as well as defect-free images, are selected to establish a standard image dataset of hot-rolled strip surface defects, which reaches more than 60,000 standard sample images.”, is the model performance sensitive to the size of the sliding window?

 

16) In 3.1, why was Region Proposal Networks used instead of sliding windows? Please provide some discussions.

 

17) Report the computer platform used for training and testing the MM model.

 

18) The proposed MM has achieved outstanding results. However, more detailed regarding its performance should be reported. Please provide confusion matrices of the training and testing phase.

 

19) Report several classification cases of the MM model with output class probabilities.

 

20) Report several misclassifications of the MM model and provide explanations if possible.

Please double-check the grammar and typos to improve the presentation of the paper.

Author Response

  • Please see the attached document.

Author Response File: Author Response.docx

Reviewer 4 Report

This article presented a method based on a multi-scale lightweight neural network which was used for the detection of the steel surface defect. The subject of the paper fits in the topics of the journal. I recommend the acceptance of the article after major revision:

1. I recommend to extend the introduction with a paragraph which presents the major 3-4 contributions of the article, such that each bullet point correspons to one contributions.

2. Please revise the paragraph from the lines 72-80, describing what is presented in each section. For example: “Section … presents the construction of …, Section … describes the overfitting problem, and so on”.

3. I recommend to extend the article with a novel section called “Related Work” or “Research Background” in which around 6-10 papers similar to the topic addressed in the article are presented in more details.

4. Regarding Table 1, it looks like the data is imbalanced: Horizontal Cracking have 20000 samples for training set, while the other labels have 15000 samples. Please describe how you dealt with the data imbalance. Would the precision results differ greatly if each label has the same number of samples for each dataset: training, validation and test?

5. Please describe the properties of the machine(s) on which the experiments were conducted such as: RAM, CPU, and so on. 6. Please extend the conclusions with 3 possible research directions.

Author Response

  • Please see the attached document.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The manuscript has been sufficiently improved to warrant publication in coatings journal.

Reviewer 2 Report

All comments have been considered.

Reviewer 3 Report

I have no further comments.

Reviewer 4 Report

All my comments were addressed. 

Back to TopTop