Research on an Intelligent Identification Method for Wind Turbine Blade Damage Based on CBAM-BiFPN-YOLOV8

Yu, Hang; Wang, Jianguo; Han, Yaxiong; Fan, Bin; Zhang, Chao

doi:10.3390/pr12010205

Open AccessArticle

Research on an Intelligent Identification Method for Wind Turbine Blade Damage Based on CBAM-BiFPN-YOLOV8

by

Hang Yu

^1,2,

Jianguo Wang

^1,2,

Yaxiong Han

^1,2,

Bin Fan

^3,4,*

and

Chao Zhang

^1,2

¹

School of Mechanical Engineering, Inner Mongolia University of Science and Technology, Baotou 014010, China

²

Inner Mongolia Key Laboratory of Intelligent Diagnosis and Control of Mechatronic System, Baotou 014010, China

³

College of Mechanical & Electrical Engineering, Inner Mongolia Agricultural University, Hohhot 010018, China

⁴

Inner Mongolia Engineering Research Center of Intelligent Equipment for the Entire Process of Forage and Feed Production, Hohhot 010018, China

^*

Author to whom correspondence should be addressed.

Processes 2024, 12(1), 205; https://doi.org/10.3390/pr12010205

Submission received: 27 December 2023 / Revised: 15 January 2024 / Accepted: 15 January 2024 / Published: 18 January 2024

Download

Browse Figures

Versions Notes

Abstract

:

To address challenges in the detection of wind turbine blade damage images, characterized by complex backgrounds and multiscale feature distribution, we propose a method based on an enhanced YOLOV8 model. Our approach focuses on three key aspects: First, we enhance the extraction of small target features by integrating the CBAM attention mechanism into the backbone network. Second, the feature fusion process is refined using the Weighted Bidirectional Feature Pyramid Network (BiFPN) to replace the path aggregation network (PANet). This modification prioritizes small target features within the deep features and facilitates the fusion of multiscale features. Lastly, we improve the loss function from CIoU to EIoU, enhancing sensitivity to small targets and the perturbation resistance of bounding boxes, thereby reducing the gap between computed predictions and real values. Experimental results demonstrate that compared with the YOLOV8 model, the CBAM-BiFPN-YOLOV8 model exhibits improvements of 1.6%, 1.0%, 1.4%, and 1.1% in precision rate, recall rate, mAP@0.5, and mAP@0.5:.95, respectively. This enhanced model achieves substantial performance improvements comprehensively, demonstrating the feasibility and effectiveness of our proposed enhancements at a lower computational cost.

Keywords:

wind turbine blade; YOLOv8; attention mechanism; feature fusion; loss function

1. Introduction

Amidst the growing global imperative for renewable energy, wind power has emerged as a progressively vital source of clean and sustainable energy [1]. However, the impairment of wind turbine blades during wind power generation not only results in diminished efficiency in power generation but also poses significant safety concerns [2,3,4,5]. A comprehensive review of the past literature underscores that damage to wind turbine blades stands out as a primary factor influencing the operational availability of wind farms. Consequently, the precise and efficient detection of wind turbine blade damage becomes an urgent imperative.

Traditional wind turbine blade damage detection methods include vibration analysis, thermal imaging and ultrasonic testing methods, thermal imaging detection methods and ultrasonic detection methods, etc. [6,7,8]. Although the traditional wind turbine blade damage detection methods have achieved a degree of results to some extent, there are still some problems and limitations. First of all, the traditional methods often rely on manual experience and professional operation and lack capability with automation and intelligentization. Secondly, traditional methods have a limited effect on the detection of small, hidden damages. In addition, the application of these traditional methods in large-scale wind farms is limited by site constraints and cost issues. Therefore, UAV-based wind turbine blade damage detection technology has emerged, which has the advantages of flexibility, high efficiency, and broadness. UAVs have the advantages of flexibility, efficiency, and broadness, which make it easier to obtain the image data of wind turbine blades.

In the realm of wind turbine blade damage detection, notable advancements have been achieved through deep learning methodologies, specifically, employing Convolutional Neural Networks (CNNs) and the YOLO algorithm. Yu et al. [9] introduced a method for recognizing wind turbine blade damage, grounded in the semantic features of defects. Their approach involves constructing a deep learning network utilizing a feature extractor, enabling the automatic extraction of image features for precise and efficient detection of wind turbine blade damage. BXYA et al. [10] employed migration learning and integrated learning classifiers to enhance the recognition performance of deep learning models. Nevertheless, these methods exhibit a notable reliance on mathematical models, rendering them susceptible to false estimations. Addressing this concern, Cheng et al. [11] proposed a Temporal Attention-based Convolutional Neural Network (TACNN), specifically designed to adapt to the diverse signal characteristics within the complex environment of Wind Turbine Generators (WTGs). Tian et al. [12] conducted an exploration into the application of a Multilayer Convolutional Recurrent Neural Network (MCRNN) for detecting icing on wind turbine blades. Their proposed effective multilayer convolutional recurrent structure aims at achieving accurate detection of icing on blades. In response to the low-salience issue associated with damage cracks on wind turbine blades, Gao et al. [13] introduced a multimodal target detection Convolutional Neural Network, fusing infrared and visible light images. This approach is designed to enhance the accuracy of crack detection. These advancements contribute significantly to the evolving landscape of wind turbine blade damage detection, offering more robust and effective methodologies. Liu et al. [14] enhanced the original YOLOv5 model by introducing a Weighted Bidirectional Feature Pyramid Network module to replace the feature fusion component of the neck network. This modification achieves the effective fusion of multiscale features of the target, thereby enhancing the focus on potential information features. Despite the advantages of deep learning algorithms, including high accuracy, robustness, and efficiency, they encounter specific challenges. Firstly, traditional deep learning algorithms exhibit reduced efficacy in detecting small targets and struggle to accurately identify minute leaf damage. Secondly, these algorithms prove less effective in processing wind turbine blade images against complex backgrounds, potentially resulting in false positives and missed detections. Consequently, ongoing research efforts are directed toward refining algorithms to improve detection capabilities for small targets and enhance image processing capabilities in complex backgrounds. This avenue of research remains pivotal in advancing the field of wind turbine blade damage detection.

Addressing the aforementioned challenges, this study presents a wind turbine blade damage detection model, denoted as CBAM-BiFPN-YOLOV8, aimed at mitigating issues related to the facile loss of small target features and challenging localization in the current task of detecting damage in wind turbine blade images characterized by complex backgrounds and diverse defect scales. The primary contributions of this paper include the following:

(1): In addressing the current algorithm’s deficiencies in the weak and incomplete extraction of features related to wind turbine blade damage, we introduce the Convolutional Block Attention Module (CBAM) attention mechanism into the backbone network. This strategic integration effectively boosts the extraction of small target features associated with damaged wind turbine blades. The incorporation of CBAM serves as a sophisticated mechanism dynamically adjusting the network’s attention to better perceive small targets. By intensifying the network’s focus on defective regions, CBAM not only refines feature extraction but also aids in the precise localization and identification of damage on the blades. This feature enhancement proves vital in handling complex backgrounds and multiscale defects in wind turbine blade images, offering robust support for enhancing detection performance and accuracy. This optimization strategy not only significantly reduces the leakage rate in wind turbine blade defect detection but also enhances detection efficiency and accuracy.
(2): Addressing the challenge of the suboptimal utilization of information in fusing shallow features and deep semantic features, we conducted an optimization of the feature fusion process. In this enhancement, we introduced the Weighted Bidirectional Feature Pyramid Network (BiFPN) to replace the original Path Aggregation Network (PANet). The objective of this optimization is to achieve a significant improvement in detection accuracy through a more effective fusion of multiscale features, thereby enhancing the comprehensive utilization of small target features within deep features. The incorporation of BiFPN introduces bidirectional connections and a weighting mechanism, enhancing the thoroughness of feature fusion, particularly in handling deep semantic features. This optimization strategy renders the model more adaptable and efficient in managing features with varying scales and semantic information. By effectively fusing multiscale features, we can capture essential information in wind turbine blade images more comprehensively, thereby improving the thorough utilization of small target features in deeper features and simultaneously enhancing detection performance while reducing the leakage rate.
(3): To address the height and width constraints in the original model’s loss function during the shape convergence task, we optimized the loss function. In order to minimize the difference between the height and width of the predicted and real frames, we replace the Complete Intersection over Union (CIoU) component of the original model’s loss function with the Effective Intersection over Union (EIoU). The aim of this optimization is to expedite the model’s convergence speed and enhance prediction accuracy. The application of EIoU renders the model more robust, enabling it to learn and predict the shape characteristics of wind turbine blade damage more accurately. This enhancement significantly elevates the model’s performance in real-world tasks, providing robust support for improved detection accuracy and shape matching.

The subsequent sections of this paper are structured as follows. Section 2 delves into the principles of YOLOV8 and delineates the architecture of the enhanced YOLOV8 model. Section 3 provides a comprehensive overview of the experimental design and result analysis, presenting a detailed account of the dataset, experimental setup, and parameter configurations. This section incorporates ablation experiments to validate the feasibility and effectiveness of the improved model, along with the comparative testing of various models against practical engineering application models. Section 4 succinctly summarizes the main findings and conclusions of this study and provides insights for future research directions based on these findings and conclusions.

2. Theoretical Background

2.1. YOLOV8 Algorithm Principles

YOLOV8 stands as the latest advancement in single-stage target detection algorithms, renowned for its rapid and efficient target localization and recognition within images. It seamlessly accommodates all YOLO series algorithms [15,16,17,18,19,20], offering users the flexibility to switch between different versions as needed. The model structure of the YOLOV8 algorithm can be delineated into distinct components: Input, Backbone, Neck, and Head. The comprehensive architecture is visually represented in the YOLOV8 structure diagram, as illustrated in Figure 1. This hierarchical structure underscores the algorithm’s robust capabilities in target detection, marking a significant stride in the evolution of single-stage detection methodologies.

YOLOV8 mainly draws inspiration from recently proposed algorithms, such as YOLOVx and YOLOV7, incorporating improvements and optimizations into the network structure and loss function. While the backbone of YOLOV8 shares similarities with YOLOV5, adopting the Conceptually Similar Parts (CSP) approach, it introduces innovative changes by replacing the C3 module with the C2f module. The C2f module combines the C3 module with the Enhanced Local Area Network (ELAN) structure, borrowing from the ELAN structure concept introduced in YOLOV7. This strategic integration aims to address limitations in the effective transfer of gradients and enhance the extraction of contextual information for improved detection, especially in scenarios involving fuzzy or occluded targets. YOLOV8 still employs the concept of PANet, enhancing the fusion and utilization of feature information at different scales. This involves removing the convolutional layers before the upsampling layer and replacing the C3 module with the C2f module. This modification aids in the fusion of shallow information into deep features, compresses the algorithm size, and improves detection performance. The main function of the Head section is target detection processing. The detection module borrows the idea of head decoupling from YOLOVx, branching the classification and regression tasks. Independent branches focus more on the feature information they are responsible for, alleviating conflicts between classification and localization tasks. Additionally, the model transitions from YOLOV5’s Anchor-Based approach to the Anchor-Free one. In the loss function section, the model utilizes CIoU Loss as the error loss function. Further enhancement of bounding box regression accuracy is achieved by minimizing DFL. The model also employs the Task Aligner sample allocation strategy, using a high-order combination of classification scores and Intersection over Union (IoU) as metrics to guide positive and negative sample selection. This approach facilitates the high alignment of classification scores and IoU, effectively improving the model’s detection accuracy, especially for smaller targets.

2.2. Improved YOLOV8 Model Structure

In addressing the challenges posed by complex backgrounds and the varied scale of defects in wind turbine blade defect image target detection, we propose an intelligent identification method, namely CBAM-BiFPN-YOLOV8. The network structure of the proposed algorithmic model is depicted in Figure 2. This innovative approach aims to overcome the difficulties associated with the easy loss and challenging localization of small target features in wind turbine blade defect images. The CBAM-BiFPN-YOLOV8 method integrates the Convolutional Block Attention Module for enhanced feature extraction and the Weighted Bidirectional Feature Pyramid Network for effective feature fusion and builds upon the YOLOV8 architecture. This strategic amalgamation of components is designed to bolster the intelligent identification of wind turbine blade damage. Figure 2 illustrates the network structure, showcasing the interconnected elements that contribute to the algorithm’s proficiency in detecting and identifying defects in wind turbine blades. This proposed methodology represents a significant advancement in the field, aiming to improve the accuracy and efficiency of target detection tasks within the context of wind turbine blade damage.

2.2.1. CBAM Attention Mechanism

In the context of handling wind turbine blade defect images, in addition to containing information about the wind turbine blade, such images typically have complex background information. The convolution operation may accumulate redundant information from the background, leading to decreased target attention and affecting detection accuracy. To enhance the model’s focus on key regions and mitigate the impact of unimportant background information, this paper introduces the CBAM attention mechanism [21], as depicted in Figure 3. The introduction of the CBAM attention mechanism aims to refine the model’s attention allocation, ensuring a more concentrated focus on key regions related to wind turbine blade information. By selectively emphasizing important features and suppressing less relevant background information, the CBAM mechanism contributes to an enhancement in detection accuracy. This strategic integration represents a significant step toward optimizing the processing of wind turbine blade defect images, crucial for accurate and efficient detection tasks.

As shown in Figure 3, CBAM comprises two integral components: the Channel Attention Module (CAM) [22] and the Spatial Attention Module (SAM) [23]. This dual-module structure adeptly considers both channel and spatial dimensions, facilitating the network in acquiring the precise location and detailed information of the target region. The CAM module initiates the process by conducting global Max-Pooling and global Average-Pooling on the input feature map, denoted as F. Subsequently, the outcomes of these two pooling operations undergo processing by a shared Multilayer Perceptron (MLP) neural network. The MLP output features are then amalgamated, and the resulting feature map Mc is generated through the activation operation of the Sigmoid function. This process effectively accentuates the target region within the feature map, enhancing the model’s ability to discern relevant details. The mathematical expression for the channel attention is provided below:

\begin{array}{l} M_{c} (F) & = σ (M L P (A v g p o o l (F)) + M L P (M a x P o o l (F))) \\ = σ (W_{1} (W_{0} (F_{a v g}^{c})) + W_{1} (W_{0} (F_{\max}^{c}))) \end{array}

(1)

Here, σ represents the Sigmoid activation function, and W₀ and W₁ correspond to the weights of the Multilayer Perceptron (MLP), (W₀∈R^C/rich, W₁∈R^C×C/r). The parameter r indicates the reduction ratio.

Subsequently, M_c is multiplied with the input feature map F to yield the modified input feature F′ required for the SAM module, i.e.,

F^{'} = M_{C} (F) \otimes F

(2)

Building upon the output feature map F′ from CAM, the process involves conducting global Max-Pooling and global Average-Pooling of channels. Subsequently, spatial features M_Ps, accentuating the target location and detailed information, are generated through a 7 × 7 convolution operation. The mathematical expression for spatial attention is as follows:

\begin{array}{l} M_{S} (F) & = σ (f^{7 \times 7} ([A v g P o o l (F); M a x P o o l (F)])) \\ = σ (f^{7 \times 7} ([F_{a v g}^{S}; F_{\max}^{S}])) \end{array}

(3)

The convolution kernel has a size of 7 × 7.

Ultimately, the spatial feature map M_s_. is multiplied with the modified input feature map F′ to obtain the final feature map F″, i.e.,

F^{″} = M_{S} (F^{'}) \otimes F^{'}

(4)

2.2.2. Weighted Bidirectional Feature Pyramid Network

The diverse shapes and sizes of wind turbine blade surface defect targets during the training process result in features with varying resolution sizes. The conventional linear superposition of these features in PANet can lead to an unequal weighting of features associated with different wind turbine blade surface defect targets in the fused output. This imbalance may cause large-scale features to dominate the postfusion output, overshadowing smaller features and potentially leading to missed detections. To address this issue, consideration must be given to preventing the loss of shallow features, which contain crucial detailed information for detecting small target objects. Consequently, the feature fusion component of YOLOV8 incorporates Bilpin [24], replacing PANet. The network structures of bifan and PANet are illustrated in Figure 4. By integrating Bilpin, the fusion of multiscale features is realized, fortifying the representation of small target information within deep features, thereby enhancing overall detection accuracy.

Bilpin stands out as an optimized alternative to PANet [25] across various dimensions. Firstly, it introduces bidirectional feature propagation, encompassing both top-down and bottom-up feature flow. This innovation facilitates a more thorough and nuanced cross-level transfer and fusion of information, enriching the model’s capability to capture diverse features at different levels. Secondly, the merging results undergo optimization through feature adjustment and selection operations. This process ensures a superior representation of crucial features and dynamically selects the most valuable features, contributing to an enhancement in the accuracy and effectiveness of target detection. Moreover, in terms of network topology, bifan incorporates neural network architecture search. This strategic approach seeks to identify the most suitable irregular feature network topology tailored to diverse tasks and resource constraints, thereby showcasing greater flexibility in network design. Finally, through well-designed feature fusion and tuning operations, BiFPN not only improves accuracy but also successfully reduces computational demands. This is achieved through feature selection and the integration of a flexible network topology, ultimately enhancing detection efficiency.

2.2.3. Loss Function

In the target detection process, the accurate positioning of the target bounding box plays a pivotal role, particularly for subsequent width calculations. Small targets exhibit heightened sensitivity to perturbations in the bounding box. To minimize the disparity between the calculated predicted value and the true value, the loss function is refined. The original YOLOV8 model utilizes CIoU as the loss function for detection box regression.

CIoU is an extension of DIoU (Distance IoU Loss) [26] with the addition of aspect ratio. The calculation formula is expressed as follows:

C I o U = I o U - \frac{ρ^{2} (b, b^{g t})}{C^{2}} - α γ

(5)

α = \frac{γ}{1 - I o U + γ}

(6)

γ = \frac{4}{π^{2}} {(\arctan^{- 1} \frac{w^{g t}}{h^{g t}} - \arctan^{- 1} \frac{w}{h})}^{2}

(7)

In this context, IoU refers to the “intersection and merger ratio”, a crucial metric for assessing the performance of object detection algorithms. It is commonly employed to gauge the overlap between detection results and the actual target in target detection tasks. Notably, b and b_gt signify the centroids of the prediction frame and the actual frame, respectively. The variable ρ represents the Euclidean distance between these two centroids. Additionally, C denotes the length of the diagonal of the smallest outer bounding rectangle encompassing both the prediction frame and the target frame. The symbol α signifies the weighting coefficients, while γ measures the consistency of the width-to-height ratio between the predictor frame and the actual frame. Furthermore, w_gt and h_gt represent the width and height of the actual frame, while w and h represent the width and height of the predictor frame, respectively.

The ultimate CIoU Loss is computed as follows:

L_{C I o U} = 1 - I o U + \frac{ρ^{2} (b, b^{g t})}{C^{2}} + α γ

(8)

Within this context, L_CIoU denotes the CIoU Loss. CIoU indirectly addresses aspect ratio through a parameter; however, due to the relative nature of width and height descriptions, accurate localization is challenging, and a true reflection of width, height, and confidence is compromised. To tackle this issue, researchers have proposed Efficient IoU Loss (EIoU) [27]. EIoU builds upon CIoU by separately considering the horizontal and vertical ratios, addressing challenges in width and height calculations. The formula is articulated as follows:

L_{E I o U} = 1 - I o U + \frac{ρ^{2} (b, b^{g t})}{C^{2}} + \frac{ρ^{2} (w, w^{g t})}{C_{w}^{2}} + \frac{ρ^{2} (h, h^{g t})}{C_{h}^{2}}

(9)

In this equation, L_EIoU represents the EIoU Loss, while C_w and C_h denote the width and height of the smallest outer rectangle for the prediction and actual frames, respectively.

EIoU enhances the convergence speed and regression accuracy of the prediction frame by separately considering the differences in width and height. This approach addresses the challenges associated with height and width constraints in the shape convergence task of CIoU, leading to an improved ability to detect small targets to some extent.

3. Experimental Design and Result Analysis

3.1. Dataset

The dataset employed in this experiment comprises images of wind turbine blades captured at high altitudes by a UAV, generously provided by a wind power company. The wind turbine model examined in this study is the Goldwind GW87-1.5MV, featuring a blade length of 43 m and a blade width of 2 m. In accordance with the wind farm UAV operation protocol, the weather conditions for this flight operation were favorable, with no adverse weather effects. The images encompass three types of defects: breakages, cracks, and scratches. Data augmentation techniques, such as brightness adjustment, contrast adjustment, and noise perturbation, were applied to diversify the dataset, enhance model robustness, and mitigate the risk of overfitting. The final dataset encompasses a total of 4500 images. To ensure a comprehensive evaluation of the model’s performance, all images were randomly partitioned into training, validation, and test sets in a ratio of 7:2:1. The labeling tool facilitated defect annotation, generating XML files adhering to the COCO128 format, which provides precise labeling information for subsequent neural network training. In practical applications, these XML files are converted to TXT files through a format conversion program, supplying suitable input data for neural network training.

3.2. Experimental Environment and Parameter Settings

The experiments were conducted using the Pytorch deep learning framework, with a Python 3.8 development environment on the Windows 11 operating system. The GPU employed was an NVIDIA RTX 4060 with 32 GB of graphics card memory. The specific parameters for the experiments are detailed as follows: The wind turbine blades were standardized to a consistent size of 640 × 640 pixels. The hyperparameters were configured with a batch size of 16, an initial learning rate of 0.01, and a stochastic gradient descent (SGD) momentum of 0.937. The training cycle consisted of 300 epochs. All experiments were carried out within this specified environment.

3.3. Evaluation Metrics

Precision (P), Recall (R), mean accuracy of each category (mAP), number of parameters, and GFLOPs served as the metrics for assessing the detection performance in the experiments.

The precision rate assesses the accuracy of positive predictions made by the classifier. It is calculated using the following formula:

P = \frac{T P}{T P + F P}

(10)

Here, TP represents True Positives, and FP represents False Positives.

Recall evaluates the proportion of correctly predicted positive examples by the classifier among all true positive examples. It is computed using the following formula:

R = \frac{T P}{T P + F N}

(11)

Here, TP represents True Positives, and FN represents False Negatives.

The average precision of each category serves as a metric to assess the classifier’s performance in target detection. It involves calculating the average precision for each category and then determining the mean average precision. The mAP is obtained by averaging the average accuracy for each category using the following formula:

m A P = \frac{1}{N} \sum_{i = 1}^{N} \int_{0}^{1} P r e c i s i o n (R e c a l l) d (R e c a l l)

(12)

Here, N represents the number of categories, and AP_i is the average precision for category i. The average precision for each category is determined by calculating the intersection and union ratio between the predicted results and the actual labeled results, helping to ascertain the error rate and correctness. The metric mAP@0.5 represents the average accuracy when the Intersection over Union threshold is set to 0.5. Moreover, mAP@0.5:.95 signifies the average accuracy across IoU thresholds ranging from 0.5 to 0.95, with a step size of 0.05. These metrics offer a comprehensive assessment of accuracy under different IoU conditions.

The parameter metric denotes the number of parameters in the model, serving as an indicator of model complexity. Generally, a higher number of parameters signifies increased model complexity. The number of parameters can be calculated using the following formula:

P a r a m e t e r s = C_{i n} \times C_{o u t} \times K \times K

(13)

Here, K represents the convolutional kernel size, and C_in and C_out indicate the number of input and output feature channels, respectively.

G F L O P s = W \times H \times K \times K \times C_{i n} \times C_{o u t}

(14)

Here, W and H denote the width and height of the input feature map, respectively; K represents the convolution kernel size, and C_in and C_out indicate the number of input and output feature channels, respectively.

3.4. Ablation Experiment

To validate the viability and efficacy of the enhanced model, the improved approach is dissected into three key components: firstly, the integration of the CBAM attention mechanism module into the backbone network; secondly, the substitution of the PANet module with BiFPN in the feature fusion module; and thirdly, the incorporation of the EIoU Loss function. To comprehensively evaluate the impact of individual module alterations and combinations on the algorithm’s performance, eight distinct ablation experiments were conducted. The outcomes of these experiments are amalgamated and summarized in Table 1, where “√” denotes the incorporation of the corresponding method.

The outcomes of the comparison experiments in Experiment 2 are detailed in Table 1. The incorporation of the CBAM attention mechanism into the backbone network leads to a marginal decrease in detection precision by 0.5%, a slight improvement in the recall rate by 0.1%, an elevation of mAP@0.5 by 0.3%, and a reduction in mAP@0.5:.95 by 0.2%. In the comparison between Experiment 1 and Experiment 3, it is evident that the introduction of BiFPN in the feature fusion part enhances multiscale feature fusion. This results in a notable improvement in detection precision by 1.8%, an increase in the recall rate by 0.7%, a marginal rise of 0.2% in mAP@0.5, and a boost of 0.3% in mAP@0.5:.95. Comparing Experiment 1 with Experiment 4, the integration of the EIoU Loss function enhances detection precision by 1.6%, albeit with a trade-off as the recall rate decreases by 1.6%. There is an increase of 0.9% in mAP@0.5 and a rise of 0.3% in mAP@0.5:.95. Different strategies contribute varied improvements to the mAP value, all yielding positive optimization results compared with the original YOLOV8 model.

The contrast between Experiment 5 and Experiment 6 highlights that the CBAM attention mechanism module significantly improves detection precision and recall, both when combined with BiFPN and the EIoU Loss function. The introduction of the attention mechanism enhances the model’s classification performance for diverse defect types, with an increased focus on the target itself. Comparing Experiment 6 with Experiment 8, the introduction of the BiFPN structure leads to significant improvements in P, R, mAP@0.5, and mAP@0.5:.95 values. This confirms that feature fusion is more effective when incorporated into the CBAM-BiFPN-YOLOV8 model.

The experimental findings demonstrate that the CBAM-BiFPN-YOLOV8 network model, incorporating three principal enhancement methods, exhibits a notable advancement in progressive target detection. The rational and effective design of these three improvement methods contributes to an enhanced detection accuracy of the target. Furthermore, it aids in reducing the leakage rate of small targets and improves the overall detection efficacy in the task of wind turbine defects detection.

3.5. Comparison Testing of Various Models

To assess the superiority of the CBAM-BiFPN-YOLOV8 model in comparison with other algorithms, extensive training and testing were conducted using SSD, YOLOV3, YOLOV3-tiny, and YOLOV8n on an identical wind turbine blade damage dataset. The model performances were evaluated based on the P, R, mAP@0.5, mAP@0.5:.95, Parameters, and GFLOPs metrics. The comparative results are summarized in Table 2.

Table 2 highlights that the proposed CBAM-BiFPN-YOLOV8 model achieves an impressive mAP@0.5 value of 83.3%. In comparison with the SSD model, the CBAM-BiFPN-YOLOV8 model exhibits a reduction of 20,967,837 parameters and a decrease of 265.1 in GFLOPs, there is an improvement of 11.4% in detection precision and a 9.7% increase in recall rate, resulting in an increase of 12.8% in mAP@0.5 and a substantial 14.5% increase in mAP@0.5:.95. In contrast to the YOLOV3-tiny model, the CBAM-BiFPN-YOLOV8 model experiences a decrease of 9,088,275 parameters and 10.7 in GFLOPs; there is a 0.3% improvement in detection precision and a 7% increase in recall rate, leading to a 3.8% increase in mAP@0.5 and a noteworthy 4.8% increase in mAP@0.5:.95. Similarly, compared with the YOLOV3 model, the CBAM-BiFPN-YOLOV8 model displays a decrease of 1,013,598 parameters and 3.8 in GFLOPs; there is a notable improvement of 2.7% in detection precision, accompanied by a 4.5% increase in recall rate, contributing to a 1.5% increase in mAP@0.5 and a commendable 1.6% increase in mAP@0.5:.95. In comparison with the YOLOV8 model, the CBAM-BiFPN-YOLOV8 model encounters a slight increase of 33,962 parameters and 0.1 in GFLOPs; there is a 1.6% enhancement in detection precision, coupled with a 1% rise in recall rate, accompanied by a 1.4% increase in mAP@0.5 and a respectable 1.1% increase in mAP@0.5:.95.

In summary, the CBAM-BiFPN-YOLOV8 model demonstrates significant advantages and outstanding performance across various indicators, achieving superior results at a lower computational cost. Comparative analysis with other algorithms underscores the model’s overall excellence. To provide a more intuitive evaluation, a visual comparison of the detection performance before and after the proposed improvements is presented in Figure 5. Upon comparing Figure 5a,b, this observation reveals a missed detection issue in the second column of the fourth row and the third column of the fourth row in Figure 5a. In Figure 5b, as indicated in the first row third column, second row third column, and fourth row first column, the improved CBAM-BiFPN-YOLOV8 model is capable of capturing more small targets. The original YOLOv8 algorithm model exhibits cases of both missed detection and false detection for certain defective targets. In contrast, the CBAM-BiFPN-YOLOV8 algorithm model can focus on both global and local features, demonstrating increased sensitivity to small target objects, thus providing a better solution to the problem of missed detection.

Through comparison, it is evident that the CBAM-BiFPN-YOLOV8 algorithm effectively addresses issues related to susceptibility to loss and challenges in localizing small target features with diverse scales, particularly in complex backgrounds. This visual evidence reinforces the algorithm’s superior performance and enhancement in practical applications compared with the YOLOV8 model.

4. Conclusions

This study endeavors to address the challenges posed by the complex backgrounds of wind turbine blade damage images and the multiscale distribution of features, which often result in the easy loss of small target features and difficulties in localization. In pursuit of this goal, we made enhancements to the YOLOV8 model. The outcomes are outlined as follows:

(1): By incorporating the CBAM attention mechanism, the backbone network effectively enhances attention to shallow damage features, thereby improving the capability to capture small target features. In the feature fusion segment, the Weighted Bidirectional Feature Pyramid Network (BiFPN) is employed to achieve the optimal fusion of multiscale information, enhancing global sensing and analytical abilities. The introduction of the Effective Intersection over Union (EIoU) aims to narrow the gap between computed prediction values and real values. This enhancement strengthens the perturbation resistance of small targets to bounding boxes, providing a novel quantitative assessment approach for wind turbine blade damage detection.
(2): Experimental results reveal that the CBAM-BiFPN-YOLOV8 model proposed in this paper attains a precision rate of 82.9%. In comparison with the YOLOV8 model, the enhanced model exhibits significant improvements across key indicators. To be precise, the precision rate increased by 1.6%, the recall rate rose by 1.0%, mAP@0.5 improved by 1.4%, and mAP@0.5:.95 improved by 1.1%. These findings distinctly illustrate the substantial positive impact of the introduced enhancements on the model’s performance. The cumulative improvements underscore the superior performance of the CBAM-BiFPN-YOLOV8 model in the intelligent identification of wind turbine blade damage. The refined model’s precision, recall, and mAP metrics hold paramount significance for the accurate identification and quantitative assessment of wind turbine blade damage, offering reliable technical support for practical applications in wind turbine health monitoring.
(3): Through comparative experiments, we thoroughly validated the remarkable effectiveness and outstanding performance of the CBAM-BiFPN-YOLOV8 model in improving detection accuracy and reducing the False-Negative rate for small targets. The enhanced model not only achieved substantial performance improvements in key metrics but also realized significant advancements in overall performance at a lower cost. This further substantiates its practical feasibility. This study provides positive guidance for the practical application of wind turbine blade damage detection in engineering and presents a viable and efficient solution for related fields.
(4): Nonetheless, we acknowledge the existence of potential limitations and opportunities for enhancement in this study. Firstly, despite the notable progress in feature extraction and detection accuracy achieved by our method, it may encounter challenges in more complex weather conditions, impacting algorithm performance in cases of wind turbine blade damage. Secondly, greater emphasis will be directed toward the detection of damage dimensions in future research. Considering the causative factors leading to fatigue cracks on the small surfaces of wind turbine blades, detecting damage dimensions is crucial for ensuring the continued normal operation of wind turbine blades [28]. Lastly, in the selection of the most suitable technique, a careful balance between detection speed and accuracy should be considered. For instance, the CBAM-BiFPN-YOLOV8-based wind turbine blade damage detection model exhibited an accuracy of 82.9%, while the improved ResNet-50 + TCCS + SS wind turbine blade damage detection model attained an accuracy of 81.2%. Although there are numerical distinctions, the margin is not substantial. The model effectively detects low-resolution and indistinct wind turbine blade features, along with small defects in the target. It is essential to note that an increase in accuracy does not necessarily correspond to an increase in detection speed. Hence, future research endeavors can further optimize the algorithm to enhance its robustness and generalizability.

In summary, our investigation has delved deeply into the realm of wind turbine blade damage detection using the enhanced YOLOv8 model, addressing issues present in the original algorithm and yielding noteworthy outcomes. It furnishes a practical and effective solution for wind turbine blade damage detection. We contend that these findings hold substantial significance for real-world applications in wind turbine blade inspection and offer valuable insights for future research and practical implementations.

Author Contributions

Conceptualization, B.F. and H.Y.; methodology, J.W.; software, H.Y.; validation, H.Y.; formal analysis, B.F. and C.Z.; investigation, H.Y.; resources, B.F. and Y.H.; data curation, B.F.; writing—original draft preparation, B.F. and H.Y.; writing—review and editing, B.F. and H.Y.; project administration, B.F.; funding acquisition, B.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the 2023 Autonomous Region Key R&D and Achievement Transformation Plan (2023YFSW0003, 2023YFJM0001, 2023YFSH0050); the National Natural Science Foundation of China Regional Science Fund Project (51965054); the Inner Mongolia Autonomous Region Military-Civilian Integration Key Scientific Research Project and Soft Science Research Project (JMZD202202, JMRHZX20220009, JMZD202303); the Program for Young Talents of Science and Technology in Universities of Inner Mongolia Autonomous Region (NJYT23044); and the Inner Mongolia Key Laboratory of Intelligent Diagnosis and Control of Mechatronic System and Inner Mongolia Engineering Research Center of Intelligent Equipment for the Entire Process of Forage and Feed Production.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

We express our gratitude for the support received from the 2023 Autonomous Region Key R&D and Achievement Transformation Plan (2023YFSW0003, 2023YFJM0001, 2023YFSH0050); the National Natural Science Foundation of China Regional Science Fund Project (51965054); the Inner Mongolia Autonomous Region Military-Civilian Integration Key Scientific Research Project and Soft Science Research Project (JMZD202202, JMRHZX20220009, JMZD202303); the Program for Young Talents of Science and Technology in Universities of Inner Mongolia Autonomous Region (NJYT23044); and the funding and technical support from the Inner Mongolia Key Laboratory of Intelligent Diagnosis and Control of Mechatronic System and Inner Mongolia Engineering Research Center of Intelligent Equipment for the Entire Process of Forage and Feed Production.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Summerfield-Ryan, O.; Park, S. The power of wind: The global wind energy industry’s successes and failures. Ecol. Econ. 2023, 210, 107841. [Google Scholar] [CrossRef]
Jensen, F.; Aoun, E.A.; Focke, O.; Krenz, A.; Tornow, C.; Schlag, M.; Lester, C.; Herrmann, A.; Mayer, B.; Sorg, M.; et al. Investigation of the Causes of Premature Rain Erosion Evolution in Rotor Blade-like GFRP Structures by Means of CT, XRM, and Active Thermography. Appl. Sci. 2022, 12, 11307. [Google Scholar] [CrossRef]
Magalhães, G.M.C.; Souza, J.A.; dos Santos, E.D. A Constructal-Theory-Based Methodology to Determine the Configuration of Empty Channels Used in the Resin Impregnation of a Square Porous Plate. Fluids 2023, 8, 317. [Google Scholar] [CrossRef]
Masita, K.; Hasan, A.; Shongwe, T. Defects Detection on 110 MW AC Wind Farm’s Turbine Generator Blades Using Drone-Based Laser and RGB Images with Res-CNN3 Detector. Appl. Sci. 2023, 13, 13046. [Google Scholar]
Alnutayfat, A.; Sutin, A. Wideband Vibro-Acoustic Modulation for Crack Detection in Wind Turbine Blades. Appl. Sci. 2023, 13, 9570. [Google Scholar]
Ding, S.; Yang, C.; Zhang, S. Acoustic-Signal-Based Damage Detection of Wind Turbine Blades—A Review. Sensors 2023, 23, 4987. [Google Scholar] [CrossRef]
Kyungil, K.; Kirsten, D.; Christopher, P.; Ian, H.; Weaver, P.M. Progress and Trends in Damage Detection Methods, Maintenance, and Data-driven Monitoring of Wind Turbine Blades–A Review. Renew. Energy Focus 2023, 44, 390–412. [Google Scholar]
Wang, W.; Xue, Y.; He, C.; Zhao, Y. Review of the Typical Damage and Damage-Detection Methods of Large Wind Turbine Blades. Energies 2022, 15, 5672. [Google Scholar] [CrossRef]
Yu, Y.; Cao, H.; Yan, X.; Wang, T.; Ge, S.S. Defect identification of wind turbine blades based on defect semantic features with transfer feature extractor. Neurocomputing 2020, 376, 1–9. [Google Scholar] [CrossRef]
Yang, X.; Zhang, Y.; Lv, W.; Wang, D. Image recognition of wind turbine blade damage based on a deep learning model with transfer learning and an ensemble learning classifier. Renew. Energy 2021, 163, 386–397. [Google Scholar] [CrossRef]
Cheng, X.; Shi, F.; Zhao, M.; Li, G.; Zhang, H.; Chen, S. Temporal Attention Convolutional Neural Network for Estimation of Icing Probability on Wind Turbine Blades. IEEE Trans. Ind. Electron. 2022, 69, 6371–6380. [Google Scholar] [CrossRef]
Tian, W.; Cheng, X.; Li, G.; Shi, F.; Chen, S.; Zhang, H. A Multilevel Convolutional Recurrent Neural Network for Blade Icing Detection of Wind Turbine. IEEE Sens. J. 2021, 21, 20311–20323. [Google Scholar] [CrossRef]
Gao, Y.; Dai, S.; Ji, W.; Wang, R. Low saliency crack detection based on improved multimodal object detection network: An example of wind turbine blade inner surface. J. Electron. Imaging 2023, 32, 033033. [Google Scholar] [CrossRef]
Liu, C.; An, C.; Yang, Y. Wind Turbine Surface Defect Detection Method Based on YOLOv5s-L. NDT 2023, 1, 46–57. [Google Scholar] [CrossRef]
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Fang, J.; Lin, X.; Zhou, F.; Tian, Y.; Zhang, M. Safety Helmet Detection Based on Optimized YOLOv5. In Proceedings of the Prognostics and Health Management Conference (PHM), Paris, France, 31 May–2 June 2023; pp. 117–121. [Google Scholar]
Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W. YOLOv6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. IEEE YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Tejashwini, P.; Thriveni, J.; Venugopal, K. A Novel SLCA-UNet Architecture for Automatic MRI Brain Tumor Segmentation. arXiv 2023, arXiv:2307.08048. [Google Scholar]
Zhu, F.; Cui, J.; Zhu, B.; Li, H.; Liu, Y. Semantic segmentation of urban street scene images based on improved U-Net network. Optoelectron. Lett. 2023, 19, 179–185. [Google Scholar] [CrossRef]
Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and efficient object detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10778–10787. [Google Scholar]
Wang, C.; He, W.; Nie, Y.; Guo, J.; Liu, C.; Han, K.; Wang, Y. Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism. arXiv 2023, arXiv:2309.11331. [Google Scholar]
Števuliáková, P.; Hurtik, P. Intersection over Union with smoothing for bounding box regression. arXiv 2023, arXiv:2303.15067. [Google Scholar]
Zhang, Y.-F.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
Banaszek, A.; Losiewicz, Z.; Jurczak, W. Corrosion influance on safety of hydraulic pipelines installed on decks of contemporary product and chemical tankers. Pol. Marit. Res. 2018, 25, 71–77. [Google Scholar] [CrossRef]

Figure 1. Structure of the YOLOV8 network model.

Figure 2. Structure of the CBAM-BiFPN-YOLOV8 network model.

Figure 3. Comprehensive CBAM structure for target localization.

Figure 4. (a) Schematic representation of the PANet feature fusion structure; (b) schematic representation of bifan feature fusion structure.

Figure 5. Detection effect before and after algorithm improvement: (a) YOLOV8 detection effect diagram; (b) CBAM-BiFPN-YOLOV8 detection effect diagram.

Table 1. Results of the ablation experiments.

Experiment	CBAM	BiFPN	EIoU	P/%	R/%	mAP@0.5/%	mAP@0.5:.95/%
1				81.3	76.4	81.9	53.6
2	√			80.8	76.3	82.2	53.4
3		√		83.1	77.1	82.1	53.8
4			√	82.9	74.8	82.8	53.9
5	√	√		80.9	77.8	82.8	53.7
6	√		√	80.9	77.0	83.0	54.2
7		√	√	79.8	78.1	82.5	53.7
8	√	√	√	82.9	77.4	83.3	54.7

Table 2. Comparative experimental results.

Experiment	Parameters	GFLOPs	P/%	R/%	mAP@0.5/%	mAP@0.5:.95/%
SSD	24,013,232	273.4	71.5	67.7	70.5	40.2
YOLOV3	4,058,993	12.1	80.2	72.9	81.8	53.1
YOLOV3-tiny	12,133,670	19.0	82.6	70.4	79.5	49.9
YOLOV8n	3,011,433	8.2	81.3	76.4	81.9	53.6
ours	3,045,395	8.3	82.9	77.4	83.3	54.7

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yu, H.; Wang, J.; Han, Y.; Fan, B.; Zhang, C. Research on an Intelligent Identification Method for Wind Turbine Blade Damage Based on CBAM-BiFPN-YOLOV8. Processes 2024, 12, 205. https://doi.org/10.3390/pr12010205

AMA Style

Yu H, Wang J, Han Y, Fan B, Zhang C. Research on an Intelligent Identification Method for Wind Turbine Blade Damage Based on CBAM-BiFPN-YOLOV8. Processes. 2024; 12(1):205. https://doi.org/10.3390/pr12010205

Chicago/Turabian Style

Yu, Hang, Jianguo Wang, Yaxiong Han, Bin Fan, and Chao Zhang. 2024. "Research on an Intelligent Identification Method for Wind Turbine Blade Damage Based on CBAM-BiFPN-YOLOV8" Processes 12, no. 1: 205. https://doi.org/10.3390/pr12010205

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on an Intelligent Identification Method for Wind Turbine Blade Damage Based on CBAM-BiFPN-YOLOV8

Abstract

1. Introduction

2. Theoretical Background

2.1. YOLOV8 Algorithm Principles

2.2. Improved YOLOV8 Model Structure

2.2.1. CBAM Attention Mechanism

2.2.2. Weighted Bidirectional Feature Pyramid Network

2.2.3. Loss Function

3. Experimental Design and Result Analysis

3.1. Dataset

3.2. Experimental Environment and Parameter Settings

3.3. Evaluation Metrics

3.4. Ablation Experiment

3.5. Comparison Testing of Various Models

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI