LBFNet: A Tomato Leaf Disease Identification Model Based on Three-Channel Attention Mechanism and Quantitative Pruning

Chen, Hailin; Wang, Yi; Jiang, Ping; Zhang, Ruofan; Peng, Jialiang

doi:10.3390/app13095589

Open AccessArticle

LBFNet: A Tomato Leaf Disease Identification Model Based on Three-Channel Attention Mechanism and Quantitative Pruning

by

Hailin Chen

¹

,

Yi Wang

^1,*,

Ping Jiang

²,

Ruofan Zhang

¹ and

Jialiang Peng

¹

College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China

²

College of Mechanical and Electrical Engineering, Hunan Agricultural University, Changsha 410128, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(9), 5589; https://doi.org/10.3390/app13095589

Submission received: 10 April 2023 / Revised: 26 April 2023 / Accepted: 27 April 2023 / Published: 30 April 2023

Download

Browse Figures

Versions Notes

Abstract

:

The current neural networks for tomato leaf disease recognition have problems such as large model parameters, long training time, and low model accuracy. To solve these problems, a lightweight convolutional neural network (LBFNet) is proposed in this paper. First, LBFNet is established as the base model. Secondly, a three-channel attention mechanism module is introduced to learn the disease features in tomato leaf disease images and reduce the interference of redundant features. Finally, a cascade module is introduced to increase the depth of the model, solve the gradient descent problem, and reduce the loss caused by increasing the depth of the model. The quantized pruning technique is also used to further compress the model parameters and optimize the model performance. The results show that the LBFNet model achieves 99.06% accuracy on the LBFtomato dataset, with a training time of 996 s and a single classification accuracy of over 94%. Further training using the saved weight file after quantized pruning enables the model accuracy to reach 97.66%. Compared with the base model, the model accuracy was improved by 28%, and the model parameters were reduced by 96.7% compared with the traditional Resnet50. It was found that LBFNet can quickly and accurately identify tomato leaf diseases in complex environments, providing effective assistance to agricultural producers.

Keywords:

artificial intelligence; three-channel attention mechanism; tomato leaf disease; convolution neural network; deep learning

1. Introduction

With the outbreak of COVID-19 in 2019, many major food-producing countries have taken measures to restrict grain exports. As tomatoes are the most widely planted and consumed vegetable crop globally, and China is one of the world’s largest producers and consumers of tomatoes, tomato production is an important means for farmers to increase income and generate export revenue. However, various diseases severely affect tomato yields, especially considering the increasingly serious issue of food safety. Therefore, accurate identification of tomato diseases and timely treatment have become an urgent issue that needs to be addressed [1,2,3,4].

Many deep learning-based methods have been proposed for crop disease identification [5,6,7,8,9,10,11,12,13,14,15,16,17,18]. For example, Li et al. [5] proposed the OplusVNet, a 13-layer convolutional neural network that achieved 99% prediction accuracy on a dataset collected from the field using VGG16 network modules for transfer learning. Nguyen et al. [6] proposed a neural network model that combines image segmentation with transfer learning, segments the image, uses HSV to extract the original leaf area and black background, and feeds it into a VGG-19 model for transfer learning, achieving an accuracy of 99.72%, with a training time of 275,000 s. These networks have effectively improved the recognition of crop diseases and pests. However, due to their complex structure and large model size, they are difficult to deploy on current mainstream devices for real-time disease and pest identification.

Many researchers have realized the inconvenience brought by complex models and began designing models with simple structures but powerful functions. For example, Ding et al. [19] proposed a model similar to the VGG inference time backbone, consisting of a series of 3 × 3 convolutions and ReLU, and proposed a multi-branch topology efficient model to reduce training time. Zeng et al. [20] proposed a self-attention convolutional neural network (SACNN) to address the confusion caused by small disease areas, low contrast between disease areas and backgrounds, and background complexity in crop disease images. The recognition accuracy on AES-CD9214 and MK-D2 was 95.33% and 98%, respectively. These studies focus on the structure and performance of the model to ensure that the model has efficient performance under a simple structure, but they do not fully consider the impact of data on the model. Deng et al. [21] explored this issue from the perspective of data and found that the difficulty in obtaining data samples is the main challenge to improving disease recognition performance. Therefore, they proposed a new data augmentation method based on generative adversarial networks (GAN), called RAHC_GAN, for tomato leaf data augmentation and disease recognition. The results showed that RAHC_GAN can generate leaves with clear disease features, and the generated extended dataset can significantly improve the classifier’s recognition performance. Data augmentation is a commonly used method in deep learning to prevent model overfitting. To address the problem of noise samples that may be introduced by data augmentation, which may damage the performance of unorganized data during the inference process, Gong et al. [22] proposed KeepAugment, which uses saliency maps to detect important areas in the original image and then preserves these information areas during the augmentation process to generate more realistic training images. The results showed that this method can improve the training effect on different datasets. Many studies have shown that the combination of attention mechanisms and deep learning models can effectively improve model performance [23,24,25,26,27]. However, most studies did not investigate the impact of different attention mechanisms on the model. Different attention mechanisms have different characteristics, so their impact on the model must be different. One of the objectives of this study is to investigate the impact of different attention mechanisms on LBFNet to improve the model’s generalization and practicality [28].

Therefore, in response to the current problems of neural network models used to identify tomato leaf diseases, such as large model size, complex structure, slow inference speed, and insufficient accuracy, it is critical to resolving these issues because most publicly available datasets do not include tomato leaves in the complex conditions and the data processed in these datasets result in weak generalization ability of models trained on them. We propose a convolutional neural network model, LBFNet, based on the VGG series, which has a simple structure but powerful functions. We use different attention mechanisms to extract deep features from images and reduce the influence of factors such as background information on model accuracy while using a cascade structure to preserve the original information of the image and improve model performance. We use various data augmentation techniques to enhance the ten types of tomato leaf data in PlantVillage and maintain sample balance, adding tomato leaf image data from different sources to construct the LBFtomato dataset. The training set and testing set were divided into a 7:3 ratio. The model trained on LBFtomato exhibited improved generalization and accuracy. Compared with previous studies on tomato disease and pest identification, the model proposed in this paper considers more diverse influencing factors and can be fully applicable to the identification of tomato leaf diseases in practical environments [29,30,31,32,33,34,35,36].

2. Materials and Methods

2.1. LBFtomato Leaf Image Datasets

The original experimental data utilized the publicly available Plant Village dataset, which comprises ten categories of tomato leaves, including nine categories of tomato diseases and one category of healthy tomato leaves. The categories are early blight, late blight, powdery mildew, leaf mold, Septoria leaf spot, bacterial spot, spider mites, yellow leaf curl virus, brown spot, and healthy leaves, totaling 18,835 images. The dataset was divided into a training set and a test set in a 7:3 ratio. To increase the number of samples in each category to approximately 1000 data augmentation techniques such as flipping, translation, and brightness adjustment were first applied to the training dataset in Plant Village. To ensure data balance, the dataset was then cleaned by removing some interfering samples, resulting in a Plant Village training dataset of 13,062 images. Finally, a new dataset named LBFtomato was created by adding real tomato leaf images taken from the Changsha tomato planting base and real-world tomato leaf images obtained from Kaggle and GitHub. LBFtomato comprises 10 categories of tomato leaves, including nine categories of tomato diseases and one category of healthy tomato leaves, and has real-world tomato leaf data, which can better verify the effectiveness of the model. A number of images in the Plant Village dataset and the LBFtomato dataset are shown in Table 1 and Table 2, respectively. The initial tomato leaf images were resized to 256 × 256, as depicted in Figure 1 [37].

The data augmentation was performed using the ImageDataGenerator function in Keras, with specific parameter settings: 40 degrees of rotation; 0.2 of horizontal and vertical translation; 0.2 of perspective transformation; 0.2 scaling; horizontal flip; and padding and random brightness in the nearest mode.

2.2. Test Environment

The experimental environment of this paper is Windows 64-bit system, solid-state drive 500 G, and mechanical hard disk 2 T, the processor uses core i5-11400H, with RAM 16.0 GB, the graphics card is RTX 3050Ti, the software environment uses Anaconda 4.8.4, CUDA11.0, python 3.7, and the programming language tensorflow 2.2 is used for model construction and training.

2.3. Use Cascading Structures to Reduce Model Loss

Convolutional neural network (CNN) models often encounter the issues of gradient vanishing and explosion during the training process. To address this problem, the traditional solution is to initialize and regularize the data. However, as the depth of the network increases, this approach can lead to network performance degradation and an increase in error rate. To solve this issue, a cascading structure has been introduced. The cascading structure not only resolves the problems of gradient vanishing and explosion but also prevents the loss of disease information caused by deepening and small tomato leaf images during the training of the tomato leaf disease classifier. By inputting the original information into a specific layer for information supplementation, the model can effectively supplement disease information during training. This improves the model’s accuracy and performance while avoiding gradient vanishing and explosion caused by deepening. Figure 2 shows an illustration of the cascading structure.

2.4. Using Three-Channel Attention Mechanism to Enhance Model Robustness

Tomato leaf diseases pose several challenges, such as small disease targets, subtle differences between different types of diseases, and unclear disease features. To address these issues, we developed a three-channel attention mechanism module that enables the model to capture subtle disease features and focus on specific disease locations. This module effectively suppresses the negative impact of interfering information, such as leaves and background, on the model, which in turn enhances the model’s robustness. The three-channel attention mechanism module, depicted in Figure 3, consists of a spatial attention module, a channel attention module, and a coordinate attention module, which are connected to produce the final output.

The proposed three-channel attention mechanism is built on the idea of CBAM, which establishes interdependencies between channels and extracts location attention information through convolution. However, convolution can only capture local relationships and lacks the ability to extract long-distance relationships. To overcome this limitation, we added the coordinate attention mechanism to CBAM, enabling the model to encode horizontal and vertical location information into channel attention based on the time-channel relationship. This approach allows the mobile network to effectively attend to a wide range of location information without excessive computational effort. Moreover, it compensates for the drawback of the original CBAM by enabling the model to accurately locate and identify disease information in images without rapid degradation of the model’s accuracy due to the complex environment, thus improving the model’s robustness.

The channel attention mechanism is an adaptive spatial region selection mechanism that selects the positions the model needs to focus on to obtain the specific location of the disease in the image. The channel attention mechanism weights the convolutional features of the channels, thereby enhancing the expression ability of the disease parts. The channel attention mechanism can be expressed as Equation (1), as shown in detail in Figure 4.

The Spatial Attention Module is a channel compression technique that performs average pooling and max pooling separately in the channel dimension. Specifically, the MaxPool operation extracts the maximum value in the channel, with the number of extractions being the product of the height and width. Meanwhile, the AvgPool operation extracts the average value in the channel, with the number of extractions also being the product of the height and width. Subsequently, the feature maps extracted earlier (each with a single channel) are combined to obtain a 2-channel feature map that is utilized to locate the specific position of the disease. The module structure is illustrated in Figure 5. The Spatial Attention Module can be mathematically expressed as Equation (2).

\begin{matrix} M c (F) = σ (M L P (A v g P o o l (F)) + M L P (M a x P o o l (F))) \\ = σ (W_{1} (W_{0} (F_{a v g}^{c})) + W_{1} (W_{0} (F_{m a x}^{c}))) \end{matrix}

(1)

\begin{matrix} M s (F) = σ (f^{7 \times 7} ([AvgPool (F); MaxPool (F)])) \\ = σ (f^{7 \times 7} ([F_{a v g}^{s}; F_{m a x}^{s};])) \end{matrix}

(2)

Y c (i, j) = X c (i, j) \times G_{c}^{h} (i) \times G_{c}^{h} (j)

(3)

The Coordinate Attention Module operates in three steps. First, the input feature maps are pooled in horizontal and vertical directions using 1D global pooling to generate two separate feature maps through aggregation. Next, the two feature maps with specific direction information are encoded to generate attention maps. Finally, the two attention maps are multiplied with the input feature maps to achieve the goal of emphasizing the coordinate information representation. The Coordinate Attention Module is depicted in Figure 6.

When applied in mobile environments, the new transformation should be designed to be as simple as possible while still being able to leverage the captured position information to accurately identify the region of interest and efficiently capture the relationship between channels. To achieve this, a three-channel attention mechanism is constructed to fuse the information between different channels. The output of the Coordinate Attention Module is demonstrated in Equation (3).

2.5. Reducing Model Parameters Using Vgg-Style Convolutional Neural Network

The current convolutional neural network models are well-designed but have made the models extremely complex. These complex models occupy a large amount of memory and decrease the inference speed while achieving high accuracy. The classical VGG convolutional neural network uses a simple system architecture consisting of convolutional layers, Relu activation functions, and pooling layers, which has extremely fast inference speed and good detection ability. However, VGG loses some original information as the structure becomes deeper and cannot fully obtain the original image information. Therefore, we propose a simple but powerful convolutional neural network structure called LBFNet, which has a linear structure similar to VGG and has the advantages of cascading structure and attention mechanism, allowing deep models to obtain complementary original information and learn important parts of disease images, reducing the influence of background noise and improving model performance. To further improve the model’s generalization ability, a Dropout layer is added to the model to prevent overfitting and enhance model robustness. The basic module of LBFNet, LBFB, as shown in Figure 6, uses a 1 × 1 convolutional kernel to obtain image information. The 1 × 1 convolutional kernel can observe finer features of diseases, which is beneficial for recognizing small targets such as tomato leaves. At the same time, a BN layer is added to prevent model overfitting, and relu6 is used as the activation function. ReLU6 is the same as ordinary ReLU but limits the maximum output value to 6 (clipping the output value), which enables the model to have good numerical resolution even with low-precision float16 on mobile devices, allowing the model to perform better when deployed on mobile devices. Meanwhile, to supplement more information and improve the representation of disease information, we add a deconvolution layer at the end to map the low-dimensional features into high-dimensional inputs to further improve the performance of the model and perfect the low-cost and high-efficiency tomato leaf disease recognition under natural conditions. The structure of LBFNet is shown in Figure 7.

3. Results

3.1. Research on Tomato Leaf Disease Classification Based on LBFNet Model

3.1.1. The Impact of Different Optimizers on the Model

The translation optimizer guides the various parameters of the loss function to update in the correct direction with an appropriate size during the backpropagation process of deep learning. This enables the updated parameters to continuously approach the global minimum of the loss function. To achieve the minimum loss value and ensure optimal performance of the model in identifying tomato leaf diseases, different optimizers were used to train the LBFNet model, with the training results shown in Figure 8.

3.1.2. The Impact of Different Learning Rate Parameters on the Model

The learning rate is an important hyperparameter of convolutional neural networks. A learning rate that is too large can cause the loss function to miss the global optimal point, while a learning rate that is too small can increase the convergence complexity of the network. Therefore, to explore the optimal learning rate for LBFNet in tomato disease recognition, we conducted experiments with learning rates set to 0.0001, 0.001, 0.01, and 0.1.

3.1.3. The Impact of Different Attention Mechanisms on the Model

The attention mechanism has evolved in recent years, further improving the performance of deep learning models. To explore the influence of different attention mechanisms and modules on the LBFNet model, SEnet attention mechanism module, three-channel attention mechanism module, DUAL attention mechanism module, CA attention mechanism module, ECA attention mechanism module, CBAM attention mechanism module, cascade module and hybrid structure were added for ablation experiments.

Table 3 shows that the addition of other modules, including the cascade module, greatly improves the model’s performance, in addition to the SE attention mechanism. The cascade module has a significant impact on improving the model, while the loss of information in the process of Global Information Embedding will cause the model performance to decrease for the SE attention mechanism. From the rest of the data, it can be seen that the attention mechanism and cascade module have a huge improvement for the model. The three-channel attention mechanism achieves optimal results at a small cost.

3.2. Model Performance Comparison

3.2.1. Parameter Settings

The parameters for the comparative experiment are set as follows: the original size of the image is 256 × 256 pixels, so the model input is also adjusted to 256 × 256 × 3. The training set and test set are set at a ratio of 7:3. The batch size is set to 32, the number of epochs is set to 100, and the initial learning rate is adjusted by comparison and finally set to 0.0001. The optimizer used is RMSprop, the loss function used is categorical_crossentropy, and the softmax activation function is used. Categorical_crossentropy is shown in Equation (4):

L o s s = - \sum_{i = 1}^{\begin{array}{l} o u t p u t \\ s i z e \end{array}} Y_{i}_{g} l o g \hat{Y} i

(4)

In the formula, “output size” represents the number of classification categories, and “Yi” represents the true label for the i-th category.

3.2.2. Evaluation Indicators

In this study, we used precision, F1 score, accuracy, and recall as evaluation metrics to assess the effectiveness of different network models in the classification and recognition tasks targeting tomato leaf disease images, where Equations (5)–(8) are the formulas for F1, Accuracy, Precision, and Recall, respectively.

F 1 = 2 \times \frac{P r e c i s i o n \times R e c a l l}{(P r e c i s i o n + R e c a l l)}

(5)

A c c u r a c y = \frac{Identify the correct total number of disease and pest images}{T o t a l n u m b e r o f d i s e a s e a n d p e s t i m a g e s}

(6)

P r e c i s i o n = \frac{T P}{T P + F P}

(7)

R e c a l l = \frac{T P}{T P + F N}

(8)

where TP is the number of positive samples predicted by the classifier with a positive true result, i.e., the number of positive samples correctly identified; FP is the number of negative samples predicted by the classifier with a positive result but a negative true result, i.e., the number of negative samples incorrectly predicted; FN is the number of positive samples for which the classifier predicts a negative outcome but the true outcome is a positive sample, i.e., the number of missed positive samples. Accuracy is the ratio of the total number of correctly identified disease images to the total number of disease images.

3.2.3. Comparative Analysis Result

To explore the effect of data increment on model accuracy, different models were trained on PlantVillage and LBFtomato datasets, and the results are presented in Table 4 and Table 5. It can be seen that the performance of all models improved on the LBFtomato dataset, proving that balancing positive and negative samples optimizes the dataset to improve model performance. Meanwhile, LBFNet achieves excellent results on different datasets, and convergence speed, and model accuracy. Although lower than the two large models, ConvNeXth and vit-transformer, the training time is Two hundred times lower than these two models and maintains the minimum number of parameters, and the accuracy rate is only 0.01 compared to the large models, while the training time is the shortest. In addition, for the problems of large parameters, long training time, and low accuracy of traditional models such as VGG and ResNet, LBFNet is a good solution that can be flexibly applied to tomato leaf disease identification in modern agriculture.

As shown in Figure 9, GoogleNert and MobileNet, as early lightweight models, have fewer parameters and simple structures, but the same training time increases accordingly and cannot achieve a complete fit after 100 rounds, while VGG16, although simple but with a huge number of parameters and large fluctuations, also does not achieve a fit, and ResNet has the disadvantage of test time, while vit-transformer, ConvNeXt, and LBFNet can reach saturation within a very short number of rounds, but vit-transformer and ConvNeXt have the disadvantages of a large number of parameters, complex structure and long training time of the model, while LBFNet has an accuracy of 0.99. The minimum number of parameters and very simple structure can effectively solve the current problems of low accuracy, large model, long time, and difficulty to deploy on mobile devices for tomato leaf disease recognition models.

3.2.4. Reduce Model Size Using Quantitative Pruning

To make the model deployable on various devices, we further processed it by pruning the model, followed by normal model training using LBFNET until basic convergence. Then we pruned the lower weight layers from a sparsity of 0.5 to 0.9, and finally, quantized and compressed the model. The pruned network was retrained again to recover accuracy until convergence. This approach can effectively reduce the model’s complexity, memory usage, and overfitting to some extent. The results after quantized pruning are presented in Table 6 which shows that the original model is 6.85 MB, and after quantized pruning, the model parameters are reduced to only 3.46 MB. By using the quantized model weight file for training, the model’s accuracy reached 97.66%, which is only a 1.4% decrease in accuracy, but the model size is reduced by half, which is an acceptable cost.

4. Discussion

The performance differences of optimizers are significant. The unsatisfactory performance of SGD is due to its stochastic gradient descent on the learning rate, where the original learning rate is set to 0.0001, resulting in slow convergence. To verify this, we trained with an initial learning rate of 0.01, and the results are shown in Figure 10. Adagrad makes it difficult for the model to converge due to its dynamically decreasing learning rate. Using a higher initial learning rate can solve this problem, but ultimately, RMSprop outperforms Adagrad under the same conditions. RMSprop was developed to address the rapid learning rate decrease issue in Adagrad, resulting in the fastest fitting speed and highest accuracy. The idea behind Adam is to set the initial learning rate to a larger value and dynamically decrease it as the number of iterations increases, achieving a balance between efficiency and effectiveness. Therefore, it also achieves good results. Considering all factors, we chose RMSprop as the optimizer for LBFNet.

From Figure 11 and Figure 12, it can be seen that when the learning rate is set to 0.0001, LBFNet achieves the best fitting speed, highest accuracy, and smallest loss, and the expected effect is achieved. Therefore, 0.0001 is chosen as the learning rate value of LBFNet. Excessive learning rate in tomato leaf disease identification cannot support the actual production needs for RMSprop optimizer.

The confusion matrix is a situation analysis table in machine learning that summarizes the prediction results of the classification model. Figure 13, Figure 14, Figure 15, Figure 16, Figure 17, Figure 18,Figure 19 show the confusion matrix of the seven models, and it can be seen that LBFNet has good results for diseases with both large and small differences between classes. Table 7 shows that for LBFNet, the classification and recognition effects of all types of diseases meet the criteria of the actual situation, with all indicators close to 1. Even for mosaic_virusy, which has the least samples, it achieves 94% high accuracy, fully compatible with the needs of agricultural production, and there are no extreme cases.

5. Conclusions

In this study, we propose a novel VGG-like convolutional neural network model called LBFNet for identifying tomato leaf diseases. The LBFNet model has a simple structure and efficient performance, overcoming the issues of previous models with complex architecture and poor accuracy. The LBFNet model combines the strengths of VGG networks, cascade networks, and attention networks. After balancing the data, the model achieves 99.06% accuracy on the LBFtomato dataset. Furthermore, after quantitative pruning and saving, followed by further training, the model achieves 97.66% accuracy with half of the parameter size, making it easy to deploy on mobile devices. The experimental results demonstrate that the model addresses the challenges of large model parameters, slow inference time, and low accuracy of current neural network models in tomato leaf disease identification. Compared with other models, LBFNet exhibits high accuracy, fast inference time, and fewer parameters, making it outstanding in the field of tomato leaf disease recognition. It can be applied to agricultural production activities to effectively improve agricultural production efficiency.

Author Contributions

Conceptualization, H.C. and R.Z.; methodology, H.C.; software, H.C. and J.P.; validation, H.C., R.Z. and J.P.; resources, H.C.; data curation, Y.W.; writing—original draft preparation, H.C. and Y.W.; writing—review and editing, H.C., J.P., R.Z. and Y.W.; supervision, Y.W.; project administration, Y.W. and P.J.; funding acquisition, Y.W.; investigation, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China under the sub-project “Research and System Development of Navigation Technology for Harvesting Machine of Special Economic Crops” (No. 2022YFD2002001) within the key program “Engineering Science and Comprehensive Interdisciplinary Research”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

https://www.kaggle.com/datasets/jingxiche/lbftomato.accessed on 20 April 2023.

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, J.; Wang, X. Plant diseases and pests detection based on deep learning: A review. Plant Methods 2021, 17, 22. [Google Scholar] [CrossRef] [PubMed]
Abade, A.; Ferreira, P.A.; de Barros Vidal, F. Plant diseases recognition on images using convolutional neural networks: A systematic review. Comput. Electron. Agric. 2021, 185, 106125. [Google Scholar] [CrossRef]
Kamilaris, A.; Prenafeta-Boldú, F.X. A review of the use of convolutional neural networks in agriculture. J. Agric. Sci. 2018, 156, 312–322. [Google Scholar] [CrossRef]
Gui, P.; Dang, W.; Zhu, F.; Zhao, Q. Towards automatic field plant disease recognition. Comput. Electron. Agric. 2021, 191, 106523. [Google Scholar] [CrossRef]
Yang, C.; Teng, Z.; Dong, C.; Lin, Y.; Chen, R.; Wang, J. In-Field Citrus Disease Classification via Convolutional Neural Network from Smartphone Images. Agriculture 2022, 12, 1487. [Google Scholar] [CrossRef]
Nguyen, T.H.; Nguyen, T.N.; Ngo, B.V. A VGG-19 Model with Transfer Learning and Image Segmentation for Classification of Tomato Leaf Disease. AgriEngineering 2022, 4, 871–887. [Google Scholar] [CrossRef]
Chen, S.; Xiong, J.; Jiao, J.; Xie, Z.; Huo, Z.; Hu, W. Citrus fruits maturity detection in natural environments based on convolutional neural networks and visual saliency map. Precis. Agric. 2022, 23, 1515–1531. [Google Scholar] [CrossRef]
Mishra, S.; Sachan, R.; Rajpal, D. Deep convolutional neural network based detection system for real-time corn plant disease recognition. Procedia Comput. Sci. 2020, 167, 2003–2010. [Google Scholar] [CrossRef]
Dong, C.; Zhang, Z.; Yue, J.; Zhou, L. Automatic recognition of strawberry diseases and pests using convolutional neural network. Smart Agric. Technol. 2021, 1, 100009. [Google Scholar] [CrossRef]
Xu, W.; Zhao, L.; Li, J.; Shang, S.; Ding, X.; Wang, T. Detection and classification of tea buds based on deep learning. Comput. Electron. Agric. 2022, 192, 106547. [Google Scholar] [CrossRef]
Liu, C.; Zhu, H.; Guo, W.; Han, X.; Chen, C.; Wu, H. EFDet: An efficient detection method for cucumber disease under natural complex environments. Comput. Electron. Agric. 2021, 189, 106378. [Google Scholar] [CrossRef]
Chen, J.; Liu, Q.; Gao, L. Visual tea leaf disease recognition using a convolutional neural network model. Symmetry 2019, 11, 343. [Google Scholar] [CrossRef]
Singh, A.K.; Sreenivasu, S.V.N.; Mahalaxmi, U.S.B.K.; Sharma, H.; Patil, D.D.; Asenso, E. Hybrid feature-based disease detection in plant leaf using convolutional neural network, bayesian optimized SVM, and random forest classifier. J. Food Qual. 2022, 2022, 2845320. [Google Scholar] [CrossRef]
Zhang, S.W.; Shang, Y.J.; Wang, L. Plant disease recognition based on plant leaf image. J. Anim. Plant Sci. 2015, 25, 42–45. [Google Scholar]
Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI, San Francisco, CA, USA, 4–9 February 2017; Volume 4, p. 12. [Google Scholar]
Abbas, A.; Jain, S.; Gour, M.; Vankudothu, S. Tomato plant disease detection using transfer learning with C-GAN synthetic images. Comput. Electron. Agric. 2021, 187, 106279. [Google Scholar] [CrossRef]
Huang, G.; Liu, Z.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 21–26 July 2017; Volume 1, p. 3. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Montreal, QC, Canada, 11–17 October 2021; pp. 13733–13742. [Google Scholar]
Zeng, W.; Li, M. Crop leaf disease recognition based on Self-Attention convolutional neural network. Comput. Electron. Agric. 2020, 172, 105341. [Google Scholar] [CrossRef]
Deng, H.; Luo, D.; Chang, Z.; Li, H.; Yang, X. RAHC_GAN: A Data Augmentation Method for Tomato Leaf Disease Recognition. Symmetry 2021, 13, 1597. [Google Scholar] [CrossRef]
Gong, C.; Wang, D.; Li, M.; Chandra, V.; Liu, Q. Keep Augment: A simple information-preserving data augmentation approach. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Montreal, QC, Canada, 11–17 October 2021; pp. 1055–1064. [Google Scholar]
Wang, H.; Song, H.; Wu, H.; Zhang, Z.; Deng, S.; Feng, X. Multilayer feature fusion and attention-based network for crops and weeds segmentation. J. Plant Dis. Prot. 2022, 129, 1475–1489. [Google Scholar] [CrossRef]
Fang, W.; Guan, F.; Yu, H.; Bi, C.; Guo, Y.; Cui, Y.; Su, L.; Zhang, Z. Identification of wormholes in soybean leaves based on multi-feature structure and attention mechanism. J. Plant Dis. Prot. 2022, 130, 401–412. [Google Scholar] [CrossRef]
Wang, S.H.; Fernandes, S.L.; Zhu, Z.; Zhang, Y.D. AVNC: Attention-based VGG-style network for COVID-19 diagnosis by three-channel attention mechanism. IEEE Sens. J. 2021, 22, 17431–17438. [Google Scholar] [CrossRef] [PubMed]
Fukui, H.; Hirakawa, T.; Yamashita, T.; Fujiyoshi, H. Attention branch network: Learning of attention mechanism for visual explanation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 10705–10714. [Google Scholar]
Chefer, H.; Gur, S.; Wolf, L. Transformer interpretability beyond attention visualization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Montreal, QC, Canada, 11–17 October 2021; pp. 782–791. [Google Scholar]
Guo, M.H.; Xu, T.X.; Liu, J.J.; Liu, Z.-N.; Jiang, P.-T.; Mu, T.-J.; Zhang, S.-H.; Martin, R.R.; Cheng, M.-M. Attention mechanisms in computer vision: A survey. Comput. Vis. Media 2022, 8, 331–368. [Google Scholar] [CrossRef]
Gadekallu, T.R.; Rajput, D.S.; Reddy, M.P.K.; Lakshmanna, K.; Bhattacharya, S.; Singh, S.; Jolfaei, A. A novel PCA–whale optimization-based deep neural network model for classification of tomato plant diseases using GPU. J. Real-Time Image Process. 2021, 18, 1383–1396. [Google Scholar] [CrossRef]
Gonzalez-Huitron, V.; León-Borges, J.A.; Rodriguez-Mata, A.E.; Amabilis-Sosa, L.E.; Ramírez-Pereda, B.; Rodriguez, H. Disease detection in tomato leaves via CNN with lightweight architectures implemented in Raspberry Pi 4. Comput. Electron. Agric. 2021, 181, 105951. [Google Scholar] [CrossRef]
Rangarajan, A.K.; Purushothaman, R.; Ramesh, A. Tomato crop disease classification using pre-trained deep learning algorithm. Procedia Comput. Sci. 2018, 133, 1040–1047. [Google Scholar] [CrossRef]
Wang, Q.; Qi, F.; Sun, M.; Qu, J.; Xue, J. Identification of tomato disease types and detection of infected areas based on deep convolutional neural networks and object detection techniques. Comput. Intell. Neurosci. 2019, 2019, 9142753. [Google Scholar] [CrossRef]
Zhang, Y.; Song, C.; Zhang, D. Deep learning-based object detection improvement for tomato disease. IEEE Access 2020, 8, 56607–56614. [Google Scholar] [CrossRef]
Chakravarthy, A.S.; Raman, S. Early blight identification in tomato leaves using deep learning. In Proceedings of the 2020 International conference on contemporary computing and applications (IC3A), Lucknow, India, 5–7 February 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 154–158. [Google Scholar]
Ahmed, S.; Hasan, M.B.; Ahmed, T.; Sony, R.K.; Kabir, H. Less is more: Lighter and faster deep neural architecture for tomato leaf disease classification. IEEE Access 2022, 10, 68868–68884. [Google Scholar] [CrossRef]
Yu, H.; Liu, J.; Chen, C.; Heidari, A.A.; Zhang, Q.; Chen, H. Optimized deep residual network system for diagnosing tomato pests. Comput. Electron. Agric. 2022, 195, 106805. [Google Scholar] [CrossRef]
Singh, D.; Jain, N.; Jain, P.; Kayal, P.; Kumawat, S.; Batra, N. PlantDoc: A dataset for visual plant disease detection. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, Hyderabad, India, 5–7 January 2020; pp. 249–253. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Karen, S.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. arXiv 2014, arXiv:1409.484. [Google Scholar]
Alexey, D.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA; 2022; pp. 11966–11976. [Google Scholar] [CrossRef]

Figure 1. (a–j) in the figure represent ten different tomato leaf species, respectively: (a) Tomato___Bacterial_spot; (b) Tomato___Early_blight; (c) Tomato___Late_blight; (d) Tomato___Leaf_Mold; (e) Tomato___Septoria_leaf_spot; (f) Tomato___Spider_mites Two-spotted_spider_mite; (g) Tomato___Target_Spot; (h) Tomato___Tomato_Yellow_ Leaf_Curl_Virus; (i) Tomato___Tomato_mosaic_virus; (j) Tomato___healthy.

Figure 2. Cascading structure.

Figure 3. Diagram of the Three-Channel Attention Mechanism Structure.

Figure 4. Channel attention module.

Figure 5. Spatial attention module.

Figure 6. Coordinate attention module.

Figure 7. LBFNet structure diagram.

Figure 8. Accuracy comparison of different optimizers.

Figure 9. The comparison of the accuracy of different models on LBFtomato.

Figure 10. The accuracy graph using the SGD optimizer with a learning rate of 0.01.

Figure 11. Accuracy comparison of different learning rates.

Figure 12. Loss comparison of different learning rates.

Figure 13. Confusion matrix of LBFNet: 0: ‘Bacterial_spot’,1: ‘Early_blight’, 2: ‘healthy’, 3: ‘Late_blight’, 4: ‘Leaf_Mold’, 5: ‘Septoria_leaf_spot’, 6: ‘Spider_mites’, 7: ‘Target_Spot’, 8: ‘mosaic_virus’, 9: ‘yellow_Leaf_Curl_Virus’.

Figure 14. Confusion matrix of vit-transformer.

Figure 15. Confusion matrix of GoogleNet.

Figure 16. Confusion matrix of MobileNet.

Figure 17. Confusion matrix of VGG16.

Figure 18. Confusion matrix of ResNet50.

Figure 19. Confusion matrix of ConvNeXt.

Table 1. The PlantVillage dataset.

Tomato Picture Category Name	Train Images	Validation Images
Tomato___Bacterial_spot	1410	717
Tomato___Early_blight	670	330
Tomato___healthy	940	651
Tomato___Late_blight	1140	769
Tomato___Leaf_Mold	570	382
Tomato___Septoria_leaf_spot	1060	711
Tomato___Spider_mites Two-spotted_spider_mite	1060	616
Tomato___Target_Spot	950	454
Tomato___Tomato_mosaic_virus	270	103
Tomato___Tomato_Yellow_Leaf_Curl_Virus	3810	1547

Table 2. The LBFtomato dataset.

Tomato Picture Category Name	Train Images	Validation Images
Tomato___Bacterial_spot	1071	340
Tomato___Early_blight	1000	200
Tomato___healthy	1081	254
Tomato___Late_blight	925	381
Tomato___Leaf_Mold	1000	192
Tomato___Septoria_leaf_spot	1083	355
Tomato___Spider_mites Two-spotted_spider_mite	1115	335
Tomato___Target_Spot	1029	284
Tomato___Tomato_mosaic_virus	1000	74
Tomato___Tomato_Yellow_Leaf_Curl_Virus	1085	258

Table 3. Effect of different key modules on model performance.

Module	Accuracy	Loss	Parameters	Train Time/s
LBFB	0.6267	1.0625	689,034	4633
LBFB + cascade	0.9567	0.1513	955,722	2158
LBFB + three-channel attention mechanism	0.9688	0.1034	798,276	1347
LBFB + cascade + three-channel attention mechanism	0.9906	0.0408	897,188	966
LBFB + SE	0.5578	1.2754	691,098	5194
LBFB + cascade + SE	0.9465	0.1703	957,786	2879
LBFB + CA	0.8922	0.3146	776,914	1552
LBFB + CA + cascade	0.9683	0.1220	962,386	2312
LBFB + ECA	0.8745	0.3650	773,584	1432
LBFB + ECA + cascade	0.9615	0.1405	955,728	1786
LBFB + DUAL	0.8853	0.3411	794,060	2434
LBFB + DUAL + cascade	0.9588	0.1261	976,204	2755
LBFB + CBAM	0.9089	0.3053	794,940	1537
LBFB + cascade + CBAM	0.9790	0.0815	777,468	1172

Table 4. LBFtomato Model Cluster Evaluation Metrics. Time taken to train the model for 100 epochs; Test time: The forecast time for a single image.

Model	Accuracy	Loss	Parameters	Train Time/s	Test Time/s	F1-Score	Recall	Precision
Resnet50 [38]	0.9482	0.1579	23,608,202	28,377	0.51	0.92	0.91	0.92
Vgg16 [39]	0.9590	0.0891	165,758,794	41,577	0.23	0.96	0.96	0.96
Mobilenet [18]	0.9492	0.1449	2,279,714	10,142	0.40	0.90	0.91	0.91
Googlenet [40]	0.8633	0.3947	10,360,590	7857	0.32	0.87	0.87	0.87
LBFNet	0.9906	0.0408	897,188	966	0.21	0.98	0.98	0.98
vit-transformer [41]	1.0	0.012	85,806,346	365,320	0.28	1.0	0.97	0.98
ConvNeXt [42]	0.9884	0.071	27,827,818	197,320	0.42	0.99	0.99	0.98

Table 5. PlantVillage Model Cluster Evaluation Metrics.

Model	Accuracy	Loss	Parameters	Train Time/s	Test Time/s	F1-Score	Recall	Precision
Resnet50	0.8965	0.3025	23,608,202	27,837	0.54	0.81	0.79	0.80
Vgg16	0.8175	0.5938	165,758,794	41,926	0.25	0.80	0.79	0.77
Mobilenet	0.7920	0.5924	2,279,714	15,858	0.45	0.77	0.79	0.80
Googlenet	0.8281	0.5588	10,360,590	7172	0.36	0.82	0.84	0.82
LBFNet	0.9756	0.2696	897,188	1420	0.23	0.97	0.98	0.98
vit-transformer	0.9943	0.015	85,806,346	412,702	0.41	0.99	0.98	0.99
ConvNeXt	0.978	0.089	27,827,818	277,456	0.52	0.97	0.98	0.97

Table 6. Performance comparison between the original model and quantitative pruning model.

	Size	Accuarcy	Loss	F1-Score	Recall	Precision
LBFNet	6.85 MB	0.9906	0.0408	0.98	0.98	0.98
pruned_quantized_model	3.46 MB	0.9766	0.0712	0.97	0.97	0.97

Table 7. The accuracy, precision, recall, and F1 score for the LBFNet.

	F1-Score	Recall	Precision	Image Numbers
Bacterial_spot	0.96	0.98	0.97	340
Early_blight	0.97	0.96	0.97	200
healthy	0.98	0.98	0.98	381
Late_blight	0.99	0.99	0.99	192
Leaf_Mold	0.99	0.99	0.99	355
Septoria_leaf_spot	0.98	0.99	0.99	335
Spider_mites	0.99	0.96	0.99	284
Target_Spot	0.99	0.98	0.99	258
mosaic_virus	0.94	1.0	0.97	74
yellow_Leaf_Curl_Virus	0.99	0.99	0.99	254
accuracy			0.98	2673
macro avg	0.98	0.98	0.98	2673
weighted avg	0.98	0.98	0.98	2673

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, H.; Wang, Y.; Jiang, P.; Zhang, R.; Peng, J. LBFNet: A Tomato Leaf Disease Identification Model Based on Three-Channel Attention Mechanism and Quantitative Pruning. Appl. Sci. 2023, 13, 5589. https://doi.org/10.3390/app13095589

AMA Style

Chen H, Wang Y, Jiang P, Zhang R, Peng J. LBFNet: A Tomato Leaf Disease Identification Model Based on Three-Channel Attention Mechanism and Quantitative Pruning. Applied Sciences. 2023; 13(9):5589. https://doi.org/10.3390/app13095589

Chicago/Turabian Style

Chen, Hailin, Yi Wang, Ping Jiang, Ruofan Zhang, and Jialiang Peng. 2023. "LBFNet: A Tomato Leaf Disease Identification Model Based on Three-Channel Attention Mechanism and Quantitative Pruning" Applied Sciences 13, no. 9: 5589. https://doi.org/10.3390/app13095589

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LBFNet: A Tomato Leaf Disease Identification Model Based on Three-Channel Attention Mechanism and Quantitative Pruning

Abstract

1. Introduction

2. Materials and Methods

2.1. LBFtomato Leaf Image Datasets

2.2. Test Environment

2.3. Use Cascading Structures to Reduce Model Loss

2.4. Using Three-Channel Attention Mechanism to Enhance Model Robustness

2.5. Reducing Model Parameters Using Vgg-Style Convolutional Neural Network

3. Results

3.1. Research on Tomato Leaf Disease Classification Based on LBFNet Model

3.1.1. The Impact of Different Optimizers on the Model

3.1.2. The Impact of Different Learning Rate Parameters on the Model

3.1.3. The Impact of Different Attention Mechanisms on the Model

3.2. Model Performance Comparison

3.2.1. Parameter Settings

3.2.2. Evaluation Indicators

3.2.3. Comparative Analysis Result

3.2.4. Reduce Model Size Using Quantitative Pruning

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI