Image Information Contribution Evaluation for Plant Diseases Classification via Inter-Class Similarity

Yang, Jiachen; Yang, Yue; Li, Yang; Xiao, Shuai; Ercisli, Sezai

doi:10.3390/su141710938

Open AccessArticle

Image Information Contribution Evaluation for Plant Diseases Classification via Inter-Class Similarity

by

Jiachen Yang

¹

,

Yue Yang

¹,

Yang Li

^1,2

,

Shuai Xiao

^1,*

and

Sezai Ercisli

³

¹

School of Electrical Automation and Information Engineering, Tianjin University, Tianjin 300072, China

²

College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832003, China

³

Department of Horticulture, Faculty of Agriculture, Ataturk University, 25240 Erzurum, Turkey

^*

Author to whom correspondence should be addressed.

Sustainability 2022, 14(17), 10938; https://doi.org/10.3390/su141710938

Submission received: 9 July 2022 / Revised: 19 August 2022 / Accepted: 31 August 2022 / Published: 1 September 2022

(This article belongs to the Special Issue Artificial Intelligence-Driven Green Agriculture for Sustainable Development)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Combineingplant diseases identification and deep learning algorithm can achieve cost-effective prevention effect, and has been widely used. However, the current field of intelligent plant diseases identification still faces the problems of insufficient data and inaccurate classification. Aiming to resolve these problems, the present research proposes an image information contribution evaluation method based on the analysis of inter-class similarity. Combining this method with the active learning image selection strategy can provide guidance for the collection and annotation of intelligent identification datasets of plant diseases, so as to improve the recognition effect and reduce the cost. The method proposed includes two modules: the inter-classes similarity evaluation module and the image information contribution evaluation module. The images located on the decision boundary between high similarity classes will be analysis as high information contribution images, they will provide more information for plant diseases classification. In order to verify the effectiveness of this method, experiments were carried on the fine-grained classification dataset of tomato diseases. Experimental results confirm the superiority of this method compared with others. This research is in the field of plant disease classification. For the detection and segmentation, further research is advisable.

Keywords:

plant disease identification; smart agriculture; few-shot learning; fine-grained classification; inter-class similarity

1. Introduction

At present, smart agriculture and digital agricultural technology have provided a lot of help in improving crop production and scientific planting [1]. One important application is the identification of plant diseases, the existence of pests and diseases has seriously affected the world’s food production. Timely control and prevention can avoid losses to the greatest extent. Generally speaking, the identification of pests and diseases needs to be entrusted to experts, which will cost a lot. In addition, if it is not discovered, entrusted, and prevented in time, will expand the loss [2]. Therefore, intelligent agriculture and digital agriculture have been well developed. Combining the plant disease identification and computer vision methods, can prevent timely and accurately. Finally, the purpose of reducing costs is achieved [3,4]. In general, the combination of deep learning and plant disease identification has been widely used [5,6,7,8,9], and has achieved many good results. However, plant disease identification combined with deep learning also faces the problems of insufficient data and inaccurate classification of fine-grained similar classes [10,11].

In order to solve the problem of insufficient data, scholars have proposed many deep learning methods based on few-shot learning. Active learning is a technology to solve the shortage of images in the field of high image labeling costs [12,13,14]. The process of active learning is that the network selects the most valuable images to improve the performance of the model from unlabeled images according to the image selection strategy [15,16,17,18]. The most widely used active learning image selection strategies are methods based on uncertainty [19,20], diversity [21,22,23], and model parameters change [24]. The method based on uncertainty is to select the images with the most inaccurate network prediction. The method based on diversity is to select the images with the lowest redundancy with the existing images. The method based on model parameters change is to select the images that can most affect the model parameters. In addition, the existing few-shot learning methods are mainly based on data enhancement and metric learning [25,26].

The method based on data enhancement is to use the existing image to create more images, so that the neural network can achieve a better generalization effect, which is often applied in the network training stage [27]. The few-shot learning method based on metric learning completes the classification task by measuring the distance between the images in the train set and the images in the test set. Prototype network [28] is a metric learning method proposed by Snell. It calculates the average value of the characteristics of all images in each class of image as the characteristic prototype representing this type of image, and calculates the Euclidean distance from the image characteristics in the test set to the characteristic prototype to predict the label of this image. The relationship network [29] is proposed by Sung. Through the relationship network composed of two modules, the similarity scores of the two images are calculated, and the similarity scores are used to judge whether the two images are from the same class.

Fine-grained image classification is a more detailed sub-classification of coarse-grained classes. Take the identification of tomato leaf diseases as an example. All the images belong to the coarse-grained class of tomato leaves. It is needed to specifically subdivide whether they are healthy leaves or which disease leaves. The classification and recognition of plant diseases is mostly based on the nuances of the same species, which will bring to the classification problem [30,31,32]. In order to solve the fine-grained classification problem of specific classification of sub classes under such a large class, researchers have proposed some methods base on the models. The initial method is to widen and deepen the network for fine-grained classification, and to improve the representation ability of the network for fine-grained image classification. On this basis, some people also use the method of network integration [33,34] to improve the accuracy of fine-grained classification and discrimination by using multiple neural networks. They also proposed the high-order coding method of convolution features [32,35,36] to convert the CNN features to high-order and then classify them.

In general, in order to improve the accuracy of intelligent recognition of plant diseases and pests, we need to pay attention to few-shot learning and fine-grained classification at the same time. The existing model-based fine-grained classification problems do not take into account the number of images and the image information quality. Fine-grained classification is the process of further classifying samples with high similarity. From the perspective of image information quality, adding images near the decision boundary of the more confusing category is more helpful to establish a clearer decision boundary. The present research proposes an image contribution evaluation method based on the analysis of inter-class similarity. This method first calculates the similarity between the various classes of images, then evaluates the information contribution of the images according to the inter-class similarity relationship. At the same time, the image information contribution evaluation method is combined with the active learning strategy. This few-shot learning method of plant diseases can use fewer images to achieve a better recognition effect.

Our contributions are as follows:

(1): We propose an image information contribution evaluation method, which focuses on the inter-class similarity, defines the images located on the decision boundary between high similarity classes as high contribution images. This can effectively alleviate the problem of inaccurate fine-grained classification of plant disease identification.
(2): We combine the image information contribution evaluation method with the active learning image selection strategy, which can effectively solve the problem of insufficient data for plant disease identification.
(3): We have carried out experiments on plant disease datasets. This method has achieved better experimental results than the traditional active learning methods. It can also achieve better experimental results with fewer data, which can provide guidance for the collection and annotation of plant disease datasets.

2. Materials

2.1. Dataset

In the plant disease classification, most of them encounter fine-grained image classification. In order to facilitate the verification of the effect of this method on fine-grained plant disease classification, we selected the tomato disease dataset. The data of this dataset is selected and sorted out from PlantVillage dataset [37]. In order to distinguish, we call the newly established dataset tomat10 dataset (T10 dataset). Some data of the T10 dataset are shown in Figure 1.

T10 dataset contains 10 categories of tomato leaves, including 1 category of healthy leaves, 4 fungal diseases (Tomato early blight, Tomato leaf mold, Tomato septoria leaf spot, Tomato target spot), 1 bacterial disease (Tomato bacterial spot), 1 mold disease (Tomato late blight), 1 mite disease (Tomato spider mites two-spotted spider mite) and 2 viral diseases (Tomato mosaic virus, Tomato yellow leaf curl virus). Each class of T10 dataset contains 500 samples, a total of 5000 images. The pixel size of each image is adjusted, and the size is 128 × 128.

Characteristics of tomato diseases mainly focus on the color and shape of leaves. Tomato early blight and tomato late blight are the two typical tomato diseases. The disease spots of tomato early blight have obvious concentric rings, which appear dark green or yellow at the beginning, and grow a black mold layer when it is wet. The disease spots of tomato late blight are dark green at first, irregular water stains, and gradually expanded to brown. According to the color and shape of tomato leaf disease spots, the types of tomato leaf disease can be better distinguished. Figure 1 also shows that each tomato leaf disease has its own characteristics.

2.2. Dataset Segmentation

First, we divide 5000 images into pool set and test set according to the ratio of 4:1, take 20% of the pool set as the initially labeled dataset, and set the budget of each tag to 20% of the pool set. For the first time, 800 images are randomly selected for training, and then 800 images are screened and added to the train set according to different methods each time. In other words, the data of the test set is 1000, the data of the five train sets are 800, 1600, 2400, 3200 and 4000.

The special note is that the number of images in the test set and the initially labeled dataset is balanced. The remaining image selection methods default to no label information, and the number of each class of images may be different when added. This is more in line with the actual dataset collection and labeling process.

3. The Proposed Method

We propose an image information contribution evaluation method for fine-grained classification of plant diseases. This method fully considers the similarity difference between image categories in fine-grained image classification, evaluates the similarity contribution of unlabeled images according to the similarity difference of image categories. Select and label pool images in combination of the evaluation results and labeling budget. The final training set is composed of the initially labeled dataset and the labeled images selected from the pool set. The image selection strategy is shown in Figure 2.

Next, we will introduce the construction method of active learning fine-grained image classification dataset based on the inter-class similarity. In Section 3.1, we will introduce the inter-class similarity evaluation method, in Section 3.2, we will introduce the image information contribution evaluation method.

3.1. Inter-Class Similarity Evaluation

The inter-class similarity evaluation is to obtain the similarity matrix between various images by using the statistical characteristics of the initially labeled dataset. When calculating, first calculate the core area of each class, and then calculate the statistical characteristics of other images entering the core area. After normalization, obtain the similarity matrix between various images. Then calculate the overall similarity between each image class and other images according to the similarity matrix. The specific calculation method is as follows:

First, the feature prototype

c_{i}

is calculated by using the characteristics of images

f_{i}^{(k)}

in each class in the initially labeled dataset as the center of this class:

c_{i} = \frac{\sum_{k = 1}^{n_{i}} f_{i}^{(k)}}{n_{i}}

(1)

where,

f_{i}^{(k)}

represents the characteristic of the k-th image in class i,

c_{i}

represents the center of class i and

n_{i}

represents the number of class i images.

The core area radius of a class is expressed by the mean of the distance from the image feature to the prototype of the class of images:

c_{r_{i}} = \frac{\sum_{k = 1}^{n_{i}} {| | c_{i} - f_{i}^{(k)} | |}_{2}}{n_{i}}

(2)

Calculate the entry of each image into other core areas in the initially labeled dataset:

d (f_{i}^{(k)}, c_{j}) = {| | f_{i}^{(k)} - c_{j} | |}_{2}

(3)

m (x_{i}^{(k)}, c_{(j)}) = \{\begin{matrix} 1, d (f_{i}^{(k)}, c_{j}) < c_{r_{i}} \\ 0, d (f_{i}^{(k)}, c_{j}) \geq c_{r_{i}} \end{matrix}

(4)

Get the statistical characteristics of inter-class similarity:

m_{i j} = \{\begin{matrix} \sum_{k_{i} = 1}^{n_{i}} m (x_{i}^{(k_{i})}, I_{j}) + \sum_{k_{j} = 1}^{n_{j}} m (x_{j}^{(k_{j})}, I_{i}), i \neq j \\ 0, i = j \end{matrix}

(5)

Secondly, normalize the inter-class similarity matrix of images in the dataset:

m_{i j}^{^{'}} = \frac{m_{i j}}{\sum_{i = 1}^{n} \sum_{j = 1}^{n} m_{i j}}

(6)

M = M_{n * n} = {(m_{i j}^{^{'}})}_{n * n}

(7)

where, n is the number of classes of images in the dataset.

Finally, measure the similarity between each class of images and all other classes of images:

s_{i} = \sum_{j = 1}^{n} m_{i j}^{^{'}}

(8)

s_{i}^{^{'}} = \frac{s_{i}}{\sum_{i = 1}^{n} s_{i}}

(9)

S = S_{n * 1} = {(s_{i}^{^{'}})}_{n * 1}

(10)

3.2. Image Information Contribution Evaluation

High information contribution refers to the image that can provide more information for the network. If task indicators are oriented, it is the high information image that can better serve the model training. On the contrary, low information contribution has a large overlap with the existing images in the train set, or the image itself can provide less help for network classification. For low information images, if task indicators are oriented, they are images that help less to improve the performance of model classification tasks.

The core idea of unlabeled image information contribution evaluation is to evaluate the contribution of adding this image to the train set to the establishment of the class decision boundary without supervised information. It is more inclined to select images that enter the core area of more classes and the core area of class that is more easily confused with other classes for labeling. The implementation method is as follows:

First, calculate the distance between the unlabeled image feature vector

g_{i}^{k}

and the centers of each class:

d (g_{i}^{(k)}, I_{j}) = {| | g_{i}^{(k)} - c_{j} | |}_{2}

(11)

Secondly, compare the distance with the radius of the core area, and get the situation that the image enters the core area of each classes:

B = B_{n * 1} = {(b_{j})}_{n * 1}

(12)

b_{j} = \{\begin{matrix} 1, d (g_{i}^{(k)}, I_{j}) < c_{r_{i}} \\ 0, d (g_{i}^{(k)}, I_{j}) \geq c_{r_{i}} \end{matrix}

(13)

Finally, add the similarity evaluation results of the classes where the image is located at the decision boundary to obtain the similarity contribution of the image:

D_{j} = B * S^{T}

(14)

4. Experiments

4.1. Experimental Parameter Setting

In the present research, training is carried out on a 3.2-GHz CPU and a Titan Xp GPU. In each cycle, the model is trained for 200 epochs with cosine annealing learning rate. Resnet-18 [38] is used in both feature extraction network and training network. Pay attention to ensure that the batchsize in the comparative experiment is consistent with the training enhancement method to ensure the effectiveness of the comparative experiment.

4.2. Validation Experiment of Image Contribution Evaluation Method

In the present research, we propose an image information contribution evaluation method, which is based on the results of inter-class similarity evaluation. When calculating the image contribution, we believe that the image between the two classes of images with high similarity is a high information contribution image, and the image between the two classes of images with low similarity, or within a class of images is a low contribution image. In order to verify the effectiveness of the image contribution evaluation method proposed in the present research. Under the same budget, we compare the experimental results of high contribution images, low contribution images, and randomly selected images. Each experiment is repeated 3 times, and the experimental result is the average value of these three experiments. The experimental results are shown in Table 1.

Through the analysis of the results in Table 1, it can be seen that the experimental effect of selecting high contribution images is the best, and the experimental effect of selecting the low contribution images is the worst.

In addition, the experimental effect of screening 40% of the high contribution images is better than that of selecting 60% of the low contribution images, and the difference between the number of labeled training data is more than 800. Using this method to select and label data, we can achieve a higher test accuracy with less labeled data. It fully proves the effectiveness of the image information contribution evaluation method based on the inter-class similarity proposed in the present research.

4.3. Comparative Experiment

The image information contribution evaluation method based on inter-class similarity evaluation proposed in the present research can be combined with active learning strategies, we compare this method with traditional active learning methods. The experimental methods involved in the comparison include uncertainty based method [20] and diversity based method [21]. Each experiment is repeated 3 times, and the experimental result is the average value of these three experiments. The comparative experimental results are shown in Figure 3.

Through the analysis of the comparative experimental results, it can be seen that the test accuracy of all active learning image selection methods is higher than that of random selecting methods, and the image selection strategy formed by the image information contribution evaluation method based on the inter-class similarity evaluation proposed in the present research has the best experimental effect.

5. Discussion

In the discussion part, we will discuss the motivation and reasons.

5.1. Motivation

In terms of agricultural plant disease control, digital and intelligent methods such as deep learning are needed. However, the method combined with deep learning faces the problems of insufficient data and inaccurate fine-grained classification. Most of the existing fine-grained classification methods are implemented by enhancing the network, and the consideration of image information quality is still insufficient. In order to supplement this research, it is necessary to propose an image information quality evaluation method for fine-grained classification to guide the collection and annotation of data.

5.2. Reasons

The reason why this method can achieve good results is that it fully considers the image inter-class similarity, and assigns higher information contribution scores to the images near the class decision boundary with high similarity. At the same time, the images around the more class decision boundary have higher contribution scores. In this section, it will be further analyzed in combination with the experimental results.

5.2.1. Class Similarity Calculate Results

Figure 4 shows the similarity evaluation results between classes calculated from the initially labeled dataset. In order to more intuitively understand the similarity evaluation results, we analyze the similarity between classes from the perspective of maximum and minimum values. The analysis results are shown in Figure 5.

From the results of similarity evaluation, the prominent characteristics are as follows: For the images of “Healthy” (Class 0), the inter-class similarity evaluation results between them and the images of “Tetranychus urticae” (Class 9) are the largest, and the two classes of images are the most similar. For the images of “Target spot” (Class 4), the inter-class similarity evaluation results between them and the images of “Tetranychus urticae” (Class 9) are the largest, and the two classes of images are the most similar. For the images of “Tetranychus urticae” (Class 9), the inter-class similarity evaluation results between them and the images of “Target spot” (Class 4) are the largest, and the two classes of images are the most similar. The inter-class similarity evaluation results between “Healthy” (Class 0) images and “Septoria leaf spot” (Class 3) images are the smallest, and the two classes of images are the most dissimilar.

Figure 6 shows the actual situation of tomato leaves, which is consistent with the inter-class similar evaluation results.

5.2.2. The Models Test Results

Figure 7 shows the test results of the models which are trained on the initially labeled dataset. Figure 8 shows the mean value of the test results of three training times for three classes of images in the initially labeled dataset and marks the maximum and minimum values of the number of false predictions.

Comprehensively consider the mean value of the test results of the three models. Class 0 (“Healthy”) images will be easily predicted as Class 9 (“Tetranychus urticae”). Class 4 (“Target spot”) images will be easily predicted as Class 9 (“Tetranychus urticae”). Class 9 (“Tetranychus urticae”) images will be easily predicted as Class 4 (“Target spot”). There is no misclassification between images of Class 0 (“Healthy”) and Class 3 (“Septoria leaf spot”). It is basically consistent with the results of the inter-class similar evaluation.

5.2.3. Different Budget Test Results

Figure 9 shows the test results of the model trained with different size train sets formed according to the random method and the method proposed in this paper. Table 2 is the mean value of the number of images misclassified between the above confusing classes under different budgets. Compare the test results between the random method and the method proposed in this research.

From the results analysis, it can be seen that compared with random addition, the recognition accuracy of confusing classes will be higher by adding images with the method in this paper. According to the above analysis, our method can evaluate the inter-class similar well. Adding images near the decision boundary of Class 4 (“Target spot”) and Class 9 (“Tetranychus urticae”) is more conducive to fine-grained image classification than adding images near the decision boundary of Class 0 (“Healthy”) and Class 3 (“Leaf mold”). The images near the decision boundary of Class 4 (“Target spot”) and Class 9 (“Tetranychus urticae”) are high information contribution images.

6. Conclusions

In the work of the present research, in order to solve the problems of insufficient data and fine-grained classification accuracy in plant disease classification, we propose a new image information contribution evaluation method based on inter-class similarity analysis. In order to verify the effectiveness of the method, an active learning image selection experiment was carried out. Experiments on fine-grained tomato classification datasets show that the proposed method can achieve better fine-grained classification results than the existing methods under the same budget. The effectiveness of this method shows that in the task of fine-grained image classification, we can not only research from the perspective of strengthening network performance, but also from the perspective of data analysis. In the future, we will continue to study few-shot learning and fine-grained classification problems, carry out data-centric image information quality evaluation methods. Strive to achieve better results with less labeled data, provide efficient sampling guidance in the field of digital agricultural data acquisition and labeling, and improve the overall effect of datasets. Propose more image information quality evaluation methods to promote the development of smart agriculture and digital agriculture.

Author Contributions

Conceptualization, J.Y. and Y.Y.; methodology, Y.Y. and Y.L.; software, Y.Y. and S.X.; validation, Y.Y. and S.E.; formal analysis, J.Y., Y.Y. and Y.L.; resources, data curation, and writing original draft preparation, Y.Y.; writing review and editing, Y.Y. and Y.L.; visualization, supervision, project administration, funding acquisition, J.Y. and S.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 32101612).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data of this dataset is selected and sorted out from PlantVillage dataset, which have been cited in the present research.

Acknowledgments

The authors would like to thank Tianjin University Laboratory of Artificial Intelligence and Marine Information Processing for support on paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Śliwiński, D.; Konieczna, A.; Roman, K. Geostatistical resampling of lidar-derived dem in wide resolution range for modelling in swat: A case study of zgłowiączka river (poland). Remote Sens. 2022, 14, 1281. [Google Scholar] [CrossRef]
Nuthalapati, S.V.; Tunga, A. Multi-domain few-shot learning and dataset for agricultural applications. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 19–25 June 2021; pp. 1399–1408. [Google Scholar]
Yang, J.; Ni, J.; Li, Y.; Wen, J.; Chen, D. The intelligent path planning system of agricultural robot via reinforcement learning. Sensors 2022, 22, 4316. [Google Scholar] [CrossRef]
Mahlein, A.K.; Heim, R.H.; Brugger, A.; Gold, K.; Li, Y.; Bashir, A.K.; Paulus, S.; Kuska, M.T. Digital plant pathology for precision agriculture. J. Plant Dis. Prot. 2022, 129, 455–456. [Google Scholar] [CrossRef]
Li, Y.; Chao, X. Toward sustainability: Trade-off between data quality and quantity in crop pest recognition. Front. Plant Sci. 2021, 12, 811241. [Google Scholar] [CrossRef]
Dhaka, V.S.; Meena, S.V.; Rani, G.; Sinwar, D.; Ijaz, M.F.; Woźniak, M. A survey of deep convolutional neural networks applied for prediction of plant leaf diseases. Sensors 2021, 21, 4749. [Google Scholar] [CrossRef]
Atila, Ü.; Uçar, M.; Akyol, K.; Uçar, E. Plant leaf disease classification using efficientnet deep learning model. Ecol. Inform. 2021, 61, 101182. [Google Scholar] [CrossRef]
Li, Y.; Nie, J.; Chao, X. Do we really need deep cnn for plant diseases identification? Comput. Electron. Agric. 2020, 178, 105803. [Google Scholar]
Ferentinos, K.P. Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 2018, 145, 311–318. [Google Scholar] [CrossRef]
Too, E.C.; Yujian, L.; Njuki, S.; Yingchun, L. A comparative study of fine-tuning deep learning models for plant disease identification. Comput. Electron. Agric. 2019, 161, 272–279. [Google Scholar] [CrossRef]
Li, Y.; Chao, X. Semi-supervised few-shot learning approach for plant diseases recognition. Plant Methods 2021, 17, 68. [Google Scholar] [CrossRef] [PubMed]
Beluch, W.H.; Genewein, T.; Nürnberger, A.; Köhler, J.M. The power of ensembles for active learning in image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9368–9377. [Google Scholar]
Wang, K.; Zhang, D.; Li, Y.; Zhang, R.; Lin, L. Cost-effective active learning for deep image classification. IEEE Trans. Circuits Syst. Video Technol. 2016, 27, 2591–2600. [Google Scholar] [CrossRef] [Green Version]
Aghdam, H.H.; Garcia, A.G.; Weijer, J.; López, A.M. Active learning for deep detection neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 3672–3680. [Google Scholar]
Li, Y.; Chao, X. Distance-entropy: An effective indicator for selecting informative data. Front. Plant Sci. 2021, 12, 818895. [Google Scholar] [CrossRef] [PubMed]
Tang, Y.P.; Huang, S.J. Self-paced active learning: Query the right thing at the right time. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 5117–5124. [Google Scholar]
Yang, Y.; Li, Y.; Yang, J.; Wen, J. Dissimilarity-based active learning for embedded weed identification. Turk. J. Agric. For. 2022, 46, 390–401. [Google Scholar] [CrossRef]
Yang, Y.; Zhang, Z.; Mao, W.; Li, Y.; Lv, C. Radar target recognition based on few-shot learning. In Multimedia Systems; Springer: Berlin/Heidelberg, Germany, 2021; pp. 1–11. [Google Scholar]
Wang, H.; Zhou, R.; Shen, Y.D. Bounding uncertainty for active batch selection. Proc. Aaai Conf. Artif. Intell. 2019, 33, 5240–5247. [Google Scholar] [CrossRef]
Li, Y.; Yang, J.; Wen, J. Entropy-based redundancy analysis and information screening. In Digital Communications and Networks; Elsevier: Amsterdam, The Netherlands, 2021. [Google Scholar]
Li, Y.; Chao, X.; Ercisli, S. Disturbed-entropy: A simple data quality assessment approach. In ICT Express; Elsevier: Amsterdam, The Netherlands, 2022. [Google Scholar]
Siddiqui, Y.; Valentin, J.; Nießner, M. Viewal: Active learning with viewpoint entropy for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 9433–9443. [Google Scholar]
Yang, J.; Ma, S.; Li, Y.; Zhang, Z. Efficient data-driven crop pest identification based on edge distance-entropy for sustainable agriculture. Sustainability 2022, 14, 7825. [Google Scholar] [CrossRef]
Yoo, D.; Kweon, I.S. Learning loss for active learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 93–102. [Google Scholar]
Chen, W.Y.; Liu, Y.C.; Kira, Z.; Wang, Y.C.; Huang, J.B. A closer look at few-shot classification. arXiv 2019, arXiv:1904.04232. [Google Scholar]
Li, F.F.; Fergus, R.; Perona, P. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 594–611. [Google Scholar]
Yang, J.; Guo, X.; Li, Y.; Marinello, F.; Ercisli, S.; Zhang, Z. A survey of few-shot learning in smart agriculture: Developments, applications, and challenges. Plant Methods 2022, 18, 28. [Google Scholar] [CrossRef] [PubMed]
Snell, J.; Swersky, K.; Zemel, R. Prototypical networks for few-shot learning. In Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; Hospedales, T.M. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1199–1208. [Google Scholar]
Xiao, T.; Xu, Y.; Yang, K.; Zhang, J.; Peng, Y.; Zhang, Z. The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 842–850. [Google Scholar]
Simon, M.; Rodner, E. Neural activation constellations: Unsupervised part model discovery with convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1143–1151. [Google Scholar]
Lin, T.Y.; RoyChowdhury, A.; Maji, S. Bilinear cnn models for fine-grained visual recognition. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1449–1457. [Google Scholar]
Berg, T.; Belhumeur, P.N. Poof: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 955–962. [Google Scholar]
Ge, Z.; McCool, C.; Sanderson, C.; Corke, P. Subset feature learning for fine-grained category classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, MA, USA, 7–12 June 2015; pp. 46–52. [Google Scholar]
Gao, Z.; Wu, Y.; Zhang, X.; Dai, J.; Jia, Y.; Harandi, M. Revisiting bilinear pooling: A coding perspective. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 3954–3961. [Google Scholar]
Perronnin, F.; Dance, C. Fisher kernels on visual vocabularies for image categorization. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar]
Hughes, D.; Salathé, M. An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv 2015, arXiv:1511.08060. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]

Figure 1. T10 dataset. The leaf dataset of ten tomato diseases. The mild diseases in each kind of disease are above, and the serious diseases of each kind of disease are below.

Figure 2. Method overview. Combine the image information contribution evaluation method with the active learning image selection strategy.

Figure 3. Comparative experiment on T10 dataset. The image information contribution evaluation method based on inter-class similarity evaluation proposed in the present research can be combined with active learning strategies, we compare this method with traditional active learning methods. The experimental methods involved in the comparison include randomly based method, uncertainty based method [20], and diversity based method [21].

Figure 4. Evaluation results of class similarity calculated from the initially labeled dataset.

Figure 5. Statistics of maximum and minimum values of class similarity evaluation results.

Figure 6. Comparison and analysis between figures.

Figure 7. The test results of training 3 times of the initially labeled dataset.

Figure 8. The mean value of the test results of 3 training times for three classes of images in the initially labeled dataset. Mark the number of false predictions from small to large, and focus on the maximum and minimum values.

Figure 9. Test results of models training under different budgets. The labeled budget is gradually added from left to right. The results in the first row are the test results randomly added according to the budget, and the results in the second row are the test results added according to the budget using the method of this research.

Table 1. Validation experiment on T10.

$Method$	20%	40%	60%	80%	100%
Randomly		88.27%	91.17%	92.70%
High information contribution	84.13%	90.87%	93.03%	93.87%	93.97%
Low information contribution		87.03%	89.73%	91.77%

Table 2. Comparison of the mean value of the number of images misclassified by different methods.

Truth = 0 & Prediction = 9		Truth = 4 & Prediction = 9		Truth = 9 & Prediction = 4
Randomly	Proposed	Randomly	Proposed	Randomly	Proposed
15.00	6.00	10.67	7.67	2.67	2.67

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Yang, Y.; Li, Y.; Xiao, S.; Ercisli, S. Image Information Contribution Evaluation for Plant Diseases Classification via Inter-Class Similarity. Sustainability 2022, 14, 10938. https://doi.org/10.3390/su141710938

AMA Style

Yang J, Yang Y, Li Y, Xiao S, Ercisli S. Image Information Contribution Evaluation for Plant Diseases Classification via Inter-Class Similarity. Sustainability. 2022; 14(17):10938. https://doi.org/10.3390/su141710938

Chicago/Turabian Style

Yang, Jiachen, Yue Yang, Yang Li, Shuai Xiao, and Sezai Ercisli. 2022. "Image Information Contribution Evaluation for Plant Diseases Classification via Inter-Class Similarity" Sustainability 14, no. 17: 10938. https://doi.org/10.3390/su141710938

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Image Information Contribution Evaluation for Plant Diseases Classification via Inter-Class Similarity

Abstract

1. Introduction

2. Materials

2.1. Dataset

2.2. Dataset Segmentation

3. The Proposed Method

3.1. Inter-Class Similarity Evaluation

3.2. Image Information Contribution Evaluation

4. Experiments

4.1. Experimental Parameter Setting

4.2. Validation Experiment of Image Contribution Evaluation Method

4.3. Comparative Experiment

5. Discussion

5.1. Motivation

5.2. Reasons

5.2.1. Class Similarity Calculate Results

5.2.2. The Models Test Results

5.2.3. Different Budget Test Results

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI