Next Article in Journal
The Influence of Mesh Granularity on the Accuracy of FEM Modelling of the Resonant State in a Microwave Chamber
Previous Article in Journal
Full-Scale Field Test on Construction Mechanical Behaviors of Retaining Structure Enhanced with Soil Nails and Prestressed Anchors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Study on Multiple Factors Affecting the Accuracy of Multiclass Skin Disease Classification

1
Graduate School of Nano IT Design Fusion, Seoul National University of Science and Technology, Seoul 01811, Korea
2
Free Style Works Korea, Inc., Seoul 06634, Korea
3
Medical Accelerator Research Team, Department of RI Application, Korea Institute of Radiological & Medical Sciences (KIRAMS), Seoul 01812, Korea
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2021, 11(17), 7929; https://doi.org/10.3390/app11177929
Submission received: 16 July 2021 / Revised: 13 August 2021 / Accepted: 23 August 2021 / Published: 27 August 2021
(This article belongs to the Topic Applied Computer Vision and Pattern Recognition)

Abstract

:
Diagnosis of skin diseases by human experts is a laborious task prone to subjective judgment. Aided by computer technology and machine learning, it is possible to improve the efficiency and robustness of skin disease classification. Deep transfer learning using off-the-shelf deep convolutional neural networks (CNNs) has huge potential in the automation of skin disease classification tasks. However, complicated architectures seem to be too heavy for the classification of only a few skin disease classes. In this paper, in order to study potential ways to improve the classification accuracy of skin diseases, multiple factors are investigated. First, two different off-the-shelf architectures, namely AlexNet and ResNet50, are evaluated. Then, approaches using either transfer learning or trained from scratch are compared. In order to reduce the complexity of the network, the effects of shortening the depths of deep CNNs are investigated. Furthermore, different data augmentation techniques based on basic image manipulation are compared. Finally, the choice of mini-batch size is studied. Experiments were carried out on the HAM10000 skin disease dataset. The results show that the ResNet50-based model is more accurate than the AlexNet-based model. The transferred knowledge from the ImageNet database helps to improve the accuracy of the model. The reduction in stages of the ResNet50-based model can reduce complexity while maintaining good accuracy. Additionally, the use of different types of data augmentation techniques and the choice of mini-batch size can also affect the classification accuracy of skin diseases.

1. Introduction

Artificial intelligence (AI), which has profoundly changed our everyday lives, has been extensively studied for several decades [1,2,3,4]. Some prominent applications are: AI-empowered autonomous driving, which has been employed in numerous electric vehicles from various automakers such as Tesla and Ford [5]; AlphaGo developed by Google using artificial neural networks [6]; and the prevailing TikTok app, which has succeeded greatly due to its recommendation algorithms [7].
An important aspect of AI is machine learning. Artificial neural networks, especially convolutional neural networks (CNNs), are extending the reach of machine learning to a broad range of applications [8,9]. CNNs are particularly useful for analyzing visual imagery. The convolution kernels used in CNNs are extremely useful for the extraction of image features. Classifying images is a challenging task. However, with the aid of powerful graphic processing units, the classification of images has become efficient using deep CNN architectures. Numerous off-the-shelf architectures are available for use and can be easily adjusted to suit new tasks by using deep transfer learning [10,11].
Training deep CNNs usually requires large annotated image datasets. However, in specific areas such as the medical image domain, a large volume of publicly available datasets is not always available, and collecting and annotating samples can also be difficult and costly. Therefore, a data augmentation technique has been extensively studied.
In this paper, multiple factors are studied for potential improvement in the classification accuracy of skin diseases using deep transfer learning. Due to the good generalization ability of deep CNNs, the off-the-shelf architectures AlexNet and ResNet50 are compared for use in the building of transfer learning models for skin disease classification. These deep CNN architectures are initially trained to classify 1000 classes in the ImageNet database. However, as only a few classes are needed for classification in the skin disease dataset, complex deep CNNs seem too heavy. Therefore, the effects of trimming the depths of deep CNNs to reduce the complexity and number of parameters in the networks are evaluated. Comparison is made between using transfer learning and training from scratch. Additionally, different image manipulation techniques for data augmentation are compared. The choice of mini-batch sizes is also discussed. The aforementioned factors are studied through experiments on the HAM10000 skin disease dataset.

2. Related Work

Skin diseases are common types of illness worldwide. It is of importance to diagnose and treat skin disease early, as some severe skin diseases might cause death [12]. With the help of deep neural networks, skin disease classification has been actively studied. A multiclass skin disease classification scheme was proposed using pretrained AlexNet for feature extraction and error-correcting output codes for support vector machine as the classifier and achieved 86% accuracy on five skin lesion categories [13]. However, the dataset was not balanced for each class, and further work is needed. Research on the combination of four popular machine learning algorithms, namely artificial neural networks, linear discriminant analysis (LDA), naïve Bayes, and support vector machines (SVMs), with two feature sets, namely color and texture, was explored [14]. It was found that LDA and SVMs show better accuracy. Automatic classification of clinical skin disease images was proposed using high-level position information [15]. Instead of hand-crafted features, the work used high-level position information to generate better deep visual features and outperformed state-of-the-art clinical skin disease classification methods. A methodology of using the rough set method to extract the best features and feedforward neural network to predict the existence of skin disease was proposed [16]. Morphological- and wavelet-based fractal texture features were used along with stacked auto-encoder-based features to classify four skin diseases [17]. The combined feature set used greatly improved the accuracy of identifying melanoma, nevus, basal cell carcinoma, and seborrheic keratosis diseases, with at least 96% accuracy. A survey of skin disease classification from images was conducted [18]. Traditional techniques and deep learning-based skin disease classification were compared, and it was concluded that the deep learning approach is more efficient and faster for extracting features.

3. Skin Disease Classification

In this section, the multiclass skin disease dataset studied in this paper is first introduced. Then, the potential ways to improve the classification accuracy of skin diseases are discussed, including the choice of the network architecture, the depths of the deep network, the use of transfer learning, the types of data augmentation techniques, and the selection of mini-batch size.

3.1. Skin Disease Dataset

The skin disease dataset used in this paper is from the HAM10000 dataset [19]. It includes a representative collection of important diagnostic categories of pigmented lesions. It is a collection of dermatoscopic images of pigmented skin lesions released to tackle the lack of diversity and small size of available dermatoscopic image datasets to train neural networks for automated classification. The goal of this paper is to classify four skin disease classes, namely basal cell carcinoma, benign keratosis, melanoma, and melanocytic nevi. The images are of size 650 × 450 pixels. Each class contains 500 images. The image examples of the dataset are shown in Figure 1.

3.2. Choice of Network Architectures

Using off-the-shelf deep networks, which have been proven effective in massive and challenging datasets, is common practice in image classification. In this paper, two network architectures, AlexNet and ResNet50, were used to construct our models for skin disease classification.
AlexNet [20] is a well-known CNN that won the ImageNet contest in 2012. It achieves 37.5% top-1 error rates and 17.0% top-5 error rates on test data. It consists of five convolutional layers and three fully connected layers, as shown in Figure 2. It has over 60 million parameters and 65,000 neurons. The network uses ReLU nonlinearity to accelerate the training and employs various techniques to prevent overfitting, such as using dropout and data augmentation.
ResNet [21] is a very deep residual learning framework developed to reduce the effort of network training. It is easy to optimize and can gain accuracy from increased network depth. The original ResNet with 152 layers won first place in the ILSVRC 2015 classification task and achieves 3.57% error on an ImageNet test set. ResNet introduces a shortcut connection technique to overcome the degradation problem, i.e., the accuracy gets saturated and degrades rapidly when the network depth increases.
A smaller version of ResNet50 consists of four stages, as shown in Figure 3. It was initially trained to classify 1000 classes on the ImageNet dataset and has a fully connected layer with a size of 1000.

3.3. Transfer Learning

The approach of using already pretrained off-the-shelf deep networks as the starting point to construct a model for a new task, and transfer the learned features from the ImageNet database to the new task, is known as transfer learning. It is highly effective and easier to train, as only a small number of training images is required, while training the network completely from scratch with randomly initialized weights is much harder.
In this paper, AlexNet architecture is used first. The 1000-way fully-connected (FC) layer is replaced with only four ways for the classification of four skin diseases. The constructed model is named Model I. Similarly, the original ResNet50 model is modified by replacing the last FC layer as well, as shown in Figure 4a. This four-stage ResNet50 Network is denoted by Model II.

3.4. Network Depth

Even though ResNet50 is largely simplified from ResNet152 by largely reducing the number of layers, it is still a complex network. The above Model II built directly from ResNet50 is still seemingly heavy for tasks that only require classifying four different skin diseases. Therefore, we reduced the layers by removing the last stage, and Model III was constructed with only three stages, as shown in Figure 4b. Further, we constructed Model IV with only two stages, as shown in Figure 4c.

3.5. Data Augmentation Techniques

Data augmentation is commonly used in deep learning when there are limited data. It includes various techniques that can enhance the size and quality of the training dataset. It is highly useful to mitigate the overfitting problem. Through data augmentation, more information can be extracted from the original dataset. Basic image manipulation techniques for data augmentation include geometric transformations, color space transformations, and kernel filters, etc., which are applied to images in the input space. Deep-learning-based augmentation techniques are also widely studied. A typical example is the generative adversarial network, which can generate plausible new images from original images. In this paper, multiple image manipulation techniques, including random scale, random rotation, random reflection, random shear, and a combination of different techniques, are compared for skin disease image classification.

3.6. Batch Normalization and Mini-Batch Size

In order to overcome the overfitting problem, where the network performs very well in the training set but poorly in the test set, batch normalization layers are used. Batch normalization can help train the network faster and in a more stable manner, less sensitive to the initial random weights, and it is said to be able to solve the internal covariate shift problem. In Models II, III, and IV, a batch normalization layer is attached to each of the convolutional layers. The skip connection in ResNet50, as well as in Models II, III, and IV, is used to mitigate the gradient explosion problem caused by using batch normalization. In training the deep neural network, batch normalization standardizes the inputs to a layer for each mini-batch. By transforming the data to have a mean of 0 and a deviation of 1, the distribution of the inputs during the weight update will not change dramatically, and the training of the network can be stabilized and accelerated. The generalization error can also be reduced.
The choice of hyperparameter mini-batch size is also important. If the mini-batch size is too small, the distribution of the mini-batches will be largely different from the actual dataset. The differences in the standardized inputs between training and using the model after training can result in noticeable differences in performance. In this paper, different mini-batch sizes are compared for the evaluation of the deep networks on a skin disease dataset.

4. Experimental Results and Discussion

In order to evaluate the effects of different factors on the classification accuracy of skin diseases, experiments were carried out on the custom four-class HAM10000 dataset. The four-fold cross-validation setup was used for all experiments to ensure less bias on the estimation of models’ performance. Each fold contained 1500 images in the training set and 500 images in the validation set.
The choice of the base architecture was first evaluated. The AlexNet-based Model I was tested. The images were resized to 227 × 227 pixels to fit the input size of the AlexNet input layer. The transferred layers were frozen by assigning small learning rates to ensure only the newly added FC layer was trained and the features learned from the ImageNet database could be appropriately transferred. For each fold of cross-validation, the Adam solver was used to train the network for 10 epochs. A mini-batch size of 10 was used. No data augmentation was applied during training. The obtained average cross-validation accuracy was 0.7100, as listed in Table 1.
An experiment was carried out on Model I again with the same experimental setup as above, except that all the weights in the transferred layers were removed, and the model was trained from scratch. The average cross-validation accuracy obtained was 0.6640.
The ResNet50-based Model II was tested. The four-stage Model II was trained using the Adam solver for 10 epochs for each fold, with a mini-batch size of 10. Transfer learning was used to transfer the weights of ResNet50 pretrained on the ImageNet database. No data augmentation techniques were used. The average classification accuracy for four folds was 0.7640. The Model II architecture with all the pretrained weights removed was trained from scratch and showed an average accuracy of 0.6605. Because the ResNet50-based Model II clearly has higher accuracy than the AlexNet-based Model I when using transfer learning, the former was chosen to be further studied for network depth reduction and potential accuracy improvement.
The ResNet50-based models, III and IV, with different network depths were then evaluated with the same four-fold cross-validation setup. Similarly, transfer learning was applied to Model III and Model IV, and the average cross-validation accuracy values for the two models were 0.7700 and 0.7835, respectively.
The above results are summarized in Table 1. It can be found that in the context of transfer learning, the choice of architecture has a huge impact on the classification accuracy of skin diseases, The ResNet50-based models outperform the AlexNet-based model by at least 0.05, which shows the supremacy of ResNet50 over AlexNet in skin disease classification. As for transfer learning versus training from scratch, no matter which architecture is used, whether the AlexNet-based model or the ResNet50-based model, transfer learning is evidently better than training from scratch, as compared in Table 1. As for the ResNet-50-based models with different depths, using the transferred weights learned from the large ImageNet database improves the accuracy by at least 0.08 compared with those trained from scratch. For the complexity of the ResNet50-based network, by reducing the network depth, the classification accuracy is observed to be increased. The two-stage Model IV with the shortest depth has the best accuracy, the complete four-stage Model II has the least accurate result, and the three-stage Model III ranks in the middle. All three transfer learning models had an accuracy above 0.76. This suggests that complex networks do not necessarily mean better accuracy when it comes to classification tasks on a simple skin diseases dataset with only a few classes. Instead, it shows that simplifying the network can obtain enough accuracy and also reduce the effort of training. The confusion matrix of Model IV with the best accuracy using transfer learning and the confusion matrix of Model II with the worst accuracy without transfer learning are shown in Figure 5.
Next, the effects of the types of augmentation techniques on the accuracy of skin disease classification were evaluated using Model IV. With no augmentation techniques applied at all, the model yields a 0.7835 accuracy. When a random scale with a factor from 0.6 to 1.4 is applied, an average accuracy of 0.7785 is obtained. When a random rotation from 0 to 360 degrees is applied, the model shows 0.7620 accuracy. As for when random reflection in the left–right direction and top–bottom direction is applied, the accuracy is 0.7735. An accuracy of 0.7765 is obtained when applying random horizontal and vertical shear from 0 to 45 degrees. With all the above types of augmentation combined, the model gives a 0.7430 accuracy. The comparison of augmentation types is summarized in Table 2. It can be seen that the worst result of 0.7430 comes from using the augmentation techniques combined, using no augmentation techniques at all yields the best accuracy of 0.7835, and all other augmentation techniques used reduce the classification accuracy. The results suggest that for skin disease classification, using basic image manipulation techniques for data augmentation does not necessarily help in improving the classification performance of the networks and should be used cautiously. In Table 2, confusion matrices of Model IV without using data augmentation and using combined data augmentation are shown in Figure 6.
Experiments were carried out to compare the choices of different mini-batch sizes ranging from 5 to 80, and the results are summarized in Table 3. It can be seen that the smallest mini-batch size of 5 has the lowest accuracy of 0.7560, while a mini-batch size of 40 yields the best accuracy of 0.7905. The corresponding confusion matrices are shown in Figure 7. The results show the accuracy can be minorly improved by choosing a proper mini-batch size.

5. Conclusions

In this paper, potential approaches to improve the classification accuracy of skin diseases are investigated based on deep learning. Transfer learning using off-the-shelf deep networks is common practice for image classification. However, the use of complete off-the-shelf networks feels too complex and is not necessary. Multiple factors are studied regarding the classification accuracy of skin diseases. Experiments were carried out using the HAM10000 skin disease dataset. The results show that the choice of the network architecture has a huge impact on accuracy. The ResNet50-based model largely outperforms the AlexNet-based model. The model trained from scratch is much less accurate than that using transfer learning and pretrained on the massive ImageNet database. In order to reduce the complexity of the models, the depths of the networks were reduced, and the two-stage model shows the best accuracy compared with the three-stage model and the four-stage model, which suggests reducing the network depths does not necessarily sacrifice accuracy in skin disease classification. Additionally, the use of different data augmentation techniques actually lowers the accuracy compared with no augmentation applied at all and should be used with caution. Furthermore, carefully choosing the mini-batch size can also help in improving accuracy.

Author Contributions

Conceptualization, Y.L.; methodology, J.F. and Y.L.; investigation, J.F.; writing—original draft preparation, J.F.; supervision, J.K., I.J. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by internal project funds by the Seoul National University of Science and Technology and by a grant from the Korea Institute of Radiological and Medical Sciences (KIRAMS), funded by the Ministry of Science and ICT (MSIT), Republic of Korea (No. 50538-2021).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Restrictions apply to the availability of these data. Data was obtained from https://doi.org/10.7910/DVN/DBW86T.

Acknowledgments

We acknowledge the support of the Seoul National University of Science and Technology and the Korea Institute of Radiological and Medical Sciences (KIRAMS).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ong, Y.-S.; Gupta, A. AIR5: Five Pillars of Artificial Intelligence Research. IEEE Trans. Emerg. Top. Comput. Intell. 2019, 3, 411–415. [Google Scholar] [CrossRef] [Green Version]
  2. Wang, Y.; Kinsner, W.; Zhang, D. Contemporary Cybernetics and Its Facets of Cognitive Informatics and Computational Intelligence. IEEE Trans. Syst. Man Cybern. Part B 2009, 39, 823–833. [Google Scholar] [CrossRef] [PubMed]
  3. Dong, C.; Loy, C.C.; He, K.; Tang, X. Image Super-Resolution Using Deep Convolutional Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 295–307. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Lei, Y.; Jia, F.; Lin, J.; Xing, S.; Ding, S.X. An Intelligent Fault Diagnosis Method Using Unsupervised Feature Learning Towards Mechanical Big Data. IEEE Trans. Ind. Electron. 2016, 63, 3137–3147. [Google Scholar] [CrossRef]
  5. Sun, P.; Kretzschmar, H.; Dotiwalla, X.; Chouard, A.; Patnaik, V.; Tsui, P.; Guo, J.; Zhou, Y.; Chai, Y.; Caine, B. Scalability in Perception for Autonomous Driving: Waymo Open Dataset. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 2443–2451. [Google Scholar]
  6. Li, Z.; Zhu, C.; Gao, Y.-L.; Wang, Z.-K.; Wang, J. AlphaGo Policy Network: A DCNN Accelerator on FPGA. IEEE Access 2020, 8, 203039–203047. [Google Scholar] [CrossRef]
  7. Zhao, Y. Analysis of TikTok’s Success Based on Its Algorithm Mechanism. In Proceedings of the 2020 International Conference on Big Data and Social Sciences (ICBDSS), Xi’an, China, 14–16 June 2020; pp. 19–23. [Google Scholar]
  8. Almakky, I.; Palade, V.; Ruiz-Garcia, A. Deep Convolutional Neural Networks for Text Localisation in Figures from Bio-medical Literature. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–5. [Google Scholar]
  9. Qayyum, A.B.A.; Arefeen, A.; Shahnaz, C. Convolutional Neural Network (CNN) Based Speech-Emotion Recognition. In Proceedings of the 2019 IEEE International Conference on Signal Processing, Information, Communication & Systems (SPICSCON), Dhaka, Bangladesh, 28–30 November 2019; pp. 122–125. [Google Scholar]
  10. Cote-Allard, U.; Fall, C.L.; Drouin, A.; Campeau-Lecours, A.; Gosselin, C.; Glette, K.; Laviolette, F.; Gosselin, B. Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning. IEEE Trans. Neural Syst. Rehabil. Eng. 2019, 27, 760–771. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  11. Nalini, M.; Radhika, K. Comparative analysis of deep network models through transfer learning. In Proceedings of the 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 7–9 October 2020; pp. 1007–1012. [Google Scholar] [CrossRef]
  12. Manoorkar, P.B.; Kamat, D.K.; Patil, P.M. Analysis and classification of human skin diseases. In Proceedings of the 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT), Pune, Indian, 9–10 September 2016; pp. 1067–1071. [Google Scholar]
  13. Hameed, N.; Shabut, A.M.; Hossain, M.A. Multi-Class Skin Diseases Classification Using Deep Convolutional Neural Network and Support Vector Machine. In Proceedings of the 2018 12th International Conference on Software, Knowledge, Information Management & Applications (SKIMA), Phnom Penh, Cambodia, 3–5 December 2018; pp. 1–7. [Google Scholar] [CrossRef]
  14. Hegde, P.R.; Shenoy, M.M.; Shekar, B. Comparison of Machine Learning Algorithms for Skin Disease Classification Using Color and Texture Features. In Proceedings of the 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Bangalore, India, 19–22 September 2018; pp. 1825–1828. [Google Scholar] [CrossRef]
  15. Lin, J.; Guo, Z.; Li, D.; Hu, X.; Zhang, Y. Automatic Classification of Clinical Skin Disease Images with Additional High-Level Position Information. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; pp. 8606–8610. [Google Scholar] [CrossRef]
  16. Hasan, Z.; Shoumik, S.; Zahan, N. Integrated Use of Rough Sets and Artificial Neural Network for Skin Cancer Disease Classification. In Proceedings of the 2019 International Conference on Computer, Communication, Chemical, Materials and Electronic Engineering (IC4ME2), Rajshahi, Bangladesh, 11–12 July 2019; pp. 1–4. [Google Scholar] [CrossRef]
  17. Chatterjee, S.; Dey, D.; Munshi, S. Morphological, Texture and Auto-encoder based Feature Extraction Techniques for Skin Disease Classification. In Proceedings of the 2019 IEEE 16th India Council International Conference (INDICON), Rajkot, India, 13–15 December 2019; pp. 1–4. [Google Scholar] [CrossRef]
  18. Goswami, T.; Dabhi, V.K.; Prajapati, H.B. Skin Disease Classification from Image—A Survey. In Proceedings of the 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 6–7 January 2020; pp. 599–605. [Google Scholar]
  19. Tschandl, P.; Rosendahl, C.; Kittler, H. The HAM10000 dataset, a large collection of multi-source dermato-scopic images of common pigmented skin lesions. Sci. Data 2018, 5, 1–9. [Google Scholar] [CrossRef] [PubMed]
  20. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
  21. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
Figure 1. Sample images from the skin disease dataset.
Figure 1. Sample images from the skin disease dataset.
Applsci 11 07929 g001
Figure 2. Architecture of AlexNet.
Figure 2. Architecture of AlexNet.
Applsci 11 07929 g002
Figure 3. Architecture of ResNet50.
Figure 3. Architecture of ResNet50.
Applsci 11 07929 g003
Figure 4. (a) Architecture of Model II; (b) architecture of Model III; (c) architecture of Model IV.
Figure 4. (a) Architecture of Model II; (b) architecture of Model III; (c) architecture of Model IV.
Applsci 11 07929 g004
Figure 5. (a) Confusion matrix of Model IV using transfer learning; (b) confusion matrix of Model II trained from scratch.
Figure 5. (a) Confusion matrix of Model IV using transfer learning; (b) confusion matrix of Model II trained from scratch.
Applsci 11 07929 g005
Figure 6. (a) Confusion matrix of Model IV without data augmentation; (b) confusion matrix of Model IV using combined data augmentation.
Figure 6. (a) Confusion matrix of Model IV without data augmentation; (b) confusion matrix of Model IV using combined data augmentation.
Applsci 11 07929 g006
Figure 7. (a) Confusion matrix of Model II with a mini-batch size of 5; (b) confusion matrix of Model II with a mini-batch size of 40.
Figure 7. (a) Confusion matrix of Model II with a mini-batch size of 5; (b) confusion matrix of Model II with a mini-batch size of 40.
Applsci 11 07929 g007
Table 1. Average cross-validation results of different models.
Table 1. Average cross-validation results of different models.
ModelArchitectureTrainingData AugmentationAccuracy
IAlexNet-basedTransfer learningNo0.7100
IAlexNet-basedFrom scratchNo0.6640
IIFour-stage ResNet50-basedTransfer learningNo0.7640
IIFour-stage ResNet50-basedFrom scratchNo0.6605
IIIThree-stage ResNet50-basedTransfer learningNo0.7700
IIIThree-stage ResNet50-basedFrom scratchNo0.6415
IVTwo-stage ResNet50-basedTransfer learningNo0.7835
IVTwo-stage ResNet50-basedFrom scratchNo0.6960
Table 2. Comparison of types of augmentation techniques.
Table 2. Comparison of types of augmentation techniques.
ModelArchitectureTrainingData AugmentationAccuracy
IVTwo-stage ResNet50-basedTransfer learningNo0.7835
IVTwo-stage ResNet50-basedTransfer learningRandom scale0.7785
IVTwo-stage ResNet50-basedTransfer learningRandom rotation0.7620
IVTwo-stage ResNet50-basedTransfer learningRandom reflection0.7735
IVTwo-stage ResNet50-basedTransfer learningRandom shear0.7765
IVTwo-stage ResNet50-basedTransfer learningCombined0.7430
Table 3. Comparison of different mini-batch sizes.
Table 3. Comparison of different mini-batch sizes.
ModelArchitectureTrainingMini-Batch SizeAccuracy
IIFour-stage ResNet50-basedTransfer learning50.7560
IIFour-stage ResNet50-basedTransfer learning100.7640
IIFour-stage ResNet50-basedTransfer learning200.7765
IIFour-stage ResNet50-basedTransfer learning400.7905
IIFour-stage ResNet50-basedTransfer learning800.7855
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Fan, J.; Kim, J.; Jung, I.; Lee, Y. A Study on Multiple Factors Affecting the Accuracy of Multiclass Skin Disease Classification. Appl. Sci. 2021, 11, 7929. https://doi.org/10.3390/app11177929

AMA Style

Fan J, Kim J, Jung I, Lee Y. A Study on Multiple Factors Affecting the Accuracy of Multiclass Skin Disease Classification. Applied Sciences. 2021; 11(17):7929. https://doi.org/10.3390/app11177929

Chicago/Turabian Style

Fan, Jiayi, Jongwook Kim, Insu Jung, and Yongkeun Lee. 2021. "A Study on Multiple Factors Affecting the Accuracy of Multiclass Skin Disease Classification" Applied Sciences 11, no. 17: 7929. https://doi.org/10.3390/app11177929

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop