Multi-Models of Analyzing Dermoscopy Images for Early Detection of Multi-Class Skin Lesions Based on Fused Features

Ahmed, Ibrahim Abdulrab; Senan, Ebrahim Mohammed; Shatnawi, Hamzeh Salameh Ahmad; Alkhraisha, Ziad Mohammad; Al-Azzam, Mamoun Mohammad Ali

doi:10.3390/pr11030910

Open AccessArticle

Multi-Models of Analyzing Dermoscopy Images for Early Detection of Multi-Class Skin Lesions Based on Fused Features

by

Ibrahim Abdulrab Ahmed

^1,*,

Ebrahim Mohammed Senan

^2,*

,

Hamzeh Salameh Ahmad Shatnawi

¹,

Ziad Mohammad Alkhraisha

¹ and

Mamoun Mohammad Ali Al-Azzam

¹

Computer Department, Applied College, Najran University, Najran 66462, Saudi Arabia

²

Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Alrazi University, Sana’a, Yemen

^*

Authors to whom correspondence should be addressed.

Processes 2023, 11(3), 910; https://doi.org/10.3390/pr11030910

Submission received: 8 February 2023 / Revised: 13 March 2023 / Accepted: 15 March 2023 / Published: 16 March 2023

(This article belongs to the Special Issue Machine Learning in Biomaterials, Biostructures and Bioinformatics)

Download

Browse Figures

Versions Notes

Abstract

:

Melanoma is a cancer that threatens life and leads to death. Effective detection of skin lesion types by images is a challenging task. Dermoscopy is an effective technique for detecting skin lesions. Early diagnosis of skin cancer is essential for proper treatment. Skin lesions are similar in their early stages, so manual diagnosis is difficult. Thus, artificial intelligence techniques can analyze images of skin lesions and discover hidden features not seen by the naked eye. This study developed hybrid techniques based on hybrid features to effectively analyse dermoscopic images to classify two datasets, HAM10000 and PH2, of skin lesions. The images have been optimized for all techniques, and the problem of imbalance between the two datasets has been resolved. The HAM10000 and PH2 datasets were classified by pre-trained MobileNet and ResNet101 models. For effective detection of the early stages skin lesions, hybrid techniques SVM-MobileNet, SVM-ResNet101 and SVM-MobileNet-ResNet101 were applied, which showed better performance than pre-trained CNN models due to the effectiveness of the handcrafted features that extract the features of color, texture and shape. Then, handcrafted features were combined with the features of the MobileNet and ResNet101 models to form a high accuracy feature. Finally, features of MobileNet-handcrafted and ResNet101-handcrafted were sent to ANN for classification with high accuracy. For the HAM10000 dataset, the ANN with MobileNet and handcrafted features achieved an AUC of 97.53%, accuracy of 98.4%, sensitivity of 94.46%, precision of 93.44% and specificity of 99.43%. Using the same technique, the PH2 data set achieved 100% for all metrics.

Keywords:

deep learning; hybrid features; ANN; skin lesion; handcrafted; SVM

1. Introduction

Cancer is a disease caused by the abnormal growth of cells and can spread and multiply through the lymph nodes, tissues and blood. Skin cancer is among the harmful and life-threatening types of cancer that cannot be treated if not detected and treated in the initial stages [1]. The skin is the outer cover and the first source of defence for the body against viruses and external shocks. It is also a source of beauty for a person. If the performance of the skin changes slightly, it will affect the body system. There are many causes of skin diseases due to fungi, bacteria, allergies or the formation of dyes. Some skin diseases are chronic and may grow into malignant tissues. The affected area is called a skin lesion, and each lesion is divided according to the cells it comprises. Melanocytes are responsible for producing melanin, and not producing melanin leads to a lack of pigment in the skin. Irregular production of non-pigmented cells, such as squamous and basal cells, leads to non-pigmented lesions such as basal and squamous cell carcinomas. A distinction is made between the two types of lesions based on the presence or absence of some features, such as the network of pigments. The lesion is then distinguished as malignant or benign based on the characteristics of dermoscopic images. If it is benign, it is classified into other types of skin lesions. Melanoma and non-melanoma skin cancers are two types of life-threatening skin cancers. In the United States of America, about 5.4 million people develop skin cancer annually [2]. According to World Health Organization reports, skin cancer increases by 53% annually, and the percentage is growing annually [3]. If skin cancer is detected in its early stages, the survival rate is more than 95%, and in contrast, the survival rate is less than 10% if early diagnosis fails. Non-melanocyte carcinomas are more common but less harmful than melanomas [4]. Although melanoma accounts for 5% of skin cancers, it causes more than 70% of deaths [5]. Identifying melanoma and non-melanoma requires highly experienced dermatologists to recognize the problems of high similarity between classes and similarity within non-melanocyte classes. Thus, automated diagnosis will increase accuracy in recognising the type of skin lesion [6]. In manual diagnosis, the type of skin cancer is determined by clinical examination. Clinical examination is the initial analysis of the lesion, followed by a biopsy, histopathological examination and lesion evaluation [7]. The biopsy is done by scraping the skin, taking a tissue sample and subjecting it to several laboratory tests; this method is painful and takes a long time. Dermoscopy involves examination of the skin to assess the type of skin lesions. A dermoscopic technique was developed to assist dermatologists in effectively assessing the lesion [8]. Detecting the lesion depends on many features, such as color, geometry, lesion border similarity and texture. Because of the high visual similarity between skin lesion species, visual classification is difficult and prone to misclassification [9]. Therefore, classification through deep learning techniques is an alternative to microscopy and an effective solution for classifying skin lesions. Computer-assisted automated systems help detect skin cancer in its early stages. Some artifacts in dermoscopic images include skin lines, darkness, hair and air bubbles. These artifacts make autodiagnosis difficult. Instead, computational methods provide many methods to treat artifacts, improve images and appear on the edges of lesions due to the similarity of the characteristics between the types of skin lesions, especially in the initial stages. Therefore, this work focuses on extracting hidden features not visible to the naked eye using deep learning models, and integrating them with handcrafted features.

The main contributions of this study are as follows:

Removal of artifacts from dermoscopic images of skin lesions using an averaging filter and then inputting the images into a Laplacian filter to show the edges of the low-contrast lesions.
Selection of important features and removal of redundant ones by PCA.
Classification of dermoscopic images of two datasets, HAM10000 and PH2, by a hybrid technique, namely SVM-MobileNet-ResNet101.
Classification of dermoscopic images of two datasets, HAM10000 and PH2, by ANN based on the combination of CNN (MobileNet and ResNet101) and handcrafted (discrete wavelet transform (DWT), gray-level co-occurrence matrix (GLCM), fuzzy color histogram (FCH) and local binary patterns (LBP)) features.

The rest of the paper is organized as follows: Section 2 discusses the methods and results of previous studies; Section 3 describes the methods and tools for classifying dermoscopic images of the two datasets, HAM10000 and PH2; Section 4 summarises and discusses the results of systems performance; Section 5 discusses and compares the performance of methods; and Section 6 concludes the study.

2. Related Work

Many researchers have devoted their time and efforts to reaching satisfactory results to develop automated systems that have the ability to diagnose skin lesions and distinguish between their types in the early stages.

Parvathane et al. [10] presented a lightweight technology for classifying skin lesions by MobileNet V2 and LSTM. The method is characterized by its ability to quickly identify the lesion area with fewer calculations than the traditional MobileNet method. The systems were evaluated using the HAM10000 dataset, which outperformed other methods, with an accuracy of up to 85%. Malik et al. [11] introduced deep learning models to extract patterns from PH2 images with the ABCD rule. Attributes were selected by Greedy Stepwise and data set balance; the random forest classifier reached a sensitivity of 93% and a kappa of 78%. Joanna et al. [12] developed algorithms to classify skin lesions based on the anatomical locations of lesions in the face, limbs and trunk by using pre-trained deep learning models to extract features based on a modified classifier. The algorithms developed were able to classify images of skin lesions based on the location of skin lesions with an accuracy of 91.45%. Talha et al. [13] proposed a skin lesion detector based on deep learning models to evaluate the unbalanced HAM10000 data set. The model was tuned with several ultra-parameters. RegNetY-320 achieved better results than the other models, with balanced and unbalanced data sets. It reached an accuracy of 91%, an F1-score of 88% and an AUC of 95%. Ibrahim et al. [14] developed automated systems to analyze two datasets, ISIC and PH2, for skin cancer detection. The features were extracted by several methods, combined to form representative features, and fed to neural networks for classification. Feed forward neural networks (FFNN) reached an accuracy of 97.91% and 95.24% for diagnosing two data sets of ISIC and PH2, respectively. Aditya et al. [15] presented a deep learning system of health and objects for skin cancer classification based on pre-trained CNN models. The features were extracted from the CNN models and sent to a fully connected layer to classify skin lesions. Walaa et al. [16] applied CNNs to analyze ISIC 2018 images to detect melanoma. The Enhanced Super-Resolution Generative Adversarial Network method optimizes the images, and the number of images increases during training. CNN models reached an average accuracy of skin lesion classification of 83.2% for the ISIC 2018 dataset. Imran et al. [17] developed a deep learning model for the multiclass classification of ISIC datasets. The deep learning model was carefully designed with specific layers, various filter sizes and fewer parameters and filters to improve performance. The model reached an accuracy of 94%, a specificity of 91% and a sensitivity of 93%. Murugan et al. [18] presented a methodology for diagnosing skin lesions. The images were improved, and the lesion was isolated from the rest of the image. The features were extracted from the lesion area by gray level run length matrix (GLRLM) and classified by SVM and random forests. Muhammad et al. [19] proposed an approach to segmenting and classifying dermoscopic images for two datasets, HAM10000 and PH2. Histogram intensity values were used to improve the images, and lesions were segmented by the salinity method based on CNN. Thresholding converts the image into binary images and uses the moth flame optimization method to select the most important features. Maximum correlation analysis was used to combine features and classify them by Kernel extreme learning machines. The method achieved an accuracy of 90.67% with the HAM10000 data set. Lina et al. [20] leveraged multiscale information to make a robust prediction. CNN works with the help of information to predict the edges of the lesion. A connection layer module was used to feed the features for each task into the sub-block to focus the network on the lesion edges. Lina et al. [21] proposed a model of CNN enhancement based on a fully convolutional network to analyse and segment skin lesions. The probabilistic model was used to refine the lesion boundaries; the model achieved an accuracy of 98% with the ISBI 2017 data set. Farhat et al. [22] presented a method for classifying the HAM10000 data set through five steps First, images were enhanced and features extracted by transfer learning. Then, the equated monthly instalment method was applied to select the best features, and combined with the canonical correlation approach. The selected features were rated by extreme machine learning, which achieved an accuracy of 93.4%.

Through previous studies, reaching high accuracy in diagnosing types of skin lesions is still the goal of all researchers. Many studies did not focus on extracting features in several ways and integrating them. Therefore, this study focused on extracting subtle and hidden features that are not visible by sight using CNN models and various traditional methods and incorporating them to produce high-representative feature vectors for each image.

3. Methods and Materials

This section discusses the methods and materials used for analyzing dermoscopic images for the early classification of two datasets from skin lesions, HAM10000 and PH2, as shown in Figure 1. The images were optimized for all systems, and the minority class images were increased to balance the HAM10000 and PH2 datasets. CNN models were used for classifying the two datasets. Hybrid techniques SVM-MobileNet, SVM-ResNet101 and SVM-MobileNet-ResNet101 were used to classify skin lesions efficiently. Finally, the features were extracted from the CNN, combined with the handcrafted features and fed into the ANN.

3.1. Description of Dermoscopic Images Dataset

In this study, the developed systems were evaluated on dermoscopic images of two datasets from skin lesions, HAM10000 and PH2.

3.1.1. Description of HAM10000 Dataset

The HAM10000 (Human Against Machine) dataset is considered one of the most extensive for dermoscopy. The goal is to make the data set available to researchers and experts to save lives from the most dangerous and deadly skin diseases, and to compete for the accuracy with which skin lesions can be diagnosed early. Dermoscopic images were acquired by various devices and under different conditions [23]. The proposed system was evaluated on the HAM10000 dataset collected over twenty years from the Department of Dermatology at the Medical University of Vienna, Austria, and the Skin Cancer Clinic at Kleve Rosendaal in Queensland, Australia. The HAM10000 dataset contains 10,015 images spread over seven skin diseases: melanocytic nevi (nv) with 6705 images, melanoma (mel) with 1113 images, benign keratosis lesions (bkl) with 1099 images, basal cell carcinoma (bcc) with 514 images, actinic keratoses (akiec) with 327 images, vascular (vasc) with 142 images and dermatofibroma (df) with 115 images. All images were obtained from both sexes and distributed over 5406 images for males and 4552 images for females from different body locations and ages. Also, the data set is unbalanced due to large differences in the number of images from each class; it is noteworthy that the nv class contains 59 times as many images as the df class.

3.1.2. Description of PH2 Dataset

The PH2 dataset is available for researchers and experts to conduct research and establish standards to reach high diagnostic accuracy. The PH2 dataset was obtained from the Dermatology Service of Pedro Hispano Hospital (Matosinhos, Portugal) [24]. All images were acquired under the same conditions by the Tuebinger Mole Analyzer system, which has a 20× magnification lens. PH2 images were acquired in RGB color as BMP files with a resolution of 768 × 560 pixels. The PH2 dataset contains 200 images distributed among three classes: 80 nevus, 80 atypical and 40 melanomas.

3.2. Pre-Processing Dermoscopic Images

Image pre-processing technologies provide the ability to interpret information for experts and those interested in this field, and are essential input into image processing processes by optimizing specific features.

3.2.1. Enhancement of Dermoscopic Images for the HAM10000 and PH2 Datasets

Image enhancement aims to change some image data (attributes) and make them clearer for easier processing in later tasks. Image enhancement is used to remove image artifacts, highlight lesion edges by processing low contrast between the edges of lesions and healthy skin, remove air bubbles and skin lines, and improve dark images. The images were optimized using two filters: the average filter and the Laplacian filter [25]. First, the images were enhanced by an average filter. An average filter improves an image by reducing the intensity contrast between adjacent pixels. The filter moves over the entire image and replaces the value of each pixel in the image with the average value of neighboring pixels.

Given an input image

f (x, y)

and the filter (kernel) size of N = m × n = 5 × 5, the output of an averaging filter at

(x, y)

is

g (x, y) = \frac{1}{N} \sum_{i = - a}^{a} \sum_{j = - b}^{b} f (x + i, y + j)

(1)

where x and y are varied so that the center (origin) of the filter visits every pixel in f once, and m = 2

a

+ 1, n = 2b + 1.

Next, the Laplacian filter was applied to detect the edges of the skin lesions, defined as:

\nabla^{2} f = \frac{\partial^{2} f}{\partial^{} x^{2}} + \frac{\partial^{2} f}{\partial^{} y^{2}}

(2)

In the x direction, we have

\frac{\partial^{2} f}{\partial^{} x^{2}} = f (x + 1, y) + f (x - 1, y) - 2 f (x, y)

(3)

And, similarly, in the y direction,

\frac{\partial^{2} f}{\partial^{} y^{2}} = f (x, y + 1) + f (x, y - 1) - 2 f (x, y)

(4)

It follows from the proceeding three equations that

\nabla^{2} f (x, y) = f (x + 1, y) + f (x - 1, y) f (x, y + 1) + f (x, y - 1) - 4 f (x, y)

(5)

Finally, the two enhanced images are merged by subtracting the output image of the Laplacian filter from the output image of the averaging filter:

O (x, y) = g (x, y) - \nabla^{2} f (x, y)

(6)

where

O (x, y)

is the outputted enhanced image.

Figure 2 shows samples from the HAM10000 dataset before and after enhancement. Figure 3 shows samples from the two PH2 datasets before and after enhancement.

3.2.2. Hair Removal Method

Some dermoscopic images contain hair, which challenges subsequent image processing methods. Extracting features from images that have hair leads to the inclusion of hair features as essential features of the lesion, and thus leading to an inaccurate diagnosis [26]. Therefore, the Dullrazor method has been used for hair removal in the proposed methods. Dullrazor is used for hair removal in three steps: (1) passing of dermoscopic images to the morphological closing operation of dark hair selection; (2) use of the bilinear interpolations operation to check the hair pixels and then replace them; (3) smoothing of the new pixels with a medium filter.

Figure 4 shows images of skin lesions before and after hair removal.

3.3. SVM Based on CNN Features

This section introduces a hybrid technique based on extracting the features of dermoscopic images of two datasets, HAM10000 and PH2, by CNN models and classifying them by the SVM algorithm. The idea of this technique is to replace the last layers in the CNN models with the SVM algorithm.

3.3.1. Extract Deep Feature Maps

A layered CNN is designed to learn visual features. They learn the features automatically by training a dermoscopic image dataset with two skin lesion datasets. Several CNN structures were constructed to diagnose skin lesions by many layers, their order, training steps and learning rate. This section describes CNN architecture, including convolutional and pooling layers [27].

Dermoscopic images of skin lesions are fed into the CNN structure, and when passed through the convolutional layers, it becomes a feature map (with the number of images and the height, width and depth of the features). Convolutional layers in neural networks must have the following features: convolutional filters are sized by width and height. The convolutional layer implements a set of filters with given sizes, learns the weights from the training set, processes the input and passes it on to the next layer [28]. The weights engage neighboring neurons of the same depth, forming a convolutional filter. Three types of parameters that determine the size of convolutional layers are filter size, step and zero padding. The filter G (t) size determines the number of pixels that it wraps around the image E (t) as in Equation (6). Step sets the number of filter offsets on the input neurons; when the step is 1, it shifts the filter by 1 pixel, and when the step is 2, it shifts the filter by 2 pixels.

W (t) = (E \times G) (t) = \int E (a) G (t - a) d a

(7)

where W (t) refers to the output layer, G (t) refers to the filter and E (t) refers to the inputted DR image.

Zero padding fills the edges with zeros to keep the output neurons the same size as the input neurons, and puts zeroes around the edge. When the zero padding is 1, the resulting neurons are lined by row and column around the edges. When the zero padding is 2, the resulting neurons are filled with zeros for two rows and two columns around the edges. The output of neurons is calculated according to Equation (8) [29].

O u t p u t N e u r o n s = \frac{N - K + 2 P}{S} + 1

(8)

where N indicates the size of the input neurons, K is the size of the filter, P is the size of padding and S indicates the stride.

The pooling layer reduces dimensionality, reducing image dimensions by pooling the output from neurons in one layer to a single cell in the next. The calculation is performed by either the maximum or average grouping. Max pooling uses the maximum values from each group in neurons in the previous layer, as in Equation (9). Average pooling uses the average values from each group, as in Equation (10) [30].

H (a; b) = m a x_{q, r = 1 \dots . k} G [(a - 1) p + q; (b - 1) p + r]

(9)

H (a; b) = \frac{1}{k^{2}} \sum_{q, r = 1 \dots . k} G [(a - 1) p + q; (b - 1) p + r]

(10)

where G is the wrapped filter around some pixels, q, r means the matrix sites, p is p-step filter jump and k is number of pixels.

Finally, for the HAM10000 dataset, each model produces features with dimensions of 10,015 × 1024 and 10,015 × 2048 for both MobileNet and ResNet101 models, respectively. For the PH2 dataset, each model produces features with dimensions of 200 × 1024 and 200 × 2048 for both MobileNet and ResNet101 models, respectively. The PCA deleted the high dimensional and repetitive features, and the essential features were kept at sizes of 10,015 × 475 and 200 × 475 for both HAM10000 and PH2 datasets, respectively.

3.3.2. Machine Learning (SVM)

The goal of SVM is to create decision lines or boundaries to separate a set of data into different classes (similar data in the same category). The best hyperplane is the one with the maximum margin between support vectors. SVM has two types: linear and non-linear. Linear SVM uses linear separable data. This means that if the data set is classified into two classes using one hyperplane, these data are called linear separable data, and the classifier is called SVM linear. Nonlinear SVM is used when the information is not linearly separated. Support vectors are the data points closest to the lines of both categories [31]. The margin is the distance from the support vectors to the hyperplane. Although the SVM method is adaptable to separate the data into two categories, it also classifies the data belonging to multiple categories. The SVM algorithm receives the features of MobileNet and ResNet101, and divides the data set into 80% for training the system and validating its generalization, and 20% for testing it.

In the new technique reported here, the biggest challenge faced was the similarity between skin lesions, especially in the initial stages. Therefore, the features of MobileNet and ResNet101 models were combined to overcome this challenge. This technique depended on extracting the features of the MobileNet and ResNet101 models separately and inputting them into the PCA method to select the representative features. Features of the two models were combined and saved to vectors size 10,015 × 950 and 200 × 950 for both HAM10000 and PH2 datasets, respectively. The ANN receives the hybrid features of MobileNet and ResNet101 and analyzes, interprets and classifies them with high accuracy. Figure 5 shows the system architecture for analyzing dermoscopic images of the HAM10000 and PH2 datasets using SVM-CNN.

3.4. ANN with Hybrid Features of CNN and Handcrafted

This technology is a hybrid between artificial neural networks based on fused features of CNN and handcrafted ones. This technique aims to achieve superior results for classifying dermoscopic images of the skin lesion datasets.

ANN is a group of neurons connected and interconnected internally. It has the superior ability to obtain information from complex data, interpret it, analyze it and produce clear patterns. Well-trained data and validation in neural networks leads to accurate results for classifying skin lesions. The success of artificial neural network training depends on the internal structure of an artificial neural network, which consists of neurons numbered in the input and output layers, and the number of hidden layers and their neurons. The second component is the algorithm mechanism for training the input data, where the number of neurons in the input (475 input units) and hidden layers (20 hidden layers) and output layers (7 neurons for HAM10000 and 3 neurons for PH2) are determined by identifying the problem as shown in Figure 6. The best network performance is chosen by experience and error, in which weights are updated continuously, and the process repeats and stops when the minimum error ratio is obtained. Artificial neural network algorithms aim to update the specified weights so that the minimum mean square error (MSE) is obtained between the actual class labels

x_{i}

and predicted class labels

y_{i}

as in Equation (11).

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}

(11)

where n is the number of data points.

The novelty of this technique is in the extraction of features from CNN models and merging them with handcrafted features extracted by DWT, GLCM, FCH and LBP methods, as shown in Figure 7. This technology is characterized by its speed in training the data set and does not require high-specification hardware.

The first step is to remove the image optimization artifacts and introduce them to the MobileNet and ResNet101 models. The two models receive images of skin lesions for two datasets, HAM10000 and PH2, and analyze them to extract accurate and hidden features by convolutional layers. Each convolutional layer extracts certain features that differ from the other layers. Finally, the MobileNet and ResNet101 models acquire high-dimensional features and save them to vectors of 10,015 × 1024 and 10,015 × 2048 for the HAM10000 and vectors of 200 × 1024 and 200 × 2048 for PH2 datasets. To eliminate redundant and unimportant features, the MobileNet and ResNet101 models’ features are entered into PCA and saved at 10,015 × 475 and 200 × 475 for the HAM10000 and PH2 datasets, respectively.

The wavelet transform method provides many advantages over other transform methods, providing an easy-to-interpret representation. Two-dimensional skin disease images are decomposed using bi-dimensional wavelet transforms, which are applied to one-dimensional transformations. According to a multi-resolution, DWT decomposes the input signal into two new signals with different frequencies [32]. These two new signals are compatible with low- and high-pass filters, representing wavelet transformation functions. Approximation coefficients are obtained by low pass filters (LL), while detail coefficients (horizontal, vertical and diagonal) are obtained by high pass filters (LH, HL and HH, respectively). The decomposition of A bi-dimensional signal using DWT leads to four sub-bands per level of decomposition. Standard deviation, mean and variance were extracted as features from each sub-band (approximation coefficients, detail coefficients (horizontal, vertical and diagonal)). DWT extracts texture features from skin lesions and saves them in vector at a size of 10,015 × 12 and 200 × 12 for HAM10000 and PH2 datasets, respectively.

The GLCM is a matrix that shows a different composition of gray levels in the region of interest. Extracting the texture features of the lesion area by GLCM helps identify various areas of the lesion region. Since the smooth lesion area has pixel values close to each other and the rough region has various pixels, the statistical measures of the lesion area histogram are appropriate for texture measures [33]. These texture metrics containing spatial information depend on the co-occurrence matrix called the spatial gray-level. Spatial information determines the relationship between pixel pairs in terms of relative position by distance d and orientation θ, where it describes the location of the second pixel in relation to the first pixel. The four values of θ that cover orientations are 0°, 45°, 90° and 135°, the common value of d = 1 when the angle θ is vertical or horizontal (θ = 0 or θ = 90), and d = √2 when it is θ = 45 or θ = 135. GLCM extracts texture features from skin lesions and saves them in vector at a size of 10,015 × 24 and 200 × 24 for HAM10000 and PH2 datasets, respectively.

Color is a strong feature in the diagnosis of skin lesions. Because each local color given is represented in the histogram bin, the color of the histogram in the colors of the target area represents a rough distribution. Two colors will be treated similarly, provided they are in the same histogram bin. Also, two colors are different when they are in a different histogram bin, even if the two colors are somewhat similar [34]. FCH is working on a study of color similarity using the total membership value of each pixel and the distribution of pixels on total histogram bins. The fuzzy color histogram method is used to distribute the color from the probability viewpoint. FCH extracts color features from skin lesions and saves them in vector at a size of 10,015 × 16 and 200 × 16 for HAM10000 and PH2 datasets, respectively.

LBP is one of the methods used to extract features. LBP is a simple and effective algorithm, because of its robust effectiveness and mathematical simplicity. The method determines the central pixel to be analyzed and determines the neighboring pixels, where the method determines the neighborhood 4 × 4 for each central pixel. Parameter R (radius) is the number that determines the number of pixels of the neighborhood for each central pixel. The basic idea of the LBP method is that it describes the texture of two-dimensional surfaces. The method works to transfer the image to an array and analyze the central pixel according to the neighboring pixels and replace it according to the mathematical expression of the algorithm, and the process is repeated for each pixel inside the skin lesion. The algorithm compares the gray level between the central pixel (gc), called the central pixel, and the neighboring pixel (gp), called neighborhood pixels, as in Equation (12) [34]. LBP extracts features from skin lesions and saves them in vector at 10,015 × 203 and 200 × 203 for both HAM10000 and PH2 datasets, respectively.

L B P_{R, P} = \sum_{p = 0}^{P - 1} s (g_{p} - g_{c}) 2^{p}

(12)

where

g_{c}

refers to the aim pixel, R refers to the adjoining,

g_{p}

refers to the adjoining pixels and P refers to the number of adjoining pixels.

Where s (c) = \{\begin{matrix} 0, | c < 0 \\ 1, | c \geq 0 \end{matrix}

(13)

The DWT, GLCM, FCH and LBP methods are combined and called handcrafted features in vector sizes of 10,015 × 12 and 200 × 12 for HAM10000 and PH2 datasets, respectively.

Due to the similarity between skin lesions, especially in the early stages, the novelty of this technique is in the integration of features of the CNN models (MobileNet and ResNet101) with the handcrafted features. Thus, the hybrid features of MobileNet and handcrafted are saved in vectors of size 10,015 × 730 and 200 × 730 for HAM10000 and PH2 datasets, respectively. Likewise, save the features hybrid of ResNet101 and handcrafted in vectors of size 10,015 × 730 and 200 × 730 for HAM10000 and PH2 datasets, respectively.

The ANN receives hybrid features of CNN (MobileNet and ResNet101) and handcrafted features, then analyzes, interprets and classifies them with high accuracy.

4. Results of System Performance

4.1. Split of HAM10000 and PH2 Data Sets

In this work, the performance of the systems was tested on dermoscopic images from two datasets, HAM10000 and PH2. The HAM10000 data set consists of 10,015 images distributed unevenly among seven types of melanocytic and non-melanocytic diseases. The PH2 data set consists of 200 images distributed unevenly among three types of melanocytic diseases. The two datasets, HAM10000 and PH2, were equally distributed for all systems into 80% for training and validating the systems (80:20) and 20% for measuring the systems’ performance, as shown in Table 1.

4.2. Metrics of Systems Evaluation

The performance of all systems to classify skin lesions images for two datasets, HAM10000 and PH2, was measured by measures shown in Equations (14)–(18). The equations show correctly classified samples denoted as TN and TP, and incorrectly classified samples denoted as FN and FP. Systems produce a confusion matrix that contains correctly and incorrectly classified samples [34].

AUC = \frac{True Positive Rate}{False Positive Rate} = \frac{Sensitivity}{Specificity}

(14)

where AUC is the area under the Receiver Operating Characteristic (ROC) curve.

Accuracy = \frac{TN + TP}{TN + TP + FN + FP} \times 100 %

(15)

Sensitivity = \frac{TP}{TP + FN} \times 100 %

(16)

Precision = \frac{TP}{TP + FP} \times 100 %

(17)

Specificity = \frac{TN}{TN + FP} \times 100 %

(18)

where

True Positive Rate = \frac{True Positive}{Total Positive}

False Positive Rate = \frac{False Positive}{Total Positive}

4.3. Data Augmentation with Data Balancing

Many datasets encounter obstacles due to the dataset’s lack of enough images and the dataset’s imbalance between classes, which negatively affect the model’s performance. Therefore, it is necessary to use techniques for balancing the classes during training. Augmentation is a technique for increasing data for minority classes. The method helps increase the size of the data set and thus reduces overfitting. The augmentation technique increases the training set samples using various methods such as rotation, size re-scaling, horizontal and vertical flipping, horizontal and vertical shifting, and cropping [35]. For the HAM10000 dataset, the technique of increased data was applied to six classes, namely mel of 1113 images, bkl of 1099 images, bcc of 514 images, akiec of 327 images, vasc of 142 images and df of 115 images, due to the lack data in them, while the augmentation was not applied to the class called nv as it has 6705 images. There was an increase in the number of training images for the six classes from 2119 to 24,762 images, while keeping the number of images in the nv class the same (4291 images) before and after augmentation.

For the PH2 dataset, the technique of increasing data was applied to three classes, namely nv of 51 images, atypical of 51 images and mel of 51 images. There was an increase in the number of training images for the three classes from 128 to 4360 images. Table 2 shows the number of training data sets before and after the augmentation technique was applied.

Figure 8 describes the HAM10000 dataset samples after applying the data augmentation technique to the images.

4.4. Results of Pre-Trained CNN Models

This section discusses the performance of pre-trained MobileNet and ResNet101 models for image classification of skin lesions for the two datasets, HAM10000 and PH2. Unfortunately, the ImageNet database does not contain images for the skin lesions data set. The images were improved before being fed to the MobileNet and ResNet101 models. Pre-trained models yielded results shown in Table 3 for the classification of skin lesions for the HAM10000 and PH2 datasets. First, for the HAM10000 dataset, the pre-trained MobileNet reached an AUC of 94.73%, accuracy of 91.80%, sensitivity of 91.67%, precision of 90.06% and specificity of 98.24%. Second, for the PH2 dataset, the pre-trained MobileNet reached an AUC of 92.87%, accuracy of 90%, sensitivity of 92.10%, precision of 88.67% and specificity of 95.60%.

In contrast, the performance of pre-trained ResNet101 is shown in Table 4. First, for the HAM10000 dataset, the pre-trained ResNet101 reached an AUC of 95.01%, accuracy of 91.40%, sensitivity of 89.31%, precision of 88.73% and specificity of 98.31%. Second, for the PH2 dataset, the pre-trained ResNet101 reached an AUC of 93.13%, accuracy of 92.50%, sensitivity of 92.07%, precision of 91.90% and specificity of 96.23%.

MobileNet produces the confusion matrix as shown in Figure 9 for the HAM10000 and PH2 datasets. First, for the HAM10000 dataset, MobileNet achieved accuracy for each class as follows: 87.7% for akiec class, 89.3% for bcc class, 93.6% for bkl class, 91.3% for df class, 96.9% for mel class, 91.5% for nv class and 71.4% for vasc class. Secondly, for the PH2 data set, MobileNet achieved accuracy for each class as follows: 87.5% for atypical class, 100% for nv class and 87.5% for mel class.

ResNet101 produces the confusion matrix as shown in Figure 10 for the two datasets HAM10000 and PH2. First, for the HAM10000 dataset, ResNet101 achieved accuracy for each class as follows: 93.8% for akiec class, 92.2% for bcc class, 93.2% for bkl class, 91.3% for df class, 93.3% for mel class, 91.1% for nv class and 71.4% for vasc class. Secondly, for the PH2 data set, ResNet101 achieved accuracy for each class as follows: 93.8% for atypical class of, 87.5% for nv class and 93.8% for mel class.

4.5. Results of SVM Based on CNN Features

This section discusses the performance of the SVM-CNN technique for classifying skin lesion images for the two datasets, HAM10000 and PH2. This technique consists of two parts; firstly, the CNN models (MobileNet and ResNet101), whose task is to extract features and then select important features through PCA. The second part is SVM classifier that receives the selected features and distributes them into 80% for classifier training and 20% for classifier performance testing.

The hybrid technique yielded good results for the classification of skin lesions for the HAM10000 and PH2 datasets, as shown in Table 5. The SVM based on MobileNet features achieved slightly better results than its performance with ResNet101 features. First, for the HAM10000 dataset, SVM-MobileNet reached an AUC of 92.57%, accuracy of 95%, sensitivity of 86.17%, precision of 84.41% and specificity of 99.01%. Secondly, for the PH2 data set, SVM-MobileNet gave an AUC of 96.50%, accuracy of 95%, sensitivity of 95.77%, precision of 94.33% and specificity of 97.70%.

In contrast, the performance of SVM-ResNet101 is shown in Table 6. First, for the HAM10000 dataset, SVM-ResNet101 reached an AUC of 90.06%, accuracy of 94.80%, sensitivity of 85.56%, precision of 85.21% and specificity of 98.80%. Secondly, for the PH2 data set, SVM-ResNet101 gave an AUC of 95.70%, accuracy of 95%, sensitivity of 94.07%, precision of 95.97% and specificity of 97.20%.

The SVM-MobileNet produced the confusion matrix as shown in Figure 11 for the HAM10000 and PH2 datasets. First, for the HAM10000 dataset, SVM-MobileNet achieved accuracy for each class as follows: 84.6% for akiec class, 91.3% for bcc class, 93.6% for bkl class, 65.2% for df class, 89.7% for mel class, 97.8% for nv class and 78.6% for vasc class. Secondly, for the PH2 data set, the SVM-MobileNet showed accuracy for each class as follows: 87.5% for atypical class, 100% for nv class and 100% for mel class.

The SVM-ResNet101 technique produced the confusion matrix as shown in Figure 12 for the two datasets HAM10000 and PH2. First, for the HAM10000 dataset, SVM-ResNet101 achieved accuracy for each class as follows: 80%for akiec class, 86.4% for bcc class, 95.5% for bkl class, 87% for df class, 85.2% for mel class, 98.4% for nv class and 67.9% for vasc class. Secondly, for the PH2 data set, SVM- ResNet101 gave accuracy for each class as follows: 100% for atypical class, 87.5% for nv class and 93.8% for mel class.

Here are the results of the hybrid technique based on combining features of the MobileNet and ResNet101 models. Due to the similarity of skin lesions in their early stages, this technique focused on integrating features of the MobileNet and ResNet101 models and then classification using SVM. This technique yielded good results for the classification of dermoscopic images for the HAM10000 and PH2 datasets, as shown in Table 7. First, for the HAM10000 dataset, SVM-MobileNet-ResNet101 gave an AUC of 98.69%, accuracy of 97%, sensitivity of 91.3%, precision of 91.16% and specificity of 99.26%. Secondly, for the PH2 data set, SVM-MobileNet-ResNet101 gave an AUC of 97.97%, accuracy of 97.5%, sensitivity of 97.87%, precision of 98.03% and specificity of 98.57%.

The SVM-MobileNet-ResNet101 technique produced the confusion matrix as shown in Figure 13 for the HAM10000 and PH2 datasets. First, for the HAM10000 dataset, the SVM-MobileNet-ResNet101 technique achieved accuracy for each class as follows: 90.8% for akiec class, 91.3% for bcc class, 96.4% for bkl class, 87% for df class, 93.3% for mel class, 98.9% for nv class and 82.1% for vasc class. Secondly, for the PH2 data set, the SVM-MobileNet-ResNet101 technique reached accuracy for each class as follows: 93.8% for atypical class, 100% for nv class and 100% for mel class.

4.6. Results of ANN with Hybrid Features of CNN and Handcrafted

This section discusses the performance of a hybrid technique for image classification of skin lesions for the two datasets, HAM10000 and PH2, based on extracting features from CNN models and fusing them with handcrafted features and classification by ANN. After extracting the features of the CNN, its dimensions are high, and thus it was inputted into the PCA method to select the most important features and then the features were fused with the handcrafted features. The ANN receives the hybrid features and distributes them to 80% for training and 20% for system performance measurement.

This technique achieved superior results for the classification of dermoscopic images of the HAM10000 and PH2 datasets of skin lesions, as shown in Table 8. The ANN based on hybrid features of MobileNet and handcrafted achieved slightly better results than its performance with hybrid features of ResNet101 and handcrafted features. First, for the HAM10000 dataset, the ANN with fused features of MobileNet-handcrafted gave an AUC of 97.53%, accuracy of 98.4%, sensitivity of 94.46%, precision of 93.44% and specificity of 99.43%. Second, for the PH2 dataset, ANN with fused features of MobileNet-handcrafted gave an AUC of 100%, accuracy of 100%, sensitivity of 100%, precision of 100% and specificity of 100%.

In contrast, the performance of ANN with hybrid features of ResNet101 and handcrafted are shown in Table 9. First, for the HAM10000 dataset, the ANN with fused features of ResNet101-handcrafted gave an AUC of 99.11%, accuracy of 97.60%, sensitivity of 92%, precision of 93.01% and specificity of 99.46%. Second, for the PH2 dataset, ANN with fused features of ResNet101-handcrafted gave an AUC of 99.53%, accuracy of 97.50%, sensitivity of 97.90%, precision of 96.30% and specificity of 98.93%.

The ANN with hybrid features of MobileNet and handcrafted produced the confusion matrix as shown in Figure 14 for the HAM10000 and PH2 datasets. First, for the HAM10000 dataset, ANN achieved accuracy for each class as follows: 90.8% for akiec class, 95.1% for bcc class, 97.3% for bkl class, 100% for df class, 98.7% for mel class, 99.5% for nv class and 82.1% for vasc class. Secondly, for the PH2 data set, ANN gave accuracy for each class as follows: 100% for atypical class, 100% for nv class and 100% for mel class.

The ANN with hybrid features of ResNet101 and handcrafted produced the confusion matrix as shown in Figure 15 for the HAM10000 and PH2 datasets. First, for the HAM10000 dataset, ANN gave an accuracy for each class as follows: 89.2% for akiec class, 91.3% for bcc class, 96.4% for bkl class, 87% for df class, 94.2% for mel class, 99.6% for nv and 85.7% for vasc class. Secondly, for the PH2 data set, ANN gave accuracy for each class as follows: 100% for atypical class, 100% for nv class and 93.8% for mel class.

This section also summarizes some ANN evaluation tools for dermoscopic image analysis to classify the two datasets of skin lesions, HAM10000 and PH2.

Gradient and validation check one of the ANN measurements when analyzing dermoscopic images to classify two datasets, HAM10000 and PH2, for skin lesions. The tool examines the cases of samples that failed during each epoch. The Failure and Gradient values for the HAM10000 and PH2 datasets are in Table 10.

Cross-entropy is an ANN measure when analyzing dermoscopic images to classify two datasets, HAM10000 and PH2, for skin lesions. The tool checks for the minimum error between the actual and expected values during each epoch and at each stage of the system evaluation of the training, validation and test data. In Table 10, the best performance of ANN is shown based on fused features of CNN and handcrafted.

Error histogram is one of the ANN measurements when analyzing dermoscopic images to classify the two datasets of skin lesions, HAM and PH2. The tool checks for the minimum error between actual and expected values based on instances during each epoch and at each stage of the system evaluation of training, validation and test data [36]. Table 10 shows the best performance of the ANN based on the fused features of CNN and handcrafted.

Regression is one of the measures of ANN when analyzing dermoscopic images to classify two datasets, HAM10000 and PH2, of skin lesions. The tool calculates continuous values (output) based on other values (target) during each stage of system evaluation of training, validation and test data [37]. Table 10 shows the average value of the ANN regression based on the fused features of CNN and handcrafted.

5. Discussion the Results of the Systems

Skin lesions are chronic and serious diseases that threaten people’s lives. There is a high rate of recovery of serious skin lesions, such as melanoma, if diagnosed early; on the other hand, it leads to death if not recognized early [38]. The characteristics of skin lesions are similar, so it is difficult to identify them in their early stages. Here, several hybrid methods have been developed to analyze images of skin lesions for early diagnosis [39]. Due to artifacts in the dermoscopic images, the images were enhanced with the same optimization methods for all techniques. A data augmentation technique was applied to all methods to solve the problem of overfitting and imbalance of two datasets, HAM10000 and PH2.

The first strategy of dermoscopic image analysis is to classify two datasets of skin lesions, HAM10000 and PH2, using pre-trained MobileNet and ResNet101 models. The second strategy is to classify skin lesions for two datasets, HAM10000 and PH2, using a hybrid technique between CNN models (MobileNet and ResNet101) and an SVM classifier. The task of the MobileNet and ResNet101 models is to extract features with high accuracy and enter them into PCA to select important features. The selected features are fed into the SVM classifier for classification with high accuracy. To increase system efficiency, features of MobileNet and ResNet10 are integrated and fed into the SVM for high-accuracy classification. The third strategy to classify skin lesions using a hybrid technique based on fused features of CNN and handcrafted. Two datasets, HAM10000 and PH2, were classified by ANN with fused features for MobileNet and ResNet101 separately with handcrafted features.

Table 11 and Figure 16 summarize the results of the system’s classification of the two datasets, HAM10000 and PH2, for skin lesions. First, for the HAM10000 dataset, the best accuracy for each class was with the ResNet101, where it achieved the best class classification accuracy within the akiec class (93.8%). ANN with hybrid features of MobileNet and handcrafted achieved the best classification accuracy for bcc, bkl, df and mel classes of 95.1%, 97.5%, 100% and 98.7%, respectively. ANN, with hybrid features of ResNet101 and handcrafted, achieved the best classification accuracy for nv and vasc classes of 99.6% and 85.7%, respectively.

Secondly, in the PH2 data set, with hybrid features of MobileNet and handcrafted, the ANN achieved the best classification accuracy for mel, nv and atypical classes of 100% for each class.

6. Conclusions

Melanoma is one of the deadliest diseases and leads to death if it is diagnosed in the late stages. There are many applications based on artificial intelligence to help doctors in the early detection of skin lesions. This study aimed to achieve satisfactory results for diagnosing skin lesions, including melanoma, through hybrid techniques based on hybrid features. Due to the imbalance of dataset classes, the challenge of balancing dataset classes has been solved. The first technique is to diagnose skin lesions by pre-trained MobileNet and ResNet101 models. The second technique for diagnosing skin lesions uses a hybrid technique, namely, SVM-MobileNet, SVM-ResNet101 and SVM-MobileNet-ResNet101, which achieved satisfactory results. The third technique is the novelty of this study, in which skin lesions were discovered and distinguished into classes by integrating the features of MobileNet and Handcrafted, and the features of ResNet101 and Handcrafted, and then feeding them to the ANN network for classification. With MobileNet and Handcrafted features, ANN achieved an AUC of 97.53%, accuracy of 98.4%, sensitivity of 94.46%, precision of 93.44% and specificity of 99.43%.

Author Contributions

Conceptualization, I.A.A., E.M.S., H.S.A.S. and Z.M.A.; methodology, I.A.A., E.M.S. and M.M.A.A.-A.; software, E.M.S. and I.A.A.; validation, M.M.A.A.-A., H.S.A.S., Z.M.A., E.M.S. and I.A.A.; formal analysis, I.A.A., M.M.A.A.-A., E.M.S. and H.S.A.S.; investigation, I.A.A., E.M.S. and Z.M.A.; resources, Z.M.A., H.S.A.S., E.M.S. and M.M.A.A.-A.; data curation, I.A.A., E.M.S. and Z.M.A.; writing—original draft preparation E.M.S.; writing—review and editing, I.A.A. and H.S.A.S.; visualization, H.S.A.S., M.M.A.A.-A. and I.A.A.; supervision, I.A.A. and E.M.S.; project administration, I.A.A. and E.M.S.; funding acquisition, I.A.A. and H.S.A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been funded by the Deputy for Research and Innovation—Ministry of Education, Kingdom of Saudi Arabia for this research, through Project Code: NU/DRP/SERC/12/19.

Data Availability Statement

In this study, data were collected to support the classification systems of the HAM10000 and PH2 data sets for skin lesions available to the public at the following link: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T https://www.fc.up.pt/addi/ph2%20database.html (accessed on 17 November 2022).

Acknowledgments

Authors would like to acknowledge the support of the Deputy for Research and Innovation- Ministry of Education, Kingdom of Saudi Arabia for this research through a grant (NU/IFC/02/002) under the Institutional Funding Committee at Najran University, Kingdom of Saudi Arabia.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bristow, R.E.; Chang, J.; Ziogas, A.; Campos, B.; Chavez, L.R. Anton-Culver, H. Impact of National Cancer Institute Comprehensive Cancer Centers on ovarian cancer treatment and survival. J. Am. Coll. Surg. 2015, 220, 940–950. Available online: https://www.sciencedirect.com/science/article/pii/S1072751515001258 (accessed on 15 November 2022). [CrossRef] [PubMed] [Green Version]
Ferlay, J.; Colombet, M.; Soerjomataram, I.; Parkin, D.M.; Piñeros, M.; Znaor, A.; Bray, F. Cancer statistics for the year 2020: An overview. Int. J. Cancer 2021, 149, 778–789. [Google Scholar] [CrossRef] [PubMed]
Kadampur, M.A.; Al Riyaee, S. Skin cancer detection: Applying a deep learning based model driven architecture in the cloud for classifying dermal cell images. Inform. Med. Unlocked 2020, 18, 100282. [Google Scholar] [CrossRef]
Bilal, H.; Xiao, Y.; Khan, M.N.; Chen, J.; Wang, Q.; Zeng, Y.; Lin, X. Stabilization of Acne Vulgaris-Associated Microbial Dysbiosis with 2% Supramolecular Salicylic Acid. Pharmaceuticals 2023, 16, 87. [Google Scholar] [CrossRef]
Singaporean Journal of Scientific Research (SJSR). Computer aided melanoma skin cancer detection using artificial neural network classifier. J. Sel. Areas Microelectron. 2016, 8, 35–42. Available online: http://www.sjsronline.com/Papers/Papers/sjsrvol8no22016-5.pdf (accessed on 22 December 2022).
Hosny, K.M.; Kassem, M.A.; Foaud, M.M. Classification of skin lesions using transfer learning and augmentation with Alex-net. PLoS ONE 2019, 14, e0217293. [Google Scholar] [CrossRef] [Green Version]
Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
Vestergaard, M.E.; Menzies, S.W. Automated diagnostic instruments for cutaneous melanoma. In Seminars in Cutaneous Medicine and Surgery; WB Saunders: Philadelphia, PN, USA, 2008; Volume 27, pp. 32–36. [Google Scholar] [CrossRef]
Codella, N.C.; Gutman, D.; Celebi, M.E.; Helba, B.; Marchetti, M.A.; Dusza, S.W.; Halpern, A. Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging, Washington, DC, USA, 4–7 April 2018; IEEE: Piscatvie, NJ, USA, 2018; pp. 168–172. Available online: https://ieeexplore.ieee.org/abstract/document/8363547/ (accessed on 15 November 2022).
Srinivasu, P.N.; SivaSai, J.G.; Ijaz, M.F.; Bhoi, A.K.; Kim, W.; Kang, J.J. Classification of Skin Disease Using Deep Learning Neural Networks with MobileNet V2 and LSTM. Sensors 2021, 21, 2852. [Google Scholar] [CrossRef] [PubMed]
Alazzam, M.B.; Alassery, F.; Almulihi, A. Diagnosis of melanoma using deep learning. Math. Probl. Eng. 2021, 2021, 1423605. Available online: https://www.hindawi.com/journals/mpe/2021/1423605/ (accessed on 15 November 2022). [CrossRef]
Jaworek-Korjakowska, J.; Brodzicki, A.; Cassidy, B.; Kendrick, C.; Yap, M.H. Interpretability of a Deep Learning Based Approach for the Classification of Skin Lesions into Main Anatomic Body Sites. Cancers 2021, 13, 6048. [Google Scholar] [CrossRef] [PubMed]
Alam, T.M.; Shaukat, K.; Khan, W.A.; Hameed, I.A.; Almuqren, L.A.; Raza, M.A.; Aslam, M.; Luo, S. An Efficient Deep Learning-Based Skin Cancer Classifier for an Imbalanced Dataset. Diagnostics 2022, 12, 2115. [Google Scholar] [CrossRef] [PubMed]
Abunadi, I.; Senan, E.M. Deep Learning and Machine Learning Techniques of Diagnosis Dermoscopy Images for Early Detection of Skin Diseases. Electronics 2021, 10, 3158. [Google Scholar] [CrossRef]
Khamparia, A.; Singh, P.K.; Rani, P.; Samanta, D.; Khanna, A.; Bhushan, B. An internet of health things-driven deep learning framework for detection and classification of skin cancer using transfer learning. Trans. Emerg. Telecommun. Technol. 2021, 32, e3963. [Google Scholar] [CrossRef]
Gouda, W.; Sama, N.U.; Al-Waakid, G.; Humayun, M.; Jhanjhi, N.Z. Detection of Skin Cancer Based on Skin Lesion Images Using Deep Learning. Healthcare 2022, 10, 1183. [Google Scholar] [CrossRef]
Iqbal, I.; Younus, M.; Walayat, K.; Kakar, M.U.; Ma, J. Automated multi-class classification of skin lesions through deep convolutional neural network with dermoscopic images. Comput. Med. Imaging Graph. 2021, 88, 101843. [Google Scholar] [CrossRef]
Murugan, A.; Nair, S.A.H.; Preethi, A.A.P.; Kumar, K.S. Diagnosis of skin cancer using machine learning techniques. Microprocess. Microsyst. 2021, 81, 103727. [Google Scholar] [CrossRef]
Khan, M.A.; Sharif, M.; Akram, T.; Damaševičius, R.; Maskeliūnas, R. Skin Lesion Segmentation and Multiclass Classification Using Deep Learning Features and Improved Moth Flame Optimization. Diagnostics 2021, 11, 811. [Google Scholar] [CrossRef] [PubMed]
Liu, L.; Tsui, Y.Y.; Mandal, M. Skin Lesion Segmentation Using Deep Learning with Auxiliary Task. J. Imaging 2021, 7, 67. [Google Scholar] [CrossRef] [PubMed]
Adegun, A.A.; Viriri, S.; Yousaf, M.H. A Probabilistic-Based Deep Learning Model for Skin Lesion Segmentation. Appl. Sci. 2021, 11, 3025. [Google Scholar] [CrossRef]
Afza, F.; Sharif, M.; Khan, M.A.; Tariq, U.; Yong, H.-S.; Cha, J. Multiclass Skin Lesion Classification Using Hybrid Deep Features Selection and Extreme Learning Machine. Sensors 2022, 22, 799. [Google Scholar] [CrossRef]
Tschandl, P.; Rinner, C.; Apalla, Z.; Argenziano, G.; Codella, N.; Halpern, A.; Kittler, H. Human–computer collaboration for skin cancer recognition. Nat. Med. 2020, 26, 1229–1234. [Google Scholar] [CrossRef]
ADDI—Automatic Computer-Based Diagnosis System for Dermoscopy Images. Available online: https://www.fc.up.pt/addi/ph2%20database.html (accessed on 30 December 2022).
Ahmed, I.A.; Senan, E.M.; Rassem, T.H.; Ali, M.A.; Shatnawi, H.S.A.; Alwazer, S.M.; Alshahrani, M. Eye Tracking-Based Diagnosis and Early Detection of Autism Spectrum Disorder Using Machine Learning and Deep Learning Techniques. Electronics 2022, 11, 530. [Google Scholar] [CrossRef]
Lyakhov, P.A.; Lyakhova, U.A.; Nagornov, N.N. System for the Recognizing of Pigmented Skin Lesions with Fusion and Analysis of Heterogeneous Data Based on a Multimodal Neural Network. Cancers 2022, 14, 1819. [Google Scholar] [CrossRef] [PubMed]
Fati, S.M.; Senan, E.M.; ElHakim, N. Deep and Hybrid Learning Technique for Early Detection of Tuberculosis Based on X-ray Images Using Feature Fusion. Appl. Sci. 2022, 12, 7092. [Google Scholar] [CrossRef]
Al-Mekhlafi, Z.G.; Senan, E.M.; Rassem, T.H.; Mohammed, B.A.; Makbol, N.M.; Alanazi, A.A.; Ghaleb, F.A. Deep Learning and Machine Learning for Early Detection of Stroke and Haemorrhage. Comput. Mater. Contin. 2022, 72, 775–796. Available online: http://eprints.bournemouth.ac.uk/36721/ (accessed on 15 November 2022). [CrossRef]
Fati, S.M.; Senan, E.M.; Azar, A.T. Hybrid and Deep Learning Approach for Early Diagnosis of Lower Gastrointestinal Diseases. Sensors 2022, 22, 4079. [Google Scholar] [CrossRef]
Mohammed, B.A.; Senan, E.M.; Al-Mekhlafi, Z.G.; Rassem, T.H.; Makbol, N.M.; Alanazi, A.A.; Almurayziq, T.S.; Ghaleb, F.A.; Sallam, A.A. Multi-Method Diagnosis of CT Images for Rapid Detection of Intracranial Hemorrhages Based on Deep and Hybrid Learning. Electronics 2022, 11, 2460. [Google Scholar] [CrossRef]
Mohammed, B.A.; Senan, E.M.; Rassem, T.H.; Makbol, N.M.; Alanazi, A.A.; Al-Mekhlafi, Z.G.; Almurayziq, T.S.; Ghaleb, F.A. Multi-Method Analysis of Medical Records and MRI Images for Early Diagnosis of Dementia and Alzheimer’s Disease Based on Deep Learning and Hybrid Methods. Electronics 2021, 10, 2860. [Google Scholar] [CrossRef]
Al-Mekhlafi, Z.G.; Senan, E.M.; Mohammed, B.A.; Alazmi, M.; Alayba, A.M.; Alreshidi, A.; Alshahrani, M. Diagnosis of Histopathological Images to Distinguish Types of Malignant Lymphomas Using Hybrid Techniques Based on Fusion Features. Electronics 2022, 11, 2865. [Google Scholar] [CrossRef]
Senan, E.M.; Jadhav, M.E. Diagnosis of dermoscopy images for the detection of skin lesions using SVM and KNN. In Proceedings of the Third International Conference on Sustainable Computing, Jaipur, India, 19–20 March 2021; Springer: Singapore, 2022; pp. 125–134. [Google Scholar] [CrossRef]
Senan, E.M.; Jadhav, M.E.; Kadam, A. Classification of PH2 images for early detection of skin diseases. In Proceedings of the 2021 6th International Conference for Convergence in Technology, Mumbai, India, (I2CT), 2–4 April 2021; (I2CT). IEEE: Piscataway, NJ, USA, 2021; pp. 1–7. [Google Scholar] [CrossRef]
Senan, E.M.; Jadhav, M.E. Techniques for the Detection of Skin Lesions in PH 2 Dermoscopy Images Using Local Binary Pattern (LBP). In Proceedings of the International Conference on Recent Trends in Image Processing and Pattern Recognition, Aurangabad, India, 3–4 January 2020; Springer: Singapore, 2020; pp. 14–25. [Google Scholar] [CrossRef]
Senan, E.M.; Abunadi, I.; Jadhav, M.E.; Fati, S.M. Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms. Comput. Math. Methods Med. 2021, 2021, 8500314. [Google Scholar] [CrossRef] [PubMed]
Senan, E.M.; Mohammed, J.M.E.; Rassem, T.H.; Aljaloud, A.S.; Mohammed, B.A.; Al-Mekhlafi, Z.G. Early Diagnosis of Brain Tumour MRI Images Using Hybrid Techniques between Deep and Machine Learning. Comput. Math. Methods Med. 2022, 2022, 8330833. [Google Scholar] [CrossRef] [PubMed]
Mohammed, B.A.; Senan, E.M.; Al-Mekhlafi, Z.G.; Alazmi, M.; Alayba, A.M.; Alanazi, A.A.; Alreshidi, A.; Alshahrani, M. Hybrid Techniques for Diagnosis with WSIs for Early Detection of Cervical Cancer Based on Fusion Features. Appl. Sci. 2022, 12, 8836. [Google Scholar] [CrossRef]
Fraiwan, M.; Faouri, E. On the Automatic Detection and Classification of Skin Cancer Using Deep Transfer Learning. Sensors 2022, 22, 4963. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The methodological framework for dermoscopic image analysis to classify the HAM10000 and PH2 datasets for early detection of skin lesions.

Figure 2. Samples of dermoscopic images for all classes of the HAM10000 data set (a) before enhancement and (b) after enhancement.

Figure 3. Samples of dermoscopic images for all classes of the PH2 data set (a) before enhancement and (b) after enhancement.

Figure 4. Sample dermoscopic images of skin lesions before and after hair removal.

Figure 5. Structure of the hybrid technique for dermoscopic image analysis of the HAM10000 and PH2 datasets.

Figure 6. The basic structure of the ANN for the classification of dermoscopic images of the two datasets, HAM10000 and PH2.

Figure 7. Methodology for classifying dermoscopic images from two datasets, HAM10000 and PH2, by ANN with hybrid features for CNN and handcrafted models.

Figure 8. Sample dermoscopic images of skin lesions after data augmentation.

Figure 9. Confusion matrix for the classification of two skin lesion datasets by MobileNet through datasets of (a) HAM10000 and (b) PH2.

Figure 10. Confusion matrix for the classification of two skin lesion datasets by the ResNet101 model through datasets of (a) HAM10000 and (b) PH2.

Figure 11. Confusion matrix for the classification of two skin lesion datasets by SVM with MobileNet features through datasets of (a) HAM10000 and (b) PH2.

Figure 12. Confusion matrix for the classification of two skin lesion datasets by SVM with ResNet101 features through datasets of (a) HAM10000 and (b) PH2.

Figure 13. Confusion matrix for the classification of two skin lesion datasets by SVM with fused features of ResNet101 and ResNet101 through datasets of (a) HAM10000 and (b) PH2.

Figure 14. Confusion matrix for the classification of two skin lesion datasets by ANN with fused features of MobileNet and handcrafted through datasets of (a) HAM10000 and (b) PH2.

Figure 15. Confusion matrix for the classification of two skin lesion datasets by ANN with fused features of ResNet101 and handcrafted through datasets of (a) HAM10000 and (b). PH2.

Figure 16. Display of the systems’ performance of dermoscopic image analysis for the classification of two datasets, HAM10000 and PH2, for skin lesions.

Table 1. Splitting the HAM10000 and PH2 data sets of skin lesions.

Datasets	HAM10000			PH2
Phase	80% (80:20)		Testing 20%	80% (80:20)		Testing 20%
Classes	Training (80%)	Validation (20%)	Testing 20%	Training (80%)	Validation (20%)	Testing 20%
Melanocytic nevi (nv)	4291	1073	1341	51	13	16
Atypical	-	-	-	51	13	16
Melanoma (mel)	712	178	223	26	6	8
Benign keratosis lesions (bkl)	703	176	220	-	-	-
Basal cell carcinoma (bcc)	329	82	103	-	-	-
Actinic keratoses (akiec)	210	52	65	-	-	-
Vascular (vasc)	91	23	28	-	-	-
Dermatofibroma (df)	74	18	23	-	-	-

Table 2. Balancing and augmentation of dermoscopic image data for the two datasets of skin lesions, HAM10000 and PH2.

Datasets	HAM10000							PH2
Phase	Training Dataset							Training Dataset
Classes	nv	Mel	bkl	bcc	akiec	vasc	df	nv	atypical	mel
Bef-augm	4291	712	703	329	210	91	74	51	51	26
Aft-augm	4291	4272	4218	4277	4200	4095	3700	1530	1530	1300

Table 3. Performance of MobileNet for dermoscopic image analysis to classify skin lesions in HAM10000 and PH2 datasets.

Dataset	Type of Class	AUC %	Accuracy %	Sensitivity %	Precision %	Specificity %
HAM10000	akiec	92.5	87.7	88.4	90.5	99.6
	bcc	93.4	89.3	89.3	88.5	98.7
	bkl	96.2	93.6	94.1	97.6	99.7
	df	94.6	91.3	90.8	100	99.6
	mel	96.1	96.9	97.2	64.5	93.2
	nv	95.7	91.5	90.6	98.4	97.4
	vasc	94.6	71.4	91.3	90.9	99.5
	average ratio	94.73	91.80	91.67	90.06	98.24
PH2	atypical	92.2	87.5	88.2	100	99.6
	mel	94.5	100	99.7	72.7	91.4
	nv	91.9	87.5	88.4	93.3	95.8
	average ratio	92.87	90.00	92.10	88.67	95.60

Table 4. Performance of ResNet101 for dermoscopic image analysis to classify skin lesions in HAM10000 and PH2 datasets.

Dataset	Type of Class	AUC %	Accuracy %	Sensitivity %	Precision %	Specificity %
HAM10000	akiec	95.4	93.8	94.3	87.1	99.5
	bcc	93.7	92.2	91.6	92.2	99.6
	bkl	96.6	93.2	93.4	96.2	99.5
	df	93.6	91.3	90.8	95.5	99.8
	mel	96.1	93.3	93.2	94.2	93.4
	nv	98.2	91.1	90.7	98.8	97.8
	vasc	91.5	71.4	71.2	57.1	98.6
	average ratio	95.01	91.40	89.31	88.73	98.31
PH2	atypical	94.9	93.8	94.1	100	99.6
	mel	90.8	87.5	87.7	87.5	97.2
	nv	93.7	93.8	94.4	88.2	91.9
	average ratio	93.13	92.50	92.07	91.90	96.23

Table 5. Performance of SVM with MobileNet features for dermoscopic image analysis to classify HAM10000 and PH2 dataset skin lesions.

Dataset	Type of Class	AUC %	Accuracy %	Sensitivity %	Precision %	Specificity %
HAM10000	akiec	92.1	84.6	85.4	87.3	99.5
	bcc	97.6	91.3	90.8	93.1	99.6
	bkl	98.2	93.6	93.7	90.4	98.7
	df	76.4	65.2	65.4	34.9	99.2
	mel	96.4	89.7	90.3	95.2	99.1
	nv	99.1	97.8	98.2	98.3	97.4
	vasc	88.2	78.6	79.4	91.7	99.6
	average ratio	92.57	95.00	86.17	84.41	99.01
PH2	atypical	95.6	87.5	88.1	100	99.5
	mel	96.1	100	99.7	88.9	97.2
	nv	97.8	100	99.5	94.1	96.4
	average ratio	96.50	95.00	95.77	94.33	97.70

Table 6. Performance of SVM with ResNet101 features for dermoscopic image analysis to classify HAM10000 and PH2 dataset skin lesions.

Dataset	Type of Class	AUC %	Accuracy %	Sensitivity %	Precision %	Specificity %
HAM10000	akiec	84.2	80	80.4	77.6	99.2
	bcc	87.5	86.4	85.6	83.2	98.8
	bkl	94.3	95.5	94.7	88.6	97.9
	df	91.2	87	87.2	90.9	99.5
	mel	89.4	85.2	85.1	91.8	99.3
	nv	97.6	98.4	97.7	98.9	98.1
	vasc	86.2	67.9	68.2	65.5	98.8
	average ratio	90.06	94.80	85.56	85.21	98.80
PH2	atypical	98.1	100	99.7	94.1	96.2
	mel	93.5	87.5	88.4	100	99.6
	nv	95.5	93.8	94.1	93.8	95.8
	average ratio	95.70	95.00	94.07	95.97	97.20

Table 7. Performance of SVM with fused features of MobileNet and ResNet101 for dermoscopic image analysis to classify HAM10000 and PH2 dataset skin lesions.

Dataset	Type of Class	AUC %	Accuracy %	Sensitivity %	Precision %	Specificity %
HAM10000	akiec	98.4	90.8	90.5	89.4	99.5
	bcc	99.1	91.3	91.2	89.5	99.1
	bkl	98.4	96.4	95.8	96.8	99.5
	df	98.6	87	87.3	87	99.7
	mel	98.1	93.3	93.2	94.4	98.8
	nv	98.9	98.9	98.7	98.9	98.6
	vasc	99.3	82.1	82.4	82.1	99.6
	average ratio	98.69	97.00	91.30	91.16	99.26
PH2	atypical	97.6	93.8	94.3	100	99.8
	mel	98.4	100	99.5	100	99.5
	nv	97.9	100	99.8	94.1	96.4
	average ratio	97.97	97.50	97.87	98.03	98.57

Table 8. Performance of ANN with fused features of MobileNet and handcrafted for dermoscopic image analysis to classify HAM10000 and PH2 dataset skin lesions.

Dataset	Type of Class	AUC %	Accuracy %	Sensitivity %	Precision %	Specificity %
HAM10000	akiec	96.7	90.8	91.2	95.2	99.5
	bcc	98.2	95.1	94.6	96.1	99.6
	bkl	96.6	97.3	96.7	98.2	99.5
	df	98.2	100	99.5	95.8	99.7
	mel	99.3	98.7	98.7	97.3	99.6
	nv	99.5	99.5	98.1	99.6	98.7
	vasc	94.2	82.1	82.4	71.9	99.4
	average ratio	97.53	98.40	94.46	93.44	99.43
PH2	atypical	100	100	100	100	100
	mel	100	100	100	100	100
	nv	100	100	100	100	100
	average ratio	100.00	100.00	100.00	100.00	100.00

Table 9. Performance of ANN with fused features of ResNet101 and handcrafted for dermoscopic image analysis to classify HAM10000 and PH2 dataset skin lesions.

Dataset	Type of Class	AUC %	Accuracy %	Sensitivity %	Precision %	Specificity %
HAM10000	akiec	99.1	89.2	89.2	90.6	99.6
	bcc	99.6	91.3	91.4	92.2	99.5
	bkl	98.7	96.4	96.2	97.2	99.7
	df	99.4	87	87.4	87	100
	mel	98.7	94.2	93.8	96.3	99.5
	nv	98.9	99.6	99.5	98.9	98.2
	vasc	99.4	85.7	86.5	88.9	99.7
	average ratio	99.11	97.60	92.00	93.01	99.46
PH2	atypical	99.5	100	99.7	100	99.9
	mel	99.7	100	99.6	88.9	97.4
	nv	99.4	93.8	94.4	100	99.5
	average ratio	99.53	97.50	97.90	96.30	98.93

Table 10. ANN performance measures for classification of the HAM10000 and PH2 datasets for skin lesions.

Datasets	Radiomic Features	Gradient and Validation Check	Cross-Entropy	Error Histogram	Regression%
HAM10000	MobileNet with handcrafted	0.0065521 at epoch 47	0.049723 at epoch 41	−0.9382 to 0.9502	98.11
HAM10000	ResNet101 with handcrafted	0.009407 at epoch 44	0.057363 to epoch 38	−0.9498 to 0.9499	97.88
PH2	MobileNet with handcrafted	0.00090003 at epoch 46	0.02453 at epoch 40	−0.9062 to 0.9067	95
PH2	ResNet101 with handcrafted	0.026715 at epoch 23	0.015564 to epoch 17	−0.9363 to 0.9387	97.04

Table 11. Results of systems classification for the two datasets for skin lesions, HAM10000 and PH2.

Datasets	Techniques		Features	akiec	bcc	Bkl	df	mel	nv	vasc	Atypical	Accuracy %
HAM10000	MobileNet			87.7	89.3	93.6	91.3	96.9	91.5	71.5		91.8
	ResNet-101			93.8	92.2	93.2	91.3	93.3	91.3	71.4	-	91.4
	SVM	MobileNet-PCA		84.6	91.3	93.6	65.2	89.7	97.8	78.6	-	95
		ResNet101-PCA		80	86.4	95.5	87	85.2	98.4	67.9	-	94.8
		Fusion features	MobileNet with ResNet-101	90.8	91.3	96.4	87	93.3	98.9	82.1	-	97
	ANN	Fusion features	MobileNet and handcrafted	90.8	95.1	97.5	100	98.7	99.5	82.1	-	98.4
	ANN	Fusion features	ResNet-101 and handcrafted	89.2	91.3	96.4	87	94.2	99.6	85.7	-	97.6
PH2	MobileNet			-	-	-	-	100	87.5		87.5	90
	ResNet-101			-	-	-	-	87.5	93.8	-	93.8	92.5
	SVM	MobileNet-PCA		-	-	-	-	100	100	-	87.5	95
		ResNet101-PCA		-	-	-	-	87.5	93.8	-	100	95
		Fusion features	MobileNet with ResNet-101	-	-	-	-	100	100	-	93.8	97.5
	ANN	Fusion features	MobileNet and handcrafted	-	-	-	-	100	100	-	100	100
	ANN	Fusion features	ResNet-101 and handcrafted	-	-	-	-	100	93.8	-	100	97.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ahmed, I.A.; Senan, E.M.; Shatnawi, H.S.A.; Alkhraisha, Z.M.; Al-Azzam, M.M.A. Multi-Models of Analyzing Dermoscopy Images for Early Detection of Multi-Class Skin Lesions Based on Fused Features. Processes 2023, 11, 910. https://doi.org/10.3390/pr11030910

AMA Style

Ahmed IA, Senan EM, Shatnawi HSA, Alkhraisha ZM, Al-Azzam MMA. Multi-Models of Analyzing Dermoscopy Images for Early Detection of Multi-Class Skin Lesions Based on Fused Features. Processes. 2023; 11(3):910. https://doi.org/10.3390/pr11030910

Chicago/Turabian Style

Ahmed, Ibrahim Abdulrab, Ebrahim Mohammed Senan, Hamzeh Salameh Ahmad Shatnawi, Ziad Mohammad Alkhraisha, and Mamoun Mohammad Ali Al-Azzam. 2023. "Multi-Models of Analyzing Dermoscopy Images for Early Detection of Multi-Class Skin Lesions Based on Fused Features" Processes 11, no. 3: 910. https://doi.org/10.3390/pr11030910

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Models of Analyzing Dermoscopy Images for Early Detection of Multi-Class Skin Lesions Based on Fused Features

Abstract

1. Introduction

2. Related Work

3. Methods and Materials

3.1. Description of Dermoscopic Images Dataset

3.1.1. Description of HAM10000 Dataset

3.1.2. Description of PH2 Dataset

3.2. Pre-Processing Dermoscopic Images

3.2.1. Enhancement of Dermoscopic Images for the HAM10000 and PH2 Datasets

3.2.2. Hair Removal Method

3.3. SVM Based on CNN Features

3.3.1. Extract Deep Feature Maps

3.3.2. Machine Learning (SVM)

3.4. ANN with Hybrid Features of CNN and Handcrafted

4. Results of System Performance

4.1. Split of HAM10000 and PH2 Data Sets

4.2. Metrics of Systems Evaluation

4.3. Data Augmentation with Data Balancing

4.4. Results of Pre-Trained CNN Models

4.5. Results of SVM Based on CNN Features

4.6. Results of ANN with Hybrid Features of CNN and Handcrafted

5. Discussion the Results of the Systems

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI