Deep and Hybrid Learning Techniques for Diagnosing Microscopic Blood Samples for Early Detection of White Blood Cell Diseases

Almurayziq, Tariq S.; Senan, Ebrahim Mohammed; Mohammed, Badiea Abdulkarem; Al-Mekhlafi, Zeyad Ghaleb; Alshammari, Gharbi; Alshammari, Abdullah; Alturki, Mansoor; Albaker, Abdullah

doi:10.3390/electronics12081853

Open AccessArticle

Deep and Hybrid Learning Techniques for Diagnosing Microscopic Blood Samples for Early Detection of White Blood Cell Diseases

by

Tariq S. Almurayziq

^1,*

,

Ebrahim Mohammed Senan

²

,

Badiea Abdulkarem Mohammed

³

,

Zeyad Ghaleb Al-Mekhlafi

¹

,

Gharbi Alshammari

¹

,

Abdullah Alshammari

¹,

Mansoor Alturki

⁴

and

Abdullah Albaker

⁴

¹

Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha’il, Ha’il 81481, Saudi Arabia

²

Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Alrazi University, Sana’a, Yemen

³

Department of Computer Engineering, College of Computer Science and Engineering, University of Ha’il, Ha’il 81481, Saudi Arabia

⁴

Department of Electrical Engineering, College of Engineering, University of Ha’il, Ha’il 81481, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(8), 1853; https://doi.org/10.3390/electronics12081853

Submission received: 13 February 2023 / Revised: 10 April 2023 / Accepted: 11 April 2023 / Published: 13 April 2023

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

The immune system is one of the most critical systems in humans that resists all diseases and protects the body from viruses, bacteria, etc. White blood cells (WBCs) play an essential role in the immune system. To diagnose blood diseases, doctors analyze blood samples to characterize the features of WBCs. The characteristics of WBCs are determined based on the chromatic, geometric, and textural characteristics of the WBC nucleus. Manual diagnosis is subject to many errors and differing opinions of experts and takes a long time; however, artificial intelligence techniques can help to solve all these challenges. Determining the type of WBC using automatic diagnosis helps hematologists to identify different types of blood diseases. This work aims to overcome manual diagnosis by developing automated systems for classifying microscopic blood sample datasets for the early detection of diseases in WBCs. Several proposed systems were used: first, neural network algorithms, such as artificial neural networks (ANNs) and feed-forward neural networks (FFNNs), were applied to diagnose the dataset based on the features extracted using the hybrid method between two algorithms, the local binary pattern (LBP) and gray-level co-occurrence matrix (GLCM). All algorithms attained superior accuracy for WBC diagnosis. Second, the pre-trained convolutional neural network (CNN) models AlexNet, ResNet-50, GoogLeNet, and ResNet-18 were applied for the early detection of WBC diseases. All models attained exceptional results in the early detection of WBC diseases. Third, the hybrid technique was applied, consisting of a pair of blocks: the CNN models block for extracting deep features and the SVM algorithm block for the classification of deep features with superior accuracy and efficiency. These hybrid techniques are named AlexNet with SVM, ResNet-50 with SVM, GoogLeNet with SVM, and ResNet-18 with SVM. All techniques achieved promising results when diagnosing the dataset for the early detection of WBC diseases. The ResNet-50 model achieved an accuracy of 99.3%, a precision of 99.5%, a sensitivity of 99.25%, a specificity of 99.75%, and an AUC of 99.99%.

Keywords:

CNN; neural networks; WBCs; hybrid techniques; SVM; LBP; GLCM

1. Introduction

Blood contains a great amount of information that can be used to evaluate and analyze a person’s health. It consists of 45% red blood cells (RBCs) and 55% plasma, in addition to less than 1% WBCs (WBC) and platelets [1]. Plasma transports nutrients, proteins, hormones, and minerals to all body parts through blood vessels and removes harmful elements in the form of waste. The RBC rate ranges between four to six million per microliter. Hemoglobin is the most important component of RBCs, hauling oxygen to all regions of the body [2]. Platelets are responsible for blood clotting and range from 150,000 to 450,000 per microliter in a normal person [3]. The WBC rate ranges from 4500 to 11,000 per microliter in a normal person. WBCs are the main component of immune cells that is responsible for providing the body immunity to fight viruses and resist diseases, as well as protecting the body from infections such as fungi, bacteria, and viruses [4]. WBCs are created by the bone marrow, lymphoid tissue, and some important glands. There are five types of WBCs: eosinophils, lymphocytes, monocytes, neutrophils, and basophils. An increase or decrease in the count of these WBCs causes various chronic and fatal diseases [5]. Additionally, many diseases, such as bacteria, leukemia, immunodeficiency syndrome, and infections, appear due to an increase or decrease in four types of WBC: eosinophils, lymphocytes, monocytes, and neutrophils. The number of images available for basophils is few due to their low incidence (0–3%). Due to this lack of images and their low importance, we will focus on the other four types of WBCs in this work. Increased neutrophil cells in the blood can occur due to bacteria, endotoxins, exotoxins, and fungi [6,7]. Lymphocyte cells increase due to diseases such as hepatitis, whooping cough, viruses, bordetella, leukemia, and brucellosis; other diseases, such as HIV, chickenpox, rubeola, and tuberculosis, reduce lymphocyte cells. Malaria, listeriosis, and viral and bacterial infections cause an increase in monocyte cells [8]. Eosinophil cells increase due to diseases such as allergies, parasites, and atopic diseases [9]. The WBC type and number are diagnosed with a blood test (hemogram) on a peripheral smear. The examination is based on spreading blood on a microscope slide and then evaluating the WBCs under the microscope [10]. For ease in the diagnosis of blood samples under a microscope, WBCs are divided into polynuclear (eosinophils, neutrophils, and basophils) and granulocytic (lymphocytes and monocytes) cells. Common procedures to calculate the count of WBC types include determining the type and location of micrographs and the shape and color of each cell. Thus, the process requires great effort, takes a long time, and is prone to errors caused by blood experts due to their differing diagnostic opinions. Computer-aided automated diagnostic systems can help to solve these challenges, reduce manual errors, and obtain reliable diagnostic accuracy [11].

The following are the main contributions of this work:

Enhanced all images using overlapping filters on average and Laplacian filters to produce high-quality images.
Extracted texture features using LBP and GLCM algorithms, combined all features into feature vectors, and classified them using ANN and FFNN networks.
Adjusted and modified the parameters of CNN models to extract deep feature maps for highly accurate diagnostic performance.
Applied hybrid techniques, including two parts of CNN models for extracting deep features and SVM for classifying deep features to obtain highly efficient diagnostic performance.
Developed high-performance diagnostic systems for diagnosing microscopic blood sample types for early detection of WBC diseases to help hematologists in decision-making.

The rest of the work is organized as follows: Section 2 describes a set of related works. Section 3 analyzes the techniques and materials for processing and exploring a dataset. Section 4 presents the outcomes for all systems proposed. Section 5 presents a discussion. Section 6 concludes the work.

2. Related Work

Many researchers have conducted a great number of studies on the discovery of WBCs, their colors, shape, and texture. However, there are still shortcomings and challenges in the process of segmentation and extracting features with high accuracy, which were addressed in this study with image enhancement using overlapping filters and extracting features using hybrid methods.

Hedge et al. presented an algorithm for detecting WBCs using light color difference. They implemented a hybrid technique between SVM and a neural network to classify the characteristics of shape, texture, and color features [12]. Arslan et al. proposed an algorithm based on the peripheral blood smear; the algorithm achieved an accuracy of 94% for WBC segmentation [13]. Prinyakupt et al. proposed a method for preimage processing, segmentation of WBC cells, feature extraction, and classification. Their system achieved good accuracy in diagnosing eosinophil, lymphocyte, monocyte, neutrophil, and basophil images [14]. Rezatofighi et al. presented the snake and Gram–Schmidt orthogonalization algorithm for separating the cytoplasm and nucleus of cells. With this algorithm, selected features are selectively extracted from the cytoplasm and nucleus. They evaluated the features selected with ANN and SVM, and the two classifiers achieved good results [15]. Der-Chen et al. discussed the herbaceous method for diagnosing WBC nuclei. The GLCM algorithm was applied to extract the essential features, and the PCA algorithm was used to reduce the dimensions of the extracted features. Finally, the features were classified, based on the genetic method, using the k-mean clustering algorithm [16]. Nazlıbilek et al. presented a method for detecting the region of pixels and longitudinal features on the major and minor axis and bounding box of grayscale WBC images.

The neural network classifier with PCA achieved an accuracy of 95% [17]. Ioanniset al. discussed a semi-supervised learning method for diagnosing WBC types. Several experiments have been applied to accurately and reliably predict the images of blood samples under a microscope [18]. Maxim et al. proposed deep learning approaches to evaluate two sets of blood sample data under a microscope to diagnose WBCs and eosinophils in the active and resting state. The deep learning models achieved 70.3% accuracy for the WBC dataset; for the eosinophil dataset, the models achieved an accuracy of 87.1% and 85.6%, respectively [19]. Justin et al. presented the pattern netfused ensemble of CNN (PECNN) method for diagnosing WBCs. The method uses PatternNet to combine random CNN outputs. PatternNet captures the strengths of patterns and ignores outliers [20]. Nizar et al. presented an approach to diagnosing leukemia using a CNN model that requires a large dataset. Thus, the augmentation method was applied to obtain artificial images throughout the training phase.

The ALL-IDB dataset was also evaluated with machine learning algorithms. All algorithms achieved good results, while the CNN models performed better than machine learning [21]. Goutam et al. proposed a system that consists of four stages, namely, preprocessing, denoising, and segmentation, to determine the cell region and split it from the remainder of the image, for feature extraction and classification algorithms. Features are categorized using K-mean clustering, SVM, and local directional path (LDP) algorithms [22]. Rawat et al. presented an approach to distinguish normal and malignant blood smears. The technique separates the nucleus from the rest of the WBCs and extracts 331 distinct color, texture, and geometrical features. The features were categorized using SVM, which achieved good results in detecting malignant smears [23]. Jyoti et al. presented a technique to distinguish malignant lymphocytes from healthy cells by separating the white cell segregation from the rest of the blood cells. Texture features were extracted using the GLCM method, which was also used to extract shape features; then, these were categorized using SVM.

The system achieved 86.7% accuracy with texture features and 56.1% accuracy with shape features [24]. Morteza et al. proposed a blood and bone marrow image collection system for diagnosing samples as malignant or normal. Enhanced images used for the process and cell nuclei were segmented using the k-means method. Statistical and geometrical characteristics of cell nuclei were extracted and classified using SVM [25]. Himali et al. presented an approach to detect leukemia cells based on blast cells and to determine whether they indicate an acute or chronic condition. Several methods were applied, such as linear contrast stretching, morphology, watershed transform, histogram equalization, and K-means; the system achieved good results in detecting leukemia [26].

It can be noted that all methodologies used in the previous studies aimed at an effective classification of WBC types and that high accuracy was the goal of all previous studies. Due to the similarity in the characteristics of WBC types, this study aimed to extract the features, combine them, and apply hybrid techniques between CNN models and ANN and SVM networks to achieve promising accuracy.

3. Materials and Methodology

Figure 1 shows the materials and methodology for analyzing and diagnosing the microscopic blood sample dataset for the detection of WBC types. A dataset containing four WBC types was collected. Every image underwent an image enhancement process to eliminate noise and artifacts.

The optimized images were then subjected to three proposed methods. The first suggested method was to diagnose microscopic blood sample images using ANN and FFNN based on the hybrid features using both LBP and GLCM methods. The second proposed method was to analyze microscopic blood sample images using four CNN models: AlexNet, ResNet-50, ResNet-18, and GoogLeNet. The third proposed method was a hybrid method of CNN models for deep feature extraction and SVM for deep feature classification.

3.1. Description of the Dataset

This study assessed all the systems that were proposed using a WBC microscopic sample dataset. The dataset contained 12,507 augmented images, almost equally divided into 4 types of blood cells divided into 2 phases: 80% for the phase of training and validation and 20% for the testing phase. The 4 types of blood cells in the dataset were eosinophils from 3133 images, lymphocytes from 3108 images, monocytes from 3095 images, and neutrophils from 3171 images. Figure 2 shows dataset samples for the four types of WBCs [27].

3.2. Average and Laplacian Filters

The dataset contains all slide images containing noise caused when the blood samples were mixed with components such as Wright’s stain or a methylene blue or eosin (red) dye mixture. The dataset also contains all slide images containing noise resulting from the diversity of microscope devices, from their accuracy and optical reflections, or from the methods of storing the dataset. These noises pose a challenge for obtaining an ultra-accurate diagnosis of the input images. Therefore, the first step in biomedical image processing is pretreatment; in this study, the noise was removed, and WBC edge contrast was increased with two filters: median and Laplacian [28].

First, the images were optimized with an average filter of 4 × 4 pixels and passed sequentially to process all image pixels. The average filter smooths the image by eliminating disparities between nearby pixels and replacing each center (target) pixel with an average of 15 neighboring pixels. The process is carried out continuously until the whole image is processed. Equation (1) shows how the average filter works [29]:

A v e r a g e (M) = \frac{1}{L} \sum_{i = 0}^{M - 1} y (M - 1)

(1)

where Average (M) denotes the optimized image (output), y(M − 1) denotes the previous input, and M denotes the image’s pixel count.

Because of the blurred edges between WBCs and other cells, the Laplacian filter was used, which detects the edges of WBCs, shows them clearly, and distinguishes them from other blood cells. The Laplacian filter’s action mechanism on the region of interest (WBCs) is described in Equation (2):

\nabla^{2} f = \frac{\partial^{2} f}{\partial^{2} x} + \frac{\partial^{2} f}{\partial^{2} y}

(2)

where ∇² f refers to a second order differential equation, and x and y refer to the coordinates in 2D matrices.

Lastly, to obtain an enhanced and clear image, the image produced using the Laplacian filter is subtracted from the image produced using the averaging filter, as described in Equation (3):

Imege optimized = A v e r a g e (M) - \nabla^{2} f

(3)

Figure 3 describes the dataset samples after they were optimized and then inputted for all the proposed systems.

3.3. Neural Network Algorithm

3.3.1. Adopted Region Growth Algorithm (Segmentation)

Images of the microscopic blood samples consist of WBCs and other blood cells; thus, extracting features from the whole image does not represent WBCs. Therefore, in this study, one of the segmentation techniques, the adopted region growth algorithm, was applied to all images in the dataset to segment the region of interest (WBCs) and, for further processing, separate it from the remainder of the image. The algorithm is based on grouping similar pixels into the same regions. For successful segmentation using the adopted region growth algorithm, the following requirements must be fulfilled [30]:

$\cup_{i = 1}^{m} y_{i} = y$ , where m is the number of the region;
$y = 1, 2, \dots \dots, M is connected$ ;
$P (y_{i}) = T R U E for 1, 2, \dots \dots, M$ ;
$P (y_{i} \cup y_{j}) F A L S E for i \neq j, where y_{i} and y are regions of neighboring .$

First, the algorithm must be fully applied to the image for the segmentation process to be complete. Second, the image must be divided into several regions, each region containing similar pixels, and the entire image is represented by the union of all regions. Third, all pixels in the same region must be correct. Fourth, there should be no two regions with identical pixels or similar pixels. The algorithm works on the principle of the bottom to top (from least to most), starting with a pixel and ending with a region (region of pixels). The primary idea of the method is that some foreign pixels are distributed as basic seeds for many regions. After that, all regions start growing and collecting similar pixels, and the border regions of the WBCs grow with similar pixels that represent the borders of the cell nucleus and isolate them from the remainder of the image. Figure 4 displays examples of the dataset after the segmentation procedure.

3.3.2. Morphological Operation

The segmentation process produces some images that contain holes; therefore, these holes are considered one of the challenges for obtaining high-efficiency diagnostic accuracy. Because these holes do not belong to the area of interest for white blood samples, they must be treated, and the holes must be filled. Therefore, this study applied a morphological method to improve the binary images and obtain more improved binary images [31]. The method creates a structure element template with a value of ones and zeros. The process works by placing the structure element on the image, comparing it with neighboring pixels, and moving the structure element over the entire image. Each time, the process tests the structure element to determine whether it “fits” or not, while the process tests the intersection of the neighborhood, which is named “hits”.

The structure element plays a vital position in image processing as it wraps around the image like convolution filters. When wrapping the structure element with the binary image, each pixel of the structural element has a corresponding pixel in the picture. It is referred to as “fits” when all the pixel values cover the image pixels in the structure element, whereas when the image pixels are covered with any pixel of the structure element, they are referred to as “hits”. Morphological operations include opening, erosion, closing, and dilation; all operations have a working mechanism for covering the structure element around the image.

The structure continues until all the image pixels are completed, and then a more enhanced image is produced. Figure 5 shows a set of microscopy images produced using morphological methods. The images were captured before applying morphological methods that contain some holes, which were filled and improved after the morphological methods.

3.3.3. Feature Extraction

Feature extraction is the most critical step in biomedical image processing. A microscopic blood sample dataset contains color, texture, and shape features. Hybrid feature extraction from several algorithms is a powerful tool for WBC diagnosis. LBP and GCLM were used to extract features. Then, the features were combined in a feature vector to form representative features for each class. The feature fusion algorithm is an effective, highly efficient, and modern method for obtaining powerful features that lead to good diagnosis results.

Firstly, the LBP algorithm is a method used to display binary surface texture by measuring contrast and the local texture pattern. The algorithm selects a central pixel (gc) and selects adjacent pixels (gp) according to the size of the diameter R, which is 16 pixels adjacent to each central pixel. The process is repeated, and in each iteration, a new central pixel is selected and processed according to the neighboring pixels [32]. Equation (4) shows how the algorithm works, whereby the value of each central pixel is replaced with the neighboring pixels. This algorithm produces 203 distinct features for every image:

L B P {(x_{c}, y_{c})}_{R, P} = \sum_{P = 0}^{P - 1} s ((g_{p} - g_{c}) \cdot 2^{P}

(4)

where g_c denotes the central pixel, g_p denotes the neighboring pixel, R denotes the radius around the central pixel, and P denotes the number of neighbors. The binary threshold function x is defined according to Equation (5):

s (x) = {\begin{matrix} 0, x < 0 \\ 1, x \geq 0 \end{matrix}

(5)

Secondly, the GLCM algorithm is an algorithm used to extract texture features from the ROI (WBCs). The algorithm distinguishes between a smooth texture and a rough texture using spatial information; the texture is smooth when the pixels are of similar values. In contrast, the texture is coarse when the pixels are of different values. The pairwise correlations between pixels are determined by the distance d and directions θ between 1 pixel, where θ represents four directions: 0°, 45°, 90°, and 135°. The relationship between distance and directions is when d = 1, then θ between pixels is θ = 0 or θ = 90; when d = √2, then θ between pixels is θ = 45 or θ = 135. This algorithm extracted 13 representative features.

Finally, the features extracted from two algorithms, LBP and GLCM, are combined, and the features are combined into one vector. The 203 features extracted from the LBP algorithm are combined with the 13 features extracted from the GLCM algorithm; thus, the hybrid method produces 216 features for each image.

3.3.4. Classification

In this section, the characteristics extracted from the WBC microscopic blood sample dataset were evaluated using two neural network classifiers, ANN and FFNN.

The input, hidden, and output layers are interconnected with specified weights in neural networks. The input layer is the first layer that receives the input, and it consists of many neurons according to the inputs. In this study, the number of external inputs is the features, and their number is 216. As a result, the input layer’s total number of input units is 216. The hidden layers are the most important layers that perform all the arithmetic operations to solve complex problems.

They receive the inputs from the input layer and process them based on the problem they are dealing with. In this study, the network contains 51 interconnected hidden layers, connected between each layer and the other, and between neurons in the same layer, with specific weights calculated to produce the best performance and the lowest error rate. Finally, the output layer receives the inputs from the last hidden layer and classifies each input according to the appropriate class. In this study, the output layer contains four neurons, the same number as the number of classes in the dataset. The algorithm analyzes complex data to interpret clear patterns. Each neuron in the network is connected to another neuron, either in the same layer or in another layer, by weight w. The network updates the weights in each iteration to reduce the error between the actual data X and the predicted Y to obtain a minimum squared error (MSE), as shown in Equation (6) [33]:

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(X_{i} - Y_{i})}^{2}

(6)

The ANN classifier was evaluated on the microscopic blood sample dataset for WBC type diagnosis. Figure 6 describes the structure of the ANN, which has an input layer of 216 neurons trained with 51 hidden layers to produce 4 neurons in the output layer.

FFNN is an intelligent neural network used to solve complex computational problems. In this study, it consists of an input and output layer, similar to the ANN classifier structure, but the hidden layer in FFNN consists of 22 hidden layers in order to produce the best network performance. All neurons are interconnected through connections named weights w. The algorithm works on the feed-forward between neurons, named the forward stage, and each neuron produces weights based on the weights of the previous neurons with the value of the bias related to the neuron. The weights are continuously updated and adjusted forward from the hidden to the output layers, and on each iteration, the lowest error between the actual X and predicted Y values is calculated. The minimum error is calculated using the minimum squared error (MSE) described in the previous equation.

3.4. Convolutional Neural Networks (CNNs)

CNNs are modern techniques used in many fields. The use of numerous layers, each of which has a specific role, distinguishes CNNs from other machine-learning techniques. CNNs work on obtaining multiple levels of representation, starting with a simple modular representation that transforms patterns from one level to a deeper level [34]. For diagnostic purposes, representative layers amplify necessary inputs and suppress differences. The number of levels of nonlinear arithmetic that the model learns indicates the depth of the structure, whereas machine-learning algorithms refer to surface structures. Therefore, researchers devoted their efforts to developing CNN architectures and succeeded in developing CNN models that have proven effective in classification, dimensionality reduction, regression, pattern recognition, texture modeling, robotics, biomedical image processing, signal processing, and other fields. CNNs are made up of auxiliary layers, the convolutional layer, the max or average-pooling layer, and the fully connected layer [35].

The convolutional layer is the first layer in CNNs, from which the network name is derived, and it consists of filters of different sizes. This layer works with a linear technique, named convolution, between the image x (t), and the filter w (t) of a specific size, as shown in Equation (7) [36]. The performance of the convolutional layers is governed by three parameters: filter size, p-step, and zero-padding. The convolutional process around the image is controlled by the filter size, the larger the filter size, the more convolution and the quicker the process. The convolutional layer contains many filters, each with a specific function. For example, one filter detects chromatic features, whereas some filters detect edges, some filters extract geometrical features, etc. This ability of CNNs is named translation invariance. The p-step parameter sets the step size when the filter moves on the image. The original input size is maintained by the zero-pad parameter [37].

s (t) = (x * w) (t) = \int x (a) w (t - a) d a

(7)

where s(t) represents the output after convolution. If t and w are a numerical value, then the wraps are defined using Equation (8):

s (t) = (x * w) (t) = \sum_{- \infty}^{\infty} x (a) w (t - a)

(8)

After a number of convolutional layers, a ReLU is used (rectified linear unit); this layer suppresses the negative output, converts it to zero, and passes the positive value, as described in Equation (9):

ReLU (x) = \max (0, x) = {\begin{matrix} x, x \geq 0 \\ 0, x < 0 \end{matrix}

(9)

In CNNs, the problem of overfitting appears because convolutional layers create millions of parameters. CNNs provide a resolution to this issue using a dropout layer. In each iteration, the dropout layer passes 50% of the neurons (parameters), and the process repeats until all parameters are passed. One of the disadvantages of this class is that it doubles the training process; however, the training time was doubled [38].

The training process is slowed down by the high-dimensional feature maps of the convolutional layers. Therefore, pooling layers are used to solve this problem and accelerate the data training. Pooling layers function similarly to convolutional layers, where a group of pixels is selected in accordance with the filter size. The pooling classes work with two techniques, max and average pooling. In max pooling, the maximum value of the pixel is chosen within the given group to represent the set of pixels, as shown in Equation (10), while in average pooling, the average of the pixel proximity is calculated, and the pixel proximity is selected by the average value, as shown in Equation (11).

P (i; j) = m a x_{m, n = 1 \dots . k} A [(i - 1) p + m; (j - 1) p + n]

(10)

P (i; j) = \frac{1}{k^{2}} \sum_{m, n = 1 \dots . k} A [(i - 1) p + m; (j - 1) p + n]

(11)

where A indicates pooling layers, m and n are filtering size dimensions, p is filtering step size, and k indicates filter capacity.

The fully connected layer is responsible for categorization. In the FCL, all neurons are interconnected with each other. The FCL converts features stored in feature maps from a two-dimensional representation to a one-dimensional representation. The FCL classifies each image inputted into the appropriate class. Finally, the Softmax function produces four neurons: eosinophils, lymphocytes, monocytes, and neutrophils. Equation (11) describes the mechanism of action of Softmax:

z (x_{i}) = \frac{\exp x_{i}}{\sum_{j = 1}^{n} \exp x_{j}}

(12)

where 0 ≤ z(x) ≤ 1.

In this study, we discuss four CNN models: AlexNet, ResNet-50, GoogLeNet, and ResNet-18.

3.5. Hybrid of CNN Models and SVM

This section will examine an approach that combines machine learning and deep learning. The hybrid method is divided into two parts: first, deep learning models are tasked with extracting deep feature maps, storing them in feature vectors, and sending them to the second section (see Figure 7).

Second, the SVM is tasked with classifying the feature maps extracted from the first part. A reason for using these hybrid technologies is that they require medium-cost computer specifications unlike CNN models, which require high-cost computer specifications. Additionally, hybrid techniques are rapid when training the dataset, and their computations are simple unlike CNN models, which consume time to train the dataset and utilize complex computations [39].

4. Experimental Result

4.1. Splitting Dataset

All the systems proposed in this work were evaluated using the microscopic blood sample dataset for the early diagnosis of WBCs. The augmented dataset contains 12,507 images divided into 4 types of WBC disease: eosinophils, containing 3133 images (25.05%); lymphocytes, containing 3108 images (24.85%); monocytes, containing 3095 images (24.75%); and neutrophils, containing 3171 images (25.35%).

The dataset was divided into 80% for the training and validation phases (80:20%, respectively) and 20% during the testing phase. Table 1 describes how the dataset was split during the phases of training, validation, and testing for all types of WBC disease. All systems were executed with an Intel ^® i5 processor with 4 GB GPU, 12 GB RAM, and a MATLAB 2018b operating environment.

4.2. Evaluation Metrics

In this study, all the proposed systems were evaluated, namely, neural network algorithms, deep learning (AlexNet, ResNet-50, ResNet-18, and GoogLeNet), and hybrid techniques (AlexNet with SVM, ResNet-50 with SVM, GoogLeNet with SVM, and ResNet-18 with SVM), using the blood microscopy dataset for WBC diagnosis by applying statistical measures. Equations (13)–(17) describe the statistical measures of accuracy, precision, sensitivity, specificity, and AUC used in this work to assess the performance of the systems. All of the proposed methods provide a confusion matrix that includes all successfully identified test images (TP and TN) as well as poorly classified images (FP and FN) [40,41,42,43,44]. Thus, using the information provided by the confusion matrix, the performance of the systems is calculated with the equations below:

Accuracy = \frac{TN + TP}{TN + TP + FN + FP} * 100 %

(13)

Precision = \frac{TP}{TP + FP} * 100 %

(14)

Sensitivity = \frac{TP}{TP + FN} * 100 %

(15)

Specificity = \frac{TN}{TN + FP} * 100 %

(16)

AUC = \frac{True Positive Rate}{False Positive Rate} = \frac{Sensitivity}{Specificity}

(17)

where the true positive (TP) is the unhealthy WBCs that have been correctly diagnosed, true negative (TN) is the healthy WBCs from correctly diagnosed normal patients, false negative (FN) is the blasted WBCs diagnosed as normal, and false positive (FP) is a normal WBC count diagnosed as blasted WBCs.

4.3. Results of Neural Network Algorithms

Neural network algorithms are among the most important and effective tools for the excellent diagnosis of medical images. Neural network classifiers depend on the quality performance of the previous stages in biomedical image processing, such as segmentation of ROI and extraction of the essential features. The dataset is divided into two phases using the neural network algorithm: training and validation, used to build the network, and testing, used to determine how accurate the system is. Figure 8 describes the training mechanism of the ANN and FFNN, which consists of an input layer containing 216 neurons and a hidden layer for processing and diagnosing the dataset. The ANN contains 51 hidden layers, while the FFNN contains 22 hidden layers. Finally, the output layer contains four neurons according to the types of leukocytes in the dataset.

4.3.1. Performance Analysis

Cross-entropy loss is one of the network performance metrics that determines the smallest error box between the actual and predicted values. There are many computational tools that can assess network performance. The system errors that occurred during the training, validation, and testing phases of the ANN and FFNN used on the dataset of microscopic blood samples are shown in Figure 9. The crossed lines in the figure signify the system’s peak performance, while the color blue signifies training, green signifies validation, and red signifies testing. The ANN algorithm in Figure 9a produced the best results at epoch 22, with a value of 0.11491. The FFNN algorithm in Figure 9b performed best at epoch 64, with a best performance of 0.065534. When the validation phase reaches the optimal performance for setting network parameters, the training phase ends.

4.3.2. Gradient

Gradient values are a neural network performance evaluation tool that measures the error between the true and expected output. Figure 10 describes the gradient value and the validation measurement for the microscopic blood sample dataset for WBC diagnosis. Figure 10a describes the implementation of the ANN algorithm on the dataset, where the best gradient value is 0.026926 during the 117 epochs, and the validation check value is 6 during the 117 epochs. Figure 10b describes the implementation of the FFNN. The best gradient value was 0.0042712 through epoch 70, and the validation measurement value is 1 × 10⁻⁶ through epoch 7.

4.3.3. Regression

Regression is a tool used to evaluate the performance of neural networks, which predicts the dependent variable according to independent variables by measuring the error rate between the output and target values. Thus, the best value for the regression is when it approaches one, where the error value between them has reached the minimum. Figure 11 describes the regression of the microscopic blood sample dataset and the predicted dependent variables according to the independent variables using the FFNN algorithm. It can be observed that during the training phase, FFNN achieved a regression of 93.41%; during the validation phase, it reached 81.59%; during the testing phase, it reached 81.30%; and its overall regression value was 89.42%. Therefore, the relationship between the output and target values is very strong.

4.3.4. Receiver Operating Characteristic (ROC)

ROC is considered one of the tools used to evaluate the performance of neural networks by measuring the ratio between true positive and false positive values. The ANN algorithm evaluates the microscopic blood sample dataset during the training, validation, and testing phases. The algorithm works by representing each class (disease) in the dataset and evaluating it with a specific color. The specificity is represented by the x-axis, and the sensitivity is represented by the y-axis. Figure 12 describes the ROC values of the microscopic blood sampling dataset using the ANN algorithm during all phases of four WBCs. The ANN algorithm averaged 94.38% overall for all four classes.

4.3.5. Confusion Matrix

The confusion matrix is a powerful and important evaluation criterion for all classification networks. It takes the form of a quadrilateral matrix containing all the images classified correctly and incorrectly. Each row and column in the dataset represent a class (disease), with the columns denoting the number of target images and the rows denoting the number of (actual) output images (predicted). The confusion matrix includes images that were correctly classified as TP and TN as well as images that were incorrectly classified as FP and FN. The microscopic blood sample dataset was assessed using ANN and FFNN during all stages. Figure 13 illustrates the confusion matrix, representing four classes of the dataset, with Class 1 representing eosinophils, Class 2 representing lymphocytes, Class 3 representing monocytes, and Class 4 representing neutrophils. Figure 13a represents the confusion matrix produced using the ANN algorithm, which achieved an accuracy of 91.6%. In contrast, Figure 13b represents the confusion matrix produced using the FFNN algorithm, which achieved 93.6% accuracy.

The outcome of ANN and FFNN for the detection of WBC diseases is shown in Table 2. The FFNN algorithm was deemed to be superior to the ANN algorithm. The percentages for the ANN algorithm accuracy, precision, sensitivity, specificity, and AUC were 91.6%, 91.5%, 97.0%, 92.0%, and 94.3%, respectively. The accuracy, precision, sensitivity, specificity, and AUC percentages for the FFNN algorithm were 93.6%, 93.66%, 97.79%, 93.92%, and 95.77%, respectively.

4.4. Results of Neural Network Algorithms

This section presents the performance of the pre-trained CNN models AlexNet, ResNet-50, GoogLeNet, and ResNet-18 used to evaluate the microscopic blood sample dataset for the early detection of WBCs [41,42]. All dataset images were resized to fit the model. The images were resized to 227 × 227 × 3 pixels for AlexNet, while the images were resized to 224 × 224 × 3 for the rest of the CNN models. For all models, the SoftMax activation function produced four classes: eosinophils, lymphocytes, monocytes, and neutrophils. Each model was tuned to the best performance to obtain the best results.

The performance of each model depends on the size of the filters, the zero-padding, and the p-stride, and the accuracy of the diagnosis of each model is affected by the pooling layer. Table 3 describes the process for tuning the parameters in each model in terms of the optimizer, maximum epoch, training time, learning rate, mini-batch size, and validation frequency. The table shows the time consumed by each model to train the dataset. Despite reducing the number of epochs to 4 and, in some models, to 2, the time to train the dataset reached, in the case of the ResNet-50 model, as long as 2117 min and 21 s, as shown in Figure 14.

Table 4 describes the evaluation results for the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models used on the microscopic blood sample dataset. Notably, all models achieved excellent results for the early diagnosis of white blood cell types. The AlexNet model achieved accuracy, precision, sensitivity, specificity, and AUC values of 98%, 98.4%, 98.2%, 99.25%, and 99.74%, respectively. The ResNet-50 model reached accuracy, precision, sensitivity, specificity, and AUC values of 99.25%, 99.5%, 99.25%, 99.75%, and 99.99%, respectively. The GoogLeNet model achieved accuracy, precision, sensitivity, specificity, and AUC values of 99.31%, 99.2%, 99.34%, 99.5%, and 99.95%, respectively. The ResNet-18 model achieved accuracy, precision, sensitivity, specificity, and AUC values of 99.12%, 99%, 99.21%, 99.75%, and 99.98%, respectively.

Figure 15 displays the performance, on a graph, of the proposed CNN models for the white blood cell diagnosis of the microscopic blood sample datasets.

Figure 16 presents the confusion matrix produced using the CNN models for the early detection of white blood cell types. Here, we discuss the percentage of each model for each class (type) in the dataset. The eosinophil cell type was diagnosed with 94.9%, 97.7%, 98.4%, and 97.3% accuracy using the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models, respectively. The lymphocyte cell type was diagnosed with 99.5%, 100%, 100%, and 99.8% accuracy using the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models, respectively. The monocyte cell type was diagnosed with 100%, 100%, 100%, and 100% accuracy using the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models, respectively. Finally, the neutrophil cell type was diagnosed with 97.5%, 99.4%, 97.8%, and 99.4% accuracy using the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models, respectively.

Figure 17 describes the performance of the CNN models for the AUC scale, where the AlexNet model achieved 99.74%, the ResNet-50 model achieved 99.99%, the GoogleNet model achieved 99.95%, and the ResNet-18 model achieved 99.98%.

4.5. Results of the Hybrid CNN Models with the SVM Algorithm

In this section, we discuss the performance results of the hybrid techniques between CNN models (AlexNet, ResNet-50, GoogLeNet, and ResNet-18) and the machine learning algorithm (SVM). The reason for applying these techniques is their ease in meeting the requirements of computer specifications and their speed in training the dataset. This technique consists of two combined (hybrid) blocks; the first block is the CNNs used to extract deep feature maps with high accuracy and send them to the second block. The second block is the SVM algorithms for the super-accurate classification of deep feature maps. Table 5 describes the performance of the AlexNet + SVM, ResNet-50 + SVM, GoogleNet + SVM, and ResNet-18 + SVM techniques used to diagnose blood sample microscopy datasets for the early detection of white blood cells. Notably, the GoogleNet + SVM network achieved slightly better results than the other hybrid networks. The AlexNet + SVM network reached accuracy, precision, sensitivity, specificity, and AUC values of 93.5%, 93.25%, 93.5%, 97.75%, and 98.25%, respectively. The ResNet-50 + SVM network achieved accuracy, precision, sensitivity, specificity, and AUC values of 93.1%, 93.25%, 93%, 97.7%, and 99.21%, respectively. The GoogleNet + SVM network achieved accuracy, accuracy, precision, sensitivity, specificity, and AUC values of 94.4%, 94.25%, 94.5%, 98.25%, and 99.52%, respectively. Finally, the ResNet-18 + SVM network achieved accuracy, precision, sensitivity, specificity, and AUC values of 93.8%, 94%, 94%, 98%, and 99.34%, respectively.

Figure 18 presents the performance, on a graph, of the proposed hybrid CNN models and SVM for the blood micro-sampling datasets for white blood cell diagnosis.

Figure 19 presents the confusion matrix produced using the hybrid techniques between CNN models and the SVM algorithm for the early detection of white blood cell type. We discuss here the percentage performance for each model in each category (type) of the dataset. The eosinophilic cell type was diagnosed with 88.8%, 89.3%, 89.8%, and 87.9% accuracy using the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models, respectively. The lymphocyte type was diagnosed with 99.2%, 97.1%, 98.4%, and 99.2% accuracy using the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models, respectively. The monocyte type was diagnosed with 97.9%, 98.2%, 99%, and 98.7% accuracy using the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models, respectively. Finally, the neutrophilic cell type was diagnosed with 88.2%, 87.9%, 90.7%, and 89.6% accuracy using the AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models, respectively.

5. Discussion

Three different types of artificial intelligence techniques for microscopic blood sample diagnosis were examined in this study. The first includes neural network algorithms (ANN and FFNN), the second includes CNN models (AlexNet, ResNet-50, GoogLeNet, and ResNet-18), and the third includes hybrid methods using CNN and SVM (AlexNet with SVM, ResNet-50 with SVM, GoogLeNet with SVM, and ResNet-18 with SVM) [43,44].

It is worth noting that all the images of the microscopic blood samples were improved by removing the noise and all artifacts using two filters: the average filter, used to raise the contrast and remove the noise of the blood samples, and the Laplacian filter, used to clarify the edges of blood samples and distinguish them from other blood samples.

In total, 80% of the dataset was used for training and validation, and 20% was used for testing.

The first proposed system uses ANN- and FFNN-based feature extraction using LBP and GLCM algorithms and feature vector merging with a size of 216 features for each image to diagnose the microscopic blood samples. When the ANN and FFNN were fed these features, the ANN algorithm accuracy was 91.6%. The accuracy of the FFNN algorithm, in contrast, was 93.6%.

The second proposed system is composed of four CNN models, namely AlexNet, ResNet-50, GoogLeNet, and ResNet-18, in which all parameters were adjusted for the best performance in microscopic blood sample type diagnosis. These techniques took a long time when training the dataset, as described above. The ResNet-50 model consumed 2117 min 21 s, the ResNet-18 model consumed 582 min 18 s, the GoogLeNet model consumed 515 min 16 s, and the AlexNet model consumed 43 min 32 s, which was the fastest model for training the dataset. All CNN models achieved promising accuracy: AlexNet, ResNet-50, GoogLeNet, and ResNet-18 reached an accuracy of 98%, 99.3%, 99%, and 99.1%, respectively.

The third proposed system uses four hybrid technologies between the CNN and SVM models: AlexNet with SVM, ResNet-50 with SVM, GoogLeNet with SVM, and ResNet-18 with SVM. This system achieved superior results in diagnosing the disease dataset. The hybrid WBC technologies achieved 93.5%, 93.1%, 94.4%, and 93.8% accuracy for AlexNet with SVM, ResNet-50 with SVM, GoogLeNet with SVM, and ResNet-18 with SVM, respectively.

Table 6 and Figure 20 describe the results achieved using each network in the proposed systems that achieved superior results for the diagnosis of microscopic blood sample data for the diagnosis of WBC diseases. First, for eosinophil disease, the best diagnostic accuracy was achieved using the GoogLeNet model, at a value of 98.4%. In comparison, the ResNet-50 and GoogLeNet models achieved the best diagnostic performance, at a value of 100%, for lymphocyte disease for both models. The AlexNet, ResNet-50, GoogLeNet, and ResNet-18 models were the best at diagnosing monocytes, with 100% accuracy for all models.

Table 7 shows a comparison between the results of the proposed methods and the relevant previous studies. It is noted from the table that the performance of the proposed methods is superior to previous studies in all measures.

6. Conclusions

When hematologists perform a blood test, if the results of the tests are related to problems with the immune system, the doctor orders a peripheral smear test, which contains essential information about the health of the immune system and related diseases. When tests are analyzed by a doctor, all subtypes of WBCs are also examined. These procedures are performed manually and are, therefore, prone to errors and are time-consuming. Artificial intelligence techniques can help to solve these challenges. This study presented several analytical systems to diagnose microscopic blood sample datasets with high efficiency. Because the microscopic blood samples contain noise, the average and Laplacian filters were applied to increase the contrast of WBCs and remove the noise. The proposed methods are divided into three types, and each type contains more than one algorithm. The initial method proposed includes neural network algorithms (ANN and FFNN) based on the hybrid features extracted using the LBP and GLCM algorithms. These two algorithms achieved promising results for dataset diagnosis. In the second proposed method, the dataset was diagnosed using four pre-trained CNN models: AlexNet, ResNet-50, GoogLeNet, and ResNet-18. The dataset was diagnosed using the transfer learning method to extract deep features. All models achieved superior results for the early detection of WBC diseases. In the third proposed method, the dataset was diagnosed using hybrid CNN models and SVM techniques, including AlexNet with SVM, ResNet-50 with SVM, GoogLeNet with SVM, and ResNet-18 with SVM. These techniques consist of two blocks, the first of which uses CNN models to extract feature maps. SVM is used to classify deep features in the deep and second block. For the detection of specific WBC diseases, all hybrid technologies produced superior results.

In future work, this dataset will be classified with another dataset using hybrid systems based on fused CNN features. Handcrafted features will be combined with the CNN features to achieve superior accuracy when classifying WBC species.

Author Contributions

Conceptualization, B.A.M., E.M.S., Z.G.A.-M. and T.S.A.; methodology, T.S.A., E.M.S., Z.G.A.-M., A.A. (Abdullah Albaker) and B.A.M.; software, E.M.S., A.A. (Abdullah Albaker) and A.A. (Abdullah Alshammari); validation, E.M.S., B.A.M., A.A. (Abdullah Alshammari), M.A. and Z.G.A.-M.; formal analysis, E.M.S., A.A. (Abdullah Albaker) and A.A. (Abdullah Alshammari); investigation, T.S.A. and G.A.; resources, E.M.S., G.A., A.A. (Abdullah Albaker) and M.A.; data curation, E.M.S. and G.A.; writing—original draft preparation, E.M.S.; writing—review and editing, T.S.A., G.A., M.A. and B.A.M.; visualization, E.M.S., B.A.M. and Z.G.A.-M.; supervision, B.A.M. and Z.G.A.-M.; project administration, T.S.A., B.A.M., M.A. and Z.G.A.-M.; funding acquisition, T.S.A. All authors have read and agreed to the published version of the manuscript.

Funding

The Scientific Research Deanship at the University of Ha’il in Saudi Arabia provided funding for this study under project number BA-2031.

Data Availability Statement

In this study, the dataset of the microscopic blood samples for WBC diagnosis used to obtain the findings were collected from this link: https://www.kaggle.com/paultimothymooney/blood-cells/version/6 (accessed on 20 May 2022).

Acknowledgments

For funding this study, we acknowledge the Scientific Research Deanship at the University of Ha’il in Saudi Arabia.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kuan, D.H.; Wu, C.C.; Su, W.Y.; Huang, N.T. A Microfluidic Device for Simultaneous Extraction of Plasma, Red Blood Cells, and On-Chip White Blood Cell Trapping. Sci. Rep. 2018, 8, 15345. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Al-Hafiz, F.; Al-Megren, S.; Kurdi, H. Red blood cell segmentation by thresholding and Canny detector. Procedia Comput. Sci. 2018, 141, 327–334. [Google Scholar] [CrossRef]
van der Meijden, P.E.J.; Heemskerk, J.W.M. Platelet biology and functions: New concepts and clinical perspectives. Nat. Rev. Cardiol. 2018, 16, 166–179. [Google Scholar] [CrossRef]
Rahadi, I.; Choodoung, M.; Choodoung, A. Red blood cells and white blood cells detection by image processing. J. Phys. Conf. Ser. 2020, 1539, 012025. [Google Scholar] [CrossRef]
Veronelli, A.; Laneri, M.; Ranieri, R.; Koprivec, D.; Vardaro, D.; Paganelli, M.; Folli, F.; Pontiroli, A.E. White Blood Cells in Obesity and Diabetes Effects of weight loss and normalization of glucose metabolism. Diabetes Care 2004, 27, 2501–2502. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shah, S.S.; Shofer, F.S.; Seidel, J.S.; Baren, J.M. Significance of extreme leukocytosis in the evaluation of febrile children. Pediatr. Infect. Dis. J. 2005, 24, 627–630. [Google Scholar] [CrossRef]
Nohynek, H.; Valkeila, E.; Leinonen, M.; Juhani, E. Erythrocyte sedimentation rate, white blood cell count and serum C-reactive protein in assessing etiologic diagnosis of acute lower respiratory infections in children. Pediatr. Infect. Dis. J. 1995, 14, 484–490. [Google Scholar] [CrossRef]
Tsukahara, T.; Yaguchi, A.; Horiuchi, Y. Significance of Monocytosis in Varicella and Herpes Zoster. J. Dermatol. 1992, 19, 94–98. [Google Scholar] [CrossRef]
Kovalszki, A.; Weller, P.F. Eosinophilia. Prim Care 2016, 43, 607. [Google Scholar] [CrossRef] [Green Version]
Tomari, R.; Zakaria, W.N.W.; Jamil, M.M.A.; Nor, F.M.; Fuad, N.F.N. Computer Aided System for Red Blood Cell Classification in Blood Smear Image. Procedia Comput. Sci. 2014, 42, 206–213. [Google Scholar] [CrossRef] [Green Version]
Reta, C.; Altamirano, L.; Gonzalez, J.A.; Diaz-Hernandez, R.; Peregrina, H.; Olmos, I.; Alonso, J.E.; Lobato, R. Segmentation and Classification of Bone Marrow Cells Images Using Contextual Information for Medical Diagnosis of Acute Leukemias. PLoS ONE 2015, 10, e0130805. [Google Scholar] [CrossRef] [Green Version]
Hegde, R.B.; Prasad, K.; Hebbar, H.; Singh, B.M.K. Development of a robust algorithm for detection of nuclei of white blood cells in peripheral blood smear images. Multimed. Tools Appl. 2019, 78, 17879–17898. [Google Scholar] [CrossRef]
Arslan, S.; Ozyurek, E.; Gunduz-Demir, C. A color and shape based algorithm for segmentation of white blood cells in peripheral blood and bone marrow images. Cytom. Part A 2014, 85, 480–490. [Google Scholar] [CrossRef]
Hiremath, P.S.; Bannigidad, P.; Geeta, S. Automated identification and classification of white blood cells (leukocytes) in digital microscopic images. IJCA 2010, 2, 59–63. [Google Scholar]
Rezatofighi, S.H.; Soltanian-Zadeh, H. Automatic recognition of five types of white blood cells in peripheral blood. Comput. Med. Imaging Graph. 2011, 35, 333–343. [Google Scholar] [CrossRef] [PubMed]
Huang, D.C.; Hung, K.D.; Chan, Y.K. A computer assisted method for leukocyte nucleus segmentation and recognition in blood smear images. J. Syst. Softw. 2012, 85, 2104–2118. [Google Scholar] [CrossRef]
Nazlibilek, S.; Karacor, D.; Ercan, T.; Sazli, M.H.; Kalender, O.; Ege, Y. Automatic segmentation, counting, size determination and classification of white blood cells. Measurement 2014, 55, 58–65. [Google Scholar] [CrossRef]
Livieris, I.E.; Pintelas, E.; Kanavos, A.; Pintelas, P. Identification of blood cell subtypes from images using an improved SSL algorithm. Biomed. J. Sci. Tech. Res. 2018, 9, 6923–6929. [Google Scholar] [CrossRef]
Lippeveld, M.; Knill, C.; Ladlow, E.; Fuller, A.; Michaelis, L.J.; Saeys, Y.; Filby, A.; Peralta, D. Classification of Human White Blood Cells Using Machine Learning for Stain-Free Imaging Flow Cytometry. Cytom. Part A 2020, 97, 308–319. [Google Scholar] [CrossRef]
Wang, J.L.; Li, A.Y.; Huang, M.; Ibrahim, A.K.; Zhuang, H.; Ali, A.M. Classification of White Blood Cells with PatternNet-fused Ensemble of Convolutional Neural Networks (PECNN). In Proceedings of the 2018 IEEE International Symposium on Signal Processing and Information Technology, ISSPIT 2018, Louisville, KY, USA, 6–8 December 2018; IEEE: New York, NY, USA, 2019; pp. 325–330. [Google Scholar] [CrossRef]
Ahmed, N.; Yigit, A.; Isik, Z.; Alpkocak, A. Identification of Leukemia Subtypes from Microscopic Images Using Convolutional Neural Network. Diagnostics 2019, 9, 104. [Google Scholar] [CrossRef] [Green Version]
Goutam, D.; Sailaja, S. Classification of acute myelogenous leukemia in blood microscopic images using supervised classifier. In Proceedings of the 2015 IEEE International Conference on Engineering and Technology, ICETECH 2015, Coimbatore, India, 20 March 2015; IEEE: New York, NY, USA, 2015. [Google Scholar] [CrossRef]
Rawat, J.; Singh, A.; HS, B.; Virmani, J.; Devgun, J.S. Computer assisted classification framework for prediction of acute lymphoblastic and acute myeloblastic leukemia. Biocybern. Biomed. Eng. 2017, 37, 637–654. [Google Scholar] [CrossRef]
Rawat, J.; Singh, A.; Bhadauria, H.S.; Virmani, J. Computer Aided Diagnostic System for Detection of Leukemia Using Microscopic Images. Procedia Comput. Sci. 2015, 70, 748–756. [Google Scholar] [CrossRef] [Green Version]
Amin, M.M.; Kermani, S.; Talebi, A.; Oghli, M.G. Recognition of Acute Lymphoblastic Leukemia Cells in Microscopic Images Using K-Means Clustering and Support Vector Machine Classifier. J. Med. Signals Sens. 2015, 5, 49. [Google Scholar] [CrossRef] [PubMed]
Vaghela, H.P.; Modi, H.; Pandya, M.; Potdar, M.B. Leukemia Detection using Digital Image Processing Techniques|Enhanced Reader. Int. J. Appl. Inf. Syst. 2015, 10. [Google Scholar] [CrossRef]
Blood Cell Images | Kaggle. Available online: https://www.kaggle.com/paultimothymooney/blood-cells/version/6 (accessed on 1 January 2022).
Senan, E.M.; Abunadi, I.; Jadhav, M.E.; Fati, S.M. Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms. Comput. Math. Methods Med. 2021, 2021, 8500314. [Google Scholar] [CrossRef]
Tsuchimoto, S.; Shibusawa, S.; Iwama, S.; Hayashi, M.; Okuyama, K.; Mizuguchi, N.; Kato, K.; Ushiba, J. Use of common average reference and large-Laplacian spatial-filters enhances EEG signal-to-noise ratios in intrinsic sensorimotor activity. J. Neurosci. Methods 2021, 353, 109089. [Google Scholar] [CrossRef]
Abunadi, I.; Senan, E.M. Deep Learning and Machine Learning Techniques of Diagnosis Dermoscopy Images for Early Detection of Skin Diseases. Electronics 2021, 10, 3158. [Google Scholar] [CrossRef]
Mishra, S.; Majhi, B.; Sa, P.K.; Sharma, L. Gray level co-occurrence matrix and random forest based acute lymphoblastic leukemia detection. Biomed. Signal Process. Control. 2017, 33, 272–280. [Google Scholar] [CrossRef]
Senan, E.M.; Jadhav, M.E. Techniques for the Detection of Skin Lesions in PH² Dermoscopy Images Using Local Binary Pattern (LBP). In Communications in Computer and Information Science, Proceedings of the Third International Conference Recent Trends in Image Processing and Pattern Recognition, RTIP2R 2020, Aurangabad, India, 3–4 January 2020; Santosh, K.C., Gawali, B., Eds.; Springer: Singapore, 2021; Volume 1381, pp. 14–25. [Google Scholar] [CrossRef]
Roshni, T.; Jha, M.K.; Drisya, J. Neural network modeling for groundwater-level forecasting in coastal aquifers. Neural Comput. Appl. 2020, 32, 12737–12754. [Google Scholar] [CrossRef]
Senan, E.M.; Alzahrani, A.; Alzahrani, M.Y.; Alsharif, N.; Aldhyani, T.H.H. Automated Diagnosis of Chest X-Ray for Early Detection of COVID-19 Disease. Comput. Math. Methods Med. 2021, 2021, 6919483. [Google Scholar] [CrossRef]
Senan, E.M.; Alsaade, F.W.; Al-Mashhadani, M.I.A.; Aldhyani, T.H.H.; Al-Adhaileh, M.H. Classification of Histopathological Images for Early Detection of Breast Cancer Using Deep Learning. J. Appl. Sci. Eng. 2021, 24, 323–329. [Google Scholar] [CrossRef]
Ouyang, N.; Wang, W.; Ma, L.; Wang, Y.; Chen, Q.; Yang, S.; Xie, J.; Su, S.; Cheng, Y.; Cheng, Q.; et al. Diagnosing acute promyelocytic leukemia by using convolutional neural network. Clin. Chim. Acta 2021, 512, 1–6. [Google Scholar] [CrossRef]
Li, C.; Zhang, H.; Wu, P.; Yin, Y.; Liu, S. A complex junction recognition method based on GoogLeNet model. Trans. GIS 2020, 24, 1756–1778. [Google Scholar] [CrossRef]
Al-Adhaileh, M.H.; Senan, E.M.; Alsaade, F.W.; Aldhyani, T.H.H.; Alsharif, N.; Alqarni, A.A.; Uddin, M.I.; Alzahrani, M.Y.; Alzain, E.D.; Jadhav, M.E. Deep Learning Algorithms for Detection and Classification of Gastrointestinal Diseases. Complexity 2021, 2021, 6170416. [Google Scholar] [CrossRef]
Mohammed, B.A.; Senan, E.M.; Rassem, T.H.; Makbol, N.M.; Alanazi, A.A.; Al-Mekhlafi, Z.G.; Almurayziq, T.S.; Ghaleb, F.A. Multi-Method Analysis of Medical Records and MRI Images for Early Diagnosis of Dementia and Alzheimer’s Disease Based on Deep Learning and Hybrid Methods. Electronics 2021, 10, 2860. [Google Scholar] [CrossRef]
Senan, E.M.; Jadhav, M.E.; Kadam, A. Classification of PH2 Images for Early Detection of Skin Diseases. In Proceedings of the 2021 6th International Conference for Convergence in Technology, Maharashtra, India, 2–4 April 2021. [Google Scholar] [CrossRef]
Ghaleb Al-Mekhlafi, Z.; Mohammed Senan, E.; Sulaiman Alshudukhi, J.; Abdulkarem Mohammed, B. Hybrid Techniques for Diagnosing Endoscopy Images for Early Detection of Gastrointestinal Disease Based on Fusion Features. Int. J. Intell. Syst. 2023, 2023, 8616939. [Google Scholar] [CrossRef]
Mohammed, B.A.; Al-Mekhlafi, Z.G. Optimized Stacking Ensemble Model to Detect Phishing Websites. Communications in Computer and Information Science. In Communications in Computer and Information Science, Proceedings of the Third International Conference, Advances in Cyber Security, ACeS 2021, Penang, Malaysia, 24–25 August 2021; Abdullah, N., Manickam, S., Anbar, M., Eds.; Springer: Singapore, 2021; Volume 1487, pp. 379–388. [Google Scholar] [CrossRef]
Olayah, F.; Senan, E.M.; Ahmed, I.A.; Awaji, B. AI Techniques of Dermoscopy Image Analysis for the Early Detection of Skin Lesions Based on Combined CNN Features. Diagnostics 2023, 13, 1314. [Google Scholar] [CrossRef] [PubMed]
Al-Mekhlafi, Z.G.; Mohammed, B.A. Using Genetic Algorithms to Optimized Stacking Ensemble Model for Phishing Websites Detection. In Communications in Computer and Information Science, Proceedings of the Third International Conference, Advances in Cyber Security, ACeS 2021, Penang, Malaysia, 24–25 August 2021; Abdullah, N., Manickam, S., Anbar, M., Eds.; Springer: Singapore, 2021; Volume 1487, pp. 447–456. [Google Scholar] [CrossRef]
Wu, W.; Liao, S.; Lu, Z. White Blood Cells Image Classification Based on Radiomics and Deep Learning. IEEE Access 2022, 10, 124036–124052. [Google Scholar] [CrossRef]
Kadry, S.; Rajinikanth, V.; Taniar, D.; Damaševičius, R.; Valencia, X.P.B. Automated segmentation of leukocyte from hematological images—A study using various CNN schemes. J. Supercomput. 2022, 78, 6974–6994. [Google Scholar] [CrossRef]
Baydilli, Y.Y.; Atila, U.; Elen, A. Learn from one data set to classify all–A multi-target domain adaptation approach for white blood cell classification. Comput. Methods Programs Biomed. 2020, 196, 105645. [Google Scholar] [CrossRef]
Kutlu, H.; Avci, E.; Özyurt, F. White blood cells detection and classification based on regional convolutional neural networks. Med. Hypotheses 2020, 135, 109472. [Google Scholar] [CrossRef]
Sahlol, A.T.; Kollmannsberger, P.; Ewees, A.A. Efficient classification of white blood cell leukemia with improved swarm optimization of deep features. Sci. Rep. 2020, 10, 2536. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chola, C.; Muaad, A.Y.; Bin Heyat, M.B.; Benifa, J.V.B.; Naji, W.R.; Hemachandran, K.; Mahmoud, N.F.; Samee, N.A.; Al-Antari, M.A.; Kadah, Y.M.; et al. BCNet: A Deep Learning Computer-Aided Diagnosis Framework for Human Peripheral Blood Cell Identification. Diagnostics 2022, 12, 2815. [Google Scholar] [CrossRef] [PubMed]
Bayat, N.; Davey, D.D.; Coathup, M.; Park, J.-H. White Blood Cell Classification Using Multi-Attention Data Augmentation and Regularization. Big Data Cogn. Comput. 2022, 6, 122. [Google Scholar] [CrossRef]
Baydilli, Y.Y.; Atila, Ü. Classification of white blood cells using capsule networks. Comput. Med. Imaging Graph. 2020, 80, 101699. [Google Scholar] [CrossRef] [PubMed]
Hegde, R.B.; Prasad, K.; Hebbar, H.; Singh, B.M.K. Comparison of traditional image processing and deep learning approaches for classification of white blood cells in peripheral blood smear images. Biocybern. Biomed. Eng. 2019, 39, 382–392. [Google Scholar] [CrossRef]
Cheuque, C.; Querales, M.; León, R.; Salas, R.; Torres, R. An Efficient Multi-Level Convolutional Neural Network Approach for White Blood Cells Classification. Diagnostics 2022, 12, 248. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Morphology for diagnosing the WBC sample datasets.

Figure 2. Samples in the dataset.

Figure 3. Sample dataset after the optimization process.

Figure 4. Samples of the dataset after segmentation.

Figure 5. Some images in the dataset that were produced using the morphological algorithm.

Figure 6. The basic structure of ANN and FFNN.

Figure 7. Hybrid method using CNN models and SVM. (a) AlexNet with SVM. (b) ResNet-50 with SVM. (c) GoogLeNet with SVM. (d) ResNet-18 with SVM.

Figure 8. Training of the WBC dataset. (a) ANN algorithm and (b) FFNN algorithm.

Figure 9. Performance diagram showing the training of the microscopic blood sample dataset. (a) ANN algorithm and (b) FFNN algorithm.

Figure 10. Gradient and validation measurement for the microscopic blood sample dataset produced using the (a) ANN algorithm and (b) FFNN algorithm.

Figure 11. Displays of the regression values of the microscopic blood sample dataset calculated using the FFNN algorithm.

Figure 12. The ROC values for the microscopic blood samples dataset calculated using the ANN algorithm.

Figure 13. Confusion matrix of microscopic blood sample dataset produced using the (a) ANN algorithm and (b) FFNN algorithm.

Figure 14. Display of the training process for the dataset using the ResNet-50 model.

Figure 15. Display showing the performance of the CNN models.

Figure 16. Confusion matrix display for the white blood cell type dataset using (a) AlexNet, (b) ResNet-50, (c) GoogLeNet, and (d) ResNet-18.

Figure 17. Display showing the ROC for the evaluation of the white blood cell dataset using (a) AlexNet, (b) ResNet-50, (c) GoogLeNet, and (d) ResNet-18.

Figure 18. Display showing the performance of the hybrid CNN models and SVM.

Figure 19. Confusion matrix display showing the white blood cell type dataset calculated using (a) AlexNet + SVM, (b) ResNet-50 + SVM, (c) GoogLeNet + SVM, and (d) ResNet-18 + SVM.

Figure 20. A comparison between the proposed systems for detecting WBC diseases.

Table 1. Splitting of the WBC dataset.

Dataset	ALL_IDB1
Phase	80% for Training with Validation (80:20%)		20% for Testing
Classes	Training (80%)	Validation (20%)	20% for Testing
Eosinophils	2005	501	627
Lymphocytes	1989	497	622
Monocytes	1981	495	619
Neutrophils	2030	507	634

Table 2. The performance of ANN and FFNN used on the microscopic blood samples dataset.

Measure	ANN	FFNN
Accuracy %	91.6	93.6
Precision %	91.58	93.66
Sensitivity %	97.08	97.79
Specificity %	92.03	93.92
AUC %	94.38	95.77

Table 3. Training parameter options for the CNN models.

Options	AlexNet	ResNet-50	GoogLeNet	ResNet-18
Training options	adam	adam	adam	adam
Mini-batch size	20	10	20	15
Max. epochs	10	5	4	3
Initial learn rate	0.0001	0.0001	0.0003	0.0001
Validation frequency	50	5	3	5
Training time (min.)	43 min 32 s	2117 min 21 s	515 min 16 s	582 min 18 s
Execution environment	GPU	GPU	GPU	GPU

Table 4. The results of the CNN models used on the microscopic blood sample dataset.

Measure	AlexNet	ResNet-50	GoogLeNet	ResNet-18
Accuracy %	98	99.3	99	99.1
Precision %	98.4	99.5	99.2	99
Sensitivity %	98.2	99.25	99.34	99.21
Specificity %	99.25	99.75	99.5	99.75
AUC %	99.74	99.99	99.95	99.98

Table 5. The results of the CNN models used on the microscopic blood sample dataset.

Measure	AlexNet + SVM	ResNet-50 + SVM	GoogLeNet + SVM	ResNet-18 + SVM
Accuracy %	93.5	93.1	94.4	93.8
Precision %	93.25	93.25	94.25	94
Sensitivity %	93.5	93	94.5	94
Sepecificy %	97.75	97.7	98.25	98
AUC %	98.25	99.21	99.52	99.34

Table 6. The accuracy reached using all proposed systems in the diagnosis of each disease.

Diseases	Neural Networks		CNN				Hybrid
Diseases	ANN	FFNN	AlexNet	ResNet-50	Google-Net	Res-Net-18	AlexNet with SVM	ResNet-50 with SVM	GoogLeNet with SVM	ResNet-18 with SVM
Eosinophils	91.7	93.2	94.9	97.7	98.4	97.3	88.8	89.3	89.8	87.9
Lymphocytes	90.4	90.1	99.5	100	100	99.8	99.2	97.1	98.4	99.2
Monocytes	93.7	96	100	100	100	100	97.9	96.2	99	98.7
Neutrophils	90.5	95.3	97.5	99.4	97.8	99.4	88.2	87.9	90.7	89.6

Table 7. Results from the implementation of the proposed methods compared to those of relevant studies.

Previous Research	Accuracy %	Sensitivity %	Specificity %	Precision %	AUC %
Wu et al. [45]	98.8	98.23	98.83	98.68	99.9
Kadry et al. [46]	97.44	98.94	98.42	-	-
Baydilli et al. [47]	98.08	95.35	98.8	95.35	-
Kutlu et al. [48]	97.21	89.6	96.06	-	-
Sahlol et al. [49]	96.11	93	95	-	-
Chola et al. [50]	97.67	96.51	98.83	-	-
Bayat et al. [51]	98.99	97.99	-	-	-
Baydilli et al. [52]	96.9	92.5	-	-	-
Liang et al. [53]	95.4	96.9	-	-	-
Cheuque et al. [54]	98.4	98.4	-	-	-
Proposed model	99.3	99.25	99.75	99.3	99.99

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Almurayziq, T.S.; Senan, E.M.; Mohammed, B.A.; Al-Mekhlafi, Z.G.; Alshammari, G.; Alshammari, A.; Alturki, M.; Albaker, A. Deep and Hybrid Learning Techniques for Diagnosing Microscopic Blood Samples for Early Detection of White Blood Cell Diseases. Electronics 2023, 12, 1853. https://doi.org/10.3390/electronics12081853

AMA Style

Almurayziq TS, Senan EM, Mohammed BA, Al-Mekhlafi ZG, Alshammari G, Alshammari A, Alturki M, Albaker A. Deep and Hybrid Learning Techniques for Diagnosing Microscopic Blood Samples for Early Detection of White Blood Cell Diseases. Electronics. 2023; 12(8):1853. https://doi.org/10.3390/electronics12081853

Chicago/Turabian Style

Almurayziq, Tariq S., Ebrahim Mohammed Senan, Badiea Abdulkarem Mohammed, Zeyad Ghaleb Al-Mekhlafi, Gharbi Alshammari, Abdullah Alshammari, Mansoor Alturki, and Abdullah Albaker. 2023. "Deep and Hybrid Learning Techniques for Diagnosing Microscopic Blood Samples for Early Detection of White Blood Cell Diseases" Electronics 12, no. 8: 1853. https://doi.org/10.3390/electronics12081853

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep and Hybrid Learning Techniques for Diagnosing Microscopic Blood Samples for Early Detection of White Blood Cell Diseases

Abstract

1. Introduction

2. Related Work

3. Materials and Methodology

3.1. Description of the Dataset

3.2. Average and Laplacian Filters

3.3. Neural Network Algorithm

3.3.1. Adopted Region Growth Algorithm (Segmentation)

3.3.2. Morphological Operation

3.3.3. Feature Extraction

3.3.4. Classification

3.4. Convolutional Neural Networks (CNNs)

3.5. Hybrid of CNN Models and SVM

4. Experimental Result

4.1. Splitting Dataset

4.2. Evaluation Metrics

4.3. Results of Neural Network Algorithms

4.3.1. Performance Analysis

4.3.2. Gradient

4.3.3. Regression

4.3.4. Receiver Operating Characteristic (ROC)

4.3.5. Confusion Matrix

4.4. Results of Neural Network Algorithms

4.5. Results of the Hybrid CNN Models with the SVM Algorithm

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI