Efficient Gastrointestinal Disease Classification Using Pretrained Deep Convolutional Neural Network

Nouman Noor, Muhammad; Nazir, Muhammad; Khan, Sajid Ali; Song, Oh-Young; Ashraf, Imran

doi:10.3390/electronics12071557

Open AccessArticle

Efficient Gastrointestinal Disease Classification Using Pretrained Deep Convolutional Neural Network

by

Muhammad Nouman Noor

¹,

Muhammad Nazir

¹

,

Sajid Ali Khan

²,

Oh-Young Song

^3,*

and

Imran Ashraf

⁴

¹

Department of Computer Science, HITEC University, Taxila 47080, Pakistan

²

Department of Software Engineering, Foundation University Islamabad, Islamabad 44000, Pakistan

³

Department of Software, Sejong University, Seoul 05006, Republic of Korea

⁴

Department of Computer Engineering, HITEC University, Taxila 47080, Pakistan

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(7), 1557; https://doi.org/10.3390/electronics12071557

Submission received: 2 February 2023 / Revised: 21 March 2023 / Accepted: 23 March 2023 / Published: 26 March 2023

(This article belongs to the Special Issue Medical Image Processing Using AI)

Download

Browse Figures

Versions Notes

Abstract

:

Gastrointestinal (GI) tract diseases are on the rise in the world. These diseases can have fatal consequences if not diagnosed in the initial stages. WCE (wireless capsule endoscopy) is the advanced technology used to inspect gastrointestinal diseases such as ulcerative-colitis, polyps, esophagitis, and ulcers. WCE produces thousands of frames for a single patient’s procedure for which manual examination is tiresome, time-consuming, and prone to error; therefore, an automated procedure is needed. WCE images suffer from low contrast which increases inter-class and intra-class similarity and reduces the anticipated performance. In this paper, an efficient GI tract disease classification technique is proposed which utilizes an optimized brightness-controlled contrast-enhancement method to improve the contrast of the WCE images. The proposed technique applies a genetic algorithm (GA) for adjusting the values of contrast and brightness within an image by modifying the fitness function, which improves the overall quality of WCE images. This quality improvement is reported using qualitative measures, such as peak signal to noise ratio (PSNR), mean square error (MSE), visual information fidelity (VIF), similarity index (SI), and information quality index (IQI). As a second step, data augmentation is performed on WCE images by applying multiple transformations, and then, transfer learning is used to fine-tune a modified pre-trained model on WCE images. Finally, for the classification of GI tract disease, the extracted features are passed through multiple machine-learning classifiers. To show the efficacy of the proposed technique in the improvement in classification performance, the results are reported for the original dataset as well as the contrast-enhanced dataset. The results show an overall improvement of 15.26% in accuracy, 13.3% in precision, 16.77% in recall rate, and 15.18% in F-measure. Finally, a comparison with the existing techniques shows that the proposed framework outperforms the state-of-the-art techniques.

Keywords:

gastrointestinal disease; deep learning; WCE images; contrast enhancement; stomach cancer

1. Introduction

The digestive system is affected by gastrointestinal (GI) tract diseases. Medical imaging is a key component of these diseases’ diagnoses. Huge image data is difficult for radiologists and medical professionals to process, which makes it susceptible to inaccurate medical assessment [1]. The most prevalent digestive system disorders include ulcerative colitis, esophagitis, ulcers, and polyps that can develop into colorectal cancer. One of the leading causes of death worldwide is colorectal cancer [2].

According to a survey of the disease, 11% of women and 26% of men worldwide have been diagnosed with colorectal cancer [3]. In the US, 338,090 new cases of colorectal cancer were detected in 2021, with a 44% increase in fatalities [4]. Each year, 0.7 million new cases of illnesses are recorded globally [5]. Due to the high death rate, early diagnosis is exceedingly challenging. The development of ulcers in the GI tract is a serious sickness that goes hand in hand with GI malignant growth. As [6] reported, the most notable yearly predominance of ulcers retained in Spain was 141.9 per 100,000 individuals and the least was 57.75 in Sweden.

Many lesions are overlooked during a typical endoscopic examination because of the presence of feces and the organs’ complex architecture. The rate of missing polyps is very high, ranging from 21.4% to 26.8% [7], even when the bowel is cleansed to facilitate the diagnosis of cancer or its precursor lesions. Furthermore, it can be difficult to identify lesions because of their similarities between classes. A new treatment called wireless capsule endoscopy (WCE) [8] allows medical professionals to view the stomach, which was previously exceedingly difficult to access via standard endoscopy. In WCE, patients ingest a camera-encased capsule that records several images as it travels through the GI system. The experts (experienced gastroenterologists) stitch these images together to create a film, which is then analyzed to look for deformations. However, this approach has some drawbacks, including time requirements and a dearth of expertise. The primary issue with this procedure is the time commitment required for a manual diagnostic. Additionally, the poor contrast in the WCE images makes it difficult to see an ulcer properly [9]. As a result, there is a possibility that a doctor will overlook the ulcer throughout the detection phase. Another issue that arises during a diagnosis with the naked eye has to do with the similarity of color, texture, and shape variations [10,11].

Different techniques for the diagnosis of colorectal cancer and its precursor lesions utilizing the WCE images have been developed by numerous researchers [4,11]. These techniques have some basic steps such as contrast enhancement and noise removal from the image, followed by segmentation of diseased area within an image, important features extraction, and finally, classification of a specified class. An integral part of a computerized process is contrast enhancement. The primary goal of this phase is to increase an infected region’s intensity range to improve accuracy and retrieve pertinent features [12]. The resultant images are then passed into the segmentation phase for the detection of disease and subsequently for features extraction, but this step faces several difficulties (such as the alteration in the topology of the infected lesion and the similarity in color between the healthy and infected parts) that lower the segmentation accuracy. Later, a disease can be incorrectly classified into an inappropriate class as a result of the decrease in segmentation accuracy.

Recently, the performance of a deep convolutional neural network (CNN) for the identification and classification of medical infections has increased [13,14] as they outperformed traditional machine learning models. These techniques extracted features using pre-trained CNN models, which were then optimized for features. Due to memory and time constraints, the pre-trained CNN models are trained via transfer learning (TL). Most of the researchers are focusing more on using the complex CNN models as well as fusing multiple models and optimizing the features for achieving better accuracy. As WCE images are subject to several challenges and limitations, work on image improvement is an area upon which focus is needed. The quality of WCE images is not good because of volume and power limitations. As a result, WCE images exhibit weak contrast [15]. Furthermore, the great similarity between normal and abnormal frames further complicates the process of disease classification [16]. Despite these difficulties, we presented an automated framework for the classification of gastrointestinal diseases using a novel contrast-enhancement and deep transfer learning method.

The main contributions of this paper are summarized as follows:

An optimized brightness-controlled contrast-enhancement methodology based on the genetic algorithm is proposed;
A lightweight pre-trained deep CNN model is deployed for significant feature extraction and classification;
An analysis to quantify the impact of improvement of the proposed technique in terms of PSNR, MSE, VIF, SI, and IQI was performed;
The significance of the proposed technique for the improvement in the classification of GI diseases was evaluated based on various performance metrics, such as accuracy, precision, recall, and F-measure.

The remaining paper is structured as follows: The literature review is performed in Section 2. This is followed by the details of the methodology in Section 3. The experimental results are detailed in Section 4 along with the discussion and analysis of the results. Finally, Section 5 concludes this work.

2. Literature Review

The use of medical imaging to identify diseases has gained popularity in recent years, particularly in the field of the gastrointestinal system. Another active field of research is the classification of digestive illnesses. Although machine learning algorithms have demonstrated amazing performance in the literature [15,16], CNN algorithms outperform ML techniques and produce superior results [17].

Several research [18,19,20,21,22,23,24] works have examined the detection of abnormal frames in capsule endoscopy images, including the detection of tumors, Crohn’s disease, polyp, hemorrhages, ulcers, lymphangiectasia, and other intestinal lesions. Existing techniques often start with feature extraction and then apply a detection technique. The detection techniques either use multi-label approaches to find and classify various types of abnormality [19] or distinguish between frames with a lesion and normal ones [18]. These methods frequently extract features from the images’ texture and morphological analysis, statistical feature analysis, and color descriptors. These techniques either use region-based [20,21] or pixel-based [18,19] methodologies. Artificial neural networks (ANNs) and support vector machines (SVMs) are the two commonly utilized classification techniques in the literature [22,23,24]. Many other studies related to gastrointestinal diseases such as polyp, colon, and capsule endoscopy have been conducted in the literature [25,26,27,28,29,30].

DL models such as ResNet-50, VGG16, and Inception-V3 were used by Lee et al. [31] to categorize normal and ulcer GI pictures. Resnet-50 outperformed the other deep networks using this technique. A saliency-based strategy was presented by Khan et al. [32] to segment GI diseases, whereas DL architecture is employed for classification. They combined a YIQ color space with an HSI color space, which was then fed into a segmentation method based on contours. An automated technique for identifying an ulcer from the WCE frames was introduced by Yuan et al. [33]. To begin with, a saliency technique based on superpixels is used to define the boundaries of the ulcer zone. The level-by-level texture and color properties are then computed and combined to produce the final saliency map. Then, to achieve a recognition rate of 92.65%, the saliency max-pooling (SMP) and locality-constrained linear coding (LLC) techniques are combined. With a dataset of 854 photos, the authors of [34] proposed the VGGNet model, which was built on a CNN (convolutional neural network), to accurately identify gastrointestinal ulcers, even though these tests used images from a standard endoscopy. The dataset used in [35] comprised 5360 WCE images with ulcers and erosions and only 440 normal images; the authors created a model based on a CNN. This method’s detection accuracy was 90.8%. An attention-based DL architecture for classifying and localizing stomach diseases from WCE images was presented by Jain et al. [36]. They started by effectively classifying stomach illnesses using CNN. Later, for the localization of contaminated areas, they combined Grad-CAM++ and a unique SegNet. On a KID dataset, the proposed technique was tested, and it showed enhanced accuracy. For WCE video summarization, Lan et al. [37] developed a combination of unsupervised DL techniques. They employed several networks, including LSTM and autoencoder, among others. This research’s major goal was to assist medical professionals in their examination of the complete WCE videos. In [38], a gastrointestinal disease classification framework is proposed in which deep features are selected and fused from two deep models, i.e., ResNet-50 and ResNet-152. Subsequently, the features were optimized and it achieved 96.43% classification performance. In another work, [25], alimentary diseases such as Barret, Polyp, and Esophagitis were classified by applying discrete wavelet transform and CNN. This framework achieved a 96.65% accuracy on the Hyper Kvasir dataset.

The literature review reveals that GI tract disease detection and classification are actively researched. Though the existing work demonstrates reasonable accuracy, there is still an opportunity for performance enhancement. Existing approaches suffer from issues such as unsuitable evaluation criteria, evaluation on a dataset with too few images, or a focus on a single disease. We would also like to emphasize that although accuracy is a crucial performance parameter, it is less significant for multi-class classification problems than other performance metrics, particularly for unbalanced datasets. In this domain, f-measure, precision, and recall rate are important performance indicators, especially when human life is at risk. In this work, a novel technique is proposed that can accurately classify various GI tract diseases for a publicly available dataset. We report results for various training–testing ratios, as well as for 10-fold cross-validation, and not only for accuracy, but also for precision, recall, and f-measure.

3. Methodology

Numerous illnesses, including colorectal cancer and its precursor polyps, and other illnesses including esophagitis, and ulcerative colitis, can affect the human GI tract. WCE images are required and essential for the diagnosis of these disorders. DL techniques are effective for the diagnosis of such disorders. As a result, in this research, we have created a deep learning-based model for the multi-classification of GI tract diseases. Our research’s major objective is to present an optimized brightness-controlled contrast-enhancement approach for WCE images, as well as a TL approach for classification.

In our proposed methodology, the major steps undertaken are shown in Figure 1. Initially, a publicly available dataset such as Kvasir V-2 and Hyper-Kvasir was collected, and the hen contrast of original images was enhanced using an optimized brightness-controlled contrast-enhancement approach. After that, data augmentation was performed on the images by applying rotation with different angles, shifting, horizontal flip, vertical flip, and zoom-in and zoom-out. Subsequently, a pre-trained DL model was modified and fine-tuned on the WCE images dataset using the TL approach. The features of the model were extracted, and finally, multiple machine learning classifiers were applied for the classification of GI tract diseases.

3.1. Dataset Collection and Preparation

For experimentation, we used publicly available datasets such as Kvasir V-2 [34] and Hyper-Kvasir [35]. The pathological findings in the Kvasir-V2 dataset have 1000 images for each class, such as polyp, esophagitis, ulcerative-colitis, and normal, and are used along with the ulcer class of the Hyper-Kvasir dataset, which has 854 images. Hence, our final dataset used for classification contains a total of 4854 images, with 5 classes, which are polyp, ulcer, esophagitis, ulcerative colitis, and normal. Samples of our dataset are shown in Figure 2.

The Kvasir Dataset is collected in Norway, at the Vestre Viken Health Trust (VV). Endoscopic equipment such as a camera containing a capsule is used to collect the data. Four hospitals make up the VV, which offers medical care to 470,000 people. A sizable gastroenterology department at one of these hospitals, the Baerum Hospital, has provided training data that will expand the dataset in the future. Additionally, one or more medical professionals from VV and the Cancer Registry of Norway thoroughly annotated the images.

3.2. Contrast Enhancement

Let Ǖ be a dataset that contains the endoscopic images. Assume Ɋ(i,j) is an endoscopic image of dimension P × L × Q, where P = L = 256 and Q = 3. The value Q = 3 specifies the image is colored and contains three channels, i.e., RGB (red, green, and blue). Initially, the image is divided into red, green, and blue channels, and CLAHE-RGB, as well as CLAHE-HSV, is applied to each channel separately. First, we will normalize the results of CLAHE-RGB using the following mathematical Equation (1).

R_{°}, G_{°}, B_{°} = [\frac{R_{C l}}{R_{c l} + G_{c l} + B_{c l}}, \frac{G_{C l}}{R_{c l} + G_{c l} + B_{c l}}, \frac{B_{C l}}{R_{c l} + G_{c l} + B_{c l}}]

(1)

The conversion from RGB to HSV image requires finding Chroma using mathematical Equations (2)–(4).

Ç = V \times S

(2)

Ⱨ = \frac{H}{60 %}

(3)

ɣ = Ç (1 - |Ⱨ m o d 2 - 1|)

(4)

The mathematical Equation (5) of conversion back from HSV to RGB which is donated by

R_{1}, G_{1}, B_{1}

is as follows.

R_{1}, G_{1}, B_{1} = \{\begin{matrix} (0,0,0) & if Ⱨ is undefined \\ (Ç ɣ 0) & if 0 \leq Ⱨ < 1 \\ (ɣ Ç 0) & if 1 \leq Ⱨ < 2 \\ (0 Ç ɣ) & if 2 \leq Ⱨ < 3 \\ (0 ɣ Ç) & if 3 \leq Ⱨ < 4 \\ (ɣ 0 Ç) & if 4 \leq Ⱨ < 5 \\ (Ç 0 ɣ) & if 5 \leq Ⱨ < 6 \end{matrix}\}

(5)

The colored contrast-enhanced image is then merged using the given mathematical Equation (6).

Ɋ_{c} (i, j) = \sum [\frac{|R_{°} + R_{1}|}{2} + \frac{|G_{°} + G^{1}|}{2} + \frac{|B_{°} + G^{1}|}{2}]

(6)

where

R_{°}, G_{°}, B_{°}

are the values of the red, green, and blue channels of the image after applying CLAHE-RGB and

R_{1}, G_{1}, B_{1}

are the values of the red, green, and blue channels of the image after applying CLAHE-HSV. The

Ɋ_{c} (i, j)

represents the contrast-enhanced image after applying the stated transforms. The ʚ and ʋ values of the contrast-enhanced image are adjusted using the genetic algorithm for the final enhanced image. The mathematical Equation (7) for the given transform is given below.

Ɋ_{f e} (i, j) = ʚ \cdot Ɋ_{c} (i, j) + ʋ

(7)

where

Ɋ_{f e}

represents the finally enhanced image and

ʚ

, and

ʋ

values are adjusted using the genetic algorithm with the help of the following fitness function (8).

F i t n e s s (ʚ, ʋ) = \{\begin{matrix} [Ƿ + ǫ + ʧ - Ƨ - Ʀ] > M A X [§ [Ƿ + ǫ + ʧ - Ƨ - Ʀ]] \end{matrix}\}

(8)

where

Ƿ

is the value of peak signal to noise ratio (PSNR),

Ƨ

is the value of similarity index (SI),

ǫ

is the value of image quality index (IQI),

Ʀ

is the value of mean squared error (MSE) and

ʧ

is visual information fidelity (VIF), and

§

is the set of calculated values for these performance metrics. A step by step process of contrast enhancement is shown in Algorithm 1.

Algorithm 1: Proposed Contrast Enhancement

START

Step 1 : Let Ɋ (i, j)

is an RGB image
Step 2: Separate RGB Channels of an image:

R, G, B = Ɋ (i, j)

Step 3: Apply CLAHE-RGB on Each Channel:

R_{C l}

, G_{C l}

, B_{C l}

= CLAHE-RGB()
Step 4: Normalized each channel using Equation (1):

R_{°}, G_{°}, B_{°} = [\frac{R_{C l}}{R_{c l} + G_{c l} + B_{c l}}, \frac{G_{C l}}{R_{c l} + G_{c l} + B_{c l}}, \frac{B_{C l}}{R_{c l} + G_{c l} + B_{c l}}]

Step 5: Convert Image to HSV: CONVERTRGBTOHSV(

Ɋ (i, j)

)
Step 6: Apply CLAHE-HSV: CLAHE-HSV()
Step 7: Convert Back Image to RGB form from CLAHE-HSV:

R_{1}, G_{1}, B_{1} = CONVERTHSVTORGB ()

Step 8: Merge two images using the following transform:

MERGEIMAGES (R_{°}, G_{°}, B_{°}, R_{1}, G_{1}, B_{1}

)

Ɋ_{c} (i, j) = \sum [\frac{|R_{°} + R_{1}|}{2} + \frac{|G_{°} + G^{1}|}{2} + \frac{|B_{°} + G^{1}|}{2}]

Step 9: Following transform applies the given ʚ and ʋ on the contrast enhanced image.

Ɋ_{f e} (i, j) = ʚ \cdot Ɋ_{c} (i, j) + ʋ

Step 10: Values of ʚ and ʋ are adjusted using the genetic algorithm (GA) by applying following fitness function.

F i t n e s s (ʚ, ʋ) = \{\begin{matrix} [Ƿ + ǫ + ʧ - Ƨ - Ʀ] > M A X [§ [Ƿ + ǫ + ʧ - Ƨ - Ʀ]] \end{matrix}\}

Step 11: Final Enhanced Image:

Ɋ_{f e} (i, j)

END

The results of our proposed contrast-enhancement technique are analyzed using the multiple quantitative performance measures available for checking the quality of an image. These performance measures are peak signal to noise ratio (PSNR), similarity index (SI), image quality index (IQI), mean squared error (MSE), and visual information fidelity (VIF). It is evident from Table 1 that our proposed model surpasses the results of the available state-of-the-art techniques such as histogram equalization (HE), adaptive histogram equalization (AHE), and contrast-limited adaptive histogram equalization (CLAHE).

PSNR between two images is used to compare the original and reconstructed images’ quality. The high value of PSNR indicates the better quality of the reconstructed image. SI compares the fraction of pixels in the enhanced image that coincides with pixels in the primary or original image. The SI scales from 0 to 1, with 0 denoting 0% similarity and 1 denoting 100%. IQI measures the pixel disparities between two images, which has a value range of −1 to 1. The quality of the enhanced image is better if the IQI is close to 1. This measure is opposite to SI. The MSE is a measurement of the squared cumulative error between the original and enhanced image. The quality of an image is better when the value of MSE is low. The VIF investigates the relationship between image information and visual quality and calculates the amount of image information lost due to distortion. A higher value of VIF means better quality.

For visualization of the resultant images after applying our proposed contrast-enhancement technique is shown in Figure 3.

3.3. Data Augmentation

Deep learning models are prone to over-fitting and need more data for generalization. We performed dataset augmentation to increase the number of images in each class and increase the generalizability of the model. Additionally, data augmentation is a very effective strategy to lower both the training and validation errors [39]. The primary transformations used during data augmentation include rotation with 20 and 30 values, width shifting with 0.3 and 0.4 values, moving height, zooming in and out, and flipping the horizontal and vertical axes. Following the application of data augmentation, the overall dataset rose to 30,000 images, including 6000 images for each class.

3.4. Transfer Learning

Transfer learning is a method for transferring knowledge of the deep CNN model from one domain to another. Transfer learning means that you will not need to restart your training for every new task. Due to the resource-intensive nature of training new machine learning models, transfer learning saves time and resources. It takes a long time to label large datasets that require precise labeling. In TL, a model that is pre-trained on another dataset is fine-tuned by applying it to another dataset. In our case, the pre-trained DL model is already trained on an ImageNet dataset with 1000 item classes. That model is fine-tuned on the WCE images dataset that has 5 classes through knowledge transfer. This demonstrates that the source data are greater than the target data. The primary goal of TL is to retrain an improved DL model on a smaller dataset. Moreover, it is evident from research that pre-trained models outperform the model trained from scratch [40,41]. The transfer learning process is illustrated in Figure 4.

Mathematically the process of TL is as follows, a source domain is given as Ďŝ and its learning task is Ĺŝ, a target domain is given as Ďƫ and its learning task is Ĺƫ. The goal of transfer learning is to enhance the learning of the conditional probability distribution P(Yƫ|Xƫ) in the target domain Ďƫ using the knowledge acquired from the source domain Ďŝ, where Ďƫ ≠ Ďŝ and Ĺŝ ŝnĹƫ.

3.5. Features Extraction and Classification

A pre-trained CNN lightweight model utilized for classification is called MobileNet-V2 [42]. The MobileNet-V2 version has an inverted residual block and a linear bottleneck frame [43], due to which it has a greater ability to address the gradient vanishing problem than the V1 version. This model has only 3.4 million parameters. The expansion layer of 1 × 1 convolution and expanding the channels before the depth-wise convolution operation are two of MobileNet-V2’s new features. The “DagNetwork” is a network that receives input with the dimensions 224 × 224 × 3. There are 154 layers in total and 163 × 2 connections in this network. The network was initially trained on an ImageNet dataset and the final layer contains 1000 classes. For the extraction of features, the model is fine-tuned by removing the top layer and placing one drop-out layer with a 0.2 probability value and another fully connected layer. Moreover, the initial 20 layers of the model are frozen and the remaining layers are fine-tuned on GI WCE images. The total number of extracted features from this model is 1210. The extracted features are then passed to multiple classifiers such as softmax, linear support vector machine (SVM), quadratic SVM, cubic SVM, and Bayesian for accurate prediction of disease.

Python was used for the experiments, and details about the hyperparameters are shown below for duplicating the outcomes. The Adam optimizer was used during model training. Because of its great performance and adaptive learning rate, the Adam optimizer is now the most popular optimizer for CNN training [44]. Additionally, the loss function categorical cross entropy (CCE) was employed. The loss function calculates the discrepancy between the target label and the model’s predicted label during the training of the deep learning model. Additionally, it modifies the CNN’s weights to create a model that fits the data better [45]. A batch size of 50 with 250 epochs was used to train the model.

CCE loss is a great way to gauge loss by figuring out how different two discrete probability distributions are from one another. The mathematical Equation (9) of this loss is as follows:

C C E = - \sum_{n = 1}^{O u t c o m e S i z e} γ_{i} . l o g \hat{γ_{i}}

(9)

where

γ_{i}

is the original value,

\hat{γ_{i}}

is the predicted value of the model, and

n

is the total value.

4. Results and Discussion

For performing experimentation, the dataset is divided into multiple train-to-test ratios and the main tools and libraries that are utilized for performing experimentation are Python, Keras, PyTorch, Tensorflow, and Matplotlib. This study uses several performance metrics that are frequently used in medical image classification approaches to assess the suggested approach. These performance metrics are precision, recall, accuracy, and f-measure. For classification problems that are evenly distributed, not skewed, or in which there is no class imbalance, accuracy is a reliable evaluation parameter. In general, accuracy can dangerously display too optimistic outcomes that are inflated, especially in unbalanced datasets. Since f-measure is the harmonic mean of precision and recall, it keeps the balance between both for the classifier. F-measure is a performance metric that accounts for both false positives and false negatives. For imbalanced classification, f-measure is typically more valuable than accuracy. Two crucial criteria for evaluating models are precision and recall [46,47,48]. Recall evaluates the capacity to discriminate between the classes, whereas precision reflects the likelihood of correctly detecting positive values [49]. Mathematical Equations (10)–(13) of these performance measures are as follows:

Precision = \frac{TP}{TP + FP}

(10)

Recall (Sensitvity) = \frac{TP}{TP + FN}

(11)

Accuracy = \frac{TP + TN}{TP + FN + FP + TN}

(12)

F - Measure = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(13)

These mathematical equations of these measures are based on a four important metrics stated in Table 2.

Classification results are collected for various steps. Initially, results are taken on input WCE images, after that on contrast-enhanced images, and then after performing stated augmentation steps on contrast-enhanced augmented images, and finally after mixing input and augmented contrast-enhanced images. The results of the input images are shown in Table 3. The softmax classifier outperformed other classifiers, with a precision of 84.27%, a recall of 76.25%, an f-measure of 80.06%, and an accuracy of 81.06%, as presented in Table 3. However, given that Bayesian performance is close to the softmax, it cannot be disregarded.

Moreover, accuracy comparison of different train-to-test ratios is also provided in Figure 5. The best accuracy was achieved by 10-fold cross-validation for all classifiers.

Figure 6 displays the confusion matrix of 10-fold cross-validation produced by the softmax classifier on the input images. Looking at the statistics, it is clear that 4% of polyp cases, 7% of ulcer cases, 5% of ulcerative-colitis cases, and 8% of esophagitis cases are treated as normal, which is a precarious indication. Additionally, 13% of healthy cases are handled as diseased instances.

The results collected on contrast-enhanced images without augmentation are shown in Table 4. The softmax classifier outperformed other classifiers with a precision of 91.14%, a recall of 83.92%, an f-measure of 87.38%, and an accuracy of 89.23%, as presented in Table 4. However, given that quadratic SVM performance is close to the softmax, it cannot be disregarded. Moreover, comparing Table 3 and Table 4, it shows that accuracy, as well as other performance measures, improved after implementing the contrast-enhancement technique and accuracy increased by nearly 9%. Hence, it is concluded that our proposed contrast-enhancement technique helped the model to generalize better regarding WCE images.

Moreover, an accuracy comparison of different train-to-test ratios is also provided in Figure 7. The best accuracy was achieved by 10-fold cross-validation for all the classifiers.

Figure 8 displays the confusion matrix of 10-fold cross-validation produced by the softmax classifier on the contrast-enhanced images. Looking at the statistics, it is clear that the results are much improved as compared with the input images, as no polyp case was treated as normal; however, 2% of ulcer cases, 5% of ulcerative-colitis cases, and 1% of esophagitis cases were treated as normal, which are unacceptable results. Furthermore, it can be seen that the prediction of ulcerative-colitis cases as normal had not changed and remained at 5%, but other predicted cases as normal were reduced.

The results collected on contrast-enhanced images after performing augmentation are shown in Table 5. The softmax classifier outperformed other classifiers with a precision of 97.57%, a recall of 93.02%, an f-measure of 95.24%, and an accuracy 96.13%, as presented in Table 5. However, given that Bayesian and quadratic SVM performance are close to the softmax, they cannot be overlooked. It is evident by looking at the results of Table 3 and Table 5 that accuracy increased by almost 15% by implementing our proposed methodology.

Moreover, an accuracy comparison of different train-to-test ratios is also provided in Figure 9. The best accuracy was achieved by 10-fold cross-validation for all the classifiers.

A confusion matrix of 10-fold cross-validation produced by the softmax classifier on the augmented contrast-enhanced images is shown in Figure 10. Looking at the statistics, it is clear that the results are much improved and adequate, as compared with previously produced results, because only 1% of the diseased cases (i.e., ulcerative-colitis) was treated as normal by the model.

The results collected by mixing the augmented contrast-enhanced images and input images are shown in Table 6. The softmax classifier outperformed other classifiers with a precision of 95.33%, a recall of 90.22%, an f-measure of 92.71%, and an accuracy of 93.21%, as shown in Table 6. However, given that Bayesian and quadratic SVM performance are close to the softmax, it cannot be overlooked. It is evident from the results that the model performed better on augmented contrast-enhanced images rather than mixing augmented contrast-enhanced images with input images. It is concluded that our proposed model performance and results are better.

Moreover, an accuracy comparison on different train-to-test ratios is also provided in Figure 11. The best accuracy was achieved by 10-fold cross-validation for all the classifiers.

A confusion matrix of 10-fold cross-validation produced by the softmax classifier on the mixed dataset (augmented contrast-enhanced images and input images) is shown in Figure 12. Looking at the statistics, it is clear that the results were less positive as compared with our proposed model of augmented contrast-enhanced images.

The classification performances of different classifiers are shown in Table 3, Table 4, Table 5 and Table 6. Moreover, the results of different train-to-test ratios are provided in Figure 5,Figure 7,Figure 9 and Figure 11, and the confusion matrices of 10-fold validation using the softmax classifier is shown in Figure 6,Figure 8,Figure 10 and Figure 12. To highlight the improvement in performance due to the proposed contrast-enhancement scheme, we compared the obtained performance with and without the contrast-enhancement step. Another result that we also want to highlight is that we provide a comparison of the performance after augmenting our contrast-enhanced dataset. Figure 13 depicts a summary of this performance comparison. The results clearly show a significant improvement in performance in terms of the various metrics due to contrast enhancement, with and without dataset augmentation. Overall, the accuracy is improved by 15%, precision is increased by 13%, recall is increased by approximately 17%, and f-measure is increased by 15% when we compare these metrics with the original dataset. The proposed technique outperforms the other state-of-the-art methods because most of the already conducted studies lack a contrast-enhancement step [38], as well as exploration of the multiple classifiers [25] of extracted features. We performed these steps as well as applying the cross-validation, which shows the better generalizability of our technique.

Ablation Study

To highlight the importance of contrast enhancement, an ablation study is added to show the results of the original images (without contrast enhancement) after dataset augmentation. The results were collected by the 10-fold cross-validation technique and the same is shown in Table 7.

To highlight the importance of modified MobileNet-V2, an ablation study is added to show the performance of the original MobileNet-V2 using our proposed methodology (contrast-enhanced dataset after dataset augmentation). The results were collected by the 10-fold cross-validation technique and the same is shown in Table 8. Comparing the results of Table 5 (modified MobileNet-V2) and Table 8 (original MobileNet-V2), it can be seen that results on original MobileNet-V2 are less than that of the modified MobileNet-V2. Moreover, the training time of the original MobileNet-V2 was also higher.

Furthermore, to check the better generalizability of the model, we added the results for 3-fold and 5-fold cross-validation using the proposed methodology. Figure 14 displays the training and Figure 15 displays the loss curve.

Finally, a comparison with existing GI tract disease classification techniques is presented in Table 9. It can be observed that the proposed technique outperforms the state-of-the-art methodologies and there is a significant improvement in terms of accuracy.

5. Conclusions

Manual classification of GI tract diseases from WCE images is a difficult task, and therefore, an automated solution is required to achieve better results. In this study, we propose an architecture based on deep learning to accurately classify GI tract diseases. The main idea is to enhance the contrast of images using an optimized brightness-controlled contrast-enhancement approach and emphasize the significance of this step regarding WCE images. The performance of contrast-enhanced images is evaluated using various performance measures such as MSE, PSNR, IQI, SI, and VIF. Furthermore, data augmentation is performed to increase the size of the dataset and better generalize the model. Then, a modified pre-trained model is fine-tuned for WCE images using the TL approach, and finally, multiple machine learning classifiers are applied to the extracted features from the fine-tuned model. The proposed model achieved an accuracy of 96.40%, a recall of 93.02%, a precision of 97.57%, and an f-measure of 95.24% using the softmax classifier. The results show that the proposed model treated only 1% of the diseased cases as normal, and that too for the ulcerative-colitis class only, which is a critical measure when human life is concerned. Although the proposed model outperformed the most advanced techniques currently in use, there are still several intriguing issues that require more exploration. For instance, this research did not consider the impact of training numerous models and fusing them, or the optimization of features. Future work will concentrate on these actions because they could lead to better performance.

Author Contributions

Conceptualization, M.N.N.; methodology, M.N.N. and M.N.; software, M.N.; validation, M.N.N., M.N. and S.A.K.; formal analysis, M.N.N., S.A.K.; investigation, M.N., S.A.K., O.-Y.S. and I.A.; resources, M.N. and O.-Y.S.; writing—original draft preparation, M.N.N., M.N. and S.A.K.; writing—review and editing, O.-Y.S.; visualization, M.N.N., S.A.K. and I.A.; supervision, M.N.; project administration, O.-Y.S. and I.A.; funding acquisition, O.-Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was financially supported by the Ministry of Trade, Industry, and Energy (MOTIE) and the Korea Institute for Advancement of Technology (KIAT) through the International Cooperative R&D program. (Project No. P0016038) and Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No.2021-0-01188, Non-face-to-face Companion Plant Sales Support System Providing Realistic Experience) and the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2022-RS-2022-00156354) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation) and the faculty research fund of Sejong University in 2022.

Data Availability Statement

Research is conducted on publically available dataset.

Acknowledgments

We are grateful to all of those with whom we have had the pleasure to work during this projects.

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

Ling, T.; Wu, L.; Fu, Y.; Xu, Q.; An, P.; Zhang, J.; Hu, S.; Chen, Y.; He, X.; Wang, J.; et al. A deep learn-ing-based system for identifying differentiation status and delineating the margins of early gastric cancer in magnifying nar-row-band imaging endoscopy. Endoscopy 2020, 53, 469–477. [Google Scholar] [CrossRef] [PubMed]
Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
Korkmaz, M.F. Artificial Neural Network by using HOG Features HOG_LDA_ANN. In Proceedings of the 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), Subotica, Serbia, 14–16 September 2017; pp. 327–332. [Google Scholar]
Li, S.; Cao, J.; Yao, J.; Zhu, J.; He, X.; Jiang, Q. Adaptive aggregation with self-attention network for gastrointestinal image classification. IET Image Process. 2022, 16, 2384–2397. [Google Scholar] [CrossRef]
Siegel, R.L.; Miller, K.; Jemal, A. Cancer statistics, 2015. CA Cancer J. Clin. 2015, 65, 5–29. [Google Scholar] [CrossRef]
Azhari, H.; King, J.; Underwood, F.; Coward, S.; Shah, S.; Ho, G.; Chan, C.; Ng, S.; Kaplan, G. The Global Incidence of Peptic Ulcer Disease at the Turn of the 21st Century: A Study of the Organization for Economic Co—Operation and Development (OECD). Am. J. Gastroenterol. 2018, 113, S682–S684. [Google Scholar] [CrossRef]
Kim, N.H.; Jung, Y.S.; Jeong, W.S.; Yang, H.-J.; Park, S.-K.; Choi, K.; Park, D.I. Miss rate of colorectal neoplastic polyps and risk factors for missed polyps in consecutive colonoscopies. Intest. Res. 2017, 15, 411–418. [Google Scholar] [CrossRef] [Green Version]
Iddan, G.; Meron, G.; Glukhovsky, A.; Swain, P. Wireless capsule endoscopy. Nature 2000, 405, 417. [Google Scholar] [CrossRef]
Muruganantham, P.; Balakrishnan, S.M. Attention Aware Deep Learning Model for Wireless Capsule Endoscopy Lesion Classification and Localization. J. Med Biol. Eng. 2022, 42, 157–168. [Google Scholar] [CrossRef]
Khan, M.A.; Khan, M.A.; Ahmed, F.; Mittal, M.; Goyal, L.M.; Hemanth, D.J.; Satapathy, S.C. Gastrointestinal diseases segmentation and classification based on duo-deep architectures. Pattern Recognit. Lett. 2019, 131, 193–204. [Google Scholar] [CrossRef]
Khan, M.A.; Sarfraz, M.S.; Alhaisoni, M.; Albesher, A.A.; Wang, S.; Ashraf, I. StomachNet: Optimal Deep Learning Features Fusion for Stomach Abnormalities Classification. IEEE Access 2020, 8, 197969–197981. [Google Scholar] [CrossRef]
Amiri, Z.; Hassanpour, H.; Beghdadi, A. Feature extraction for abnormality detection in capsule endoscopy images. Biomed. Signal Process. Control. 2021, 71, 103219. [Google Scholar] [CrossRef]
Khan, M.; Ashraf, I.; Alhaisoni, M.; Damaševičius, R.; Scherer, R.; Rehman, A.; Bukhari, S. Multimodal Brain Tumor Classification Using Deep Learning and Robust Feature Selection: A Machine Learning Application for Radiologists. Diagnostics 2020, 10, 565. [Google Scholar] [CrossRef]
Cicceri, G.; De Vita, F.; Bruneo, D.; Merlino, G.; Puliafito, A. A deep learning approach for pressure ulcer prevention using wearable computing. Human-Centric Comput. Inf. Sci. 2020, 10, 5. [Google Scholar] [CrossRef]
Wong, G.L.H.; Ma, A.J.; Deng, H.; Ching, J.Y.L.; Wong, V.W.S.; Tse, Y.K.; Yip, T.C.-F.; Lau, L.H.-S.; Liu, H.H.-W.; Leung, C.M.; et al. Machine learning model to predict recurrent ulcer bleeding in patients with history of idiopathic gastroduodenal ulcer bleeding. APT—Aliment. Pharmacol. Therapeutics 2019, 49, 912–918. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Xing, Y.; Zhang, L.; Gao, H.; Zhang, H. Second glance framework (secG): Enhanced ulcer detection with deep learning on a large wireless capsule endoscopy dataset. In Proceedings of the Fourth International Workshop on Pattern Recognition, Nanjing, China, 28–30 June 2019; Volume 11198. [Google Scholar]
Majid, A.; Khan, M.A.; Yasmin, M.; Rehman, A.; Yousafzai, A.; Tariq, U. Classification of stomach infections: A paradigm of convolutional neural network along with classical features fusion and selection. Microsc. Res. Tech. 2020, 83, 562–576. [Google Scholar] [CrossRef]
Usman, M.A.; Satrya, G.; Shin, S.Y. Detection of small colon bleeding in wireless capsule endoscopy videos. Comput. Med Imaging Graph. 2016, 54, 16–26. [Google Scholar] [CrossRef] [PubMed]
Iakovidis, D.; Koulaouzidis, A. Automatic lesion detection in capsule endoscopy based on color saliency: Closer to an essential adjunct for reviewing software. Gastrointest Endosc. 2014, 80, 877–883. [Google Scholar] [CrossRef]
Noya, F.; Alvarez-Gonzalez, M.A.; Benitez, R. Automated angiodysplasia detection from wireless capsule endoscopy. In Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Jeju, South Korea, 11–15 July 2017; pp. 3158–3161. [Google Scholar] [CrossRef]
Li, B.; Meng, M.Q.-H. Texture analysis for ulcer detection in capsule endoscopy images. Image Vis. Comput. 2009, 27, 1336–1342. [Google Scholar] [CrossRef]
Fu, Y.; Zhang, W.; Mandal, M.; Meng, M.Q.-H. Computer-Aided Bleeding Detection in WCE Video. IEEE J. Biomed. Heal. Inform. 2013, 18, 636–642. [Google Scholar] [CrossRef]
Pan, G.; Yan, G.; Qiu, X.; Cui, J. Bleeding Detection in Wireless Capsule Endoscopy Based on Probabilistic Neural Network. J. Med Syst. 2010, 35, 1477–1484. [Google Scholar] [CrossRef]
Li, B.; Meng, M.Q.-H. Computer-Aided Detection of Bleeding Regions for Capsule Endoscopy Images. IEEE Trans. Biomed. Eng. 2009, 56, 1032–1039. [Google Scholar] [CrossRef] [PubMed]
Mohapatra, S.; Pati, G.K.; Mishra, M.; Swarnkar, T. Gastrointestinal abnormality detection and classification using empirical wavelet transform and deep convolutional neural network from endoscopic images. Ain Shams Eng. J. 2023, 14, 101942. [Google Scholar] [CrossRef]
Koyama, S.; Okabe, Y.; Suzuki, Y.; Igari, R.; Sato, H.; Iseki, C.; Tanji, K.; Suzuki, K.; Ohta, Y. Differing clinical features between Japanese siblings with cerebrotendinous xanthomatosis with a novel compound heterozygous CYP27A1 mutation: A case report. BMC Neurol. 2022, 22, 193. [Google Scholar] [CrossRef] [PubMed]
Higuchi, N.; Hiraga, H.; Sasaki, Y.; Hiraga, N.; Igarashi, S.; Hasui, K.; Ogasawara, K.; Maeda, T.; Murai, Y.; Tatsuta, T.; et al. Automated evaluation of colon capsule endoscopic severity of ulcerative colitis using ResNet50. PLoS ONE 2022, 17, e0269728. [Google Scholar] [CrossRef] [PubMed]
Ji, X.; Xu, T.; Li, W.; Liang, L. Study on the classification of capsule endoscopy images. EURASIP J. Image Video Process. 2019, 2019, 1–7. [Google Scholar] [CrossRef] [Green Version]
Szczypiński, P.; Klepaczko, A.; Strzelecki, M. An Intelligent Automated Recognition System of Abnormal Structures in WCE Images. In Proceedings, Part I 6, Proceedings of the Hybrid Artificial Intelligent Systems: 6th International Conference, HAIS 2011; Wroclaw, Poland, 23–25 May 2011, Lecture Notes in Computer Science 6678; Springer: Berlin, Heidelberg, 2011; pp. 140–147. [Google Scholar] [CrossRef]
Patel, V.; Armstrong, D.; Ganguli, M.P.; Roopra, S.; Kantipudi, N.; Albashir, S.; Kamath, M.V. Deep Learning in Gastrointestinal Endoscopy. Crit. Rev. Biomed. Eng. 2016, 44, 493–504. [Google Scholar] [CrossRef]
Lee, J.H.; Kim, Y.J.; Kim, Y.W.; Park, S.; Choi, Y.-I.; Park, D.K.; Kim, K.G.; Chung, J.-W. Spotting malignancies from gastric endoscopic images using deep learning. Surg. Endosc. 2019, 33, 3790–3797. [Google Scholar] [CrossRef]
Khan, M.A.; Lali, M.I.U.; Sharif, M.; Javed, K.; Aurangzeb, K.; Haider, S.I.; Altamrah, A.S.; Akram, T. An Optimized Method for Segmentation and Classification of Apple Diseases Based on Strong Correlation and Genetic Algorithm Based Feature Selection. IEEE Access 2019, 7, 46261–46277. [Google Scholar] [CrossRef]
Yuan, Y.; Wang, J.; Li, B.; Meng, M.Q.-H. Saliency Based Ulcer Detection for Wireless Capsule Endoscopy Diagnosis. IEEE Trans. Med Imaging 2015, 34, 2046–2057. [Google Scholar] [CrossRef]
Pogorelov, K.; Randel, K.R.; Griwodz, C.; Eskeland, S.L.; de Lange, T.; Johansen, D.; Spampinato, C.; Dang-Nguyen, D.-T.; Lux, M.; Schmidt, P.T.; et al. KVASIR: A multi-class image dataset for computer aided gastrointestinal disease detection. In Proceedings of the 8th ACM on Multimedia Systems Conference, Taipei, Taiwan, 20–23 June 2017; pp. 164–169. [Google Scholar]
Borgli, H.; Thambawita, V.; Smedsrud, P.H.; Hicks, S.; Jha, D.; Eskeland, S.L.; Randel, K.R.; Pogorelov, K.; Lux, M.; Nguyen, D.T.D.; et al. HyperKvasir: A comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci. Data 2020, 7, 283. [Google Scholar] [CrossRef]
Jain, S.; Seal, A.; Ojha, A.; Yazidi, A.; Bures, J.; Tacheci, I.; Krejcar, O. A deep CNN model for anomaly detection and localization in wireless capsule endoscopy images. Comput. Biol. Med. 2021, 137, 104789. [Google Scholar] [CrossRef]
Lan, L.; Ye, C. Recurrent generative adversarial networks for unsupervised WCE video summarization. Knowledge-Based Syst. 2021, 222, 106971. [Google Scholar] [CrossRef]
Alhajlah, M.; Noor, M.N.; Nazir, M.; Mahmood, A.; Ashraf, I.; Karamat, T. Gastrointestinal Diseases Classification Using Deep Transfer Learning and Features Optimization. Comput. Mater. Contin. 2023, 75, 2227–2245. [Google Scholar] [CrossRef]
Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
Noor, M.N.; Khan, T.A.; Haneef, F.; Ramay, M.I. Machine Learning Model to Predict Automated Testing Adoption. Int. J. Softw. Innov. 2022, 10, 1–15. [Google Scholar] [CrossRef]
Noor, M.N.; Nazir, M.; Rehman, S.; Tariq, J. Sketch-Recognition using Pre-Trained Model. In Proceedings of the National Conference on Engineering and Computing Technology, Islamabad, Pakistan, 8 January 2021. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18-22 June 2018; pp. 4510–4520. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. 2017. Available online: https://arxiv.org/abs/1704.04861 (accessed on 13 January 2023).
Bae, K.; Ryu, H.; Shin, H. Does Adam optimizer keep close to the optimal point? arXiv 2019, arXiv:1911.00289. [Google Scholar]
Ho, Y.; Wookey, S. The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling. IEEE Access 2020, 8, 4806–4813. [Google Scholar] [CrossRef]
Shafiq, M.; Tian, Z.; Bashir, A.K.; Du, X.; Guizani, M. IoT malicious traffic identification using wrapper-based feature selection mechanisms. Comput. Secur. 2020, 94, 101863. [Google Scholar] [CrossRef]
Bhattacharya, S.; Maddikunta, P.K.R.; Hakak, S.; Khan, W.Z.; Bashir, A.K.; Jolfaei, A.; Tariq, U. Antlion re-sampling based deep neural network model for classification of imbalanced multimodal stroke dataset. Multimedia Tools Appl. 2020, 81, 41429–41453. [Google Scholar] [CrossRef]
Feng, L.; Ali, A.; Iqbal, M.; Bashir, A.K.; Hussain, S.A.; Pack, S. Optimal haptic communications over nanonetworks for e-health systems. IEEE Trans. Ind. Inform. 2019, 15, 3016–3027. [Google Scholar] [CrossRef] [Green Version]
Seo, S.; Kim, Y.; Han, H.-J.; Son, W.C.; Hong, Z.-Y.; Sohn, I.; Shim, J.; Hwang, C. Predicting Successes and Failures of Clinical Trials With Outer Product–Based Convolutional Neural Network. Front. Pharmacol. 2021, 12, 670670. [Google Scholar] [CrossRef] [PubMed]
Kumar, C.; Mubarak, D.M.N. Classification of Early Stages of Esophageal Cancer Using Transfer Learning. IRBM 2021, 43, 251–258. [Google Scholar] [CrossRef]
Ahmed, A. Classification of Gastrointestinal Images Based on Transfer Learning and Denoising Convolutional Neural Networks. In Proceedings of the International Conference on Data Science and Applications: ICDSA 2021; Springer: Singapore, 2022; pp. 631–639. [Google Scholar]
Escobar, J.; Sanchez, K.; Hinojosa, C.; Arguello, H.; Castillo, S. Accurate Deep Learning-based Gastrointestinal Disease Classification via Transfer Learning Strategy. In Proceedings of the 2021 XXIII Symposium on Image, Signal Processing and Artificial Vision (STSIVA), Popayán, Colombia, 15–17 September 2021; pp. 1–5. [Google Scholar]
Bang, C.S. Computer-Aided Diagnosis of Gastrointestinal Ulcer and Hemorrhage Using Wireless Capsule Endoscopy: Systematic Review and Diagnostic Test Accuracy Meta-analysis. J. Med. Internet Res. 2021, 23, e33267. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Proposed methodology.

Figure 2. Sample images from dataset.

Figure 3. Images after applying our proposed contrast-enhancement technique.

Figure 4. Transfer learning process.

Figure 5. Accuracy results for various train-to-test ratios using original dataset.

Figure 6. Confusion matrix using softmax classifier for original dataset.

Figure 7. Accuracy results using contrast-enhanced images for various train-to-test ratios.

Figure 8. Confusion matrix using softmax classifier for contrast enhanced dataset.

Figure 9. Accuracy results using contrast-enhanced images for various train-to-test ratios of augmented dataset.

Figure 10. Confusion matrix using softmax classifier for contrast enhanced augmented dataset.

Figure 11. Accuracy results using contrast-enhanced images for various train-to-test ratios of mixed dataset (original images + augmented contrast enhanced images).

Figure 12. Confusion matrix using softmax classifier for mixed dataset (original images + augmented contrast enhanced images).

Figure 13. Summarized performance comparison.

Figure 14. Training accuracy on 3-fold, 5-fold and 10-fold cross validation.

Figure 15. Training loss on 3-fold, 5-fold and 10-fold cross validation.

Table 1. Numerical results of contrast enhancement.

Performance Measure	Histogram Equalization	Adaptive Histogram Equalization	Contrast-Limited Adaptive Histogram Equalization	Our Proposed Method
PSNR	19.22	22.67	24.48	37.44
Similarity Index	0.99	0.90	0.88	0.86
Image Quality Index	0.69	0.71	0.72	0.95
Mean Squared Error	343.92	287.45	231.79	117.21
Visual Information Fidelity	0.99	0.98	1.10	1.19

Table 2. Description of parameters used in performance measures.

Symbols	Definition	Description
$TP$	True Positive	The number of diseased images which are accurately classified.
$FP$	False Positive	The number of diseased images which are incorrectly classified.
$FN$	False Negative	The number of normal images which are incorrectly classified.
$TN$	True Negative	The number of normal images which are accurately classified.

Table 3. Classification performance using original dataset.

	Precision	Recall	F-Measure	Accuracy
Softmax	84.27%	76.25%	80.06%	81.14%
Linear SVM	74.16%	67.23%	70.53%	71.39%
Quadratic SVM	82.37%	71.17%	76.36%	77.84%
Cubic SVM	77.31%	69.64%	73.27%	75.05%
Bayesian	81.22%	73.37%	77.10%	79.21%

Table 4. Classification performance using contrast-enhanced dataset.

	Precision	Recall	F-Measure	Accuracy
Softmax	91.14%	83.92%	87.38%	90.23%
Linear SVM	87.25%	81.37%	84.20%	84.58%
Quadratic SVM	89.19%	83.44%	86.22%	87.94%
Cubic SVM	85.11%	77.64%	81.20%	81.22%
Bayesian	82.71%	74.51%	78.39%	79.31%

Table 5. Classification performance using contrast-enhanced images after dataset augmentation.

	Precision	Recall	F-Measure	Accuracy
Softmax	97.57%	93.02%	95.24%	96.40%
Linear SVM	91.11%	84.39%	87.62%	87.94%
Quadratic SVM	95.27%	87.41%	91.17%	94.77%
Cubic SVM	92.02%	86.20%	89.02%	90.45%
Bayesian	95.18%	90.63%	92.85%	94.98%

Table 6. Classification results for mixed dataset (original images + augmented contrast enhanced images).

	Precision	Recall	F-Measure	Accuracy
Softmax	95.33%	90.22%	92.71%	93.21%
Linear SVM	90.21%	83.27%	86.60%	88.94%
Quadratic SVM	92.27%	87.41%	89.77%	90.12%
Cubic SVM	89.34%	83.31%	86.22%	86.59%
Bayesian	93.84%	86.87%	90.22%	91.58%

Table 7. Classification performance for original images with dataset augmentation.

	Precision	Recall	F-Measure	Accuracy
Softmax	87.29%	79.33%	83.12%	83.95%
Linear SVM	77.20%	72.56%	74.81%	75.18%
Quadratic SVM	84.64%	73.44%	78.64%	79.87%
Cubic SVM	79.77%	72.42%	75.92%	76.52%
Bayesian	83.41%	75.32%	79.16%	79.98%

Table 8. Classification performance on contrast-enhanced images after dataset augmentation using original MobileNet-V2.

	Precision	Recall	F-Measure	Accuracy
Softmax	93.24%	89.45%	91.31%	94.60%
Linear SVM	87.35%	81.57%	84.36%	85.04%
Quadratic SVM	92.06%	83.87%	87.77%	91.13%
Cubic SVM	91.84%	83.11%	87.26%	88.78%
Bayesian	92.31%	88.20%	90.21%	91.14%

Table 9. Performance comparison with the state-of-the-art techniques.

Technique	Accuracy
Applied logistic and ridge regression on multiple extracted features [15]	83.3%
Second glance framework based on CNN [16]	85.69%
Different classifiers such as SVM, softmax, and decision trees are applied to colored, geometric, and texture features [17]	93.64%
Different pre-trained models are used for the extraction of features and multiple classifiers are applied [50]	94.46%
Initially preprocessed images and applied a modified VGGNet model [34]	86.6%
Initially de-noised images and applied pre-trained CNN, i.e., AlexNet [51]	90.17%
DenseNet201 [52]	78.55%
ResNet-50 [53]	90.42%
Proposed Model	96.40%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nouman Noor, M.; Nazir, M.; Khan, S.A.; Song, O.-Y.; Ashraf, I. Efficient Gastrointestinal Disease Classification Using Pretrained Deep Convolutional Neural Network. Electronics 2023, 12, 1557. https://doi.org/10.3390/electronics12071557

AMA Style

Nouman Noor M, Nazir M, Khan SA, Song O-Y, Ashraf I. Efficient Gastrointestinal Disease Classification Using Pretrained Deep Convolutional Neural Network. Electronics. 2023; 12(7):1557. https://doi.org/10.3390/electronics12071557

Chicago/Turabian Style

Nouman Noor, Muhammad, Muhammad Nazir, Sajid Ali Khan, Oh-Young Song, and Imran Ashraf. 2023. "Efficient Gastrointestinal Disease Classification Using Pretrained Deep Convolutional Neural Network" Electronics 12, no. 7: 1557. https://doi.org/10.3390/electronics12071557

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient Gastrointestinal Disease Classification Using Pretrained Deep Convolutional Neural Network

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Dataset Collection and Preparation

3.2. Contrast Enhancement

3.3. Data Augmentation

3.4. Transfer Learning

3.5. Features Extraction and Classification

4. Results and Discussion

Ablation Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI