EdgeSVDNet: 5G-Enabled Detection and Classification of Vision-Threatening Diabetic Retinopathy in Retinal Fundus Images

Bilal, Anas; Liu, Xiaowen; Baig, Talha Imtiaz; Long, Haixia; Shafiq, Muhammad

doi:10.3390/electronics12194094

Open AccessArticle

EdgeSVDNet: 5G-Enabled Detection and Classification of Vision-Threatening Diabetic Retinopathy in Retinal Fundus Images

by

Anas Bilal

¹

,

Xiaowen Liu

¹,

Talha Imtiaz Baig

^2,3,

Haixia Long

^1,* and

Muhammad Shafiq

⁴

¹

College of Information Science and Technology, Hainan Normal University, Haikou 571158, China

²

School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610056, China

³

School of Science and Technology, University of Management and Technology, Lahore 54770, Pakistan

⁴

School of Information Engineering, Qujing Normal University, Qujing 655011, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(19), 4094; https://doi.org/10.3390/electronics12194094

Submission received: 31 August 2023 / Revised: 26 September 2023 / Accepted: 28 September 2023 / Published: 29 September 2023

(This article belongs to the Special Issue Advances in 5G Wireless Edge Computing)

Download

Browse Figures

Versions Notes

Abstract

:

The rise of vision-threatening diabetic retinopathy (VTDR) underscores the imperative for advanced and efficient early detection mechanisms. With the integration of the Internet of Things (IoT) and 5G technologies, there is transformative potential for VTDR diagnosis, facilitating real-time processing of the burgeoning volume of fundus images (FIs). Combined with artificial intelligence (AI), this offers a robust platform for managing vast healthcare datasets and achieving unparalleled disease detection precision. Our study introduces a novel AI-driven VTDR detection framework that integrates multiple models through majority voting. This comprehensive approach encompasses pre-processing, data augmentation, feature extraction using a hybrid convolutional neural network-singular value decomposition (CNN-SVD) model, and classification through an enhanced SVM-RBF combined with a decision tree (DT) and K-nearest neighbor (KNN). Validated on the IDRiD dataset, our model boasts an accuracy of 99.89%, a sensitivity of 84.40%, and a specificity of 100%, marking a significant improvement over the traditional method. The convergence of the IoT, 5G, and AI technologies herald a transformative era in healthcare, ensuring timely and accurate VTDR diagnoses, especially in geographically underserved regions.

Keywords:

vision-threatening diabetic retinopathy (VTDR); Internet of Things (IoT); fundus images (FIs); artificial intelligence (AI); support vector machine (SVM); singular value decomposition (SVD)

1. Introduction

The increasing prevalence of vision-threatening diabetic retinopathy (VTDR) underscores the need for effective early detection methods. The Internet of Things (IoT) and 5G technologies hold immense promise for enhancing VTDR diagnosis [1,2]. Although a surge in fundus images (FIs) data has been observed, timely processing (crucial for early VTDR detection) remains a challenge [3]. Traditional barriers, such as privacy concerns, have inhibited the broader accessibility of medical images, stalling advancements in healthcare [4]. As 5G emerges, the significance of real-time diagnosis is accentuated, given that delays in VTDR detection can critically impact mortality rates [5]. Leveraging the IoT’s capability to connect medical devices to the cloud could revolutionize healthcare services, ranging from health monitoring to AI-powered diagnostics [6,7,8,9]. With 5G’s integration, swifter and more precise responses are anticipated, streamlining research expenses.

For VTDR, the 5G-enabled IoT presents a sophisticated platform adept at managing vast healthcare data sets, predominantly FIs [10,11]. This system achieves superior disease detection accuracy by harnessing machine learning and advanced optimization [12]. Such optimization methods have notably refined AI models, especially in decreasing the diagnostic inaccuracies in retinal conditions [13,14,15]. Typically, medical facilities employ specialized apparatus to record eye images, which is crucial for VTDR identification. Analyzing these images via 5G-IoT technologies ensures prompt and accurate diagnoses, which are pivotal to preventing grave vision complications [16,17]. However, such advancements remain aspirational in geographically isolated regions with scarce ophthalmic resources. Despite this, the global affordability and accessibility of fundus photography, even among non-experts, fortifies AI-assisted remote evaluations [18]. AI is an established but still rapidly advancing technology, especially in computer-aided diagnosis of human diseases [19]. It has been effectively applied in the detection of skin cancer [20], Alzheimer’s disease [21], arrhythmia [22], HIV infection [23], intracranial diseases [24], lung cancer [25], and breast cancer [26]. In recent years, the convergence of computer-aided diagnostic-based Internet of Things (IoT) and artificial intelligence (AI) technologies has paved the way for innovative and efficient approaches to disease management. In retinal diseases, deep learning algorithms for AI-assisted diagnoses have been applied to screen for DR [3,13], AMD [27], DME [28], and glaucoma [29]. These AI-assisted diagnosis systems mostly focus on binary classification. In real life, especially in remote areas lacking specialized ophthalmologists, the capability to classify various severity detection thresholds is needed. A severity threshold detecting system using fundus images should be developed to avoid missed diagnoses and delayed treatment. This paper introduces a 5G-enabled IoT framework by applying a nature inspired machine learning algorithm in a customized two-step strategy that can classify VTDR based on color fundus images.

The paper is structured into six main sections: Section 1 introduces the topic and sets the context; Section 2 delves into the literature review, providing insights into previous research; Section 3 outlines the proposed methodology, detailing the approach and techniques used; Section 4 presents the results obtained from the study; Section 5 offers a discussion analyzing the findings; and Section 6 concludes the paper, summarizing the key takeaways and implications.

2. Literature Review

With the advent of computer-aided diagnosis (CAD) techniques, there has been a transformative shift in medical imaging and diagnostic procedures. Diabetic retinopathy (DR), a prevalent diabetic complication and a leading cause of vision impairment, has become a focal point for researchers in this revolution. Advancements in convolutional neural networks (CNNs) have paved the way for intricate image analysis, enabling the automatic identification and classification of DR from retinal scans. These networks, coupled with traditional machine learning techniques, offer the potential to discern the subtlest pathological changes that might be missed during manual examinations. Additionally, advanced image pre-processing techniques are being integrated to enhance image quality, correct artefacts, and emphasize DR-specific features. Yet, as the field rapidly advances, it is essential to critically analyze these methodologies’ performance, benefits, and limitations. A holistic understanding of their capabilities and constraints can guide future research and optimize clinical implementations. Table 1 provides a comprehensive summary of significant approaches in the DR detection landscape, detailing their methodologies, outcomes, and challenges.

Table 1 highlights the ongoing challenges in enhancing the precision of VTDR detection through automated mechanisms, even with the introduction of various advanced machine and deep learning techniques. A significant portion of these methods leverages the power of convolutional neural networks (CNNs) for efficient feature extraction and subsequent classification. In contrast, a subset integrates SVM alongside morphological and geometrical attributes. Yet, these strategies encounter challenges related to accuracy, operational efficiency, and adaptability to extensive datasets and intricate retinal imagery. To address these recognized shortcomings, our suggested methodology encompasses stages of pre-processing, data enhancement, feature extraction, and eventual classification, and all are fine-tuned for VTDR pinpointing. Central to this proposal is using a hybrid CNN-SVD model, specifically for extracting and condensing features from retinal fundus snapshots. This is paired with an advanced version of the improved support vector machine (ISVM) to classify the diverse DR stages. This investigation’s paramount importance is rooted in its potential to offer a refined, swifter approach towards VTDR detection, paving the way for timely detection and effective medical intervention against this predominant instigator of global visual disorders and blindness. To attest to the efficacy of the introduced method, the performance metrics encompass accuracy, sensitivity, specificity, and the F1-score, reinforcing the technique’s superiority and effectiveness.

3. Materials and Methods

This study introduces an advanced methodology for detecting vision-threatening diabetic retinopathy (VTDR) by harnessing retinal fundus images from the renowned IDRiD public dataset. We incorporated several image enhancement strategies to ensure these images were at their analytical best. Image resizing standardizes all images to a uniform dimension for consistent analysis, whereas histogram balancing accentuates intricate features by equalizing pixel intensity distributions. Additionally, contrast enhancement sharpens image details, significantly improving clarity. Robust data augmentation strategies are employed to counter the potential pitfalls of model underfitting and overfitting due to data imbalances. By expanding our dataset with minor transformations of the original images, we amplify the volume of data and its diversity. Venturing beyond conventional methods, the analysis introduces a composite model that combines the strengths of convolutional neural networks (CNN) and singular value decomposition (SVD). This innovative fusion effectively extracts salient features from the images and emphasizes the most informative attributes. For classification, an improved support vector machine with a radial basis function (RBF) kernel delineates DR into five nuanced stages, offering a detailed understanding of disease progression. The effectiveness and reliability of our approach are ascertained through a comprehensive set of performance metrics, including F1-score, accuracy, sensitivity, and specificity. The suggested procedure is shown in flowchart form in Figure 1 to represent our methodology visually.

3.1. Pre-Processing and Data Augmentation

Challenges such as blurred images or unclear features within the dataset necessitate robust pre-processing methods. These methods not only correct the imperfections but also enhance the overall quality of the images for subsequent analysis.

3.1.1. Pre-processing Techniques

FI Scaling: We transform retinal images to RGB through the inverse YCbCr transformation. This step is not just a color space conversion; it also ensures that all images conform to a standardized size, facilitating uniformity for the ensuing steps and ensuring no data are lost in the subsequent stages.
Histogram Equalization and Contrast Stretching: The intensity distribution within images often varies, which could mask vital details. By employing histogram equalization, we redistribute these intensities, ensuring a balanced representation of the image’s features. Moreover, with contrast limited adaptive histogram equalization (CLAHE), we ensure that this redistribution does not lead to excessive noise, as it limits extreme enhancements, providing a more natural and clearer image. Data augmentation becomes crucial for the AI model to generalize well on unseen data and avoid overfitting. By creating new, varied images from the existing dataset, we ensure the model trains on a more comprehensive set of data [47].

3.1.2. Data Augmentation Techniques

Rotation: Rotating the images between 0 and 360 degrees ensures the model is not biased towards any specific orientation. In real-world scenarios, images could come in varied angles and this step trains the model to recognize features irrespective of orientation.
Shearing: We mimic potential distortions by introducing shearing at angles ranging from 10 to 20 degrees. This ensures that minor changes in perspective or angle do not hinder the model’s recognition capabilities.
Flip: In medical imaging, mirror image variations are common due to different imaging angles. Flipping images horizontally and vertically introduces the model to these possibilities, making it robust against such variations.
Zoom: Zooming in and out within a range of (1/1.3, 1.3) simulates different focus levels. It trains the model to identify features even when they are not at the optimal focus, ensuring consistent performance across varied image quality.
Crop: Cropping images to 85% and 95% of their original size exposes the model to images where certain features might be partially off-frame or images taken with different resolutions.
Translation: Translating or shifting images between −25 and 25 pixels in all directions imitates potential misalignments during imaging. The model learns to recognize features even when they are slightly displaced.

3.2. Feature Extraction and Reduction by CNN-SVD from FIs

In this study, our primary goal was to harness the power of a convolutional neural networks (CNNs) to delve deeply into the intricacies of fundus images (FIs). By doing so, we aim to capture the intricate features that would allow us to differentiate between the varying stages of diabetic retinopathy (DR). The simplicity of our chosen CNN model, by design, enables a focused feature extraction specifically tailored for DR characteristics. A visual representation of our CNN-based feature extractor can be observed in Figure 1. As we traverse deeper into the model, each layer of the CNN helps dissect the image, homing in on features crucial for DR stage classification. To optimize the process, we have incorporated batch normalization. This streamlines the training process by ensuring consistent input distribution across layers and enhances the model’s generalization capabilities. In addition, we have integrated max-pooling, a technique that zeros in on the most pertinent information, distilling the images to their most relevant features. However, with deeper networks and more parameters, there is always the risk of the model becoming too closely fitted to the training data, a phenomenon known as overfitting. To counteract this, we introduced dropout layers. These layers randomly “turn off” certain neurons during training, ensuring that the model does not rely too heavily on any particular feature and promotes faster and more generalized learning. When optimizing the training process, we decided on the Adam optimizer. This decision was influenced by Adam’s proven track record of handling vast datasets efficiently, making it ideal for our expansive collection of FIs. Our CNN model culminates in a dense layer designed to capture 256 unique characteristics from every FI. However, recognizing that not every extracted feature would be equally important, we proceeded to a dimensionality reduction phase using singular value decomposition (SVD). This mathematical technique condenses our feature space, retaining only the most significant data patterns. The goal is straightforward: to simplify our dataset to its most essential elements while ensuring that the essence, which aids in DR classification, remains intact.

3.3. In-Depth Exploration of the Proposed Classification Technique

The realm of machine learning is vast and varied. For classification problems, a plethora of algorithms exist, each with its unique strengths, weaknesses, and applications. Some of these myriad algorithms stand out due to their efficacy in specific scenarios or their versatility across varied datasets. Three such algorithms that have garnered attention in recent years are the improved support vector machine with radial basis function (ISVM-RBF), K-nearest neighbor (KNN), and decision tree (DT). An ensemble technique known as the voting method often finds its way into sophisticated machine learning pipelines.

3.3.1. Improved Support Vector Machine with Radial Basis Function

With the innovative SVM-RBF approach, we will appreciate the role of data scaling in SVM. This is more than just a preparatory step—it is a fundamental cornerstone. Data scaling ensures that each attribute is uniformly treated, irrespective of its original numerical range. This becomes critical, especially in scenarios where SVM relies heavily on kernel values rooted in the inner products of feature vectors, such as the linear and polynomial kernels. Without this harmonization, the risk of numerical problems stemming from large attribute values becomes significant.

Simulated annealing, when employed in the SVM, does more than optimize, it introduces a systematic, probabilistic technique to explore the vast solution space. The genius behind it is mimicking the annealing process, where random variations are applied to the current solution. This probabilistic exploration prevents the algorithm from being trapped in local optima, a frequent challenge in many optimization strategies. To ensure the balance between exploration and exploitation, a temperature parameter within simulated annealing is judiciously adjusted, determining the likelihood of accepting solutions that might be worse than the current one.

The cross-validation aspect of SVM-SA, particularly its k-fold variation, is an analytical marvel. The dataset is divided into k subsets. In each iteration, one subset serves as the validation set whereas the remaining subsets form the training set. This rotation ensures that each data point has its turn in the validation set, granting a holistic evaluation. Optimizing its hyper-parameters, such as the penalty parameter C and the kernel coefficient γ, is crucial for the SVM with an RBF kernel. The cross-validation score acts as a feedback mechanism in the simulated annealing process, guiding the search for the best hyper-parameter values.

One unique strength of our SVM-RBF is its ability to refine the search space adaptively. As simulated annealing explores potential solutions, it can zoom in on promising regions and explore them with greater granularity. By employing a virtual window around the current best solution, finer searches are executed, ensuring that potential areas with optimal solutions are not overlooked. Finally, the SVM-RBF’s iterative nature demands a meticulous selection of its initial parameters. The beginning values for the SVM’s parameters C and γ are often chosen randomly. This randomness seeds the diversity required for simulated annealing’s exploration. As iterations proceed, the combination of these parameters is continually adjusted, seeking to maximize the cross-validation score and thus the overall accuracy. In essence, SVM-RBF is a masterclass in optimization, synergizing the best of the SVM’s structural robustness with the adaptive, exploratory strengths of simulated annealing and resulting in a classification tool that stands out in precision and adaptability.

3.3.2. K-Nearest Neighbor (KNN)

KNN operates on a simple premise: it leverages the power of community knowledge. For every new data point, it looks at the “k” nearest data points in the training set and decides based on the majority class among them. The rationale behind KNN’s effectiveness is that similar data points (regarding features) often belong to the same class. It is a non-parametric method useful when the underlying data distribution is unknown.

3.3.3. Decision Tree (DT)

DTs break down a dataset into smaller and smaller subsets based on specific criteria, leading to a set of decisions. The hierarchical nature of decision trees often mirrors the decision-making process in real-world scenarios. Techniques such as the chi-squared method can be employed to determine the significance of splits, ensuring that the tree is making meaningful decisions at every node.

In the vast landscape of machine learning, it is evident that no singular algorithm can be universally optimal for all scenarios. Even the most advanced algorithms can occasionally falter, showcasing blind spots or producing errors under unique situations. Against this backdrop, the voting method emerges as a beacon of collective intelligence. Instead of relying on the decision of one, the voting method amalgamates predictions from multiple algorithms to arrive at a more informed and consensus-driven decision. Several advantages drive its adoption for classification. Firstly, the method celebrates the diversity inherent in various algorithms. Each algorithm’s unique strength and decision-making pathway contributes to a more panoramic data view. Secondly, a recurring challenge in machine learning is the menace of overfitting, where models might resonate too closely with a specific subset of data. By pooling predictions, the voting method considerably diminishes the risk of overfitting. Moreover, classification accuracy often boosts, given that multiple “votes” or predictions on a classification tend to refine the final decision compared with relying solely on a single model.

Navigating through the myriad of algorithms, three have emerged as notably effective for certain scenarios: ISVM-RBF, KNN, and DT. ISVM-RBF showcases proficiency in managing extensive and intricate datasets. KNN simplifies the classification process by capitalizing on the power of similarity, and the decision tree (DT) offers methodical, tiered decision making. The versatility of these algorithms is evident in their applicability across diverse datasets, be they linear or nonlinear, petite or vast, or straightforward or complex. True brilliance shines when their predictions are unified through the voting method, fortifying the ensemble system against the potential weaknesses of individual algorithms. In summary, the strategic amalgamation of these three algorithms with the voting method provides a harmonious classification blend, effectively leveraging each component’s strengths while buffering their weaknesses.

3.4. Comprehensive Performance and Complexity Evaluation of the Proposed Methodology

Evaluating the algorithms’ effectiveness and computational demands is paramount in machine learning and data analysis. We will delve into two critical components of our proposed methodology: performance evaluation metrics and the theoretical computational complexity assessment.

3.4.1. Performance Evaluation Metrics

In the arena of machine learning, the true capability of a model is mirrored not just by its training scores but also by how it performs on unseen data. We have chosen a series of evaluation metrics to assess the proposed model’s efficacy objectively. These indicators provide insights into the model’s various classification facets [48,49].

Sensitivity (or True Positive Rate) provides insights into the model’s adeptness at capturing all instances that truly belong to the positive class, often referred to as “recall”. It essentially quantifies the proportion of actual positives that were identified correctly.

Specificity, a counterpoint to sensitivity, specificity focuses on the model’s skill in correctly distinguishing the instances that belong to the negative class. It gives a clear picture of the classifier’s accuracy when it predicts that an instance does not belong to the positive class.

Accuracy, perhaps the most straightforward of all metrics, it offers a comprehensive overview of the model’s performance. It evaluates how many classifications the model makes align with the actual labels, encompassing positives and negatives.

F1-Score acts as a bridge between precision and recall. This metric proves invaluable in datasets where class distribution is skewed. It amalgamates precision and recall, providing a single score that balances these metrics’ trade-offs.

MCC (Matthew’s Correlation Coefficient) provides a balanced measure even when the classes are of different sizes. It considers true and false positives and negatives and is generally regarded as a balanced measure that can be used even if the classes are of very different sizes.

AUC (Area Under the Curve) provides an aggregate performance measure across all possible classification thresholds. It quantifies the overall ability of the model to discriminate between positive and negative classes.

These meticulously chosen metrics form a formidable arsenal, granting us a panoramic view of the model’s real-world adaptability and performance. The mathematical intricacies of each of these metrics are encapsulated in Table 2.

3.4.2. Evaluation of Theoretical Computational Complexity

Pre-processing and Data Augmentation: These foundational steps predominantly exhibit a complexity of O(n), with n symbolizing image pixel count.
Feature Extraction via CNN-SVD: With the convolutional neural network (CNN) relying on its architectural depth (L layers) and breadth (F filters), its complexity can be estimated as O(L × F × n²). The singular value decomposition (SVD), on the other hand, has a complexity represented by O(r × m × n).
Working Mechanism of Novel ISVM-RBF: The intricacies of SVM-RBF revolve around its support vectors (S) and the dimensionality (D) of the feature space, leading to a complexity for O(S × D).
K-Nearest Neighbor (KNN): The simplicity of KNN does not shield it from computational demands, especially with a complexity driven by the number of training samples (N) and dimensionality (D), approximated as O(N × D).
Decision Tree (DT): For this structured algorithm, its complexity is determined by the feature count (F), computed as O(F).

Conclusively, the overarching computational complexity of our methodology hinges on the most resource-intensive segment. Here, the hybrid CNN-SVD for feature extraction emerges as the most demanding, considering its O(L × F × n²) complexity. However, a precise computational complexity estimation remains elusive without specifics concerning the L, F, and n values.

4. Results

4.1. IDRiD Dataset

To comprehensively assess the effectiveness of our proposed strategy, we have chosen to benchmark it against prior studies that have extensively utilized the IDRiD dataset [50]. The IDRiD dataset is a rich collection of 516 retinal images meticulously curated to represent various pathological stages associated with DR and DME. This dataset is structured with 413 training and 103 testing images, ensuring a robust evaluation framework, as detailed in Figure 2.

A distinguishing feature of the IDRiD dataset is its detailed annotation. Each image is accompanied by labels that not only signify the presence of VTDR damage but also indicate the severity of the damage. Addressing the reviewer’s comment for clarity, the severity of DR in the dataset is systematically categorized on a scale that spans five distinct groups. These groups are normal VTDR, mild non-proliferative VTDR, moderate non-proliferative VTDR, severe non-proliferative VTDR, and proliferative VTDR.

Each group represents a progressive stage of DR severity, and our study’s classification methodology aligns with these five predefined classes.

Beyond the severity classification, the IDRiD dataset is invaluable for its segmentation masks. These masks are designed to achieve pinpoint spatial accuracy, especially concerning four predominant lesion types: hard exudates, soft exudates, hemorrhages, and microaneurysms. For a more illustrative understanding, Figure 1 showcases a selection of fundus images (FIs) from the dataset juxtaposed with their respective ground truth masks, highlighting the meticulous detail captured in the IDRiD dataset.

4.2. Experimental Setup

The Matlab programming environment was used for all experiments. We used an Intel Core i7 7th generation CPU, a 1TB SSD, and 32 GB of RAM. In this section, we emphasize the main outcomes of the classifier results, time complexity, and image pre-processing. In a separate presentation, the proposed work is contrasted with traditional approaches. The configurations of hyperparameters are optimized for performance. A batch size 64 strikes a balance between computational efficiency and convergence stability. The learning rate is set at 0.001, ensuring consistent and effective model adjustments. A weight decay of 0.005 prevents overfitting, penalizing large individual weights. The ADAM optimizer, which combines the strengths of AdaGrad and RMSProp, dynamically adjusts the learning rate, offering efficient training. The categorical cross-entropy loss function is ideal for multiclass classification, promoting accurate predictions. The class weights, set at [−1,1], address dataset imbalances but must be used cautiously. The model runs for 100 epochs, a number chosen based on previous model behaviors and dataset specifics, ensuring neither underfitting nor overfitting.

4.3. Image Processing Results

This section compares the pre-processing results to the classification results. As shown in Figure 1, the case study findings for illness grade 4 are effectively achieved in the detection zones. Because of the lack of ground truth in the disease classification database, the segmentation results cannot be quantified. Table 3 compares the outcomes of various classifiers.

In this section, we delve deeper into the intricacies of the image processing results, with a specific emphasis on the segmentation and classification outcomes. As illustrated in Figure 1, our methodology effectively discerns the findings for illness grade 4 within the designated detection zones.

As detailed in Figure 1, the segmentation process is a multi-step procedure that begins with a pre-processing step. This step is crucial for enhancing the detection of microaneurysms (MAs) and eliminating both inherent and external noise present in the fundus images. The images undergo a series of transformations, including conversion from RGB to YCbCr, median filtering, and contrast stretching, to name a few. Following pre-processing, adaptive segmentation algorithms locate red and brilliant lesions. The red lesions, mostly hemorrhages, are recognized using adaptive segmentation at a sensitivity of 0.15. Bright lesions, which are suggestive of exudates, are recognized with a higher sensitivity value of 0.85. The segmentation masks are critical to our process. Their main function is to outline and emphasize the areas of interest, especially the four kinds of lesions: hard exudates, soft exudates, hemorrhages, and microaneurysms. Making these masks guarantee this precision is critical because it guarantees that the ensuing feature extraction and classification algorithms are based on precise and distinct lesion demarcations. In the context of our proposed technique, these masks serve as a fundamental layer, allowing the system to distinguish between distinct lesion kinds and severity levels.

Post-detection and feature extraction are performed for both red and bright lesions. Features such as the number of regions, mean area, mean perimeter, and mean solidity are extracted and stacked together. These features are then fused lexicographically, forming the foundation for the subsequent classification process. The training algorithm constructs the image set and assigns target classes based on the dataset’s severity. Three classifiers, namely ISVM-RBF, KNN, and BT, are then trained using this data. The testing algorithm predicts the results through feature extraction by the given classifier, and a voting method is employed to finalize the output.

It is worth noting that the algorithm outputs three sets of features: red, bright, and fused. However, for the classification, the fused features are predominantly utilized. The classification encompasses five severity levels, ranging from “No DR” to “Proliferative DR”. The accuracy of each model is quantified by comparing the precisely classified labels against the total number of images per class. The segmentation results offer a comprehensive view of the lesions; due to the absence of ground truth in the disease classification database, direct quantification of these results remains challenging. Nevertheless, Table 3 provides a comparative analysis of the performance metrics across various classifiers, shedding light on the efficacy of our proposed method and the importance of the segmentation masks in achieving precise lesion detection and classification.

Figure 3 presents a detailed heatmap that visually represents the performance of various machine learning models across different severity thresholds. Along the x-axis, the heatmap displays severity thresholds, specifically “Normal”, “Mild”, “Moderate”, “Severe”, and “PDR”. Vertically, on the y-axis, different machine learning models such as “KNN”, “Binary Trees”, and “SVM-Linear” are listed. At the intersection of a model and a severity threshold, each heatmap cell provides the accuracy percentage of that model for the given threshold. The color intensity within each cell indicates its accuracy value, with the “cool, warm” color map employed: cooler shades of blue represent lower accuracy percentages.

In comparison, warmer shades of red indicate higher accuracies. Each cell is also annotated for precise interpretation with the exact accuracy percentage it represents. Through this heatmap, one can effortlessly compare and discern the efficacy of each machine learning model at various severity levels.

Figure 4a–g provides a comprehensive visualization of the performance metrics of various machine learning models evaluated against different severity thresholds. The dataset encompasses models such as “KNN”, “Binary Trees”, “SVM-Linear”, and several others. These models are assessed against severity thresholds including “Normal”, “Mild”, “Moderate”, “Severe”, and “PDR”. Four primary performance metrics are considered: accuracy (“Acc%”), sensitivity (“Sen%”), specificity (“Spc%”), and F1-Score (“F1-Score%”). Among the models, ISVM-RBF stands out, consistently outperforming the others, KNN, decision trees, and various SVM configurations.

In Figure 4a–d, the performance metrics are visualized using line charts. The x-axis of these charts represents the severity thresholds, whereas the y-axis denotes the percentage values of the respective metric. Each machine learning model’s performance is depicted as a unique line, making it straightforward to compare their performances across different severity thresholds. These charts are enhanced with gridlines and legends and each is titled based on the metric it represents, ensuring clarity. Figure 4e shows the Matthew’s correlation coefficient (MCC%) for the machine learning models using a grouped bar graph. Each severity threshold group on the graph contains bars representing different a model’s performance, providing a clear comparative view. Figure 4f delves into the receiver operating characteristic (ROC) curves for a range of machine learning classifiers using a synthetic dataset. After generating a dataset and dividing it into training and testing sets, classifiers such K-nearest neighbor (KNN), decision trees, and various support vector machines (SVMs) are trained and evaluated. The resulting ROC curves plot the true positive rate (TPR) against the false positive rate (FPR) for each classifier, with distinct colors and line styles. The area under each curve (AUC) is also calculated and presented in the legend, measuring each classifier’s overall performance.

Lastly, Figure 4g is designed to provide a detailed view of the performance metrics of the machine learning models. The resulting line chart offers a clear perspective on each model’s performance across the severity thresholds, with each model–metric combination distinctly represented. The visualization has gridlines, a legend, and a title, ensuring a thorough and clear understanding of the evaluations.

4.4. Time Complexity Analysis

In the domain of image processing, especially when dealing with intricate medical images, the time complexity of an algorithm is pivotal for its real-world applicability. Our proposed methodology has been meticulously designed to ensure efficiency, but it is essential to delve deeper into its time complexity to understand its practical implications. The pre-processing and feature extraction phase is arguably the most time-intensive segment of our approach. This phase encompasses image conversion, noise removal, and adaptive segmentation. Although crucial for the method’s accuracy, these operations introduce specific computational bottlenecks. The intricacies involved in these steps, especially adaptive segmentation, can significantly elongate the processing time. For instance, whereas seemingly straightforward, converting images from RGB to other formats and vice versa can be computationally demanding when dealing with large datasets. Similarly, noise removal, essential for the clarity of medical images, requires intricate filtering processes that can be time consuming.

Training the classifiers is another segment that demands attention. The time complexity in this phase is directly influenced by the dataset’s size and the nature of the classifiers employed. Although our chosen classifiers (SVM, KNN, and BT) have been optimized for speed, the sheer volume of data and the intricacies of the training process mean that this phase cannot be overlooked when considering the overall time complexity. Training involves iterative processes, and with a large dataset, even minor inefficiencies can compound, leading to extended training times.

The testing phase, in contrast, is relatively swift. It involves making predictions using the trained classifiers and then employing a voting mechanism to finalize the prediction. However, it is worth noting that the speed of this phase is contingent on the efficiency of the previous stages, especially the training phase. Our proposed approach is designed for efficiency; certain inherent steps, especially pre-processing and feature extraction, can be computationally intensive. Recognizing and addressing these bottlenecks is crucial for enhancing the method’s speed and ensuring its suitability for real-time medical applications.

4.5. Comparison with State-of-the-Art Studies

Table 4 meticulously contrasts the proposed ISVM-RBF mixture model with a series of state-of-the-art studies conducted from 2019 to 2023. These studies, spearheaded by various authors, have employed various methodologies. For instance, in 2019, the GNN model in [51] achieved an accuracy of 78.3%, whereas the CNN + handcrafted features model in [52] had a reported 90.70% accuracy. As we progress through the years, there is a discernible trend of increasing accuracy, with models such as the DCNN in 2022 achieving 73.00% and the ELM model reaching an impressive 99.04%. However, the crowning jewel in this comparative analysis is undeniably the proposed work from 2023. The ISVM-RBF mixture model boasts an exceptional accuracy of 99.89% and demonstrates a sensitivity of 89.20% and a flawless specificity of 100.00%. This dual focus on accuracy and sensitivity sets the proposed model apart. Whereas many studies have showcased high accuracy rates, they often do so at the expense of sensitivity or vice versa. The proposed model’s balanced performance underscores its robustness and versatility.

Furthermore, a deeper dive into the metrics reveals that the proposed model’s performance, especially regarding sensitivity, is unparalleled. For instance, whereas the DLM model from 2023 reported an admirable sensitivity of 89.00%, it still falls short of the proposed model’s 89.20%. This minute yet significant difference highlights the proposed model’s superior capability in correctly identifying positive cases, which is crucial in medical diagnostics. On the computational front, the proposed algorithm is not just about achieving high scores but also about efficiency. Although the pre-processing technique encompasses all image processing steps and is the most time-intensive at 9.5935 s, it is a testament to the model’s comprehensive approach to ensure accuracy and reliability in its predictions. In conclusion, the proposed ISVM-RBF mixture model is a testament to the advancements in the field, setting a new benchmark for future research.

Table 4. Proposed and state-of-the-artwork comparison.

Author	Year	Dataset	Method	Acc (%)	Sen (%)	Spc (%)
[51]	2019	IDRiD	GNN	78.3	-	-
[52]	2019	IDRiD	CNN + Handcrafed Features	90.70	-	-
[53]	2019	IDRiD	R-CNN	-	83.00	94.00
[54]	2020	IDRiD	CANeT	92.60	-	-
[55]	2020	IDRiD		90.29	88.75	96.89
[55]	2020	MESSIDOR	CNN	90.89	88.75	96.30
[56]	2020	IDRiD	RSNET	86.33	-	-
		IDRiD
[57]	2020	Kaggle	CNN	81.00	-	-
[58]	2021	IDRiD	Fine KNN	94.00	-	-
[58]	2021	MESSIDOR	Fine KNN	98.10	-	-
[59]	2022	IDRiD	TL	71.00	-	71.00
[60]	2022	IDRiD	DCNN	73.00	-	-
[61]	2022	IDRiD	ELM	99.04	-	-
[62]	2023	IDRiD	GNN	96.00	-	-
[63]	2023	IDRiD	DLM	96.65	89.00	99.00
Proposed work	2023	IDRiD	ISVM-RBF Mixture model	99.89	89.20	100.00

5. Discussion

Although promising, the proposed methodology for VTDR detection and classification comes with challenges that need to be addressed for effective real-world implementation.

Data Collection and Pre-processing: One of the primary challenges is the collection of a comprehensive and diverse dataset of retinal images. The quality and diversity of data play a pivotal role in the model’s performance. The pre-processing of these images to ensure they are suitable for training can be intricate, especially when dealing with varied image qualities, lighting conditions, and artefacts.

Computing Resources: Deep learning models, especially those used for image processing, demand substantial computing resources. The need for GPUs or TPUs escalates the cost and poses challenges in terms of maintenance and scalability.

Hyperparameter Optimization: Although tuning hyperparameters is crucial for enhancing model performance, it is a task easier said than done. This process’s vast parameter space and time-intensive nature make it a significant challenge.

Interpretability and Explain ability: The “black box” nature of AI models poses challenges in clinical settings. For healthcare professionals to trust and adopt these models, they need to understand how decisions are made, emphasizing the need for model interpretability.

Model Deployment: Deploying the AI model in real-world clinical settings introduces another layer of complexity. This encompasses ethical considerations, ensuring patient privacy, and adhering to stringent regulatory compliance standards.

In our study, as illustrated in Figure 1, the pre-processing technique accentuates the lesions, aiding in more accurate DR detection. We employed two distinct algorithms for lesion identification. Post-identification, we extracted features from these lesions, amalgamating them into a cohesive feature vector. The voting system’s cumulative performance, gauged against an escalating severity threshold for each classifier, outperformed individual classifiers. This can be attributed to the enhanced clarity of the lesion, facilitating more accurate classification. By setting a higher severity threshold, we observed an uptick in the accuracy across all classifiers. Notably, mixed models showcased an exemplary overall accuracy of 99.89% at a disease severity level of 4.

6. Conclusions

Timely intervention in cases of vision-threatening diabetic retinopathy (VTDR) is paramount for patient wellbeing, and the integration of modern technology is revolutionizing this diagnostic process. This research leveraged cutting-edge artificial intelligence models to categorize the severity of retinal lesions meticulously. The study’s core emphasis was precisely categorizing red and bright lesions, utilizing three distinct classifiers with a combined voting mechanism. The proposed methodology showcased exemplary results, achieving an astounding accuracy rate of 99.79% and specificity metrics of 85.4% and 100%, respectively, setting a new benchmark that eclipses existing state-of-the-art models. Nevertheless, it is imperative to recognize the inherent constraints of the proposed model. The pre-processing steps and feature extraction methodologies significantly shape the outcomes, underscoring the delicate balance between pivotal parameters. Although the hybrid approach adopted in this research has set high standards, the quest for perfection continues, indicating avenues for further refinement and enhancement. Looking ahead, there are several promising directions for future research. Integrating more advanced neural network architectures and exploring unsupervised learning techniques could enhance the model’s performance. Expanding the dataset to include more diverse retinal images from various ethnicities and age groups can improve the model’s generalizability. Collaborations with ophthalmologists and incorporating their expert insights can also lead to more clinically relevant models. Lastly, real-time application and testing of the model in clinical settings will be critical in transitioning from research to tangible patient benefits.

Author Contributions

Conceptualization, A.B. and X.L.; methodology, A.B.; software, A.B.; validation, A.B., H.L. and T.I.B.; formal analysis, H.L.; investigation, X.L., M.S. and T.I.B.; resources, H.L.; data curation, A.B.; writing—original draft preparation, A.B.; writing—review and editing, M.S.; visualization, A.B. and X.L.; funding acquisition, A.B. and H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Foreign Young Talents Programme of the State Bureau of Foreign Experts Ministry of Science and Technology China (No. QN2023034001), National Natural Science Foundation of China (No. 71762010, No. 62262019), the Hainan Provincial Natural Science Foundation of China (No. 823RC488), and the Haikou Science and Technology Plan Project of China (No. 2022-016).

Data Availability Statement

INDIAN DIABETIC RETINOPATHY IMAGE DATASET (IDRID) (Link: Indian Diabetic Retinopathy Image Dataset (IDRiD)). The data supporting this study’s findings are available from the corresponding author or Anas Bilal (a.bilal19@yahoo.com) upon reasonable request.

Conflicts of Interest

There are no conflict of interest in this article.

References

Dao, N.N. Internet of Wearable Things: Advancements and Benefits from 6G Technologies. Futur. Gener. Comput. Syst. 2023, 138, 172–184. [Google Scholar] [CrossRef]
Zhang, G.; Navimipour, N.J. A Comprehensive and Systematic Review of the IoT-Based Medical Management Systems: Applications, Techniques, Trends and Open Issues. Sustain. Cities Soc. 2022, 82, 103914. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Mazhar, S.; Imran, A.; Latif, J. A Transfer Learning and U-Net-Based Automatic Detection of Diabetic Retinopathy from Fundus Images. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2022, 10, 663–674. [Google Scholar] [CrossRef]
Grassmann, F.; Mengelkamp, J.; Brandl, C.; Harsch, S.; Zimmermann, M.E.; Linkohr, B.; Peters, A.; Heid, I.M.; Palm, C.; Weber, B.H.F. A Deep Learning Algorithm for Prediction of Age-Related Eye Disease Study Severity Scale for Age-Related Macular Degeneration from Color Fundus Photography. Ophthalmology 2018, 125, 1410–1420. [Google Scholar] [CrossRef] [PubMed]
Bilal, A.; Sun, G.; Mazhar, S. Survey on Recent Developments in Automatic Detection of Diabetic Retinopathy. J. Fr. Ophtalmol. 2021, 44, 420–440. [Google Scholar] [CrossRef]
Alahmari, F.; Naim, A.; Alqahtani, H. E-Learning Modeling Technique and Convolution Neural Networks in Online Education. In IoT-Enabled Convolutional Neural Networks: Techniques and Applications; Taylor & Francis: Oxford, UK, 2023. [Google Scholar]
Krichen, M. Convolutional Neural Networks: A Survey. Computers 2023, 12, 151. [Google Scholar] [CrossRef]
Sahoo, K.S.; Tiwary, M.; Luhach, A.K.; Nayyar, A.; Choo, K.K.R.; Bilal, M. Demand-Supply-Based Economic Model for Resource Provisioning in Industrial IoT Traffic. IEEE Internet Things J. 2022, 9, 10529–10538. [Google Scholar] [CrossRef]
Singh, M.; Sahoo, K.S.; Nayyar, A. Sustainable IoT Solution for Freshwater Aquaculture Management. IEEE Sens. J. 2022, 22, 16563–16572. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Li, Y.; Mazhar, S.; Khan, A.Q. Diabetic Retinopathy Detection and Classification Using Mixed Models for a Disease Grading Database. IEEE Access 2021, 9, 23544–23553. [Google Scholar] [CrossRef]
Kukkar, A.; Gupta, D.; Beram, S.M.; Soni, M.; Singh, N.K.; Sharma, A.; Neware, R.; Shabaz, M.; Rizwan, A. Optimizing Deep Learning Model Parameters Using Socially Implemented IoMT Systems for Diabetic Retinopathy Classification Problem. IEEE Trans. Comput. Soc. Syst. 2022, 10, 1654–1665. [Google Scholar] [CrossRef]
Karimi, D.; Warfield, S.K.; Gholipour, A. Transfer Learning in Medical Image Segmentation: New Insights from Analysis of the Dynamics of Model Parameters and Learned Representations. Artif. Intell. Med. 2021, 116, 102078. [Google Scholar] [CrossRef] [PubMed]
Bilal, A.; Zhu, L.; Deng, A.; Lu, H.; Wu, N. AI-Based Automatic Detection and Classification of Diabetic Retinopathy Using U-Net and Deep Learning. Symmetry 2022, 14, 1427. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Mazhar, S.; Imran, A. Improved Grey Wolf Optimization-Based Feature Selection and Classification Using CNN for Diabetic Retinopathy Detection. In Evolutionary Computing and Mobile Sustainable Networks: Proceedings of ICECMSN 2021; Springer: Singapore, 2022; Volume 116, pp. 1–14. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Mazhar, S. Diabetic Retinopathy Detection Using Weighted Filters and Classification Using CNN. In Proceedings of the 2021 International Conference on Intelligent Technologies (CONIT), Hubli, India, 25–27 June 2021. [Google Scholar] [CrossRef]
Hollon, T.C.; Pandian, B.; Adapa, A.R.; Urias, E.; Save, A.V.; Khalsa, S.S.S.; Eichberg, D.G.; D’Amico, R.S.; Farooq, Z.U.; Lewis, S.; et al. Near Real-Time Intraoperative Brain Tumor Diagnosis Using Stimulated Raman Histology and Deep Neural Networks. Nat. Med. 2020, 26, 52–58. [Google Scholar] [CrossRef] [PubMed]
Neely, D.C.; Bray, K.J.; Huisingh, C.E.; Clark, M.E.; McGwin, G.; Owsley, C. Prevalence of Undiagnosed Age-Related Macular Degeneration in Primary Eye Care. JAMA Ophthalmol. 2017, 135, 570–575. [Google Scholar] [CrossRef]
Balyen, L.; Peto, T. Promising Artificial Intelligence–Machine Learning–Deep Learning Algorithms in Ophthalmology. Asia-Pac. J. Ophthalmol. 2019, 8, 264–272. [Google Scholar]
Topol, E.J. High-Performance Medicine: The Convergence of Human and Artificial Intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef]
Alam, T.M.; Shaukat, K.; Khan, W.A.; Hameed, I.A.; Almuqren, L.A.; Raza, M.A.; Aslam, M.; Luo, S. An Efficient Deep Learning-Based Skin Cancer Classifier for an Imbalanced Dataset. Diagnostics 2022, 12, 2115. [Google Scholar] [CrossRef]
Duc, N.T.; Ryu, S.; Qureshi, M.N.I.; Choi, M.; Lee, K.H.; Lee, B. 3D-Deep Learning Based Automatic Diagnosis of Alzheimer’s Disease with Joint MMSE Prediction Using Resting-State FMRI. Neuroinformatics 2020, 18, 71–86. [Google Scholar] [CrossRef]
Madan, P.; Singh, V.; Singh, D.P.; Diwakar, M.; Pant, B.; Kishor, A. A Hybrid Deep Learning Approach for ECG-Based Arrhythmia Classification. Bioengineering 2022, 9, 152. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Mazhar, S.; Junjie, Z. Neuro-Optimized Numerical Treatment of HIV Infection Model. Int. J. Biomath. 2021, 14, 2150033. [Google Scholar] [CrossRef]
Shi, Z.; Miao, C.; Schoepf, U.J.; Savage, R.H.; Dargis, D.M.; Pan, C.; Chai, X.; Li, X.L.; Xia, S.; Zhang, X.; et al. A Clinically Applicable Deep-Learning Model for Detecting Intracranial Aneurysm in Computed Tomography Angiography Images. Nat. Commun. 2020, 11, 6090. [Google Scholar] [CrossRef] [PubMed]
Bilal, A.; Shafiq, M.; Fang, F.; Waqar, M.; Ullah, I.; Ghadi, Y.Y.; Long, H.; Zeng, R. IGWO-IVNet3: DL-Based Automatic Diagnosis of Lung Nodules Using an Improved Gray Wolf Optimization and InceptionNet-V3. Sensors 2022, 22, 9603. [Google Scholar] [CrossRef] [PubMed]
Wetstein, S.C.; de Jong, V.M.T.; Stathonikos, N.; Opdam, M.; Dackus, G.M.H.E.; Pluim, J.P.W.; van Diest, P.J.; Veta, M. Deep Learning-Based Breast Cancer Grading and Survival Analysis on Whole-Slide Histopathology Images. Sci. Rep. 2022, 12, 15102. [Google Scholar] [CrossRef] [PubMed]
Kermany, D.S.; Goldbaum, M.; Cai, W.; Valentim, C.C.S.; Liang, H.; Baxter, S.L.; McKeown, A.; Yang, G.; Wu, X.; Yan, F.; et al. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning. Cell 2018, 172, 1122–1131.e9. [Google Scholar] [CrossRef] [PubMed]
Sundar, S.; Sumathy, S. An Effective Deep Learning Model for Grading Abnormalities in Retinal Fundus Images Using Variational Auto-Encoders. Int. J. Imaging Syst. Technol. 2023, 33, 92–107. [Google Scholar] [CrossRef]
Latif, J.; Tu, S.; Xiao, C.; Ur Rehman, S.; Imran, A.; Latif, Y. ODGNet: A Deep Learning Model for Automated Optic Disc Localization and Glaucoma Classifcation Using Fundus Images. SN Appl. Sci. 2022, 4, 98. [Google Scholar] [CrossRef]
Chen, W.; Yang, B.; Li, J.; Wang, J. An Approach to Detecting Diabetic Retinopathy Based on Integrated Shallow Convolutional Neural Networks. IEEE Access 2020, 8, 178552–178562. [Google Scholar] [CrossRef]
Pan, X.; Jin, K.; Cao, J.; Liu, Z.; Wu, J.; You, K.; Lu, Y.; Xu, Y.; Su, Z.; Jiang, J.; et al. Multi-Label Classification of Retinal Lesions in Diabetic Retinopathy for Automatic Analysis of Fundus Fluorescein Angiography Based on Deep Learning. Graefe’s Arch. Clin. Exp. Ophthalmol. 2020, 258, 779–785. [Google Scholar] [CrossRef]
Tymchenko, B.; Marchenko, P.; Spodarets, D. Deep Learning Approach to Diabetic Retinopathy Detection. In Proceedings of the ICPRAM 2020—9th International Conference on Pattern Recognition Applications and Methods, Valletta, Malta, 22–24 February 2020. [Google Scholar]
Qummar, S.; Khan, F.G.; Shah, S.; Khan, A.; Shamshirband, S.; Rehman, Z.U.; Khan, I.A.; Jadoon, W. A Deep Learning Ensemble Approach for Diabetic Retinopathy Detection. IEEE Access 2019, 7, 150530–150539. [Google Scholar] [CrossRef]
Pao, S.I.; Lin, H.Z.; Chien, K.H.; Tai, M.C.; Chen, J.T.; Lin, G.M. Detection of Diabetic Retinopathy Using Bichannel Convolutional Neural Network. J. Ophthalmol. 2020, 2020, 9139713. [Google Scholar] [CrossRef]
de la Torre, J.; Valls, A.; Puig, D. A Deep Learning Interpretable Classifier for Diabetic Retinopathy Disease Grading. Neurocomputing 2020, 396, 465–476. [Google Scholar] [CrossRef]
Gadekallu, T.R.; Khare, N.; Bhattacharya, S.; Singh, S.; Maddikunta, P.K.R.; Ra, I.H.; Alazab, M. Early Detection of Diabetic Retinopathy Using Pca-Firefly Based Deep Learning Model. Electronics 2020, 9, 274. [Google Scholar] [CrossRef]
Zeng, X.; Chen, H.; Luo, Y.; Ye, W. Automated Diabetic Retinopathy Detection Based on Binocular Siamese-like Convolutional Neural Network. IEEE Access 2019, 7, 30744–30753. [Google Scholar] [CrossRef]
Mateen, M.; Wen, J.; Nasrullah, N.; Sun, S.; Hayat, S. Exudate Detection for Diabetic Retinopathy Using Pretrained Convolutional Neural Networks. Complexity 2020, 2020, 5801870. [Google Scholar] [CrossRef]
Zhang, W.; Zhong, J.; Yang, S.; Gao, Z.; Hu, J.; Chen, Y.; Yi, Z. Automated Identification and Grading System of Diabetic Retinopathy Using Deep Neural Networks. Knowl.-Based Syst. 2019, 175, 12–25. [Google Scholar] [CrossRef]
Samanta, A.; Saha, A.; Satapathy, S.C.; Fernandes, S.L.; Zhang, Y.D. Automated Detection of Diabetic Retinopathy Using Convolutional Neural Networks on a Small Dataset. Pattern Recognit. Lett. 2020, 135, 293–298. [Google Scholar] [CrossRef]
Bibi, I.; Mir, J.; Raja, G. Automated Detection of Diabetic Retinopathy in Fundus Images Using Fused Features. Phys. Eng. Sci. Med. 2020, 43, 1253–1264. [Google Scholar] [CrossRef]
Math, L.; Fatima, R. Adaptive Machine Learning Classification for Diabetic Retinopathy. Multimed. Tools Appl. 2021, 80, 5173–5186. [Google Scholar] [CrossRef]
Rekhi, R.S.; Issac, A.; Dutta, M.K. Automated Detection and Grading of Diabetic Macular Edema from Digital Colour Fundus Images. In Proceedings of the 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics, UPCON 2017, Mathura, India, 26–28 October 2017. [Google Scholar]
Marin, D.; Gegundez-Arias, M.E.; Ponte, B.; Alvarez, F.; Garrido, J.; Ortega, C.; Vasallo, M.J.; Bravo, J.M. An Exudate Detection Method for Diagnosis Risk of Diabetic Macular Edema in Retinal Images Using Feature-Based and Supervised Classification. Med. Biol. Eng. Comput. 2018, 56, 1379–1390. [Google Scholar] [CrossRef]
Kunwar, A.; Magotra, S.; Sarathi, M.P. Detection of High-Risk Macular Edema Using Texture Features and Classification Using SVM Classifier. In Proceedings of the 2015 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2015, Kochi, India, 10–13 August 2015. [Google Scholar]
Perdomo, O.; Otalora, S.; Rodríguez, F.; Arevalo, J.; González, F.A. A Novel Machine Learning Model Based on Exudate Localization to Detect Diabetic Macular Edema. In Proceedings of the Ophthalmic Medical Image Analysis Thind International Workshop, OMIA 2016, Athens, Greece, 21 October 2016; Volume 3. [Google Scholar]
Tufail, A.B.; Ullah, K.; Khan, R.A.; Shakir, M.; Khan, M.A.; Ullah, I.; Ma, Y.K.; Ali, M. On Improved 3D-CNN-Based Binary and Multiclass Classification of Alzheimer’s Disease Using Neuroimaging Modalities and Data Augmentation Methods. J. Healthc. Eng. 2022, 2022, 1302170. [Google Scholar] [CrossRef]
Qadri, S.F.; Shen, L.; Ahmad, M.; Qadri, S.; Zareen, S.S.; Khan, S. OP-ConvNet: A Patch Classification-Based Framework for CT Vertebrae Segmentation. IEEE Access 2021, 9, 158227–158240. [Google Scholar] [CrossRef]
Qadri, S.F.; Shen, L.; Ahmad, M.; Qadri, S.; Zareen, S.S.; Akbar, M.A. SVseg: Stacked Sparse Autoencoder-Based Patch Classification Modeling for Vertebrae Segmentation. Mathematics 2022, 10, 796. [Google Scholar] [CrossRef]
Porwal, P.; Pachade, S.; Kamble, R.; Kokare, M.; Deshmukh, G.; Sahasrabuddhe, V.; Meriaudeau, F. Indian Diabetic Retinopathy Image Dataset (IDRiD): A Database for Diabetic Retinopathy Screening Research. Data 2018, 3, 25. [Google Scholar] [CrossRef]
Sakaguchi, A.; Wu, R.; Kamata, S. Fundus Image Classification for Diabetic Retinopathy Using Disease Severity Grading. In Proceedings of the ACM International Conference Proceeding Series, Tokyo, Japan, 28–30 March 2019. [Google Scholar]
Harangi, B.; Toth, J.; Baran, A.; Hajdu, A. Automatic Screening of Fundus Images Using a Combination of Convolutional Neural Network and Hand-Crafted Features. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, Berlin, Germany, 23–27 July 2019. [Google Scholar]
Kind, A.; Azzopardi, G. An Explainable AI-Based Computer Aided Detection System for Diabetic Retinopathy Using Retinal Fundus Images. In Lecture Notes in Computer Science, Proceedings of the CAIP 2019: Computer Analysis of Images and Patterns, Salerno, Italy, 3–5 September 2019; Springer: Cham, Switzerland, 2019; Volume 11678. [Google Scholar]
Li, X.; Hu, X.; Yu, L.; Zhu, L.; Fu, C.W.; Heng, P.A. CANet: Cross-Disease Attention Network for Joint Diabetic Retinopathy and Diabetic Macular Edema Grading. IEEE Trans. Med. Imaging 2020, 39, 1483–1493. [Google Scholar] [CrossRef] [PubMed]
Elswah, D.K.; Elnakib, A.A.; El-Din Moustafa, H. Automated Diabetic Retinopathy Grading Using Resnet. In Proceedings of the National Radio Science Conference, NRSC, Cairo, Egypt, 8–10 September 2020. [Google Scholar]
Saranya, P.; Prabakaran, S. Automatic Detection of Non-Proliferative Diabetic Retinopathy in Retinal Fundus Images Using Convolution Neural Network. J. Ambient. Intell. Humaniz. Comput. 2020, 1–10. [Google Scholar] [CrossRef]
Alcalá-Rmz, V.; Maeda-Gutiérrez, V.; Zanella-Calzada, L.A.; Valladares-Salgado, A.; Celaya-Padilla, J.M.; Galván-Tejada, C.E. Convolutional Neural Network for Classification of Diabetic Retinopathy Grade. In Advances in Soft Computing, Proceedings of the 19th Mexican International Conference on Artificial Intelligence, MICAI 2020, Mexico City, Mexico, 12–17 October 2020; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; Volume 12468, p. 12468. [Google Scholar]
Bhardwaj, C.; Jain, S.; Sood, M. Hierarchical Severity Grade Classification of Non-Proliferative Diabetic Retinopathy. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 2649–2670. [Google Scholar] [CrossRef]
Shaukat, N.; Amin, J.; Sharif, M.; Azam, F.; Kadry, S.; Krishnamoorthy, S. Three-Dimensional Semantic Segmentation of Diabetic Retinopathy Lesions and Grading Using Transfer Learning. J. Pers. Med. 2022, 12, 1454. [Google Scholar] [CrossRef]
Jiwani, N.; Gupta, K.; Afreen, N. A Convolutional Neural Network Approach for Diabetic Retinopathy Classification. In Proceedings of the 2022 IEEE 11th International Conference on Communication Systems and Network Technologies, CSNT 2022, Indore, India, 23–24 April 2022. [Google Scholar]
Albadr, M.A.A.; Ayob, M.; Tiun, S.; AL-Dhief, F.T.; Hasan, M.K. Gray Wolf Optimization-Extreme Learning Machine Approach for Diabetic Retinopathy Detection. Front. Public. Health 2022, 10, 925901. [Google Scholar] [CrossRef]
Chandran, J.J.G.; Jabez, J.; Srinivasulu, S. Auto-Metric Graph Neural Network Optimized with Capuchin Search Optimization Algorithm for Coinciding Diabetic Retinopathy and Diabetic Macular Edema Grading. Biomed. Signal Process. Control 2023, 80, 104386. [Google Scholar]
Saranya, P.; Pranati, R.; Patro, S.S. Detection and Classification of Red Lesions from Retinal Images for Diabetic Retinopathy Detection Using Deep Learning Models. Multimed. Tools Appl. 2023, 1–21. [Google Scholar] [CrossRef]

Figure 1. The figure presents our comprehensive methodology. It initiates with blood glucose evaluations to discern between diabetes type 1 and 2. Specifically for type 2 fundus eye images are uploaded to the cloud: (a) images are preprocessed; (b) augmented; (c) ground truths are distinctly marked in red; (d) segmentation results for red lesions are displayed; (e) segmentation results for bright lesions are showcased. Subsequently, (f) features are extracted via CNN and honed with SVD; and finally, (g) Diabetic Retinopathy (DR) is categorized into five stages using the advanced ISVM-RBF hybrid Voting methods.

Figure 2. The number of augmented images for each severity for training and testing.

Figure 3. Accuracy comparison of several classifiers.

Figure 4. Comparison of (a) accuracy, (b) sensitivity, (c) specificity, (d) f1-score, (e) MCC, (f) AUC, and (g) metrices comparison.

Table 1. Role of AI in automatic VTDR detection.

Reference	Proposed Methodology	Results	Limitations
[30]	Multi-scale shallow CNNs for retinal image classification	Enhanced model surpassed existing models	Not invariant to input data
[31]	Utilization of ResNet50 and VGG16	Efficient DR lesion detection; computationally robust	Inaccurate identification of microaneurysms due to fluorescein
[32]	A DL-centric system with 3 CNN ensemble for DR stage detection	Improved and consistent results	Limited feature considerations affected the accuracy
[33]	Implementation of Dense121, Dense169, Resnet50, Inception V3, and Xception CNNs	Lesion identification based on mole severity	High computational cost
[34]	Entropy image from the fundus photo’s green component with UM pre-processing	Higher accuracy and Sensitivity	UM led to missing image edges
[35]	DR class prediction through deep learning, with pixel score assignment for final classification	DR class prediction through deep learning, with pixel score assignment for final classification	Potential improvements via relevant action inclusion for algorithm assessment
[36]	Early DR detection with PCA and firefly algorithm for dimensionality reduction	Superior approach showcased	Spatial information was lost due to feature reduction
[37]	The weight-sharing layer idea from Inception V3 creates a Siamese-like CNN architecture.	Promising DR detection with a kappa value of 0.829	It might not work well with matched fundus photo datasets
[38]	Pre-trained CNN-based framework for exudate detection using ROI localization and transfer learning from various architectures	Technique surpassed existing methods	High training time for the developed model
[39]	Introduction of DeepDR framework and labeled DR image dataset	Specificity and sensitivity values of 97.7% and 97.5% respectively	Requires testing on larger and more complicated datasets
[40]	CNN-based DR detection on a small dataset with CLAHE enhancement	Better kappa score achieved	Prediction accuracy affected by uneven Gray Level
[41]	Two-stage preprocessing-centered model with various feature descriptors and SVM-based classification	The model was more generalized	Performance declined with noisier images
[42]	Adaptive ML classification with segment-level DR estimation using pre-trained CNN	Superior performance observed	High maintenance cost for the model
[43]	Incorporation of morphological, geometrical, and orientational properties with SVM classification	Achieved 92.11% accuracy for DR grading and classification	Achieved 92.11% accuracy for DR grading and classification
[44]	Future work could incorporate high-performance technologies.	Predicted DR risks with 0.90 sensitivity	Need to improve detection performance further
[45]	Utilization of texture characteristics with SVM classification.	Achieved 86% accuracy for high-risk DR detection	Conducted using a limited dataset
[46]	Two-stage CNNs approach	Identified areas of interest in the retinal picture and predicted the DR class.	Computationally demanding method

Table 2. Performance Evaluation Matrices.

Matric Name	Mathematical Representation
Sensitivity	$\frac{TP}{TP + FN} \times 100 %$
Specificity	$\frac{TN}{TN + FP} \times 100 %$
Accuracy	$\frac{TP + TN}{TP + TN + FP + FN} \times 100 %$
F1-Score	$\frac{2 \times Precision \times Recall}{Precision + Recall} \times 100 %$
MCC	$\frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP) (TP + FN) (TN + FP) (TN + FN)}} \times 100 %$

Table 3. Comparison of the outcomes of several classifiers.

Model	Severity Threshold	Acc%	Sen%	Spc%	F1-Score%	MCC %
KNN	Normal	99.12	99.5	99.02	99.31	98.52
	Mild	88.49	87.29	92.22	85.15	79.71
	Moderate	94.12	92.15	92.89	89.22	85.04
	Severe	94.08	88.25	94.43	83.77	82.68
	PDR	95.12	81.32	95.52	91.42	76.84
Binary Trees	Normal	98.91	99.04	98.86	98.95	97.90
	Mild	86.11	95.04	90.35	94.65	85.39
	Moderate	94.43	93.21	76.46	91.10	69.67
	Severe	91.06	88.32	84.36	92.10	72.68
	PDR	92.45	83.11	92.64	90.32	75.75
SVM-Linear	Normal	99.28	99.20	99.32	99.26	98.52
	Mild	92.18	92.29	92.04	94.30	84.33
	Moderate	92.12	93.66	90.06	93.92	83.72
	Severe	93.14	87.52	97.30	88.64	84.82
	PDR	96.05	73.52	99.25	81.17	72.77
SVM-Polynomial	Normal	99.38	99.30	99.42	99.36	98.72
	Mild	92.50	91.70	93.88	94.52	85.58
	Moderate	92.58	94.32	90.41	94.42	84.73
	Severe	94.55	82.33	99.85	89.88	82.18
	PDR	98.03	78.95	99.39	88.75	78.34
SVM-RBF	Normal	99.48	99.40	99.52	99.46	98.92
	Mild	96.19	95.78	97.07	97.33	92.95
	Moderate	97.16	98.09	95.60	97.90	93.69
	Severe	97.89	92.60	9830	95.91	92.49
	PDR	98.49	87.20	99.10	92.11	86.30
SVM-Polynomial (Mixed Models)	Normal	99.40	99.32	99.45	99.38	98.72
	Mild	92.60	91.80	93.98	94.62	85.58
	Moderate	92.95	94.42	90.51	94.52	85.03
	Severe	94.65	82.48	99.95	89.98	82.43
	PDR	98.13	79.05	99.49	88.85	82.43
SVM-Linear (Mixed Models)	Normal	99.30	99.22	99.34	99.28	78.64
	Mild	92.28	92.49	92.24	94.60	98.56
	Moderate	92.28	94.06	90.26	94.22	84.78
	Severe	93.35	85.12	97.50	89.14	84.34
	PDR	96.25	74.32	99.45	81.57	82.65
ISVM-RBF (Mixed Models)	Normal	99.76	99.68	99.78	99.73	99.44
	Mild	97.39	97.10	97.87	97.48	94.96
	Moderate	98.26	98.20	97.90	98.05	96.10
	Severe	98.99	93.76	99.95	96.92	93.71
	PDR	99.89	89.20	100.0	94.31	89.09

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bilal, A.; Liu, X.; Baig, T.I.; Long, H.; Shafiq, M. EdgeSVDNet: 5G-Enabled Detection and Classification of Vision-Threatening Diabetic Retinopathy in Retinal Fundus Images. Electronics 2023, 12, 4094. https://doi.org/10.3390/electronics12194094

AMA Style

Bilal A, Liu X, Baig TI, Long H, Shafiq M. EdgeSVDNet: 5G-Enabled Detection and Classification of Vision-Threatening Diabetic Retinopathy in Retinal Fundus Images. Electronics. 2023; 12(19):4094. https://doi.org/10.3390/electronics12194094

Chicago/Turabian Style

Bilal, Anas, Xiaowen Liu, Talha Imtiaz Baig, Haixia Long, and Muhammad Shafiq. 2023. "EdgeSVDNet: 5G-Enabled Detection and Classification of Vision-Threatening Diabetic Retinopathy in Retinal Fundus Images" Electronics 12, no. 19: 4094. https://doi.org/10.3390/electronics12194094

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EdgeSVDNet: 5G-Enabled Detection and Classification of Vision-Threatening Diabetic Retinopathy in Retinal Fundus Images

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

3.1. Pre-Processing and Data Augmentation

3.1.1. Pre-processing Techniques

3.1.2. Data Augmentation Techniques

3.2. Feature Extraction and Reduction by CNN-SVD from FIs

3.3. In-Depth Exploration of the Proposed Classification Technique

3.3.1. Improved Support Vector Machine with Radial Basis Function

3.3.2. K-Nearest Neighbor (KNN)

3.3.3. Decision Tree (DT)

3.4. Comprehensive Performance and Complexity Evaluation of the Proposed Methodology

3.4.1. Performance Evaluation Metrics

3.4.2. Evaluation of Theoretical Computational Complexity

4. Results

4.1. IDRiD Dataset

4.2. Experimental Setup

4.3. Image Processing Results

4.4. Time Complexity Analysis

4.5. Comparison with State-of-the-Art Studies

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI