Framework for Detecting Breast Cancer Risk Presence Using Deep Learning

Humayun, Mamoona; Khalil, Muhammad Ibrahim; Almuayqil, Saleh Naif; Jhanjhi, N. Z.

doi:10.3390/electronics12020403

Open AccessArticle

Framework for Detecting Breast Cancer Risk Presence Using Deep Learning

¹

Department of Information systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi Arabia

²

Department of Computer Science, Bahria University, Islamabad 44000, Pakistan

³

School of Computer Science (SCS), Taylor’s University, Subang Jaya 47500, Malaysia

^*

Authors to whom correspondence should be addressed.

Electronics 2023, 12(2), 403; https://doi.org/10.3390/electronics12020403

Submission received: 24 November 2022 / Revised: 6 January 2023 / Accepted: 11 January 2023 / Published: 12 January 2023

(This article belongs to the Special Issue Deep Learning for Computer Vision)

Download

Browse Figures

Versions Notes

Abstract

:

Cancer is a complicated global health concern with a significant fatality rate. Breast cancer is among the leading causes of mortality each year. Advancements in prognoses have been progressively based primarily on the expression of genes, offering insight into robust and appropriate healthcare decisions, owing to the fast growth of advanced throughput sequencing techniques and the use of various deep learning approaches that have arisen in the past few years. Diagnostic-imaging disease indicators such as breast density and tissue texture are widely used by physicians and automated technology. The effective and specific identification of cancer risk presence can be used to inform tailored screening and preventive decisions. For several classifications and prediction applications, such as breast imaging, deep learning has increasingly emerged as an effective method. We present a deep learning model approach for predicting breast cancer risk primarily on this foundation. The proposed methodology is based on transfer learning using the InceptionResNetV2 deep learning model. Our experimental work on a breast cancer dataset demonstrates high model performance, with 91% accuracy. The proposed model includes risk markers that are used to improve breast cancer risk assessment scores and presents promising results compared to existing approaches. Deep learning models include risk markers that are used to improve accuracy scores. This article depicts breast cancer risk indicators, defines the proper usage, features, and limits of each risk forecasting model, and examines the increasing role of deep learning (DL) in risk detection. The proposed model could potentially be used to automate various types of medical imaging techniques.

Keywords:

deep learning; machine learning; convolutional neural network; computed tomography; computer vision

1. Introduction

Breast cancer is the most common cancer in patients, and it has a high death rate. The vast variation in breast cancer makes forecasting a patient’s cancer risk challenging. As a result, a standardized and community-based approach to screening for cancer has been proposed and adopted. Mammography is presently the most expensive and medically acceptable method of detecting early breast cancer among various diagnostic screening techniques [1,2]. Although many cancer risk forecasting methods have been formulated and evaluated using various types of risk aspects extracted from molecular genetics, imaging, and public health data, they are inadequate to correctly estimate the contingency of breast cancer upon one or a series of critical diagnostic-imaging health screenings on just an individual basis [3]. The discriminating power of utilizing breast density arbitrarily evaluated by physicians as a signal of risk of breast cancer is still unsatisfactory, and patients and practitioners cannot accept it as a means of determining who should be tested more frequently [4].

Breast cancer is discovered early using mammography inspection, which has been shown to lower mortality. Most histologic screening now utilizes age as the sole risk indicator to identify the target group, although there is a growing emphasis on customized screening [5]. Categorization using a disease risk forecast model can identify women at greater risk of tumors, allowing for monitoring to be customized to patients for maximum benefit [6]. The breast cancer detection factor is simply a technique used to detect the presence of cancer in histopathology images [7]. The modeling approach created the cancer risk model, which utilizes the foundation of increased probability of breast cancer associated with individual variables to predict potential losses, and it is among the most commonly used frameworks for cancer risk evaluation. The models incorporate numerous risk variables [8], while none of them are intrinsically linked to the mammography type.

Cancer is a serious public health problem all over the world. Common chest cancer and the primary causes of cancer in women are still on the rise both in the industrialized and developing worlds [9]. Breast disease is defined as the uncontrollable development of breast cells that might be malignant. Microscopic histopathology examination is often performed visually, and as a result, it is about as trustworthy as the specialist’s competence [10]. Therefore, multi-classification diagnosis of cancer using histological pictures is a difficult process because it is much less subjective, relies on the observer’s training and knowledge, and is a laborious and time-consuming technique. Furthermore, due to the scarcity of competent pathologists in most poor nations, a pathologist is required to analyze numerous types of tissue sections and patients every day. The pathologist’s restricted ability to evaluate a wide range of data and the intricacy of the pictures may result in incorrect conclusions [11,12]. Overdiagnosis can occur as a result of either over- or under-clarification. Individuals who do not have cancer may be subjected to possibly hazardous therapies and incur needless costs as a result of misinterpretation.

To tackle the challenge of increasing screening mammography effectiveness, we examined a novel image-feature-based method of indicating risk of breast cancer and forecast technique in several prior studies, one of which employs computed tomography image characteristics such as bidirectional-screening-mammography density imbalance as a signal to identify the risk of breast cancer [13]. The overall experiment aims to see if this deep learning-based approach can outperform the previously tested traditional framework for detecting cancer risk presence. It is normal for a patient to have repeated longitudinal mammography exams in breast cancer monitoring. The long-term radiology data may give extra information to enhance the learning of a risk assessment model [14]. In this case, the screening job predicts the result of a single abnormal mammography. Longitudinal mammography pictures are used to predict the risk of breast cancer. It is worth noting that we did not use numerous priors as inputs to the models [15].

The fast advancement in breast cancer smart detection technology has opened the door to studying biological-subtype smart forecasting, yet biological-subgroup intelligent forecasting remains a difficult issue. Increased chances of survival may be attributed to the early evaluation and classification of breast cancer.

Using an InceptionResNetV2 classifier, we present a classification strategy for identifying the presence and severity of metastatic breast cancer in digitized pathological images. A significant amount of research has previously been conducted in this respect by medical professionals; however, their methods do not achieve a very high accuracy. To address these obstacles, we attempted to enhance the method of accurately classifying breast cancer photos by merging the concepts of DL and transfer learning, so that preliminary-stage cancer detection may be performed with high accuracy and promising findings can be obtained. Earlier approaches failed to give improved efficacy since they were unable to retrieve hidden features.

This article investigates the efficacy of clinical characteristics obtained during standard assessment for diagnosing breast cancer using the InceptionResNetV2 classifier. InceptionResNetV2 merged the concepts of an extremely deep Inception model containing residual connections. By employing InceptionResNetV2, which allows for fine-tuning and is centered on optimal activation functions, the efficacy is increased. The proposed model based on InceptionResNetV2 is extensively trained and can be used for other medical disease prediction. To quantify the reliability and robustness of the model, we can use it for other clinical datasets.

The content of this paper is structured in the following manner. In Section 2, we discuss a comprehensive analysis of previous DL approaches that are designed for cancer prediction. In Section 3, we discuss our proposed model to analyze the risk factors in breast cancer risk detection and we discuss the proposed model experiments and discuss the results. In the last section, we conclude the study and give the future directions for our research.

2. Literature Review

With the advancement of biomedical research, various innovative technologies for the diagnosis of breast cancer have been discovered. The following is an overview of the studies on this subject.

Jing et al. [16] focused on a loss function composed of an enhanced squared-error loss plus a paired-ordering loss depending on the surviving data rating values. This error rate is used to improve a deep feed-forward network that can be used to analyze the observed data. The authors presented the methodology for the prediction of relapses in nasopharyngeal carcinoma using the RankDeepSurv model. RankDeepSurv used eight clinical parameters to forecast relapse and produced a greater C-index (0.681) compared to the normal survival concept.

Dmitrii Bychkov et al. [17] present a combination of CNN and recurrent models to train a DL network to detect colorectal cancer prognosis using photos of tumor cell extracts. They looked at a collection of digitized tumor tissues collected from 420 cancer patients. According to their findings, DL systems may be able to obtain more predictive knowledge of cervical cancer from the tissue’s shape than established human observation.

Katzman et al. [18] present DeepSurv, a Cox regression hazard DNN and cutting-edge survival approach, to model relationships between an individual’s covariates and their clinical outcome to deliver individualized therapy prescriptions. Using connection weights, DeepSurv, a DL feed-forward network, determines how a patient’s variables will affect their level of risk. This shows that DeepSurv performs as well as or better than other cutting-edge survival models and confirms that DeepSurv effectively predicts progressively complicated correlations between a participant’s variables and their probability of inability.

Pierre Courtiol et al. [19] present a MesoNet technique based on deep convolutional neural networks that successfully predicts the survival rates of mesothelioma sufferers using whole-slide digitized pictures without the need for a toxicologist to locally tag areas. MesoNet found zones that help determine patient outcomes. Curiously, the authors discovered that these zones are mostly present in the stroma and are histologically related to infection, cellular heterogeneity, and demyelination. The results indicate that DL algorithms may detect novel traits that are predictive of clinical outcomes, possibly leading to the discovery of new biomarkers.

Jakob Nikolas et al. [20] present three procedures used to evaluate CNN training quality. The classification performance is validated in an isolated training batch during the first step. The second procedure uses dispersed stochastic neighbor modeling of deep layer activations to show the partitioning of classes. In the third step, DeepDream is used to visualize deep neuron activations on 46 layers of the DL model VGG19, using a pyramid level of 12 with 75 iterations, a scale 1.1, and stretching of the histogram of the produced image for best viewing.

Panagiotis Korfiatis et al. [21] present three alternative residual DNN models to test their ability to determine methylation conditions without requiring a different tumor segmentation phase. The results show that the ResNet50 model performs best, with an accuracy of 94.90. ResNet50 outperforms both the ResNet18 and ResNet34 designs with statistical significance. We provide an approach that eliminates the need for considerable pre-processing and serves as a proof of concept for the use of DNNs to identify molecular biomarkers given regular medical imaging. The existing cancer prediction techniques are shown in Table 1.

Despite significant differences among the image compression tasks and feature extraction classification techniques, DL pre-trained models on larger datasets, such as ImageNet, are useful for clinical imaging techniques. Whenever the origin and objective tasks are learned on comparable datasets, it has been demonstrated that target resemblance can enhance efficiency in the training set. As a result, a system that combines transfer learning in ImageNet with learning algorithms from associated activities might improve efficiency.

3. Proposed Methodology

The proposed methodology is based on transfer learning with InceptionResNetV2 as the base model. The approach consists of three stages: data pre-processing, model training, and model prediction, as shown in Figure 1. The dataset was enhanced using transformation, clarity, and scale improvements. The model is initialized on the pre-trained weights that are trained in the ImageNet dataset of 1000 classes. The pre-trained weights are used to make the model more efficient and predict more accurately.

We used a transfer learning-based method to advance the procedure of breast cancer detection using a breast cancer dataset. With the advent of precision medicine programs, computerized breast cancer categorization based on histological images is important for clinical prediction and diagnosis. The objective of this study was to improve the diagnosis process by lowering erroneous diagnostic impressions of breast cancer, enabling clinicians to readily differentiate between patients, and strengthening medical practitioners’ ability to discriminate healthy persons. Figure 2 depicts the proposed methodology’s workflow.

Several characteristics derived from a single classifier may be concatenated to describe shape descriptors such as curvature, sphericity, compaction, etc. In histological pictures, the feature matrix is used to classify breast cancer. In the pre-training of models from different generic image features, these structures are used to extract useful features from small images employing the transfer learning method.

The most important area in this research is to improve the precision of the positioning of multi-classification with loss. Softmax with the cost is a class predictor that is based on the logistic regression approach.

The training set contains n images

{u_{i}, v_{i}}_{1}^{n}

where

{u_{i}}

is the image and

{v_{i}}

is the label of that image. To determine the average, we use the probability

p (v_{i} = w | u_{i})

w.r.t class

w

, which will be either 0 or 1 for the binary classification and function

Z_{θ} (u_{i})

.

Z_{θ} (u_{i}) = p (v_{i} = 1 | u_{i}; θ) \dots . p (v_{i} = n | u_{i}; θ) = \frac{1}{\sum_{j = 1}^{n} e^{θ_{j}^{T} u_{i}}} (e^{θ_{1}^{T} u_{i}} \dots e^{θ_{n}^{T} u_{i}})

The input characteristics may be used to understand certain parameters inside the hidden nodes. To determine a result using a collection of parameters, in this instance, we consider that we hold the equivalent forecast for the labeled data for every entry of the dataset.

The total of all probability is 1, and

\frac{1}{\sum_{j = 1}^{n} e^{θ_{j}^{T} u_{i}}}

indicates the probability division.

The goal is to maximize the scores by regulating the parameters; before updating the parameters, seed the weights with a minimal random number around zero. the pointer function is

{v_{i} = w}

is defined as:

{v_{i} = w} = {_{1 v_{i} \in w}^{0 v_{i} \notin w}

When the degree of categorization errors is assessed by the loss function, selection criteria for the training phase are used. Throughout training, the system adjusts connection weights to reduce the error to zero. In contrast, in fine-grained classification, the equation attempts to compress the images from the classes into an area inside the feature map. Algorithm 1 depicts the proposed methodology’s working steps.

Algorithm 1. Proposed Methodology

Let ζϵ = dataset images, α = augmentation, i = image, pp = pre-processing, s = scaling, r = rotation, rf = reflection, sm = shifting methods, and IEA = image enhancement algorithm.
Begin
Step 1: Get(ζϵ)
Step 2: α(image) w.r.t. r, s, rf, sm
Step 3: Perform (pp (i))
3.1. Execute (IEA)
3.2. Resize
3.3. Normalize (i)/interval[0, 1]
3.3.1. Conversion
3.3.2. Computation (mean)
3.3.3. Scaling(i)
3.3.4. Conversion back
3.4. Dataset splitting for training/testing
3.5. Feature Extraction InceptionResNetV2 pre-trained model
3.6. Optimize (epochs, batch size, learning weights)
Step 4: Evaluation Metrics (accuracy, precision, F1 score, and recall)
End

The DL model InceptionResNetV2 is used as a backbone model [22] and images are resized to 299 × 299 × 3 as per the model requirement. The InceptionResNetV2 architecture is shown in Figure 3. The proposed methodology is based on transfer learning using InceptionResNetV2, which is pre-trained on the ImageNet dataset, with the last layers fine-tuned on the breast cancer dataset to achieve a better score.

The method is affected by its deep architecture and wide range of training parameters. The Inception architecture and the residual connections are the foundations of the formulation of InceptionResNetV2. Multiple convolutional filters of various sizes are mixed with residual connections in the Inception-Resnet module. In addition to avoiding the decay issue brought on by deep structures, the inclusion of residual connections shortens training time. At the local scale, it is necessary to address the optimum and overfitting settings. Unless there is a type of information, it is possible to build new models and train them from the beginning; however, because there are not enough records, overfitting may occur, and transfer learning is used to counteract it. For this reason, transfer learning is the best method for learning the image classifier. With the networked connection of appropriately sized variables of the transfer theory, the model’s structure base parameters are completely learned and adjusted, resulting in more distinct target features.

4. Experiment and Results

The proposed model is trained over 70 epochs. On the cancer dataset, we employed performance measures to evaluate and validate our indicated methodology. This was primarily examined utilizing measuring criteria. Our proposed approach achieves a cancer prediction accuracy of 91%, and other evaluation metrics including recall, precision, and f1-score were also evaluated and compared to the traditional deep learning approaches.

The dataset was collected from an open source library of breast histopathology images [23]. The dataset contained the data of cancer-affected patients, and healthy patients, as shown in Figure 4. Cancerous patches appear more violet and packed than normal patches. All patches with a size of 50 × 50 were extracted from the specimens. The dataset is publicly available in the open source library of breast histopathology images. The dataset extracted from invasive ductal carcinoma contained 78,786 IDC-positive and 198,738 IDC-negative patients. The dataset labeled by human specialists by specifying the ground truths made a significant proportion of patch-level annotations. From the original dataset, we considered the number of labeled patches in the dataset to be 157,572, and the division ratio of the dataset was 80% for training and 20% for testing, as shown in Figure 5.

Data Pre-Processing

Data transformation involves dataset pre-processing and data enrichment to vary the gathering of training data and improve the model’s performance and generalizability. The deep learning repository facilitates digital image enhancement. Horizontal flip, vertical flip, shift range, and zooming range are enhancement characteristics using the Image Augmentor module [24].

Prior segmentation, noise reduction, quantization, and morphological assessment may all be utilized to improve the picture clarity and segmentation results. The purpose of data augmentation is to improve the variability of the training dataset by incorporating the defined augmentation features in Table 2, as well as the visual improvements. Figure 6 shows the pixel density of a picture from the dataset and the red, green, and blue colors represent the distribution of pixels. The data are scaled between 0 and 256; however, we choose to scale it between 0 and 1. In doing so, classification algorithms will be able to use the data. The images are filled, resized, normalized, and reshaped to the correct proportions for processing.

To reduce the limitations of seeing irregularities without excessive impact from screening, data pre-processing is critical. All of the data normalize the dataset, which also aids in converting the numeric field attribute results to scale-based while preserving variability in the dataset’s intervals. We now need to normalize the data since the dataset has a variety of feature regions. To reduce calculation time, the lesion patches are automatically removed using feature extraction preceding the process of learning. The proposed model accuracy for different parameters is shown in Figure 7.

By evaluating the model projections to actual values in percentages, the accuracy determines how effectively the model predicts. Loss is a monetary value that reflects the total of our model’s mistakes. It assesses how well or inadequately the suggested model performs. The proposed model loss of different parameters is shown in Figure 8.

A confusion matrix is a classifier success-computing approach. It enables you to assess the recognition accuracy model on a collection of test samples to determine the real values. A confusion matrix with different epochs is shown in Figure 9 and detailed results are shown in Table 3.

The following are the evaluation matrixes that are used for evaluating the proposed model, and the overall accuracy is measured.

P r e c i s i o n = \frac{T r u e P o s i t i v e}{T r u e P o s i t i v e + F a l s e P o s i t i v e}

R e c a l l = \frac{T r u e P o s i t i v e}{T r u e P o s i t i v e + F a l s e N e g a t i v e}

F 1 S c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

The proposed model’s results are consistent with the recently proposed DL methods for automated cancer risk detection. It is revealed that the proposed model learns successfully with minimal losses and has the smallest gap among the validating and training levels of accuracy compared to other DL methodologies, as shown in Table 4. Overall, the results show that the DL model worked effectively. Our research, and the previous comparable studies with which our findings were compared, all employed accuracy as a performance metric. On the other hand, in order to better communicate the findings of our own experiments, we used the assessment metrics of accuracy, precision, recall, and f-1 score.

Experiments are carried out as part of the ablation study, adjusting various aspects of the presented InceptionResNetV2 based on the fine-tuned layers. To distinguish between cancer patients and healthy patients, the optimizer Adam and the loss function binary cross-entropy are used. Finally, we use a batch rebalance technique to improve the outbreak form distribution during the batching process. It has been observed that the recommended approach is more efficient. The proposed methodology based on InceptionResNetV2, which achieves 91% accuracy in cancer risk detection, as shown in Table 3. The experimental findings demonstrate that it outperforms the previous classifier approaches in the classification results.

5. Discussion

In this article, we present a deep learning technique for detecting cancer risk presence. Data augmentation is used during pre-processing to minimize overfitting and to enhance the model. Neuronal models are more amenable to the application of transfer learning to improve classification scores. The results of the experiments reveal that the suggested technique produces good accuracy, recall, and precision, and a good F1 score. Furthermore, from a clinical standpoint, joint sensitivities are beneficial because they offer a realistic estimate of the ratio of susceptible people, which is an essential aspect that clinicians evaluate when estimating the severity of the disease.

The results in the preceding section indicate that our technique is smart enough to learn better-level discriminatory characteristics and has the highest accuracy in the classifiers of breast cancer categorization. Although fine spatial breast cancer histopathology pictures exhibit fine-grained patterns that make multi-classification challenging, the exclusionary capability of CNN methods is superior to conventional models. The model is more suitable for the medical diagnosis of cancer. Because primary care clinics have a severe lack of skilled pathologists, our study should be expanded to include a computerized breast cancer categorization system that would provide accurate, factual, and actual indicators. We see quicker training in various learning procedures during the training process. By taking the mean value out of each input parameter, we normalized the input data. This procedure is also known as centering. The speed of convergence is also impacted by normalization; for instance, a neural network may converge more quickly if the mean values of the input variables are close to zero.

Breast cancer is now the most common and one of the most lethal diseases in women. Individuals with cancer possess widely varying survival times, highlighting the need to identify predictive indicators for individualized diagnosis and therapy [34]. Knowledge on this topic has become more accessible with the emergence of new techniques such as next-generation technology, allowing for a more complete examination of a medical illness. Treatment success analysis may be divided into two categories: the first is binary classification and the second is risk regression [35]. In the binary classification problem, patients are generally divided into two groups depending on a specified threshold: short survival and long survival [36]. A scoring system is produced for each individual in risk regression research, generally using the proportional-risk approach and its adaptations.

Adequate risk analysis for cancer is required to make educated decisions about individualized screening and preventive methods. Most prognosis algorithms give population-wide estimations but are less exact on an interpersonal basis. Existing clinical risk algorithms rely heavily on the data collected, and include menarche age, hormonal therapy treatment usage, and a family background of breast cancer [37,38]. Cancer risk detection has received much press recently due to recent breakthroughs in deep learning models. Scholars can extract diagnostic characteristics that are far more indicative than standard diagnostic risk measures such as breast densities or breast tissue structure using DL algorithms [39]. DL on radiographs has been proven in investigations to give statistical imaging characteristics associated with the prognosis of breast cancer [40].

The ability of DL to use convolutional neural networks to estimate specific disease risk values based on breast pictures is demonstrated in this problem of medicine. The increasing usage of big data and complex computer techniques, as well as greater processing power, has driven the modern increase of DL techniques [41]. DL employs detailed image processing features using big datasets and is built on systems of associated elements, as opposed to the usual machine learning technique for categorizing pictures, which is centered on handcrafted properties [42]. The units link to construct several stages, some of which are stored between source and destination nodes, which can produce extremely high-level interpretations of the supplied data. The terminology “neural network” is derived from the connection of nerve cells, and a deep neural network is a form of neural net that is used in the field of image processing [43].

The foregoing issues can be addressed by applying computer-aided diagnosis (CAD) tools for cancer pathology [44]. Innovative diagnostic methods can assist in improving accuracy rates, minimizing mistake rates in cancer classification and assessment, and reducing physicians’ efforts [45]. Designing appropriate approaches to histologic capturing, pre-processing, and the smart extraction of features for the computer-assisted screening of disease is a difficult challenge [46]. Various techniques used for predicting survival in big and diverse cancer databases have been established and multi-class segmentation models also diagnose chest diseases by segmenting the organs accurately [47,48].

Through extensive research, it has been discovered that early diagnosis increases the likelihood of proper medication and endurance; yet, this circle is tedious and frequently results in disagreement amongst pathologists. In any scenario, early detection and prediction can drastically lower the risk of death. Breast cancer must now be detected as soon as feasible. We propose a technique for predicting breast cancer since it is the most common disease in women, impacting 2.1 million women each year, and also causing an enormous number of deaths due to malignant development in women. In 2020, it is estimated that 276,480 new cases of breast cancer would result in death, accounting for approximately 18% of all disease-related deaths among women [49].

When there is similar information among tasks, multiple-task learning enhances success by acquiring all tasks concurrently. In clinical uses, one typical multitask learning method is to perform a classification problem and a separation task at the same time utilizing a single statistic from each patient [50]. In addition, multi-task training has been effectively applied to continuous clinical datasets in recent years by creating associated tasks at distinct time points in the dataset.

Categorization has higher clinical relevance than classification since it offers more data about individuals’ health problems, relieves analyzers’ loads, and assists specialists in making more effective treatment regimens. Additionally, while CNNs have been utilized in feature extraction for edge detection, object recognition, and registrations, healthcare information still has a lot of potential for development when compared to the CV space [51,52,53]. As a result, an optimum training approach that is centered on transfer learning using natural pictures is utilized to fine-tune the multimodal classifier in this research, which is a popular method for DL models used in diagnostic medical interpretation.

6. Conclusions and Future Work

Clinicians and computerized technologies routinely utilize diagnostic medical imaging. This article presents a method for classifying breast cancer images using deep neural networks and transfer learning. The open source collection of breast histopathology images was utilized. To improve the categorization process, several picture magnification variables were investigated, as were data augmentation strategies. This article discusses the diagnosis and risk presence of breast cancer. We introduce a transfer learning-based automated classification technique for locating the objects of attention in breast cancer data. We propose a deep transfer learning model using InceptionResNetV2 for the diagnostic stage, which reached 91 percent accuracy. The experimental work shows that the proposed technique performed efficiently in the detection of cancer risk presence. This article depicts the risk factors for breast cancer, discusses the suitable use, features, and constraints of each risk presence model, and explores the rising role of DL in diagnosis. In order to increase accuracy and create a more reliable model, future research will focus on assessing these classifier complexes for the automated forecasting of new problems in medical imaging. Additionally, the availability of computing power based on graphics processing units (GPUs) in the cloud and the distribution structure encourage the creation of efficient parallel methods for creating such classifiers.

Author Contributions

Conceptualization, M.H., M.I.K. and N.Z.J.; methodology, M.H., M.I.K. and N.Z.J.; software, M.H., M.I.K., N.Z.J. and S.N.A.; validation, M.H., M.I.K., N.Z.J. and S.N.A.; formal analysis, M.H. and M.I.K.; investigation, M.H., M.I.K., N.Z.J. and S.N.A.; data curation, M.H., M.I.K., N.Z.J. and S.N.A.; writing—original draft preparation, M.H. and M.I.K.; writing—review and editing, M.H., M.I.K. and N.Z.J.; visualization, M.H., M.I.K. and N.Z.J.; supervision, M.H. and N.Z.J.; project administration, M.H. and N.Z.J.; funding acquisition, S.N.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Deanship of Scientific Research at Jouf University under grant No. DSR-2021-02-0329.

Data Availability Statement

Data can be provided upon request.

Acknowledgments

This work was funded by the Deanship of Scientific Research at Jouf University under grant No. DSR-2021-02-0329.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shah, S.H.; Iqbal, M.J.; Ahmad, I.; Khan, S.; Rodrigues, J.J.P.C. Optimized gene selection and classification of cancer from microarray gene expression data using deep learning. Neural Comput. Appl. 2020, 1–12. [Google Scholar] [CrossRef]
Gouda, W.; Almurafeh, M.; Humayun, M.; Jhanjhi, N.Z. Detection of COVID-19 Based on Chest X-rays Using Deep Learning. Healthcare 2022, 10, 343. [Google Scholar] [CrossRef] [PubMed]
Ismael, S.A.A.; Mohammed, A.; Hefny, H. An enhanced deep learning approach for brain cancer MRI images classification using residual networks. Artif. Intell. Med. 2020, 102, 101779. [Google Scholar] [CrossRef] [PubMed]
Dif, N.; Elberrichi, Z. A new deep learning model selection method for colorectal cancer classification. Int. J. Swarm Intell. Res. 2020, 11, 72–88. [Google Scholar] [CrossRef]
Brohi, S.N.; Pillai, T.R.; Brohi, N.N.; Jhanjhi, N.Z. A Multilayer Perceptron Model for the Classification of Breast Cancer Cells. Int. J. Comput. Digit. Syst. 2021. [Google Scholar]
Khamparia, A.; Singh, P.K.; Rani, P.; Samanta, D.; Khanna, A.; Bhushan, B. An internet of health things-driven deep learning framework for detection and classification of skin cancer using transfer learning. Trans. Emerg. Telecommun. Technol. 2021, 32, e3963. [Google Scholar] [CrossRef]
Welikala, R.A.; Remagnino, P.; Lim, J.H.; Chan, C.S.; Rajendran, S.; Kallarakkal, T.G.; Zain, R.B.; Jayasinghe, R.D.; Rimal, J.; Kerr, A.R.; et al. Automated detection and classification of oral lesions using deep learning for early detection of oral cancer. IEEE Access 2020, 8, 132677–132693. [Google Scholar] [CrossRef]
Humayun, M.; Alsayat, A. Prediction Model for Coronavirus Pandemic Using Deep Learning. Comput. Syst. Sci. Eng. 2022, 40, 947–961. [Google Scholar] [CrossRef]
Pacal, I.; Karaboga, D.; Basturk, A.; Akay, B.; Nalbantoglu, U. A comprehensive review of deep learning in colon cancer. Comput. Biol. Med. 2020, 126, 104003. [Google Scholar] [CrossRef]
Murtaza, G.; Shuib, L.; Wahab, A.W.A.; Mujtaba, G.; Nweke, H.F.; Al-Garadi, M.A.; Zulfiqar, F.; Raza, G.; Azmi, N.A. Deep learning-based breast cancer classification through medical imaging modalities: State of the art and research challenges. Artif. Intell. Rev. 2020, 53, 1655–1720. [Google Scholar] [CrossRef]
Chi, W.; Ma, L.; Wu, J.; Chen, M.; Lu, W.; Gu, X. Deep learning-based medical image segmentation with limited labels. Phys. Med. Biol. 2020, 65, 235001. [Google Scholar] [CrossRef]
Qin, R.; Wang, Z.; Jiang, L.; Qiao, K.; Hai, J.; Chen, J.; Xu, J.; Shi, D.; Yan, B. Fine-grained lung cancer classification from PET and CT images based on multidimensional attention mechanism. Complexity 2020, 2020, 6153657. [Google Scholar] [CrossRef] [Green Version]
Manne, R.; Kantheti, S.; Kantheti, S. Classification of Skin cancer using deep learning, ConvolutionalNeural Networks-Opportunities and vulnerabilities-A systematic Review. Int. J. Mod. Trends Sci. Technol. 2020, 2455–3778. [Google Scholar] [CrossRef]
Shon, H.S.; Batbaatar, E.; Kim, K.O.; Cha, E.J.; Kim, K.-A. Classification of kidney cancer data using cost-sensitive hybrid deep learning approach. Symmetry 2020, 12, 154. [Google Scholar] [CrossRef] [Green Version]
Shon, H.S.; Batbaatar, E.; Kim, K.O.; Cha, E.J.; Kim, K.-A. Automated Detection and Classification of Oral Lesions Using Deep Learning to Detect Oral Potentially Malignant Disorders. Cancers 2021, 13, 2766. [Google Scholar]
Jing, B.; Zhang, T.; Wang, Z.; Jin, Y.; Liu, K.; Qiu, W.; Ke, L.; Sun, Y.; He, C.; Hou, D.; et al. A deep survival analysis method based on ranking. Artif. Intell. Med. 2019, 98, 1–9. [Google Scholar] [CrossRef] [PubMed]
Bychkov, D.; Linder, N.; Turkki, R.; Nordling, S.; Kovanen, P.E.; Verrill, C.; Walliander, M.; Lundin, M.; Haglund, C.; Lundin, J. Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci. Rep. 2018, 8, 3395. [Google Scholar] [CrossRef] [Green Version]
Katzman, J.L.; Shaham, U.; Cloninger, A.; Bates, J.; Jiang, T.; Kluger, Y. DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 2018, 18, 24. [Google Scholar] [CrossRef] [PubMed]
Alanazi, S.A.; Kamruzzaman, M.M.; Sarker, N.I.; Alruwaili, M.; Alhwaiti, Y.; Alshammari, N.; Siddiqi, M.H. Boosting breast cancer detection using convolutional neural network. J. Healthc. Eng. 2021, 2021, 5528622. [Google Scholar] [CrossRef] [PubMed]
Kather, J.N.; Krisam, J.; Charoentong, P.; Luedde, T.; Herpel, E.; Weis, C.-A.; Gaiser, T.; Marx, A.; Valous, N.A.; Ferber, D.; et al. Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study. PLoS Med. 2019, 16, e1002730. [Google Scholar] [CrossRef] [PubMed]
Korfiatis, P.; Kline, T.L.; Lachance, D.H.; Parney, I.F.; Buckner, J.C.; Erickson, B.J. Residual deep convolutional neural network predicts MGMT methylation status. J. Digit. Imaging 2017, 30, 622–628. [Google Scholar] [CrossRef] [PubMed]
Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
Mooney, P. Breast Histopathology Images. 2018. Available online: https://www.kaggle.com/datasets/paultimothymooney/breast-histopathology-images (accessed on 13 July 2022).
PypI Image-Augmenter. 2019. Available online: https://pypi.org/project/image-augmenter (accessed on 13 July 2022).
Priego-Torres, B.M.; Lobato-Delgado, B.; Atienza-Cuevas, L.; Sanchez-Morillo, D. Deep learning-based instance segmentation for the precise automated quantification of digital breast cancer immunohistochemistry images. Expert Syst. Appl. 2022, 193, 116471. [Google Scholar] [CrossRef]
Liu, T.; Huang, J.; Liao, T.; Pu, R.; Liu, S.; Peng, Y. A hybrid deep learning model for predicting molecular subtypes of human breast cancer using multimodal data. Irbm 2022, 43, 62–74. [Google Scholar] [CrossRef]
Hirra, I.; Ahmad, M.; Hussain, A.; Ashraf, M.U.; Saeed, I.A.; Qadri, S.F.; Alghamdi, A.M.; Alfakeeh, A.S. Breast cancer classification from histopathological images using patch-based deep learning modeling. IEEE Access 2021, 9, 24273–24287. [Google Scholar] [CrossRef]
Abdollahi, S.; Lin, P.-C.; Chiang, J.-H. WinBinVec: Cancer-Associated Protein-Protein Interaction Extraction and Identification of 20 Various Cancer Types and Metastasis Using Different Deep Learning Models. IEEE J. Biomed. Health Inform. 2021, 25, 4052–4063. [Google Scholar] [CrossRef]
Wang, X.; Chen, H.; Gan, C.; Lin, H.; Dou, Q.; Tsougenis, E.; Huang, Q.; Cai, M.; Heng, P.-A. Weakly Supervised Deep Learning for Whole Slide Lung Cancer Image Analysis. IEEE Trans. Cybern. 2020, 50, 3950–3962. [Google Scholar] [CrossRef]
Chen, X.; Lin, X.; Shen, Q.; Qian, X. Combined Spiral Transformation and Model-Driven Multi-Modal Deep Learning Scheme for Automatic Prediction of TP53 Mutation in Pancreatic Cancer. IEEE Trans. Med. Imaging 2021, 40, 735–747. [Google Scholar] [CrossRef]
Yang, T.; Liang, N.; Li, J.; Yang, Y.; Li, Y.; Huang, Q.; Li, R.; He, X.; Zhang, H. Intelligent Imaging Technology in Diagnosis of Colorectal Cancer Using Deep Learning. IEEE Access 2019, 7, 178839–178847. [Google Scholar] [CrossRef]
Trivizakis, E.; Manikis, G.C.; Nikiforaki, K.; Drevelegas, K.; Constantinides, M.; Drevelegas, A.; Marias, K. Extending 2-D Convolutional Neural Networks to 3-D for Advancing Deep Learning Cancer Classification With Application to MRI Liver Tumor Differentiation. IEEE J. Biomed. Health Inform. 2019, 29, 923–930. [Google Scholar] [CrossRef]
Alfian, G.; Syafrudin, M.; Fahrurrozi, I.; Fitriyani, N.L.; Atmaji, F.T.D.; Widodo, T.; Bahiyah, N.; Benes, F.; Rhee, J. Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method. Computers 2022, 11, 136. [Google Scholar] [CrossRef]
Pang, T.; Wong, J.H.D.; Ng, W.L.; Chan, C.S. Deep learning radiomics in breast cancer with different modalities: Overview and future. Expert Syst. Appl. 2020, 158, 113501. [Google Scholar] [CrossRef]
Gaur, L.; Bhatia, U.; Jhanjhi, N.Z.; Muhammad, G.; Masud, M. Medical image-based detection of COVID-19 using Deep Convolution Neural Networks. Multimed. Syst. 2021, 1–10. [Google Scholar] [CrossRef] [PubMed]
Girum, K.B.; Créhange, G.; Hussain, R.; Lalande, A. Fast interactive medical image segmentation with weakly supervised deep learning method. Int. J. Comput. Assist. Radiol. Surg. 2020, 15, 1437–1444. [Google Scholar] [CrossRef]
Adegun, A.; Viriri, S. Deep learning techniques for skin lesion analysis and melanoma cancer detection: A survey of state-of-the-art. Artif. Intell. Rev. 2021, 54, 811–841. [Google Scholar] [CrossRef]
Ahmad, F.; Almuayqil, S.N.; Humayun, M.; Naseem, S.; Khan, W.A.; Junaid, K. Prediction of COVID-19 cases using machine learning for effective public health management. Comput. Mater. Contin. 2021, 66, 2265–2282. [Google Scholar] [CrossRef]
Tabares-Soto, R.; Orozco-Arias, S.; Romero-Cano, V.; Bucheli, V.S.; Rodríguez-Sotelo, J.L.; Jiménez-Varón, C.F. A comparative study of machine learning and deep learning algorithms to classify cancer types based on microarray gene expression data. PeerJ Comput. Sci. 2020, 6, e270. [Google Scholar] [CrossRef] [Green Version]
Stoean, R. Analysis on the potential of an EA–surrogate modelling tandem for deep learning parametrization: An example for cancer classification from medical images. Neural Comput. Appl. 2020, 32, 313–322. [Google Scholar] [CrossRef]
Chen, C.-L.; Chen, C.-C.; Yu, W.-H.; Chen, S.-H.; Chang, Y.-C.; Hsu, T.-I.; Hsiao, M.; Yeh, C.-Y. An annotation-free whole-slide training approach to pathological classification of lung cancer types using deep learning. Nat. Commun. 2021, 12, 1193. [Google Scholar] [CrossRef]
Acharya, S.; Alsadoon, A.; Prasad, P.W.C.; Abdullah, S.; Deva, A. Deep convolutional network for breast cancer classification: Enhanced loss function (ELF). J. Supercomput. 2020, 76, 8548–8565. [Google Scholar] [CrossRef]
Chaturvedi, S.S.; Tembhurne, J.V.; Diwan, T. A multi-class skin Cancer classification using deep convolutional neural networks. Multimed. Tools Appl. 2020, 79, 28477–28498. [Google Scholar] [CrossRef]
Chand, S. A comparative study of breast cancer tumor classification by classical machine learning methods and deep learning method. Mach. Vis. Appl. 2020, 31, 46. [Google Scholar]
Kriegsmann, M.; Haag, C.; Weis, C.-A.; Steinbuss, G.; Warth, A.; Zgorzelski, C.; Muley, T.; Winter, H.; Eichhorn, M.E.; Eichhorn, F.; et al. Deep learning for the classification of small-cell and non-small-cell lung cancer. Cancers 2020, 12, 1604. [Google Scholar] [CrossRef]
Kadampur, M.A.; Al Riyaee, S. Skin cancer detection: Applying a deep learning based model driven architecture in the cloud for classifying dermal cell images. Inform. Med. Unlocked 2020, 18, 100282. [Google Scholar] [CrossRef]
Khalil, M.I.; Humayun, M.; Jhanjhi, N.Z.; Talib, M.N.; Tabbakh, T.A. Multi-class Segmentation of Organ at Risk from Abdominal CT Images: A Deep Learning Approach. In Intelligent Computing and Innovation on Data Science; Springer: Berlin/Heidelberg, Germany, 2021; pp. 425–434. [Google Scholar]
Khalil, M.I.; Tehsin, S.; Humayun, M.; Jhanjhi, N.; AlZain, M.A. Multi-Scale Network for Thoracic Organs Segmentation. Cmc-Comput. Mater. Contin. 2022, 70, 3251–3265. [Google Scholar] [CrossRef]
Musa, A.; Aliyu, U. Application of Machine Learning Techniques in Predicting of Breast Cancer Metastases Using Decision Tree Algorithm. Sokoto Northwestern Nigeria. J. Data Min. Genom. Proteom. 2020, 11, 1–5. [Google Scholar]
Alzubaidi, L.; Al-Shamma, O.; Fadhel, M.A.; Farhan, L.; Zhang, J.; Duan, Y. Optimizing the performance of breast cancer classification by employing the same domain transfer learning from hybrid deep convolutional neural network model. Electronics 2020, 9, 445. [Google Scholar] [CrossRef] [Green Version]
Zheng, J.; Lin, D.; Gao, Z.; Wang, S.; He, M.; Fan, J. Deep learning assisted efficient AdaBoost algorithm for breast cancer detection and early diagnosis. IEEE Access 2020, 8, 96946–96954. [Google Scholar] [CrossRef]
Jinnai, S.; Yamazaki, N.; Hirano, Y.; Sugawara, Y.; Ohe, Y.; Hamamoto, R. The development of a skin cancer classification system for pigmented skin lesions using deep learning. Biomolecules 2020, 10, 1123. [Google Scholar] [CrossRef]
Hassan, M.; Mollick, S.; Yasmin, F. An unsupervised cluster-based feature grouping model for early diabetes detection. Healthc. Anal. 2022, 2, 100112. [Google Scholar] [CrossRef]

Figure 1. Proposed methodology workflow.

Figure 2. A brief flow chart of the proposed methodology.

Figure 3. InceptionResNetV2 architecture.

Figure 4. Breast cancer data: (a) false positive samples and (b) true positive samples.

Figure 5. Dataset distribution.

Figure 6. Patch pixel intensity.

Figure 7. Model accuracy over 70 epochs.

Figure 8. Model loss over 70 epochs.

Figure 9. Model confusion matrix over 70 epochs.

Table 1. Existing cancer prediction techniques.

Article	Cancer Type	Data Size	Methodology	Validation	Result	Performance
[16]	Breast cancer	METABRIC, 1903 samples	Deep neural network	METABRIC, 951 samples	Cancer prediction	C-index 0.704
[17]	Colorectal cancer	420 patients’ data	VGG16	140 samples for testing and 60 samples for validation	Cancer prediction	HR 2.3; CI 95% 1.79–3.03; AUC 0.69
[18]	Breast cancer	1546 training, 686 testing	Neural Network	20% test data used.	Cancer prediction	CI 0.67
[19]	Breast cancer	275,000, 50 × 50-pixel RGB	CNNs	-	Cancer prediction	Accuracy 87%
[20]	Colorectal cancer	Dataset contains 7180 slides	VGG19, GoogLeNet, Resnet50, AlexNet, SqueezeNet	409 samples for validation	Classification of 9 tissues	CI 95
[21]	Glioblastoma multiforme	458,951 images from MRI scans of 262 patients	Deep neural network	Variation in k-fold	Cancer prediction	ResNet50 achieved 94.90% (+/−3.92%); ResNet34 (34 layers) achieved 80.72% (+/−13.61%)

Table 2. Augmentation properties.

Augmentation Property	Value
Perspective Rotation	0.1
Hue/Saturation	1
Perspective Rotation	0.1
Horizontal Flip	0.5

Table 3. Classification results.

Metric for Evaluation	Results
Accuracy	0.907
	Healthy	Cancer
Recall	0.93	0.76
Precision	0.96	0.68
F1-Score	0.94	0.72

Table 4. Result comparison with other papers.

Article	Accuracy	Methodology
[25]	0.85	Deep neural network
[26]	0.88	Hybrid DL
[27]	0.86	Patch-based deep belief network (DBN)
[28]	0.78	DL-based window-based binary vectors
[29]	0.82	Patch-based fully convolutional network (FCN)
[30]	0.83	Model-driven multi-modal deep learning
[31]	0.79	DL-based gray-area size matrix GLSZM
[32]	0.83	3D CNN
[33]	0.80	SVM with randomized trees
Proposed Model	0.91	TL with InceptionResNetV2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Humayun, M.; Khalil, M.I.; Almuayqil, S.N.; Jhanjhi, N.Z. Framework for Detecting Breast Cancer Risk Presence Using Deep Learning. Electronics 2023, 12, 403. https://doi.org/10.3390/electronics12020403

AMA Style

Humayun M, Khalil MI, Almuayqil SN, Jhanjhi NZ. Framework for Detecting Breast Cancer Risk Presence Using Deep Learning. Electronics. 2023; 12(2):403. https://doi.org/10.3390/electronics12020403

Chicago/Turabian Style

Humayun, Mamoona, Muhammad Ibrahim Khalil, Saleh Naif Almuayqil, and N. Z. Jhanjhi. 2023. "Framework for Detecting Breast Cancer Risk Presence Using Deep Learning" Electronics 12, no. 2: 403. https://doi.org/10.3390/electronics12020403

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Framework for Detecting Breast Cancer Risk Presence Using Deep Learning

Abstract

1. Introduction

2. Literature Review

3. Proposed Methodology

4. Experiment and Results

Data Pre-Processing

5. Discussion

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI