Next Article in Journal
BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control
Next Article in Special Issue
Medical Assistant Mobile Application for Diabetes Control by Simulating a Compartmental Model
Previous Article in Journal
Hybrid Harmony Search-Simulated Annealing Algorithm for Location-Inventory-Routing Problem in Supply Chain Network Design with Defect and Non-Defect Items
Previous Article in Special Issue
Handling Skewed Data: A Comparison of Two Popular Methods
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Taylor Bird Swarm Algorithm Based on Deep Belief Network for Heart Disease Diagnosis

by
Afnan M. Alhassan
1,2,* and
Wan Mohd Nazmee Wan Zainon
1
1
School of Computer Science, Universiti Sains Malaysia, George Town 11800, Malaysia
2
College of Computing and Information Technology, Shaqra University, Shaqra 11961, Saudi Arabia
*
Author to whom correspondence should be addressed.
Appl. Sci. 2020, 10(18), 6626; https://doi.org/10.3390/app10186626
Submission received: 27 July 2020 / Revised: 13 September 2020 / Accepted: 18 September 2020 / Published: 22 September 2020
(This article belongs to the Special Issue Medical Informatics and Data Analysis)

Abstract

:
Contemporary medicine depends on a huge amount of information contained in medical databases. Thus, the extraction of valuable knowledge, and making scientific decisions for the treatment of disease, has progressively become necessary to attain effective diagnosis. The obtainability of a large amount of medical data leads to the requirement of effective data analysis tools for extracting constructive knowledge. This paper proposes a novel method for heart disease diagnosis. Here, the pre-processing of medical data is done using log-transformation that converts the data to its uniform value range. Then, the feature selection process is performed using sparse fuzzy-c-means (FCM) for selecting significant features to classify medical data. Incorporating sparse FCM for the feature selection process provides more benefits for interpreting the models, as this sparse technique provides important features for detection, and can be utilized for handling high dimensional data. Then, the selected features are given to the deep belief network (DBN), which is trained using the proposed Taylor-based bird swarm algorithm (Taylor-BSA) for detection. Here, the proposed Taylor-BSA is designed by combining the Taylor series and bird swarm algorithm (BSA). The proposed Taylor-BSA–DBN outperformed other methods, with maximal accuracy of 93.4%, maximal sensitivity of 95%, and maximal specificity of 90.3%, respectively.

1. Introduction

Contemporary medicine depends on a large amount of information accumulated in medical datasets. The extraction of such constructive knowledge can help when making scientific decisions to diagnose disease. Medical data can enhance the management of hospital information and endorse the growth of telemedicine. Medical data primarily focuses on patient care first, and research resources second. The main rationalization to collect medical data is to promote patient health conditions [1]. The accessibility of numerous medical data causes redundancy, which requires effectual and significant techniques for processing data to extract beneficial knowledge. However, the diagnostics of various diseases indicate significant issues in data analysis [2]. Quantifiable diagnosis is performed by adoctor’s guidance rather than patterns of the medical dataset; thus, there is the possibility of incorrect diagnosis [3]. Cloud-based services can assist with managing medical data, including compliance management, policy integration, access controls, and identity management [4].
Now a day, heart disease is a foremost source of death. We are moving towards a new industrial revolution; thus, lifestyle changes should take place to prevent risk factors of heart disease, such as obesity, diabetes, hypertension, and smoking [5]. The treatment of disease is a complex mission in medical field. The discovery of heart disease, with different risk factors, is considered a multi-layered issue [6]. Thus, patient medical data are collected to simplify the diagnosis process. Offering a valuable service (at less cost) is a major limitation in the healthcare industry. In [7], valuable quality service refers to the precise diagnosis (and effective treatment) in patients. Poor clinical decisions cause disasters, which may affect the health of patients. Automated approaches, such as the machine-learning approach [8,9] and data mining [10] approach, assist with attaining clinical tests, or diagnoses, at reduced risks [11,12]. The classification and pattern recognition by machine learning algorithms are widely included in prognostic and diagnosis monitoring. The machine learning approach supports decision-making, which increases the safety of the patients and avoids medical errors, so that it can be used in clinical decision support systems (CDSS) [13,14].
Several methods are devised for automatic heart disease detection to evaluate the efficiency of the decision tree and Naive Bayes [15]. Moreover, optimization with the genetic algorithm is employed for minimizing the number of attributes without forfeiting accuracy and efficiency to diagnose heart disease [16]. Data mining methods for heart disease diagnosis include the bagging algorithm, neural network, support vector machine, and automatically defined groups [17]. In [18], the study acquired 493 samples from a cerebrovascular disease prevention program, and utilized three classification techniques (the Bayesian classifier, decision tree, and backpropagation neural network) for constructing classification models. In [19], a method is devised for diagnosing coronary artery disease. The method utilized 303 samples by adapting the feature creation technique. In [20], a methodology is devised for automatically detecting the efficiency of features to reveal heart rate signals. In [21], a hybrid algorithm is devised with K-Nearest Neighbour (KNN), and the genetic algorithm for effectual classification. The method utilized a genetic search as a decency measure for ranking attributes. Then, the classification algorithm was devised on evaluated attributes for heart disease diagnosis. The extraction of valuable information from huge data is a time-consuming task [22]. The size of the medical dataset is increasing in a rapid manner and advanced techniques of data mining help physicians make effective decisions. However, the issues of heart disease data involve feature selection, in which the imbalance of samples and the lack of magnitude of features are just some of the issues [23]. Although there are methods for heart disease detection with real-world medical data, these methods are devised to improve accuracy and time for computation in disease detection [24]. In [25], a hybrid model with the cuckoo search (CS)—and a rough set—is adapted for diagnosing heart disease. The drawback is that a rough set produces an unnecessary number of rules. To solve these challenges in heart disease diagnoses; a novel method, named the Taylor-based bird swarm algorithm–deep belief network (Taylor-BSA–DBN), is proposed for medical data classification.
The purpose of the research is to present a heart disease diagnosis strategy, for which the proposed Taylor-BSA–DBN is employed. The major contribution of the research is the detection of heart disease using selected features. Here, the feature selection is performed using sparse FCM for selecting imperative features. In addition, DBN is employed for detecting heart disease data using the features. Here, the DBN is trained by the proposed Taylor-BSA, in such a way that the model parameters are learned optimally. The proposed Taylor-BSA is developed through the inheritance of the high global convergence property of BSA in the Taylor series. Hence, the proposed Taylor-BSA–DBN renders effective accuracy, sensitivity, and specificity while facilitating heart disease diagnosis.
The major portion of the paper focuses on:
  • Proposed Taylor-BSA–DBN for heart disease diagnosis:Taylor-BSA–DBN(a classifier) is proposed by modifying the training algorithm of the DBN with the Taylor-BSA algorithm, which is newly derived by combining the Taylor series and BSA algorithm, for the optimal tuning of weights and biases. The proposed Taylor-BSA–DBN is adapted for heart disease diagnosis.
Other sections of the paper are arranged as follows: Section 2 elaborates the descriptions of conventional heart disease detection strategies utilized in the literature, as well as challenges faced, which are considered as the inspiration for developing the proposed technique. The proposed method for heart disease diagnosis using modified DBN is portrayed in Section 3. The outcomes of the proposed strategy with other methods are depicted in Section 4; Section 5 presents the conclusion.

2. Motivations

This section illustrates eight strategies employed for heart disease diagnosis, along with its challenges.

Literature Survey

Reddy, G.T. et al. [22] devised an adaptive genetic algorithm with fuzzy logic (AGAFL) model for predicting heart disease, which assists clinicians in treating heart disease at earlier phases. The model comprises rough sets with a fuzzy rule-based classification module and heart disease feature selection module. The obtained rules from fuzzy classifiers are optimized by adapting an adaptive genetic algorithm. Initially, the significant features that affect heart disease are chosen using the rough set theory. Then, the second step predicts heart disease with the AGAFL classifier. The method is effective in handling noisy data and works effectively with large attributes. Nourmohammadi-Khiarak et al. [23] devised a method for selecting features and reducing the number of features.
Here, the imperialist competitive algorithm was devised to choose important features from heart disease. This algorithm offers an optimal response in selecting features. Moreover, the k-nearest neighbor algorithm was utilized for classification. The method showed that the accuracy of feature selection was enhanced. However, the method failed to utilize incomplete or missed data. Magesh, G. and Swarnalatha, P. [26] devised a model using Cleveland heart samples for heart disease diagnosis. The method employed cluster-based Decision Tree learning (CDTL) for diagnosing heart disease. Here, the original set was partitioned using target label distribution. From elevated distribution samples, the possible class was derived. For each class set, the features were detected using entropy for diagnosing heart disease. Thiyagaraj, M. and Suseendran, G. [27] developed Particle Swarm Optimization and Rough Sets with Transductive Support Vector Machines (PSO and RS with TSVM) for heart disease diagnosis. This method improved data integrity to minimize data redundancy. The normalization of data was carried out using Zero-Score (Z-Score). Then, the PSO was employed for selecting the optimal subset of attributes, reduce computational overhead, and enhance prediction performance. The Radial Basis Function-Transductive Support Vector Machines (RBF-TSVM) classifier was employed for heart disease prediction. Abdel-Basset, M. et al. [28] devised a model using Internet of Things (IoT) for determining and monitoring heart patients. The goal of the healthcare model was to obtain improved precision for diagnosis. The neutrosophic multi-criteria decision-making (NMCDM) technique was employed for aiding patients (i.e., for observing patients suffering from heart failure). Moreover, the model provided an accurate solution that decreases the rate of death and the cost of treatment. Nilashi, M. et al. [24] devised a predictive technique for heart disease diagnosis with machine learning models. Here, the method adapted unsupervised and supervised learning for diagnosing heart disease. In addition, the method employed Self-Organizing Map, Fuzzy Support Vector Machine (FSVM), and Principal Component Analysis (PCA) for missing value assertion. Moreover, incremental PCA and FSVM are devised for incremental learning of data to minimize the time taken for computation in disease prediction. Shah, S.M.S. et al. [29] devised an automatic diagnostic technique for diagnosing heart disease. The method evaluated the pertinent feature subset by employing the benefits of feature selection and extraction models. For accomplishing the feature selection, two algorithms: accuracy based feature selection algorithm (AFSA) and Mean Fisher based feature selection algorithm (MFFSA) for heart disease diagnosis. However, the method failed to employ PCA for dimension reduction. Acharjya, D.P. [25] devised a hybrid method for diagnosing heart disease. The method combined the cuckoo search (CS) and rough set to infer decision rules. Moreover, the CS was employed for discovering essential features. In addition, three major features were evaluated with rough set rules. The method improved feasibility, but failed to induce an intuitionistic fuzzy rough set and CS for diagnosing heart disease.

3. Proposed Taylor-BSA–DBN for Medical Data Classification

The accessibility of a large amount of medical data led to the requirement of strong data analysis tools for extracting valuable knowledge. Researchers are adapting data mining and statistical tools for improving the analysis of data on huge datasets. The diagnosis of a disease is the foremost application in which data mining tools are offering triumphant results. Medical data tend to be rich in information, but poor in knowledge. Thus, there is a deficiency of effectual analysis tools for discovering hidden relation and trends from medical data generated from clinical records. The processing of medical data brings a manifestation if it has some powerful methods. Thus, the proposed Taylor-BSA–DBN is devised to process medical data for attaining effective heart disease diagnosis. Figure 1 portrays the schematic view of the proposed Taylor-BSA–DBN for heart disease diagnosis. The complete process of the proposed model is pre-processing feature selection, and detection. At first, the medical data is fed as an input to the pre-processing phase, wherein log transformation is applied to pre-process the data. Log transformation is applied for minimizing skew, and to normalize the data. Once the pre-processed data are obtained, then it is further subjected to the feature selection phase. In the feature selection phase, the imperative features are selected with Sparse FCM. After obtaining imperative features, the detection is performed with DBN, wherein the training of DBN is carried out using Taylor-BSA. The proposed Taylor-BSA is devised by combining the Taylor series and BSA. The output produced from the classifier is the classified medical data.
Consider an input medical data be given as A , with various attributes, and is expressed as
A = A G , H ; 1 G B ; 1 H C
where A G , H denotes H t h attribute in G t h data, B specifies a total number of data, and C specifies total attributes in each data. The dimension of the database is represented as B × C .

3.1. Pre-Processing

The importance of pre-processing is to facilitate smoother processing of the input data. Additionally, the pre-processing is carried out for eliminating the noise and artefacts contained in the data. In this method, the pre-processing is carried out by using log transformation, in which data are replaced with a log function, wherein the base of the log is set by the analyst (maybe 2, or 10). The process is used to compress the massive data. In addition, the log transformation has extensively adapted the method to solve skewed data and assist data normalization. The log transformation is formulated as,
D = log 10 ( A )
The dimension of pre-processed dataset A becomes B × C .

3.2. Selection of Features with Sparse FCM Clustering

The pre-processed data are fed to the feature selection module, considering the Sparse FCM algorithm [30], which is the modification of the standard FCM. The benefit of using Sparse FCM is to provide high dimensional data clustering. The pre-processed data contain different types of attributes, each indicating individual value. In the medical data classification strategy, the sparse FCM is applied for determining the features from the data. The sparse FCM clustering algorithm clusters nodes, to attain communication between nodes through the cluster head, and facilitate effective detection of the attacker node. Generally, in sparse FCM, dimensional reduction is effective, poses the ability to handle disease diagnosis without delay, and is easier with optimization techniques.

3.3. Classification of Medical Data with Proposed Taylor-BSA-Based DBN

In this section, medical data classification using the proposed Taylor-BSA method is presented, and the classification is progressed using the feature vector.

3.3.1. Proposed Taylor-BSA Algorithm

The proposed Taylor-BSA is the combination of the Taylor series and BSA. The Taylor series [31] explains the functions of complex variables, and it is the expansion of a function into an infinite sum of terms. It not only serves as a powerful tool, but also helps in evaluating integrals and infinite sums. Moreover, the Taylor series is aone-step process, and it can deal with higher-order terms. The Taylor series seems to be advantageous for derivations, and can be used to get theoretical error bounds. Above all, the Taylor series ensures the accuracy of classification. Moreover, it is a simple method to solve complex functions. BSA [32] is duly based on the social behaviors of birds that follow some idealistic rules. BSA is more accurate than other standard optimizations with highly efficient, accurate, and robust performances. In addition, there is a perfect balance between exploration and exploitation in BSA. The DBN has recently become a popular approach in machine learning for its promised advantages, such as fast inference and the ability to encode richer and higher order network structures. DBN is used to extract better feature representations, and several related tasks are solved simultaneously by using shared representations. Moreover, it has the advantages of a multi-layer structure, and pre-training with the fine-tuning learning method. The algorithmic steps of the proposed Taylor-BSA are described below:
Step 1. Initialization: the first step is the initialization of population and other algorithmic parameters, including: F i , j ; 1 i j , where, the population size is denoted as j , h max represent maximal iteration, p r o b indicate the probability of foraging food, and the frequency of flight behavior of birds is expressed as F t .
Step 2. Determination of objective function: the selection of the best position of the bird is termed as a minimization issue. The minimal value of error defines the optimal solution.
Step 3. Position update of the birds: for updating the positions, birds have three phases, which are decided using probability. Whenever the random number R a n d 0 , 1 < p r o b , then the update is based on foraging behavior, or else the vigilance behavior commences. On the other hand, the swarm splits as scroungers and producers, which is modeled as flight behaviors. Finally, the feasibility of the solutions is verified and the best solution is retrieved.
Step 4. Foraging behavior of birds: the individual bird searches for the food based on its own experience, and the behavior of the swarm, which is given below. The standard equation of the foraging behavior of birds [32] is given by,
F i , j h + 1 = F i , j h F i , j h R a n d ( 0 , 1 ) [ Z + T ] + R a n d ( 0 , 1 ) [ Ρ i , j Z + Y j T ]
where, F i , j h + 1 and F i , j h denotes the location of i t h bird in j t h dimension at h + 1 and h , Ρ i , j refers to the previous best position of the i t h bird, R a n d ( 0 , 1 ) is independent uniformly distributed numbers, Y j indicates the best previous location shared by the birds swarm, Z denotes the cognitive accelerated coefficients, and T denotes the social accelerated coefficients. Here, Z and T are positive numbers.
According to the Taylor series [31], the update equation is expressed as,
F i , j h + 1 = 0.5 F i , j h + 1.3591 F i , j h 1 1.359 F i , j h 2 + 0.6795 F i , j h 3 0.2259 F i , j h 4 + 0.0555 F i , j h 5 0.0104 F i , j h 6 + 1.38 e 3 F i , j h 7 9.92 e 5 F i , j h 8
F i , j h = 1 0.5 F i , j h + 1 1.3591 F i , j h 1 + 1.359 F i , j h 2 0.6795 F i , j h 3 + 0.2259 F i , j h 4 0.0555 F I , J h 5 + 0.0104 F i , j h 6 1.38 e 3 F i , j h 7 + 9.92 e 5 F i , j h 8
Substituting Equation (5) in Equation (3),
F i , j h + 1 = F i , j h 2 F i , j h + 1 2.7182 F i , j h 1 + 2.718 F i , j h 2 1.359 F i , j h 3 0.4518 F i , j h 4 0.111 F i , j h 5 + 0.0208 F i , j h 6 0.00276 F i , j h 7 + 0.0001984 F i , j h 8   R a n d ( 0 , 1 ) Z + T + R a n d ( 0 , 1 ) Ρ i , j Z + Y j T
F i , j h + 1 + 2 F i , j h + 1 = F i , j h + 2.7182 F i , j h 1 2.718 F i , j h 2 + 1.359 F i , j h 3 + 0.4518 F i , j h 4 + 0.111 F i , j h 5 0.0208 F i , j h 6 + 0.00276 F i , j h 7 0.0001984 F i , j h 8 R a n d ( 0 , 1 ) Z + T + R a n d ( 0 , 1 ) Ρ i , j Z + Y j T
3 F i , j h + 1 = F i , j h + 2.7182 F i , j h 1 2.718 F i , j h 2 + 1.359 F i , j h 3 + 0.4518 F i , j h 4 + 0.111 F i , j h 5 0.0208 F i , j h 6 + 0.00276 F i , j h 7 0.0001984 F i , j h 8 R a n d ( 0 , 1 ) [ Z + T ] + R a n d ( 0 , 1 ) Ρ i , j Z + Y j T
F i , j h + 1 = 1 3 F i , j h + 2.7182 F i , j h 1 2.718 F i , j h 2 + 1.359 F i , j h 3 + 0.4518 F i , j h 4 + 0.111 F i , j h 5 0.0208 F i , j h 6 + 0.00276 F i , j h 7 0.0001984 F i , j h 8 R a n d ( 0 , 1 ) [ Z + T ] + R a n d ( 0 , 1 ) [ Ρ i , j Z + Y j T ]
Step 5. Vigilance Behavior of Birds: the birds move towards the center, during which, the birds compete with each other; the vigilance behavior of birds is modeled as,
F i , j h + 1 = F i , j h + V 1 μ j F i , j h × R a n d ( 0 , 1 ) + V 2 U o j F i , j h × R a n d ( 1 , 1 )
V 1 = w 1 × exp R Q U i R Q + ψ × v
V 2 = w 2 × exp R Q U i R Q U T R Q U T R Q U i + ψ v × R Q U T R Q + ψ
where, V represents the number of birds, w 1 and w 2 are the positive constants lying in the range of 0 , 2 , R Q U i denotes the optimal fitness value of i t h bird, and R Q corresponds to the addition of the best fitness values of the swarm. ψ be the constant that keeps optimization away from zero-division error. T signifies the positive integer.
Step 6. Flight Behavior: this behavior is of the birds’ progress, when the birds fly to another site in case of any threatening events and foraging mechanisms. When the birds reach a new site, they search for food. Some birds in the group act as producers and others as scroungers. The behavior is modeled as,
F i , j h + 1 = F i , j h + R a n d   r 0 , 1   ×   F i , j h
F i , j h + 1 = F i , j h +   F γ , j h F i , j h × F l × R a n d 0 , 1
where, R a n d o m   0 , 1 refer to the Gaussian distributed random number with zero-mean and standard deviation.
Step 7. Determination of best solution:the best solution is evaluated based on error function. If the newly computed solution is better than the previous one, then it is updated by the new solution.
Step 8. Terminate: the optimal solutions are derived in an iterative manner until the maximum number of iterations is reached. The pseudo-code of the proposed Taylor-BSA algorithm is illustrated in Algorithm 1.
Algorithm 1. Pseudocode for the proposed Taylor-BSA algorithm
Input: Bird swarm population W k , l ;   1 k b
Output: Best solution
Procedure:
Begin
 Population initiation: F i , j ;   1 i p
 Read the parameters: b - population   size ;   h max maximal   iteration ,   p r o b - probability   of   foraging   food ,   F l -frequency of flight behavior of birds
 Determine the fitness of the solutions
 While h < h max
 For k = 1 : b
  If R a n d 0 , 1 < p r o b
   Foraging behavior using Equation (3)
  Else
   Vigilance behavior using Equation (12)
  End if
 End for
 Else
 Split the swarm as scroungers and producers
  For k = 1 : b
    If k is a producer
     Update using Equation (13)
    Else
     Update using Equation (14)
    End if
    End for
 Check the feasibility of the solutions
 Return the best solution
h = h + 1
 End while
 Optimal solution is obtained
End

3.3.2. Architecture of Deep Belief Network

The DBN [33] is a subset of Deep Neural Network (DNN) and comprises different layers of Multilayer Perceptrons (MLPs) and Restricted Boltzmann Machines (RBMs). RBMs comprise of visible and hidden units that are associated with weights. The basic structural design of the DBN is illustrated in Figure 2.

Training of Deep Belief Network

This section elaborates on the training process of the proposed Taylor-BSA–DBN classifier. A RBM has unsupervised learning based on the gradient descent method, whereas MLP performs a supervised learning method using the standard backpropagation algorithm. Therefore, the training of DBN is based on a gradient descent–backpropagation algorithm. Here, the most appropriate weights are chosen optimally for the update. The training procedure of the proposed DBN classifier is described below,
  • Training of RBM Layers
    A training sample N is given as the input to the first layer of RBM. It computes the probability distribution of the data and encodes it into the weight parameters. The steps involved in the training process of RBM are illustrated below.
    • The input training sample is read and the weight vector is produced randomly.
    • The probability function of each hidden neuron in the first RBM is calculated.
    • The positive gradient is computed using a visible vector and the probability of the hidden layer.
    • The probability of each visible neuron is obtained by reconstructing the visible layer from the hidden layer.
    • The probability of reconstruction of hidden neurons is obtained by resampling the hidden states.
    • The negative gradient is computed.
    • Weights are updated by subtracting the negative gradient from the positive gradient.
    • Weights are updated for the next iteration, using the steepest or gradient descent algorithm.
    • Energy is calculated for a joint configuration of the neurons in the visible and the hidden layers.
  • Training of MLP
    The training procedure in MLP is based on a backpropagation approach by feeding the training data, which are the hidden output of the second RBM layer through the network. Analyzing the data, the network is adjusted iteratively until the optimal weights are chosen. Moreover, Taylor-BSA is employed to compute the optimal weights, which are determined using the error function. The training procedure is summarized below.
    • Randomly initialize the weights.
    • Read the input sample from the result of the preceding layer.
    • Obtain the average error, based on the difference between the obtained output and the desired output.
    • Calculate the weight updates in the hidden and the visible layers.
    • Obtain the new weights from the hidden and the visible layers by applying gradient descent.
    • Identify the new weights using the updated equation of Taylor-BSA.
    • Estimate the error function using gradient descent and Taylor-BSA.
    • Choose the minimum error and repeat the steps.

4. Results and Discussion

This section elaborates on the assessment of the proposed strategy with classical strategies for medical data classification using accuracy, sensitivity, and specificity. The analysis is done by varying training data. In addition, the effectiveness of the proposed Taylor-BSA–DBN is analyzed.

4.1. Experimental Setup

The implementation of the proposed strategy is carried out using Java libraries via Java Archive (JAR) files, utilizing a PC, Windows 10 OS, 2GB RAM, and an Intel i3 core processor. The simulation setup of the proposed system is depicted in Table 1.

4.2. Dataset Description

The experimentation is done using Cleveland, Hungarian, and Switzerland datasets taken from healthcare data based on University of California Irvine (UCI) machine learning repository [34], which is commonly used for both detection and classification. The Cleveland database is taken from the Cleveland Clinical Foundation contributed by David W. Aha. The Hungarian dataset is obtained from the Hungarian Institute of Cardiology. The Switzerland dataset is obtained from the University Hospital, Basel, Switzerland. The dataset comprises of 303 number of instances and 75 attributes, ofwhich, 13 attributes are employed for experimentation. Furthermore, the dataset is characterized as multivariate with integer and real attributes. The attributes (features), such asresting blood pressure (trestbps), maximum heart rate achieved (thalach), the slope of the peak exercise ST segment (slope), age (age), sex (sex), fasting blood sugar (fbs), ST depression induced by exercise relative to rest (oldpeak), chest pain (cp), serum cholesterol (chol), exercise-induced angina (exang), resting electrocardiographic results (restecg), number of major vessels (0–3) colored by fluoroscopy (ca), and 3 = normal; 6 = fixed defect; 7 = reversible defect (thal).

4.3. Evaluation Metrics

The performance of the proposed Taylor-BSA–DBN is employed for analyzing the methods, including accuracy, sensitivity, and specificity.

4.3.1. Accuracy

The accuracy is described as the degree of closeness of an estimated value with respect to its original value in optimal medical data classification, and it is represented as,
A c c u r a c y = T p + T n T p + T n + F p + F n
where, T p represent true positive, F p indicate false positive, T n indicate true negative, and F n represents false negative, respectively.

4.3.2. Sensitivity

This measure is described as the ratio of positives that are correctly identified by the classifier, and it is represented as,
S e n s i t i v i t y = T p T p + F n

4.3.3. Specificity

This measure is defined as the ratio of negatives that are correctly identified by the classifier, and is formulated as.
S p e c i f i c i t y = T n T n + F p

4.4. Comparative Methods

The methods employed for the analysis include the Support Vector Machine (SVM) [35], Naive Bayes (NB) [36], DBN [33], and the proposed Taylor-BSA–DBN.

4.5. Comparative Analysis

The analysis of the proposed Taylor-BSA–DBN, with the conventional methods, with accuracy, sensitivity, and specificity parameters, is evaluated. The analysis is performed by varying the training data using Cleveland, Hungarian, and Switzerland databases.

4.5.1. Analysis with Cluster Size = 5

The analysis of methods, considering cluster size = 5, using Cleveland, Hungarian, and Switzerland databases are specified below:

Analysis Considering Cleveland Database

Table 2 elaborates the analysis of methods using the Cleveland database, considering training data with accuracy, sensitivity, and specificity parameters. The maximum accuracy, sensitivity, and specificity is considered as the best performance. Here, the proposed system offers better performances than the existing methods, such as SVM, NB, and DBN, respectively.

Analysis Considering Hungarian Database

Table 3 elaborates the analysis of methods using the Hungarian database, considering training data with accuracy, sensitivity, and specificity parameters. The proposed system offers the best performance when considering 90% of training data.

Analysis Considering Switzerland Database

Table 4 elaborates the analysis of methods using the Switzerland database considering training data with accuracy, sensitivity, and specificity parameters. The better performances of the proposed system, with values, are 0.8462, 0.8571, and 0.8333 for performance metrics, such as accuracy, sensitivity, and specificity.

4.5.2. Analysis with Cluster Size = 9

The analysis of methods considering cluster size = 9, using Cleveland, Hungarian, and Switzerland databases are specified below:

Analysis Considering Cleveland Database

Table 5 depicts the analysis of methods using the Cleveland database, considering training data with accuracy, sensitivity, and specificity parameters. The maximum accuracy, sensitivity, and specificity are considered as the best performances. Here, the proposed system offers better performance than the existing methods, such as SVM, NB, and DBN, respectively.

Analysis Considering Hungarian Database

Table 6 shows the analysis of methods using the Hungarian database, considering training data with accuracy, sensitivity, and specificity parameters. The proposed system offers the best performance when considering 90% of training data.

Analysis Considering Switzerland Database

Table 7 depicts the analysis of methods using the Switzerland database considering training data with accuracy, sensitivity, and specificity parameters. The better performance of the proposed system with values is 0.7778, 0.7857, and 0.7692, for the performance metrics, such as accuracy, sensitivity, and specificity.

4.5.3. Analysis Based on Receiver Operating Characteristic (ROC) Curve

Table 8 depicts the comparative analysis based on ROC curve, using Cleveland, Hungarian, and Switzerland databases. In the Cleveland dataset, when the false positive rate (FPR) is 5, the corresponding true positive rate (TPR) of the methods, such as SVM, NB, DBN, and the proposed Taylor-BSA–DBN is 0.8857, 0.9119, 0.9535, and 0.9684, respectively. By considering the Hungarian dataset, when the FPR is 4, the corresponding TPR of the proposed method is a maximum of 0.9348. For the same FPR, the TPR of the methods, such as SVM, NB, and DBN is 0.9030, 0.9130, and 0.9233, respectively. By considering the Switzerland dataset, when the FPR is 6, the TPR of the methods, such as SVM, NB, DBN, and the proposed Taylor-BSA–DBN is 0.9105, 0.9443, 0.9569, and 0.9794, respectively.

4.5.4. Analysis Based on k-Fold

Table 9 depicts the comparative analysis based on k-fold using the Cleveland, Hungarian, and Switzerland databases, for cluster size = 5. The Hungarian datasets offer the maximum accuracy of 0.9021, when k-fold = 8. By considering k-fold = 7, the specificity offered by the Cleveland datasets for the methods, such as SVM, NB, DBN, and the proposed Taylor-BSA–DBN, is 0.8032, 0.8189, 0.8256, and 0.8321, respectively. The proposed Taylor-BSA–DBN offers maximum accuracy, sensitivity, and specificity, when considering k-fold = 8.

4.6. Comparative Discussion

Table 10 portrays the analysis of methods using accuracy, sensitivity, and specificity parameter with varying training data. The analysis is done with Cleveland, Switzerland, and Hungarian databases. Using cluster size = 5, and considering the Cleveland database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.871, which is 13.43%, 12.17%, and 11.14%, better than the existing methods, such as SVM, NB, and DBN, respectively. In the existing methods, the DBN offers maximum sensitivity of 0.771, but the proposed method is 12.29% better than the existing DBN. The proposed method has a maximum specificity of 0.862. The percentage of improvement of the proposed method with the existing methods, such as SVM, NB, and DBN, is 12.99%, 12.06%, and 9.40%, respectively. Considering the Hungarian database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.913, maximal sensitivity of 0.933, and maximal specificity of 0.875. Considering the Switzerland database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.846, which is 19.98%, 16.78%, and 15.60% better than the existing methods, such as SVM, NB, and DBN, respectively. Similarly, the proposed system has a maximum sensitivity of 0.857. The percentage of improvement of the proposed system sensitivity, with the existing methods, such as SVM, NB, and DBN is 19.72%, 19.25%, and 16.69%, respectively. Likewise, the proposed Taylor-BSA–DBN showed maximal specificity of 0.833.
Using cluster size = 9, and considering the Cleveland database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.934, which is 16.92%, 11.13%, and 3.96%, better than the existing methods, such as SVM, NB, and DBN, respectively. In the existing methods, the DBN offers maximum sensitivity of 0.913, but the proposed method is 3.89% better than the existing DBN. The proposed method has a maximum specificity of 0.903. The percentage of improvement of the proposed method with the existing methods, such as SVM, NB, and DBN, is 23.15%, 15.28%, and 3.10%, respectively. Considering the Hungarian database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.902, maximal sensitivity of 0.909, and maximal specificity of 0.893. Considering the Switzerland database, the proposed Taylor-BSA–DBN showed maximal accuracy of 0.840, which is 19.17%, 10.12%, and 2.38%, better than the existing methods, such as SVM, NB, and DBN, respectively. Similarly, the proposed system has a maximum sensitivity of 0.846. The percentage of improvement of the proposed system sensitivity with the existing methods, such as SVM, NB, and DBN is 19.74%, 11.35%, and 1.89%, respectively. Likewise, the proposed Taylor-BSA–DBN showed maximal specificity of 0.833.
Table 11 shows the computational time of the proposed system and the existing methods, such as SVM, NB, and DBN, in which the proposed Taylor-BSA–DBN has a minimum computation time of 6.31 sec.
Table 12 shows the statistical analysis of the proposed work and the existing methods based on mean and variance.

5. Conclusions

Contemporary medicine depends on a huge amount of information contained in medical databases. The obtainability of large medical data leads to the requirement of effective data analysis tools for extracting constructive knowledge. This paper proposes a novel, fully automated DBN for heart disease diagnosis using medical data. The proposed Taylor-BSA is employed to train DBN. The proposed Taylor-BSA is designed by combining the Taylor series and BSA algorithm, which can be utilized for finding the optimal weights for establishing effective medical data classification. Here, the sparse-FCM is employed for selecting significant features. The incorporation of sparse FCM for the feature selection process provides more benefits for interpreting the models, as this sparse technique provides important features for detection, and can be utilized for handling high dimensional data. The obtained selected features are fed to DBN, which is trained by the proposed Taylor-BSA. The proposed Taylor-BSA is designed by integrating the Taylor series and BSA in order to generate optimal weights for classification. The proposed Taylor-BSA–DBN outperformed other methods with maximal accuracy of 93.4%, maximal sensitivity of 95%, and maximal specificity of 90.3%, respectively. The proposed method does not classify the type of heart disease. In the future, other medical data classification datasets will be employed for computing efficiency of the proposed method. In addition, the proposed system will be further improved to classify heart diseases, such ascongenital heart disease, coronary artery disease, and arrhythmia.

Author Contributions

Conceptualization, A.M.A. methodology, A.M.A.; software, A.M.A.; validation, A.M.A.; resources, A.M.A.; data curation, A.M.A.; writing—original draft preparation, A.M.A. and W.M.N.W.Z.; writing—review and editing, A.M.A. and W.M.N.W.Z.; visualization, A.M.A. and W.M.N.W.Z.; supervision, W.M.N.W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Abdel-Basset, M.; Gamal, A.; Manogaran, G.; Long, H.V. A novel group decision making model based on neutrosophic sets for heart disease diagnosis. Multimed. Tools Appl. 2019, 79, 9977–10002. [Google Scholar] [CrossRef]
  2. Acharjya, D.P. A Hybrid Scheme for Heart Disease Diagnosis Using Rough Set and Cuckoo Search Technique. J. Med. Syst. 2020, 44, 27. [Google Scholar]
  3. Ahn, G.J.; Hu, H.; Lee, J.; Meng, Y. Representing and Reasoning about Web Access Control Policies. In Proceedings of the IEEE 34th Annual Computer Software and Applications Conference, Seoul, Korea, 19–23 July 2010; pp. 137–146. [Google Scholar]
  4. Alizadehsani, R.; Habibi, J.; Hosseini, M.J.; Mashayekhi, H.; Boghrati, R.; Ghandeharioun, A.; Bahadorian, B.; Sani, Z.A. A data mining approach for diagnosis of coronary artery disease. Comput. Methods Programs Biomed. 2013, 11, 52–61. [Google Scholar] [CrossRef] [PubMed]
  5. Alzahani, S.M.; Althopity, A.; Alghamdi, A.; Alshehri, B.; Aljuaid, S. An overview of data mining techniques applied for heart disease diagnosis and prediction. Lect. Notes Inf. Theory 2014, 2, 310–315. [Google Scholar] [CrossRef] [Green Version]
  6. Babič, F.; Olejár, J.; Vantová, Z.; Paralič, J. Predictive and descriptive analysis for heart disease diagnosis. In Proceedings of the Federated Conference on Computer Science and Information Systems (FedCSIS), Prague, Czech Republic, 3–6 September 2017; pp. 155–163. [Google Scholar]
  7. Chang, X.; Wang, Q.; Liu, Y.; Wang, Y. Sparse Regularization in Fuzzy c-Means for High-Dimensional Data Clustering. IEEE Trans. Cybern. 2017, 47, 2616–2627. [Google Scholar] [CrossRef]
  8. Fatima, M.; Pasha, M. Survey of Machine Learning Algorithms for Disease Diagnostic. J. Intell. Learn. Syst. Appl. 2017, 9, 1–16. [Google Scholar] [CrossRef] [Green Version]
  9. Ghumbre, S.; Patil, C.; Ghatol, A. Heart disease diagnosis using support vector machine. In Proceedings of the International Conference on Computer Science and Information Technology (ICCSIT), Mumbai, India, 10–12 June 2011. [Google Scholar]
  10. Ghumbre, S.U.; Ghatol, A.A. Heart Disease Diagnosis Using Machine Learning Algorithm. In Proceedings of the International Conference on Information Systems Design and Intelligent Applications, Visakhapatnam, India, 5–7 January 2012; Volume 132, pp. 217–225. [Google Scholar]
  11. Giri, D.; Acharya, U.R. Automated diagnosis of Coronary Artery Disease affected patients usingLDA, PCA, ICA and Discrete WaveletTransform. Knowl. Based Syst. 2013, 37, 274–282. [Google Scholar] [CrossRef]
  12. Heart Disease Data Set. Available online: http://archive.ics.uci.edu/ml/datasets/Heart+Disease (accessed on 22 April 2020).
  13. Jabbar, M.A.; Deekshatulu, B.; LandChandra, P. Classification of Heart Disease Using K- Nearest Neighbor and Genetic Algorithm. Procedia Technol. 2013, 10, 85–94. [Google Scholar] [CrossRef] [Green Version]
  14. Jabbar, M.A.; Deekshatulu, B.L.; Chandra, P. Heart disease classification using nearest neighbor classifier with feature subset selection. An. Ser. Inform. 2013, 11, 47–54. [Google Scholar]
  15. Kukar, M.; Kononenko, I.; Groselj, C.; Kralj, K.; Fettich, J. Analysing and Improving the Diagnosis of Ischaemic Heart Disease with Machine Learning. Artif. Intell. Med. 1999, 16, 25–50. [Google Scholar] [CrossRef]
  16. Magesh, G.; Swarnalatha, P. Optimal feature selection through a cluster-based DT learning (CDTL) in heart disease prediction. Evol. Intell. 2020, 1–11. [Google Scholar] [CrossRef]
  17. Mangai, S.A.; Sankar, B.R.; Alagarsamy, K. Taylor Series Prediction of Time Series Data with Error Propagated by Artificial Neural Network. Int. J. Comput. Appl. 2014, 89, 41–47. [Google Scholar]
  18. Mannepalli, K.; Sastry, P.N.; Suman, M. A novel Adaptive Fractional Deep Belief Networks for speaker emotion recognition. Alex. Eng. J. 2017, 56, 485–497. [Google Scholar] [CrossRef] [Green Version]
  19. Medhekar, D.S.; Bote, M.P.; Deshmukh, S.D. Heart disease prediction system using naive Bayes. Int. J. Enhanc. Res. Sci. Technol. Eng. 2013, 2, 1–5. [Google Scholar]
  20. Meng, X.; Gao, X.Z.; Lu, L.; Liu, Y.; Zhang, H. A new bio-inspired optimisation algorithm: Bird Swarm Algorithm. J. Exp. Theor. Artif. Intell. 2016, 28, 673–687. [Google Scholar] [CrossRef]
  21. Mohan, S.; Thirumalai, C.; Srivastava, G. Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques. IEEE Access 2019, 7, 81542–81554. [Google Scholar] [CrossRef]
  22. Nilashi, M.; Ahmadi, H.; Manaf, A.A.; Rashid, T.A.; Samad, S.; Shahmoradi, L.; Aljojo, N.; Akbari, E. Coronary Heart Disease Diagnosis Through Self-Organizing Map and Fuzzy Support Vector Machine with Incremental Updates. Int. J. Fuzzy Syst. 2020, 23, 1376–1388. [Google Scholar] [CrossRef]
  23. Nourmohammadi-Khiarak, J.; Feizi-Derakhshi, M.R.; Behrouzi, K.; Mazaheri, S.; Zamani-Harghalani, Y.; Tayebi, R.M. New hybrid method for heart disease diagnosis utilizing optimization algorithm in feature selection. Health Technol. 2019, 10, 667–678. [Google Scholar] [CrossRef] [Green Version]
  24. Oyyathevan, S.; Askarunisa, A. An expert system for heart disease prediction using data mining technique: Neural network. Int. J. Eng. Res. Sports Sci. 2014, 1, 1–6. [Google Scholar]
  25. Palaniappan, S.; Awang, R. Intelligent heart disease prediction system using data mining techniques. In Proceedings of the International Conference on Computer Systems and Applications, Doha, Qatar, 31 March–4 April 2008. [Google Scholar]
  26. Palaniappan, S.; Awang, R. Intelligent Heart Disease Prediction System Using Data Mining Techniques. Int. J. Comput. Sci. Netw. Secur. 2008, 8, 343–350. [Google Scholar]
  27. Patil, S.B.; Kumaraswamy, Y.S. Extraction of significant patterns from heart disease warehouses for heart attack prediction. Int. J. Comput. Sci. Netw. Secur. 2009, 9, 228–235. [Google Scholar]
  28. Pattekari, S.A.; Parveen, A. Prediction system for Heart Disease using Naive Bayes. Int. J. Adv. Comput. Math. Sci. 2012, 3, 290–294. [Google Scholar]
  29. Ranganatha, S.; Raj, H.P.; Anusha, C.; Vinay, S.K. Medical data mining and analysis for heart disease dataset using classification techniques. In Proceedings of the National Conference on Challenges in Research & Technology in the Coming Decades (CRT), Ujire, India, 27–28 September 2013. [Google Scholar]
  30. Reddy, G.T.; Reddy, M.P.K.; Lakshmanna, K.; Rajput, D.S.; Kaluri, R.; Srivastava, G. Hybrid genetic algorithm and a fuzzy logic classifier for heart disease diagnosis. Evol. Intell. 2019, 13, 185–196. [Google Scholar] [CrossRef]
  31. Safdar, S.; Zafar, S.; Zafar, N.; Khan, N. Machine learning based decision support systems (DSS) for heart disease diagnosis: A review. Artif. Intell. Rev. 2018, 50, 597–623. [Google Scholar] [CrossRef]
  32. Shah, S.M.S.; Shah, F.A.; Hussain, S.A.; Batool, S. Support Vector Machines-based Heart Disease Diagnosis using Feature Subset, Wrapping Selection and Extraction Methods. Comput. Electr. Eng. 2020, 84, 106628. [Google Scholar] [CrossRef]
  33. Shouman, M.; Turner, T.; Stocker, R. Using data mining techniques in heart disease diagnosis and treatment. In Proceedings of the IEEE Japan-Egypt Conference on Electronics, Communications and Computers, Alexandria, Egypt, 6–9 March 2012; pp. 173–177. [Google Scholar]
  34. Subbalakshmi, G. Decision support in heart disease prediction system using naive bayes. Indian J. Comput. Sci. Eng. 2011, 2, 170–174. [Google Scholar]
  35. Thiyagaraj, M.; Suseendran, G. Enhanced Prediction of Heart Disease Using Particle Swarm Optimization and Rough Sets with Transductive Support Vector Machines Classifier. In Data Management, Analytics and Innovation; Springer: Singapore, 2020; Volume 2, pp. 141–152. [Google Scholar]
  36. Yeh, D.Y.; Cheng, C.H.; Chen, Y.W. A predictive model for cerebrovascular disease using data mining. Expert Syst. Appl. 2011, 38, 8970–8977. [Google Scholar] [CrossRef]
Figure 1. Schematic view of the proposed Taylor-based bird swarm algorithm (Taylor-BSA)–deep belief network (DBN) for heart disease diagnosis.
Figure 1. Schematic view of the proposed Taylor-based bird swarm algorithm (Taylor-BSA)–deep belief network (DBN) for heart disease diagnosis.
Applsci 10 06626 g001
Figure 2. Architectural view diagram of DBN classifier.
Figure 2. Architectural view diagram of DBN classifier.
Applsci 10 06626 g002
Table 1. Simulation setup.
Table 1. Simulation setup.
ParameterValue
Number of input layers2
Number of hidden layers2
Number of output layers1
Cluster size5 to 9
Number of selected features in Cleveland dataset123
Number of selected features in Hungarian dataset139
Number of selected features in Switzerland dataset139
Learning rate0.1
Table 2. Analysis of methods with cluster size = 5 using the Cleveland database. Abbreviations: SVM, Support Vector Machine; NB, Naive Bayes; DBN, deep belief network.
Table 2. Analysis of methods with cluster size = 5 using the Cleveland database. Abbreviations: SVM, Support Vector Machine; NB, Naive Bayes; DBN, deep belief network.
MethodsSVMNBDBNProposed Taylor-BSA–DBN
Training Percentage
Accuracy
500.75900.76030.78740.8625
600.71430.76820.78510.8632
700.74600.76270.81220.8531
800.72360.76190.78690.8644
900.75380.76470.77420.8710
Sensitivity
500.75350.76130.79080.8693
600.71200.76110.78860.8699
700.74730.75580.81720.8602
800.71670.76560.79030.8710
900.75760.76670.77140.8788
Specificity
500.75660.76670.78380.8551
600.71650.77500.78150.8559
700.74470.76920.80680.8452
800.73020.75810.78330.8571
900.75000.75760.78130.8621
Table 3. Analysis of methods with cluster size = 5 using the Hungarian database.
Table 3. Analysis of methods with cluster size = 5 using the Hungarian database.
MethodsSVMNBDBNProposed Taylor-BSA–DBN
Training Percentage
Accuracy
500.78100.80430.84280.9200
600.79060.79760.85950.8907
700.76740.81430.81820.8551
800.75760.80950.87100.8710
900.69570.75000.76470.9130
Sensitivity
500.81600.84560.87760.9388
600.83000.83470.89080.9160
700.80520.85390.85710.8876
800.80650.84000.87950.9000
900.75000.80000.81250.9333
Specificity
500.72940.73260.78050.8846
600.71430.75000.80300.8438
700.71150.74510.75000.7959
800.67570.76470.81820.8182
900.61110.66670.68420.8750
Table 4. Analysis of methods with cluster size = 5 using the Switzerland database.
Table 4. Analysis of methods with cluster size = 5 using the Switzerland database.
MethodsSVMNBDBNProposed Taylor-BSA–DBN
Training Percentage
Accuracy
500.76190.77100.78950.8644
600.70090.73740.78000.8557
700.70730.75680.78950.8904
800.75510.76470.77780.8400
900.67740.70370.71430.8462
Sensitivity
500.76560.77420.78180.8710
600.69810.72920.78430.8627
700.70730.75000.79480.8974
800.75000.76920.78570.8462
900.68750.69230.71430.8571
Specificity
500.75810.76670.79660.8571
600.70370.74510.77550.8478
700.70730.76320.78380.8823
800.76000.76000.76920.8333
900.66670.71430.71430.8333
Table 5. Analysis of methods with cluster size = 9 using the Cleveland database.
Table 5. Analysis of methods with cluster size = 9 using the Cleveland database.
MethodsSVMNBDBNProposed Taylor-BSA–DBN
Training Percentage
Accuracy
500.75900.76030.79930.8690
600.71660.74300.78510.8632
700.73540.73630.81230.8857
800.74190.76070.76190.8475
900.74600.79100.87100.9016
Sensitivity
500.75350.76130.80390.8758
600.71070.74400.78860.8699
700.73030.73680.81720.8925
800.74190.75440.76560.8548
900.74190.80000.87880.9091
Specificity
500.75660.76670.79450.8613
600.72220.74190.78150.8559
700.73400.74190.80680.8780
800.74190.75810.76670.8393
900.75000.78130.86210.8929
Table 6. Analysis of methods with cluster size = 9 using the Hungarian database.
Table 6. Analysis of methods with cluster size = 9 using the Hungarian database.
MethodsSVMNBDBNProposed Taylor-BSA–DBN
Training Percentage
Accuracy
500.75000.79570.85150.9200
600.75130.78700.80750.8907
700.77770.82730.85000.9118
800.77550.82980.89740.9341
900.76740.80000.82980.8696
Sensitivity
500.79070.83890.88440.9388
600.80170.82180.84870.9160
700.82420.86520.87320.9326
800.82260.86670.91300.9500
900.80770.84380.86670.9000
Specificity
500.68970.72090.79270.8846
600.66670.73530.73530.8438
700.69810.76000.81630.8723
800.69440.76470.87500.9032
900.70590.72220.76470.8125
Table 7. Analysis of methods with cluster size = 9 using the Switzerland database.
Table 7. Analysis of methods with cluster size = 9 using the Switzerland database.
MethodsSVMNBDBNProposed Taylor-BSA–DBN
Training Percentage
Accuracy
500.74600.74790.80170.8644
600.71700.76240.76840.8947
700.73680.75000.76620.8904
800.67860.75510.82000.8400
900.73330.76000.76790.7778
Sensitivity
500.74140.75000.80650.8710
600.71700.76090.76470.9020
700.72970.75610.76920.8974
800.67860.75000.83000.8462
900.73000.75000.76670.7857
Specificity
500.74190.75410.79660.8571
600.71700.76000.77550.8864
700.74360.74360.76320.8824
800.67860.76000.82000.8333
900.71430.73000.75560.7692
Table 8. Analysis based on ROC.
Table 8. Analysis based on ROC.
MethodsSVMNBDBNProposed Taylor-BSA–DBN
FPRTPR
Cleveland
10000
20.79130.79490.84290.8761
30.79610.83300.85230.8798
40.84620.87530.91490.9284
50.88570.91190.95350.9684
60.91530.95690.97880.9847
70.97100.97830.98950.9975
80.99520.998911
91111
101111
Hungarian
10000
20.82330.84100.85530.8941
30.82860.86470.87340.8953
40.90300.91300.92330.9348
50.92460.94170.95960.9789
60.95210.96970.98030.9999
70.97930.98000.99461
80.99810.998511
91111
101111
Switzerland
10000
20.75930.76820.80240.8258
30.76200.79230.83990.8682
40.84520.87350.87810.9101
50.87250.91840.91940.9564
60.91050.94430.95690.9794
70.97010.97220.98650.9924
80.99460.99530.99941
91111
101111
Table 9. Analysis based on k-fold.
Table 9. Analysis based on k-fold.
MetricsMethodsSVMNBDBNProposed Taylor-BSA–DBN
k-Fold
Cleveland
Accuracy50.70210.70880.71260.7239
60.71220.71890.72450.7365
70.73450.75430.76340.7843
80.75280.78430.79650.8132
Sensitivity50.75670.76780.78980.7956
60.78340.80450.81560.8232
70.80320.81890.82560.8321
80.81450.82290.83650.8448
Specificity50.75860.76560.76990.7865
60.78540.77450.79650.8043
70.79400.80880.81240.8227
80.80210.81780.82490.8339
Hungarian
Accuracy50.79600.87580.87910.8822
60.80710.84130.88380.8854
70.79850.80300.83240.8917
80.79820.86260.89480.9021
Sensitivity50.79590.80270.81970.8231
60.78570.78910.79250.8367
70.78910.79250.79590.8393
80.78570.79250.80610.8458
Specificity50.74940.75130.76450.7656
60.71460.73430.76450.7760
70.74330.75350.76450.7719
80.75300.76450.76970.7873
Switzerland
Accuracy50.75280.77890.77990.7896
60.76580.78280.79250.8012
70.77120.79580.80280.8156
80.78280.81280.81590.8259
Sensitivity50.74280.76890.78470.7956
60.76250.77580.78580.8028
70.77480.78960.79520.8125
80.78280.79860.80780.8225
Specificity50.76580.75890.77500.7896
60.77580.78320.79620.8020
70.78410.79110.80250.8196
80.79580.80020.81780.8219
Table 10. Comparative analysis.
Table 10. Comparative analysis.
Cluster SizeDatabaseMetricsSVMNBDBNProposed Taylor-BSA–DBN
Cluster size = 5ClevelandAccuracy0.7540.7650.7740.871
Sensitivity0.7580.7670.7710.879
Specificity0.7500.7580.7810.862
HungarianAccuracy0.6960.7500.7650.913
Sensitivity0.7500.8000.8130.933
Specificity0.6110.6670.6840.875
SwitzerlandAccuracy0.6770.7040.7140.846
Sensitivity0.6880.6920.7140.857
Specificity0.6670.7140.7140.833
Cluster size = 9ClevelandAccuracy0.7760.8300.8970.934
Sensitivity0.8230.8670.9130.950
Specificity0.6940.7650.8750.903
HungarianAccuracy0.7460.7910.8710.902
Sensitivity0.7420.8000.8790.909
Specificity0.7500.7810.8620.893
SwitzerlandAccuracy0.6790.7550.8200.840
Sensitivity0.6790.7500.8300.846
Specificity0.6790.7600.8200.833
Table 11. Computational Time.
Table 11. Computational Time.
MethodsSVMNBDBNProposed Taylor-BSA–DBN
Time (Sec)10.088.797.566.31
Table 12. Statistical Analysis.
Table 12. Statistical Analysis.
DatasetMethodsAccuracyMeanVarianceSensitivityMeanVarianceSpecificityMeanVariance
Cluster size = 5
ClevelandSVM0.7540.7520.0020.7580.7540.0040.7500.7480.002
NB0.7650.7610.0040.7670.7650.0020.7580.7540.004
DBN0.7740.7710.0030.7710.7680.0030.7810.7790.002
Proposed Method0.8710.8690.0020.8790.8780.0010.8620.8600.002
HungarianSVM0.6960.6930.0030.7500.7480.0020.6110.6080.003
NB0.7500.7460.0040.8000.7990.0010.6670.6650.002
DBN0.7650.7630.0020.8130.8100.0030.6840.6820.002
Proposed Method0.9130.9110.0020.9330.9320.0010.8750.8730.002
SwitzerlandSVM0.6770.6750.0020.6880.6840.0040.6670.6650.002
NB0.7040.7020.0030.6920.6900.0020.7140.7110.003
DBN0.7140.7110.0030.7140.7130.0010.7140.7120.002
Proposed Method0.8460.8440.0020.8570.8550.0020.8330.8310.002
Cluster size = 9
ClevelandSVM0.7760.7730.0030.8230.8220.0010.6940.6910.003
NB0.8300.8260.0040.8670.8650.0020.7650.7610.004
DBN0.8970.8950.0020.9130.9110.0020.8750.8730.002
Proposed Method0.9340.9320.0020.9500.9480.0020.9030.9010.002
HungarianSVM0.7460.7430.0030.7420.7400.0020.7500.7480.002
NB0.7910.7900.0010.8000.7970.0030.7810.7800.001
DBN0.8710.8680.0030.8790.8780.0010.8620.8600.002
Proposed Method0.9020.9000.0020.9090.9070.0020.8930.8910.002
SwitzerlandSVM0.6790.6770.0020.6790.6750.0040.6790.6770.002
NB0.7550.7520.0030.7500.7480.0020.7600.7580.002
DBN0.8200.8180.0020.8300.8270.0030.8200.8190.001
Proposed Method0.8400.8380.0020.8460.8440.0020.8330.8320.001

Share and Cite

MDPI and ACS Style

Alhassan, A.M.; Wan Zainon, W.M.N. Taylor Bird Swarm Algorithm Based on Deep Belief Network for Heart Disease Diagnosis. Appl. Sci. 2020, 10, 6626. https://doi.org/10.3390/app10186626

AMA Style

Alhassan AM, Wan Zainon WMN. Taylor Bird Swarm Algorithm Based on Deep Belief Network for Heart Disease Diagnosis. Applied Sciences. 2020; 10(18):6626. https://doi.org/10.3390/app10186626

Chicago/Turabian Style

Alhassan, Afnan M., and Wan Mohd Nazmee Wan Zainon. 2020. "Taylor Bird Swarm Algorithm Based on Deep Belief Network for Heart Disease Diagnosis" Applied Sciences 10, no. 18: 6626. https://doi.org/10.3390/app10186626

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop