EFFICIENT HEART DISEASE DIAGNOSIS BASED ON TWIN SUPPORT VECTOR MACHINE

Heart disease is the leading cause of death in the world according to the World Health Organization (WHO). Researchers are more interested in using machine learning techniques to help medical staff diagnose or detect heart disease early. In this paper, we propose an efficient medical decision support system based on twin support vector machines (Twin-SVM) for heart disease diagnosing with binary target (i.e. presence or absence of disease). Unlike conventional support vector machines (SVM) that finds only one optimal hyperplane for separating the data points of first class from those of second class, which causes inaccurate decision, Twin-SVM finds two non-parallel hyper-planes so that each one is closer to the first class and is as far from the second class as possible. Our experiments are conducted on real heart disease dataset and many evaluation metrics have been considered to evaluate the performance of the proposed method. Furthermore, a comparison between the proposed method and several well-known classifiers as well as the state-of-the-art methods has been performed. The obtained results proved that our proposed method based on Twin-SVM technique gives promising performances better than the state-of-the-art. This improvement can seriously reduce time, materials, and labor in healthcare services while increasing the final decision accuracy.


INTRODUCTION
Heart disease is one of the main reasons for disability and premature death of people in the world. According to the World Health Organization, about 17.9 million deaths have occurred worldwide due to Heart diseases in 2016 [1]. However, some key factors help us reduce the risk of heart disease, such as controlled blood pressure and lower cholesterol [2]. Therefore, the diagnosing of heart disease is a delicate, risky, and very important factor [3]. If done properly it can be used by the medical staff to save life. This process can be realized by exploring the registered patient data. Usually, the existing healthcare systems use electronic health records to store those data [4]. Advances in computer and information technologies can deal with this routine data to make critical medical decisions [5].
Machine learning (ML), which is part of artificial intelligence, is the research domain of algorithms and statistical techniques that build a mathematical model based on sample data in order to make decisions or diagnosis without explicitly programming them. Actually, many researchers have worked on heart diseases prediction/ diagnosing using ML approaches in order to achieve an accurate diagnosis. In [6], several data mining classifiers such as Naïve Bayes, Decision tree, Rule-based and Artificial Neural Network have been examined with different healthcare data including heart disease prediction. Also, Shouman, et al. [7] have applied a range of Decision Tree techniques for retrieving the better performance in heart disease diagnosing. Chaurasia and Pal [8] have explored WEKA data mining tool for heart disease detection. This tool consists of several machine learning algorithms for mining purpose such as: bagging, Naive Bayes, and J48. Bagging has provided better classification results compared to other techniqus.
The authors in [9] have considered two systems based on Artificial Neural Network (ANN) and Neuro-Fuzzy approaches in order to develop an automatic heart disease diagnosis system. Xiong et al. [10] have realized RhythmNet system for the classification of heart disease from single lead electrocardiogram (ECG). This system used a residual convolutional recurrent neural network as classifier.
Furthermore, the authors in [11] have developed heart sound classification using a combination of convolutional neural network (CNN) and majority voting for cardiovascular disease prediction. According to Amine et al. [12], the prediction accuracy of the cardiovascular disease can be significantly improved by combining different features and classification techniques. The best performing classifier achieved by using vote technique, which combines Naïve Bayes and Logistic Regression techniques. Besides, Padmanabhan and his colleagues [13] have DIAGNOSTYKA, Vol. 22, No. 3 (2021) Brik Y, Djerioui M, Attallah B.: Efficient heart disease diagnosis based on twin support vector machine 4 proposed an Auto Machine Learning (AutoML) to evaluate cardiovascular disease diagnosing. The performance evaluation of their system has been conducted by using the Auto-Sklearn library.
The authors in [14] have presented a one dimensional deep CNN to classify multiple heart diseases where a modified ECG signal has been considered as an input signal. In [15], an automated diagnostic system for the prediction of heart disease has been proposed. This system used a statistical model for features refinement and Deep Neural Network (DNN) for classification. Sellami et al. [16], have presented a deep CNN based on state-ofthe-art deep learning techniques for accurate heartbeat classification using ECG signals.
However, the Deep learning techniques are often a time-consuming and costly procedure in term of parameter optimization. Furthermore, these techniques require a lot of structured data [5] [11]. This disadvantage has been lifted by the SVM technique which is chosen due to, first, its speed in learning phase and its performance [17]. Second, thanks to the structural risk minimization theory, SVM has effectively solved the local minimum problems and high dimensionality. Third, SVM is very powerful tool for solving binary problems [18].
In fact, SVM can perfectly classify binary data by finding the optimal hyper-plane that separates the data points of first class from those of second class. In clinical decision support systems, SVM has attracted many attentions, especially in heart disease diagnosing. In Tan et al. [19], SVM has been joined with Genetic Algorithm using wrapper approach to classify five heart disease data sets. In [20] SVM with many machine learning techniques such as: Baysian Network, Decision tree, Artificial Neural Network, and Fuzzy pattern tree have been used to classify the Cleavland heart disease data set using 10-fold-cross validation. SVM achieved the highest prediction accuracy compared to other classifiers. Otoom et al. [21] have presented a system for Coronary artery disease detection and monitoring where three machine learning techniques are performed such as: Bayes Net, SVM, and Functional Trees. The authors have used WEKA tool for feature selection and detection. SVM has provided the best accuracy with 85.1%.
However, enormous difficulties have been presented in dealing with complex data that is nonlinearly inseparable and unstructured where single hyper-plane cannot efficiently maximize the margin between the classes [22]. Furthermore, SVM is very sensitive to noisy data which makes it predisposed to over-fitting [23], [24]. In order to remedy these drawbacks, Khemchandani and Chandra [25] have proposed a new SVM variant called Twin Support Vector Machine (Twin-SVM) for the binary classification. Unlike traditional SVM, Twin-SVM would find two non-parallel hyper-planes, such that each one is closer to the first class and is as far as possible from the second class. Therefore, Twin-SVM provides lower computational complexity and better generalization ability compared to conventional SVM [26]. All of these advantages make Twin-SVM very adequate for heart disease diagnosing system that contains data of patient records with binary target, i.e. referring to the presence or absence of heart disease. Furthermore, SVM can be learned efficiently for heart disease diagnosing without optimizing a large amount of hyper-parameters [18], [27].
In this paper, we propose a Twin-SVM for a heart disease purpose for butter diagnosis that can be obtained by using two non-parallel hyper-planes. Our work focuses on the following points: − set up a system architecture for Heart diseases based on Twin-SVM in order to make an adapted decision to the Heart diseases diagnosing; − a comparative study between Twin-SVM and other SVM variants; − a comparison of Twin-SVM results against those of many well-known classifier methods such as Multilayer Perceptron (MLP), Logistic Regression, Decision Trees, Random Forest, and K-Nearest Neighbors (KNN). − furthermore, we compare our proposed method with the state-of-the-art methods that used the same datasets, the same experimental protocol, and the same performance measurements. The remainder of this paper is organized as follows: Section 2 presents the Heart diseases diagnosing system and gives an overview of the conventional SVM and Twin-SVM. In Section 3, we describe the dataset used, the evaluation metrics and discuss the obtained results. In Section 4, we make a meaningful comparison between the proposed method and some well-known classifiers in the diagnosing purpose. Section 5 reports the comparison of the proposed method with the stateof-the-art techniques that used the same heart disease dataset in evaluation. Finally, the conclusions drawn from this work are presented in Section 6.

PROPOSED METHOD
The present work was conducted on real heart disease dataset using machine learning techniques. The flowchart given in Fig. 1 presents the proposed heart diseases diagnosing system. We focus in the next on the theoretical background of SVM and Twin-SVM.

Support Vector Machine
The SVM method proposed by Vapnik has been studied extensively for classification and regression [17], [18]. The SVM algorithm was developed for prediction by using an -insensitive loss function. The goal of SVM is to identify a function ( ) that DIAGNOSTYKA, Vol. 22, No. 3 (2021) Brik Y, Djerioui M, Attallah B.: Efficient heart disease diagnosis based on twin support vector machine 5 for all training patterns has a maximum deviation from the target values and has a maximum margin [24]. The estimating function is taken in the form: where and are the coefficients that have to be estimated from data. ( ) is the non-linear function in feature space. The objective is to find the values of and such that ( ) can be determined by minimizing the following cost function: where ε is the extension of -insensitive loss function defined as: After introducing slack variables, the risk function can be expressed in the following constrained form: With subject to: where * ≥ 0.
Solution of the above problem (4) using primal dual method leads to the following dual problem that can be expressed as: Subject to where and * are the Lagrange multipliers that act as forces pushing the predictions towards the target value . The computation in input space can be performed using kernel function in feature space as follows: Note that any function that satisfies Mercer's theorem [28] can be used as a kernel function. The kernel parameters are user's defined where controls the smoothness of approximating function and determines the margin within which the error is tolerated. Finally, the estimating function can be expressed as: where is the number of support vectors.
We present in Tab. I five well-known types of SVM kernels that we use in this work including linear, polynomial, radial basis function (RBF), sigmoid (hyperbolic tangent), and Laplace kernels.

Kernel name
Mathematical function

Twin Support Vector Machine
Twin-SVM is one of the new emerging machine learning approaches suitable for both classification and regression problems [25]. The target of Twin-SVM is to generate the above two non-parallel hyper-planes in the n-dimensional real space , such that each plane is closer to one of the two classes and is as far as possible from the other [26]. For linear case, the two nonparallel hyper-planes can be formulated as: and 2 ( ) = ( 2 . ) + 2 = 0, where 1 , 2 ∈ are normal vectors and 1 , 2 ∈ are bias terms. The linear classifiers are obtained by solving the following optimization problems.
Subject to where 1 and 2 are penalty parameters, and are slack positive factors, 1 and 2 are vectors of ones of appropriate dimensions.
By introducing the Lagrangian multipliers, the dual quadratic programming problems (QPPs) of (14) and (16) can be represented as followings Subject to and Subject to where After solving the dual problems (18) and (20), the two nonparallel hyper-planes can be produced by Twin-SVM then can easily assign a label "+1" or "-1" to a testing instance by = argmin where |. | is the absolute value.
In order to make Twin-SVM non-linear, the kernel functions reported in Tab.1 can be used to map the original data samples into a new non-linear feature space where the decision function of equation (23) For a new input data, its distance is measured from both kernel surfaces and is assigned to the class from which its distance is smaller.

EXPERIMENTAL RESULTS
In this section, we describe the dataset used in this study as well as the different evaluation metrics involved in the performance assessment. Furthermore, we present the result of our proposed method against other SVM variants applied on Heart UCI dataset. Then, we compare our work to the state-of-the-art methods. It should be noted that the evaluation are performed using MATLAB environment on 1.9 GHz CPU processor with 8 GB RAM memory.

Dataset Description
The Heart UCI dataset has been collected from UCI machine learning repository [29]. This dataset contains in total 303 patient records with 76 attributes for each one, but only 14 of them are used for our evaluation to make our scores comparable to previous works. In particular, the Cleveland dataset is the only one that has been used by ML researchers to this date [6], [7], [12], [13], [22], [23], [30][31][32][33]. Tab. II provides a brief description about the selected attributes and their proprieties. The last attribute serves as the prediction target that indicates the absence or presence of heart disease in a patient (0 or 1 value, respectively). Of the 303 records, 138 ones are that of patients with target 0 and 165 with target 1.

Evaluation Metrics
Usually, the accuracy rate is the most performance metric used to evaluate the classifiers such as the proposed model. However, due to the imbalanced nature of our dataset, typical measures such as accuracy or error rates are heavily biased and do not reflect the real performance of the system. For this reason, metrics that are insensitive to the imbalanced set are involved based on the confusion matrix (see Fig. 2).

Results and discussion
We carried out many experiments to demonstrate the effectiveness of the proposed method. We recall that 5-fold cross validation has been employed to evaluate the performance of our system. The performance of Twin-SVM for heart disease diagnosing was evaluated with the kernel functions mentioned in Tab. I. The obtained results for training and testing phases of Twin-SVM with different kernel functions are reported in Tab. III. It should be noted that all the parameters of the kernel functions are defined empirically according to the loss minimization. Because our data has different ranges, we normalize each column values to fit [0-1] scale without distorting the differences in the ranges of values.
From Tab. III, we observe that the linear kernelbased Twin-SVM outperforms the other Twin-SVM variants in both training and testing accuracies. Furthermore, Twin-SVM with linear kernel consumes less time than other variants. Next, we discuss with more details the obtained results in term of different evaluation metrics mentioned in equations (25), (26), (27) and (28). Fig. 3 shows the obtained results in term of balanced accuracy. The Linear Twin-SVM shows superior performance than all other variants. In addition, Twin-SVM with Laplacian kernel performs poorer than RBF and Polynomial kernels.
Twin-SVM-Sigmoid shows a very bad performance.     Fig. 4 shows the performance of the proposed method on sensitivity rate with five different kernels. Twin-SVM-Linear shows a better sensitivity value than all considered kernels, with Twin-SVM-Polynomial is a close second. Twin-SVM-Sigmoid provides the worst performance.   5 shows the performance of the proposed method on specificity rate. Linear based-Twin-SVM shows high specificity Rate with 90.32% followed by Twin-SVM-Polynomial with 89.55%. Twin-SVM-Sigmoid presents constantly the worst performance. Fig. 6 shows the performance of the proposed method on Matthews Correlation Coefficient rate. We recall here that Matthews Correlation Coefficient is a correlation value between the actual and predicted classes that varies from -1 to +1. A value of +1 means complete identical prediction, 0 is random, -1 means complete disagreement. It is clear that Twin-SVM-Linear far outperforms all other Twin-SVM variants with range of 11.43% compared to the second. Besides, Sigmoid and Laplacian variants demonstrate poor performance. Now, in order to verify the credibility of these results, we have performed another experiments using conventional SVM with the same kernel functions and the obtained results are depicted in Tab. IV. We can clearly see that Twin-SVM with linear kernel outperforms conventional SVM with different kernels in term of accuracy. Moreover, the conventional SVM with Linear kernel gives more accuracy than other non-linear kernels in less time, which confirm the superiority of linear kernel-based Twin-SVM. Hence, in our case (i.e. heat disease data) Twin-SVM-linear minimizes the empirical risks of training samples so that providing more precise and faster results than conventional SVM with several kernels. Fig. 6. Twin-SVM performance on Matthews Correlation Coefficient rate.

COMPARISON WITH OTHER WELL-KNOWN CLASSIFIERS
More importantly, the proposed method is benchmarked with some well-known classifiers such as Multilayer Perceptron (MLP), Logistic Regression, Decision Trees, Random Forest, and k-Nearest Neighbors (kNN). These algorithms have been widely used for automatic medical diagnosis [4], [5], [34]. In order to define the hyperparameters of each algorithm, we performed the process of trial and errors. For MLP, we used Levenberg-Marquardt function to fit the network with one hidden layer of size 12 neurons. As activation function, the symmetric sigmoid and linear transfer functions was used for hidden and output layers, respectively. In the case of Decision Tree, the maximum depth equal the training sample size minus 1, the minimum sample leaves was 1 and the minimum parent size is 10. For Logistic regression, we used maximum likelihood in fitting the model. However, Random Forest was implemented with Bayesian optimization for tuning its hyperparameters, where the minimum number of observations per leaf was 6 and the number of predictors to sample at each node was 7. Finally, From Tab. V, we clearly see that the highest accuracy and balanced accuracy achieved are when applying Twin-SVM for diagnosing the heart disease data with 90.72% and 90.66%, respectively. The MLP classifier comes in second class with 86.15% and 86.34% in term of accuracy and balanced accuracy, respectively. The kNN is the third, while the Decision Tree, Logistic Regression and Random Forest classifiers are not competitive. Therefore, Twin-SVM can deal butter with binary classification problem by finding two non-parallel hyper-planes, such that each one is closer to the first class and is as far as possible from the second class. With this improvement, Twin-SVM has lower computational complexity and better generalization ability with linear kernel compared to conventional SVM and its variants.

COMPARISON OF STATE-OF-THE-ART METHODS
In order to give an idea on where our proposed method ranks performance-wise, we made a comparison with several state-of-the-art methods that used the same heart disease dataset, the same experimental protocol, and the same performance metrics. The disease diagnosis results obtained for the proposed method and other approaches have been presented in Tab. VI. It is worth noting that this comparison was based only on the accuracy metric because the other evaluation metrics (balanced accuracy, sensitivity, specificity, and Matthews's correlation coefficient) are not available. As observed from Tab. VI, the proposed Twin-SVM based diagnosing model outperforms methods reported in literature. The other techniques also exhibit sensibly good results but are slightly low in terms of prediction accuracy compared to Twin-SVM method.

CONCLUSION
In this study, we proposed an effective heart disease diagnosing method based on Twin support vector machines. The performance evaluation of the proposed system was conducted on real cardiovascular disease dataset, which contains clinical data from trial subjects and whether or not they have heart disease. In fact, our system can predict the presence or absence of heart disease with given a new subject's data providing a good accuracy. The proposed diagnostic system demonstrated its superiority on different performance evaluation metrics. This superiority is justified by the ability of Twin-SVM in dealing with complex data (i.e., contains imbalanced continuous and discrete attributes) that is nonlinearly inseparable where single hyper-plane cannot efficiently maximize the margin between the classes. Furthermore, a comparison between the proposed method and several well-known classifiers as well as the state-of-the-art methods has been performed. This comparison proved that our proposed method based on Twin-SVM classifier can significantly give promising performances better than the state-of-the-art in heart disease diagnosing. In the future work, we plan to perform some powerful algorithms for selecting the most pertinent features to find which one is more suitable for our purpose. Likewise, it is very interesting to integrate this proposed method in medical diagnostic systems, which can positively provide an economic and life-saving impact in healthcare services.