Classification of Imbalanced Cardiac Arrhythmia Data


Ecemiş C., Avcu N., Sarı Z.

EUROPEAN JOURNAL OF SCIENCE AND THEOLOGY, sa.34, ss.546-552, 2022 (Scopus)

Özet

Arrhythmias are irregularities in the heartbeat and can be life-threatening. Early diagnosis of Cardiac Arrhythmia is quite crucial for saving patient lives. In this study, the main goal is to detect the presence of cardiac arrhythmia and classify it into 16 groups from the ECG recordings. The arrhythmia dataset in the UCI databank is used to apply different network structures for classification. The number of sample of each class are not the same in the dataset. The dataset has a very immoderate class distribution, and moreover, some classes don't exist. The imbalance condition between the classes causes a decrement in the performance of the classifier such as low classification accuracy. Also, in the cross-validation steps, the data is divided into groups each of which includes the same number of samples from the classes to overcome this difficulty in the classification. The samples of each class are divided into five groups to satisfy that condition. The training and test datasets are obtained as a combination of these groups. To deal with the imbalance condition in the dataset, first, some typical classification algorithms as Multilayer Perceptron (MLP), Support Vector Machine (SVM), Radial Basis Function (RBF), and Random Forest (RF) are used to classify the data. According to the precision and accuracy performance measurements of the classifiers for each data class, the nested classifier structures are constructed to improve the overall accuracy. The different structures are tried to obtain a better classifier performance. The results of classical and proposed four new ensemble networks are presented to compare their performance. The result shows that the random forest classifier has the best performance in terms of accuracy and, even with the ensemble network having the highest accuracy can be obtained almost the same performance results. For this reason, it is planned to increase the dataset and apply the different network structures for the enhancement of classifier performance as to future work.