Atrial Fibrillation Discrimination for Real-Time ECG Monitoring Based On QT Interval Variation

Background/Objectives: An occasional Atrial Fibrillation (AF) event in heart rhythm should be monitored regularly, in continuous intervals. Timely detection of these anomalies in heart rhythm is required to save patients from sudden cardiac arrest. Method: A long-duration ECG categorization algorithm named AFECOC is proposed. For this one-minute-long 71 signals are attained from the Physionet’s “MIT-BIH arrhythmia (MA)” and “AF” database. Two-stage ﬁltering of noisy signals is employed before signal analysis. Four algorithms i.e. Error-Correcting Output Code (ECOC), Naïve Bayes, Decision Tree, and K-Nearest Neighbor(K-NN) are applied to reduce the feature set and then signals are classiﬁed with ECOC classiﬁer. Findings: It was found that the ECOC algorithm gives the highest accuracy of 81.95% on the complete feature set. To exclude the irrelevant features, the highest performing algorithm ECOC was used that extracts the combination of the feature sets that get most aﬀected during AF. The combination of ’heart-beat’ and ’mean QT-interval’ are found to be the most relevant features aﬀected during AF events. The accuracy of these two features was evaluated with four classiﬁers namely ECOC, Naïve Bayes, Decision tree-based and K mean classiﬁer and the accuracy obtained was 89.6%, 76.19%, 76.19%, and 61% respectively. It concludes that the proposed methodology achieved the highest accuracy of 89.6% with the ECOC classiﬁer. Finally, all the AF rhythms have been checked using annotated labels for spontaneous change in QT-interval to verify the designed methodologies. Novelty: Instead of missing P-waves and RR-interval variation, recognition of mean QT interval variation-based AF event detection algorithm gives better accuracy for longer signals. Hence, it can be implemented in Real-Time continuous monitoring.


Introduction
Sudden cardiac arrest or stroke in patients is probably attributable to a history of irregular heartbeats.These irregular heartbeats can be caused by a higher atrial rate (400-600) (1) .The higher atrial rates were initiated because of the presence of the longest and most growing arrhythmia conditions i.e.Atrial Fibrillation (AF).AF occurs when "action potentials fire very rapidly within the pulmonary veins or atrium in a chaotic manner".It finally results in several impulses that try to travel through the ' AV' node and the atrial contraction lost coordination (2) .AF is one of the prominent causes of heart stroke and thrombus formation in in-hospital and out-hospital patients.It is chronic and progressive (3) .The severity of AF increases with age.The mean Indian patient aged 51.24 ± 15.36 are normally found affected with AF (3) .Hence treatment and prioritization of the patients having a history of AF rhythms are important.
The atrial activity of the heart can be majorly detected by P-waves in ECG (4) , but during fibrillation, these P-waves get replaced by fluctuating waveforms (f-waves).Hence for the prediction of AF spotting of missed P-waves or f-waves in combination with heartbeat variability was considered in (4,5) .In addition to this irregularity in RR interval due to varying heartbeat can also be considered to predict AF rhythm.The peaks of P-waves or f-waves are weaker during AF that's makes them difficult to locate most of the time (6) and similarly, the irregularity in RR interval can also be due to many other heart arrhythmias.Hence these features cannot always forecast an AF rhythm.The research for AF prediction is generally limited to the estimation of missed P-waves, irregular R-R interval and variable heartbeat.The varying heartbeats when elevates out of the normal range, reduce the QT interval duration also.In recent years, some of the literatures also suggests tracking QT intervals during AF (7,8) .For this QT interval, correction formulas are given by Bazett and Fridericia (9) to correct the heartbeat during AF.Apart from this averaging of QT interval in consecutive beats is suggested in (6) .The methods present in the literature include cumbersome calculations and need more time to implement.In the proposed work we have suggested calculating the mean QT interval during a rhythm to forecast an AF event.
An ideal ECG beat and a sample AF rhythm are shown in Figure 1 and Figure 2 respectively.A sample rhythm of Atrial Fibrillation (10) As shown in Figure 2, AF rhythms are irregular and appear occasionally.Therefore it is recommended to monitor the ECG of this kind of patient for a longer duration to avoid delayed management (5) .Consequently, the proposed work has incorporated 1-minute long ECG signals.The generated ECG waves require pre-processing of the raw signals to spot out the AF rhythm characteristic features that may be buried in various noises.Noise cancellation can be done by filters like Adaptive filtering, https://www.indjst.org/Wavelet thresholding, Wavelet Transform (WT) (11) , Stationary Wavelet Transform (SWT) with dynamic thresholding (12) , Fourier Transforms, Bayesian Filters and Kalman filters (13) etc.However, decomposition through wavelets is a time-taken and complex process (14) , thus not appropriate for filtering.For the reason the paper proposing a novel two stage filtering for noise cancellation and smoothning of the ECG signl.After cleaning the signal feature extraction through techniques like WT (12,15) , Discrete Cosine Transform (DCT) (16) , Wave Detector using Discrete Wavelet Detection, Adaptive thresholding or windowing method (3) can be utilized as suggested in the literature.The extracted features are grouped into time-domain, spectral, temporal, and morphological features for convenience (12) .To select the relevant features out of the set of total extracted features, different feature selection algorithms are discussed in (17) on various medical data sets.
These selected features can be classified through deep learning (DL) and machine learning (ML) in a trouble-free way with various classification algorithms.Deep learning models have their limitations of data dependency.They can be trained and evaluated with the same and a limited number of datasets.That can reduce the performance when the model will be tested with external datasets (18) .While ML algorithms are free from data dependency.The ML and DL algorithms like Artificial neural network, Markov models, Support Vector Machine (SVM) (19) , K-Nearest Neighbour (K-NN) (20) , Stationary Wavelet Transform (SWT) (21) , Naïve Bayes, Linear Discriminant Analysis (LDA) (10) , Tree method (14) , Convolutional Neural Network (CNN) (19) , Multilayer Perceptron (MLP) (22) , Linear discriminant classifier (23) etc. has been employed in the literature for classification purpose.
The CNN classifier algorithm in (20) classifies the rhythm into three classes i.e.AF, normal and other types.Accuracy for other types of rhythm classification is however lower than AF and normal rhythms.A deep CNN approach was used in (4) .In which, AF signals were classified with Accuracy 98.29%, Sensitivity 98.34%, and Specificity 98.24%.In (24) the deep CNN-BLSTM network model has been incorporated for AF signal discrimination on the AF database with 96.59% accuracy.Besides these algorithms, various other classification algorithms are implied for AF event detection in the literature.The Comparison of the classification performance of recent approaches is available in the literature is presented in Table 1.

Data Base
Feature Selected Rhythm Type
It is a challenging task to design an algorithm that can predict the presence of AF rhythms as there is a lot of similarity between the different arrhythmias.This paper has designed a new machine learning-based algorithm (AFECOC) that can overcome the gaps in the existing state-of-the-art algorithms.As in (4,5,23) small database with less number of signals has been employed that may lead to data dependency and biased classification results.Researchers in (18) have included short-duration signals of 10sec.long.The shorter rhythms cannot locate all the AF incidents.To quantify these issues in the proposed AFECOC algorithm a big dataset of 71 (MA (48 signals) and AF (23 signals) from Physionet) signals of 1-minute long duration are brought in to https://www.indjst.org/locate more AF events.To clean the signal precisely we have proposed a novel two-stage filtering.Most of the researches like in (3,12,15,17) have implemented unsupervised learning for feature selection.Instead, this paper has proposed supervised learning through Error Correcting Code (ECOC) algorithm.The selected features are classified through the ECOC classifier.The ECOC classifier is a multidimensional Support Vector Machine classifier (SVM).It was used as it can be trained for even more classes which is not the case with other classifiers.After classification, a 10-fold cross-validation strategy was implied by incorporating four different classifiers.The methodology also introduced 'loss' calculation in predicting true classes.Finally, we have manually drawn the relation between the occurrence of AF rhythms and the simultaneous change in the mean QT interval values.For this, we have considered the annotated AF rhythms only from both the databases acquired from the Physionet.

Model Architecture
The proposed model framework is illustrated in Figure 3, and it contains pre-processing, feature extraction, feature selection, and classification as its main modules.Initially, a 1-minute long ECG signal was taken, and then the impact of the noise on that signal was reduced using a two-stage filtering process that includes Butterworth bandpass as the first stage filtering and Sgolay filter as the second stage filtering.Then the peaks of the ECG waveform are detected and marked.Once the peaks are located the work has focused on two types of feature extraction i.e. temporal and time domain.Subsequently feature selection was performed to remove the irrelevant features.The feature selection is followed by classification.Four different classifiers named Naïve Bayes, Decision Tree, ECOC, and K-NN were applied to the relevant set of features.The complete methodology is depicted through Algorithm 1.The details of the methodology are depicted in the following subsections.

Preprocessing
The preprocessing includes filtering of the obtained signals.A novel two-stage filtering is done in which the first stage is Butterworth bandpass filter with a low cutoff frequency is 0.5Hz (to remove baseline wander) and the high cutoff frequency is 45Hz (to discard PLI noise) with a filter order 3.The second stage of filtering is done by SvatizkyGolay (SG) filter.This is an FIR (finite impulse response) smoothening filter for which filter order was taken 7 with 21 frame length.The sampling frequency 'f s ' was kept at 250 Hz.The resultant of the SG filter appears smoother than the previous filter output.

Locating Peaks
The feature extraction and selection is started with identifying the prominent peaks of the signal viz.P, Q, R, S, and T. The P-peaks are generally absent during AF rhythms and sometimes replaced by false f-waves.In this case, a false location of this wave may get marked, so the P-peaks marking is omitted in this work.In the process of marking the peaks first of all R-peaks are located and marked.Since the R-peaks are the dominating peaks they can be located easily.The window of size 13 was set to draw the R-peaks.The lowest inverted peak before 'R-wave' is the 'Q-wave' and is determined using an inverse of signal https://www.indjst.org/and setting the minimum amplitude location points between the expected 'Q-wave' peaks and the baseline.Similarly, 'S-wave' which is the lowest inverted peak after 'R-wave' is determined.Finally, the 'T-wave' is located by tracing initially all the P and T wave peaks and then extracting T-peak locations out of the combined peak points.A more elaborated explanation is given in AFECOC Algorithm.

Feature Selection
To select the feature set for AF rhythm prediction, we first determined some temporal and time-domain features of the ECG wave.These features include the interval between the peaks viz.RR interval, QT interval, and QRS complex and the mean values of all these intervals.The mean RR interval was used to calculate heartbeat using the relation 60/ (average R-R interval) in bpm.Hence a total of 11-nos. of features have been extracted viz.peaks-Q, R, S, T, intervals-RR, QT, QRS, mean intervals -RR, QT, QRS, and Heartbeat.The peak values i.e. peaks Q, R, S, & T, and their intervals i.e.RR, QT, & QRS duration are placed into temporal features of ECG while the mean of all these intervals like mean RR interval, mean QT interval and mean QRS duration are placed into time-domain features.
The focus of the work is to search for some novel features out of the complete feature set that get affected during AF rhythms.This can be done by dimensionality reduction by extracting the irrelevant features from the feature set selected previously.In the process of accomplishing this initially four algorithms viz.ECOC, Decision Tree, Naïve Bayes and K-NN are applied to the dataset of 71 signals to select the best performing algorithm.ECOC algorithm gave 81.95% accuracy on complete feature set as shown in Table 2.As the ECOC algorithm performed best, it is used to reduce the dimensionality of the feature set.The dimensionality reduction was done by dividing the feature set in the combination of features in which it was ensured that each feature was combined with another only once like RR interval + Heartbeat, mean RR interval + Heartbeat, mean RR interval + mean QT interval etc. and then ECOC was applied one by one on these pair of feature set.It was found that the combination of the mean values of peak intervals with heartbeat gives the highest accuracy.The results are depicted in Table 3 and Figure 4.According to this during an AF rhythm, the mean QT interval + heartbeat, the mean RR + Heartbeat, and the mean QRS duration + Heartbeat get affected the most.Out of which the combination of mean QT interval and Heartbeat gives the highest accuracy of detecting AF event that means it gets affected the most during AF event.

Dataset
The Physionet online database is an open-source database.It provides plentiful kinds of databases for analysis.Out of these the "MIT-BIH Arrhythmia database and AF database" are taken for analysis.The arrhythmia database consists of "48 half-hours two-channel ambulatory ECG recordings related to 47 subjects.The recordings were digitized at 360 samples per second per channel with 11-bit resolution over a 10 mV range".The AF database contains 23 records that include a couple of ECG signals in each recording and are 10 hours long with a sampling frequency of 250 samples per second and 12-bit resolution over a range of ±10 mV.A 1-minute long segment of all the signals is considered for recognition of arrhythmia present.These databases used with a large number of subjects are elaborated in Table 4.
The above-given Table 4 has provided information about the data set used in this work.While Table 5 is giving information about each data set separately based on some common attributes present in both datasets.It reveals the total time taken by AF rhythms, other types of rhythms, the variation of heartbeat, ECG leads used to read the signal, no. of total participants, etc. https://www.indjst.org/

Experimental Settings
The whole setup is simulated on Intel (R) Core TM i5-6200U CPU@ 2.30GHz processor having 4.00 GB RAM, and 64 bit Operating System.The development language chosen was MATLAB 2018a with a machine learning toolbox.We have set sampling frequency fixed (250 Hz) for all the signals from each dataset.The signals are re-sampled on this frequency.

Performance Metrics
The performance metrics of an algorithm represent the criterion through which an algorithm's effectiveness and usability can be measured.For the proposed algorithm we have included the following performance metrics to calculate the efficiency.1. Accuracy=(TP+TN /N) "Accuracy is defined as the ratio of the number of correctly classified cases".( 1

Results and Discussion
The selected features i.e. 'mean QT interval + heartbeat' were used for ECG rhythm classification in AF rhythm or other rhythms.It is a binary kind of classification.Total 71 records have been encountered including all the 48 records from the Arrhythmia database and 23 records from the AF database.Both the databases provide annotated rhythms that make training easier.The combined set of samples is classified into training and testing data by keeping a ratio of 70% to 30% respectively.Four classifiers i.e.ECOC, KNN, Decision Tree, and Naïve Bayes have experimented on the obtained feature set (mean QT interval + heartbeat).10-fold cross-validation was applied along with calculating the classification error rate or loss on the classification results.The results obtained (given in Table 6) after classification, depicted the importance of the use of the ECOC classifier to classify the rhythms into AF or other kinds.
The ECOC has performed superior in all aspects viz.accuracy, specificity, PPV, and with the lowest classification error.Figure 6 portrays the results in graphical form.https://www.indjst.org/From the above results, can conclude that whenever an AF event occurs there is a change in QT interval value and heartbeat value.The change in the QT interval value can also be seen through the annotated QT interval values in the respective databases (whenever an AF rhythm appears).Hence to validate the acquired results, the abrupt change in the QT interval values deploying annotated AF rhythms are calculated and depicted in Table 7 .The table includes four successive beats of a particular rhythm showing AF event annotation in the database at that point.It shows that there is always a change in QT interval values whenever an AF event appears in an ECG rhythm.7 represents the record no.109 of MA that is other than the AF rhythm signal.Figure 8 is record no. 203 of MA database.It contains AF rhythms and is contaminated with baseline wandering noise in its raw form.Figure 9 shows record no. 04126 from AF database.8.
The AFECOC algorithm attained the highest sensitivity with good accuracy of classification using the ECOC classifier.Other than accuracy and sensitivity it has also calculated other performance parameters like specificity, PPV, and cross-validation that are generally absent from the literature.In ECOC, unlike SVM the input data can be classified into more than two output classes with the same accuracy as of SVM classifier (20) .The results suggest that AF rhythms can be differentiated through considering different input features i.e. mean QT interval and heartbeat than missing P waves.As the heartbeats are increasing or decreasing during atrial fibrillation the QT interval also decreases or increases respectively (7) .The mean value of this varying QT interval is considered in this method.Additionally, the work proposes even more performance indicators than any of the methods present in the literature viz.accuracy, sensitivity, specificity, PPV, and Loss.That helped us to choose a suitable classifier for the signals. https://www.indjst.org/

Conclusion
The methodology is designed in virtue of the ever-increasing heart stroke cases due to AF rhythms during ECG.This work purposes a timely recognition of heart arrhythmia that enables the suitable treatment and ultimately can reduce heart attacks.The goal of the research is threefold i.e., monitoring continuous long duration signals, classifying AF rhythms from other rhythms, and evaluating the effect of AF rhythm on the variation of the mean QT interval.The method helps in detecting the probable AF events based on the varying mean QT interval.The AFECOC algorithm can be implemented with ECG sensors in-home monitoring environment also.It can be used in clinics where an abundant amount of ECG data produce every day and manual tracing of each beat is a cumbersome job.This computer-based processing is faster and reduces human error too.The ECOC classifier has given the highest classification accuracy, PPV, Se, Sp i.e. 89.6%, 1, 0.9, and 1 respectively with the lowest classification error i.e. 16.33 %.The results suggest the use of the mean QT interval + Heartbeat feature set to detect the AF in ECG waves.The outcome also shows that the chosen method outperforms many existing algorithms.Besides, the work shall be extended by incorporating more databases from online and offline sources and by hardware implementation of the designed AFECOC algorithm.Evaluation of the designed algorithm shall be done upon long duration real time ECG signals.

Fig 3 .
Fig 3. Framework of the proposed real-time ECG monitoring system based On QT Interval variation

Figure 4
Figure 4 illustrates the difference in the performance of different sets of features through graphical representation.

Fig 4 .
Fig 4. A statistical representation of the feature selection process by applying different sets of features on ECOC classifier Hence the dimensions of the feature set were reduced and now for the classification process, only two features are considered on which different classifiers are tested.The whole process and set of rules followed to classify the signals are depicted in Figure 5. https://www.indjst.org/

Fig 5 .
Fig 5.The proposed AFECOC algorithm for Atrial Fibrillation rhythm classification

) 2 .
Sensitivity=TP/(TP+FN) Sensitivity is a True Positive Rate of classes.(2) 3. Specificity=TN/(TN+FP) Specificity is defined as the proportion of negative values correctly identified.(3) 4. Positive predictive Value: The amount of correctly predicted probability of any class of arrhythmia. 5. Loss: loss due to the classification

Figures 7 ,
Figures 7, 8 and 9 illustrates the example signals from both the databases.Each Figure contains a set of 2 waveforms in each i.e. a raw waveform and a filtered waveform along with peaks located.Figure7represents the record no.109 of MA that is other than the AF rhythm signal.Figure8is record no. 203 of MA database.It contains AF rhythms and is contaminated with baseline wandering noise in its raw form.Figure9shows record no. 04126 from AF database.

Figure
Figures 7, 8 and 9 illustrates the example signals from both the databases.Each Figure contains a set of 2 waveforms in each i.e. a raw waveform and a filtered waveform along with peaks located.Figure7represents the record no.109 of MA that is other than the AF rhythm signal.Figure8is record no. 203 of MA database.It contains AF rhythms and is contaminated with baseline wandering noise in its raw form.Figure9shows record no. 04126 from AF database.

Fig 7 .
Fig 7. Illustration of a raw normal rhythm from MA database (record no.109) along with its filtered and peak located signal

Fig 8 .
Fig 8. Illustration of a raw AF rhythm from MA database (record no.203) along with its filtered and peak located signalAccording to the literature few researchers have already experimented on AF rhythm event classification.Comparative performance of the other state-of-the-art algorithms to the designed AFECOC algorithm is given in Table8.The AFECOC algorithm attained the highest sensitivity with good accuracy of classification using the ECOC classifier.Other than accuracy and sensitivity it has also calculated other performance parameters like specificity, PPV, and cross-validation that are generally absent from the literature.In ECOC, unlike SVM the input data can be classified into more than two output classes with the same accuracy as of SVM classifier(20) .The results suggest that AF rhythms can be differentiated through considering different input features i.e. mean QT interval and heartbeat than missing P waves.As the heartbeats are increasing or decreasing during atrial fibrillation the QT interval also decreases or increases respectively(7) .The mean value of this varying QT interval is considered in this method.Additionally, the work proposes even more performance indicators than any of the methods present in the literature viz.accuracy, sensitivity, specificity, PPV, and Loss.That helped us to choose a suitable classifier for the signals.

Fig 9 .
Fig 9. Illustration of a raw AF rhythm from AF database (record no.04126) along with its filtered and peak located signal

Table 6 . A statistical representation of classification performance using different classifiers formean QT interval + Heartbeat Classifiers Accuracy Sensitivity Specificity PPV
A statistical representation of classification performance using different classifiers for QT interval + Heartbeat