Analysis and classification of ECG beat based on wavelet decomposition and SVM

Objectives : To extract the features of single arrhythmia ECG beat. To develop efficient algorithms for automated detection of arrhythmia based on ECG. Methods/Statistical analysis: The methodology includes pre-processing and segmentation of ECG. Extraction of ECG features are to support the ECG beat classification and analysis of cardiac abnormalities using machine learning techniques. Wavelet decomposition is considered for feature extraction and classification withmulticlass support vector machine. Findings: This work evaluates the suitability of the wavelet features of ECG for classifier. The proposed arrhythmia classifier results in an accuracy up to 98% for various classes of arrhythmia considered in this work. Novelty/Applications: This work is an assistive tool for medical practitioners to examine ECG in a limited time with their expertise to make the accurate abnormality diagnosis of the arrhythmia.


Introduction
Cardiac diseases are the most common cause of death around the globe. The design of health monitoring systems is always a topic of active research to support the cardiac patient. Electrocardiogram (ECG) provides detailed information of the condition of the heart (1,2) . Cardiologists can infer heart conditions from ECG wave patterns and inter wave intervals. To assist the medical doctors, researchers have proposed many algorithms for segmentation and classification of ECG signals more precisely and correctly in real-time (3) . An arrhythmia classification system includes the pre-processing of ECG signal, abnormal beat segmentation, extraction of wavelet domain features and beat classification (4)(5)(6) . The objective is to identify the various ECG arrhythmias as per AAMI standard thereby assisting the cardiologist for early diagnosis of heart disease. Arrhythmia detection procedure differs in selecting the size of ECG signal window, ECG feature extraction and classification approaches (7) . The heart performance and prediction of future complications are done using Linear prediction method, Grid partitioning and Fuzzy C-means clustering for ECG classification (8) . ECG feature selection is implemented by Bacterial Forging Optimization (BFO) and Particle Swarm https://www.indjst.org/ Optimization (PSO), classification with Levenberg-Marquardt Neural Network classifier (9) . QRS detection is done with KNN algorithm and recurrent neural network for classification (3) . This work aims at identifying the suitability of ECG features to recognise different arrhythmic ECG beats by the classifier using machine learning techniques. Table 1 indicates the characteristics of the significant waves and their origin of a normal human heart (10) . Figure 1 shows the typical normal sinus rhythm for a healthy adult with a typical heart rate of 82 bpm (11) .

Materials and Methods
To detect the cardiac abnormalities based on machine learning, the primary need is to analyse the abnormal beat segmented from the continuous ECG signal. The overall block diagram of the proposed system is given in Figure 2. The entire process in divided in five stages: pre-processing of ECG signal, segmentation, ECG feature extraction in wavelet domain, beat classification and performance analysis of the classifier. The pre-processing aims at improving the ECG signal quality by base line wander removal and elimination of noise. The signal is then segmented into single beats with R-point at the centre. The feature extraction step aims at wavelet feature extraction form the segmented ECG beats to construct the feature matrix. The feature matrix has N rows (F 11 -F 1N ) indicating the various features of the single ECG beat and n columns (F 11 -Fn 1 ) indicating the similar features of every ECG beat under consideration. This matrix serves as input to train and test the classifier model to perform required classification. The identification of the appropriate input features for classification is included in the performance analysis stage. This paper considers wavelet coefficients and relative wavelet energy as the input features for classification.

ECG data acquisition
The  (11) . In this work, seven different arrhythmias are considered in accordance with standard of Association for the Advancement of Medical Instrumentation (AAMI). Table 2 indicate the different classes of arrhythmias along with timing information considered in this work. Several noises overlapping with the bandwidth of ECG signal is shown in the Figure 3 (12) .

Pre-processing of ECG signal
The ECG pre-processing stage, prior to ECG feature extraction involves in segmentation, baseline correction and alignment of ECG (4,13,14) . ECG segmentation is done to separate the ECG beat under consideration. Figure 4 shows the segmented ECG with 1025 samples for a duration of 3.075sec. Each segment may contain 3 to 5 beats depending on the heart rate and abnormality. Pre-processing of the beats include the following steps: 1. Normalization to compensate the probable error introduced during the acquisition of the ECG signal. 2. AC interference elimination by notch filter of second order. 3. Noise filtering-the noise present in the ECG is filtered by band pass IIR filter. 4. Correction of ECG baseline-the alignment of PQ segment of ECG beat with the reference (zero) line, achieved by subtracting the 10 th level approximation signal from the filtered ECG beat with the discrete wavelets. In this work, Daubechies wavelet (db6) is taken because of its morphological similarities to normal ECG beat (15) . The pre-processed ECG is further smoothened by an averaging filter to eliminate the flaws and to aid the peak detection process. The data acquisition, segmentation is done using cygwin-an open source tool and pre-processing, feature extraction and classification are implemented with MATLAB 2018A computing environment. Finally, the ECG features in wavelet domain are extracted from the ECG beats.

ECG Feature Extraction
An ECG feature can be any intelligence extracted from the ECG beat to discriminate its type from others. The ECG features can be extracted in various forms directly from the signal's morphology and time information PQRST inter wave intervals. Features in frequency domain can be extracted with Fourier transform or using wavelet descriptors. Feature extraction involves the description of ECG beat and feature selection involves in choosing a subset of most significant features to improve the performance of the classifier. Discrete wavelet decomposition involves in selection of appropriate wavelet descriptor and decision on number of decomposition level to extract time and frequency information of the ECG beat. The levels are chosen based on the regions of the ECG beat that are correlated with the frequencies required for arrhythmia classification. The decomposition results in the approximate and detailed coefficients, which is the representation of original ECG beat. In this work, time-frequency representation involves single level one-dimensional Daubechies wavelet filters (5,8) .

Discrete Wavelet Transform
The Discrete Wavelet Transform (DWT) is an effective method for representing the ECG beats in wavelet domain to find the abnormalities in ECG beat. The DWT implementation disintegrates the ECG beat in the form of mutually perpendicular set of wavelets. The accurate ECG features of time and frequency domain at high and low frequencies are extracted. The ECG beat is decomposed using a high scale, low pass filter (LPF) to yield "Approximation coefficients" (A). Simultaneously ECG beat is decomposed using a low scale, high pass filter (HPF) to result in "Detailed coefficients" (D). The entire process repeated to increase the frequency resolution. Hence, a set of filters have been used for decomposition of the ECG beats using DWT (13,15,16) . Figure 5 (a) indicates the wavelet decomposition process. The sampling frequency(Fs) of ECG is 360Hz informs that frequency component of ECG upto 180Hz(Fs/2). The eight levels ECG decomposition is considered in order to include all the useful frequency components. The Figure 5(b) shows the ECG decomposition process using DWT of the ECG signal to include all frequencies from 180Hz to 0.5Hz.
https://www.indjst.org/ The energy distributions of the ECG signal for different arrhythmia classes are not similar at various frequency bands under consideration. The energy associated with wavelet coefficients indicates the energy of the signal at different composition level and can be singled out at distinct resolution levels. The energy associated with detailed coefficients at j th level is computed as (16) : Where j=1 2 ,...J decomposition levels; k = 1,2,... N 2 j Index of decomposition coefficients and N -number of samples in the ECG beat.
Energy of the approximation coefficients at J th level is calculated as: The total energy of the ECG beat is: The time scale density or relative wavelet energy (RWE) of the ECG beat is computed in accordance with the equation ∑ D j + A = 100, j = 1, 2, . . . J for all wavelet coefficients. This is the representation of the energy corresponding to different frequency bands of the ECG beat.
The normalized relative wavelet energy at each level is computed for all wavelet coefficients as given in the equation (4). The mean± standard deviation (std) for the coefficients is calculated and this serves as the meaningful feature of ECG beat. Finally, the variance of the obtained coefficients is computed for each decomposition levels in order to extract the features from the ECG beats for abnormality classification (17) . https://www.indjst.org/

ECG feature Selection and Classification
In this work, the support vector machine (SVM) is considered to classify the different arrhythmias. SVMs are statistical, supervised learning models-primarily described by partitioning hyper planes in multidimensional space and used for both classification and regression challenges. SVM is a binary classifier that takes a pair of input features and class labels, to separate cases of different labels (5,13,18,19) . It analyses the input features, trains itself and then classifies the data by generating the support vectors. The support vectors are nothing but the coordinates of each data point. In this work, the SVM model adopts RBF kernel, One Versus One encoding scheme, 10 fold cross validation and simplex optimization routine.

Feature Extraction
The typical single ECG beat for 7 classes are as shown in the Figure 6. The morphology of ECG beats appears similar for the visual observation. These beats serve as input to the feature extractor. The ECG beat input is decomposed with dB6 wavelet. Nine wavelet features viz., D1, D2,…,D8 and A8 are extracted from every ECG beat of the database. The frequency components of ECG beat at each level of wavelet decomposition to generate the wavelet coefficients are given in the Table 3. The mean ± stdev of the mean, variance of wavelet coefficients at each level for all 7 classes are tabulated in Table 4 (a) and Table 4(b) respectively. They are included row wise in the feature matrix. The relative wavelet energy at each level computed as per equation (1) and are indicated in

Classification
The extracted features are plotted for various classes of arrhythmias under consideration. Not all the features support perfect classification of the ECG abnormalities. Typical plots of extracted features across various class of arrhythmia are indicated in Figures 7, 8 and 9. The plots 7(a)-9(d) indicate that features extracted can be categorized as perfect, partially perfect and poor classification. This gives an approximate estimate of classification. The Table 5 indicates the summary of features useful for further classification. The perfect category is indicated as (i)NL and NP (ii)APB and NP(iii)NP and NE(iv)NP and RB (vi)NP and fPN classes which are perfectly classifiable with respect to mean of D8 coefficients as indicated in Figure 7(d). The NL and NE with respect to variance of D4, NE and NP with respect to variance of D6 as shown in Figure 8 (b) and Figure 8(c) https://www.indjst.org/ respectively. The NP and other classes are perfectly separable with respect to RWE of D4 and D5 coefficients as indicated in 9(b) and 9(c). The partially perfect category shows the partial overlapping of mean ± std of different classes-Mean of D3 for NP and RB class as shown in Figure 7(a), NL and NP class as indicated in Figure 7(c). Total overlapping indicates the poor classification as seen in Figure 8 (a) with NL and APB for mean of D3. NL and NE for mean of D6 as in Figure 7(c), NP and RB for RWE of D3 coefficients as seen in Figure 9 (a). These are only indicative and do not give the accuracy of classification.

Classification with SVM
The classification can be further quantised by the implementation of the multiclass support vector machine (MSVM). It gives a measure of classification in terms of accuracy. The SVM takes two features at a time(column wise) from the feature matrix as input and trains itself with training data set and classifies the testing dataset. The training and test datasets are selected at random from the feature database but their ratio can be set by the programmer. The accuracy of the classifier is the benchmark to select the particular feature pair for the ECG classification. The different input to MSVM leads to various degrees of classification https://www.indjst.org/  accuracies. The accuracy above 90% are considered as good and 80-90% as moderate and below 80% as poor for the selected input feature pairs from the dataset. Table 6 and Figure 10 indicate the segregation of the input pairs leading to different ranges of accuracy of MSVM. Table 8 (a &b) lists input feature pair and the corresponding accuracies for the good and moderate category with accuracy greater than 80%. Only those pairs of input with resulting accuracy more than 90% are considered for further processing, to obtain better accuracy of classification. The input features are fed to the SVM with different ratios of training and testing data.
Observations indicate that classification accuracy is improved for the 7:3 and 7.5:2.5 ratios with 10 successive iterations. The results are tabulated in Table 8 (a) and Table 8 (b) indicate the accuracies for typical input pairs D4Var-D4 RWE and D4Var-D7Var respectively Table 8 (a).
The performance of the SVM is measured in terms of classification accuracy. Classification accuracy of 97.67% is obtained for input feature pairs D4Var-D4RWE with 8.5:1.5 training and testing data as indicated in Table 8 (a). An output accuracy of 98% is achieved for selected input feature set for D4Var -D7Var with 8: 2 of training and testing data as seen in Table 8 (b). The MSVM output for selected input feature set is indicated in Figure 11 (a) and Figure 12 (a). As the training and test datasets are chosen at random, the MSVM is tested for multiple iterations with the same input feature pairs and testing datasets. The https://www.indjst.org/ summary of performance analysis of the SVM for typical input features are indicated in the Table 9. An accuracy upto 98.63% is achieved for the database under consideration. Figure 13 represents the overall accuracy of SVM for different training and testing data ratios [ Figure 11 Table 9], [ Figure 13]. The performance of the proposed method is compared with other related works. Seven arrhythmia classification methods were considered to compare with this method. These methods differ in method of ECG feature extraction, number of arrhythmia https://www.indjst.org/  beat type and type of classifier considered. The performance of the methods are indicated in Table 10. The results shows that the proposed method considers increased number of wavelet features of ECG and SVM classifier results in considerable accuracy of classification of higher number of arrhythmia beat types in comparison with other methods using the same database. [ Table 10]

Conclusion
Automatic detection of cardiac abnormalities is the need of the day for medical professionals for early diagnosis of heart disease.
The study of cardiac activities through ECG provides a mean for researchers to develop various algorithms using machine learning techniques. This work presents the classification of cardiac arrhythmia with the ECG single beat analysis. It focuses on ECG feature extraction with multiresolution analysis in wavelet domain and identification of suitable ECG features for arrhythmia classification with multiclass support vector machine. A set of input feature pairs for classifier model are identified to find the correctness of arrhythmia classifier with different training and test dataset ratios and successive iteration of MSVM. Classification accuracy upto 98% is achieved for selected input feature pairs. Higher accuracy of classification can be achieved by incorporating the feature reduction techniques and considering larger size of the database in each class of abnormalities.