Combining Wavelet Transform and Hidden Markov Models for ECG Segmentation

This work aims at providing new insights on the electrocardiogram (ECG) segmentation problem using wavelets. The wavelet transform has been originally combined with a hidden Markov models (HMMs) framework in order to carry out beat segmentation and classiﬁcation. A group of ﬁve continuous wavelet functions commonly used in ECG analysis has been implemented and compared using the same framework. All experiments were realized on the QT database, which is composed of a representative number of ambulatory recordings of several individuals and is supplied with manual labels made by a physician. Our main contribution relies on the consistent set of experiments performed. Moreover, the results obtained in terms of beat segmentation and premature ventricular beat (PVC) detection are comparable to others works reported in the literature, independently of the type of the wavelet. Finally, through an original concept of combining two wavelet functions in the segmentation stage, we achieve our best performances.


INTRODUCTION
Electrocardiogram (ECG) analysis requires powerful signal processing tools to emphasize signal contents. This issue is still more important when working on signals corrupted by different sources of noise, which is the case of the ambulatory ECG.
Since the wavelet transform became popular, many works in the ECG signal processing field have adopted this technique in order to perform ECG analysis. For the particular problem of beat detection and waveform segmentation, we can mention the works of Li et al. [1] and Sahambi et al. [2] who employed the quadratic b-spline and the derivative of the Gaussian functions, respectively. Other works like those of Senhadji et al. [3], Chazal and Reilly [4], and Andreão et al. [5] used the wavelet transform to extract parameters which represented better the ECG signal contents. Finally, the wavelet transform has also been proposed in the ECG data compression problem [6][7][8]. In the article of Addison [9], one can find a good review about the spread use of the wavelet transform in ECG analysis.
More recently, wavelet transform has been successfully combined with hidden Markov models (HMMs) providing reliable beat segmentation results [10]. The approach proposed by Andreão et al. [10] offers some advantages on current methods, mainly on those based on heuristics rules.
However, it can be noted that very few works provide comparative results between different wavelet functions. Actually, the choice of the wavelet function remains in general based on qualitative arguments. This is particularly true to almost all the systems proposed for the ECG segmentation problem. We observe that since most of these systems are based on heuristic rules which are too dependent on the selected wavelet function, it is difficult to provide the right framework for comparison.
In this context, this work presents new insights on the choice of the wavelet function for ECG segmentation. To this purpose, five continuous wavelet functions are evaluated and compared using a hidden Markov model framework. It is the first time that different wavelet functions are compared in the context of ECG segmentation. Moreover, we originally explore the combination of two wavelet functions to segment and classify beats.
This article is organized as follows: Section 2 presents the wavelet functions used in our experiments. The HMM framework is described in Section 3, followed by the experiments in Section 4. Finally, the article ends with the discussions and conclusions.

WAVELET TRANSFORMS
The wavelet transform represents the signal in a scale-time space, where each scale can be seen as the result of a passband filtering. The frequency bands depend on the scale and also on the type of the chosen wavelet function.
From the mother wavelet ψ ∈ L 2 (R) with zero mean, the class of wavelets is then [11] where ψ s (t) is a wavelet in the scale s and ψ * represents the wavelet complex conjugate. Thus, the wavelet transform is given by where f is the signal and u is translation parameter. Equation (2) shows that the wavelet transform is the convolution between the signal and the wavelet function at scale s. Moreover, it can also be viewed as the correlation computation between the wavelet function and the ECG signal, which underlines the importance of the mother wavelet shape.
For implementation purposes, both s and u parameters are discretized. In this work, the scale s is called dyadic, since it is a power of 2. On the other hand, the translation parameter u varies along the samples of the signal f .
In the Appendix, we describe five selected continuous wavelets used for ECG analysis [12]: Paul function, Morlet, first and second derivatives of the Gaussian functions, and quadratic b-spline function [11,13]. It is important to note that some wavelet functions had their parameters defined after simulations [12].
Most of the selected wavelets were already employed for ECG segmentation and parameter extraction [1][2][3][4][5]. Furthermore, some works in the field [1][2][3] have shown that the continuous wavelets are suitable for the ECG segmentation problem. Indeed, the segmentation problem requires a localized and transitory event analysis which consists in pursuing the time evolution of the signal spectrum contents. In this case, the sampling rate must be the same to keep the timeinvariance and the temporal resolution at different scales.
It is important to remark that real-valued functions are better detecting peaks and sharp variations whereas complex functions evidence oscillations.

HIDDEN MARKOV MODEL FRAMEWORK
We have developed an original ECG analysis system based on hidden Markov models divided in three parts: wavelet transform, ECG segmentation using HMMs, and premature ventricular beat detection through a rule-based system as described in Figure 1. A complete description of our system is given in the following.

Parameter extraction using wavelet transform
An HMM requires a feature extraction stage for the construction of the observation sequence O. At this point, the wavelet transform is implemented.
In the segmentation problem, the works based on wavelets usually select some dyadic scales where one can find most of the signal contents [1,2]. Furthermore, the selection of the scales which are the least affected by noise can lead to a robust signal representation without necessarily removing useful information. This is also true in our approach. Thus, we have decided to select three consecutive dyadic wavelet scales s = 2 j for each wavelet transform [5,12]: (i) Paul function: scales j= 3, 4, and 5; (ii) Morlet: scales j = 3, 4, and 5, (iii) DOG: scales j = 1, 2, and 3; (iv) MHat: scales j = 2, 3, and 4; and (v) Quadratic b-spline function: scales j = 2, 3, and 4. Figure 2 shows an example of an ECG signal 4 s long processed by the parameter extraction stage using the Mexican hat wavelet function.
As a result, the observation sequence generated after the parameter extraction is of the form where T is the signal length in number of samples and each observation o t is a vector of size three, that is, the wavelet scales have the same time resolution as the original signal.

HMM-based ECG segmentation
The heartbeat is seen as a sequence of elementary waves and segments ( Figure 3), which respects some heart electrical activity constraints. Each wave (P, QRS, T) and segment (PQ, ST) of a heartbeat is then modeled by a specific left-right HMM. The isoelectric line between two consecutive beats is R. V. Andreão and J. Boudy also modeled by an HMM. The concatenation of the individual HMMs models the whole ECG signal.
In general, an HMM can be defined as a stochastic machine characterized by the following parameter set [14]: where A is the state-transition probability matrix, B represents the continuous observation densities functions, and π is the initial state probability vector. The structure of a leftright HMM is given in Figure 4. The observations generated in the parameter extraction part O = (o1 o 2 · · · o T ) are continuous signal representations modeled by a Gaussian probability density function of the form where o t is the observation vector at time t, μ t is the mean vector, and U j is the covariance matrix at state j. In this work, the continuous observations are modeled by a single Gaussian probability density function. In fact, histograms of the  Figure 3: An illustration of a heartbeat observed in an electrocardiogram. Waveform and segment boundaries are also indicated. observations values for every state of every HMM were constructed, and it was observed that most of them appeared to be well fitted to a single Gaussian [10]. The HMMs must be previously trained, that is, the model parameters are estimated aiming at learning the waveform morphologies. Hence, each waveform model was trained on a set of approximately 700 patterns for each waveform extracted from the QT database (see Section 4). The training stage is based on the Baum-Welch method [14], also called forward-backward algorithm, which is considered as a standard for HMM training. It locally maximizes the likelihood P(O | λ) of the model λ using an iterative procedure, which is a particular case of the expectation-maximization method [14].
Finally, we constructed multiple models for each waveform, using the HMM likelihood clustering algorithm, since it improves modeling of complex patterns [10,15]: four 3state HMMs were trained for QRS complex waveforms; two 2-state HMMs for the PQ and ST segments; two 3-state HMMs for the P wave; two 6-state HMMs for the T wave, and one 3-state HMM for the isoelectric line (ISO). We note that the number of states of each model was specified empirically after some simulations, following a compromise on complexity versus performance.
The beat segmentation procedure consists in matching the HMMs to the ECG waveform patterns. This is typically performed by the Viterbi decoding algorithm [10,14], which relates each observation O t to an HMM state following a maximum likelihood criteria and respecting the beat model structure (see Figure 5).
In fact, the beat model of Figure 5 is a connected HMM structure, where the transition probability among models is 4 EURASIP Journal on Advances in Signal Processing uniform. Thus, during segmentation, the Viterbi algorithm works in two dimensions: time and waveform model. As a result, the Viterbi algorithm provides not only the best state sequence of each model but also the best sequence of connected waveform models.
We have also implemented a nonsupervised training procedure in order to keep tracking of new shapes and to adapt accordingly the HMM to them during real-time analysis in an ambulatory ECG. Thus, during ECG segmentation, the waveforms detected are grouped according to their label. Then, the HMM is retrained when enough number of segmented examples is achieved. More details on our HMM training procedure can be found in [10].

Beat classification
This last stage is intended to give more elements to compare the wavelet functions. Indeed, besides the fact that the wavelet must be good in performing waveform segmentation, the wavelet function must also be appropriate to characterize different waveform morphologies.
The beat classification stage implemented here only classifies premature ventricular beats which are characterized by an interval between two consecutive QRS-complexes (R-R interval) shorter than the normal and a QRS-complex larger than the normal of the patient. For that, two main pieces of information generated by the segmentation stage are taken into account: QRS-complex morphology and the R-R interval. The later is directly calculated after segmentation. The distinction between normal and premature R-R intervals is done through a threshold τ RR calculated as follows: where RR NORMAL [k] is the vector of the last five intervals classified as normal, μ RR is the average of the normal R-R intervals, and α QRS is a constant. On the other hand, QRS-complex morphology is represented as a likelihood measure. First of all, the HMM which segments most of QRS-complexes in the beginning of the recording is taken as the individual dominant model. Hence, when the QRS-complex of a normal beat is segmented, its likelihood according to the dominant HMM will be high, while the likelihood for the ventricular beat will be low. To obtain the border line separating both QRS-complex classes, we define a likelihood threshold τ QRS . The threshold τ QRS is calculated through the equations below: where o n represents the observation sequence of the nth QRS-complex segmented by the individual dominant model λ d , P μ is the mean likelihood computed over the ten last examples segmented by λ d , and α QRS is a constant. All constant values are determined after simulations on a training set aiming at reducing the error rate. Besides the fact that each stage is performed in only one lead, we took advantage on the two-lead database used in this work by implementing a simple fusion strategy. Thus, the QRS-complexes detected in each lead are first compared in order to remove false positives. Then, the beats are classified considering the information of both leads. The beat is classified as PVC if there is at least one lead where such a phenomenon occurs, otherwise the beat is detected as normal.

EXPERIMENTS
All experiments have been realized on the QT database [16], which is composed of a representative number of ambulatory recordings of several individuals and is supplied with manual labels made by a physician. The manual labels are used as a reference to compute the global performance of each wavelet function. It contains 105 two-channel records of 15 minutes, sampled at 250 Hz. All files include annotations of waveform peaks and boundaries in at least 30 beats. Moreover, 82 files have their all QRS complexes annotated and the beat class specified.
A set of experiments has been performed in order to evaluate each wavelet function in a real world application. Two main indicators have been taken into account: (i) waveform detection and precision, and (ii) premature ventricular beat (PVC) detection accuracy.
For the first experiment, the results are presented in terms of the following: (i) precision: the onset and offset points of every waveform detected are compared to the manual labeled ones. As a result, we obtain a mean μ and a standard deviation σ in millisecond of the errors between our approach and the physician (the standard deviation for all recordings is computed as the average of the standard deviations of each recording); (ii) waveform detection: it gives the percentage of waveforms correctly detected, according to the labels made by a physician.  The performance results for each wavelet using all 105 recordings as a test set are shown in Tables 1 and 2. It is important to remark that the results consider only the channel 1 of each recording, differently from [10] where the best channel was retained. In addition, for the Paul complex wavelet, we have performed the experiments using both the real part and the magnitude of the wavelet transform. For the second experiment, the performance is measured by two criteria in accordance the recommendations to assess performance of beat detection approaches [17]: sensibility and positive predictivity. Sensibility (Se) is related to the fraction of events correctly detected, where TP (true positive) is the number of matched events and FN (false negative) is the number of events that were not detected by the approach. The positive predictivity (PP) gives the ability of detecting true events, where FP (false positive) is the number of events detected by the approach and nonmatched to the manual labels. In this case, 59 recordings composed the test set, since the other 19 ones were required to estimate the variables for the PVC detection approach. The results for all wavelets are given in Table 3.
In general, the Morlet wavelet presents a very good stability in the waveform position, except for the T wave, since the standard deviation σ is one of the lowest. On the other hand, its mean detection error μ is very high for all waveforms. Moreover, the P wave, QRS complex, and PVC detection results are the worst comparing with the other wavelets.
The DOG wavelet (first derivative of the Gaussian function) shows a high QRS complex and PVC detection sensibility.
The real component of the Paul wavelet transform has obtained a high P wave detection rate, while the PVC wave detection performance is the poorest. On the other hand, exactly the opposite has been achieved by the magnitude of this wavelet transform.
The MHat wavelet (second derivative of the Gaussian function) is better for PVC detection and presents the best overall performance.
Finally, fusing both DOG and MHat wavelets in the parameter extraction stage leaded to the best performance results. In this particular case, the fusion strategy consisted in combining the observation vector of each wavelet, that is, the observation vector at time t o t is a vector of size six.

DISCUSSION
The P wave detection requires a localized analysis both in time and frequency. There is no wavelet with both properties. Comparing the Morlet wavelet (best localization in frequency) with MHat and the real part of the Paul wavelet (best localization in time), the latest outperforms the first one, particularly in the number of detected P waves.
The best function for QRS complex detection must have a similar shape and presents noise robustness. Comparing Paul, MHat, and DOG wavelets, we observe that these functions are highly correlated with the QRS complex. Moreover, in both cases, the use of multiple scales increased noise robustness. The T wave detection depends on the QRS detection and suffers from noise interference much less than the P wave. The difficulty relies on the T wave offset detection, since it is easily confused with the baseline. In this case, the MHat achieved the best result and the Morlet wavelet achieved the worst one.
From this preliminary discussion, we conclude that each wavelet function has its own particularities. As the DOG and MHat wavelets seemed to be the most attractive ones, we have combined both in the parameter extraction stage. The wavelet combination results shown in Tables 1, 2, and 3 are better than the individual ones, especially for P wave and PVC detection. Indeed, improving the parameter extraction strategy, the ECG signal is better represented, and hence the waveforms are better discriminated.

CONCLUSION
This work has addressed the problem of choosing the right wavelet transform for ECG segmentation and classification using an HMM framework. For that, a group of five continuous wavelet functions commonly used in the ECG field has been compared: quadratic b-spline, first and second derivatives of the Gaussian function, Morlet, and Paul function. The wavelet transform acted as the parameter extraction stage necessary to build the observation sequence of our original HMM-based segmentation approach. We have also implemented a beat classification stage, through a simple set of rules, with the purpose of giving more elements on the choice of the wavelet function.
All experiments have been carried out using the QT database, which is composed of manual labels of beat waveforms and beat classes. The Mexican hat wavelet has been the one which presented the best overall performance. Additionally, the results obtained in terms of beat detection and segmentation and PVC beat detection are also comparable to others works reported in the literature [10,18,19].
At last, we have explored the complementarity of two selected wavelet functions, namely, the first derivative of the Gaussian and the Mexican hat functions, by combining their wavelet transforms in the parameter extraction stage, achieving our best performance.

Paul wavelet
The Paul wavelet has the form where m is the function order and t ∈ R. This function has the best time resolution among the continuous wavelet functions. Figure 6 shows the real and imaginary components of the Paul function with m = 4.

Morlet wavelet
The Morlet function ( Figure 7) is a complex wavelet defined as where w o is the central frequency. Figure 7 illustrates the real part of the Morlet function with w o = 6 as used in our experiments.

Derivative of Gaussian wavelet
This family of functions is built from the Gaussian function where f (p) is the pth derivative of f . Figure 8 shows the first and the second derivatives of the Gaussian function, which will be called in this work DOG and MHat, respectively.

Quadratic b-spline wavelet
The quadratic b-spline wavelet (see Figure 9) has its Fourier transform of the form It corresponds to the derivative of a box spline of degree m = 3. The box spline is obtained convolving the basic unit step function 1 [0,1) with itself (m + 1) times.