doi:10.1155/2010/853434 Research Article Combination of EEG Complexity and Spectral Analysis for Epilepsy Diagnosis and Seizure Detection

Approximately 1% of the world's population has epilepsy, and 25% of epilepsy patients cannot be treated sufficiently by any available therapy. If an automatic seizure-detection system was available, it could reduce the time required by a neurologist to perform an off-line diagnosis by reviewing electroencephalogram (EEG) data. It could produce an on-line warning signal to alert healthcare professionals or to drive a treatment device such as an electrical stimulator to enhance the patient's safety and quality of life. This paper describes a systematic evaluation of current approaches to seizure detection in the literature. This evaluation was then used to suggest a reliable, practical epilepsy detection method. The combination of complexity analysis and spectrum analysis on an EEG can perform robust evaluations on the collected data. Principle component analysis (PCA) and genetic algorithms (GAs) were applied to various linear and nonlinear methods. The best linear models resulted from using all of the features without other processing. For the nonlinear models, applying PCA for feature reduction provided better results than applying GAs. The feasibility of executing the proposed methods on a personal computer for on-line processing was also demonstrated.


Introduction
Epilepsy is one of the most common neurological disorders. Approximately 1% of the world's population has epilepsy, and up to 5% of people may have at least one seizure during their lifetime [1]. Epilepsy is characterized by a sudden and recurrent malfunction of the brain, a "seizure" [2]. An electroencephalogram (EEG) is a record of the electrical potentials generated by the cerebral cortex's nerve cells, and it has been an especially valuable clinical tool for the evaluation and treatment of epilepsy [3].
A continuous recording of the EEG lasting as long as one week is required to detect epilepsy. Examining the entire length of such EEG recordings by a well-trained neurologist is both tedious and time consuming. The time required by the neurologist to review extensive EEG data could be greatly reduced via assistance from a reliable automated seizure detection system. In addition, if an online seizure detection method was available, the system could signal healthcare professionals to provide immediate care when a seizure occurs, or it could drive a treatment device (e.g., an electrical stimulator or a drug delivery device) to suppress the seizure and to enhance the patient's quality of life [4,5].
Several methods have been proposed to automatically detect epileptic seizures by analyzing EEG data. During seizures, the scalp EEG of patients with epilepsy is characterized by high-amplitude, synchronized EEG waveforms. Therefore, analysis of EEG data using chaotic nonlinear dynamics (e.g., Lyapunov exponents) [6,7] and complexity analysis (e.g., entropy) [8][9][10][11][12] has been proposed to analyze seizure discharges. Time-frequency analysis approaches that analyze the fundamental frequencies and the harmonic frequencies of seizure events have also been proposed for seizure detection. These types of analyses include short-time Fourier transforms [13][14][15] and wavelet transforms [16].
Combinations of wavelet analysis with entropy analysis or combinations of wavelet analysis with Lyapunov exponents have also been proposed for seizure detection by analyzing the complexity of some specific EEG subbands [17,18]. Because of their self-training capability, neural networks [12,14,[17][18][19][20] and adaptive neurofuzzy inference systems [21] have also been utilized for classification of normal events and 2 EURASIP Journal on Advances in Signal Processing seizure events by analyzing the spectra and/or the complexity of the EEG recordings.
The performance of an EEG-based seizure detection model may be affected by at least four factors: (1) the EEG features, (2) the feature extraction/reduction methods, (3) the classifiers, and (4) the number of data classes to be classified. The objective of this study was to systematically evaluate the performance of current approaches described in the literature and to suggest a reliable, practical epilepsy detection method based on the findings of the evaluation. Because the dataset from Department of Epileptology, University of Bonn, Germany [21,22], has been widely used for performance demonstration in many studies, the evaluation presented here focused on analyzing methods with this dataset for fair comparisons [8,12,14,18,[23][24][25][26][27][28][29][30][31][32][33][34].
For EEG features, Srinivasan et al. successfully combined an approximate entropy (ApEn) analysis with neural networks to discriminate between normal and ictal EEG signals, and the overall accuracy was as high as 100% [12]. The study presented here examined the capability of ApEn analysis to analyze multiclass EEGs, and the analysis' performance was enhanced by combining ApEn analysis with the power of EEG subbands or autoregressive models [30]. Genetic algorithms [27] and principle component analysis [18] were compared to examine their ability to select features that are useful while using various linear and nonlinear methods for classification.
For the number of data classes to be classified, there are five datasetes in [21,22], including normal eyes-open EEGs (Set A), normal eyes-closed EEGs (Set B), interictal EEGs at the opposite (Set C) and epileptogenic zones (Set D), and ictal EEGs (Set E) at epileptogenic zones. For twoclass classification, studies described in [8,12,14,[23][24][25] classified Set A and Set E, and their accuracies ranged from 92% to 100%. Studies described in [26,27] classified Set E and Sets A, B, C, and D, and their accuracies ranged from 96% to 97%. For three-class classification, studies described in [18,[28][29][30] classified Set A, Set D (Set C in [30]), and Set E, and the accuracies ranged from 85% to 96%. Finally, the studies described in [31][32][33][34] classified the five datasetes, and the accuracies ranged from 89% to 99%. In our experiments, the seizure detection approaches were also used to classify two additional datasets: Set D and Set E; and Sets A and D and Set E.
The motivation of this study is to present a comprehensive study on state-of-the-art methods for seizure detection and propose a reliable and practical epilepsy detection method that can balance computational complexity and detection accuracy. The combination of complexity analysis and spectrum analysis on an EEG that can perform robust evaluations on the collected data is proposed. Applying principle component analysis (PCA) for feature reduction and utilizing radial-basis-function support vector machine as the classifier are developed for multiclass EEG discrimination. For online seizure detection, the temporal and spectral features are integrated with the linear classifiers and can be easily implemented on current processing platforms and perform with high accuracy and with low computational cost. In addition, it is feasible for the system to responsively drive a treatment device, such as an electrical stimulator or a drug delivery device, to suppress the seizures.

Dataset
This study utilized the dataset from Department of Epileptology at the University of Bonn, Germany [22,23], because many seizure detection methods have been proposed and evaluated with this dataset. It contains normal EEGs of healthy subjects, interictal EEGs of epileptic subjects, and ictal EEGs of epileptic subjects. Each set contains 100 singlechannel EEG segments with a duration of 23.6 s and a sampling frequency of 173.61 Hz. Each segment consists of 4,096 samples. Segments from Sets A and B contain normal EEGs taken using surface electrodes on five healthy people with their eyes open and closed, respectively. Sets C and D contain interictal EEGs taken from five epileptic patients using intracranial electrodes placed at the opposite and epileptogenic zones, respectively. Set E contains ictal EEGs taken from five epileptic patients using intracranial electrodes placed at the epileptogenic zone. The type of epilepsy was diagnosed as temporal lobe epilepsy. The exemplary EEG segments from each set are shown in Figure 1

Methods
A typical EEG-based seizure detection model may contain (1) the EEG features, (2) the feature extraction/reduction, and (3) the classifier. This study systematically evaluated the performance of current approaches with respect to these EURASIP Journal on Advances in Signal Processing  three elements. The results of the evaluation were used to suggest methods for achieving a practical epilepsy detection method.
For EEG features, the approximate entropy (ApEn) analysis was evaluated for its ability to analyze multiclass EEGs, and the performance of ApEn analysis was enhanced by incorporating the power of EEG subbands or autoregressive models [30]. Genetic algorithms [27] and principle component analysis [18] were compared for their ability to select features while using various linear and nonlinear methods for classification.

Spectral and Entropy Analysis
3.1.1. Approximate Entropy (ApEn). Approximate entropy (ApEn) is a measure that quantifies the regularity or the predictability of a time series [35]. ApEn accounts for the temporal order of points in a time sequence and is therefore a preferred measure of randomness or regularity. It has also been used recently for the detection of epilepsy [8,12]. Smaller values of ApEn imply a greater likelihood that similar patterns of measurements will be followed by additional similar measurements [35].
In addition to diagnosis such as discriminating between the EEGs of healthy people and the epileptic-seizure EEGs of patients [12], the ability to discriminate between the ictal and nonictal EEGs of epileptic patients is also important for some practical applications, such as seizure warning systems or closed-loop seizure control systems [5]. Figure 2(a) shows the distribution, the means, and the standard deviations of ApEn values corresponding to three datasetes: the normal scalp EEGs of healthy subjects (Set A), the interictal EEGs of epileptic subjects (Set D), and the ictal EEGs of epileptic subjects (Set E). The data length of each nonoverlapping window for ApEn calculation was 512 points, and the other ApEn parameters were r = 0 and m = 1 [12]. Based on these data, ApEn was a good index for discriminating between normal (Set A) and ictal EEGs (Set E). However, the ApEn values of the interictal EEGs overlapped with those of the normal and the ictal EEGs. Figure 2(b) shows the results from the sample entropy analysis method [36] that was proposed to eliminate bias caused by self-matching. The values from sample entropy analysis corresponding to the interictal EEGs overlap with those of the normal EEGs, and ApEn analysis performs better than the sample entropy method for discriminating between normal and ictal EEGs. Thus, to improve the performance of epileptic seizure detection, it is required to combine additional, complementary features to ApEn analysis. Spectral features and the autoregressive model are compared for this purpose.

Spectral Analysis.
The EEG power spectra calculated using a fast Fourier transform (FFT) were normalized to a logarithmic scale. The 0-60 Hz frequency range was continuously segmented into 15 subbands, and the averaged EEG log-power spectrum from each sub-band (with a 4 Hz bandwidth) was extracted to generate the spectral features used by the classifying method. The feature for each subband i (for 0 ≤ i ≤ 14) was calculated by where PS( f ) is the log power of frequency f. Combined with a time domain feature (i.e., the approximate entropy), a total of 16 features were used. Because EEG signals are noisy and nonstationary, each EEG segment was divided into several subwindows, and the approximate entropy analysis and spectral analysis was applied to each subwindow. The feature values of the EEG segment were the median values of the ApEn and the band powers for each of divided subwindows. Assuming that the number of data points of an EEG segment to be classified is N, the resultant ApEn is and the power of the ith band is where PS i,k represents the power of the ith band at the kth sub-window [37].

Autoregressive Model.
The autoregressive (AR) model is a parametric model used to describe stationary time series, and it is also a popular tool for EEG analysis [38]. AR models represent the current signal as the weighted sum of its previous values and the white noise. The determination of the weights is based on the least mean square (LMS) criteria. For the analysis presented here, Akaike's information criterion [39] was used to determine the appropriate order of the AR model, which was 20. The weights of the AR model and the ApEn analysis (a total of 21 features) were combined as the set of features for use by the classifiers and were compared with the spectral features.

Feature Reduction.
Principle component analysis and genetic algorithms have been utilized to perform feature reduction in seizure detection methods [18,27]. Here, these two methods have been used in conjunction with the combination of ApEn analysis and spectral power analysis for feature reduction, and the results are compared to those obtained when examining the original 16 features without any processing.

Principle Component Analysis.
Principal component analysis transforms a set of correlated variables into a new set of uncorrelated variables that can use fewer dimensions to express the relevant information contained in the observation data. It has also been widely used in EEG analysis for dimension reduction or for feature extraction [18,37]. For this analysis, PCA was applied to the 16 features (ApEn and the powers of the 15 subbands), and the resultant principle components (PCs) were fed to the classifiers for evaluation. The number of PCs was determined based on the best performance of each classifier [18].

Genetic Algorithm.
A Genetic Algorithm (GA) is an adaptive heuristic search algorithm. It starts with an initial population of fixed-length individuals (chromosomes). The evolution process is governed by selection, crossover, and mutation [40] of the parents to generate the children's generation. A fitness function is defined to evaluate how well a solution (i.e., the individuals) solves the problem. Here, the settings and the procedure for the GA followed the approach proposed in [27] for seizure detection. At initialization, the population size was set at 20. Each individual consisted of 16 genes that represented each of the 16 features (i.e., the ApEn and the powers of the 15 subbands). The genes were allowed to have a value of 0 or 1. The value of 1 implied that the corresponding feature was selected, and a value of 0 implied that the corresponding feature was excluded. The fitness value of an individual was defined as the inverse value of the classification error. Two of the individuals (the elites) in the current generation were quarantined to survive to the next generation without any modifications. In each generation, 80% of the individuals in the population, excluding the elites, were created through a crossover operation. The remaining 20% were generated through mutation. The algorithm was allowed to run for a maximum of 100 generations. The stop criteria for the algorithm were set such that the algorithm would halt if there were no improvements in the fitness values from 20 consecutive generations [27].

Classification.
To evaluate and compare the performance of analysis methods, four linear and nonlinear methods were utilized to classify the extracted features as a seizure or a Nonseizure event. The four evaluated methods were linear least squares [41], linear discriminate analysis [42], a backpropagation (BP) neural network [43], and the support vector machine with either the linear (LISVM) or the radialbasis function kernels (RBFSVM) [44]. The linear least squares (LLS) method finds a best fitting linear model that minimizes the mean square error between the system output and the desired output. Mathematically, it can be stated as finding an approximate solution to an overdetermined system of linear equations. Because the model output is only the weighted sum of the input features, it is suitable for implementation using processors without high computing power or for use in online processing. Linear discriminant analysis (LDA) uses a hyperplane to find the linear combination of features that best separates two or more classes of objects or events. Usually, the withinclass, between-class, and mixture scatter matrices are used to formulate the criteria for searching the hyperplane so that the distance between the classes' means is minimized and the interclass variance is maximized [45,46]. Backpropagation (BP) neural networks are widely used nonlinear models for pattern recognition and classification problems. BP is a multilayer perceptron that is composed of several layers of neurons. The error between the desired output and the network output is backpropagated from the output layer to the hidden and input layers to update the weights for network training based on the gradient-descent method. Here, a 3-layer feedforward neural network was utilized, and the number of neurons in the hidden layer was 20. The log-sigmoid function was used as the activation function of the hidden and output layers. The learning constant was 0.1 for network training and iteration number is 2000 for network training.
The support vector machine (SVM) also uses a hyperplane to identify classes. The hyperplane that maximizes the margin (i.e., the distance from the nearest training points) is selected by the SVM. Maximizing the margins is known to     increase the method's generalization capabilities. The SVM performs structural risk minimization and creates a classifier with a minimized Vapnik-Chervonenkis (VC) dimension. When the VC dimension is low, the expected probability of error is low and ensures a good generalization. The SVM can also simultaneously minimize the empirical risk and the expected risk of pattern classification problems [47]. For the analysis presented here, two kinds of kernels, the linear kernel and the radial-basis function, were used. The RBF kernel nonlinearly maps samples into a higher dimensional space to handle cases where the relation between class labels and attributes is nonlinear. The parameter settings used were the following: a penalty parameter, C = 2; a variance, σ = 0.5 for RBFSVM; and C = 1 for LISVM.

Results
The experiments consisted of two parts: (1) Epilepsy Diagnosis based on classification of three or five EEG datasetes, including normal EEGs, interictal EEGs, and ictal EEGs and (2) Seizure Detection based on classifying the windowed EEG trials as ictal or nonictal. The results demonstrated the feasibility of the seizure detection method for use in online seizure detection. This experiment contained three subexperiments. The first was to distinguish ictal EEGs (Set E) and nonictal EEGs (Sets A and D). The second was to distinguish ictal EEGs (Set E) and interictal EEGs (Set D). The third was to distinguish ictal EEGs (Set E) and nonictal EEGs (Sets A, B, C, and D) when all of the datasets provided in [22] are used. Table 1 shows the average accuracies obtained by combining various feature extraction methods and classifiers to classify the three EEG datasetes, including the normal EEGs (Set A), the interictal EEGs (Set D), and the ictal EEGs (Set E). Most of the seizure-detection methods were evaluated with Sets A, D, and E without using Sets B and C [18,28,29]. Set B consists of normal EEGs taken from subjects with closed eyes that will induce specific alpha rhythms. Set C consists of interictal EEGs taken with intracranial electrodes at the opposite side of the epileptic zones, but a temporal-lobe seizure is regarded as a focal seizure.

Classification.
In the experiments, 60% of the datasetes were randomly selected for training and the remaining data were used for testing the performance of the methods. The procedures were repeated 10 times to obtain the average performance results and their standard deviations (noted in parentheses).
By combining the temporal and spectral features of the EEG signals, the average accuracies of the linear models were remarkable and close to 97%. The use of PCA enhanced the best average classification accuracies of BP and SVM. Figure 3 shows the average variation in classification accuracies versus the number of principle components with respect to various classifiers. Applying the PCA method did not improve the average accuracy of LLS or LDA and it required more computation for the linear classifiers, so PCA will not be combined with the linear classifiers in the following experiments. However, PCA did improve either the dimension reduction or the accuracy improvement of BP and SVM. The RBFSVM method had the highest average accuracy when combined with PCA. The GA did not improve the average accuracies for all classifiers. The results obtained by combining ApEn analysis with the AR model were worse than the results of combining ApEn analysis and EEG spectral features. Table 2 shows the average accuracies of applying the linear and nonlinear classifiers to classify all of the five EEG datasetes. The results show that nonlinear classifiers perform much better than the linear classifiers if the number of data classes increases. In our experiments, the best result among the ten tests for each classifier can be higher than 80% and the best result of RBFSVM can reach 90%.

Toward an Online Seizure Detection System.
To evaluate the feasibility of online operation of the seizure detection methods, all of the segments of the datasetes were divided into nonoverlapping windows with 173-, 256-, and 512point window lengths. The trial numbers corresponding to various window lengths for training and for testing are shown in Table 3. Three combinations of the datasets were set to distinguish between ictal and nonictal EEGs, including (1) Set D versus Set E, (2) Sets A and D versus Set E, and (3) Sets A, B, C, and D versus Set E. Tables 4-6 show the results from combining various feature extraction methods with various classifiers and applying them to the three cases, respectively. Sensitivity (SE), specificity (SP), and accuracy (AC) are defined as 8 EURASIP Journal on Advances in Signal Processing   where TP is the true positive, the total number of correctly detected positive events; TN is the true negative, the total number of correctly detected negative events; FP is the false positive, the total number of erroneously positive detections (i.e., false alarms); and FN is the false negative, the total number of erroneously negative detections (i.e., missed detections). Table 4 shows that the performance of linear classifiers was similar to the nonlinear models, and the average accuracy of seizure detection could reach roughly 96% for the 173-pt EEG windows (a data window of approximately 1 second). The detection accuracy increased to 98% for the 512-pt EEG windows (a data window of approximately 3 seconds). For the linear models, utilizing all of the features obtained the best accuracy, and features extracted by PCA provided for the best accuracy from the nonlinear models. The RBFSVM had the highest average accuracy when combined with PCA.
The average accuracies in Tables 5 and 6 are somewhat higher than those in Table 4 because the specificity was improved by including more datasets (i.e., Set A or Sets A, B, C) to Set D in the nonictal class. However, the sensitivity values were reduced due to uneven numbers of data points.
LDA had a more stable sensitivity than LLS and is close to the sensitivity of RBFSVM. For these experiments, the BP had two output nodes representing ictal and nonictal events, respectively. The class belonging to the output node that had the higher output value was regarded as the classified result for each input vector. For the linear models and the SVMs, the threshold was set as the "knee point" of the receiver operation characteristic (ROC) curve corresponding to the training data (Figure 4). To avoid bias, no testing data were used to determine the threshold.
The feasibility of executing the proposed method in real time was also examined. The windowed EEGs were analyzed by using FFT and the approximate entropy to extract the 16 features, and the features were fed to the LLS/LDA classifier for discrimination. The program was coded using the C programming language. The method was implemented on a personal computer with an Intel core 2 6600 operated at 2.4 GHz with 2 G of RAM. The time consumed during processing of the EEGs using different window lengths is presented in Table 7. The first raw was, of Table 7, used as an example. The time consumed by processing 1000 173point windows was 547 milliseconds, and the average time consumed to process 1 window (173 points, 1 second) was Table 6: Sensitivity, specificity, and accuracy of combining various feature extraction methods with various classifiers to distinguishing between Set E (ictal windows) and all nonictal windows (Sets A, B, C, and D).     only 0.547 milliseconds. The proposed method can easily be implemented using the current processing platform. When a seizure occurs, the online seizure detection system could generate a warning signal so that healthcare professionals could provide immediate care. Alternately, the system could drive a treatment device, such as an electrical stimulator or a drug delivery device, designed to suppress the seizure and enhance the patient's quality of life. Table 8 shows the performance data for the various seizure detection methods applied to the same dataset [21,22] (reproduced from [26,34]). Only methods evaluated using the same dataset were included so that a comparison between the results was feasible. Although combining ApEn analysis with neural networks to discriminate between the EEGs of healthy people and the epileptic-seizure EEGs of patients has high accuracy [12], when only ApEN analysis was used to determine the features of the probabilistic neural network (PNN) for the three-class classification (Sets A, D, E), the detection rates for Sets A, D, and E were 89.75%, 39.75%, and 96.00%, respectively. The average accuracy also decreased to 74.42%. Each of the classification methods combined with the corresponding best features in Table 1 was significantly better than ApEN analysis combined with PNN (P < .001). However, ApEn analysis was still a good index for discriminating between normal (Set A) and ictal EEGs (Set E) because the normal EEGs were not classified as ictal EEGs (or vice versa). For the three-class discrimination, combining medianfiltered ApEn data and the multiband EEG power spectra led to average accuracies ranging from 96.83%-98.67% while using the linear and nonlinear classifying methods. These accuracies were superior to those from the related methods  that utilize recurrent networks [28], adaptive neural fuzzy networks [29], or radial basis-function neural networks [18].

Discussion
For the five-class discrimination, the average accuracy of the proposed features combined with RBFSVM is 85.9%. Our best result can reach 90% (close to the results in [34]) among the ten tests but is not satisfactory. The best result in literature is 99.28% reported in [33]. For our approach, most of errors were the misclassifications between Sets C and D, interictal EEGs of epileptic patients recorded by intracranial electrodes, so the misclassifications do not affect the applicability of the developed methods to much in epilepsy diagnosis (discriminating between the EEGs of healthy people and the epileptic-seizure EEGs of patients) and seizure detection (discriminating between the ictal and nonictal EEGs of epileptic patients).
For the two-class classification that distinguishes between Set E and Sets (A, B, C, D) [26,27], the approach presented here achieved the best average accuracies, which were 1%-2% higher than the methods described in the literature. The comparisons showed that integrating temporal and spectral features with linear classifiers can perform with high accuracy and with low computational cost to achieve epilepsy diagnosis or seizure detection.
For determining the EEG features, the experimental results in Tables 2 and 4-6 show that, when combined with ApEn analysis, multiband EEG spectra can be used to achieve better performance than the AR model used with all classification methods. For feature extraction, using all 16 features without other processing produced the best results while using the linear models. With the nonlinear models, applying PCA to ApEn analysis of all of the frequency bands produced better results than using features selected by GA. Figure 5 shows the times of the features selected by GA. The most selected features were the ApEn and the band powers of 0-4 Hz, 4-8 Hz, 12-16 Hz, 48-52 Hz, and 56-60 Hz. However, the least-selected feature, at a band power of 16-20 Hz, was still selected more than 35 times. Figure 6 shows the power distributions of the 15 subbands corresponding to Set A (normal), Set D (interictal), and Set E (ictal). The figure also shows that no specific sub-band could EURASIP Journal on Advances in Signal Processing linearly separate the three classes individually. Therefore, utilizing all of the subbands can lead to better performance. Real-time operation is also an issue for an online seizure warning or seizure control system. The operations, including FFT analysis, approximate entropy analysis, and the LLS/LDA classification method, could be easily implemented on current processing platforms designed for various online applications. The results described here and in the literature (Table 8) were obtained using a database selected and cut out from continuous, multichannel EEG recordings after a visual inspection for artifacts such as muscle activity or eye movement. Further study to evaluate the performance of the seizure detection methods using continuous EEG recordings encompassing various behaviors and physiological states is required for development of clinical applications. In addition, these methods may require modification if they are applied to other types of seizures, such as absence seizures that have rhythmic oscillations on fundamental and harmonic frequency bands of the scalp EEG [48,49].