 Research
 Open Access
 Published:
Spectral information of EEG signals with respect to epilepsy classification
EURASIP Journal on Advances in Signal Processing volume 2019, Article number: 10 (2019)
Abstract
Background
The spectral information of the EEG signal with respect to epilepsy is examined in this study.
Method
In order to assess the impact of the alternative definitions of the frequency subbands that are analysed, a number of spectral thresholds are defined and the respective frequency subband combinations are generated. For each of these frequency subband combination, the EEG signal is analysed and a vector of spectral characteristics is defined. Based on this feature vector, a classification schema is used to measure the appropriateness of the specific frequency subband combination, in terms of epileptic EEG classification accuracy.
Results
The obtained results indicate that additional frequency band analysis is beneficial towards epilepsy detection.
Conclusions
This work includes the first systematic assessment of the impact of the frequency subbands to the epileptic EEG classification accuracy, and the obtained results revealed several frequency subband combinations that achieve high classification accuracy and have never been reported in the literature before.
Introduction
Signal processing of electroencephalogram (EEG) is a field that has drawn significant attention in the last years. As a result, numerous EEG processing methodologies have been presented in the literature. One of the most popular field in EEG signal processing is the epilepsy detection and classification. Being one of the most common neurological disorders [1], epilepsy has been the focus of hundreds of EEG analysis studies. Epilepsy is a chronic brain disorder, characterized by recurrent seizures, which cannot be predicted. The severity of the condition can vary greatly, while seizures may fall into a large variety of types [2].
Most of the studies for epileptic activity detection/classification using EEG signal processing, formulate methodologies that analyse the EEG signal by extracting informative features from it [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]. To this end, spectral analysis of the EEG signal is essential, since epileptic activity interrupts normal brain functionality. Analysing the EEG signal frequency patterns in order to extract spectral characteristics is one of the most common types of EEG analysis, either by itself (i.e. by focusing on the frequency domain) or combined with other types of analysis (such as nonlinear analysis), thus resulting to a vector of features. Then, these features are used as input into a classifier, resulting to classification of epileptic signals.
The EEG spectral analysis is based on a set of frequency subbands. Researchers have mainly used wavelet transform (WT) [3,4,5,6,7,8,9,10,11,12,13,14,15,16] and timefrequency distributions (TFD) [17,18,19,20] to analyse the EEG spectral patterns. However, although spectral analysis is a wellknown approach, with numerus studies including spectral characteristics in the features extracted from the EEG, the importance of the frequency subbands that are used to analyse the signal has never been thoroughly investigated in the literature. It is medically established that brainwaves are divided based on their frequency into several subbands, being delta (1–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz) and gamma (30–80 Hz) [21]. Thus, several researchers roughly focus on these subbands [3,4,5,6,7,8,9,10,11,12,13,14, 17, 18], with the technical limitations that the analysis technique imposes (i.e. WT). Thus, the importance of the frequency subbands and their limits have not been analysed in the literature, since in WTbased approaches the frequency subbands are automatically set [3,4,5,6,7,8,9,10,11,12,13,14,15,16], while in TFDbased methodologies, an attempt to compare the impact of different subbands has been presented [17], however not being a systematic approach since only four different subband combinations were analysed.
The main focus of this study is to study the impact of frequency subband selection regarding the EEG epilepsy classification. To this end, a methodology has been developed, which initially defines the number of spectral thresholds (which determines the number of frequency subbands that are created) from 0 to 12, with 0 meaning that the overall frequency spectrum of the EEG is considered as a single frequency subband and all other values (1–12) defining the number of frequency subbands (i.e. for five spectral thresholds, six frequency subbands are created). Then, all possible combinations of these subbands are created, subject to simple limitations (i.e. the range of each subband is forced to be ≥ 2 Hz). From each combination, a set of features is extracted, which are used in a classifier. The Bonn EEG database has been employed and results are obtained in terms of classification accuracy, indicating the importance of this study. To the best of the author’s knowledge, this is the first systematic analysis of the impact of different frequency subband number and range, presented in the literature. Furthermore, the results reveal frequency subbands that presented high classification accuracy and have never been studied in the literature before.
Related work
Dataset
The Bonn EEG database [22] has been employed in this study, which is a wellknown benchmark dataset for this problem. The database includes recordings for both healthy and epileptic subjects, divided in five subsets (denoted as AE and named as Z, O, N, F and S, respectively) each of them containing 100 singlechannel EEG recordings. Sets A and B (Z and O files) are recordings from five healthy volunteers with eyes open and eyes closed, respectively. The recordings are made extracranially, using the standard 10–20electrode positioning system. Sets C and D (N and F files) are seizurefree recordings from five epileptic patients, from the epileptogenic zone (set D) and the hippocampal formation of the opposite brain hemisphere (set C), while set E (S files) contains seizure activity, selected from several recording sites exhibiting ictal activity. Sets C, D and E are recorded intracranially, using depth electrodes implanted symmetrically into the hippocampal formation and strip electrodes are implanted onto the lateral and basal regions (middle and bottom) of the neocortex. An example recording of each set is illustrated in Fig. 1. The sampling rate of the EEG data is 173.61 Hz, and each of them has duration of 23.6 s (4096 samples), recorded using 12bit resolution, while the spectral bandwidth is 0.5 to 85 Hz.
Methods using wavelet transform
The WTbased methods presented in the literature for the analysis of epilepsy in EEG mainly apply discrete wavelet transform (DWT) or wavelet packet decomposition (WPD). WT is a timefrequency technique, which provides both time and frequency views of a signal [23]. Thus, it can accurately capture and localize transient features in the data like the epileptic spikes. In wavelet analysis, a linear combination of specific functions represents the initial signal. These functions are obtained by dilation and translation of the mother wavelet. The signal is decomposed into segments of half its size and spectrum with the use of the mother wavelet. Particularly, in DWT the scaling and translating parameters are presented in powers of two. A series of quadrature mirror filters (QMF) are used, serving as highpass and lowpass filters. In the first level, the conjugate filters (highpass and lowpass) are applied to the input signal resulting to a set of coefficients, named wavelet coefficients. The “approximation” is the output of the lowpass filter and is subdecomposed, extending this procedure in the next level. However, the output of the highpass filter (“detail”) is not further decomposed. In the next level, the procedure is repeated only for the approximation until the signal is decomposed to reveal the band of interest.
WPD is a wavelet transform and it can also be interpreted as an expansion of the DWT, wherein the signal is analysed with a set of QMFs that divide the frequency axis in separate intervals of various sizes [24]. However, in the WPD, the signal is passed through more filters than the DWT and both the detail and approximation coefficients are decomposed. In the first level of decomposition, the obtained wavelet packet coefficients are referred as firstlevel approximation and detail respectively. In the second level, the approximation of the approximation (AA), the detail of the approximation (DA), the approximation of the detail (AD) and the detail of the detail (DD) coefficients are computed and this recursive algorithm renders each newly computed wavelet packet coefficient the root of its own analysis tree. This recurrent splitting is represented in a binary tree. The steps of the methodological approaches presented in the literature are common in both cases. The EEG signal is decomposed into several frequency subbands and features are extracted, creating a feature vector, most commonly used as input to a classifier.
DWTbased studies
The sampling frequency of the EEG recordings in the Bonn database is 173.61 Hz, and thus the frequency range is 0–86.8 Hz. In the majority of methods, the entire spectrum of the EEG recordings was analysed. However, frequencies higher than 60 Hz are often characterized as noise and are subsequently discarded. For that reason, some researchers have initially applied a bandpass filter, which removes the redundant frequency and focuses only on the spectrum that corresponds to the five medically established EEG rhythms, i.e. delta (0–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz) and gamma (30–60 Hz or 30–80 Hz).
Subasi [3] used DWT to decompose the EEG signals into six frequency subbands. However, only the wavelet coefficients that correspond to the frequency range of interest 0–21.7 Hz, meaning the details D3D5 and the approximation A5, were used to calculate the features and train a mixture of experts (ME)based classifier. Guo et al. [4] also used the DWT to analyse the EEG signals, applying a fourlevel decomposition, dividing the selected EEG recordings into five frequency subbands. The line length feature was extracted from each of the five subsignals (D1D4 and A4) forming the feature vector that trained a multilayer perceptron neural network (MLP). Ocak [5] applied a decomposition of three levels in the entire spectrum (0–86.8 Hz). Approximate entropy (ApEn) values, calculated for all the frequency bands, were used to define a threshold which classified the EEG segments. Kumar et al. [6] applied a fivelevel decomposition and calculated the ApEn in each decomposition level. The generated feature vector was fed to an MLP classifier. In a subsequent study, the same group applied a decomposition of five levels (as they previously suggested in [6]), using the fuzzy approximate entropy (fApEn) and support vector machines (SVM) for classification.
A comparison of three feature extraction techniques, principal component analysis (PCA), independent component analysis (ICA) and linear discriminant analysis (LDA) was presented in [8]. The EEG recordings were subjected to a fivelevel decomposition, and statistical features were extracted only by the subsignals D3, D4, D5 and A5, which correspond to the frequency range of 0–21.7 Hz. The dimension of the resulting feature set was reduced by using PCA, ICA and LDA, and the feature vector was used as input to an SVM classifier. In another DWTbased study [9], the authors’ main target was the implementation of a feature extraction system based on genetic programming. Therefore, they applied a fourlevel decomposition to analyse the signal in subsignals and then genetic programming, aiming to reduce the dimension of the extracted feature vector. The extracted set of features and the reduced were used respectively to train a knearest neighbour (KNN) classifier. Results indicated that the reduced feature vector improved the classifier’s performance. A comprehensive methodology based on optimized extreme learning machine (OELM) was proposed in [10]. In this methodology, waveletbased statistical features were extracted from a fourlevel decomposition and the OELM classifier was trained by the features that were extracted from the entire spectrum (0–86.8 Hz). Five classification problems were conducted (among them the fiveclass problem ZONFS), and the performance was measured with accuracy, which reached above 94% for all of the problems.
Another approach is to isolate the frequency band of interest from the five EEG rhythms, from the redundant frequency of the signal, by applying a bandpass filter. A waveletchaos methodology was presented by Adeli et al. [11], where a lowpass finite impulse response (FIR) was used to filter the EEG signal to the 0–60 Hz band. The EEG recordings were then subjected to a fourlevel decomposition, and the average values and standard deviations of a couple of parameters (namely correlation dimension and largest Lyapunov exponent) were calculated in each wavelet subsignal (D1D4 and A4), representing the system’s chaocity. In a subsequent study [12], the aforementioned authors applied wavelet analysis and decomposed the signals into the same frequency subbands, evaluating different methods of classification. A similar approach is described in study [13], wherein the authors applied a bandpass filter and cut off all the signal activity outside the 0–60Hz range to prepare the EEG signals for further processing. In the next stage, a fourlevel decomposition was applied and the calculated autoregressive (AR) parameters of each subband were fed to an MLP classifier. Wang et al. [14] presented a novel classification algorithm based on a voting strategy and a hardware implementation. The authors used a bandpass filter to focus only to the 0–32Hz range and then applied a threelevel decomposition and extracted the sample entropy (SampEn) only by the detail coefficients (D1, D2, D3).
WPDbased studies
Ocak [15] divided the EEG segments through a fourlevel wavelet packet decomposition. ApEn values of the wavelet coefficients of all the 31 nodes of the decomposition tree were used as a feature vector, while a genetic algorithm was employed to reduce the number of features and find the optimal feature subset that maximizes the classification performance of a learning vector quantization (LVQ) scheme. Swami et al. [16] used wavelet packet decomposition to extract valuable information from the EEG signal. A sixlevel wavelet packet decomposition yielding 64 nodes was performed, and several statistical features were extracted from each node. The authors tested seven different combinations of the feature vector and resulted in the best pair, reaching high levels of accuracy. Table 1 summarizes WTbased methods (DWT and WPD) presented in the literature.
Methods using timefrequency analysis
The smoothed pseudo WignerVille distribution (SPWVD) was applied in study [17]. Various lengths of timefrequency resolutions (64, 128, 256 and 512), time windows (3 and 5) and frequency subbands (4, 5, 7 and 13) were analysed, aiming to extract several features from the spectrum of the signal reflecting the energy distribution over the timefrequency plane. PCA was applied to the obtained features, and then an artificial neural network (ANN) was employed for classification. In [18], the same group presented a comprehensive study wherein the shorttime Fourier transform (STFT) and 12 other TFDs were evaluated. The power spectrum density (PSD) of each segment was also extracted and used as input to an ANN classifier.
A methodology based on fast Fourier transform (FFT) and ApEn was proposed in [19]. The average power spectrum was extracted in each subband of 4 Hz along with the ApEn. In total, 16 features were extracted, and the ability of genetic programming and PCA to reduce the dimension of feature vector was examined. The SVM classifier with linear and radial basis functions (kernel functions) was also employed.
In study [20], EEG analysis using TFDs and particularly the spectrogram (SP), the ChoiWilliams distribution (CWD) and the SPWVD are performed. The purpose of the study was both the identification of the seizure peaks and the classification of the EEG signals. For the identification of the peak seizures, the TFDs were calculated and the maximum values were found. The normalized Renyi marginal entropy (RME) was extracted for various lengths of a window (11, 17, 27, 41, 49, 93, 151, 205, 255) for SP and SPWVD and the best value of CWD obtained by the best values of window length of SP and SPWVD. The SPWVD with the RME provided the best results in terms of timefrequency resolution for the peak identification problem. Each signal of the entire datasets was segmented in six subbands, and the energy from the subbands B1, B2 and B3 corresponding to the frequency range of interest of 0.5 to 12 Hz was extracted. A vector of 200 values of energy for the three subbands of interest was obtained, and the moving averages were extracted. The classification of the signals was performed by a threshold which was defined by the mean of the moving average of energy for each band. The obtained results were used as input to a score function to classify each signal. Methods based on TFD analysis are summarized in Table 2.
Method
The flowchart of the methodology followed for this study in order to access the spectral characteristics of the EEG signals is presented in Fig. 2.
Select number of thresholds
Initially, the number of spectral thresholds is selected, which determines the number of frequency subbands that are created; for N spectral thresholds, N + 1 frequency subbands are analysed. The number of spectral thresholds that are examined in this study varied from N = 0 (thus considering all EEG spectrum to be a single subband) to N = 12 (thus creating 13 spectral subbands).
Create combinations
For each number of thresholds, all possible threshold combinations are generated, subject to a single constrain, being that no two consecutive thresholds can be closer than 2 Hz. The limits for the spectral analysis are set to [0, 42] Hz. For N spectral thresholds, the threshold set T^{N} is defined as:
with t_{0} = 0 Hz and t_{N + 1} = 42 Hz, thus:
while each frequency subband is defined as:
and the frequency subbands set F^{N} is defined as:
with F^{N} = N + 1. For example, for N = 5, F^{5} = {[0, t_{1}], [t_{1}, t_{2}], [t_{2}, t_{3}], [t_{3}, t_{4}], [t_{4}, t_{5}], [t_{5}, 42]} Hz.
In order to create all different threshold combinations C^{N} that satisfy the above limitation, only integer values of thresholds are considered. Thus, t_{i} ∈ [2, 40] Hz, i = 1:N, since all frequency subbands must be ≥ 2 Hz, and:
The number of combinations varies greatly as N increases; N vs C^{N} is presented in Fig. 3.
Spectral feature extraction
Subband energy
All EEG signals are initially filtered using a lowpass filter with cutoff frequency of 42 Hz. Then, each threshold set combination C^{N} is used in order to define a set of filters for the EEG signal, one low pass for the [0, t_{1}] Hz subband, one high pass for the [t_{N}, 42] Hz subband and N − 1 bandpass filters for the [t_{i}, t_{i + 1}] Hz, i = 1:N − 1 subbands. All EEG filters are designed as Elliptic IIR filters, with t_{i} ± 0.5 Hz values as fstop and fpass thresholds, respectively. The overall procedure is illustrated in Fig. 4.
The energy of each of the N + 1 filtered signals is the calculated (e_{i}), and the vector of energies (E^{N}) is used for the classification.
Total EEG energy
The total EEG energy (TE) is also calculated as sum of all subband energies:
Subband fractional energy
Besides the energy of each subband, the fractional energy (fe_{i}) is also calculated:
The vector of fractional energies (FE^{N}) is also used as input for the classification step.
Spectral entropy
The spectral entropy (SEn) is the Shannon entropy of the power spectrum density of each EEG signal, calculated as:
with P_{k} being the spectral power of normalized frequencies (and ΣP_{k} = 1), and M is the number of frequency bins.
Classification
The spectral feature vector created in the previous step is FV = {E^{N}, TE, FE^{N}, SEn}. Thus, the size of FV is 2 N + 4, except in the case of N = 0 (i.e. when all EEG spectrum is considered as a single subband) where FV = 1 (i.e. a single feature is included). The number of spectral subbands (F^{N}), spectral threshold combinations (C^{N}) and the size of the feature vector (FV^{N}) with respect to the number of spectral thresholds (N) are presented in Table 3. Classification is based on a random forest classifier [25], which is an ensemble learning method based on the construction of a multitude of decision trees. In this study, random forests were constructed with standard parameters, i.e. each forest containing 100 decision trees, which are grown to the full depth.
The overall methodology is presented in Algorithm 1.
Results
The study focused on two different classification problems, the fiveclass problem (i.e. classifying all Z, O, N, F and S categories) with the main objective being to identify the spectral subbands that carry the maximum information, and the threeclass problem (i.e. ZONFS categories), which is a wellknown medically established problem in this area. The obtained results are in terms of classification accuracy. The 10fold stratified crossvalidation technique has been employed in the classification, thus the dataset has been divided into 10 equally sized datasets, with each of them having the same number of EEG recordings from each of the categories, and then nine of them were used for training the classifier, and the final for testing. This procedure is applied 10 times, thus resulting into 10 confusion matrices, while the final confusion matrix (used to calculate classification accuracy) is their summation.
In Table 4, the best obtained accuracy for the fiveclass problem, for all number of thresholds (N) is presented (max accuracy). Also, the average value of the top10 classification accuracies for each number of thresholds (N) is calculated (average accuracy). The results are illustrated in Fig. 5.
The obtained accuracy results for N = 1 are presented in Fig. 6. The value of the threshold (t_{1}) is on the xaxis; thus, the respective accuracy result is obtained using features extracted from frequency subbands F^{1} = {[0, t_{1}], [t_{1}, 42]} Hz, with the size of the feature vector FV^{1} = 6. For example, for t_{1} = 4 Hz, the frequency subbands are {[0, 4], [4, 42]} Hz and the accuracy result is 73.60%. Also, the accuracy result of N = 0 (F^{0} = [0, 42] Hz, FV^{0} = 1), being 44.80%, is depictured in Fig. 6 (black line) as a baseline result.
The obtained accuracy results for N = 2 are presented in Fig. 7. Since using two spectral thresholds, the obtained results formulate a matrix M (with M (t_{1}, t_{2}) = accuracy obtained using these spectral thresholds), the results are depicted in a 3D image. The value of t_{1} threshold (Hz) is on the xaxis and the value of t_{2} threshold (Hz) is on the yaxis. Thus, the accuracy result for frequency subbands F^{2} = {[0, t_{1}], [t_{1}, t_{2}], [t_{2}, 42]} Hz (with size of feature vector FV^{2} = 8). For example, for t_{1} = 4 Hz and t_{2} = 6 Hz, the frequency subbands are {[0, 4], [4, 6], [6, 42]} Hz.
For values of N greater than 2, the obtained results cannot be presented with respect to the t_{i} values. Thus, results for N > 2 are presented in Fig. 8a–j with respect to the overall number of combinations (C^{N}). Vertical lines represent the changes of t_{1}. For example, the first part of Fig. 8a (denoted with gray color) presents the results of all C^{3} combinations with t_{1} = 2 Hz (which is the first valid value for t_{1}, since t_{1}–t_{0} must be ≥ 2 Hz) and thus t_{2} ∈ [4, 38] Hz and t_{3} ∈ [6, 40] Hz. The sequence of C^{3} combinations for t_{1} = 2 Hz is {{[0, 2], [2, 4], [4, 6], [6, 42]}, {[0, 2], [2, 4], [4, 7], [7, 42]}, … {[0, 2], [2, 4], [4, 40], [40, 42]}, {[0, 2], [2, 5], [5, 7], [7, 42]}, …, {[0, 2], [2, 38], [38, 40], [40, 42]}}.
To make clearer the plots of Fig. 8, the results of C^{2} combinations are also generated in this form (Fig. 9). The subplots (a) to (f) in Fig. 9 correspond to the parts of the main plot that are connected with the red lines, for a specific value of t_{1}. Figure 8a (the first part of the main plot) corresponds to t_{1} = 2 Hz and thus t_{2} ∈ [4, 40] Hz, Fig. 8b (the second part of the main plot) corresponds to t_{1} = 3 Hz and thus t_{2} ∈ [5, 40] Hz, Fig. 8c corresponds to t_{1} = 4 Hz and t_{2} ∈ [6, 40] Hz, Fig. 8d corresponds to t_{1} = 6 Hz and t_{2} ∈ [8, 40] Hz, Fig. 8e corresponds to t_{1} = 7 Hz and t_{2} ∈ [9, 40] Hz and Fig. 8f corresponds to t_{1} = 8 Hz and t_{2} ∈ [10, 40] Hz.
The top five obtained classification accuracy results for each N value, and the respective F^{N} are presented in Table 5.
Besides the fiveclass problem, the wellknown threeclass problem (ZONFS) is also addressed. In this case, the main focus is a medically established problem, addressed from several researchers in the literature [17, 26,27,28,29,30,31]. Again, the results are in terms of classification accuracy, and the 10fold stratified crossvalidation technique has been employed. The obtained results are presented in Table 6.
Discussion
A methodology for systematic analysis of the frequency subband definition regarding EEG analysis for epilepsy, is presented in this work, in order to assess the impact of different number and alternative definitions of frequency subbands in this problem. The methodology is based on the definition of a number of spectral thresholds, based on which a set of frequency subbands is created. Then, a set of spectral features are extracted and used to train a random forest classifier. For each specific number of spectral thresholds (ranging from 0 to 12), all combinations of subband definition are analysed, with the limitation that each subband range must be at least 2 Hz, resulting to a total of ~ 1.32 × 10^{8} frequency subband combinations. The methodology has been applied on a benchmark dataset, being the Bonn EEG database, for the fiveclass (ZONFS) and the threeclass (ZONFS) problems.
For the fiveclass problem, the maximum accuracy obtained for each N (presented in Table 4) ranges from 44.80% (for N = 0) to 91.20% (obtained for two combinations with N = 9). An important conclusion extracted from this analysis is that increasing the number of frequency subbands does not have a positive impact in the classification accuracy, since the results after peaking for N = 9 are slightly decreasing with respect to N (Fig. 5). The same conclusion is reached when the average accuracy of the top 10 results is taken under consideration; maximum average accuracy is 90.72% (obtained for N = 9), decreasing to 90.08% (for N = 12). It should be noted that evidence for this conclusion can be found in Tzallas et al. [17] and Liang et al. [19], where 13 and 15 frequency subbands were examined, respectively, however drawn from single experiments and not a systematic analysis. In [17], the results are decreasing for 13 frequency subbands compared to the results obtained for five and seven frequency subbands (although the fiveclass problem is not included in the analysis of [17]), while in [19] the obtained accuracy for the fiveclass problem is 85.90% using 15 frequency subbands. Furthermore, combinations with N = 5–12 achieved classification results ≥ 90%, being in accordance with the majority of researchers, using four to seven frequency subbands in their analysis (without however any justification for this selection).
Considering the delta, theta, alpha, beta and gamma frequency subbands (medically established rhythms) that correspond to the {[0–4], [4–8], [8–13], [13–30], [30–42]} Hz combination for four spectral thresholds (N = 4), the obtained accuracy is 82.80%, being 6.8% lower than the maximum classification accuracy obtained for N = 4 (89.60%) and 8.4% lower than the best classification accuracy obtained in this study (being 91.20%, obtained for two frequency subband combinations for N = 9). Several of the frequency subband combinations that achieved high classification accuracy (≥ 90%) include frequency subbands that correlate with the medically established rhythms, including also however subbands that clearly differentiate from them. For N = 4 spectral thresholds, the {[0–3], [3–8], [8–18], [18–33], [33–42]} Hz combination, which achieved the best classification accuracy (for N = 4), includes [0–3] Hz band (resembling delta) and [3–8] Hz (resampling theta); however, the other bands are somewhat different. Also, the {[0–2], [2–8], [8–16], [16–25], [25–35], [35–42]} Hz combination, which is one of the frequency subband combinations that achieved maximum classification accuracy for N = 5, includes [8–16] Hz band (alpha rhythm) but significant differences for all other rhythms. Furthermore, for N > 4, additional frequency subbands that carry significant information regarding this problem are revealed.
The frequency subband combinations that achieved maximum classification accuracy are in the first two lines for N = 9 in Table 5. Both include the [0–3] Hz and [3–7] Hz bands, closely related to delta and theta rhythms, but also an additional band [7, 8] Hz, between theta and alpha rhythms, is included. In both cases, beta rhythm is split into four and three smaller bands, for the first and second combination, respectively. Also, gamma rhythm is split into smaller bands (two for the first combination and three for the second). The lowfrequency bands [0–3] and [3–7] are the most common among the ones that achieved high classification accuracy (≥ 90%). This is in compliance with several works presented in the literature [5, 7, 8, 11,12,13,14, 17,18,19,20]. In higher frequencies, however, there are major differences in the frequency subband combinations that achieved maximum results in this study. Especially with the WPDbased studies [15, 16], the frequency subbands used are in complete disagreement with the results obtained in this study. A band (0–43.4 Hz), included in [15, 16] studies, carries little information for this problem, while lowfrequency subbands, extensively included in the highaccuracy achieving combinations in this study, are excluded from the WPDbased studies.
Considering the threeclass problem, the maximum accuracy obtained for each N (presented in Table 6) ranges from 56% (for N = 0) to 98.8% (obtained for several combinations with N = 8 and N = 9). Again, increasing the number of frequency subbands does not have a positive impact in the classification accuracy; the maximum values are obtained for N = 8 and then the results are decreasing with respect to N. In this case also, the combination that corresponds to the medically established rhythms obtained much lower classification accuracy. Among the frequency subband combinations that achieved high classification accuracy (≥ 90%), the lowfrequency bands [0–3] and [3–7] are the most common while there are significant differences in the highfrequency bands.
In Table 7, a comparison of methodologies presented in the literature for the fiveclass problem is presented. Although the focus of this study is to assess the impact of the number of frequency subbands and the different frequency subband combinations in the classification of EEG regarding epilepsy, the obtained results compare well with the ones reported in the literature. The works by Guler and Ubeyli [32, 33] and Murugavel and Ramakrishnan [10] reported high classification accuracy; however, they are validated using a 50% holdout technique and not a crossvalidation procedure. The obtained results using a crossvalidation technique [17, 19, 34, 35] range from 86.10 to 93.75%, with the best obtained results in this study being 91.20%.
A comparison of methodologies presented in the literature for the threeclass problem is presented in Table 8. The results reported in the literature range from 95.6 to 98.8%, with the proposed method archiving 98.8%. Again, some researchers used different validation techniques; however, works employing a 10fold crossvalidation technique [29,30,31] range from 98.28 to 98.8%.
Conclusions
The first systematic analysis in the literature, regarding the impact of the frequency subband definition in the epileptic EEG classification problem, is presented in this study. The study revealed significand conclusions, some are in accordance to the majority of works presented in the literature, while others are contradicting with published works. Yet, a major conclusion of this study is that examining additional frequency subbands (and not only focusing on the medically established rhythms) can greatly benefit studies focusing on the EEG analysis for epilepsy detection.
A limitation of this study is that the range of each subband was forced to be ≥ 2 Hz, thus not examining in greater detail the frequency subbands. The main reason for this limit was the high number of spectral threshold combinations, as the number of spectral thresholds increase. In future, the results obtained in this study will be validated in additional EEG recordings and other wellknown EEG databases [36], including different types of seizure activity; the latter is of major importance since different types of epileptic seizure activity may present different spectral patterns. Also, the application of frequencybased EEG analysis (as in this work) is advantageous compared to other types of EEG processing, since it is of low computational complexity and can be applied in real time. Furthermore, the author will exploit the conclusions from this study (i.e. frequency subband combinations that achieve maximum classification accuracy), in the design of an EEG epilepsy classification procedure based on more complex signal processing techniques (such as using this combination for a timefrequency grid, as in [17]). Also, employment of additional classification methods, such as neural networks and deep learning networks [37,38,39], will be studied in future communications.
Abbreviations
 ANN:

Artificial neural network
 ApEn:

Approximate entropy
 AR:

Autoregressive
 CWD:

ChoiWilliams distribution
 DWT:

Discrete wavelet transform
 EEG:

Electroencephalogram
 fApEn:

Fuzzy approximate entropy
 FFT:

Fast Fourier transform
 FIR:

Finite impulse response
 ICA:

Independent component analysis
 KNN:

knearest neighbor
 LDA:

Linear discriminant analysis
 LVQ:

Learning vector quantization
 ME:

Mixture of experts
 MLP:

Multilayer perceptron neural network
 OELM:

Optimized extreme learning machine
 PCA:

Principal component analysis
 PSD:

Power spectrum density
 QMF:

Quadrature mirror filters
 RME:

Normalized Renyi marginal entropy
 SampEn:

Sample entropy
 SEn:

Spectral entropy
 SP:

Spectrogram
 SPWVD:

Smoothed pseudo WignerVille distribution
 STFT:

Shorttime Fourier transform
 SVM:

Support vector machines
 TFD:

Timefrequency distributions
 WPD:

Wavelet packet decomposition
 WT:

Wavelet transform
References
 1.
D. Hirtz, D.J. Thurman, K. GwinnHardy, M. Mohamed, A.R. Chaudhuri, R. Zalutsky, How common are the “common” neurologic disorders? Neurology 68(5), 326–337 (2007). https://doi.org/10.1212/01.wnl.0000252807.38124.a3
 2.
S.F. Robert, W.E. Boas, W. Blume, C. Elger, P. Genton, P.L.J. Engel, Epileptic seizures and epilepsy: Definitions proposed by the international league against epilepsy (ILAE) and the International Bureau for Epilepsy (IBE). Epilepsia 46(4), 470–472 (2005). https://doi.org/10.1111/j.00139580.2005.66104.x
 3.
A. Subasi, EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst. Appl. 32(4), 1084–1093 (2007). https://doi.org/10.1016/j.eswa.2006.02.005
 4.
L. Guo, D. Rivero, J. Dorado, J.R. Rabunal, A. Pazos, Automatic epileptic seizure detection in EEGs based on line length feature and artificial neural networks. J Neurosci Methods 191(1), 101–109 (2010). https://doi.org/10.1016/j.jneumeth.2010.05.020
 5.
H. Ocak, Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy. Expert Syst. Appl. 36(2), 2027–2036 (2009). https://doi.org/10.1016/j.eswa.2007.12.065.
 6.
Y. Kumar, M.L. Dewal, R.S. Anand, Epileptic seizures detection in EEG using DWTbased ApEn and artificial neural network. SIViP 8(7), 1323–1334 (2014). https://doi.org/10.1007/s1176001203629
 7.
Y. Kumar, M.L. Dewal, R.S. Anand, Epileptic seizure detection using DWT based fuzzy approximate entropy and support vector machine. Neurocomputing 133, 271–279 (2014). https://doi.org/10.1016/j.neucom.2013.11.009
 8.
A. Subasi, M.I. Gursoy, EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Syst. Appl. 37(12), 8659–8666 (2010). https://doi.org/10.1016/j.eswa.2010.06.065
 9.
L. Guo, D. Rivero, J. Dorado, C.R. Munteanu, A. Pazos, Automatic feature extraction using genetic programming: An application to epileptic EEG classification. Expert Syst. Appl. 38(8), 10425–10436 (2011). https://doi.org/10.1016/j.eswa.2011.02.118
 10.
A.M. Murugavel, S. Ramakrishnan, An optimized extreme learning machine for epileptic seizure detection. IAENG Int J Comput Sci 41(4), 212–221 (2014)
 11.
H. Adeli, S. GhoshDastidar, N. Dadmehr, A waveletchaos methodology for analysis of EEGs and EEG subbands to detect seizure and epilepsy. IEEE Trans Biomed Eng 54(2), 205–211 (2007). https://doi.org/10.1109/TBME.2006.886855
 12.
S. GhoshDastidar, H. Adeli, N. Dadmehr, Mixedband waveletchaosneural network methodology for epilepsy and epileptic seizure detection. IEEE Trans Biomed Eng 54(9), 1545–1551 (2007). https://doi.org/10.1109/TBME.2007.891945
 13.
S.R. Mousavi, M. Niknazar, B.V. Vahdat, Epileptic seizure detection using AR model on EEG signals. Cairo International Biomedical Engineering Conference 2008 (CIBEC 2008) (IEEE, Cairo), p. 2008
 14.
Y. Wang, Z. Li, L. Feng, C. Wang, W. Jing, Y. Zhang, Hardware Design of Seizure Detection Based on wavelet transform and sample entropy. J Circuits Syst Comp 25(9), 1650101 (2016). https://doi.org/10.1142/S0218126616501012
 15.
H. Ocak, Optimal classification of epileptic seizures in EEG using wavelet analysis and genetic algorithm. Signal Process. 88(7), 1858–1867 (2008). https://doi.org/10.1016/j.sigpro.2008.01.026
 16.
P. Swami, A.K. Godiyal, J. Santhosh, B.K. Panigrahi, M. Bhatia, S. Anand, Robust expert system design for automated detection of epileptic seizures using SVM classifier. International Conference on Parallel, Distributed and Grid Computing (PDGC 2014). IEEE (2014). https://doi.org/10.1109/PDGC.2014.7030745
 17.
A.T. Tzallas, M.G. Tsipouras, D.I. Fotiadis, Automatic seizure detection based on timefrequency analysis and artificial neural networks. Comput Intell Neurosci, 80510 (2007). https://doi.org/10.1155/2007/80510
 18.
A.T. Tzallas, M.G. Tsipouras, D.I. Fotiadis, Epileptic seizure detection in EEGs using time–frequency analysis. IEEE Trans Inf Technol Biomed 13(5), 703–710 (2009). https://doi.org/10.1109/TITB.2009.2017939
 19.
S.F. Liang, H.C. Wang, W.L. Chang, Combination of EEG complexity and spectral analysis for epilepsy diagnosis and seizure detection. EURASIP J Adv Signal Process 1, 853434 (2010). https://doi.org/10.1155/2010/853434
 20.
A. Ridouh, D. Boutana, S. Bourennane, EEG signals classification based on time frequency analysis. J Circuits Syst Comp 26(12), 1750198 (2017). https://doi.org/10.1142/S0218126617501985
 21.
N.E. Crone, A. Korzeniewska, P.J. Franaszczuk, Cortical gamma responses: Searching high and low. Int. J. Psychophysiol. 79(1), 9–15 (2011). https://doi.org/10.1016/j.ijpsycho.2010.10.013
 22.
R.G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, C.E. Elger, Indications of nonlinear deterministic and finitedimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Phys. Rev. E 64, 061907 (2001)
 23.
S.G. Mallat, A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans Pattern Anal Mach Intell 11(7), 674–693 (1989). https://doi.org/10.1109/34.192463
 24.
S.G. Mallat, A wavelet tour of signal processing (Academic press, 1999)
 25.
L. Breiman, Random Forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324.
 26.
U.R. Acharya, S.V. Sree, S. Chattopadhyay, W. Yu, P.C.A. Ang, Application of recurrence quantification analysis for the automated identification of epileptic EEG signals. Int J of Neural Syst 21(3), 199–211 (2011). https://doi.org/10.1142/S0129065711002808
 27.
U. Orhan, M. Hekim, M. Ozer, EEG signals classification using the Kmeans clustering and a multilayer perceptron neural network model. Expert Syst. Appl. 38, 13475–13481 (2011). https://doi.org/10.1016/j.eswa.2011.04.149
 28.
U.R. Acharya, F. Molinari, S.V. Sree, S. Chattopadhyay, K.H. Ng, J.S. Suri, Automated diagnosis of epileptic EEG using entropies. Biomed Signal Process Control 7, 401–408 (2012). https://doi.org/10.1016/j.bspc.2011.07.007
 29.
M. Peker, B. Sen, D. Delen, A novel method for automated diagnosis of epilepsy using complexvalued classifiers. IEEE J Biomed Health Inform 20(1), 108–118 (2016). https://doi.org/10.1109/JBHI.2014.2387795
 30.
A.K. Tiwari, R.B. Pachori, V. Kanhangad, B. Panigrahi, Automated diagnosis of epilepsy using keypoint based local binary pattern of EEG signals. IEEE J Biomed Health Inform 21(4), 888–896 (2017). https://doi.org/10.1109/JBHI.2016.2589971
 31.
A. Bhattacharyya, R.B. Pachori, A. Upadhyay, U.R. Acharya, TunableQ wavelet transform based multiscale entropy measure for automated classification of epileptic EEG signals. Appl. Sci. 7, 385 (2017). https://doi.org/10.3390/app7040385
 32.
I. Guler, E.D. Ubeyli, Adaptive neurofuzzy inference system for classification of EEG signals using wavelet coefficients. J Neurosci Methods 148(2), 113–121 (2005). https://doi.org/10.1016/j.jneumeth.2005.04.013
 33.
E.D. Ubeyli, I. Guler, Features extracted by eigenvector methods for detecting variability of EEG signals. Pattern Recogn. Lett. 28(5), 592–603 (2007). https://doi.org/10.1016/j.patrec.2006.10.004
 34.
N. Nicolaou, J. Georgiou, Detection of epileptic electroencephalogram based on permutation entropy and support vector machines. Expert Syst. Appl. 39(1), 202–209 (2012). https://doi.org/10.1016/j.eswa.2011.07.008
 35.
N.S. Tawfik, S.M. Youssef, M. Kholief, A hybrid automated detection of epileptic seizures in EEG records. Comput Electr Eng 53, 177–190 (2016). https://doi.org/10.1016/j.compeleceng.2015.09.001
 36.
P. Fergus, A. Hussain, D. Hignett, D. AlJumeily, K. AbdelAziz, H. Hamdan, A machine learning system for automated wholebrain seizure detection. Appl Comput Inform 12(1), 70–89 (2016). https://doi.org/10.1016/j.aci.2015.01.001
 37.
U.R. Acharya, S.L. Oh, Y. Hagiwara, J.H. Tan, H. Adelid, Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals. Comput. Biol. Med. 100(1), 270–278 (2018) https://doi.org/10.1016/j.compbiomed.2017.09.017
 38.
P Thodoroff, J Pineau, A Lim, Learning robust features using deep learning for automatic seizure detection. arXiv:1608.00220, (2016)
 39.
O. Fausta, Y. Hagiwara, T.J. Hong, O.S. Lih, U.R. Acharya, Deep learning for healthcare applications based on physiological signals: A review. Comput. Methods Prog. Biomed., 161 (2018). https://doi.org/10.1016/j.cmpb.2018.04.005
Funding
This research has been cofinanced by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH – CREATE – INNOVATE (project code: T1EDK01958).
Availability of data and materials
All data used in this manuscript are publicly available in [22].
Author information
Affiliations
Contributions
Markos G. Tsipouras is the sole author. The author read and approved the final manuscript.
Corresponding author
Correspondence to Markos G. Tsipouras.
Ethics declarations
Author’s information
MGT was born in Athens, Greece, in 1977. He received the diploma degree in Computer Science from the University of Ioannina, Greece, in 1999, and M.Sc. and Ph.D degrees in computer science, in 2002 and 2008 respectively, from the same department. Also, he received a Natural Sciences diploma from the Hellenic Open University in 2013. He has participated in more than 15 European and National Research & Development Projects as a researcher/developer. He has published more than 40 papers in peerreviewed scientific journals, and more than 60 articles in peerreviewed conference proceedings. Also, he has published 7 book chapters, and he has coauthored one book. His research interests include digital signal and image processing, medical informatics, artificial intelligence, fuzzy logic, data mining, decision support systems and expert systems.
Competing interests
The author declares that he has no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 EEG signal processing
 EEG spectral analysis
 EEG frequency subbands
 Epilepsy