Open Access

Optimization of Weighting Factors for Multiple Window Spectrogram of Event-Related Potentials

EURASIP Journal on Advances in Signal Processing20102010:391798

https://doi.org/10.1155/2010/391798

Received: 22 December 2009

Accepted: 14 May 2010

Published: 10 June 2010

Abstract

This paper concerns the mean square error optimal weighting factors for multiple window spectrogram of different stationary and nonstationary processes. It is well known that the choice of multiple windows is important, but here we show that the weighting of the different multiple window spectrograms in the final average is as important to consider and that the equally averaged spectrogram is not mean square error optimal for non-stationary processes. The cost function for optimization is the normalized mean square error where the normalization factor is the multiple window spectrogram. This means that the unknown weighting factors will be present in the numerator as well as in the denominator. A quasi-Newton algorithm is used for the optimization. The optimization is compared for a number of well-known sets of multiple windows and common weighting factors and the results show that the number and the shape of the windows are important for a small mean square error. Multiple window spectrograms using these optimal weighting factors, from ElectroEncephaloGram data including steady-state visual evoked potentials, are shown as examples.

1. Introduction

Estimation and detection of frequency changes of shorter or longer duration in the ElectroEncephaloGram (EEG) connected to stimuli, for example, evoked or induced potentials are often of great interest. To statistically differ between responses from different types of stimuli, choosing an spectral estimator with small bias and low variance is important.

The idea of multiple windows or multitapering was introduced by Thomson, [1], and in the last decades the Thomson method has been used in many different application areas. It has been shown to outperform the Welch method [2] in terms of leakage, resolution, and variance for a stationary spectrally smooth process, [3]. For nonsmooth spectra, however, the performance of the Thomson method degrades due to cross-correlation between subspectra [4]. Other appropriate choices are then, for example, [57]. A comparison of Hermite and Slepian functions (the Thomson method) has shown that in the case of time-varying signals and spectrogram estimation, Hermite functions are a better choice [8].

The choice of windows has been studied in the literature but how to weight the different multiple window spectrograms in the final average has not gained that much attention. In [9], the weighting factors are optimized for the Peak-Matched Multiple Windows, [6]. A criterion is used where normalized bias, variance, and mean square error is optimized for the predefined peaked spectrum. In the nonstationary case, different approaches to approximate a time-varying spectrum with a few windowed spectrograms have been taken, for example, [1013].

In this paper we compare the Hermite functions, the Thomson windows, the Peak Matched Multiple Windows, and the Welch windows and evaluate the performance with optimal weighting factors for different processes. The cost function for optimization is the normalized mean square error where the normalization factor is the multiple window spectrogram. This means that the unknown weighting factors will be present in the numerator as well as in the denominator. A quasi-Newton algorithm is used for the optimization. We compare the results from the usual equally weighted multiple window spectrogram as well as an optimal scaling factor-adjusted multiple window spectrogram. Preliminary results have been presented in [14]. A nonstationary process model, which could be appropriate for, for example, induced responses in the EEG, is studied. We illustrate the weighted multiple window spectrogram estimates by showing examples of steady-state visual evoked potential (SSVEP).

The paper is organized as follows. Section 2 presents the optimized weighting factors and in Section 3 the evaluation for different stationary and nonstationary processes is presented. In Section 4 examples of estimation of SSVEP are shown. Section 5 concludes the paper.

2. Optimization of Weighting Factors

The Multiple Window Spectrogram of the zero mean real-valued random process , is defined by
(1)

for and , where the assumption is that the data is stationary for the samples . Equation (1) is a weighted sum of spectrograms obtained by using the data windows , and the weighting factors , . The parameter is the step size and the number of values in the DFT.

With only one window, , the spectrogram has too large variance to be useful in the analysis of a stochastic process, as the variance is approximately the squared Wigner spectrum .

2.1. Mean Square Error Optimization

The mean square error (MSE) is a natural choice of optimization since it includes both variance and squared bias. Optimizing the MSE for a model where the power varies with time, which might be the case for nonstationary processes, focuses too much on high-power parts of the process. To avoid this, the optimization can be done normalizing with the true Wigner spectrum at each time and frequency value. However, this might give a strange result if the Wigner estimate is biased. Therefore, we consider the normalized MSE (nMSE) where the expectation of the multiple window spectrogram is used for the normalization at each time and frequency value.

The nMSE, which is computed in the time interval and in the frequency interval , is the average of a number of time-frequency values, giving the cost function:
(2)
where the MSE for each time and frequency value is defined as
(3)
The variance is
(4)
where the covariance matrix with and and the superscript denotes conjugate transpose, according to [4]. Reduction of the variance is established if the correlation between the windowed periodogram (subspectra),
(5)

from the windows and , , is small for all frequency values .

The bias is
(6)

where is the known Wigner spectrum of the model. The optimization cost function of (2) includes the expressions of (4) and (6) where , are known windows and is the time-variable nonstationary covariance matrix. The unknown variables are , which appear both in the numerator and the denominator of (2). The minimization of the criterion is therefore done iteratively with a quasi-Newton algorithm [15]. The criterion and its derivative are used in the algorithm. The algorithm is described in [9]. Using these weights in the multiple window spectrogram is referred to as optimal weights (OPTWEI).

2.2. Averaging and Scale Optimization

Usually, the spectrograms from different windows are equally weighted and averaged in the final estimate, that is,
(7)

Using equal weights according to (7) is referred to as equal weights (EQWEI).

The mean square error could be optimized according to the nMSE criterion, using equal weights scaled with a constant factor, that is,
(8)
where a closed form expression for the factor is found from
(9)

The weighting factors are referred to as scaled weights (SCWEI).

3. Results

3.1. Bandlimited White Noise Process

The evaluation is done for different stationary and nonstationary processes. The bandlimited white noise process with the covariance function
(10)
generates a Toeplitz covariance matrix , which is shown in Figure 1(a) for , (Case ). The locally stationary process approach [16, 17], where the covariance function of a nonstationary process is defined by
(11)
gives a time-variable bandlimited spectrum where the time-variable power of the bandlimited white-noise process changes with a Gaussian envelope. Two examples are seen in Figure 1(b) as a long-event nonstationary process ( ) (Case ) and in Figure 1(c) as a short-event nonstationary process ( ) (Case ).
Figure 1

The three different test covariance matrices for bandlimited white noise processes: (a) stationary process, (b) long-event nonstationary process, and (c) short-event nonstationary process. (a)Case (b)Case (c)Case

The weighting factors are optimized using four different sets of multiple windows. The Thomson multiple windows (TH) [1] give uncorrelated subspectra and thereby low variance for a stationary white noise process and the window functions are given by the eigenvectors of the ( ) Toeplitz covariance matrix with elements given by (10) with
(12)

where is the number of multiple windows in the set.

The Peak-Matched (PM) multiple windows [6] are designed to give small correlation between subspectra when the spectrum of the stationary process includes peaks and notches. The windows are given by the solution of the generalized eigenvalue problem where the number of windows satisfies (12). Other parameters to be defined are the peak height chosen as  dB and the sidelobe suppression chosen as  dB [6]. The number of windows is and is related to the bandwidth and window length as in (12).

The Welch method (WO) [2] utilizes time-shifted equal windows. In this paper we use a Hanning window of appropriate length so that the number of windows, , is fitted into the total window length with 50% overlap.

A set of Hermite functions (HE) is computed as
(13)

with for . The parameter is chosen so that the first Hermite function is approximately equal to the first Slepian function of the Thomson method in each case (similar approach as in [8]).

The number of windows is chosen as for all different methods and the window lengths are in all cases giving for the Thomson and Peak-Matched multiple windows. For Case (stationary process), the nMSE is computed and optimized only for the frequency, , that is, and . For the nonstationary cases we choose, and with , ( values) for Case and with , ( values) for Case These choices include the whole covariance matrix in each case and give a balance between different time and frequency values in the average.

The nMSE for Case is shown in Figure 2(a), for the different multiple window sets, where the nMSE from EQWEI is shown with circles, the SCWEI with pluses, and OPTWEI with stars. The Thomson windows and Hermite functions are optimal for the stationary bandlimited white-noise process using the EQWEI and thereby the optimization of the weighting factors (SCWEI and OPTWEI) does not give any improvement of the nMSE. The Peak-Matched multiple windows do not give a small error using EQWEI, but with SCWEI and also OPTWEI, the nMSE decreases. The overall smallest error, however, is given by the Thomson and Hermite multiple windows as expected, as these two sets are optimal for a stationary bandlimited process. The Peak-Matched multiple windows and the Welch method are not able to reach the same nMSE even when the weighting factors are optimized.
Figure 2

The normalized mean square error for the three bandlimited white noise processes with EQWEI (circles), SCWEI (pluses), and OPTWEI (stars) for different window sets, Thomson multiple windows (TH), Peak Matched multiple windows (PM), Welch method (WO), and Hermite functions (HE): (a) stationary process, (b) long-event nonstationary process, and (c) short-event nonstationary process. (a)Case (b) Case (c)Case

In Case Figure 2(b), the results from the long-event nonstationary process show the importance of using SCWEI and OPTWEI compared to the EQWEI in the nonstationary case. The difference of these two sets of weights is, however, not that large. It could also be noted that the Hermite functions perform slightly better than the Thomson multiple windows, which is in concordance with the study of nonstationary processes in [8]. In Case Figure 2(c), using EQWEI on the short-event nonstationary process gives a very large error. Using SCWEI and OPTWEI gives a much lower nMSE.

The weighting factors for OPTWEI are depicted in Figure 3 for the different window sets and the different cases. For the stationary process, the optimal weighting factors for the Thomson multiple windows (stars) are equally given by . This is almost also the case for the Hermite functions (crosses), where the Peak-Matched multiple windows as well as the Welch method give more irregular weighting factors. Overall, however, the optimal weighting factors result almost in equally averaged spectra in all cases which coincides with theory for stationary processes. Of more interest is the long-event nonstationary process in Figure 3(b), where now both the Thomson and the Hermite windows give weights where more power is given to the spectrogram from the first window function with decreasing power to the following ones. Similar appearance is seen for the Welch method, where we should remember that all the windows have the same frequency shape but have their power centered at different time points. Most of the power is laid on the resulting spectrograms of the middle windows which intuitively seems quite natural. For the more short-event nonstationary process, the resulting weighting factors have a different behavior, see Figure 3(c), where now the multiple windows located at end points of the time interval for the Welch method are given most power. This shows the importance of considering the weighting factors in estimation procedure. However, for the bandlimited white noise process, we should remember that using SCWEI in all cases gave almost as small error as OPTWEI.
Figure 3

The optimized weighting factors for the bandlimited white noise process when the different multiple window sets are applied to the different cases, Thomson windows (stars), Peak Matched multiple windows (pluses), Welch method (circles), Hermite functions (crosses): (a) stationary process, (b) long-event nonstationary process, and (c) short-event nonstationary process. (a)Case (b)Case (c)Case

As the optimization is made using a quasi-Newton algorithm, we cannot be sure of convergence to the global minimum. To verify, we optimize the weighting factors in all cases using 100 different initial sets of weighting factors. The set of initial values is randomly picked from a rectangle distribution with values between zero and one and the resulting sum is normalized to one. For all three bandlimited white noise processes and for all sets of windows, the optimization converged to the same minimum error for all the 100 cases, based on equal sets of weighting factors.

3.2. Bandlimited Peaked Spectrum Process

Instead of using a bandlimited white noise process, the stationary covariance function in (10) is replaced with the covariance function of a bandlimited peaked spectrum according to [6]:
(14)

In (14), is a peaked spectrum with , and  dB, where  dB. The nonstationary processes are found from (11) with replaced with .

The results from these processes are presented in Figure 4. For the stationary-peaked spectrum process, Figure 4(a), the EQWEI of the Welch method happens to give the smallest error. Using SCWEI we can lower the nMSE for all methods but using OPTWEI combined with Peak Matched multiple windows gives the smallest nMSE of all methods, which is concordance with [6, 9], where these windows and optimized weighting factors are shown to be optimal for this process. In Case Figure 4(b), for the long-event nonstationary process, the benefit of using windows with properties suitable for the process becomes visible as the smallest nMSE is given for the Peak-Matched multiple windows combined with OPTWEI. In this case, the SCWEI is far from giving the same result. In Case we also see a similar result for the short-event nonstationary process.
Figure 4

The normalized mean square error for the three bandlimited peaked spectrum processes with EQWEI (circles), SCWEI (pluses), and OPTWEI (stars) for different window sets, Thomson multiple windows (TH), Peak-Matched multiple windows (PM), Welch method (WO) and Hermite functions (HE): (a) stationary process, (b) long-event nonstationary process, and (c) short-event nonstationary process. (a)Case (b)Case (c)Case

The different weighting factors are depicted in Figure 5, and for the stationary case, we see in Figure 5(a) the characteristic weighting for the PM given by , where is the eigenvalues from the solution of the eigenvalue problem giving the peak-matched multiple windows optimizing at the frequency see [9]. Of more interest is the nonstationary processes of Cases and The optimal weightings of Peak-Matched and Thomson multiple windows are similar and they all give most power to the spectrogram from the first window and decreasing power to the following spectrograms. It is also worth notifying that this power increases for the short-event process of Case see Figure 5(c). Most important, however, is how the weighting changes between the peaked spectrum process and the bandlimited white noise process and also how the weighting changes with the non-stationarity of the process.
Figure 5

The optimized weighting factors for the bandlimited peaked spectrum process when the different multiple window sets are applied to the different cases, and Thomson windows (stars), Peak Matched multiple windows (pluses), Welch method (circles), and Hermite functions (crosses): (a) stationary process, (b) long-event nonstationary process, and (c) short-event nonstationary process. (a)Case (c)Case (c)Case

The convergence of the optimization of the weighting factors in the three cases of bandlimited peaked spectrum processes is also investigated using the same randomly picked initial values as for the bandlimited white noise process and all different window sets. The results show that a minimum of 90% (usually around 95%) of the initial values converge to the global minimum giving the true optimal weighting factors. In the cases where the algorithm did not converge, the final error and the weighting factors were very far away from the true values and the divergence was easily discovered.

4. Real Data Examples

To show the performance for real-data, sampled ElectroEncephaloGram data (EEG) were studied, where a flickering light (Grass Photic stimulator Model PS22C) was introduced at different time points. The light stimulation lasted approximately 1 s or 5 s. For a repetitive periodic visual stimulus a steady-state visual evoked potential (SSVEP) arises in the EEG. We assume the short stimulation ( 1 s) to introduce a short-event nonstationary process and the long stimulation ( 5 s) to introduce a long-event nonstationary process in the measured EEG. The subject was supine with closed eyes on a bed in a silent laboratory where ambient light was dimmed. The flickering light, with set frequency and time interval, was flashed at the subject from a distance of approximately 1 m. Data were recorded using a Neuroscan system with a digital amplifier (SYNAMP 5080, Neuro Scan, Inc.). Amplifier band-pass settings were 0.3 and 50 Hz. The sample rate was 256 Hz which was downsampled to a sample rate of  Hz in Matlab. In all examples channel PZ is chosen.

We illustrate the performance of the methods with examples of four different data sets. Example is given from flickering light of 12 Hz and example is given from flickering light of 15 Hz. For both these examples the flickering lasts between time points 5 and 10 s and we assume these responses to be two long-event nonstationary processes. The third and fourth examples are given from flickering of 9 Hz between time points 10.4 and 12.2 s and time points 4.7 and 5.7 s, respectively. These two examples are assumed to be responses of short-event nonstationary processes.

We assume that we can model the different long-event nonstationary SSVEPs of Examples and as bandlimited peaked spectrum processes; see [18]. Logarithmic spectrograms are depicted in Figure 6, where we compare the spectrograms using the single Hanning window with different weightings of the Peak Matched multiple windows using OPTWEI from Figure 5(b) and EQWEI/SCWEI. In all cases, the window length is . Note that the spectrogram using EQWEI is equal to SCWEI as the difference is only a gain factor and the coloring is adjusted between the minimum and maximum value of each plot. We should also remember that the bandwidth of this estimator is  Hz, which is also clearly seen in the examples in Figures 6(b) and 6(e). The spread of the power caused by the large frequency bandwidth makes it difficult to know where the actual response frequency is located. Equal weighting of the multiple window spectrograms is not appropriate for data where it is important to locate the maximum power at a certain frequency. In the time-scale, however, we see that the resulting responses show up in the time interval where they should be located according to the stimuli given. The single window Hanning spectrograms are well resolved in frequency but the variance is however too large to be reliable, which is seen in Figures 6(a) and 6(d).
Figure 6

Logarithmic spectrogram of SSVEP of long-event nonstationary character, estimated using a single window spectrogram with a Hanning window and the Peak-Matched multiple windows with EQWEI/SCWEI and with OPTWEI from Figure 5(b). Horizontal and vertical lines indicate flickering light of (a), (b), (c) 12 Hz, 5–10 s, (d), (e),(f) 15 Hz, 5–10 s. (a)Ex. 1, Hanning (b)Ex. 1 EQWEI/SCWEI (c)Ex. 1, OPTWEI (d)Ex. 2, Hanning (e)Ex. 2, EQWEI/SCWEI (f)Ex. 2, OPTWEI

In Figure 7, we compare the spectrogram estimates using the single Hanning window with different weightings of the Peak-Matched multiple windows using OPTWEI from Figure 5(c) and EQWEI/SCWEI. The single Hanning spectrogram in Figures 7(a) and 7(d) is difficult to interpret and the spectrograms using EQWEI/SCWEI give a too wide estimate in frequency; see Figures 7(b) and 7(e). Using OPTWEI, the short-event nonstationary processes in Figures 7(c) and 7(f) are located correctly in the time interval as well as at the appropriate frequency. The last case however, Figure 7(f), has a large amount of power outside the time interval of stimuli, around 6-7 s. This is explained by the fact the stimulus sequence also activated the person and thereby also the alpha activity raised; see [18].
Figure 7

Logarithmic spectrogram of SSVEP of short-event nonstationary character, estimated using a single window spectrogram with a Hanning window and the Peak-Matched multiple windows with EQWEI/SCWEI and with OPTWEI from Figure 5(b). Horizontal and vertical lines indicate flickering light of (a), (b), (c) 9 Hz, 10.4–12.2 s, (d), (e), (f) 9 Hz, 4.7–5.7 s. (a)Ex. 3, Hanning (b)Ex. 3, EQWEI/SCWEI (c)Ex. 3, OPTWEI (d)Ex. 4, Hanning (e)Ex. 4, EQWEI/SCWEI (f)Ex. 4, OPTWEI

Even better is to actually estimate a model covariance function, using, for example, many trials from the same experiment for a robust estimate. From the properties of this modeled covariance function an appropriate set of multiple windows can be chosen and the weighting factors could be nMSE optimized to estimate the single stimulus response.

5. Conclusion

We compare the Hermite functions, the Thomson windows, the Peak-Matched Multiple Windows, and the Welch windows and compute the performance with optimal weighting factors for different stationary and nonstationary processes. The cost function for optimization is the normalized mean square error where the normalization factor is the multiple window spectrogram. This means that the unknown weighting factors will be present in the numerator as well as in the denominator. A quasi-Newton algorithm is used for the optimization. The results show that the weighting factors, as well as the shape of the windows, are important factors for a small error. It is also shown that a scaling optimization of the usual averaging could give almost as small mean square error as an optimization of the individual weighting factors in case of a smooth spectrum. For a peaked spectrum, a significant reduction of the normalized mean square error is achieved using individual optimization of the weights.

Declarations

Acknowledgment

This paper is supported by the Swedish Research Council.

Authors’ Affiliations

(1)
Division of Mathematical Statistics, Centre for Mathematical Sciences, Lund University, Lund, Sweden

References

  1. Thomson DJ: Spectrum estimation and harmonic analysis. Proceedings of the IEEE 1982, 70(9):1055-1096.View ArticleGoogle Scholar
  2. Welch PD: The use of fast fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Transactions on Audio Electroacoustics 1967, 15(2):70-73. 10.1109/TAU.1967.1161901MathSciNetView ArticleGoogle Scholar
  3. Bronez TP: On the performance advantage of multitaper spectral analysis. IEEE Transactions on Signal Processing 1992, 40(12):2941-2946. 10.1109/78.175738View ArticleGoogle Scholar
  4. Walden AT, McCoy E, Percival DB: Variance of multitaper spectrum estimates for real Gaussian processes. IEEE Transactions on Signal Processing 1994, 42(2):479-482. 10.1109/78.275635View ArticleGoogle Scholar
  5. Riedel KS, Sidorenko A: Minimum bias multiple taper spectral estimation. IEEE Transactions on Signal Processing 1995, 43(1):188-195. 10.1109/78.365298View ArticleGoogle Scholar
  6. Hansson M, Salomonsson G: A multiple window method for estimation of peaked spectra. IEEE Transactions on Signal Processing 1997, 45(3):778-781. 10.1109/78.558503View ArticleGoogle Scholar
  7. Farhang-Boroujeny B: Prolate filters for nonadaptive multitaper spectral estimators with high spectral dynamic range. IEEE Signal Processing Letters 2008, 15: 457-460.View ArticleGoogle Scholar
  8. Bayram M, Baraniuk RG: Multiple window time-frequency analysis. Proceedings of IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, June 1996 173-176.View ArticleGoogle Scholar
  9. Hansson M: Optimized weighted averaging of peak matched multiple window spectrum estimates. IEEE Transactions on Signal Processing 1999, 47(4):1141-1146. 10.1109/78.752613MathSciNetView ArticleGoogle Scholar
  10. Çakrak F, Loughlin PJ: Multiple window time-varying spectral analysis. IEEE Transactions on Signal Processing 2001, 49(2):448-453. 10.1109/78.902129View ArticleGoogle Scholar
  11. Xiao J, Flandrin P: Multitaper time-frequency reassignment for nonstationary spectrum estimation and chirp enhancement. IEEE Transactions on Signal Processing 2007, 55(6):2851-2860.MathSciNetView ArticleGoogle Scholar
  12. Williams WJ, Aviyente S: Spectrogram decompositions of time-frequency distributions. Proceedings of the 6th International, Symposium on Signal Processing and its Applications (ISSPA '01), 2001 587-590.View ArticleGoogle Scholar
  13. Scharf LL, Friedlander B: Toeplitz and Hankel kernels for estimating time-varying spectra of discrete-time random processes. IEEE Transactions on Signal Processing 2001, 49(1):179-189. 10.1109/78.890359MathSciNetView ArticleGoogle Scholar
  14. Hansson-Sandsten M, Sandberg J: Optimization of weighting factors for multiple window time-frequency analysis. Proceedings of the European Signal Processing Conference (EUSIPCO '09), 2009, Glasgow, UKGoogle Scholar
  15. Fletcher R: Practical Methods for Optimization. John Wiley & Sons, New York, NY, USA; 1987.MATHGoogle Scholar
  16. Silverman RA: Locally stationary random processes. IRE Transactions on Information Theory 1957, 3: 182-187. 10.1109/TIT.1957.1057413View ArticleGoogle Scholar
  17. Wahlberg P, Hansson M: Kernels and multiple windows for estimation of the Wigner-Ville spectrum of Gaussian locally stationary processes. IEEE Transactions on Signal Processing 2007, 55(1):73-84.MathSciNetView ArticleGoogle Scholar
  18. Hansson M, Lindgren M: Multiple-window spectrogram of peaks due to transients in the electroencephalogram. IEEE Transactions on Biomedical Engineering 2001, 48(3):284-293. 10.1109/10.914791View ArticleGoogle Scholar

Copyright

© M. Hansson-Sandsten and J. Sandberg. 2010

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.