EURASIP Journal on Applied Signal Processing 2003:7, 713–729 c ○ 2003 Hindawi Publishing Corporation Joint Time-Frequency-Space Classification of EEG in a Brain-Computer Interface Application

Brain-computer interface is a growing field of interest in human-computer interaction with diverse applications ranging from medicine to entertainment. In this paper, we present a system which allows for classification of mental tasks based on a joint time-frequency-space decorrelation, in which mental tasks are measured via electroencephalogram (EEG) signals. The efficiency of this approach was evaluated by means of real-time experimentations on two subjects performing three different mental tasks. To do so, a number of protocols for visualization, as well as training with and without feedback, were also developed. Obtained results show that it is possible to obtain good classification of simple mental tasks, in view of command and control, after a relatively small amount of training, with accuracies around 80%, and in real time.


INTRODUCTION
Research on human-computer interfaces (HCIs) for disabled people has lead to the so-called brain-computer interface (BCI) systems that use brain activity for communication purposes. When the brain activity is monitored through electroencephalogram (EEG) measurements, one has an EEGbased BCI, henceforth simply called BCI.
Current BCIs use the following noninvasive EEG signals.
(i) Event-related potentials (ERPs), which appear in response to some specific stimulus. ERPs can provide control when the BCI produces the appropriate stimuli. The advantage of an ERP-based BCI is that little training is necessary for a new subject to gain control of the system. The disadvantage is that the subject must wait for the relevant stimulus presentation [1]. (ii) Steady-state visual-evoked responses (SSVERs), which are elicited by a visual stimulus that is modulated at a fixed frequency. The SSVER is characterized by an increase in EEG activity at the stimulus frequency. With biofeedback training, subjects learn to voluntarily con-trol their SSVER amplitude. Changes in the SSVER result in control actions occurring at fixed intervals of time [2]. (iii) Slow cortical potential shifts (SCPSs) that are shifts of cortical voltage, lasting from a few hundred milliseconds up to several seconds. Subjects can learn to produce slow cortical amplitude shifts in an electrically positive or negative direction for binary control. This skill can be acquired if the subjects are provided with a feedback on the course of their SCP and if they are positively reinforced for correct responses [3]. (iv) Spontaneous signals (SSs) that are recorded in the course of ordinary brain activity. These signals are spontaneous in the sense that they do not constitute the responses to a particular stimulus.
A BCI based on SSs generates a control signal at given intervals of time based on the classification of EEG patterns resulting from a particular mental activity (MA) [4,5]. The development in BCI research was mainly motivated by the hope that it could serve as an augmentative communication option for people with motor disabilities [6]. However, efficient BCIs can serve as additional command and control means when the hands are used for other tasks, as in the case of pilots. The application that motivated our research was the design of an immersive environment where people could interact, between themselves and the environment, by simply thinking.
The achievement of a successful BCI system depends on system design factors (classification algorithm, communication bit rate, and feedback strategy) as well as on subject motivation.
There is a subject dependency because the subject should learn how to control his EEG in order to interact with the system. Human factors such as fatigue, stress, or boredom are of great influence; one of the first questions when designing a BCI should be how to motivate the subject.
In this paper, we present a time-frequency SS-based BCI. We designed five operational modes (OMs) going from the simple real-time visualization of EEG in a 3D environment to object control. In this way, the subject can become familiar with the system and get motivated because of the 3D environment where the interaction takes place.

GENERAL CONCEPTS
A BCI can be defined as a communication system that involves two entities: a human subject and a machine. The subject communicates by producing EEG and the machine responds with "actions." In this research, the machine is a computer and the computer actions are dynamic multimedia signals (3D scenes, images, videos, or sounds).
The subject performs MAs to control the computer actions. These MAs are characterized by the presence of patterns in recorded EEG signals.
The correspondence between EEG patterns and computer actions constitutes a machine-learning problem since the computer should learn how to recognize a given EEG pattern. In order to solve this problem, a training phase is necessary, in which the subject is asked to perform MAs and a computer algorithm is in charge of extracting the EEG patterns characterizing them.
When the training phase is finished, the subject can start to control the computer actions with his thoughts. This is the application phase and constitutes the ultimate goal of our research.

EEG acquisition
EEG signals are measured at the scalp by affixing an array of electrodes according to the 10-20 international system ( Figure 1) and with reference to digitally linked ears (DLE). DLE voltages are obtained by using the average of voltages at both earlobes as reference. The earlobes are selected because they constitute an almost quiet reference. In fact, they present small influences due to temporal activity [7].
If we denote by V e the voltage at any of the electrodes, and V A1 and V A2 the voltages at the left earlobe and right earlobe, respectively, then the DLE-referenced voltage of  electrode e is when V A1 is the physical reference and when V A2 is the physical reference. An EEG signal is thus composed of the DLE signals of each electrode. When a measure is composed of such single composite measures, it is called multivariate [8].

Training phase
The objective of this phase is two-fold: to extract EEG patterns that uniquely characterize MAs, and to train the subject. The results of this phase are MA models that will serve as references for the application phase.
This phase can be performed with two approaches, namely, training without feedback and training with feedback.
In the case of training without feedback, the subject is asked to perform MAs during a given period of time (with repetitions if necessary) while his EEG signals are recorded for ulterior MA model construction.
In the case of training with feedback, clue information is provided to the subject that tells him if his EEG pattern  was successfully identified (positive feedback) or not (negative feedback). According to neuroscience results [9,10], the human brain is able to modulate its activity in order to minimize the number of negative feedbacks. Training with feedback is possible only if an MA model exists, that is, the information of a previous training without feedback is available.

Application phase
The basic scheme of a BCI in its application phase is shown in Figure 2. The computer action at recognition time t k is generated by the classification of the EEG pattern present in the EEG signals (S k ) recorded during the T seconds preceding the recognition time. In the sequel, we will call this EEG segment of duration T a trial.
The time interval between two successive recognition times is denoted by T I (interaction period or computer actions period). The choice of T I and T is the result of a tradeoff between computer actions rate, EEG pattern misclassification probability, and computational cost.
As EEG signals are contaminated by noise, a preprocessing step is necessary. The trial S k is then passed trough the preprocessing module whose output is a clean trial X k or a special message if S k is too perturbed to be useful.
The pattern estimation module extracts the EEG patterns F k contained in X k . The nature of F k is determined by the classification algorithm.
Finally, a classifier module decides which computer action to consider based on a distance measure between MA model representatives and the pattern F k .

BCI-system modules
According to Figure 2, a BCI in its application phase is composed of the following modules: signal acquisition, preprocess-ing, pattern estimation, pattern classification, and computer actions generator.
In the training phase, the same modules are used plus an MA model builder. The role of the computer actions generator is however different here as it is used to display visual cues (indicating which MA to perform) and to provide feedback.
Since BCI technology is still in its experimental phase, these modules and their relationships should be as flexible as possible.

OMs of the BCI
Five OMs 1 were implemented; they allow the subjects to perform various experiments from simple to more complex.

Visualization OM (VOM)
In this OM, the subject can watch a visual representation of his EEG in real time. Specific EEG features, such as the power values in the typical frequency bands (δ, θ, α, β), interelectrode coherences, and total power at a given electrode, are mapped to a 3D virtual environment and are regularly updated. The objectives of this OM are to familiarize the subject with the system as well as to calibrate the latter.

Training without feedback OM (NFOM)
In this OM, the subject is asked (by means of visual or audio cues) to perform a defined MA. The produced EEG is then recorded for offline MA model construction.

Training with feedback OM (FOM)
The subject is asked to perform an MA and a feedback is provided. This feedback is positive when the computer recognizes the MA and is negative otherwise. This is possible as MA models were calculated during a previous training without feedback. MA models can be updated in the course of a FOM (dynamic update) or at the end of it [11].

Control OM (COM)
Since the results of previous OMs are MA models, the subject can start to control the system by performing the MAs for which the system has been trained. In this OM, visual or sound cues are no longer necessary.

Multisubject simultaneous training OM (MUOM)
This is a particular form of the FOM. It consists in a multisubject game whose goal is to gain control of an object by performing an MA. This OM was chosen because of its more stimulating effect when compared to a simple feedback.

System architecture
We grouped the system modules listed before into three components: signal production, signal processing, and multimedia renderer.  We propose a distributed architecture in which each component offers specific services to the others in an efficient and transparent way. Figure 3 depicts the architecture diagram of our BCI system.
(1) The signal production component is responsible for signal acquisition, digitalization, and efficient data transmission through the network.
(2) The signal processing component is in charge of signal preprocessing, pattern extraction, MA model construction, and pattern classification.
(3) The rendering component is used to display multimedia cues in NFOM and FOM, as well as to provide the feedback for the FOM. Furthermore, it acts as renderer in the VOM, COM, and MUOM.
The communication rules between these components were designed over the CORBA specification [12], and implemented in JAVA (for networking) and C and Matlab (for processing).

EEG SIGNALS PREPROCESSING
The purpose of EEG signal preprocessing is to maximize the signal-to-noise ratio (SNR). Noise sources can be nonneural (eye movements, muscular activity, 50 Hz power-line noise) or neural (EEG features other than those used for control) [6].
In this research, we centered our analysis on nonneural noise such as eye-movement artefacts, muscular artefacts, and the 50 Hz power-line noise.
Since the frequencies of interest in EEG are mainly located below 40 Hz, we filtered the signals between 1 and 40 Hz. The 50 Hz power-line noise was therefore attenuated.
For eye-movement artefacts and muscular artefacts, we chose to reject a trial containing any of these artefacts and, consequently, such a trial could not generate any computer action.
In the case of muscular activity, one of the best approaches for detection consists in using independent component analysis (ICA) of EEG. However, ICA is basically an offline method since it is only meaningful when the amount of data is large enough [13].  A practical method for detecting muscular artefacts is based on the fact that these artefacts are characterized by high frequencies (above 20 Hz) and high amplitudes. In [14], muscle artefact detection is achieved by considering the absolute and relative power over 25 Hz. In this paper, we set a threshold on the power at this frequency band based on visual inspection and ICA during a calibration step.
For eye-movement artefacts detection, many methods have been proposed [15]. They are fundamentally offline because they are mainly oriented to clinical research.
We implemented a method based on the power at prefrontal electrodes (Fp1 and Fp2) because eye-movement artefacts are characterized by an abrupt change in amplitude mainly localized at Fp1 and Fp2 ( Figure 4). The signal power at Fp1 and Fp2 is computed every half second and compared to the mean power of the preceding two seconds. If the current power subtracted from the mean is larger than some multiple of the standard deviation of the two-second power, the trial is marked as contaminated by an eye artefact and thus rejected. The threshold is determined in the calibration step.

EEG SIGNALS CLASSIFICATION
The classification of EEG signals based on the patterns characterizing the MAs constitutes a fundamental part of a BCI. As a matter of fact, the choice of the temporal parameters T and T I is strongly dependent on the classification method.
An EEG signal is multivariate because it is composed of signals coming from several electrodes. In this paper, we propose a decomposition of the multivariate classification into univariate classifications. Figure 5 depicts the general scheme of our method.
In the following subsections, we first present the univariate classification algorithm and then the decomposition of the multivariate signals (MVSs) into univariate representative signals.

Univariate signal classification in the time-frequency domain
In this subsection, the objects to be classified are univariatesignals (henceforth simply called signals).

Time-frequency representation
Time-frequency representations (TFRs) of a signal can be divided into two groups according to the nature of their transformations: linear (short-time Fourier transform), and quadratic (based on the Wigner-Ville distribution). Here we focus on the quadratic representation. According to [16], all TFR of a signal s(t) can be obtained from 2 where t is the time, ω is the frequency, τ is the time lag (usually called doppler), θ is the frequency lag (usually called delay), and φ(θ, τ) is a two-dimensional function called the kernel.
The choice of the kernel is guided by the desire to have a TFR satisfying some established properties with regard to the application. Here, we designed a kernel with the objective of efficient signal classification.
There are a number of alternative ways for writing the general class of time-frequency distributions that are the most convenient for the classification application. One of them is the characteristic function (CF) formulation. We recall that the CF M(θ, τ) is the double Fourier transform of 2 All the integrals where the limits are not indicated span from −∞ to +∞. the TFR [16]: Combining (3) and (4), we obtain where A(θ, τ) is the symmetrical ambiguity function (AF) of s(t), defined as andŝ(ω) is the Fourier transform of s(t).
Equation (6) allows us to interpret the AF as a measure of the joint time-frequency auto-correlation of s(t). The θ − τ plane is commonly called ambiguity plane.
The kernel function that can be seen as a mask in the ambiguity plane has the goal of enhancing the regions in the plane θ−τ that better discriminate the signals to be classified.
In this research, we consider the classification problem with respect to the modulus of the CF. The kernel is then designed so as to enhance the regions where this modulus is more discriminative.

Kernel design
Given a training set of labeled signals, where W is the number of classes, Q wk the number of labeled signals belonging to class w k , and s qk wk (t) the q k th signal belonging to class w k , we wish to determine a kernel function φ(θ, τ) so that we can compare the CF modulus of an unknown signal s(t) to that of each class and assign s(t) to its most likely class.
We define the set ϑ(Υ) as where In order to detect the regions where the class differences are maximal, we define the contrast function Γ(θ, τ) as where is the mean AF modulus corresponding to class w k . The variance of the AF modulus corresponding to class w k is The discrete version of the θ − τ plane allows us to select the κ points of maximum contrast.
We group these points in a max-contrast set K defined as such as where ∆θ and ∆τ are the discretization steps. We design the kernel as a discrete binary function, where the points in K are set to "1" while the others are set to "0," as follows:

Class model
The model of the class w k is composed of its mean AF modulus, its variance AF modulus, and the kernel. All these elements are considered in their discrete form.
By an adequate choice of units, we can set the discretization steps ∆θ and ∆τ to 1.
The model of the class w k can be written as follows: In an alternative way, the set K can be included instead of φ(m, n) as follows: Unlabeled signals classification In order to assign an unknown signal s(t) to a class, we need a distance measure between s(t) and a class model. We take as a distance measure where |A(m i , n i )| is the discrete AF modulus of s(t) at point (m i , n i ).
In fact d s wk is the Mahalanobis distance between the AF modulus of s(t) and the mean AF modulus of class w k at the points where the kernel φ(m, n) is different from zero.
The most likely class of s(t) is given by its classification Classification error rate The classification error rate is defined, with respect to a labeled signal set (test set), by the ratio between the number of correctly classified signals and the total number of signals in the labeled set. The choice of the parameter κ (number of contrast points that we take into account) remains to be detailed. This parameter should be chosen so as to minimize the classification error in a test set of labeled signals. This can be achieved by increasing the value of κ until a minimal classification error rate is obtained.

MVS classification in the time-frequency domain
where W is the number of classes, Q wk the number of labeled MVSs belonging to class w k , and S qk wk (t) the q k th MVS belonging to class w k , we wish to characterize each class by a model so that we can compare an unknown MVS S(t) to each class-model and assign S(t) to its most likely class.

Time-frequency-space representation of MVS
The multivariate ambiguity function (MAF) of an MVS S(t) is defined by [17] MA(θ, τ) where H stands for conjugate transpose.
In (22), the terms on the diagonal are the auto-ambiguity functions (commonly called AFs) and the off-diagonal terms are called cross-ambiguity functions.
In Section 5.1 we mentioned the fact that the AF could be interpreted as a measure of the joint time-frequency autocorrelation. The generalization of this interpretation implies that the MAF is an indicator of the joint time-frequencyspace autocorrelation of a MVS. The space dimension is taken into account by the cross-ambiguity functions.

Spatial decorrelation
A common approach when dealing with multivariate data is to find a number of components satisfying some statistical properties that can generate the original multivariate data by applying a linear transformation. The most common techniques are the principal component analysis (PCA) whose components are linearly statistically independent and ICA whose components are statistically independent.
In both PCA and ICA the correlation between the new transformed components (TRCs) is zero. Therefore, PCA and ICA lead to MVSs whose components are spatially decorrelated.
Furthermore, the MAF matrix of a spatial decorrelated MVS is diagonal.
Besides PCA and ICA, other decorrelation methods can be used. As in the case of the kernel design, we can design a decorrelation method whose goal is to find components that maximally discriminate among the classes. A way to achieve this goal is to use a feature extraction based on eigenvector analysis [18]. An example of application in the BCI framework can be found in [19], where an interpretation in terms of spatial filters is presented. However, only two classes can be classified at a time.
We present below a decorrelation method based on the joint diagonalization of the autocorrelation matrices of each class.
We denote by Z(t) the transformed MVS (TMVS) resulting from the premultiplication of S(t) by the matrix P: The labeled MVS belonging to the training set Υ are Pprojected to generate the following transformed set PΥ: where Z qk wk (t) = P · S qk wk (t). The signal components (transformed components) of Z qk wk (t) are { z qk wk (t); 1 ≤ ≤ N}. The discrete version of the set PΥ is constituted of the matrices Z qk wk whose elements are the values of Z qk wk (t) at the sampling instants.
We wish to determine the matrix P such that where D wk are diagonal matrices and R wk are called the autocorrelation matrices of the class w k . Thus the matrix P simultaneously diagonalizes the set {R wk | 1 ≤ k ≤ W}.
As a matter of fact, the matrix P that exactly diagonalizes this set exists when the R wk are normal 3 commuting matrices [20]. According to (25), the R wk are normal but they do not necessarily commute. However, it is possible to find a matrix that approximately diagonalizes the set {R wk | 1 ≤ k ≤ W} [20] by optimizing a joint diagonality criterion (minimization of the square sum of the off-diagonal elements). An iterative procedure consisting in the application of plane rotations so as to satisfy the jointdiagonality criterion is presented in [20]. Because of the efficiency and the good results of such method, we used it in our work.
In order to characterize the discrimination potential of each of the components of Z(t), we define the contrast function Ω( ), where = 1, . . . , N is the TRC index, as follows: Model w k = P, ρ( ), E A wk (m, n) , where E[| A wk (m, n)|], VAR[| A wk (m, n)|], and K are, respectively, the mean AF modulus of the th component associated to the class w k , the variance AF modulus of the th associated to the class w k , and the max-contrast set of the th component.

Unlabeled signals classification
Given an MVS S(t), we first compute its TMVS Z(t) (see (23)) and obtain the TRCs z 1 (t), z 2 (t), . . . , z N (t). Then the distances between each component and the model of each class associated with that component are calculated as follows: where | A(m i , n i )| is the modulus of the AF of z (t). Finally, the global distance between S(t) and the class w k is The most likely class of S(t) is given by its classification defined as classification S(t) = argmin wk D S(t) wk . (31)

EXPERIMENTAL METHODS AND PROTOCOL
Two male and healthy volunteers (S1 and S2), 29 and 23 years old, participated in six sessions of 20 minutes distributed over five weeks. The subjects were comfortably sitting in an armchair and placed in front of a computer screen. The experimentation room was quiet and slightly illuminated. The subjects started each session by five minutes of the VOM. During the VOM, we controlled the recording conditions and set the threshold parameters for artefact rejection (see Section 4). Furthermore, the VOM allowed subjects to get familiar with the system.
The EEG signals were recorded with reference to DLE (see Section 2.1) and from electrodes Fp1, Fp2, F3, F4 C3, C4, P3, P4, O1, and O2 of the 10/20 international system, at a rate of 256 Hz per channel. The electrodes Fp1 and Fp2 were used only for eye-movement artefacts detection and they were not included in the classification analysis.
Both subjects were asked to perform the following imagined MAs: vertical movements of the left and right index fingers (MA1 and MA2) and incremental mental counting (MA3).
Visual cues were used to indicate which MA to perform. In the case of MA1 and MA2, a horizontal arrow pointing to the left or to the right was displayed on the computer screen; for MA3, the first two-digit number was displayed.
The first recording session was carried out without feedback and the next five with feedback. In the first session, the first MA models were calculated; this allowed us to provide feedback in the second session. During the feedback sessions, the MA models were updated incrementally as explained in Section 6.2.
The temporal parameters T and T I were both set to 0.5 second (see Section 2.3). The goal was therefore to train MA models able to correctly classify half-second EEG segments (trials).

Protocol of a training-without-feedback session
The first five minutes were spent with the VOM. The remaining 15 minutes were divided into three five-minute slices in which, respectively, MA1, MA2, and MA3 were trained.
The five-minute slices were as well divided into oneminute recordings and thirty-second break as depicted in Figure 6. The one-minute recordings were organized in the following way. At the beginning, the corresponding visual cue was displayed and lasted five seconds. Then a break signal appeared, indicating five-second break. This process was repeated during the one-minute recording (Figure 6).
At the end of this session, the MA models for the three MAs were computed. These models are calculated as explained in Section 5.
Theoretically, we have 180 trials per MA for training the MA models. However, the first trial after the presentation of the visual cue is rejected because of the presence of evoked potentials-due to visual stimulation-and about 20% of the trials are rejected because of artefacts. In practice, no more than 150 trials per MA were available.

Protocol of a training-with-feedback session
The twenty minutes are distributed between visualization, MAs, and breaks in the same way as in the precedent case (Figure 7).
During the MAs, a feedback is provided to the subject in the form of a sphere that moves left, right, or upwards if MA1, MA2, or MA3 are correctly identified. If the MA is wrongly classified, the sphere does not move. The feedback is provided for each half second but the first after the visual cue indicates which MA to perform (see Figure 7).
During the last break period of each five-minute slice, the MA models are updated with the new recorded data. Table 1 shows the MAs that were trained in each fiveminute slices of the session with feedback.

RESULTS AND DISCUSSIONS
We divided the results presentation into two parts: the results of the first session where no feedback was provided and those of sessions where feedback was provided (two to six).

First session (without feedback)
The number of retained trials in the first session (after artefact rejection) per subject and per MA is reported in Table 2. We used 100 trials to compute the matrix P, the mean AF modulus, and the variance AF modulus of each TRC (see Section 5.2). The remaining trials were used as a test set to determine the optimal number of contrast points, in the ambiguity plane, associated with each TRC.  In Figures 8 and 9 (for S1 and S2, respectively), the absolute values of the coefficients of the matrix P are represented in a comparative graph (left). This graph gives us the information about the composition of each TRC as a linear combination of signals coming from different electrodes. In the right part, the classification weights associated with each TRC, calculated according to (27), are depicted.
The results of Figures 8 and 9 show that for both subjects there are five TRCs that seem to be more important for the classification than the others (1, 2, 3, 5, and 8 for S1, and 1, 2, 3, 6, and 8 for S2). In order to confirm this impression, we computed the classification error associated with each TRC and the optimal number of contrast points. These results are shown in Figures 10 and 11 (for S1 and S2, respectively). From these results, we can say that the smaller error rates correspond to those components with largest classification weights.
We also present the optimal contrast points for the four TRCs that have the smallest error rate. Only the first quadrants were represented since the modulus of the AF is symmetric with respect to the origin.

Sessions from two to six (with feedback)
In Table 3, we report the number of retained trials after artefact elimination for each five-minute slice from sessions two to six.
During the second session, each MA was trained with feedback (see Table 1). Such feedback was produced by taking as reference the models built after the first session. In Table 4, we present the percentage of trials that were not correctly classified among the nonrejected trials (error rate).  At the end of the second session, new MA models were built using 100 trials (randomly chosen) to compute the matrix P, the mean AF modulus, and the variance AF modulus. The test set composed of the remaining trials was used to compute the optimal number of contrast points.
In Figures 12 and 13 (for S1 and S2, respectively), we represent the coefficients of the matrix P and the contrast points for each of the five TRCs that have the largest classification weights. We can see that the TRCs that have the largest classification weights are the same as in the case of the training without feedback (Figures 7 and 8). This suggests that the relationship between the coefficients of the matrix P remained the same. In the case of contrast points, we can remark that the general distribution found during the training without feedback is generally maintained in the training with feedback. On the other hand, the optimal number of contrast points has slightly changed with respect to the training without feedback. Transformed components error rates Figure 10: Top: contrast points selected for the four TRCs with the smallest error rates (as the modulus of the AF is symmetric with respect to the origin, only the first quadrant is represented). Down: error rates associated with each TRC (S1).
It is important to note that we built new MA models at the end of the second session in order to make it in feedback conditions. In this way, it is possible to update the models after each five-minute slice in sessions from three to six.
In sessions from three to six, we updated the matrix P, the mean AF modulus, and the variance of the AF modulus of the trained MA for each five-minute slice. This procedure was performed by using 100 trials randomly chosen to update those parameters and to take the remaining trials as a test set for determining the optimal number of contrast points.
In Figures 14 and 15 (for S1 and S2, respectively), we represent the evolution of the error rate over the sessions from three to six. These results for each five-minute slice are reported in Table 1.
As it can be seen, the error rate decreased almost always except between sessions 3 and 4 for S1. Nevertheless, at the end of the sixth session, we achieved the lowest error rates for all the MAs. This result suggests that the feedback strategy improved the performance of the system. In fact, the subjects reported their general satisfaction with regard to feedback because of its stimulating effects.

CONCLUSIONS AND FUTURE WORK
In this paper, we proposed a BCI system and an associated network architecture between the components that can be used in different OMs. We stated that the relationship between these components should be flexible since the BCI technology is still in its experimental phase.  In order to familiarize the subjects with our BCI, we proposed to precede each session with a short real-time visualization of a projection of the EEG signals in a 3D environment.
We classified EEG signals from the point of view of the joint correlation in three dimensions: time, frequency, and space (as EEG signals are multivariate). In order to reduce the amount of data that results from such analysis, we decorrelated the EEG signals before moving to the time-frequency correlation part. The decorrelation process resulted in a set of TRCs. In this way, we divided the original problem of classification of MVSs into several univariate classifications.
The training was performed in two ways: with and with-out feedback. The obtained results show that the relationship between the TRCs remains essentially the same for both training types. Nevertheless, as noticed in [11] the structure of the MA models is different from person to person. Therefore, a BCI should be personalized.
The general reduction of the classification error rate over the sessions where feedback was provided shows that the feedback constituted an effective strategy for the training. Nevertheless, more experiments are necessary for confirming this hypothesis.
In the future, we plan to experiment with more subjects during more sessions.  Figure 12: Results for S1. TRCs with largest classification weights for the MA models built after the second session (first session with feedback). Top: contrast points in the Doppler-delay plane. Middle: rows of the matrix P associated with the TRCs. Down: classification error rate associated with each TRC.
As the goal is to control devices by thinking, it is necessary to add more MAs for making, at least, a twodimensional control possible.
We will consider other spatial analysis techniques such as nonlinear PCA for extracting those TRCs that can be classi-fied in the time-frequency domain.
Another possibility could be to perform a parametric time-frequency analysis (multivariate autoregressive models) first and then apply a spatial analysis technique.  Figure 14: Error rate evolution for S1 over the training sessions from three to six. We reported the error rate for each five-minute slice in Table 1.  Figure 15: Error rate evolution for S2 over the training sessions from three to six. We reported the error rate for each five-minute slice in Table 1.
Gary N. Garcia Molina was born in Sucre, Bolivia. He received his M.S. degree in electrical engineering from the Swiss Federal Institute of Technology, Lausanne (EPFL), in 2001. His diploma work was on the application of pattern recognition techniques to speech processing. He then worked as a Software Engineer in the design and implementation of distributed systems for multimedia content broadcasting. Since April 2001, he has been a Ph.D. student at EPFL where he is involved in several projects in image, video, and biomedical signal processing. Currently he develops an adaptive direct brain-computer communication device with strong emphasis on the models and interpretations of synchronized neural activity.