- Research Article
- Open Access
Combination of Adaptive Feedback Cancellation and Binaural Adaptive Filtering in Hearing Aids
© Anthony Lombard et al. 2009
- Received: 12 December 2008
- Accepted: 17 March 2009
- Published: 2 April 2009
We study a system combining adaptive feedback cancellation and adaptive filtering connecting inputs from both ears for signal enhancement in hearing aids. For the first time, such a binaural system is analyzed in terms of system stability, convergence of the algorithms, and possible interaction effects. As major outcomes of this study, a new stability condition adapted to the considered binaural scenario is presented, some already existing and commonly used feedback cancellation performance measures for the unilateral case are adapted to the binaural case, and possible interaction effects between the algorithms are identified. For illustration purposes, a blind source separation algorithm has been chosen as an example for adaptive binaural spatial filtering. Experimental results for binaural hearing aids confirm the theoretical findings and the validity of the new measures.
- Independent Component Analysis
- Blind Source Separation
- Forward Path
- Microphone Signal
- Unilateral Case
Traditionally, signal enhancement techniques for hearing aids (HAs) were mainly developed independently for each ear [1–4]. However, since the human auditory system is a binaural system combining the signals received from both ears for audio perception, providing merely bilateral systems (that operate independently for each ear) to the hearing-aid user may distort crucial binaural information needed to localize sound sources correctly and to improve speech perception in noise. Foreseeing the availability of wireless technologies for connecting the two ears, several binaural processing strategies have therefore been presented in the last decade [5–10]. In , a binaural adaptive noise reduction algorithm exploiting one microphone signal from each ear has been proposed. Interaural time difference cues of speech signals were preserved by processing only the high-frequency components while leaving the low frequencies unchanged. Binaural spectral subtraction is proposed in . It utilizes cross-correlation analysis of the two microphone signals for a more reliable estimation of the common noise power spectrum, without requiring stationarity for the interfering noise as the single-microphone versions do. Binaural multi-channel Wiener filtering approaches preserving binaural cues were also proposed, for example, in [7–9], and signal enhancement techniques based on blind source separation (BSS) were presented in .
Research on feedback suppression and control system theory in general has also given rise to numerous hearing-aid specific publications in recent years. The behavior of unilateral closed-loop systems and the ability of adaptive feedback cancellation algorithms to compensate for the feedback has been extensively studied in the literature (see, e.g., [11–15]). But despite the progress in binaural signal enhancement, binaural systems have not been considered in this context. In this paper, we therefore present a theoretical analysis of a binaural system combining adaptive feedback cancellation (AFC) and binaural adaptive filtering (BAF) techniques for signal enhancement in hearing aids.
The paper is organized as follows. An efficient binaural configuration combining AFC and BAF is described in Section 2. Generic vector/matrix notations are introduced for each part of the processing chain. Interaction effects concerning the AFC are then presented in Section 3. It includes a derivation of the ideal binaural AFC solution, a convergence analysis of the AFC filters based on the binaural Wiener solution, and a stability analysis of the binaural system. Interaction effects concerning the BAF are discussed in Section 4. Here, to illustrate our argumentation, a BSS scheme has been chosen as an example for adaptive binaural filtering. Experimental conditions and results are finally presented in Sections 5 and 6 before providing concluding remarks in Section 7.
AFC and BAF techniques can be combined in two different ways. The feedback cancellation can be performed directly on the microphone inputs, or it can be applied at a later stage, to the BAF outputs. The second variant requires in general fewer filters but it has also several drawbacks. Actually, when the AFC comes after the BAF in the processing chain, the feedback cancellation task is complicated by the necessity to follow the continuously time-varying BAF filters. It may also significantly increase the necessary length of the AFC filters. Moreover, the BAF cannot benefit from the feedback cancellation effectuated by the AFC in this case. Especially at high HA amplification levels, the presence of strong feedback components in the sensor inputs may, therefore, seriously disturb the functioning of the BAF. These are structurally the same effects as those encountered when combining adaptive beamforming with acoustic echo cancellation (AEC) .
In this paper, lower-case boldface characters represent (row) vectors capturing signals or the filters of single-input-multiple-output (SIMO) systems. Accordingly, multiple-input-single-output (MISO) systems are described by transposed vectors. Matrices denoting multiple-input-multiple-output (MIMO) systems are represented by upper-case boldface characters. The transposition of a vector or a matrix will be denoted by the superscript .
2.2. The Microphone Signals
We consider here multi-sensor hearing aid devices with microphones at each ear (see Figure 1), where typically ranges between one and three. Because of the reverberation in the acoustical environment, point source signals ( ) are filtered by a MIMO mixing system (one MIMO system for each ear in the figure) modeled by finite impulse response (FIR) filters. This can be expressed in the -domain as:
where is the -domain representation of the received source signal mixture at the th sensor of the left ( ) and right ( ) hearing aid, respectively. and denote the transfer functions (polynomes of order up to several thousands typically) between the th source and the th sensor at the left and right ears, respectively. One of the point sources may be seen as the target source to be extracted, the remaining being considered as interfering point sources. For the sake of simplicity, the -transform dependency will be omitted in the rest of this paper, as long as the notation is not ambiguous.
The acoustic feedback originating from the loudspeakers (LS) and at the left and right ears, respectively, is modeled by four SIMO systems of FIR filters. and represent the ( -domain) transfer functions (polynomes of order up to several hundreds typically) from the loudspeakers to the th sensor on the left side, and and represent the transfer functions from the loudspeakers to the th sensor on the right side. The feedback components captured by the th microphone of each ear can therefore be expressed in the -domain as
Note that as long as the energy of the two LS signals are comparable, the "cross" feedback signals (traveling from one ear to the other) are negligible compared to the "direct" feedback signals (occuring on each side independently). With the feedback paths (FBP) used in this study (see the description of the evaluation data in Section 5.3), an energy difference ranging from 15 to 30 dB has been observed between the "direct" and "cross" FBP impulse responses. When the HA gains are set at similar levels in both ears, the "cross" FBPs can then be neglected. But the impact of the "cross" feedback signals becomes more significant when a large difference exists between the two HA gains. Here, therefore, we explicitly account for the two types of feedback by modelling both the "direct" paths (with transfer functions and , ) and the "cross" paths (with transfer functions and , ) by FIR filters.
Diffuse noise signals and , constitute the last microphone signal components on the left and right ears, respectively. The -domain representation of the th sensor signal at each ear is finally given by:
refers to the target source and is a subset of capturing the remaining interfering sources. is a row of which captures the transfer functions from the target source to the sensors and is a matrix containing the remaining rows of . Like the other vectors and matrices defined above, these four entities can be further decomposed into their left and right subsets, labeled with the indices and , respectively.
2.3. The AFC Processing
which is, ideally, free of any feedback components. (21) and (22) can be reformulated in matrix form as follows:
with the block-diagonal constraint
put on the AFC system. The vectors and , capturing the -domain representations of the residual and AFC output signals, respectively, are defined in analogous way to in (8). As can be seen from (21) and (22), we perform here bilateral feedback cancellation (as opposed to binaural operations) since AFC is performed for each ear separately. This is reflected in (24), where we force the off-diagonal terms to be zero instead of reproducing the acoustic feedback system with its set of four SIMO systems. The reason for this will become clear in Section 3.1. Guidelines regarding an arbitrary (i.e., unconstrained) AFC system (defined similarly to in this case) will also be provided at some points in the paper. The superscript is used to distinguish constrained systems defined by (24) from arbitrary (unconstrained) systems (with possibly non-zero off-diagonal terms).
2.4. The BAF Processing
The BAF filters perform spatial filtering to enhance the signal coming from one of the external point sources. This is performed here binaurally, that is, by combining signals from both ears (see Figure 1). The binaural filtering operations can be described by a set of four MISO systems of adaptive FIR filters. This can be expressed in the -domain as follows:
2.5. The Forward Paths
Note that for simplicity, we assumed that the number of sensors used on each device for digital signal processing was equal. The above notations as well as the following analysis are however readily applicable to asymmetrical configurations also, simply by resizing the above-defined vectors and matrices, or by setting the corresponding microphone signals and all the associated transfer functions to zero. In particular, the unilateral case can be seen as a special case of the binaural structure discussed in this paper, with one or more microphones used on one side, but none on the other side.
The structure depicted in Figure 1 for binaural HAs mainly deviates from the well-known unilateral case by the presence of binaural spatial filtering. The binaural structure is characterized by a significantly more complex closed-loop system, possibly with multiple microphone inputs, but most importantly with two connected LS outputs, which considerably complicates the analysis of the system. However, we will see in the following how, under certain conditions, we can exploit the compact matrix notations introduced in the previous section, to describe the behavior of the closed-loop system. We will draw some interesting conclusions on the present binaural system, emphasizing its deviation from the standard unilateral case in terms of ideal cancellation solution, convergence of the AFC filters and system stability.
3.1. The Ideal Binaural AFC Solution
In the unilateral and single-channel case, the adaptation of the (single) AFC filter tries to adjust the compensation signal (the filter output) to the (single-channel) acoustic feedback signal. Under ideal conditions, this approach guarantees perfect removal of the undesired feedback components and simultaneously prevents the occurrence of howling caused by system instabilities  (the stability of the binaural closed-loop system will be discussed in Section 3.3). The adaptation of the filter coefficients towards the desired solution is usually achieved using a gradient-descent-like learning rule, in its simplest form using the least mean square (LMS) algorithm . The functioning of the AFC in the binaural configuration shown in Figure 1 is similar.
denotes the ideal AFC solution in the unconstrained case. This is the binaural analogon to the ideal AFC solution in the unilateral case, where perfect cancellation is achieved by reproducing an exact replica of the acoustical FBP. In practice, this solution is however very difficult to reach adaptively because it requires the two signals and to be uncorrelated, which is obviously not fulfilled in our binaural HA scenario since the two HAs are connected (the correlation is actually highly desirable since the HAs should form a spatial image of the acoustic scene, which implies that the two LS signals must be correlated to reflect interaural time and level differences). This problem has been extensively described in the literature on multi-channel AEC, where it is referred to as the "non-uniqueness problem". Several attempts have been reported in the literature to partly alleviate this issue (see, e.g., [18–20]). These techniques may be useful in the HA case also, but this is beyond the scope of the present work.
jointly capturing and the HA processing. Provided that and are linear, (41) (and hence (40) is equivalent to assuming the existence of a linear dependency between the LS outputs, which we can express as follows:
This assumption implies that only one filter (instead of two, one for each LS signal) suffices to cancel the feedback components in each sensor channel. It corresponds to the constraint (24) mentioned in Section 2.3, which forces the AFC system matrix to be block-diagonal ( ). The required number of AFC filters reduces accordingly from to .
denotes the ideal AFC solution when is constrained to be block-diagonal ( ) and under the assumption (43). The operator is the block-wise counterpart of the operator. Applied to a list of vectors, it builds a block-diagonal matrix with the listed vectors placed on the main diagonal of the block-matrix, respectively.
To illustrate these results, we expand the ideal AFC solution (46) using (15) and (18):
For each filter, we can clearly identify two terms due to, respectively, the "direct" and "cross" FBPs (see Section 2.2). Contrary to the "direct" terms, the "cross" terms are identifiable only under the assumption (43) that the LS outputs are linearly dependent. Should this assumption not hold because of, for example, some non-linearities in the forward paths, the "cross" FBPs would not be completely identifiable. The feedback signals propagating from one ear to the other would then act as a disturbance to the AFC adaptation process. Note, however, that since the amplitude of the "cross" FBPs is negligible compared to the amplitude of the "direct" FBPs (Section 2.2), the consequences would be very limited as long as the HA gains are set to similar amplification levels, as can be seen from (47). It should also be noted that the forward path generally includes some (small) decorrelation delays and to help the AFC filters to converge to their desired solution (see Section 3.2). If those delays are set differently for each ear, causality of the "cross" terms in (47) will not always be guaranteed, in which case the ideal solution will not be achievable with the present scheme. This situation can be easily avoided by either setting the decorrelation delays equal for each ear (which appears to be the most reasonable choice to avoid artificial interaural time differences), or by delaying the LS signals (but using the non-delayed signals as AFC filter inputs). However, since it would further increase the overall delay from the microphone inputs to the LS outputs, the latter choice appears unattractive in the HA scenario.
3.2. The Binaural Wiener AFC Solution
In the configuration depicted in Figure 2, similar to the standard unilateral case (see, e.g., ), conventional gradient-descent-based learning rules do not lead to the ideal solution discussed in Section 3.1 but to the so-called Wiener solution . Actually, instead of minimizing the feedback components in the residual signals, the AFC filters are optimized by minimizing the mean-squared error of the overall residual signals (38).
In the following, we conduct therefore a convergence analysis of the binaural system depicted in Figure 2, by deriving the Wiener solution of the system in the frequency domain:
where the frequency dependency was omitted in (48) and (49) for the sake of simplicity, like in the rest of this section. is recognized as the (frequency-domain) ideal AFC solution discussed in Section 3.1, and denotes a (frequency-domain) bias term. The assumption (43) has been exploited in (48) to obtain the above final result. represents the (auto-) power spectral density of , , and , , is a vector capturing cross-power spectral densities. The cross-power spectral density vectors and are defined in a similar way.
By nature, the spatially uncorrelated diffuse noise components will be only weakly correlated with the LS outputs. The third bias term will have therefore only a limited impact on the convergence of the AFC filters. The diffuse noise sources will mainly act as a disturbance. Depending on the signal enhancement technique used, they might even be partly removed. But above all, the (multi-channel) BAF performs spatial filtering, which mainly affects the interfering point sources. Ideally, the interfering sources may even vanish from the LS outputs, in which case the second bias term would simply disappear. In practice, the interference sources will never be completely removed. Hence the amount of bias introduced by the interfering sources will largely depend on the interference rejection performance of the BAF. However, like in the unilateral hearing aids, the main source of estimation errors comes from the target source. Actually, since the BAF aims at producing outputs which are as close as possible to the original target source signal, the first bias term due to the (spectrally colored) target source will be much more problematic.
One simple way to reduce the correlation between the target source and the LS outputs is to insert some delays and in the forward paths . The benefit of this method is however very limited in the HA scenario where only tiny processing delays (5 to 10 ms for moderate hearing losses) are allowed to avoid noticeable effects due to unprocessed signals leaking into the ear canal and interfering with the processed signals. Other more complicated approaches applying a prewhitening of the AFC inputs have been proposed for the unilateral case [21, 22], which could also help in the binaural case. We may also recall a well-known result from the feedback cancellation literature: the bias of the AFC solution decreases when the HA gain increases, that is, when the signal-to-feedback ratio (SFR) at the AFC inputs (the microphones) decreases. This statement also applies to the binaural case. This can be easily seen from (50) where the auto-power spectral density decreases quadratically whereas the cross-power spectral densities increase only linearly with increasing LS signal levels.
The bias term is identical to the one already obtained in (50), while the desired term is now split into two parts. The first one is related to the "direct" FBPs. The second term involves the "cross" FBPs and shows that gradient-based optimization algorithms will try to exploit the correlation of the LS outputs (when existing) to remove the feedback signal components traveling from one ear to the other. In the extreme case that the two LS signals are totally decorrelated (i.e., ), this term disappears and the "cross" feedback signals cannot be compensated. Note, however, that it would only have a very limited impact as long as the HA gains are set to similar amplification levels, as we saw in Section 3.1.
3.3. The Binaural Stability Condition
In this section, we formulate the stability condition of the binaural closed-loop system, starting from the general case before applying the block-diagonal constraint (24). We first need to express the responses and of the binaural system (Figure 1) on the left and right side, respectively, to an external excitation . This can be done in the -domain as follows:
Combining (52) and (53) finally yields the relations:
Here, the phase condition has been ignored, as usual in the literature on AFC . Note that the function in (57) and hence the stability of the binaural system, depend on the current state of the BAF filters.
Furthermore, it can easily be verified that the following relations are satisfied in this case:
The closed-loop response (56) of the binaural system simplifies, therefore, in this case to
The above results show that in the unconstrained (constrained, resp.) case, when the AFC filters reach their ideal solution ( , resp.), the function in (57) (65), resp.) is equal to zero. Hence the stability condition (58) is always fulfilled, regardless of the HA amplification levels used, and the LS outputs become ideal, with as expected.
The presence of feedback in the microphone signals is usually not taken into account when developing signal enhancement techniques for hearing aids. In this section, we consider the configuration depicted in Figure 1 and focus exemplarily on BSS techniques as possible candidates to implement the BAF, thereby analyzing the impact of feedback on BSS and discussing possible interaction effects with an AFC algorithm.
4.1. Overview on Blind Source Separation
The aim of blind source separation is to recover the original source signals from an observed set of signal mixtures. The term "blind" implies that the mixing process and the original source signals are unknown. In acoustical scenarios, like in the hearing-aid application, the source signals are mixed in a convolutive manner. The (convolutive) acoustical mixing system can be modeled as a MIMO system of FIR filters (see Section 2.2). The case where the number of (simultaneously active) sources is equal to the number of microphones (assuming channels for each ear (see Section 2.2) is referred to as the determined case. The case where is called overdetermined, while is denoted as underdetermined.
The underdetermined BSS problem can be handled based on time-frequency masking techniques, which rely on the sparseness of the sound sources (see, e.g., [23, 24]). In this paper, we assume that the number of sources does not exceed the number of microphones. Separation can then be performed using independent component analysis (ICA) methods, merely under the assumption of statistical independence of the original source signals . ICA achieves separation by applying a demixing MIMO system of FIR filters on the microphone signals, hence providing an estimate of each source at the outputs of the demixing system. This is achieved by adapting the weights of the demixing filters to force the output signals to become statistically independent. Because of the adaptation criterion exploiting the independence of the sources, a distinction between desired and undesired sources is unnecessary. Adaptation of the BSS filters is therefore possible even when all sources are simultaneously active, in contrast to more conventional techniques based on Wiener filtering  or adaptive beamforming .
One way to solve the BSS problem is to transform the mixtures to the frequency domain using the discrete Fourier transform (DFT) and apply ICA techniques in each DFT-bin independently (see e.g., [27, 28]). This approach is referred to as the narrowband approach, in contrast with broadband approaches which process all frequency bins simultaneously. Narrowband approaches are conceptually simpler but they suffer from a permutation and scaling ambiguity in each frequency bin, which must be tackled by additional heuristic mechanisms. Note however that to solve the permutation problem, information on the sensor positions is usually required and free-field sound wave propagation is assumed (see, e.g., [29, 30]). Unfortunately, in the binaural HA application, the distance between the microphones on each side of the head will generally not be known exactly and head shadowing effects will cause a disturbance of the wavefront. In this paper, we consider a broadband ICA approach [31, 32] based on the TRINICON framework . Separation is performed exploiting second-order statistics, under the assumption that the (mutually independent) source signals are non-white and non-stationary (like speech). Since this broadband approach does not rely on accurate knowledge of the sensor placement, it is robust against unknown microphone array deformations or disturbance of the wavefront. It has already been used for binaural HAs in [10, 34].
The BSS algorithm satisfies, therefore, the assumption (41) and the AFC-BSS combination can be equivalently described by Figure 2, with . In the following, refers to the selected BSS output presented (after amplification in the forward paths) to the HA user at both ears, and denotes the transfer functions of the selected BSS filters (common to both LS outputs). Note finally that post-processing filters may be used to recover spatial cues . They can be modelled as being part of the forward paths and .
In the HA scenario, since the LS output signals feed back into the microphones, the closed-loop system formed by the HAs participates in the source mixing process, together with the acoustical mixing system. Therefore, the BSS inputs result from a mixture of the external sources and the feedback signals coming from the loudspeakers. But because of the closed-loop system bringing the HA inputs to the two LS outputs, the feedback signals are correlated with the original external source signals. To understand the impact of feedback on the separation performance of a BSS algorithm, we describe below the overall mixing process.
where and refer to the AFC system and its ideal solution (46), respectively, under the block-diagonal constraint (24). characterizes the stability of the binaural closed-loop system and is defined by (65). From (68), we can identify two independent components and present in the BSS inputs and originating from the external point sources and from the diffuse noise, respectively. As mentioned in Section 4.1, the BSS algorithm allows to separate point sources, additional diffuse noise having only a limited impact on the separation performance . We therefore concentrate on the first term in (68):
which produces an additional mixing system introduced by the acoustical feedback (and the required AFC filters). Ideally, the BSS filters should converge to a solution which minimizes the contribution of the interfering point sources at the BSS output , that is,
refers to the acoustical mixing of the interfering sources , as defined in Section 2.2. can be defined in a similar way and describes the mixing of the interfering sources introduced by the feedback loop.
In the absence of feedback (and of AFC filters), the second term in (70) disappears and BSS can extract the target source by unraveling the acoustical mixing system , which is the desired solution. Note that this solution also allows to estimate the position of each source, which is necessary to select the output of interest, as discussed in Section 4.1. However, when strong feedback signal components are present at the BSS inputs, the BSS solution becomes biased since the algorithm will try to unravel the feedback loop instead of targetting the acoustical mixing system only. The importance of the bias depends on the magnitude response of the filters captured by in (70), relative to the magnitude response of the filters captured by . Contrary to the AFC bias encountered in Section 3.2, the BSS bias therefore decreases with increasing SFR.
The above discussion concerning BSS algorithms can be generalized to any signal enhancement techniques involving adaptive filters. The presence of feedback at the algorithm's inputs will always cause some adaptation problems. Fortunately, placing an AFC in front of the BAF like in Figure 1 can help increasing the SFR at the BAF inputs. In particular, when the AFC filters reach their ideal solution (i.e., ), then becomes zero and the bias term due to the feedback loop in (70) disappears, regardless of the amount of sound amplification applied in the forward paths.
To validate the theoretical analysis conducted in Sections 3 and 4, the binaural configuration depicted in Figure 3 was experimentally evaluated for the combination of a feedback canceler and the blind source separation algorithm introduced in Section 4.1.
The BSS processing was performed using a two-channel version of the algorithm introduced in Section 4.1, picking up the front microphone at each ear (i.e., ). Four adaptive BSS filters needed to be computed at each adaptation step. The output containing the target source (the most frontal one) was selected based on BSS-internal source localization (see Section 4.1, and ). To obtain meaningful results which are, as far as possible, independent of the AFC implementation used, the AFC filter update was performed based on the frequency-domain adaptive filtering (FDAF) algorithm . The FDAF algorithm allows for an individual step-size control for each DFT bin and a bin-wise optimum control mechanism of the step-size parameter, derived from [13, 37]. In practice, this optimum step-size control mechanism is inappropriate since it requires the knowledge of signals which are not available under real conditions, but it allows us to minimize the impact of a particular AFC implementation by providing useful information on the achievable AFC performance. Since we used two microphones, the (block-diagonal constrained) AFC consisted of two adaptive filters (see Figure 3).
Finally, to avoid other sources of interaction effects and concentrate on the AFC-BSS combination, we considered a simple linear time-invariant frequency-independent hearing-aid processing in the forward paths (i.e., and ). Furthermore, in all the results presented in Section 4, the same HA gains and decorrelation delays (see Section 3.2) were applied at both ears. The selected BSS output was therefore amplified by a factor , delayed by and played back at the two LS outputs.
5.2. Performance Measures
We saw in the previous sections that our binaural configuration significantly differs from what can usually be found in the literature on unilateral HAs. To be able to objectively evaluate the algorithms' performance in this context, especially concerning the AFC, we need to adapt some of the already existing and commonly used performance measures to the new binaural configuration. This issue is discussed in the following, based on the outcomes of the theoretical analysis presented in Sections 3 and 4.
5.2.1. Feedback Cancellation Performance Measures
In the conventional unilateral case, the feedback cancellation performance is usually measured in terms of misalignment between the (single) FBP estimate and the true (single) FBP (which is the ideal solution in the unilateral case), as well as in terms of Added Stable Gain (ASG) reflecting the benefit of AFC for the user .
The ideal binaural AFC solution has been defined in (39) for the general case, and in (44) under the block-diagonal constraint (24) and assumption (43). In the results presented in Section 4, the misalignment has been averaged over all AFC filters (two filters in our case).
where has been defined in (57) and is the initial magnitude of , without AFC. Since the assumption (41) is valid in our case (with ) and since we force our AFC system to be block-diagonal, we can alternatively use the simplified expression of given by (65). Note that the initial stability margin as well as the margin with AFC , and hence the ASM, depend not only on the acoustical (binaural) FBPs, but also on the current state of the BAF filters. Also, when , becomes directly proportional to and the ASM can be interpreted as an ASG.
Additionally, the SFR measured at the BSS and AFC inputs should be taken into account when assessing the AFC-BSS combination since it directly influences the performance of the algorithms. The SFR is defined in the following as the signal power ratio between the components coming from the external sources (without distinction between desired and interfering sources), and the components coming from loudspeakers (i.e., the feedback signals).
5.2.2. Signal Enhancement Performance Measures
The separation performance of the BSS algorithm is evaluated in terms of signal-to-interference ratio (SIR), that is, the signal power ratio between the components coming from the target source and the components coming from the interfering source(s). Although the feedback components and the AFC filter outputs (i.e., the compensation signals) contain some signal coming from the external sources (which causes a bias of the BSS solution, as discussed in Section 4), we will ignore them in the SIR calculation since these components are undesired. An SIR gain can then be obtained as the difference between the SIR at the BSS inputs and the SIR at the BSS outputs. It reflects the ability of BSS to extract the desired components from the signal mixture , regardless of the amount of feedback (or background noise) present. Since only one BSS output is presented to the HA user (Section 4.1), we average the input SIR over all BSS input channels (here two), but we consider only the selected BSS output for the output SIR calculations.
5.3. Experimental Conditions
Since a two-channel ICA-based BSS algorithm can only separate two point sources (Section 4.1), no diffuse noise has been added to the sensor signal mixture (i.e., ) and only two point sources were considered (one target source and one interfering source).
The source signal components were then generated by convolving speech signals with the recorded HRIRs, with the target and interfering sources placed at azimuths (in front of the HA user) and (facing the right ear), respectively. The target and interfering sources were approximately of equal (long-time) signal power.
left HA mounted on a manikin without obstruction,
right HA mounted on a manikin with a telephone as obstruction.
In the following, experimental results involving the combination of AFC and BSS are shown and discussed. BSS filters of 1024 coefficients each were applied, the AFC filter length was set to 256 and decorrelation delays of 5 ms were included in the forward paths.
6.1. Impact of Feedback on BSS
The discussion of Section 4 indicates that a deterioration of the BSS performance is expected at low input SFR, due to a bias introduced by the feedback loop. To determine to which extent the amount of feedback deteriorates the achievable source separation, the performance of the (adaptive) BSS algorithm was experimentally evaluated for different amounts of feedback by varying the amplification level . Preliminary tests in the absence of AFC showed that the feedback had almost no impact on the BSS performance as long as the system was stable (i.e., as long as ) because the SFR at the BSS inputs was kept high (greater than 20 dB). This basically validates the traditional way signal enhancement techniques for hearing aids have been developed, ignoring the presence of feedback.
At low gains, the feedback has very little impact on the SIR gain because the input SFR is sufficiently high in all tested scenarios. We see also that the interference rejection causes a decrease in SFR (from the BSS inputs to the output) since parts of the external source components are attenuated. This should be beneficial to an AFC algorithm since it reduces the bias of the AFC Wiener solution due to the interfering source, as discussed in Section 3.2. However, at high gains, where the input SFR is low (less than 10 dB), the large amount of feedback causes a significant deterioration of the interference rejection performance. Moreover, it should be noted that at low gains, the input SFR decreases proportionally to the gain, as expected. We see, however, from the figures that the input SFR suddenly drops at higher gains, when the amount of feedback becomes significant (see, e.g., the transition from 20 dB to 25 dB, in Figure 6, for an open vent). Since BSS has no influence on the signal power of the external sources (the "S" component in the SFR), it means that BSS amplifies the LS signals (and hence the feedback signals at the microphones, that is, the "F" component in the SFR). This undesired effect is due to the bias introduced by the feedback loop (Section 4.2) and can be interpreted as follows: two mechanisms enter into play. The first one unravels the acoustical mixing system. It produces LS signals which are dominated by the target source (see the positive SIR gains in the figures), as desired. The second mechanism consists in amplifying the sensor signals. As long as the feedback level is small, this second mechanism is almost invisible since it would amplify signals coming from both sources. But at higher gains, where the amount of feedback in the BSS inputs become more significant, this second mechanism becomes more important since it acts mainly in favor of the target source. This second mechanism illustrates the impact of the feedback loop on the BSS algorithm at high feedback levels. It shows the necessity to have the AFC placed before BSS, so that BSS can benefit from a higher input SFR.
6.2. Overall Behavior of the AFC-BSS Combination
The results confirm the observations made in the previous section. With the AFC applied directly on the sensor signals, the BSS algorithm could indeed benefit from the ability of the AFC to keep the SFR at the BSS inputs at high levels for every considered HA gains. Therefore, BSS always provided SIR gains which were very close to the reference SIR gain obtained without feedback, even at high gains. This contrasts with the results obtained in Figures 5 and 6, where an ideal AFC was applied at the BSS output instead of being applied first.
Note that the SFR at the AFC outputs correspond here to the SFR at the BSS inputs. The gain in SFR ( , i.e., the feedback attenuation) achieved by the AFC algorithm can be therefore directly visualized from Figure 7. As expected from the discussion presented in Section 3.1, the two AFC filters used were sufficient to efficiently compensate both the "direct" and "cross" feedback signals, and hence avoid instability of the binaural closed-loop system. Like in the unilateral case and as expected from the convergence analysis conducted in Section 3.2, the best AFC results were obtained at low input SFR levels, that is, at high gains. The AFC performance was also better in the low-reverberation chamber than in the living-room-like environment, as can be seen from the higher SFR levels at the BSS inputs, the higher ASM values and the lower misalignments. This result seems surprising at the first sight, since the FBPs were identical in both environments. It can be however easily justified by the analytical results presented in Section 3.2. We saw actually that the correlation between the external source signals and the LS signals introduce a bias of the AFC Wiener solution. The bias due to the target source is barely influenced by the BSS results since BSS left the target signal (almost) unchanged in both environments. But the BSS performance influences directly the amount of residual interfering signal present at the LS outputs, and hence the bias of the AFC Wiener solution due to the interfering source. In general, since reverberation increases the length of the acoustical mixing filters (and hence the necessary BSS filter length, typically), the more reverberant the environment, the lower the achieved separation results (for a given BSS filter length). This is confirmed here by the SIR results shown in the figures. The difference in AFC performance comes therefore from the higher amount of residual interfering signal present at the LS outputs in the living-room-like environment, which increases the bias of the AFC Wiener solution.
The AFC does not suffer from any particular negative interactions with the BSS algorithm since it comes first in the processing chain, but rather benefits from BSS, especially in the low-reverberation chamber, as we just saw. Note that the situation is very different when the AFC is applied after BSS. In this case, the AFC filters need to quickly follow the continuously time-varying BSS filters, which prevents proper convergence of the AFC filters, even with time-invariant FBPs.
An analysis of a system combining adaptive feedback cancellation and adaptive binaural filtering for signal enhancement in hearing aids was presented. To illustrate our study, a blind source separation algorithm was chosen as an example for adaptive binaural filtering. A number of interaction effects could be identified. Moreover, to correctly understand the behavior of the AFC, the system was described and analyzed in detail. A new stability condition adapted to the binaural configuration could be derived, and adequate performance measures were proposed which account for the specificities of the binaural system. Experimental evaluations confirmed and illustrated the theoretical findings.
The ideal AFC solution in the new binaural configuration could be identified but a steady-state analysis showed that the AFC suffers from a bias in its optimum (Wiener) solution. This bias, similar to the unilateral case, is due to the correlation between feedback and external source signals. It was also demonstrated theoretically as well as experimentally that a signal enhancement algorithm could help reducing this bias. The correlation between feedback and external source signals also causes a bias of the BAF solution. But contrary to the bias encountered by the AFC, the BAF bias increases with increasing HA amplification levels. Fortunately, this bias can be reduced by applying AFC on the sensor signals directly, instead of applying it on the BAF outputs.
The analysis also showed that two SIMO AFC systems of adaptive filters can effectively compensate for the four SIMO FBP systems when the outputs are sufficiently correlated (see Section 3.1). Should this condition not be fulfilled because of, for example, some non-linearities in the forward paths, the "cross" feedback signals travelling from one ear to the other would not be completely identifiable. But we saw that since the amplitude of the "cross" FBPs is usually negligible compared to the amplitude of the "direct" FBPs, the consequences would be very limited as long as the HA gains are set to similar amplification levels.
This work has been supported by a grant from the European Union FP6 Project 004171 HEARCOM (http://hearcom.eu/main_de.html). We would also like to thank Siemens Audiologische Technik, Erlangen, for providing some of the hearing-aid recordings for evaluation.
- Doclo S, Moonen M: GSVD-based optimal filtering for single and multimicrophone speech enhancement. IEEE Transactions on Signal Processing 2002,50(9):2230-2244. 10.1109/TSP.2002.801937View ArticleGoogle Scholar
- Spriet A, Moonen M, Wouters J: Spatially pre-processed speech distortion weighted multi-channel Wiener filtering for noise reduction. Signal Processing 2004,84(12):2367-2387. 10.1016/j.sigpro.2004.07.028View ArticleGoogle Scholar
- Doclo S, Spriet A, Moonen M, Wouters J: Design of a robust multi-microphone noise reduction algorithm for hearing instruments. Proceedings of the International Symposium on Mathematical Theory of Networks and Systems (MTNS '04), July 2004, Leuven, Belgium 1-9.Google Scholar
- Vanden Berghe J, Wouters J: An adaptive noise canceller for hearing aids using two nearby microphones. Journal of the Acoustical Society of America 1998,103(6):3621-3626. 10.1121/1.423066View ArticleGoogle Scholar
- Welker DP, Greenberg JE, Desloge JG, Zurek PM: Microphone-array hearing aids with binaural output—part II: a two-microphone adaptive system. IEEE Transactions on Speech and Audio Processing 1997,5(6):543-551. 10.1109/89.641299View ArticleGoogle Scholar
- Doerbecker M, Ernst S: Combination of two-channel spectral subtraction and adaptive Wiener post-filtering for noise reduction and dereverberation. Proceedings of the 8th European Signal Processing Conference (EUSIPCO '96), September 1996, Trieste, Italy 995-998.Google Scholar
- Klasen TJ, van den Bogaert T, Moonen M, Wouters J: Binaural noise reduction algorithms for hearing aids that preserve interaural time delay cues. IEEE Transactions on Signal Processing 2007,55(4):1579-1585.MathSciNetView ArticleGoogle Scholar
- Doclo S, Dong R, Klasen TJ, Wouters J, Haykin S, Moonen M: Extension of the multi-channel Wiener filter with ITD cues for noise reduction in binaural hearing aids. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA 70-73.Google Scholar
- Klasen TJ, Doclo S, van den Bogaert T, Moonen M, Wouters J: Binaural multi-channel Wiener filtering for hearing aids: preserving interaural time and level differences. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '06), May 2006, Toulouse, France 5: 145-148.Google Scholar
- Aichner R, Buchner H, Zourub M, Kellermann W: Multi-channel source separation preserving spatial information. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '07), May 2007, Honolulu, Hawaii, USA 1: 5-8.Google Scholar
- Egolf DP: Review of the acoustic feedback literature for a control systems point of view. In The Vanderbilt Hearing-Aid Report. York Press, London, UK; 1982:94-103.Google Scholar
- Siqueira MG, Alwan A: Steady-state analysis of continuous adaptation in acoustic feedback reduction systems for hearing-aids. IEEE Transactions on Speech and Audio Processing 2000,8(4):443-453. 10.1109/89.848225View ArticleGoogle Scholar
- Puder H, Beimel B: Controlling the adaptation of feedback cancellation filters—problem analysis and solution approaches. Proceedings of the 12th European Conference on Signal Processing (EUSIPCO '04), September 2004, Vienna, Austria 25-28.Google Scholar
- Spriet A: Adaptive filtering techniques for noise reduction and acoustic feedback cancellation in hearing aids, Ph.D. thesis. Katholieke Universiteit Leuven, Leuven, Belgium; 2004.Google Scholar
- Spriet A, Rombouts G, Moonen M, Wouters J: Combined feedback and noise suppression in hearing aids. IEEE Transactions on Audio, Speech, and Language Processing 2007,15(6):1777-1790.View ArticleGoogle Scholar
- Kellermann W: Strategies for combining acoustic echo cancellation and adaptive beamforming microphone arrays. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), April 1997, Munich, Germany 1: 219-222.Google Scholar
- Haykin S: Adaptive Filter Theory, Prentice Hall Information and System Sciences Series. 4th edition. Prentice Hall, Upper Saddle River, NJ, USA; 2002.Google Scholar
- Sondhi MM, Morgan DR, Hall JL: Stereophonic acoustic echo cancellation—an overview of the fundamental problem. IEEE Signal Processing Letters 1995,2(8):148-151. 10.1109/97.404129View ArticleGoogle Scholar
- Benesty J, Morgan DR, Sondhi MM: A better understanding and an improved solution to the specific problems of stereophonic acoustic echo cancellation. IEEE Transactions on Speech and Audio Processing 1998,6(2):156-165. 10.1109/89.661474View ArticleGoogle Scholar
- Morgan DR, Hall JL, Benesty J: Investigation of several types of nonlinearities for use in stereo acoustic echo cancellation. IEEE Transactions on Speech and Audio Processing 2001,9(6):686-696. 10.1109/89.943346View ArticleGoogle Scholar
- Spriet A, Proudler I, Moonen M, Wouters J: An instrumental variable method for adaptive feedback cancellation in hearing aids. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA 3: 129-132.Google Scholar
- van Waterschoot T, Moonen M: Adaptive feedback cancellation for audio signals using a warped all-pole near-end signal model. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '08), April 2008, Las Vegas, Nev, USA 269-272.Google Scholar
- Araki S, Sawada H, Mukai R, Makino S: A novel blind source separation method with observation vector clustering. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '05), September 2005, Eindhoven, The Netherlands 117-120.Google Scholar
- Cermak J, Araki S, Sawada H, Makino S: Blind source separation based on a beamformer array and time frequency binary masking. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '07), May 2007, Honolulu, Hawaii, USA 1: 145-148.Google Scholar
- Hyvaerinen A, Karhunen J, Oja E: Independent Component Analysis. John Wiley & Sons, New York, NY, USA; 2001.View ArticleGoogle Scholar
- Herbordt W: Sound Capture for Human /Machine Interfaces. Springer, Berlin, Germany; 2005.MATHGoogle Scholar
- Parra L, Spence C: Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 2000,8(3):320-327. 10.1109/89.841214View ArticleGoogle Scholar
- Sawada H, Mukai R, Araki S, Makino S: Polar coordinate based nonlinear function for frequency-domain blind source separation. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1: 1001-1004.Google Scholar
- Kurita S, Saruwatari H, Kajita S, Takeda K, Itakura F: Evaluation of blind signal separation method using directivity pattern under reverberant conditions. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 5: 3140-3143.Google Scholar
- Sawada H, Mukai R, Araki S, Makino S: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Transactions on Speech and Audio Processing 2004,12(5):530-538. 10.1109/TSA.2004.832994View ArticleGoogle Scholar
- Buchner H, Aichner R, Kellermann W: A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Transactions on Speech and Audio Processing 2005,13(1):120-134.View ArticleGoogle Scholar
- Aichner R, Buchner H, Yan F, Kellermann W: A real-time blind source separation scheme and its application to reverberant and noisy acoustic environments. Signal Processing 2006,86(6):1260-1277. 10.1016/j.sigpro.2005.06.022MATHView ArticleGoogle Scholar
- Buchner H, Aichner R, Kellermann W: TRINICON: a versatile framework for multichannel blind signal processing. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Canada 3: 889-892.Google Scholar
- Eneman K, Luts H, Wouters J, et al.: Evaluation of signal enhancement algorithms for hearing instruments. Proceedings of the 16th European Signal Processing Conference (EUSIPCO '08), August 2008, Lausanne, Switzerland 1-5.Google Scholar
- Buchner H, Aichner R, Stenglein J, Teutsch H, Kellennann W: Simultaneous localization of multiple sound sources using blind adaptive MIMO filtering. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA 3: 97-100.Google Scholar
- Shynk JJ: Frequency-domain and multirate adaptive filtering. IEEE Signal Processing Magazine 1992,9(1):14-37. 10.1109/79.109205View ArticleGoogle Scholar
- Mader A, Puder H, Schmidt GU: Step-size control for acoustic echo cancellation filters—an overview. Signal Processing 2000,80(9):1697-1719. 10.1016/S0165-1684(00)00082-7MATHView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.