 Research
 Open Access
A Bayesian network approach to linear and nonlinear acoustic echo cancellation
 Christian Huemmer^{1}Email author,
 Roland Maas^{1},
 Christian Hofmann^{1} and
 Walter Kellermann^{1}
https://doi.org/10.1186/s1363401502822
© Huemmer et al. 2015
 Received: 25 June 2015
 Accepted: 6 November 2015
 Published: 25 November 2015
Abstract
This article provides a general Bayesian approach to the tasks of linear and nonlinear acoustic echo cancellation (AEC). We introduce a statespace model with latent state vector modeling all relevant information of the unknown system. Based on three cases for defining the state vector (to model a linear or nonlinear echo path) and its mathematical relation to the observation, it is shown that the normalized least mean square algorithm (with fixed and adaptive stepsize), the Hammerstein group model, and a numerical sampling scheme for nonlinear AEC can be derived by applying fundamental techniques for probabilistic graphical models. As a consequence, the major contribution of this Bayesian approach is a unifying graphicalmodel perspective which may serve as a powerful framework for future work in linear and nonlinear AEC.
Keywords
 Bayesian networks
 Acoustic echo cancellation
 Graphical models
1 Introduction
The problem of acoustic echo cancellation (AEC) is one of the earliest applications of adaptive filtering to acoustic signals and yet is still an active research topic [1, 2]. Especially in applications like teleconferencing and handsfree communication systems, it is of vital importance to compensate acoustic echos and thus prevent the users from listening to delayed version of their own speech [3]. Since the invention of the normalized least mean square (NLMS) algorithm in 1960 [4], the acoustic coupling between loudspeakers and microphones is often modeled by adaptive linear finite impulse response (FIR) filters. However, the statistical properties of speech signals (being widesense stationary only for short time frames) and challenging properties of the acoustic environment (such as speech signals as interference, nonstationary background noise and timevarying acoustic echo paths) complicate the filter adaptation and motivated various concepts improving the performance of linear FIR filters in many practical scenarios [5–7]. Despite these challenges, singlechannel linear AEC has already reached a mature state as vital part of modern communication devices. On the other hand, the nonlinear distortions created by amplifiers and transducers in miniaturized loudspeakers require dedicated nonlinear echopath models and are still a very active research topic [8, 9]. In this context, a variety of concepts for nonlinear AEC have been proposed based on artificial neural networks [10, 11], Volterra filters [12, 13], or Kernel methods [14, 15]. A commonly used model, which is also considered in this article, is a cascade of a nonlinear memoryless preprocessor (to model the loudspeaker signal distortions) and an adaptive linear FIR filter (to model the acoustic sound propagation and the microphone) [9, 16–19].
Recently, the application of machine learning techniques to signal processing tasks attracted increasing interest [20–22]. In particular, graphical models provide a powerful framework for deriving (links between) numerous existing algorithms based on probabilistic inference [23–25]. Besides the widely used factor graphs, which capture detailed information about the factorization of a joint probability distribution [23, 26, 27], especially directed graphical models, such as Bayesian networks, have been shown to be wellsuited for modeling causal probabilistic relationships of sequential data like speech [28, 29].
This article provides a concise overview on different algorithms for linear and nonlinear AEC from a unifying Bayesian network perspective. For this, we consider a statespace model with a latent (unobserved) state vector capturing all relevant information of the unknown system. Depending on the definition of the state vector (modeling a linear or nonlinear echo path) and its mathematical relation to the observation, we illustrate that the application of different probabilistic inference techniques to the same graphical model straightforwardly leads to the NLMS algorithm with fixed/adaptive stepsize value, the Hammerstein group model (considered from this perspective here for the first time), and a numerical sampling scheme for nonlinear AEC. This consistent Bayesian view on conceptually different algorithms highlights the probabilistic assumptions underlying the respective derivations and provides a powerful framework for further research in linear and nonlinear AEC.
Note that C _{ z,n }=C _{ z,n } I (identity matrix I) implies the elements of z _{ n } to be mutually statistically independent and of equal variance C _{ z,n }. Finally, we distinguish between the probability density function (PDF) p(z _{ n }) and realizations \(z_{n}^{(l)}\) (samples drawn from p(z _{ n })) of a random variable z _{ n }, where l is the sample index.
This article is structured as follows: First, we briefly review Bayesian networks and introduce a general statespace model in Section 2. This statespace model will be further specified in Section 3 for the tasks of linear and nonlinear AEC. This is followed by applying several fundamental probabilistic inference techniques for deriving the NLMS algorithm with fixed/adaptive stepsize value (linear AEC, Section 4), as well as the Hammerstein group model and a numerical sampling scheme (nonlinear AEC, Section 5). Finally, the practical performance of the algorithms is illustrated in Section 6 and conclusions are drawn in Section 7.
2 Review of Bayesian networks and statespace modeling
This section provides a concise review of Bayesian networks and statespace modeling following the detailed discussions in [30].
2.1 Bayesian networks
The same property of conditional independence can be derived for the case of a tailtotail relationship in z _{2} as shown in Fig. 2 b. In contrast, two independent random variables z _{1} and z _{3} are conditionally dependent given z _{2} if they share a headtohead relationship as in Fig. 2 c, which would, e.g., be the case if z _{2} was defined as z _{2}=z _{1}+z _{3}.

arrows meet headtotail or tailtotail, and the node is in the set C

arrows meet headtohead and neither the node, nor any of its descendants, are in the set C [30].
2.2 Statespace modeling
In this part, we introduce a general probabilistic model (later applied to linear and nonlinear AEC) and review fundamental techniques which are commonly employed in Bayesian network modeling.

With respect to the latent state vector z _{ n−1}, the headtotail relationships of all paths from d _{1:n−2} to z _{ n } and the tailtotail relationship of the path from d _{ n−1} to z _{ n } together imply the current state vector z _{ n } to depend on all previous observations d _{1:n−1}. For the conditional PDF of z _{ n } given {z _{ n−1},d _{1:n−1}}, this leads to:$$ p \left(\mathbf{z}_{n}  \mathbf{z}_{n1}, d_{1:n1}\right) = p \left(\mathbf{z}_{n} \mathbf{z}_{n1}\right). $$(8)

The current observation d _{ n } depends on all previous observations d _{1:n−1} following the headtotail relationship in the latent state vector z _{ n }. This allows to reformulate the conditional PDF of d _{ n } given {z _{ n },d _{1:n−1}} as$$ p \left(d_{n}  \mathbf{z}_{n}, d_{1:n1}\right) = p \left(d_{n} \mathbf{z}_{n}\right). $$(9)

w _{ n } is normally distributed with mean vector 0 and covariance matrix C _{ w,n } defined by the scalar variance C _{ w,n }:$$ \mathbf{w}_{n} \sim \mathcal{N} \{\mathbf{w}_{n}  \mathbf{0}, \mathbf{C}_{\mathbf{w}, n}\},\quad \mathbf{C}_{\mathbf{w},n} = C_{\mathbf{w},n} \mathbf{I}. $$(10)

v _{ n } is assumed to be normally distributed with variance C _{ v,n } and zero mean:$$ v_{n} \sim \mathcal{N} \{ v_{n}  0, C_{v, n}\}. $$(11)
To derive estimates for the state vector and the hyperparameters C _{ v,n } and C _{ w,n }, we recall the steps of probabilistic inference and learning in the next part.
where ·_{2} is the Euclidean norm and \(\mathcal {E}\{\cdot \}\) the expectation operator. Note that this MMSE estimate can be calculated in an analytically closed form in case of linear relations between the variables in (7) and is optimal in the Bayesian sense for jointly normally distributed random variables z _{ n } and d _{1:n }.
In the learning stage, the hyperparameters C _{ v,n } and C _{ w,n } of the statespace model in (7) are estimated by solving a maximum likelihood (ML) problem (see Section 4.1 for more details).
3 Statespace model for linear and nonlinear AEC
where the latent lengthM vector h _{ n } models the acoustic path between the loudspeaker and the microphone. Note that the observation equation in (20) is denoted as a model which is linear in the coefficients (LIC model) due to the linear relation between the elements of the state vector z _{ n } and the observation d _{ n }.
This represents a LIC model as the output d _{ n } linearly depends on the coefficients of z _{ n }. The three previously described pairs of observation equations and state vector definitions represent special cases of the statespace model in (7) and will be employed in the subsequent sections to derive algorithms for linear and nonlinear AEC following the schematic overview in Fig. 5.
4 A Bayesian view on linear AEC
where tr{·} represents the trace of a matrix. This implies the filter taps to be uncorrelated and of equal estimation uncertainty. The assumption (27) will be the basis for deriving the NLMS algorithm with adaptive (Section 4.1) and fixed (Section 4.2) stepsize value.
4.1 NLMS algorithm with adaptive stepsize value [32]
However, the approximations in (29) often lead to oscillations which have to be addressed by limiting the absolute value of β _{ n } [36].
In the following, we employ Bayesian network modeling to derive the filter update of (28) (in the inference stage) and an estimation scheme for the adaptive stepsize β _{ n } (in the learning stage).
Inserting (38) and (39) into (28) finally yields the identical expression for the filter update as in (34). All together, we thus derived the adaptive stepsize NLMS algorithm (initially heuristically proposed in 1982 [34]) by applying fundamental techniques of Bayesian network modeling to a special realization of the fundamental statespace model in (7). Next, we estimate the hyperparameters C _{ v,n } and C _{ w,n } in the learning stage to realize the adaptive stepsize NLMS algorithm in (34) without exploiting the approximations of (28).
Note that this approximated ML solution is only guaranteed to converge to a locally but not necessarily globally optimum solution [32].
4.2 NLMS algorithm with fixed stepsize value [38]

The uncertainty w _{ n } is equal to zero by choosing C _{ w,n }=0 in (10).

The variance of the microphone signal uncertainty C _{ v,n } is proportional to the current loudspeaker power and the estimation uncertainty C _{ h,n−1}:$$ C_{v,n} = \tilde{\alpha} \mathbf{x}_{n}^{\mathrm{T}} \mathbf{x}_{n} C_{\mathbf{h},n1}, \quad \text{where} \quad \tilde{\alpha}\geq 0. $$(48)Inserting both assumptions into (34) leads to the filter update of the NLMS algorithm$$\begin{array}{*{20}l} \hat{\mathbf{h}}_{n} &= \hat{\mathbf{h}}_{n1} + \frac{C_{\mathbf{h},n1} \mathbf{x}_{n} e_{n} }{\mathbf{x}^{\mathrm{T}}_{n}\mathbf{x}_{n} C_{\mathbf{h},n1} + \tilde{\alpha} \mathbf{x}_{n}^{\mathrm{T}} \mathbf{x}_{n} C_{\mathbf{h},n1}} \notag \\ &=\hat{\mathbf{h}}_{n1} + \frac{\alpha }{\mathbf{x}^{\mathrm{T}}_{n}\mathbf{x}_{n}} \mathbf{x}_{n} e_{n} \end{array} $$(49)with fixed stepsize value$$ \alpha = (1+\tilde{\alpha})^{1}. $$(50)
Interestingly, the resulting stepsize α is from the interval typically chosen for an NLMS algorithm: if the additive uncertainty is equal to zero (C _{ v,n }=(48)0 for \(\tilde {\alpha }=0\)), the stepsize reaches the maximum value of α=(50)1. With increasing additive uncertainty (\(C_{v,n}\stackrel {(48)}{\rightarrow }\infty \) for \(\tilde {\alpha } \rightarrow \infty \)), the stepsize decreases and tends to zero.
5 A Bayesian view on nonlinear AEC
In this section, we consider the nonlinear AEC scenario of Fig. 4 and compare both realizations of the state vector in (22) and (23) to compare models having a linear (LIC models) or nonlinear (NIC models) relation between the observation and the coefficients of the state vector.
5.1 LIC model: Hammerstein group models
5.2 NIC model: numerical sampling
which describe the likelihoods that the observation is obtained by the corresponding particle (as measures for the probability of the samples to be drawn from the true PDF [40]). To calculate the weights in (56), the particles are plugged into (18) to determine the estimated microphone samples \(d^{(l)}_{n}\).

Starting point: L particles \(\mathbf {z}^{(l)}_{n}\).

Measurement update: Calculate the weights \(\omega ^{(l)}_{n}\) and determine the posterior PDF \(p \left (\mathbf {z}_{n}  d_{1:n}\right)\) (see (56) and (55), respectively).

Time update: Replace all particles by L new samples drawn from the posterior PDF [30]$$\begin{array}{*{20}l} p \left(\mathbf{z}_{n+1}  d_{1:n}\right) = \sum\limits_{l=1}^{L} \omega^{(l)}_{n} p \left(\mathbf{z}_{n+1}  \mathbf{z}^{(l)}_{n}\right), \end{array} $$(58)
which is equivalent to sampling from \(p \left (\mathbf {z}_{n}  d_{1:n}\right)\) and subsequently adding one realization of the uncertainty w _{ n+1} defined in (10)^{1}. This is the starting point for the next iteration step.
Unfortunately, the classical particle filter (initially proposed for tracking applications) is conceptually illsuited for the task of nonlinear AEC: it is well known that the performance degrades with increasing search space and that the local optimization problem is solved without generalizing the instantaneous solution (see the weight calculation in (56)) [41–43]. These properties of the classical particle filter are severe limitations for the task of nonlinear AEC with its highdimensional state vector (see (22)). To cope with these conceptional limitations without introducing sophisticated resampling methods [40, 44], the elitist particle filter based on evolutionary strategies (EPFES) has been recently proposed in [9]. As major modifications for the task of nonlinear AEC, an evolutionary selection process facilitates to evaluate realizations of the state vector based on recursively calculated particle weights to generalize the instantaneous solution of the optimization problem [9]. These fundamental properties of the EPFES will be illustrated for the statespace model of (7) in the next part.

Starting point: L particles \(\mathbf {z}^{(l)}_{n}\) with weights \(\mathbf {\omega }^{(l)}_{n1}\) determined in the previous time step.

Measurement update: Update weights \(\omega ^{(l)}_{n}\) in (59), select elitist particles, and determine \(p \left (\mathbf {z}_{n}  d_{1:n}\right)\) by inserting the set of elitist particles \(\bar {\mathbf {z}}^{(q_{n})}_{n}\) and weights \(\bar {\omega }^{(q_{n})}_{n}\) into (55).

Time update: Replace the nonelitist particles by new samples drawn from the posterior PDF \(p \left (\mathbf {z}_{n}  d_{1:n}\right)\). Furthermore, add realizations of w _{ n+1} (following (7)) to the set of particles (containing Q _{ n } elitist particles and L−Q _{ n } new samples). This is the starting point for the next iteration step^{2}.
It has been shown that these modifications of the classical particle filter generalize the instantaneous solution of the optimization problem and thus allow to identify the nonlinearlinear cascade in Fig. 4 [9]. However, the EPFES evaluates realizations of the state vector based on longterm fitness measures. This leads to a high computational complexity due to the high dimension of the state vector in (22). Although many realtime implementations of particle filters have been proposed using parallel processing units [46, 47], it might be necessary for typical applications of nonlinear AEC (e.g., in mobile devices) to reduce the computational complexity to meet specific hardware constraints. Note that a very efficient solution for this problem is the socalled significanceaware EPFES (SAEPFES) proposed in [19], where the NLMS algorithm (to estimate the linear component of the AEC scenario) is combined with the EPFES (to estimate the loudspeaker signal distortions) by applying significanceaware (SA) filtering. In short, the fundamental idea of SA filtering is to reduce the computational complexity by exploiting physical knowledge about the most significant part of the linear subsystem to estimate the coefficients of the nonlinear preprocessor [18]. Thus, the state vector in (22) underlying the derivation of the SAEPFES models the coefficients of the nonlinear preprocessor and a small part of the impulse response around the highest energy peak (to capture estimation errors of the NLMS algorithm in the directpath region).
6 Experimental performance
This overview article establishes a unifying Bayesian network view on linear and nonlinear AEC with the goal to drive future research by highlighting the idealizations and limitations in the probabilistic models of existing methods. Note that a detailed analysis of the adaptive algorithms described in the previous sections has already been performed in [18, 19]. Therefore, we briefly summarize the main findings without explicitly detailing the practical realizations of the algorithms (see [18, 19] for more details). For a recorded female speech signal (commercial smartphone placed on a table with display facing the desk) in a mediumsize room with moderate background noise (SNR ≈40 dB), the NLMS algorithm (length256 FIR filter at 16 kHz) achieved an average echo return loss enhancement (ERLE) of 8.2 dB in a time interval of 9 s [19]. Compared to this, the Hammerstein group model and the SAEPFES improve the average ERLE by 34 and 68 % at a computational complexity increased by 27 and 50 %, respectively [19]. To achieve these results, the Hammerstein group model (termed as SAHGM in [18]) and the SAEPFES are realized based on the concept of SA filtering [18] (11 filter taps for the directpath region of the RIR) by using length256 FIR filters and a thirdorder memoryless preprocessor (inserting oddorder Legendre functions into (18)).
7 Conclusions
In this article, we derived a set of conceptually different algorithms for linear and nonlinear AEC from a unifying graphical model perspective. Based on a concise review of Bayesian networks, we introduced a statespace model with latent state vector capturing all relevant information of the unknown system. After this, we employed three combinations of statevector definitions (to model a linear or nonlinear echo path) and observation equations (mathematical relation between state vector and observation) to apply fundamental techniques of machine learning research. Thereby, it is shown that the NLMS algorithm, the Hammerstein group model (considered from this perspective here for the first time), and a numerical sampling scheme for nonlinear AEC can be derived from a unifying Bayesian network perspective. This viewpoint highlights probabilistic assumptions underlying different derivations and serves as a basis for developing new algorithms for linear and nonlinear AEC and similar tasks. An example for future work is a Bayesian view on a nonlinear AEC scenario, where the nonlinear loudspeaker signal distortions are modeled by a nonlinear preprocessor with memory.
8 Endnotes
^{1} Note that sampling from the posterior PDF \(p\left (\mathbf {z}_{n+1}  d_{1:n}\right) \stackrel {(7)}{=} \sum \limits _{l=1}^{L} \omega ^{(l)}_{n} \mathcal {N} \left (\mathbf {z}_{n+1}  \mathbf {z}^{(l)}_{n},C_{\mathbf {w},n+1} \mathbf {I}\right)\) is equivalent to adding samples drawn from the discrete PDF \(p \left (\mathbf {z}_{n}  d_{1:n}\right)\) in (55) and the Gaussian PDF \(p \left (\mathbf {w}_{n+1}\right)\) in (10).
^{2} In practice, the weights of the new samples for the recursive update in (56) are initialized by the value ω _{th}.
Declarations
Acknowledgements
The authors would like to thank the Deutsche Forschungsgemeinschaft (DFG) for supporting this work (contract number KE 890/42).
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 P Dreiseitel, E Hänsler, H Puder, in Proc. Conf. Europ. Signal Process. (EUSIPCO). Acoustic echo and noise control–a long lasting challenge (Rhodes, 1998), pp. 945–952.Google Scholar
 E Hänsler, The handsfree telephone problem—an annotated bibliography. Signal Process.27(3), 259–271 (1992).View ArticleGoogle Scholar
 E Hänsler, in IEEE Int. Symp. Circuits, Systems. The handsfree telephone problem, (1992), pp. 1914–1917.Google Scholar
 B Widrow, ME Hoff, in IRE WESCON Conv. Rec. 4. Adaptive switching circuits (Los Angeles, CA, 1960), pp. 96–104.Google Scholar
 C Breining, P Dreiseitel, E Hänsler, A Mader, B Nitsch, H Puder, T Schertler, G Schmidt, J Tilp, Acoustic echo control. An application of veryhighorder adaptive filters. IEEE Signal Process. Mag.16(4), 42–69 (1999).View ArticleGoogle Scholar
 E Hänsler, in IEEE Int. Symp. Circuits, Systems. Adaptive echo compensation applied to the handsfree telephone problem (New Orleans, LA, 1990), pp. 279–282.Google Scholar
 E Hänsler, G Schmidt, Acoustic Echo and Noise Control: a Practical Approach (J. Wiley and sons, New Jersey, 2004).View ArticleGoogle Scholar
 A Stenger, W Kellermann, Adaptation of a memoryless preprocessor for nonlinear acoustic echo cancelling. Signal Process.80(9), 1747–1760 (2000).MATHView ArticleGoogle Scholar
 C Huemmer, C Hofmann, R Maas, A Schwarz, W Kellermann, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. (ICASSP). The elitist particle filter based on evolutionary strategies as novel approach for nonlinear acoustic echo cancellation (Florence, Italy, 2014), pp. 1315–1319.Google Scholar
 AN Birkett, RA Goubran, in Proc. IEEE Workshop Neural Networks Signal Process. (NNSP). Nonlinear echo cancellation using a partial adaptive time delay neural network (Cambridge, MA, 1995), pp. 449–458.Google Scholar
 LSH Ngja, J Sjobert, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. (ICASSP). Nonlinear acoustic echo cancellation using a Hammerstein model (Seattle, WA, 1998), pp. 1229–1232.Google Scholar
 M Zeller, LA AzpicuetaRuiz, J ArenasGarcia, W Kellermann, Adaptive Volterra filters with evolutionary quadratic kernels using a combination scheme for memory control. IEEE Trans. Signal Process.59(4), 1449–1464 (2011).View ArticleGoogle Scholar
 F Küch, W Kellermann, Orthogonalized power filters for nonlinear acoustic echo cancellation. Signal Process.86(6), 1168–1181 (2006).MATHView ArticleGoogle Scholar
 G Li, C Wen, WX Zheng, Y Chen, Identification of a class of nonlinear autoregressive models with exogenous inputs based on kernel machines. IEEE Trans. Signal Process.59(5), 2146–2159 (2011).MathSciNetView ArticleGoogle Scholar
 J Kivinen, AJ Smola, RC Williamson, Online learning with kernels. IEEE Trans. Signal Process.52(8), 165–176 (2004).MathSciNetView ArticleGoogle Scholar
 S Shimauchi, Y Haneda, in Proc. IEEE Int. Workshop Acoustic Signal Enhanc. (IWAENC). Nonlinear Acoustic Echo Cancellation Based on Piecewise Linear Approximation with Amplitude Threshold Decomposition (Aachen, Germany, 2012), pp. 1–4.Google Scholar
 S Malik, G Enzner, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. (ICASSP). Variational Bayesian inference for nonlinear acoustic echo cancellation using adaptive cascade modeling (Kyoto, 2012), pp. 37–40.Google Scholar
 C Hofmann, C Huemmer, W Kellermann, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. (ICASSP). Significanceaware Hammerstein group models for nonlinear acoustic echo cancellation (Florence, Italy, 2014), pp. 5934–5938.Google Scholar
 C Huemmer, C Hofmann, R Maas, W Kellermann, in Proc. IEEE Global Conf. Signal Information Process. (GlobalSIP). The significanceaware EPFES to estimate a memoryless preprocessor for nonlinear acoustic echo cancellation (Atlanta, GA, 2014), pp. 557–561.Google Scholar
 T Adali, D Miller, K Diamantaras, J Larsen, Trends in machine learning for signal processing. IEEE Signal Process. Mag.28(6), 193–196 (2011).View ArticleGoogle Scholar
 R Talmon, I Cohen, S Gannot, RR Coifman, Diffusion maps for signal processing: a deeper look at manifoldlearning techniques based on kernels and graphs. IEEE Signal Process. Mag.30(4), 75–86 (2013).View ArticleGoogle Scholar
 KR Muller, T Adali, K Fukumizu, JC Principe, S Theodoridis, Special issue on advances in kernelbased learning for signal processing. IEEE Signal Process. Mag.30(4), 14–15 (2013).View ArticleGoogle Scholar
 BJ Frey, Graphical Models for Machine Learning and Digital Communication (MIT Press, Cambridge, MA, USA, 1998).Google Scholar
 SJ Rennie, P Aarabi, BJ Frey, Variational probabilistic speech separation using microphone arrays. IEEE Trans. Audio, Speech, Lang. Process.15(1), 135–149 (2007).View ArticleGoogle Scholar
 S Malik, J Benesty, J Chen, A Bayesian framework for blind adaptive beamforming. IEEE Trans. Signal Process.62(9), 2370–2384 (2014).MathSciNetView ArticleGoogle Scholar
 FR Kschischang, BJ Frey, HA Loeliger, Factor graphs and the sumproduct algorithm. IEEE Trans. Inform. Theory. 47(2), 498–519 (2001).MATHMathSciNetView ArticleGoogle Scholar
 P Mirowski, Y LeCun, in Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, 5782. Dynamic factor graphs for time series modeling (Springer,Berlin Heidelberg, 2009), pp. 128–143.View ArticleGoogle Scholar
 CW Maina, JM Walsh, Joint speech enhancement and speaker identification using approximate Bayesian inference. IEEE Trans. Audio, Speech, Lang. Process.19(6), 1517–1529 (2011).View ArticleGoogle Scholar
 D Barber, AT Cemgil, Graphical models for time series. IEEE Signal Process. Mag.27(6), 18–28 (2010).Google Scholar
 CM Bishop, Pattern Recognition and Machine Learning (Springer, New York, 2006).MATHGoogle Scholar
 R Maas, C Huemmer, C Hofmann, W Kellermann, in ITG Conf. Speech Commun. On Bayesian networks in speech signal processing (Erlangen, Germany, 2014).Google Scholar
 C Huemmer, R Maas, W Kellermann, The NLMS algorithm with timevariant optimum stepsize derived from a Bayesian network perspective. IEEE Signal Process. Lett.22(11), 1874–1878 (2015).View ArticleGoogle Scholar
 PAC Lopes, JB Gerald, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. (ICASSP). New normalized LMS algorithms based on the Kalman filter (New Orleans, LA, 2007), pp. 117–120.Google Scholar
 S Yamamoto, S Kitayama, An adaptive echo canceller with variable step gain method. Trans. IECE Japan. E65(1), 1–8 (1982).Google Scholar
 S Haykin, Adaptive Filter Theory (Prentice Hall, New Jersey, 2002).Google Scholar
 U Schultheiß, Über die adaption eines kompensators für akustische echos. VDI Verlag (1988).Google Scholar
 C Breining, P Dreiseitel, E Hänsler, A Mader, B Nitsch, H Puder, T Schertler, G Schmidt, J Tilp, Acoustic echo control. Signal Process.16(4), 42–69 (1999).View ArticleGoogle Scholar
 R Maas, C Huemmer, A Schwarz, C Hofmann, W Kellermann, in Proc. IEEE China Summit Int. Conf. Signal Information Process. (ChinaSIP). A Bayesian network view on linear and nonlinear acoustic echo cancellation (Xi’an, China, 2014), pp. 495–499.Google Scholar
 K Uosaki, T Hatanaka, Nonlinear state estimation by evolution strategies based particle filters. 2003 Congr. Evolut. Comput.3:, 2102–2109 (2003).View ArticleGoogle Scholar
 T Schön. Estimation of nonlinear dynamic systems (PhD thesis, Linköpings universitetLiUTryck, 2006).Google Scholar
 T Bengtsson, P Bickel, B Li, in Probability, Statistics: Essays Honor David A. Freedman, Vol. 2. Curseofdimensionality revisited: collapse of the particle filter in very large scale systems (Institute of Mathematical StatisticsBeachwood, Ohio, USA, 2008), pp. 316–334.View ArticleGoogle Scholar
 F Gustafsson, F Gunnarsson, N Bergman, U Forssell, J Jansson, R Karlsson, PJ Nordlund, Particle filters for positioning, navigation, and tracking. IEEE Trans. Signal Process.50(2), 425–437 (2002).View ArticleGoogle Scholar
 A Doucet, AM Johansen, A tutorial on particle filtering and smoothing: fifteen years later. Handbook Nonlinear Filtering. 12:, 656–704 (2009).Google Scholar
 A Doucet, N de Freitas, N Gordon, Sequential Monte Carlo Methods in Practice (Springer, New York, 2001).MATHView ArticleGoogle Scholar
 T Bäck, HP Schwefel, in Proc. IEEE Int. Conf. Evolut. Comput. (ICEC). Evolutionary computation: an overview (Nagoya, Japan, 1996), pp. 20–29.Google Scholar
 S Henriksen, A Wills, TB T. Schön, B Ninness, in Proc. 16th IFAC Symposium Syst. Ident, 16. Parallel implementation of particle MCMC methods on a GPU (Brussels, Belgium, 2012), pp. 1143–1148.Google Scholar
 A Lee, C Yau, MB Giles, A Doucet, CC Holmes, On the utility of graphics cards to perform massively parallel simulation of advanced monte carlo methods. J. Comp. Graph. Stat.19(4), 769–789 (2010).View ArticleGoogle Scholar