Most of the cost functions of adaptive filtering algorithms include the square error, which depends on the current error signal. When the additive noise is impulsive, we can expect that the square error will be very large. By contrast, the cross error, which is the correlation of the error signal and its delay, may be very small. Based on this fact, we propose a new cost function called the mean square cross error for adaptive filters, and provide the mean value and mean square performance analysis in detail. Furthermore, we present a two-stage method to estimate the closed-form solutions for the proposed method, and generalize the two-stage method to estimate the closed-form solution of the information theoretic learning methods, including least mean fourth, maximum correntropy criterion, generalized maximum correntropy criterion, and minimum kernel risk-sensitive loss. The simulations of the adaptive solutions and closed-form solution show the effectivity of the new method.
1 Introduction
The mean square error (MSE) is probably the most widely used cost function for adaptive linear filters [1,2,3,4,5]. The MSE relies heavily on Gaussianity assumptions and performs well for Gaussian noise. Recently, information theoretic learning (ITL) has been proposed to process non-Gaussian noise. ITL uses the higher-order moments of the probability density function and may work well for non-Gaussian noise. Inspired by ITL, some cost functions, such as the maximum correntropy criterion (MCC) [6,7,8,9,10,11], improved least sum of exponentials (ILSE) [12], least mean kurtosis (LMK) [13], least mean fourth (LMF) [14,15,16,17,18,19], generalized MCC (GMCC) [20], and minimum kernel risk-sensitive loss (MKRSL) criterion [21, 22] have been presented.
The LMK and LMF are robust to sub-Gaussian noise. One typical sub-Gaussian distribution is the uniform distribution. The MCC and ILSE are robust to larger outliers or impulsive noise, which often take relatively more often values that are very close to zero or very large. This means that impulsive noise has a super-Gaussian distribution [23, 24].
Altogether, the distribution of the additive noise in linear filtering can be divided into three types: Gaussian, super-Gaussian, and sub-Gaussian. Super-Gaussian noise and sub-Gaussian noise are both non-Gaussian.
From the viewpoint of performance, for example, the steady error, the MSE, MCC, and LMF work well for Gaussian, super-Gaussian, and sub-Gaussian noise, respectively. The MSE demonstrates similar performance for the three types of noise under the same signal-to-noise ratio (SNR). For Gaussian noise, all the algorithms have similar steady errors. For super-Gaussian noise, the steady error comparison under the same SNR is MCC < MSE < LMF. For sub-Gaussian noise, the comparison is LMF < MSE < MCC.
Note that the cost functions of the above algorithms all include the square error, which is the correlation of the error signal. When impulsive noise is involved, we can expect that the square error will be very large. By contrast, the cross error (CE), which is the correlation of the error signal itself and its delay, may be very small for impulsive noise.
In our early work [25,26,27], we proposed the mean square cross prediction error to extract the desired signal in blind source separation (BSS), where the square cross prediction error was much smaller than the square prediction error. In this paper, we propose a new cost function called the mean square CE (MSCE) for adaptive filtering to process non-Gaussian noise. We expect that the proposed MSCE algorithm will perform well for non-Gaussian noise.
Note that the ITL methods can capture higher-order statistics of data. Thus, it is hard to directly obtain the corresponding closed-form solutions. We present a two-stage method to estimate the closed-form solutions for the LMF, MCC, GMCC, MKRSL, and MSCE.
The contributions of this paper are summarized as follows:
i)
We present a new cost function, that is, the MSCE, for adaptive filters, and provide the mean value and mean square performance analysis in detail.
ii)
We propose a two-stage method to estimate the closed-form solution of the proposed MSCE algorithm.
iii)
We generalize the two-stage method to estimate the closed-form solution of the LMF MCC, GMCC, and MKRSL algorithms.
The paper is organized as follows: In Section 2, the problem statement is explained in detail. In Section 3, the MSCE algorithm is presented with the adaptive algorithm and closed-form solution. In Section 4, the closed-form solution of the LMF, MCC, GMCC, and MKRSL are estimated. In Section 5, the mean behavior and mean square behavior of MSCE are analyzed. Simulations are provided in Section 6. Lastly, a conclusion is provided in Section 7.
2 Problem formulation
The absolute value of the normalized kurtosis may be considered as one measure of non-Gaussianity of the error signal. Several definitions for a random variable of zero means are presented as follows:
Definition 1 (normalized kurtosis)
The normalized kurtosis of random variable x is defined as
A distribution with negative normalized kurtosis is called sub-Gaussian, platykurtic, or short-tailed (e.g., uniform).
Definition 3 (super-Gaussian or leptokurtic)
A distribution with positive normalized kurtosis is called super-Gaussian, leptokurtic, or heavy-tailed (e.g., Laplacian).
Definition 4 (mesokurtic)
A zero-kurtosis distribution is called mesokurtic (e.g., Gaussian).
When the linear filtering problem is considered, there is an input vector u∈ℝM, with unknown parameter wo∈ℝM and desired response d∈ℝ1. Data d(i) are observed at each time point i by the linear regression model:
where v is zero-mean background noise with variance \( {\sigma}_v^2 \) and L is the length of the sequence. The error signal for the linear filter is defined as
$$ e(i)=d(i)-{\mathbf{w}}^T\mathbf{u}(i), $$
(3)
where w is the estimate of wo. The distribution of the additive noise in linear filtering can be divided into three types: Gaussian, super-Gaussian, and sub-Gaussian. Super-Gaussian noise and sub-Gaussian noise are both non-Gaussian.
In this research, we made the following assumptions:
A1) The additive noise is white, that is,
$$ E\left\{v(i)v(j)\kern0.1em \right\}=0,i\ne j. $$
(4)
A2) Inputs u(t) at different time moments (i, j) are uncorrelated:
$$ E\left\{{\mathbf{u}}^H(i)\mathbf{u}(j)\right\}=E\left\{{\mathbf{u}}^T(i)\mathbf{u}(j)\right\}=0,i\ne j. $$
(5)
A3) The inputs and additive noise at different time moments (i, j) are uncorrelated:
$$ E\left\{{\mathbf{u}}^H(i)v(j)\right\}=0,i\ne j. $$
(6)
The linear filtering algorithms of the MSE, MCC, and LMF are as follows: the cost function based on the MSE is given by
The CE can be expressed as e(i)e(i-q), where q denotes the error delay. Because the CE may be negative, we provide a new cost function, that is, the MSCE, as
Combining the references of ICA [23, 24, 28,29,30] with those of the MCC [6,7,8,9,10,11], ILSE [12], and LMF [14,15,16,17,18,19], we determined that there are three cost functions in the fast ICA [30] algorithm:
G2(u) is used to separate the super-Gaussian source in ICA, and works as the cost function of the MCC. G3(u) is used to separate the sub-Gaussian source in ICA when there are no outliers, and works as the cost function of the LMF. G1(u) has not been used in adaptive filtering, but cosh(α1u) works as the cost function of the ILSE.
This motivated us to explore ICA or BSS algorithms to determine a suitable criterion for adaptive filtering. Here, we use G1(u) in the proposed GMSCE algorithm:
For impulsive noise, its variance E{v2(i)} may be very large, but its MSCE E{v2(i)v2(i − q)} may be small. Thus, the proposed MSCE algorithm may have a smaller steady error than the MSE for impulsive noise.
5.3 Selection of delay q
After obtaining the estimate of e(i) using (33), we can estimate the MSCE for q = 1, 2, ⋯, Q:
whereJMSCE is the estimate of JMSCE. Because the mean-square performance of the MSCE algorithm is proportional to E{v2(i)v2(i − q)} according to (85), we should select q with the smallest JMSCEin (87).
6 Simulation results and discussion
In this section, the performance of the MSE, MSCE, GMSCE, LMF, MCC, GMCC, and MKRSL will be evaluated by simulations. All the simulation points were averaged over 100 independent runs. The performance of the adaptive solution was estimated by the steady-state mean-square deviation (MSD)
We concluded that the smaller the MSD, the better the performance.
6.1 Closed-form solutions comparison
The closed-form solutions of the MSE, MSCE, GMSCE, LMF, MCC, GMCC, and MKRSL are expressed by (9), (40), (43), (51), (57), (61), and (65), respectively. The GMCC with α = 2, 4, and 6 are denoted by GMCC1, GMCC2, and GMCC3, respectively. The MKRSL with λ = 0.1 and 32 are denoted by MKRSL1 and MKRSL2, respectively.
In the experiments, we compared the MSDs of the closed-form solutions of the ten algorithms with different non-Gaussian noises. The input filter order was M = 5, and the sample size had length L = 3000. When the SNRs are ranged from − 20 to 20 dB, we obtain similar performance comparisons. Here the SNR was set to 6 dB.
Figures 1 and 2 partly show the four types of sub- and super-Gaussian noise, respectively.
Figure 1a–c shows the periodic noises, and Fig. 1d shows the noise with uniform distribution. The kurtoses of the noises shown in Fig. 1a–d are − 1.5, − 1.0, − 1.4, and − 1.2, respectively.
Figure 2a, b shows the periodic super-Gaussian noises, and Fig. 2c, d shows impulsive noise. The impulsive noise v(i) is generated as v(i)=b(i)*G(i), where b(i) is Bernoulli process with a probability of success P{b(i)=1}=p. G(i) in Fig. 2c is zero-mean Rayleigh noise, and G(i) in Fig. 2d is zero-mean Gaussian noise. The kurtoses of the noises shown in Fig. 2a–d are 3.0, 4.1, 14.4, and 7.3, respectively.
The MSDs of the closed-from solutions for sub- and super-Gaussian noise were shown in Tables 1 and 2, respectively. From the above two tables, we can observe the following three conclusions: firstly, the existing algorithms (MSE, LMF, MCC, GMCC and MKRSL) do not perform better than the MSE method for sub and super-Gaussian noise simultaneously. The MCC, GMCC1, MKRSL1 and MKRSL2 performs better (worse) than the MSE method for super-Gaussian (sub-Gaussian) noise, whereas the LMF and GMCC2 perform better (worse) than the MSE for sub-Gaussian (super-Gaussian) noise. Simulations demonstrated that the proposed MSCE and GMSCE algorithm may perform better than the MSE algorithm both for sub and super-Gaussian noise. Secondly, the MCC performs as well as the MKRSL, whose parameters,λ and σ, did not influence the MSDs of the closed-form solution. Thirdly, the parameters,λ and α, have great influence on the GMCC. When α = 2 and λ = 0.031, GMCC1 performs better than the MSE for super-Gaussian noise. When α = 4 and λ = 0.005, GMCC2 performs better than the MSE for sub-Gaussian noise.
Table 1 The MSDs (dB) of the closed-form solutions of the MSCE, GMSCE, MSE, LMF, and MCC with different sub-Gaussian noises at SNR = 6 dB (L = 3000)
In the simulation, the input filter order was M = 5, the sample size had length L = 10,000 and SNR was set to 6 dB. The proposed algorithms (22) and (30) are denoted by MSCE and GMSCE, respectively.
For sub-Gaussian noise shown in Fig. 1a, d, we compared the performance of the LMS, LMF, MCC, GMCC1-3, MKRSL1-2, MSCE, and GMSCE. The step-sizes were chosen such that all the algorithms had almost the same initial convergence speed, and other parameters (if any) for each algorithm were experimentally selected to achieve desirable performance.
The comparisons were shown in Figs. 3 and 4. From the two figures we can observe:
Fig. 3
Comparisons of the algorithms under sub-Gaussian noise shown in Fig. 1a
First, the GMCC1-3, LMF, and MSCE performed better than LMS for sub-Gaussian noises. GMCC1 and GMCC2 perform best among the algorithms.
Second, the MKRSL1-2 and MCC performed worse than the LMS. The performance curves of MKRSL1 and MCC were almost overlapped.
Third, the performance of the adaptive solution was not always consistent with that of the closed-form solution. Table 1 showed that the closed-form solution of GMCC3 was worse than MSE, but the adaptive solution of GMCC3 was better than MSE. It may be hard for each algorithm to achieve a good tradeoff between the same initial convergence speed and the desirable performance (steady-state error).
6.3 Adaptive solution for super-Gaussian noise
In the simulations, the input filter order was M = 5, the sample size had length L = 10,000 and SNR was set to 6 dB. The step-sizes were chosen such that all the algorithms had almost the same initial convergence speed.
For the super-Gaussian noise shown in Fig. 2a, d, we compared the performance of the LMS, LMF, MCC, GMCC1-3, MKRSL1-2, MSCE, and GMSCE. The comparisons were shown in Figs. 5 and 6. From the two figures we can observe:
Fig. 5
Comparisons of the algorithms under super-Gaussian noise shown in Fig. 2a
First, the proposed MSCE and GMSCE performed much better than other algorithms for the periodic super-Gaussian noise shown in Fig. 2a. The MSCE performed a litter better than the LMS for impulsive noise shown in Fig. 2d.
Second, the MKRSL1 and MCC had almost the same performance, the two algorithms performed a little better than the LMS.
Third, the LMF, GMCC1-3, and MKRSL2 performed worse than the LMS, though the closed-form solutions of the GMCC2 and MKRSL2 performed better than the LMS.
Combining the above simulations in this section, we can find that each algorithm may have its good points, and no algorithms can perform best for all kinds of noise. Dividing the additive noise into three types will be helpful to select the suitable algorithm for real applications.
7 Conclusions
This paper proposes a new cost function called the MSCE for adaptive filters, and provided the mean value and mean square performance analysis in detail. We have also presented a two-stage method to estimate the closed-form solutions for the MSCE method, and generalize the two-stage method to estimate the closed-form solution of the information theoretic learning methods, such as LMF, MCC, GMCC, and MKRSL.
The additive noise in adaptive filtering is divided into three types: Gaussian, sub-Gaussian, and super-Gaussian. The existing algorithms do not perform better than the mean square error method for sub and super-Gaussian noise simultaneously. The MCC, GMCC1, MKRSL1 and MKRSL2 performs better (worse) than the MSE method for super-Gaussian (sub-Gaussian) noise, whereas the LMF and GMCC2 perform better (worse) than the MSE for sub-Gaussian (super-Gaussian) noise. Simulations demonstrated that the proposed MSCE and GMSCE algorithm may perform better than the MSE algorithm both for sub and super-Gaussian noise.
In the future work, the MSCE algorithm may be extended to Kalman filtering, complex-valued filtering, distributed estimation, and non-linear filtering.
Availability of data and materials
The datasets used during the current study are available from the corresponding author on reasonable request.
Abbreviations
MSE:
Mean square error
ITL:
Information theoretic learning
MCC:
Maximum correntropy criterion
ILSE:
Improved least sum of exponentials
LMK:
Least mean kurtosis
LMF:
Least mean fourth
GMCC:
Generalized maximum correntropy criterion
MKRSL:
Minimum kernel risk-sensitive loss
SNR:
Signal-to-noise ratio
CE:
Cross error
BSS:
Blind source separation
MSCE:
Mean square CE Cross error
References
X. Li, T. Adali, Complex-valued linear and widely linear filtering using MSE and Gaussian entropy. IEEE Trans. Signal Process. 60(11), 5672–5684 (2012)
T. Adali, P.J. Schreier, Optimization and estimation of complex valued signals: theory and applications in filtering and blind source separation. IEEE Signal Process. Mag. 31(5), 112–128 (2014)
S.Y. Huang, C.G. Li, Y. Liu, Complex-valued filtering based on the minimization of complex-error entropy. IEEE Trans. Signal Process. 24(5), 695–708 (2013)
J. Chen, A.H. Sayed, Diffusion adaptation strategies for distributed optimization and learning over networks. IEEE Trans. Signal Process. 60(8), 4289–4305 (2011)
W. Liu, P.P. Pokharel, J.C. Principe, Correntropy: properties and applicationsin non-Gaussian signal processing. IEEE Trans. Signal Process. 55(11), 5286–5298 (2007)
A. Singh, J.C.Principe, in Proceedings of the International Joint Conference on Neural Networks 2009, Using correntropy as a cost function in linear adaptive filters (IEEE, Atlanta, 2009), pp. 2950–2955. https://doi.org/10.1109/IJCNN.2009.5178823
R. He, W.S. Zheng, B.G. Hu, Maximum correntropy criterion for robust face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1561–1576 (2011)
A.I. Fontes, A.M. de Martins, L.F. Silveira, J. Principe, Performance evaluation of the correntropy coefficient in automatic modulation classification. Expert Syst. Appl. 42(1), 1–8 (2015)
W. Ma, B. Chen, J. Duan, H. Zhao, Diffusion maximum correntropy criterion algorithms for robust distributed estimation. Digit. Signal Process. 58, 10–19 (2016)
P.I. Hubscher, J.C.M. Bermudez, An improved statistical analysis of the least mean fourth (LMF) adaptive algorithm. IEEE Trans. Signal Process. 51(3), 664–671 (2003)
E. Eweda, Mean-square stability analysis of a normalized least mean fourth algorithm for a Markov plant. IEEE Trans. Signal Process. 62(24), 6545–6553 (2014)
E. Eweda, Dependence of the stability of the least mean fourth algorithm on target weights non-stationarity. IEEE Trans. Signal Process. 62(7), 1634–1643 (2014)
B. Chen, L. Xing, H. Zhao, N. Zheng, J.C. Príncipe, Generalized correntropy for robust adaptive filtering. IEEE Trans. Signal Process. 64(13), 3376–3387 (2016)
B. Chen, R. Wang, Risk-sensitive loss in kernel space for robust adaptive filtering. Proc. 2015 IEEE Int. Conf. Digit. Signal Process., 921–925 (2015)
B. Chen, L. Xing, B. Xu, H. Zhao, N. Zheng, J.C. Principe, Kernel risk-sensitive loss: definition, properties and application to robust adaptive filtering. IEEE Trans. Signal Process. 65(11), 2888–2901 (2017)
G. Wang, Y. Zhang, B. He, K.T. Chong, A framework of target detection in hyperspectral imagery based on blind source extraction. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 9(2), 835–844 (2016)
G. Wang, C. Li, L. Dong, Noise estimation using mean square cross prediction error for speech enhancement. IEEE Trans. Circuits Syst. I Reg. Pap. 57, 1489–1499 (2010)
G. Wang, N. Rao, S. Shepherd, C. Beggs, Extraction of desired signal based on AR model with its application to atrial activity estimation in atrial fibrillation. EURASIP J. Adv. Signal Process. 9, 728409 (2008)
Thanks to the anonymous reviewers and editors for their hard work.
Funding
This work was supported by the National Key Research and Development Program of China (Project No. 2017YFB0503400), National Natural Science Foundation of China under Grants 61371182 and 41301459.
Author information
Authors and Affiliations
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, P.R. China
Yunxiang Zhang & Gang Wang
School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, P.R. China
Yuyang Zhao
School of Electronics and Information Engineering, Beihang University, Beijing, P. R. China
Rui Xue and Gang Wang proposed the original idea of the full text. Rui Xue designed the experiment. Yunxiang Zhang, Yuyang Zhao and Gang Wang performed the experiment and analyzed the results. Yunxiang Zhang and Yuyang Zhao drafted the manuscript. Rui Xue and Gang Wang wrote the manuscript. All authors read and approved this submission.
All authors agree to publish the submitted paper in this journal.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Zhang, Y., Zhao, Y., Wang, G. et al. Mean square cross error: performance analysis and applications in non-Gaussian signal processing.
EURASIP J. Adv. Signal Process.2021, 24 (2021). https://doi.org/10.1186/s13634-021-00733-7