Skip to main content

An RMT-based generalized Bayesian information criterion for signal enumeration

Abstract

This paper provides a method for enumerating signals impinging on an array of sensors based on the generalized Bayesian information criterion (GBIC). The proposed method motivates by a statistic for testing the sphericity of the covariance matrix when the sample size n is less than the dimension m. The statistic consists of the first four moments of sample eigenvalue distribution and relaxes the assumption of Gaussian distribution. We derive the asymptotical distribution of the statistic as m, n tends to infinity at the same ratio by random matrix theory and propose the expression of GBIC for determining the signal number. Numerical simulations demonstrate that the proposed method has a high probability of detection in both the Gaussian and the non-Gaussian noise, and performs better than other methods.

1 Introduction

Estimating the number of signals, also called signal enumeration, is an important problem in array signal processing. It is applied to diverse fields, such as radar and wireless communication [1, 2]. There are many different approaches to signal enumeration. Much attention has been aroused on information theoretic criterion (ITC), including the Akaike information criterion (AIC) [3], the Bayesian information criterion (BIC) [4], the minimum description length (MDL) principle [5], and the predictive minimum description length (PMDL) principle [6]. These conventional methods yield poor performance because the noises rely heavily on the assumption of normality. In practical applications, e.g., interference of indoor and outdoor mobile signals, the radar clutter, and underwater noises follow a non-Gaussian distribution [7,8,9]. In [10], it proposes an approach to estimate the signal number by the entropy estimation of eigenvalues (EEE) under the white Gaussian and the non-Gaussian assumption. Moreover, a signal subspace identification method is developed under the non-Gaussian noise with heavy-tailed [11]. The above methods work well when the sample size n tends to infinity with a fixed sensor number m. However, in certain scenarios, the available sample size may be the same order of magnitude as the sensor number, even the scenarios \(m\gg n\) can occur likely in practice. For instance, in a short-time stationary process, an array containing a large number of sensors just receives limited signal observations in the MIMO radar system. In these cases, these methods will encounter performance degradation because the sample eigenvalues can not converge to the true eigenvalues as mn both tend to infinity at the same ratio [12].

The traditional methods, built on information theoretic criteria, have been analyzed and developed by random matrix theory (RMT) in the large-dimensional regime. There are many methods based on the sample eigenvalue properties of large-dimensional random matrices, respectively discussed in [13,14,15,16,17,18,19]. By the distribution of the largest sample eigenvalue of a Wishart matrix, the classic AIC is modified by increasing the penalty term [15]. In [16], an improved AIC is proposed when the high-dimensional signals are contained in the white Gaussian noise by the sample moments of the random matrix with the Wishart distribution. In [17], a linear shrinkage based on the MDL criterion is devised for detecting the signal number by the spherical structure of noise subspace components under Gaussian assumption. In [18], the BIC is reformulated by constructing a penalized likelihood function with a convergence of the largest sample eigenvalues and the corresponding eigenvectors. In [19], the signal enumeration is realized based on the MDL and sample eigenvalues of the covariance matrix for Gaussian observations. Furthermore, a generalized Bayesian Information Criterion (GBIC) is derived. The GBIC is constructed by considering additional information including the probability distribution or statistics of sample eigenvalues from available data [20]. In a large array and finite sample scenario, a signal enumeration method is proposed via the GBIC based on a test statistic [21]. In [22], a new algorithm has been presented to estimate the number of sources embedded in a correlated complex elliptically distributed noise in the context of a large dimensional regime. As a whole, these methods mainly focus on signal enumeration in large-dimensional and finite sample settings.

It is more common to meet signal enumeration in a non-Gaussian noise when both the sample size and the sensor number are large. The ITC-like methods are difficult to implement for the non-Gaussian distributed observations because the probability density function (pdf) is difficult to be expressed. Moreover, the existing distribution properties of sample eigenvalues are only applied to the Gaussian observations.

This paper deals with estimating the signal number in a large-dimensional regime when the noise is uncorrelated and non-Gaussian. This estimator is considered in the complex elliptical symmetric (CES) noises [23], which is a better characterization for the noises in many applications. Motivated from [20], we construct a new test statistic consisting of high-order moments of sample eigenvalue distribution. It has the higher powers for the spiked covariance matrix in [24], according to [25, 26]. Whereas, the proposed approach differs from [27] and is analogous to the way devised in [28] from the technical point of view.

We propose an approach for signal enumeration based on GBIC, whether the noises are Gaussian or non-Gaussian. The characteristics and advantages of our approach are explained as follows:

  1. 1.

    The proposed approach is based on GBIC by utilizing RMT.

  2. 2.

    We consider the problem of signal enumeration in the asymptotic situation where both the sample size n and the sensor number m tend to infinity with a ratio \(m/n \rightarrow c\in (0,\infty )\).

  3. 3.

    We construct an estimator that is composed of the finite moments of the spectral distribution of the sample covariance matrix.

  4. 4.

    The proposed estimator is evaluated numerically in scenarios with Gaussian and non-Gaussian noise and has a higher probability of detection than some existing estimators.

The rest of this paper is as follows. Section 2 introduces the signal model and the GBIC. Section 3 proposes a new test statistic and deduces the asymptotic normality with general moment conditions under the null hypotheses. Section 4 provides some numerical simulation results for demonstrating the asymptotic behavior of the proposed statistic and the performance for high-dimensional covariance in Gaussian and non-Gaussian noise setting. Section 5 concludes.

1.1 Notations

The notation \(\mathbb {R}\) is the real number set. The notation \(\mathbb {C}^{m\times n}\) is the set of all \(m\times n\) complex matrix. The notation \(\mathbb {C}^m\) is the set of all m-dimensional complex column vectors. The symbol E denotes the mathematical expectation. The symbol \(\varvec{I}_{m}\) denotes the identity matrix of order m. For a matrix \(\varvec{A}\), \(\varvec{A}^{T}\) and \(\varvec{A}^{H}\) respectively denote its transpose and conjugate transpose. For a squared matrix \(\varvec{A}\), \(\text {tr}(\varvec{A})\) denotes its trace.

2 Signal model and the GBIC

2.1 Formulation

Consider that K signals \(s_1, \ldots , s_K\), which are narrow-band spatially incoherent, impinge on an array of m sensors from distinct directions \(\theta _1, \ldots , \theta _K\). The received sample vector at discrete time t denoted as \(\varvec{y}(t) \in \mathbb {C}^m\), can be modeled as

$$\begin{aligned} \varvec{y}(t) = \sum \limits ^K_{k=1}\varvec{a}(\theta _k)s_k(t) + \varvec{w}(t) = \varvec{A}(\varvec{\theta }) \varvec{s}(t) + \varvec{w}(t), \end{aligned}$$
(1)

where \(\varvec{A}(\varvec{\theta }) = [\varvec{a}(\theta _1), \ldots , \varvec{a}(\theta _K)]\in \mathbb {C}^{m \times K}\) is the unknown manifold matrix with unit norm steering vectors \(\varvec{a}(\theta _k)\in \mathbb {C}^{m}\) corresponding to \(\theta _k, k = 1, \ldots , K\), and \(\varvec{s}(t) = [s_1(t), \ldots , s_K(t)]^T \in \mathbb {C}^{K}\) is the signal vector, and \(\varvec{w}(t) \in \mathbb {C}^{m}\) is the additive white noise impinging on the sensor array at the time t. Suppose that there are n snapshots of sensor array signals. The observation matrix is denoted as

$$\begin{aligned} \varvec{Y}_n = \varvec{A}(\varvec{\theta }) \varvec{S}_n + \varvec{W}_n, \end{aligned}$$
(2)

where \(\varvec{Y}_n = [\varvec{y}(1), \ldots , \varvec{y}(n)], \varvec{S}_n = [\varvec{s}(1), \ldots , \varvec{s}(n)]\), and \(\varvec{W}_n = [\varvec{w}(1), \ldots , \varvec{w}(n)]\). We estimate the signal number K from \(\varvec{Y}_n\). Some statistical assumptions are made for the model as follows:

  • A1: The signal number is unknown and satisfies \(K < \min \{m,n\}\).

  • A2: The spatially incoherent signals \(s_1(t), \ldots , s_K(t)\) are stationary processes with 0 mean. The covariance matrix of signal vector \(\varvec{s}(t)\) is \(E\{\varvec{s}(t)\varvec{s}^H(t)\} = \text {diag}(\varvec{p}) \triangleq \varvec{P}_s\), where \(\varvec{p}= [p_1, \ldots , p_K]^T\) is the signal power vector.

  • A3: The noise components \(\varvec{w}(1), \ldots , \varvec{w}(n)\) are independent of \(\varvec{A}(\varvec{\theta }) \varvec{s}(1), \ldots , \varvec{A}(\varvec{\theta }) \varvec{s}(n)\) and come from CES with zero-mean and the covariance matrix \(\sigma ^2 \varvec{I}_m\), i.e., \(E\{\varvec{w}(t)\varvec{w}^H(t)\} = \sigma ^2 \varvec{I}_m\), where \(\sigma ^2\) is unknown power. And \(s_1(t), \ldots , s_K(t)\) are mutually independent.

Under the above assumptions, the observation vector \(\varvec{y}(t)\) in (1) at discrete time t has mean zero and covariance matrix

$$\begin{aligned} \varvec{R}\triangleq E[\varvec{y}(t)\varvec{y}^H(t) ] = \varvec{A}(\varvec{\theta }) \varvec{P}_S \varvec{A}(\varvec{\theta })^H + \sigma ^2 \varvec{I}_m. \end{aligned}$$
(3)

We denote the eigenvalues of \(\varvec{R}\) as \(\lambda _1 \ge \ldots \lambda _K \ge \lambda _{K + 1} = \lambda _m = \sigma ^2\) with the corresponding eigenvectors \(\varvec{u}_1, \ldots , \varvec{u}_m\). The sample covariance matrix (SCM) is

$$\begin{aligned} {\hat{\varvec{R}}} = \frac{1}{n} \sum \limits ^n_{t=1}\varvec{y}(t)\varvec{y}^H(t). \end{aligned}$$
(4)

The eigenvalues and eigenvectors of \({\hat{\varvec{R}}}\) are denoted as \({\hat{\lambda }}_{1} \ge \ldots \ge \hat{\lambda }_{m}\), \(\hat{\varvec{u}}_1, \ldots , \hat{\varvec{u}}_m\), also named the sample eigenvalues and the sample eigenvectors.

2.2 The review of generalized Bayesian information criterion

As a conventional method, the BIC deals with the model selection issue from a Bayesian point of view. But the criterion can not work well in some conditions such as small sample size, and low signal-to-noise ratio (SNR). Meanwhile, the BIC relies heavily on the density of the observation. By incorporating the density functions of sample eigenvalues and related statistics, the GBIC is proposed to remedy the limitation of the BIC [20]. The GBIC has two different expressions with relaxed the Gaussian assumption of observation and respectively denoted as \(\hbox {GBIC}_1\) and \(\hbox {GBIC}_2\). The \(\hbox {GBIC}_1\) is made up of two parts: the density of observations and the ones of sample eigenvalues or related statistics. To reduce the influence of sample eigenvectors, the \(\hbox {GBIC}_2\) does not need the density of observations from \(\hbox {GBIC}_1\) for overcoming the limitation of the density. Meanwhile, it is convenient to describe the non-Gaussian data. Suppose that \(\mathfrak {L}\) is a statistic based on the sample eigenvalues corresponding to an unknown vector \(\Theta _z^{(k)}\) with a possible signal number k. If the pdf of the statistic \(\mathfrak {L}\) is denoted as \(f(\mathfrak {L}|\Theta _z^{(k)})\), then the expression of the \(\hbox {GBIC}_2\) is

$$\begin{aligned} \text {GBIC}_2(k) = -2\log f(\mathfrak {L}|{\hat{\Theta }}_{z}^{(k)}) + \nu (k)\log n, \end{aligned}$$
(5)

where \({\hat{\Theta }}_z^{(k)}\), \(\nu (k)\) are the maximum likelihood estimate and the involved free parameter number of \(\Theta _z^{(k)}\) respectively. In (5), the first term is the log-likelihood function of \(f(\mathfrak {L}|{\hat{\Theta }}_z^{(k)})\), and the second term is the penalty. In this paper, we suggest a sphericity test statistic to estimate the signal number, based on the \(\hbox {GBIC}_2\).

3 The signal enumeration method

Let \(\varvec{x}_1, \dots , \varvec{x}_n \in \mathbb {C}^p\) be an independent and identically distributed sample (circularly symmetric complex) with zero mean and covariance matrix \(\varvec{\Sigma }_p\). Consider the sphericity test for the population covariance matrix:

$$\begin{aligned} H_0: \varvec{\Sigma }_p = \sigma ^2 \varvec{I}_p \quad \text{ vs. } \quad H_1: \varvec{\Sigma }_p \ne \sigma ^2 \varvec{I}_p, \end{aligned}$$
(6)

where \(\sigma\) is an unknown but fixed positive constant.

3.1 Estimators of \(\text {tr}\varvec{\Sigma }_p^i/p\)

Suppose that \(H_p\), \(F_n\) are spectral distributions of \(\varvec{\Sigma }_p\), \(\varvec{S}_n\) respectively, where \(\varvec{S}_n = \frac{1}{n}\sum ^n_{i=1}\varvec{x}_i \varvec{x}_i^{ H }\) is the SCM. We define the integer-order moments of \(H_p\) and \(F_n\):

$$\begin{aligned} \alpha _i&:=\int {t^i \text {d}H_p(t)} = \frac{1}{p}\textrm{tr}(\varvec{\Sigma }_p^i), \\ {\hat{\beta }}_i&:=\int t^i \text {d}F_n(t) = \frac{1}{p}\textrm{tr}(\varvec{S}_n^i), \quad i = 1, 2, \dots \end{aligned}$$

Supposing that the observations are from Gaussian distribution, the estimators \({\hat{\alpha }}_i\) of \(\alpha _i\), \(i = 1, 2, 4\), are proved to be consistent, unbiased, and asymptotically normal as \((n, p) \rightarrow \infty\) and adopted in [25, 26, 29, 30]. These estimators can be expressed as the polynomials of \({\hat{\beta }}_i\):

$$\begin{aligned} {\hat{\alpha }}_1 =&{\hat{\beta }}_1, \\ {\hat{\alpha }}_2 =&\frac{n^2 }{(n - 1)(n + 2)} \Big ( {\hat{\beta }}_2 - \frac{p}{n}{\hat{\beta }}_1^2 \Big ), \\ {\hat{\alpha }}_4 =&\tau _1 \Big ( {\hat{\beta }}_4 - \frac{4p}{n}{\hat{\beta }}_3{\hat{\beta }}_1 - \tau _2 {\hat{\beta }}_2^2 + \tau _3 {\hat{\beta }}_2{\hat{\beta }}_1^2 - \tau _4 {\hat{\beta }}_1^4 \Big ), \end{aligned}$$

where

$$\begin{aligned} \tau _1&= \frac{n^5(n^2 + n + 2)}{(n + 1)(n + 2)(n + 4)(n + 6)(n - 1)(n - 2)(n - 3)},\\ \tau _2&= \frac{p(2n^2 + 3n - 6)}{n(n^2 + n + 2)}, \\ \tau _3&= \frac{p^2(10n^2 + 12n)}{n^2(n^2 + n + 2)}, \\ \tau _4&= \frac{p^3(5n^2 + 6n)}{n^3(n^2 + n + 2)}. \end{aligned}$$
(7)

If the population distribution is non-Gaussian, the unbiasedness does not hold anymore for \({\hat{\alpha }}_k, k = 2, 3, 4\). However, the consistency and asymptotic normality are retained under some suitable assumptions in [30].

Assumption 1

The sample size n tends to infinity together the dimension p with a ratio \(c_n = p / n \rightarrow c \in (0, \infty )\).

Assumption 2

There exists a doubly infinite matrix composed of independent and identically distributed random variables \(w_{ij}\) satisfying

$$\begin{aligned} E(w_{ij}) = 0, \quad E(w_{ij}^2) = 1, \quad E(w_{ij}^4) < \infty , \quad i, j \ge 1. \end{aligned}$$

Denote \(\varvec{W}_n = (w_{ij})_{1 \le i \le p, 1 \le j \le n}\), the observation vectors can be expressed as \({\varvec{x}}_j = \varvec{\Sigma }_p^{1 / 2} \varvec{w}_{\cdot j}\), where \(\varvec{w}_{\cdot j} = (w_{ij})_{1 \le i \le p}\) is the j-th column of \(\varvec{W}_n\).

Assumption 3

The spectral norm of \(\varvec{\Sigma }_p\) has a positive constant bound, and the population spectral distribution \(H_p\) weakly converges to a probability distribution H as \(p \rightarrow \infty\).

It is worth noting that we are accustomed to assuming \(E(w_{11}^4) = 3 + \Delta\), where \(\Delta\) is a finite constant which is 0 if \(w_{ij}\) is Gaussian in Assumption 2. Under these assumptions, we know that the estimators \(\hat{\alpha }_k\) converge almost surely to \(\alpha _k, k=1, 2, 3, 4\) [30].

3.2 Test procedure

Let \(\lambda _1, \ldots , \lambda _p \in \mathbb {R}\) be the eigenvalues of \(\varvec{\Sigma }_p\). From Cauchy–Schwarz inequality, we know that

$$\begin{aligned} \left( \sum ^p_{i = 1} \lambda _i \right) ^2 \le p \times \sum ^p_{i = 1} \lambda _i^2, \end{aligned}$$
(8)

with equality holding if and only if \(\lambda _1, \ldots , \lambda _p\) are equal. By simple deformation of (8), we have

$$\begin{aligned} \frac{\sum _{i = 1}^p \lambda _i^4 / p}{\Big ( \sum _{i = 1}^p \lambda _i /p \Big )^4} \ge \frac{\sum _{i = 1}^p \lambda _i^4 /p}{\Big ( \sum _{i = 1}^p \lambda _i^2 /p \Big )^2}. \end{aligned}$$
(9)

Therefore, the moments of the distribution \(H_p\) satisfy

$$\begin{aligned} \frac{\alpha _4}{\alpha _1^4} \ge \frac{\alpha _4}{\alpha _2^2}. \end{aligned}$$
(10)

The equality holds if and only if the sphericity hypothesis is true. Denote

$$\begin{aligned} \phi :=\frac{\alpha _4}{\alpha _1^4} - \frac{\alpha _4}{\alpha _2^2} \ge 0, \end{aligned}$$

then \(\phi = 0\) iff \(\varvec{\Sigma }_p = \sigma ^2 \varvec{I}_p\). Therefore, the sphericity test (6) is equivalent to the following hypothesis

$$\begin{aligned} H_{0a}: \phi = 0 \quad \text{ vs. } \quad H_{1a}: \phi > 0. \end{aligned}$$

For convenience, we define a new variable

$$\begin{aligned} \gamma = \frac{\hat{\alpha }_4}{\hat{\alpha }_1^4} - \frac{\hat{\alpha }_4}{\hat{\alpha }_2^2}. \end{aligned}$$
(11)

The following theorem investigates the asymptotic distribution of the proposed statistic \(\gamma\) under null hypothesis \(H_{0a}\).

Proposition 1

Under Assumptions 1–3, when the null hypothesis \(H_{0}\) in (6) holds, we have

$$\begin{aligned} T = \frac{n {\gamma } - \tilde{\mu }}{\sqrt{V}} \sim \mathcal {N}(0, 1), \end{aligned}$$
(12)

where \(\tilde{\mu } = 2 \Delta\), \(V = 16 + 16\Delta /c\). Further, if \(E(w_{11}^4) = 3\), then

$$\begin{aligned} T = \frac{n \gamma }{4} \sim \mathcal {N}(0, 1). \end{aligned}$$
(13)

Proof

From (2) in [30], we know

$$\begin{aligned} n ({\hat{\alpha }}_1 - 1, {\hat{\alpha }}_2 - 1, {\hat{\alpha }}_4 - 1)^{T} \sim \mathcal {N}_3 (\tilde{\varvec{m}}, \varvec{V}_1+\varvec{V}_2\Delta ), \end{aligned}$$

where the mean vector \(\tilde{\varvec{m}}\) and the terms \(\varvec{V}_1\) and \(\varvec{V}_2\) of the covariance matrix respectively

$$\begin{aligned} \tilde{\varvec{m}}&= \Delta \cdot (0, 1, c +6)^T, \\ \varvec{V}_1&= \frac{1}{c}\begin{bmatrix} 2 &{} 4 &{} 8 \\ 4 &{} 4(c +2) &{} 8(3 c + 2) \\ 8 &{} 8(3 c + 2) &{} 8 (c^3 + 12 c^2 + 18 c + 4) \end{bmatrix}, \\ \varvec{V}_2&= \frac{1}{c}\begin{bmatrix} 1 &{} 2 &{} 4 \\ 2 &{} 4 &{} 8 \\ 4 &{} 8 &{} 16 \end{bmatrix}. \end{aligned}$$

Let

$$\begin{aligned} f(\varvec{t}) = \frac{z}{x^4} - \frac{z}{y^2}, \quad \varvec{t} = (x, y, z)^T. \end{aligned}$$

It is clear that \(f(\varvec{t})\) has a continuous partial derivative at \(\varvec{t}_0 = (1, 1, 1)^T\) and the Jacobian vector is

$$\begin{aligned} J({\varvec{t}}) = \frac{\partial f(\varvec{t})}{\partial \varvec{t}} = \Big (-\frac{4z}{x^5}, \frac{2z}{y^3}, \frac{1}{x^4} - \frac{1}{y^2}\Big )^T. \end{aligned}$$

Note that \(f(\varvec{t}_0) = 0\) and \(J({\varvec{t}_0}) = (-4, 2, 0)^T\). Based on the Delta method, we have

$$\begin{aligned} n \bigg ( \frac{{{\hat{\alpha }}}_4}{{{\hat{\alpha }}_1^4}} - \frac{{{\hat{\alpha }}}_4}{{{\hat{\alpha }}_2^2}} \bigg ) \xrightarrow {D} \mathcal {N}(\tilde{\mu }, V). \end{aligned}$$

By simply calculating, we know

$$\begin{aligned} {\tilde{\mu }}&= J(\varvec{t}_0)^T \tilde{\varvec{m}} = 2 \Delta ,\\ V&= J(\varvec{t}_0)^T (\varvec{V}_1+\varvec{V}_2\Delta ) J(\varvec{t}_0) = 16 + 16\Delta /c. \end{aligned}$$

Namely,

$$\begin{aligned} n \gamma \sim \mathcal {N}(\tilde{\mu }, V). \end{aligned}$$

Thus,

$$\begin{aligned} T = \frac{n \gamma - \tilde{\mu }}{\sqrt{V}} \sim \mathcal {N}(0, 1). \end{aligned}$$

\(\square\)

By the work in [29], we employ the statistic \(T = \alpha _2/\alpha _1^2\), where \(\alpha _1 = \text {tr}(\varvec{\Sigma }_p)/p\) and \(\alpha _2= \text {tr}(\varvec{\Sigma }^2_p)/p\) for testing the sphericity of a \(p-\) dimensional positive definite covariance matrix \(\varvec{\Sigma }_p\). If \(\varvec{\Sigma }_p\) is proportional to the identity matrix, there will be \(\alpha _1^2 = \alpha _2\) and \(T = 1\). Otherwise, the statistic \(T > 1\).

Therefore, T is a sphericity test statistic, which can be employed for testing the sphericity structure of a positive definite covariance matrix. In [29], \(\text {tr}(\varvec{\Sigma })\) and \(\text {tr}(\varvec{\Sigma }^2)\) are unbiasedly and consistently estimated on the asymptotic condition \(m, n \rightarrow \infty\) and \(n = O(m^\delta ), 1/2<\delta < 1\), where \(n= O(p^{\delta }), 1/2<\delta <1\) reveals that n is on the same magnitude as \(p^\delta\). However, Proposition 1 ensures that the proposed statistic T performs well for testing the sample covariance matrix when the dimension exceeds the sample size under the asymptotic condition.

The key problem is detecting the signal number using the proposed sphericity test statistic. We denote the presumptive covariance matrix of signals as \(\varvec{R}^{(k)}\) with eigenvalues \(\lambda _1\ge \cdots \ge \lambda _k>\lambda _{k+1}=\ldots =\lambda _m=\sigma ^2_k\), and the unknown parameter vector as \(\varvec{\Theta }^{(k)}=[\lambda _1, \ldots , \lambda _k, \sigma _k^2]^T\), where \(\sigma ^2\) is the noise power. The signal eigenvalues of \(\varvec{R}^{(k)}\) are \(\lambda _1, \ldots , \lambda _k\) with the signal subspace \(\varvec{U}_k \triangleq \{\varvec{u}_1, \ldots , \varvec{u}_k\}\), and the noise eigenvalues of \(\varvec{R}^{(k)}\) are \(\lambda _{k+1}, \ldots , \lambda _m\) with the noise subspace \(\varvec{U}_{m-k}\triangleq \{\varvec{u}_{k+1}, \ldots , \varvec{u}_m\}\). By the unitary coordinate transformation, we decompose the received data \(\varvec{y}(t)\) as

$$\begin{aligned} \varvec{U}^H\varvec{y}(t)=\begin{bmatrix} \varvec{U}^H_k \\ \varvec{U}^H_{m-k} \end{bmatrix} \varvec{y}(t)\triangleq \begin{bmatrix} \varvec{y}_s^{(k)}(t)\\ \varvec{y}_w^{(k)}(t) \end{bmatrix} \end{aligned}$$
(14)

where \(\varvec{y}_s^{(k)}(t)=\varvec{U}^H_k\varvec{y}(t)\in \mathbb {C}^{k}\) and \(\varvec{y}_w^{(k)}(t)=\varvec{U}^H_{m-k}\varvec{y}(t)\in \mathbb {C}^{m-k}\) are the presumptive signal and noise subspace components, respectively in [14, 17]. For the presumptive signal subspace components, its covariance matrix is

$$\begin{aligned} E\big (\varvec{y}_s^{(k)}(t)(\varvec{y}_s^{(k)}(t))^H\big )=\text {diag}(\lambda _1, \ldots , \lambda _k)\triangleq \varvec{R}_s^{(k)}. \end{aligned}$$
(15)

For the presumptive noise subspace component, the corresponding covariance matrix is

$$\begin{aligned} E\big (\varvec{y}_w^{(k)}(t)(\varvec{y}_w^{(k)}(t))^H\big )=\text {diag}(\lambda _{k+1}, \ldots , \lambda _m)\triangleq \varvec{R}_w^{(k)}, \end{aligned}$$
(16)

If there exists no signal in the presumptive noise subspace components \(\varvec{y}^{(k)}_w(t)\), \(\varvec{R}_w^{(k)}\) should be proportional to the identity matrix, i.e. \(\varvec{R}_w^{(k)}\) is spherical. Therefore, the problem of estimating the signal number turns out to be testing the sphericity of the presumptive noise subspace covariance matrices \(\varvec{R}_w^{(k)}, k=0, 1, \ldots , \min (m,n) - 1\). We define

$$\begin{aligned} \alpha _1^{(k)}&\triangleq \frac{1}{m-k}\text {tr}(\varvec{R}_w^{(k)}), \\ \alpha _2^{(k)}&\triangleq \frac{1}{m-k}\text {tr}(\varvec{R}_w^{(k)})^2, \\ \alpha _4^{(k)}&\triangleq \frac{1}{m-k}\text {tr}(\varvec{R}_w^{(k)})^4, \end{aligned}$$
(17)

and

$$\begin{aligned} T^{(k)} = \frac{n \gamma ^{(k)} - \tilde{\mu }}{\sqrt{V}}, \end{aligned}$$
(18)

where

$$\begin{aligned} \gamma ^{(k)} =\frac{\alpha _4^{(k)}}{(\alpha _1^{(k)})^4}-\frac{\alpha _4^{(k)}}{(\alpha _2^{(k)})^2}. \end{aligned}$$

According to Proposition 1, the statistic \({\hat{T}}^{(k)}\) follows Gaussian distribution if we estimate \(\alpha _1^{(k)}, \alpha _2^{(k)}, \alpha _4^{(k)}\) by the presumptive noise subspace components \(\varvec{y}_w^{(k)}(t)\). If there exists no signal, the \(m-k\) smallest eigenvalues will be equal to each other, and \(\varvec{R}_w^{(k)}\) will be spherical. If the presumptive noise subspace components contain signals, \(\varvec{R}_w^{(k)}\) turns out to be a diagonal matrix. The distribution of \({\hat{T}}^{(k)}\) is non-Gaussian in this case. Thus, \({\hat{T}}^{(k)}\) is also a statistic of the \(m-k\) smallest eigenvalues. We employ it as the statistic \(\mathfrak {L}\) of the second type expression of the GBIC. This shows that the signal number can be estimated via the GBIC with \({\hat{T}}^{(k)}\) by the principle that \({\hat{T}}^{(k)}\) can be employed for testing the kth presumptive noise subspace.

3.3 Implementation of the RMT-based GBIC

We adopt \({\hat{T}}^{(k)}\) to signal enumeration via the GBIC in this subsection. By the unitary coordinate transformation with sample eigenvalues and sample eigenvectors, we obtain the presumptive noise subspace components of the observations. The sample eigenvalues are \({\hat{\lambda }}_1>\cdots >{\hat{\lambda }}_m\) with the corresponding sample eigenvectors \({\hat{\varvec{u}}}_1, \ldots , {\hat{\varvec{u}}}_m\). For the k signals, we can estimate the noise subspace components as

$$\begin{aligned} {\hat{\varvec{y}}}_w^{(k)}(t) = {\hat{\varvec{U}}}^H_{m-k}\varvec{y}(t)\in \mathbb {C}^{m-k}, \end{aligned}$$
(19)

where \({\hat{\varvec{U}}}_{m-k}=[{\hat{\varvec{u}}}_{k+1}, \ldots , {\hat{\varvec{u}}}_{m}]\). The covariance matrix of the presumptive noise subspace components is unbiasedly estimated by

$$\begin{aligned} \hat{\varvec{S}}_w^{(k)}=\frac{1}{n}\sum \limits _{t=1}^n ({\hat{\varvec{y}}}_w^{(k)}(t))({\hat{\varvec{y}}}_w^{(k)}(t))^H. \end{aligned}$$
(20)

Letting

$$\begin{aligned} {\hat{\beta }}_i^{(k)} = \frac{1}{m-k}\text {tr}({\hat{\varvec{S}}}_w^{(k)})^i, i=1, 2, 3, 4, \end{aligned}$$
(21)

we have \({\hat{\alpha }}_1^{(k)}\), \({\hat{\alpha }}_2^{(k)}\) and \({\hat{\alpha }}_4^{(k)}\) can be unbiasedly and consistently estimated as follows:

$$\begin{aligned} {\hat{\alpha }}_1^{(k)} & = {\hat{\beta }}_1^{(k)}, \\ {\hat{\alpha }}_2^{(k)} & = \frac{n^2 }{(n - 1)(n + 2)} \Big ( {\hat{\beta }}_2^{(k)} - \frac{m}{n}({\hat{\beta }}_1^{(k)})^2 \Big ), \\ {\hat{\alpha }}_4^{(k)} & = \tau _1 \Big ( {\hat{\beta }}_4^{(k)} - \frac{4m}{n}{\hat{\beta }}_3^{(k)}{\hat{\beta }}_1^{(k)} - \tau _2({\hat{\beta }}_2^{(k)})^2 + \tau _3 {\hat{\beta }}_2^{(k)}({\hat{\beta }}_1^{(k)})^2 - \tau _4 ({\hat{\beta }}_1^{(k)})^4 \Big ), \end{aligned}$$
(22)

where \(\tau _1, \tau _2, \tau _3\) and \(\tau _4\) are shown in (7).

The statistic \(\gamma ^{(k)}\) is estimated by

$$\begin{aligned} {\hat{\gamma }}^{(k)}&=\frac{{\hat{\alpha }}_4^{(k)}}{({\hat{\alpha }}_1^{(k)})^4}-\frac{{\hat{\alpha }}_4^{(k)}}{({\hat{\alpha }}_2^{(k)})^2},\end{aligned}$$
(23)
$$\begin{aligned} {\hat{T}}^{(k)}&= \frac{n {\hat{\gamma }}^{(k)} - \tilde{\mu }}{\sqrt{V}}. \end{aligned}$$
(24)

If \(\varvec{R}_w^{(k)}\) is proportional to the identity matrix, we have the asymptotic distribution of \({\hat{\gamma }}^{(k)}\) is

$$\begin{aligned} {\hat{T}}^{(k)} \sim \mathcal {N}(0, 1). \end{aligned}$$
(25)

The pdf \(f({\hat{T}}^{(k)}|{\hat{\varvec{\Theta }}}^{(k)})\) under the parameter estimate \({\hat{\varvec{\Theta }}}^{(k)}=[{\hat{\lambda }}_1, \ldots , {\hat{\lambda }}_k, {\hat{\sigma }}_k^2]\) with \({\hat{\sigma }}_k^2 = \frac{1}{m-k}{\hat{\lambda }}_k\) is

$$\begin{aligned} f({\hat{T}}^{(k)}|{\hat{\varvec{\Theta }}}^{(k)}) = \frac{1}{\sqrt{2 \pi }}\exp \Big (\frac{1}{2}({\hat{T}}^{(k)})^2 \Big ) \end{aligned}$$
(26)

Substituting (26) into (5) with \(f(\mathfrak {L}|{\hat{\varvec{\Theta }}}^{(k)}) =f({\hat{T}}^{(k)}|{\hat{\varvec{\Theta }}}^{(k)})\) and the parameters number \(\nu (k)=k+1\) and ignoring the constant term, we have the proposed GBIC-based function

$$\begin{aligned} \text {GBIC}_{\text {new}}(k)=({\hat{T}}^{(k)})^2 + (k+1)\log n. \end{aligned}$$
(27)

By minimizing (27) concerning k, we estimate the signal number as:

$$\begin{aligned} & \hat{k} = \arg \mathop {\min }\limits_{k} {\text{GBIC}}_{{{\text{new}}}} (k), \\ & k \in {\mathcal{N}},0 \le k \le \min (m,n) - 1. \\ \end{aligned}$$
(28)

In summary, we write the proposed signal enumeration methods as the following algorithm (Table 1).

Table 1 Summary of the proposed algorithm

4 Numerical evaluation

This section verifies the performance of signal enumeration by comparing the proposed method with some existing methods. These methods are respectively presented in [15](denoted as BN-AIC), [16] (denoted as RMT-AIC), [18] (denoted as BIC-variant), [10] (denoted as EEE) and [21] (denoted GBIC-SP), and set in the scenario of a large array in white Gaussian and non-Gaussian noise setting. The existing criterion functions are collected in Table 2.

Table 2 Competing criterion functions in the existing literature

4.1 The situations of no source signals

When no source signals are present, we mainly look at two cases in large dimensional regime:

  • Case 1: the additive noise is distributed as complex Gaussian \(\varvec{w}\sim \mathbb {C}\mathcal {N}(\varvec{0}, \varvec{I}_m)\);

  • Case 2: the noise is K-distributed \(\varvec{w}= \sqrt{\tau (t)}\varvec{n}(t)\) with \(\tau (t)\sim \text{ Gamma }(v, 1/v)\) and \(\varvec{n}\sim \mathbb {C}\mathcal {N}(\varvec{0}, \varvec{I}_m)\).

We compute the empirical probability of correct detection based on 3000 independent numerical runs in the two cases.

Fig. 1
figure 1

Probability of detection versus the ration c under the complex Gaussian and K-distributed noise

Figure 1 shows the correct detection probabilities versus the number of sensors in the Gaussian and K-distributed noise when no sources are present. Figure 1 indicates that our proposed method shows a good detection effect when the ratio c becomes large, as the probability of correct detection is very close to 1.

4.2 The presence of signals

We assume that \(K=5\) spatially incoherent narrow-band signals with identical power \(p_s\) from the directions of arrival (DOA) \(\varvec{\theta }= \{-55^{\circ }, -25^{\circ }, -5^{\circ }, 20^{\circ }, 40^{\circ }\}\) impinge upon a linear array containing m sensors with uniform half-wavelength gap. We generate the signals by independent complex Gaussian sequences with steering vectors \(\varvec{a}(\theta _k) = (1/\sqrt{m})\big ( 1, e^{j(2\pi d/\lambda )\sin {\theta _k}}, \ldots , e^{j(2\pi d/\lambda )(m-1)\sin {\theta _k}} \big )^{T}, k=1, \ldots , K\). We define the signal-to-noise ratio (SNR) is equal to \(10\log (p_s/\sigma ^2\))), where the noise power \(\sigma ^2=1\). The setting of additive noise still adopts the two cases in Sect. 4.1.

Fig. 2
figure 2

Probability of detection versus SNR under the complex Gaussian and K-distributed noise

Fig. 3
figure 3

Probability of detection versus sample size under the complex Gaussian and K-distributed noise

Fig. 4
figure 4

Probability of detection versus the number of sensors under the complex Gaussian and K-distributed noise

Fig. 5
figure 5

Probability of detection versus the number of sensors under the complex Gaussian and K-distributed noise

Fig. 6
figure 6

Probability of detection versus the ratio c under the complex Gaussian and K-distributed noise

Fig. 7
figure 7

Probability of detection versus v under the K-distributed noise

Figure 2 respectively shows the empirical probability of the corrected detection versus SNR in the complex Gaussian and K-distributed noise for large sample size n and sensor size m. Figure 2a, b indicate that the empirical correct detection probabilities of all methods increase as the SNR becomes large. Meanwhile, we note that the proposed method has the best performance.

Figure 3 shows the empirical correct detection probabilities versus the number of samples in the complex Gaussian and K-distributed noise. Figure 3a, b indicate these estimators have consistency when sample size n becomes larger and larger except for EEE. As the SNR is low or the sample size is small, the EEE approach performs poorly.

Figure 4 shows the empirical probabilities of the correct detection versus the number of sensors for the high-dimensional case with \(c = m/n = 1.5\). We can see that the proposed method is significantly better than the criteria collected in Table 2. The reason is that \({\hat{T}}^{(k)}\) is more sensitive to the structure of \({\hat{\varvec{R}}}^{(k)}\), and the accurate probabilistic model of \({\hat{T}}^{(k)}\) ensures that the proposed method can reach a higher correct detection probability. The BN-AIc and BIC-variant methods can not work because the denominators of the criterion functions equal 0.

Figure 5 shows the correct detection probabilities versus the number of sensors in the Gaussian and K-distributed noise. Figure 6 shows the correct detection probabilities versus the ratio c in the Gaussian and K-distributed noise. Figures 5 and 6 imply that the proposed method has good performance in most situations when the ratio c becomes large. The RMT-AIC method has better performance in some places but exhibits instability.

Figure 7 shows the correct detection probabilities versus the parameter v of Gamma distribution in K-distributed noise. We can see that the proposed method has the highest empirical correct detection probability than the others for large n and m with \(n>m\).

From the above simulation results, we have the following conclusions:

  1. 1.

    All these methods perform well for the high SNR and large sample size with \(n>m\) in the above types of noise.

  2. 2.

    The BN-AIC and BIC-variant methods can not work well in a high-dimensional case with \(n<m\).

  3. 3.

    The EEE approach performs well in the Gaussian noise but suffers the low SNR or small sample size.

  4. 4.

    The proposed method has an optimal detection performance in the Gaussian and no-Gaussian noise for the case of large dimensions.

5 Conclusion

This paper has developed an RMT-based method for signal enumeration in Gaussian and non-Gaussian noise when the sensor number is large together with the sample size. We estimate the signal number via RMT-based GBIC built on a new test statistic in the sphericity hypothesis of the noise subspace covariance matrix. The proposed method provides higher correct detection probabilities than the existing methods in Gaussian and non-Gaussian noise settings. Moreover, the proposed method can achieve more accurate signal enumeration even when the sample size is less than the sensor number.

Availability of data and materials

All data generated during this study are included in this published article.

References

  1. J. Morales, S. Agaian, D. Akopian, Mitigating anomalous measurements for indoor wireless local area network positioning. IET Radar Sonar Navig. 10(7), 1220–1227 (2016)

    Article  Google Scholar 

  2. L. Xu, J. Li, P. Stoica, Target detection and parameter estimation for mimo radar systems. IEEE Trans. Aerosp. Electron. Syst. 44(3), 927–939 (2008)

    Article  Google Scholar 

  3. H. Akaike, A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  4. G. Schwarz, Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  5. J. Rissanen, Modeling by shortest data description. Automatica 14(5), 465–471 (1978)

    Article  MATH  Google Scholar 

  6. S. Valaee, P. Kabal, An information theoretic approach to source enumeration in array signal processing. IEEE Trans. Signal Process. 52(5), 1171–1178 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  7. P.L. Brockett, M. Hinich, G.R. Wilson, Nonlinear and non-Gaussian ocean noise. J. Acoust. Soc. Am. 82(4), 1386–1394 (1987)

    Article  Google Scholar 

  8. K.L. Blackard, T.S. Rappaport, C.W. Bostian, Measurements and models of radio frequency impulsive noise for indoor wireless communications. IEEE J. Sel. Areas Commun. 11(7), 991–1001 (1993)

    Article  Google Scholar 

  9. E. Ollila, D.E. Tyler, V. Koivunen, H.V. Poor, Complex elliptically symmetric distributions: survey, new results and applications. IEEE Trans. Signal Process. 60(11), 5597–5625 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  10. H. Asadi, B. Seyfe, Source number estimation via entropy estimation of eigenvalues (EEE) in Gaussian and non-Gaussian noise. arXiv preprint arXiv:1311.6051 (2013)

  11. G. Anand, P. Nagesha, Source number estimation in non-Gaussian noise. In: IEEE European Signal Processing Conference (EUSIPCO), pp. 1711–1715 (2014)

  12. F. Benaych-Georges, R.R. Nadakuditi, The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. 227(1), 494–521 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  13. S. Kritchman, B. Nadler, Non-parametric detection of the number of signals: hypothesis testing and random matrix theory. IEEE Trans. Signal Process. 57(10), 3930–3941 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  14. L. Huang, C. Qian, H.C. So, J. Fang, Source enumeration for large array using shrinkage-based detectors with small samples. IEEE Trans. Aerosp. Electron. Syst. 51(1), 344–357 (2015)

    Article  Google Scholar 

  15. B. Nadler, Nonparametric detection of signals by information theoretic criteria: performance analysis and an improved estimator. IEEE Trans. Signal Process. 58(5), 2746–2756 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  16. R.R. Nadakuditi, A. Edelman, Sample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples. IEEE Trans. Signal Process. 56(7), 2625–2638 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  17. L. Huang, H.C. So, Source enumeration via MDL criterion based on linear shrinkage estimation of noise subspace covariance matrix. IEEE Trans. Signal Process. 61(19), 4806–4821 (2013)

    Article  Google Scholar 

  18. L. Huang, Y. Xiao, K. Liu, H.C. So, J. Zhang, Bayesian information criterion for source enumeration in large-scale adaptive antenna array. IEEE Trans. Veh. Technol. 65(5), 3018–3032 (2016)

    Article  Google Scholar 

  19. E. Yazdian, S. Gazor, H. Bastani, Source enumeration in large arrays using moments of eigenvalues and relatively few samples. IET Signal Proc. 6(7), 689–696 (2012)

    Article  MathSciNet  Google Scholar 

  20. Z. Lu, A.M. Zoubir, Generalized Bayesian information criterion for source enumeration in array processing. IEEE Trans. Signal Process. 61(6), 1470–1480 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  21. Y. Liu, X. Sun, S. Zhao, Source enumeration via GBIC with a statistic for sphericity test in white Gaussian and non-Gaussian noise. IET Radar Sonar Navig. 11(9), 1333–1339 (2017)

    Article  Google Scholar 

  22. E. Terreaux, J.-P. Ovarlez, F. Pascal, New model order selection in large dimension regime for complex elliptically symmetric noise. In: IEEE European Signal Processing Conference (EUSIPCO), pp. 1090–1094 (2017)

  23. D. Kelker, Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhyā: Indian J. Stat. Ser. A, 419–430 (1970)

  24. I.M. Johnstone, On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat. 29, 295–327 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  25. T.J. Fisher, X. Sun, C.M. Gallagher, A new test for sphericity of the covariance matrix for high dimensional data. J. Multivar. Anal. 101, 2554–2570 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  26. T.J. Fisher, On testing for an identity covariance matrix when the dimensionality equals or exceeds the sample size. J. Stat. Plan. Inference 142, 312–326 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  27. M.S. Srivastava, H. Yanagihara, T. Kubokawa, Tests for covariance matrices in high dimension with less sample size. J. Multivar. Anal. 130, 289–309 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  28. Z.D. Bai, D.D. Jiang, J.F. Yao, S.R. Zheng, Corrections to LRT on large-dimensional covariance matrix by RMT. Ann. Stat. 37, 3822–3840 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  29. M.S. Srivastava, Some tests concerning the covariance matrix in high dimensional data. J. Jpn. Stat. Soc. 35(2), 251–272 (2005)

    Article  MathSciNet  Google Scholar 

  30. X.T. Tian, Y.T. Lu, W.M. Li, A robust test for sphericity of high-dimensional covariance matrices. J. Multivar. Anal. 141, 217–227 (2015)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

None.

Funding

This work was partly supported by the Innovation team of Pu’er University(CXTD019), the Guangxi Science and Technology Planning Project (2022AC21276), and Young backbone teachers training project of Puer University.

Author information

Authors and Affiliations

Authors

Contributions

SY was involved in conceptualization, methodology, software, and writing. BZ contributed to supervision and review. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Bin Zhang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, S., Zhang, B. An RMT-based generalized Bayesian information criterion for signal enumeration. EURASIP J. Adv. Signal Process. 2023, 50 (2023). https://doi.org/10.1186/s13634-023-01012-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-023-01012-3

Keywords