The upper bound of multi-source DOA information in sensor array and its application in performance evaluation

Direction of arrival (DOA) estimation has been discussed extensively in the array signal processing field. In this paper, the authors focus on the multi-source DOA information which is defined as the mutual information between the DOA and the received signal contaminated by complex additive white Gaussian noise. A theoretical expression of DOA information with multiple sources is derived for the uniform linear array. At high SNRs and under the sparse-source assumption obtained is the upper bound of DOA information contained in K sparse sources which can be regarded as the sum of all single-source information minus the uncertainty of sources’ order logK!. Moreover, because of the uncertainty of multi-sources’ order, the posteriori probability distribution of DOA no longer obeys single peak Gaussian distribution so that the mean square error is unsuitable in evaluating the performance of multi-dimensional parameter estimation. Consequently, entropy error (EE) is used as a new performance evaluation metric, whose relationship with DOA information is given.

limit on estimation accuracy. Stoica and Nehorai [14] introduced stochastic and deterministic signal models and derived the general expressions for the corresponding CRBs in the multi-source case. The comparisons of multiple signal classification MUSIC, ML, and CRB are presented [15,16]. However, CRB is not a tight bound of MSE in the low SNR region [17]. When the received signal is given, the probability distribution of DOA no longer obeys Gaussian distribution in the low SNR region. Thus, using MSE (a secondorder statistic) to evaluate the estimation results of the actual algorithm is insufficient when SNR is low. In this paper, we use information theory to define a new performance evaluation metric of multi-source DOA estimation algorithms.
The information theory [18] was proposed by Shannon in 1948 and plays a fundamental role in the field of information transmission, channel coding, data compression, etc. Similar to the communication system, the radar system and the sensor array system are both information acquisition systems. Woodward and Davies [19,20] utilized mutual information to investigate the problem of measurement of the target's range. Xu [21] had also employed the thoughts and methodologies of Shannon's information theory to systematically establish an information theory for a radar system in the presence of complex Gaussian noise. However, existing investigations based on Shannon's information theory for DOA estimation mainly focus on the enumeration of source signals. Wax [22,23] introduced the information theory criterion into the problems of signal detection and proposed the methods to estimate the number of sources. To the best of our knowledge, only a few researchers employ the information theory to address the performance analysis of DOA estimation. Xu and Yan [24] studied the spatial status estimation process with a sensor array from the perspective of information theory and provided the quantity of information obtained from the sensor array. In their study, the upper bound of DOA information in the single-source scenario is derived. Furthermore, the entropy error (EE) is defined to measure estimation performance. The relationship between EE, MSE, and CRB was presented. However, their research is not yet complete in the multi-source scenario. In this paper, the research of DOA information in the multi-source scenario will be further promoted.
The remaining of this paper is organized as follows. In Section 2, we review the DOA information theory which includes the system model and the definition of DOA information. Then, a theoretical expression of DOA information in the multi-source scenario is obtained. At high SNRs and under the sparse-source assumption obtained is the upper bound of DOA information contained in K sparse sources can be regarded as the sum of all single-source information minus the uncertainty of sources' order log K!. Moreover, the expression of EE and its low bound (EEB) in the multi-source scenario is obtained. We give the simulation comparison and discuss the obtained results in Section 3: the upper bound of DOA information is compared with the DOA information of single-source; the comparison between EE, EEB, MSE, and CRB in the case of dual-source is presented. Section 4 concludes the paper.

System model
Suppose that there are K narrowband far-field sources impinging on a uniform linear antenna array with M elements, as shown in Fig. 1. The received signal at the mth array element is given by is given by where s k (t) = α k e jϕ k denotes the kth(k = 1, 2, · · · , K) source signal. The source signal's amplitude α k is constant and its phase ϕ is random. w 0 is the angular frequency of carrier signal. w m (t) stands for the complex additive white Gaussian noise (CAWGN) at the mth array element. And the noise added to different arrays is independent of each other. τ m (θ k ) represents the time delay of the kth source signal with DOA θ k to the mth array element. Suppose the distance between any two adjacent elements in the uniform linear array is d, then time delay τ m (θ k ) can be expressed by τ m (θ k ) = md sin θ k /v, where v is the propagation velocity of the signal. Constructing a matrix equation based on (1), we have where in which where a (θ k ) is a so-called transfer vector between the kth source and received signal. Considering a single snapshot scenario, omitting time t, we can rewrite (2) as Assuming that the source obeys uniform distribution within the observation interval of angle Q = [− Θ /2, Θ /2], where Θ is the observation interval, then the prior probability density function (PDF) of the Θ is given by When the carrier frequency is very high, a small change in time delay will lead to a large change in phase. Therefore, Φ is regarded as a random variable subject to uniform distribution on the interval [0, 2π], so the prior PDF of Φ is given by Next, note that noise is CAWGN, and obeys where I is an identity matrix and E{·} denotes the expectation. N 0 is the power spectral density of noise, which represents the power of noise when the bandwidth is normalized. Then, we define the signal to noise ratio as where α k 2 is the power of the kth source.
We will derive the expression of DOA information in the following section.

DOA information
In this section, we will provide the theoretical expression of the DOA information. The DOA information is defined as the mutual information between DOA and received source signal, i.e., I (X; Θ). We suppose the actual value of DOA is θ 0 = [θ 10 , θ 20 , · · · , θ K0 ] T . Considering CAWGN, the multi-dimensional PDF of X conditioned on Θ and Φ is given by (see (13)).
The joint probability density of X and Θ conditioned on Φ is given by Then, the joint probability density of X and Θ can be derived as Consequently, the probability density of Θ conditioned on X is given by by omitting the terms independent of Θ, this expression can be simplified to where g (x, θ , ϕ) is given by Since the posteriori probability density of Θ is given, the quantity of DOA information obtained from the multiple sources is the difference of the priori entropy and the conditional entropy of Θ, i.e., where h (Θ) denotes the prior information of Θ and h (Θ|X) denotes the conditional entropy of Θ when X is obtained. Clearly, the DOA information is algorithmindependent. It can provide a bound for the performance of any algorithms, which has important theoretical guidance.

Upper bound of DOA information
The upper bound of DOA information in the single-source scenario is obtained in previous papers. In this section, we will use some reasonable assumptions and approximation methods to derive the upper bound of DOA information in the multi-source scenario.

Posterior PDF of sparse multi-source
Obviously, when the DOA of signal sources are close to each other, part of the DOA information will be lost because of the interference between sources. Therefore, to obtain the maximum DOA information, we suppose there are K(K << M) independent sources with large spacing between any two sources to avoid this interference, i.e., sparse sources assumption. Similar to the single-source scenario, p(θ |x) presents Gaussian-like distribution centered on the actual location of the source θ 0 . Thus, we obtain p(θ|x) in the neighborhood of θ 0 .
Clearly, in the case of multi-source, we have (2020) 2020:42 Notice that when i = j, a H (θ i )a θ j = M. Furthermore, (21) has a distribution like the sinc function, its side lobe is quite small compared to the main lobe. Base on the sparse sources assumption, we have Therefore, (20) can be approximated to it follows that Substituting (23) in (18) results in In addition, for the actual received signal, we have where θ 0 is the actual value of DOA, and s 0 = α 1 e jϕ 10 , α 2 e jϕ 20 , · · · , α K e jϕ K0 T Same as (20), A H (θ ) A(θ 0 ) can be approximated when θ is in the neighborhood of θ 0 . At this time, a H (θ k )a (θ k0 ) is the only element left in its kth row and the rest is approximated to 0, i.e.
where * is Hadamard product. Moreover, suppose that the signal amplitude of each source is equal, i.e., α k = α. It follows that (see (28)). where w m = w m e −jϕ k0 , ϕ k = ϕ k − ϕ k0 . We further have where I 0 {·} is the first kind of zero-order Bessel function [25], and where β k = 2πd sin θ k /λ, β k0 = 2πd sin θ k0 /λ. And G (θ k ) can be regarded as the influence of the signal to the posteriori probability density of Θ. And can be regarded as the influence of the noise to the posteriori probability density of Θ. Therefore, under the sparse sources condition, Eq. (17) can be rewritten as We have known that in the single-source scenario, DOA information will approach an upper bound with the increasing of SNR. The closed expression of the upper bound was derived under the condition of high SNR. Therefore, we follow this condition to derive the upper bound of multi-source DOA information. Considering the posterior PDF is composed of signal and noise components, in the case of high SNR, we can neglect the noise components to approximate p(θ|x) when θ is in the neighborhood of θ 0 . Moreover, p(θ |x) tends to 0 out of the neighborhood. Thus, we have in which where κ is a normalizing constant. In order to obtain the approximation of DOA information, we approximate |G (θ k )| using the first-order Taylor series expansion at θ k = θ k0 , it follows that where L 2 = π 2 L 2 /3 is root mean square aperture width, L = Md/λ denotes the normalized aperture width, and cosθ k0 is direction cosine of sensor arrays. Substituting (35) in (34) and using the expansion of the Bessel function It follows that the approximation of (34) is given by where σ k 2 = 2Mρ 2 L 2 cos 2 θ k0 −1 , ρ 2 = α 2 /N 0 , and Using the expression of p(θ|x), we can observe the posterior PDF through numeral calculation. We take the dual-source scenario as an example, the actual value of DOA is set as θ 0 = [θ 10 , θ 20 ] T . As shown in Fig. 2, the posterior PDF presents a two-dimensional probability distribution with two peaks, which are located at θ = [θ 10 , θ 20 ] T and θ = [θ 20 , θ 10 ] T , respectively.
Since the order of sources is not determined and the K elements of θ 0 = [θ 10 , θ 20 , · · · , θ K0 ] T have K! different permutations, the posterior probability distribution presents a K-dimensional probability distribution with K! peaks when there are K sources.
To facilitate further derivation, we introduce the concept of permutation matrix. Set π l represents one of the permutations of [1, · · · , K], where l = 1, 2, · · · , K!. The permutation matrix of π l is written as P π l . Then, the permutation of θ 0 can be represented as P π l θ 0 .
According to the numeral calculation results, we find that the distribution of the posterior probability is mainly located in the neighborhood of P π l θ 0 . At this time, a(θ k ) H a θ π l (k)0 is the only element left in its kth row and the rest is approximated to zero. Now, (27) can be rewritten as The subsequent derivation is the same as (28)-(37). Therefore, the correction expression of posterior PDF in the neighborhood of P π l θ 0 is given by where κ ≈ 1/K! because p(θ |x) presents a K-dimensional probability distribution with K! same peaks when α k = α. And is the PDF of K-dimensional Gaussian distribution. In which, is the covariance matrix.

Upper bound of DOA information
Then, we divide the domain of integration into K! domains centered on each peak. The PDF in the neighborhood of each peak is given by (39). Next, we can extend the integral domain to the whole domain when calculating the integral of each domain for convenience. The error caused by such approximation is acceptable because the value of the Gaussian distribution outside the neighborhood of each peak is close to zero. The calculation process is given by By substituting (41) in (19), we obtain an approximation of the upper bound of DOA information where the first term of (43) is the sum of DOA information of every single source and the second term is the loss of information due to the uncertainty of sources' order. This is our main result. Moreover, the upper bound of DOA information of the source signal with random amplitudes can be obtained simply by taking the SNR in (43) as a random SNR and taking the expectation of (43). For example, the upper bound of DOA information of the source signal with Rayleigh distribution amplitudes is given by where γ is Euler-Mascheroni constant.

Entropy error and MSE
We know that the conditional entropy h (Θ|X) represents the uncertainty of Θ when the received signal is given. As the SNR increases, the conditional entropy continues to shrink, indicating that the estimation is more accurate. In other words, the conditional entropy or DOA information (when the prior entropy of Θ is determined) can be considered as a performance metric of DOA estimation.
In the case of the single source [24], EE is defined as the entropy power of the posterior probability distribution to measure the theoretical performance of DOA estimation. In this section, we will discuss the relationship between EE and MSE in the multi-source case.
Firstly, EE is defined as the entropy power of p (θ |x ), which is given by in the K-source case. We can learn from (45) that once the sensor arrays obtain 1 bit of DOA information, the entropy deviation σ EE is reduced by half.
In the previous section, we have derived the conditional entropy of DOA. Since (41) is the approximation of the conditional entropy in high SNR region, we can obtain an approximation of EE by substituting (41) into (45); it follows that EE's low bound (EEB) is given by where K! reflects the uncertainty of sources' order. As we mentioned in the introduction, MSE is usually used to evaluate the performance of DOA estimation algorithms. Xu and Yan [24] had pointed out the limitation of MSE at medium and low SNRs. Here, we discuss the limitations of MSE in the multi-source case. The MSE of N times DOA estimation for K sources is given by which is calculated under the condition of the determined sources' order. It only applies to a one-dimensional matching multi-source DOA estimation. When we use multi-dimensional matching estimation to improve the angular resolution [26], the estimate of DOAθ will present a K-dimensional probability distribution with K! peaks when there are K sources similar to Fig. 2 due to the uncertainty of sources' order. Consequently, MSE is no longer applicable to evaluate the estimation performance.
CRB provides the best accuracy achievable by any unbiased estimator of the signal parameters and provides a fundamental physical limit on estimation accuracy. And the expression of CRB in the multi-source scenario is given by [15], which is shown by Similarly, as the theoretical lower bound of MSE, CRB can only be used as the lower bound of multi-source DOA estimation accuracy in one-dimensional matching estimation. Moreover, the relationship between EEB and CRB is given by which is shown in Fig. 5.

Results and discussion
In this section, we provide the numerical results to illustrate the theoretical result in the multi-source scenario with CAWGN. Taking the dual-source scenario for example, we consider the reflection coefficient α 1 = α 2 = 1 and the phase follows a uniform distribution in the interval [0,2π]. Considering that there are only two sources, the sparse-source assumption is still valid when the observation interval is small. In order to reduce the calculation time, the observation interval of the DOA is set as [−20 • , 20 • ]. Moreover, θ 10 and θ 20 are located at −5 • and 5 • , respectively. And the number of array elements M is set as 32 in general. Besides, we consider the reflection coefficient follows a Rayleigh distribution in the random source signal amplitudes scenario. The other conditions are the same as the constant source signal amplitudes scenario.

Upper bound of DOA information
In this subsection, we have presented the simulation results of DOA information and its upper bound in both constant amplitudes scenario and random amplitudes scenario, which are shown in Figs. 3 and 4, respectively.
It can be seen from the two figures that the theoretical value of DOA information corresponds to the upper bound of DOA information obtained by us in the high SNR region, which proves the correctness of the derivation. Numerically, the sum of DOA information of two single sources is 1 bit more than the joint DOA information obtained by the two-dimensional search. As we mentioned in the explanation of (43), the 1 bit loss of (2020) 2020:42 Page 12 of 15

Fig. 3 DOA information in constant amplitudes scenario
information is caused by the uncertainty of sources' order. It can be concluded from the simulation results that the calculation of DOA information containing multiple independent sources can be converted into the sum of all single source's information minus the uncertainty of sources' order log K!.

Comparison of EE, EEB, MSE, and CRB
Next, we compare EE, EEB, MSE, and CRB for various SNRs through simulation to show their relationship in the dual-source scenario. The EE is calculated by substituting under the condition of the determined sources' order. The empirical EE of the ML algorithm is obtained based on the probability distribution of the estimate of DOAθ . For the convenience of comparison, CRB will be further calculated by |CRB (θ )|.
According to the result of the simulation shown in Fig. 5, we find MSE decreases monotonically with increasing SNR and tends to CRB in the high SNR region. Similarly, EE decreases monotonically with increasing SNR and tends to EEB. However, unlike the single-source scenario, EEB does not coincide with CRB in the multi-source scenario. The difference is caused by the uncertainty of sources' order. The empirical EE of the ML algorithm is bigger than and close to the theoretical EE. Moreover, when the sources' order cannot be determined, CRB is unreachable, and EEB is a more reasonable theoretical bound.
So far, we have pointed out two limitations of MSE: 1) The posterior probability distribution of Θ no longer obeys Gaussian distribution in the case of medium and low SNR. MSE is invalid as a second-order statistic when SNR is low [24]. It is still valid in the multi-source scenario.
2) MSE cannot reflect the uncertainty of sources' order in multi-dimensional matching multi-source DOA estimation.
EE avoids both of these limitations; it is more suitable to be used as an evaluation metric in medium and low SNRs and the multi-source case.

Conclusions
One of the significant findings to emerge from this study is that the upper bound of DOA information contained in K sparse sources can be regarded as the sum of all singlesource information minus the uncertainty of sources' order log K!. The second major finding was that MSE is no longer applicable to evaluate the estimation performance in multi-dimensional matching multi-source DOA estimation. Specifically, considering the uncertainty of sources' order, the estimate of DOAθ will present a K-dimensional probability distribution with K! peaks when there are K sources. Consequently, entropy error(EE) is defined as a new performance evaluation metric, and its low bound is given.
In addition, EEB can be regarded as the generalized CRB considering the sources' order in the multi-source scenario.
The main conclusions of this paper are given under the condition of high SNRs and sparse-source. However, the findings of this paper provide guidance for further study of multi-source DOA information and estimation performance evaluation in the general scenarios. Further investigations will be undertaken in future works in order to complete the research of DOA information theory in other scenarios.