Skip to main content

Robust direct position determination methods in the presence of array model errors

Abstract

Direct position determination (DPD) methods are known to have many advantages over the traditional two-step localization method, especially for low signal-to-noise ratios (SNR) and/or short data records. However, similar to conventional direction-of-arrival (DOA) estimation methods, the performance of DPD estimators can be dramatically degraded by inaccuracies in the array model. In this paper, we present robust DPD methods that can mitigate the effects of these uncertainties in the array manifold. The proposed technique is related to the classical auto-calibration procedure under the assumption that prior knowledge of the array response errors is available. Localization is considered for the cases of both unknown and a priori known transmitted signals. The corresponding maximum a posteriori (MAP) estimators for these two cases are formulated, and two alternating minimization algorithms are derived to determine the source location directly from the observed signals. The Cramér-Rao bounds (CRBs) for position estimation are derived for both unknown and known signal waveforms. Simulation results demonstrate that the proposed algorithms are asymptotically efficient and very robust to array model errors.

1 Introduction

Emitter localization using direction-of-arrival (DOA) measurements [1,2,3] has received significant attention because of its importance in fields such as radar, sonar, seismology, vehicle navigation, and wireless communications. In this type of localization system, a single moving observer or multiple stationary observers are used to determine the emitter position. Generally, each base station is equipped with an antenna array that can be used to estimate the angle of arrival of the transmitted signals. With multiple DOA estimates from different observer locations, it is possible to locate the source [4,5,6]. This emitter localization procedure is called two-step location. In the first step, the signal parameters (e.g., DOA [1,2,3,4,5,6], time difference of arrival (TDOA) [7, 8], time of arrival (TOA) [9, 10], frequency difference of arrival (FDOA) [11, 12], frequency of arrival (FOA) [13], received signal strength (RSS) [14], gain ratios of arrival (GROA) [15]) are obtained independently from the intercepted signals by spatially separated sensors. In the second step, the measurements of all sensors are transferred to a central unit and the transmitter location is estimated. Note that this two-step procedure is also known as the decentralized approach [16].

It is worth mentioning that, although the two-step approach is extensively used in modern localization systems, it is only a suboptimal position determination technique. This is because the signal metrics extracted from the waveforms ignore the constraint that all measurements must correspond to the same transmitter. Consequently, information loss between the two steps is inevitable. Indeed, the extended invariance principle (EXIP) [17] can be used to show that the two-step procedure can also provide an asymptotically efficient estimate under certain conditions. In practice, however, these requirements are not easily satisfied.

To improve the localization accuracy of two-step location methods, the direct position determination (DPD) technique has been proposed and extensively developed. DPD is a centralized and single-step location technique in which the estimator uses exactly the same signal samples as in two-step methods, but estimates the source location directly, skipping the intermediate (first) step. This can be viewed as searching for the emitter location that best explains the collected data. In general, the DPD method is superior to the classical two-step methods for low signal-to-noise ratios (SNR) and/or short data records. Moreover, DPD can be applied to many wireless positioning systems. In particular, DPD methods for locating a narrowband radio emitter based on Doppler shift [18, 19] and for locating a wideband source using a time delay metric [20,21,22] have been presented. Additionally, DPD estimators using both the Doppler effect and the relative delay have been developed [23,24,25,26]. In the DPD methods mentioned above, several platforms each carrying single-antenna receivers are used for source location; hence, the DOA information of the impinging signals cannot be exploited. To use such signal metrics, DPD methods based on multiple static stations, each equipped with an antenna array, have been proposed [27]. This single-step location technique models the array response as a function of the source position, requiring only a two-dimensional search for planar geometry and a three-dimensional search for the general case. Following the methods in [27], other DPD estimators for special localization scenarios have been reported in the literature. Specifically, DPD methods for multiple-source scenarios are studied in [28, 29], and some high-resolution DPD methods are proposed in [30,31,32]. Additionally, DPD methods tailored to special signal structures (e.g., orthogonal frequency division multiplexing signals, cyclostationary signals, noncircular signals, and intermittent emissions) have been developed [33,34,35,36,37]. Note that the experimental results in [18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37] demonstrate that the single-step approach outperforms the two-step method in scenarios with low SNR and/or relatively few data records.

High-resolution DOA estimation methods are known to be sensitive to errors in the array response model [38,39,40,41,42]. This is because these approaches require exact knowledge of the array manifold. Note that the DPD methods using an antenna array also rely on the accuracy of the array model. As a result, it is reasonable to expect their localization accuracy to deteriorate significantly in the event of an array model mismatch. In [43,44,45], for the case of a known signal waveform, the closed-form expression for the mean square error (MSE) of the DPD estimator was derived for the case of array model errors. However, in practice, the signal waveform is rarely known. Consequently, an alternative statistical analysis of the DPD method in the presence of array model errors was performed [46] under the assumption that the signal waveform is unknown. All the theoretical and experimental results in [43,44,45,46] demonstrate that the accuracy loss caused by array model errors is considerable. Hence, a new DPD technique that accounts for array model errors is needed to improve the emitter location accuracy. However, to the best of our knowledge, very few studies have considered this topic. This paper presents robust DPD methods that can reduce the impact of uncertainties in the array manifold. The proposed technique is similar to the traditional auto-calibration procedure in the field of array signal processing and assumes that certain prior knowledge of the array response errors is available. This is a reasonable assumption in most applications, and it allows for more general perturbation models. We consider two different localization cases: the case of a priori known signal waveforms and the more realistic case where the transmitted signals are unknown to the location systems. The corresponding maximum a posteriori (MAP) estimators for the two cases are formulated and two alternating minimization algorithms are developed to estimate the emitter position directly from the received signal samples. Hence, the proposed methods follow the Bayesian framework [47,48,49,50]. Additionally, to verify the asymptotic efficiency of the new methods, the Cramér-Rao bounds (CRBs) for position estimation are derived for both unknown and known signal waveforms.

The remainder of this paper is organized as follows. Section 2 describes the method and experimental used in this paper. In Section 3, the signal model for DPD is formulated and the array error model is also discussed. Section 4 presents a robust DPD method in the presence of array model errors, when signal waveform is unknown. In Section 5, an alternative robust DPD method in the presence of uncertainties in array manifold is proposed for the case of known transmitted signals. In Section 6, the CRB expressions for the position estimation are derived for both unknown and known signal waveforms. Simulation results are reported in Section 7. Conclusions are drawn in Section 8. The proofs of the main results are given in the Appendix 1, 2, 3, 4, and 5.

2 Methods/experimental

This study refers to the field of emitter localization. The aim of this study is to propose the robust DPD methods that can mitigate the effects of these uncertainties in the array manifold. The methods are designed based on MAP criterion. We consider two different localization cases: the case of a priori known signal waveforms and the more realistic case where the transmitted signals are unknown to the location systems. The numerical optimization technology is applied and two alternating minimization algorithms to identify the unknown parameters are presented. To verify the asymptotic efficiency of the new methods, we derive the CRBs on position estimation for both unknown and known signal waveforms.

We perform a set of Monte Carlo simulations to examine the behavior of the proposed robust DPD methods. The root mean square error (RMSE) of position estimate is employed to assess and compare the performance. All the simulation results are averaged over 2000 independent runs. We compare our algorithms with the algorithms in [27], and the traditional two-step localization algorithms, as well as the CRB for unknown and known signal waveforms. Besides, all the experiments are conducted for the 3D localization.

All data and procedures performed in the paper are in accordance with the ethical standards of research community. This paper does not contain any studies with human participants or animals performed by any of the authors.

3 Problem formulation

3.1 Signal model for direct position determination

3.1.1 Time-domain signal model

Consider a stationary radio emitter and N base stations that can intercept the transmitted signal. Each base station is equipped with an antenna array composed of M sensors. The transmitter’s position is denoted by an L × 1 vector of coordinates p. In this paper, we consider the three-dimensional (3D) scenario, in which p = [p x   p y   p z ]T and L = 3. The complex envelopes of the signal observed by the nth base station are modeled by

$$ {\mathbf{x}}_n(t)={\beta}_n{\mathbf{a}}_n\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)s\left(t-{\tau}_n\left(\mathbf{p}\right)-{t}_0\right)+{\boldsymbol{\upvarepsilon}}_n(t)\kern1em \left(1\le n\le N\right) $$
(1)

where

  • a n (p) is the array response of the nth station to a signal transmitted from position p,

  • s(t − τ n (p ) − t0) is the signal waveform transmitted at unknown time t0 and delayed by τ n (p ),

  • τ n (p ) is the signal propagation time from the emitter to the nth base station (i.e., distance divided by signal propagation speed),

  • β n is an unknown complex scalar representing the channel attenuation between the emitter and the nth base station,

  • ε n (t) denotes temporally white, circularly symmetric complex Gaussian noise with zero mean and covariance matrix \( {\sigma}_{\varepsilon}^2{\mathbf{I}}_M \),

  • μ n represents the perturbed array parameters and is used to model the uncertainty in the array steering vector.

We assume that the observation vector x n (t) is sampled with period T. The kth data sample can then be expressed as

$$ {\mathbf{x}}_{n,k}={\beta}_n{\mathbf{a}}_n\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)s\left( kT-{\tau}_n\left(\mathbf{p}\right)-{t}_0\right)+{\boldsymbol{\upvarepsilon}}_{n,k}\kern1em \left(1\le k\le K\right) $$
(2)

where K is the number of snapshots.

3.1.2 Frequency-domain signal model

In order to determine the emitter position directly from the received signal samples, it is desirable to separate the propagation delay τ n (p ) and transmit time t0 from the signal waveform. This is easy when using the frequency-domain representation of the problem. Taking the discrete Fourier transform (DFT) of (2), we get

$$ {\overline{\mathbf{x}}}_{n,k}={\beta}_n{\mathbf{a}}_n\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right){\overline{s}}_k\cdot \exp \left\{-\mathrm{j}{\omega}_k\left({\tau}_n\left(\mathbf{p}\right)+{t}_0\right)\right\}+{\overline{\boldsymbol{\upvarepsilon}}}_{n,k}\kern1em \left(1\le n\le N\kern0.5em ;\kern0.5em 1\le k\le K\right) $$
(3)

where

  • ω k  = 2π(k − 1)/(KT) is the kth known discrete frequency point,

  • \( {\overline{s}}_k \) is the kth Fourier coefficient of the unknown signal corresponding to frequency ω k ,

  • \( {\overline{\boldsymbol{\upvarepsilon}}}_{n,k} \) is the kth Fourier coefficient of the random noise corresponding to frequency ω k .

As the DFT is an orthogonal linear transformation, the probability distribution of the noise vector \( {\overline{\boldsymbol{\upvarepsilon}}}_{n,k} \) is the same as that of εn, k, with first- and second-order moments given by

$$ \left\{\begin{array}{l}\mathrm{E}\left[{\overline{\boldsymbol{\upvarepsilon}}}_{n,k}\right]={\mathbf{O}}_{M\times 1}\\ {}\mathrm{E}\left[{\overline{\boldsymbol{\upvarepsilon}}}_{n,k}{\overline{\boldsymbol{\upvarepsilon}}}_{n,k}^{\mathrm{T}}\right]={\mathbf{O}}_{M\times M}\kern0.5em ,\kern0.5em \mathrm{E}\left[{\overline{\boldsymbol{\upvarepsilon}}}_{n,k}{\overline{\boldsymbol{\upvarepsilon}}}_{n,k}^{\mathrm{H}}\right]={\sigma}_{\varepsilon}^2{\mathbf{I}}_M\\ {}\mathrm{E}\left[{\overline{\boldsymbol{\upvarepsilon}}}_{n,k}{\overline{\boldsymbol{\upvarepsilon}}}_{n,l}^{\mathrm{T}}\right]=\mathrm{E}\left[{\overline{\boldsymbol{\upvarepsilon}}}_{n,k}{\overline{\boldsymbol{\upvarepsilon}}}_{n,l}^{\mathrm{H}}\right]={\mathbf{O}}_{M\times M}\end{array}\right.\kern1em \left(1\le k\kern0.5em ,\kern0.5em l\le K\kern0.5em ;\kern0.5em k\ne l\right) $$
(4)

Note that the robust DPD methods proposed in this paper are derived from the frequency-domain signal model (3). For notational convenience, we introduce the following three vectors

$$ \left\{\begin{array}{l}{\overline{\mathbf{x}}}_n={\left[{\overline{\mathbf{x}}}_{n,1}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{n,2}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{n,K}^{\mathrm{H}}\right]}^{\mathrm{H}}\kern0.5em ,\kern0.5em \boldsymbol{\upbeta} ={\left[{\boldsymbol{\upbeta}}_1\kern1.5em {\boldsymbol{\upbeta}}_2\kern1.5em \cdots \kern1.5em {\boldsymbol{\upbeta}}_N\right]}^{\mathrm{T}}\\ {}\overline{\mathbf{s}}={\left[{\overline{s}}_1\cdot \exp \left\{-\mathrm{j}{\omega}_1{t}_0\right\}\kern1.5em {\overline{s}}_2\cdot \exp \left\{-\mathrm{j}{\omega}_2{t}_0\right\}\kern1.5em \cdots \kern1.5em {\overline{s}}_K\cdot \exp \left\{-\mathrm{j}{\omega}_K{t}_0\right\}\right]}^{\mathrm{T}}\end{array}\right. $$
(5)

3.2 Array error model

This subsection describes the array error model. According to the discussion in [47, 48, 50], μ n can be modeled by considering it as a real random vector and is composed of the array parameters that are subject to perturbations. A nominal value of μ n , denoted by μn, 0, is known, resulting in a nominal array manifold a n (p, μn, 0). As a consequence, the actual array response may be different. We denote the length of μ n as Q n .

It is known that μ n may include either structured parameters, such as the sensor gain, phase, position, and/or mutual coupling, or unstructured parameters. Generally, the perturbation parameter vector μ n can be modeled as a Gaussian random variable with mean E[μ n ] = μn, 0 and covariance

$$ \mathrm{E}\left[\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right){\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)}^{\mathrm{T}}\right]={\boldsymbol{\Omega}}_n\kern1em \left(1\le n\le N\right) $$
(6)

Moreover, \( {\boldsymbol{\upmu}}_{n_1} \) and \( {\boldsymbol{\upmu}}_{n_2} \) are statistically independent for n1 ≠ n2. Note that the covariance matrices {Ω n }1 ≤ n ≤ N are also assumed to be known. These matrices can be determined, for example, using sample statistics from a number of independent, identical calibration experiments or using tolerance data specified by the sensor manufacturer [51].

In the next sections, the a priori information in the form of a probability distribution is used to develop two robust DPD methods with regard to array model mismatch. The first DPD method covers the common case of signals with unknown waveforms, and the second one is applicable to the less common case of signals with known waveforms. For notational simplicity, we define the following two vectors

$$ \left\{\begin{array}{l}{\boldsymbol{\upmu}}_0={\left[{\boldsymbol{\upmu}}_{1,0}^{\mathrm{H}}\kern1.5em {\boldsymbol{\upmu}}_{2,0}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\boldsymbol{\upmu}}_{N,0}^{\mathrm{H}}\right]}^{\mathrm{H}}\\ {}\boldsymbol{\upmu} ={\left[{\boldsymbol{\upmu}}_1^{\mathrm{H}}\kern1.5em {\boldsymbol{\upmu}}_2^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\boldsymbol{\upmu}}_N^{\mathrm{H}}\right]}^{\mathrm{H}}\end{array}\right. $$
(7)

4 Robust direct position determination method for case of unknown signal waveform

In this section, we propose a new DPD method that is robust to array model errors and assumes that the signal waveforms are unknown. The optimal MAP estimator for the problem at hand is formulated, and a computationally efficient numerical algorithm is derived.

4.1 Optimization criterion for maximum a posteriori estimator

If the transmitted signals are unknown, the parameters to be determined involve p, \( {\left\{{\overline{s}}_k\right\}}_{1\le k\le K} \), t0, β, and μ. From (5), both \( {\left\{{\overline{s}}_k\right\}}_{1\le k\le K} \) and t0 are contained in the vector \( \overline{\mathbf{s}} \). Hence, the estimation of \( {\left\{{\overline{s}}_k\right\}}_{1\le k\le K} \) and t0 can be replaced by the estimation of \( \overline{\mathbf{s}} \).

When deriving the MAP estimator, the a priori distribution of {μ n }1 ≤ n ≤ N must be exploited. Following [47,48,49,50], the joint estimation of p, \( \overline{\mathbf{s}} \), β, and μ is then obtained as

$$ \underset{\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu}}{\min }{J}_{\mathrm{U}\hbox{-} \mathrm{MAP}}\left(\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)=\underset{\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu}}{\min}\left\{{J}_{\mathrm{U}\hbox{-} \mathrm{ML}}\left(\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)+\frac{1}{2}\cdot \sum \limits_{n=1}^N{\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)}^{\mathrm{T}}{\boldsymbol{\Omega}}_n^{-1}\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)\right\} $$
(8)

where

$$ {J}_{\mathrm{U}\hbox{-} \mathrm{MAP}}\left(\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)={J}_{\mathrm{U}\hbox{-} \mathrm{ML}}\left(\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)+\frac{1}{2}\cdot \sum \limits_{n=1}^N{\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)}^{\mathrm{T}}{\boldsymbol{\Omega}}_n^{-1}\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right) $$
(9)

Here, \( {J}_{\mathrm{U}\hbox{-} \mathrm{ML}}\left(\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right) \) is the negative log-likelihood function, which is given by

$$ {\displaystyle \begin{array}{c}{J}_{\mathrm{U}\hbox{-} \mathrm{ML}}\left(\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)=\frac{1}{\sigma_{\varepsilon}^2}\cdot \sum \limits_{n=1}^N\sum \limits_{k=1}^K{\left\Vert {\overline{\mathbf{x}}}_{n,k}-{\beta}_n{\mathbf{a}}_n\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right){\overline{s}}_k\cdot \exp \left\{-\mathrm{j}{\omega}_k\left({\tau}_n\left(\mathbf{p}\right)+{t}_0\right)\right\}\right\Vert}_2^2\\ {}=\frac{1}{\sigma_{\varepsilon}^2}\cdot \sum \limits_{n=1}^N{\left\Vert {\overline{\mathbf{x}}}_n-\left(\left(\overline{\mathbf{s}}\odot {\boldsymbol{\upgamma}}_n\left(\mathbf{p}\right)\right)\otimes {\mathbf{a}}_n\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)\right){\beta}_n\right\Vert}_2^2\end{array}} $$
(10)

where denotes Kronecker product, and represents Schur product (element by element multiplication), and

$$ {\boldsymbol{\upgamma}}_n\left(\mathbf{p}\right)={\left[\exp \left\{-\mathrm{j}{\omega}_1{\tau}_n\left(\mathbf{p}\right)\right\}\kern1.5em \exp \left\{-\mathrm{j}{\omega}_2{\tau}_n\left(\mathbf{p}\right)\right\}\kern1.5em \cdots \kern1.5em \exp \left\{-\mathrm{j}{\omega}_K{\tau}_n\left(\mathbf{p}\right)\right\}\right]}^{\mathrm{T}} $$
(11)

Setting \( {\boldsymbol{\Omega}}_{n,0}={\boldsymbol{\Omega}}_n/{\sigma}_{\varepsilon}^2 \), the optimization criterion can be finally formulated as

$$ \underset{\mathbf{p},\overline{\mathbf{s}},\boldsymbol{\upbeta}, \boldsymbol{\upmu}}{\min}\left\{\sum \limits_{n=1}^N{\left\Vert {\overline{\mathbf{x}}}_n-\left(\left(\overline{\mathbf{s}}\odot {\boldsymbol{\upgamma}}_n\left(\mathbf{p}\right)\right)\otimes {\mathbf{a}}_n\right(\mathbf{p},{\boldsymbol{\upmu}}_n\left)\right){\beta}_n\right\Vert}_2^2+\frac{1}{2}\cdot \sum \limits_{n=1}^N{\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)}^{\mathrm{T}}{\boldsymbol{\Omega}}_{n,0}^{-1}\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)\right\} $$
(12)

where we assume, without loss of generality, that the first element in \( \overline{\mathbf{s}} \) is equal to one so that the solution is unique and unambiguous.

Obviously, (12) is a multidimensional nonlinear minimization problem, and no closed-form solution is available. In the next subsection, we present an efficient numerical algorithm to solve (12). For this purpose, the following matrices are introduced:

$$ \left\{\begin{array}{l}\mathbf{A}\left(\mathbf{p},\boldsymbol{\upmu} \right)={\left[{\left({\mathbf{A}}_1\left(\mathbf{p},{\boldsymbol{\upmu}}_1\right)\right)}^{\mathrm{H}}\kern1.5em {\left({\mathbf{A}}_2\left(\mathbf{p},{\boldsymbol{\upmu}}_2\right)\right)}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\left({\mathbf{A}}_N\left(\mathbf{p},{\boldsymbol{\upmu}}_N\right)\right)}^{\mathrm{H}}\right]}^{\mathrm{H}}\\ {}{\mathbf{A}}_n\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)=\mathrm{blkdiag}\left[{\mathbf{a}}_{n,1}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)\kern1.5em {\mathbf{a}}_{n,2}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)\kern1.5em \cdots \kern1.5em {\mathbf{a}}_{n,K}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)\right]\kern1em \left(1\le n\le N\right)\end{array}\right. $$
(13)

where

$$ {\mathbf{a}}_{n,k}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)={\mathbf{a}}_n\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right)\cdot \exp \left\{-\mathrm{j}{\omega}_k{\tau}_n\left(\mathbf{p}\right)\right\} $$
(14)

4.2 Numerical algorithm

Recalling (12), it is apparent that the unknown parameter vectors of p, \( \overline{\mathbf{s}} \), β, and μ cannot be completely separated. Hence, the straightforward minimization of (12) with respect to p, \( \overline{\mathbf{s}} \), β, and μ is rarely feasible. Because of the diversity of the unknowns, we derive an alternating minimization algorithm to identify the unknown parameters in an iterative manner. This numerical method has been successfully applied to a variety of signal processing issues [52, 53].

After careful analysis of the present problem, the unknown parameters can be grouped into two categories: one is composed of p and \( \overline{\mathbf{s}} \), and the other comprises β and μ. The two sets of variables are alternately optimized in succession. First, the minimization is performed with respect to p and \( \overline{\mathbf{s}} \), while β and μ are kept unchanged. Subsequently, we solve the problem for β and μ while keeping p and \( \overline{\mathbf{s}} \) constant. This procedure is repeated until the convergence criterion is satisfied. A detailed description of the two steps in each iteration is given in the following subsection.

4.2.1 Joint optimization of p and \( \overline{\mathbf{s}} \)

First, consider the joint optimization of p and \( \overline{\mathbf{s}} \), with β and μ given by \( {\widehat{\boldsymbol{\upbeta}}}^{\left(\mathrm{a}\right)} \) and \( {\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)} \), respectively. Fortunately, when β and μ are fixed, the estimation of p and \( \overline{\mathbf{s}} \) can be decoupled.

Using the result in [27], we have the following optimization problem for finding the position vector p.

$$ \underset{\mathbf{p}}{\max }{f}_1\left(\mathbf{p}\right)=\underset{\mathbf{p}}{\max }{\lambda}_{\mathrm{max}}\left\{{\left(\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\}=\underset{\mathbf{p}}{\max }{\lambda}_{\mathrm{max}}\left\{\mathbf{C}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\} $$
(15)

where λmax{} denotes the maximal eigenvalue of its input matrix and

$$ \left\{\begin{array}{l}\mathbf{C}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)={\left(\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\\ {}\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)={\left(\mathbf{A}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}\overline{\mathbf{X}}\end{array}\right. $$
(16)

with \( \overline{\mathbf{X}}=\mathrm{blkdiag}\left[{\overline{\mathbf{x}}}_1\kern1.5em {\overline{\mathbf{x}}}_2\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_N\right] \). Note that the cost function f1(p) in (15) is expressed as the maximal eigenvalue of some positive semidefinite matrix and, hence, the functional form of f1(p) with respect to p is not explicit. The most straightforward method for solving (15) may be a grid search, as recommended in [27]. However, this method is very computationally expensive when the area of interest is large and the grid step size is small. To avoid a multidimensional search, we derive an efficient Gauss-Newton algorithm, which has much faster convergence speed than the steepest ascent and steepest descent methods. For this purpose, we first introduce the following proposition, which is associated with the matrix eigenvalue perturbation result.

Proposition 1: Let ZCn × nbe a positive semidefinite matrix with eigenvalues λ1 ≤ λ2 ≤  ≤ λ n , associated with unit eigenvectors v1 ,  v2 ,   ,  v n , respectively. Moreover, λ j differs from the other eigenvalues. Assume Z is corrupted by a Hermitian error matrix δΖCn × n, and the corresponding perturbed matrix is denoted as \( \widehat{\mathbf{Z}} \), i.e., \( \widehat{\mathbf{Z}}=\mathbf{Z}+\boldsymbol{\updelta} \mathbf{Z}\in {\mathbf{C}}^{n\times n} \). If the eigenvalues of matrix \( \widehat{\mathbf{Z}} \) are defined by \( {\widehat{\lambda}}_1\le {\widehat{\lambda}}_2\le \cdots \le {\widehat{\lambda}}_n \), then the relationship between \( {\widehat{\lambda}}_j \) and λ j can be described by

$$ {\widehat{\lambda}}_j={\lambda}_j+{\mathbf{v}}_j^{\mathrm{H}}\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_j+{\mathbf{v}}_j^{\mathrm{H}}\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{V}}_j\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_j+o\left({\left\Vert \boldsymbol{\updelta} \mathbf{Z}\right\Vert}_{\mathrm{F}}^2\right) $$
(17)

where

$$ {\mathbf{V}}_j=\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n\frac{{\mathbf{v}}_i{\mathbf{v}}_i^{\mathrm{H}}}{\lambda_j-{\lambda}_i} $$
(18)

The proof of Proposition 1 is given in Appendix 1. The result in Proposition 1 can be used to obtain the second-order Taylor series expansion of the cost function f1(p), from which the gradient and Hessian matrix of f1(p) can be obtained.

Assume that vector \( \widehat{\mathbf{p}} \) belongs to some neighborhood of the true value p. The second-order Taylor series expansion of \( \mathbf{C}\left(\widehat{\mathbf{p}},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right) \) around p leads to

$$ \mathbf{C}\left(\widehat{\mathbf{p}},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)=\mathbf{C}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+\sum \limits_{l=1}^L<\boldsymbol{\updelta} \mathbf{p}{>}_l\cdot {\dot{\mathbf{C}}}_l\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+\frac{1}{2}\cdot \sum \limits_{l_1=1}^L\sum \limits_{l_2=1}^L<\boldsymbol{\updelta} \mathbf{p}{>}_{l_1}\cdot <\boldsymbol{\updelta} \mathbf{p}{>}_{l_2}\cdot {\ddot{\mathbf{C}}}_{l_1{l}_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+o\left({\left\Vert \boldsymbol{\updelta} \mathbf{p}\right\Vert}_2^2\right) $$
(19)

where

$$ {\dot{\mathbf{C}}}_l\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)=\frac{\partial \mathbf{C}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)}{\partial <\mathbf{p}{>}_l}={\left({\dot{\mathbf{B}}}_l\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+{\left(\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}{\dot{\mathbf{B}}}_l\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right) $$
(20)
$$ {\displaystyle \begin{array}{c}{\ddot{\mathbf{C}}}_{l_1{l}_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)=\frac{\partial^2\mathbf{C}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)}{\partial <\mathbf{p}{>}_{l_1}\partial <\mathbf{p}{>}_{l_2}}={\left({\ddot{\mathbf{B}}}_{l_1{l}_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+{\left(\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}{\ddot{\mathbf{B}}}_{l_1{l}_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\\ {}+{\left({\dot{\mathbf{B}}}_{l_1}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}{\dot{\mathbf{B}}}_{l_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+{\left({\dot{\mathbf{B}}}_{l_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}{\dot{\mathbf{B}}}_{l_1}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\end{array}} $$
(21)

with \( {\dot{\mathbf{B}}}_l\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)=\frac{\partial \mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)}{\partial <\mathbf{p}{>}_l} \) and \( {\ddot{\mathbf{B}}}_{l_1{l}_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)=\frac{\partial^2\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)}{\partial <\mathbf{p}{>}_{l_1}\partial <\mathbf{p}{>}_{l_2}} \). Let \( {\lambda}_1\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\le {\lambda}_2\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\le \cdots \le {\lambda}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right) \) and \( {\mathbf{v}}_1\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\kern0.5em ,\kern0.5em {\mathbf{v}}_2\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\kern0.5em ,\kern0.5em \cdots \kern0.5em ,\kern0.5em {\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right) \) be the eigenvalues and relevant unit eigenvectors of matrix \( \mathbf{C}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right) \), respectively. Combining (15), (17), and (19) leads to

$$ {\displaystyle \begin{array}{l}{f}_1\left(\widehat{\mathbf{p}}\right)={\lambda}_{\mathrm{max}}\left\{\mathbf{C}\left(\widehat{\mathbf{p}},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\}={\lambda}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+\sum \limits_{l=1}^L<\boldsymbol{\updelta} \mathbf{p}{>}_l\cdot {\left({\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}{\dot{\mathbf{C}}}_l\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right){\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\\ {}\kern3.2em +\frac{1}{2}\cdot \sum \limits_{l_1=1}^L\sum \limits_{l_2=1}^L<\boldsymbol{\updelta} \mathbf{p}{>}_{l_1}\cdot <\boldsymbol{\updelta} \mathbf{p}{>}_{l_2}\cdot {\left({\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}\left(2{\dot{\mathbf{C}}}_{l_1}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right){\mathbf{V}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right){\dot{\mathbf{C}}}_{l_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+{\ddot{\mathbf{C}}}_{l_1{l}_2}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right){\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+o\left({\left\Vert \boldsymbol{\updelta} \mathbf{p}\right\Vert}_2^2\right)\end{array}} $$
(22)

where

$$ {\mathbf{V}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)=\sum \limits_{n=1}^{N-1}\frac{{\mathbf{v}}_n\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right){\left({\mathbf{v}}_n\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}}{\lambda_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)-{\lambda}_n\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)} $$
(23)

Define the following vector and matrices

$$ \left\{\begin{array}{l}\mathbf{h}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)=\left[\begin{array}{c}{\left({\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}{\dot{\mathbf{C}}}_1\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right){\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\\ {}{\left({\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}{\dot{\mathbf{C}}}_2\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right){\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\\ {}\vdots \\ {}{\left({\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}{\dot{\mathbf{C}}}_L\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right){\mathbf{v}}_N\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\end{array}\right]\\ {}\mathbf{H}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)={\mathbf{H}}_1\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+{\mathbf{H}}_2\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\end{array}\right. $$
(24)

where

figure a

Then, \( {f}_1\left(\widehat{\mathbf{p}}\right) \) in (22) can be rephrased as

$$ {\displaystyle \begin{array}{c}{f}_1\left(\widehat{\mathbf{p}}\right)={f}_1\left(\mathbf{p}\right)+{\boldsymbol{\updelta} \mathbf{p}}^{\mathrm{T}}\cdot \mathbf{h}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)+\frac{1}{2}\cdot {\boldsymbol{\updelta} \mathbf{p}}^{\mathrm{T}}\cdot \mathbf{H}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\cdot \boldsymbol{\updelta} \mathbf{p}+o\left({\left\Vert \boldsymbol{\updelta} \mathbf{p}\right\Vert}_2^2\right)\\ {}={f}_1\left(\mathbf{p}\right)+{\boldsymbol{\updelta} \mathbf{p}}^{\mathrm{T}}\cdot \operatorname{Re}\left\{\mathbf{h}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\}+\frac{1}{2}\cdot {\boldsymbol{\updelta} \mathbf{p}}^{\mathrm{T}}\cdot \operatorname{Re}\left\{\mathbf{H}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\}\cdot \boldsymbol{\updelta} \mathbf{p}+o\left({\left\Vert \boldsymbol{\updelta} \mathbf{p}\right\Vert}_2^2\right)\end{array}} $$
(26)

where the second equality follows from the fact that f1() is a real function and δp is a real vector. From (26), \( \operatorname{Re}\left\{\mathbf{h}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\} \) and \( \operatorname{Re}\left\{\mathbf{H}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\} \) are the gradient and Hessian matrix of f1(p), respectively. Consequently, the Gauss-Newton algorithm for solving (15) is given by

$$ {\widehat{\mathbf{p}}}^{\left(i+1\right)}={\widehat{\mathbf{p}}}^i-{\alpha}^i{\left(\operatorname{Re}\left\{\mathbf{H}\left({\widehat{\mathbf{p}}}^{(i)},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\}\right)}^{-1}\cdot \operatorname{Re}\left\{\mathbf{h}\left({\widehat{\mathbf{p}}}^{(i)},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\} $$
(27)

where α (0, 1) denotes the step size, and superscript i indexes the ith iteration.

Before proceeding, some remarks are concluded.

Remark 1: Clearly, (15) is a nonlinear least-squares optimization problem, so it can be effectively solved by the Gauss-Newton algorithm.

Remark 2: Note that the imaginary parts of \( \mathbf{h}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right) \) and \( \mathbf{H}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right) \) are ignored in (26) and (27). However, this does not lead to any information loss or performance degradation. The vanishing term in the second equality in (26) is \( {\boldsymbol{\updelta} \mathbf{p}}^{\mathrm{T}}\cdot \operatorname{Im}\left\{\mathbf{h}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\}+{\boldsymbol{\updelta} \mathbf{p}}^{\mathrm{T}}\cdot \operatorname{Im}\left\{\mathbf{H}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\}\cdot \boldsymbol{\updelta} \mathbf{p}/2 \), which is equal to zero because \( {f}_1\left(\widehat{\mathbf{p}}\right) \), f1(p), and δp are real scalars or vectors. As a result, \( \operatorname{Re}\left\{\mathbf{h}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\} \) and \( \operatorname{Re}\left\{\mathbf{H}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right\} \) are the true gradient and Hessian matrix of f1(p), respectively, and no approximation is necessary. Hence, (27) can be viewed as a standard Gauss-Newton algorithm.

Remark 3: It is worth noting that apart from Gauss-Newton algorithm, there are many other local search algorithms, such as Newton algorithm, steepest descent algorithm, Levenberg-Marquardt algorithm, etc. However, Newton algorithm is much more computationally demanding because it requires computing the second-order derivative of the cost function; steepest descent algorithm has slower convergence rate than the other algorithms; Levenberg-Marquardt algorithm needs to introduce a reasonable damping factor, which is difficult to determine. For these reasons, we use Gauss-Newton algorithm to solve (15). In our simulation performed in Section 7, this algorithm can provide satisfactory results. Besides, it must be emphasized that all these algorithms can only guarantee local convergence; that is to say, it is not guaranteed to obtain the global optimal solution for these algorithms.

Remark 4: In the simulations described in Section 7, the step size α is set to 0.85. A number of simulation results indicate that this value for α provides good estimates of the source position.

Assuming that the convergence result of (27) is \( {\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)} \), the derivation in [27] implies that the optimal solution of \( \overline{\mathbf{s}} \) is given by

$$ {\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}={\mathbf{v}}_{\mathrm{max}}\left\{\mathbf{B}\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right){\left(\mathbf{B}\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\mathrm{H}}\right\} $$
(28)

where vmax{} denotes the eigenvector (with first element equal to one) corresponding to the largest eigenvalue of its matrix argument.

4.2.2 Joint optimization of β and μ

This subsection describes the minimization of the criterion in (12) with respect to β and μ, while p and \( \overline{\mathbf{s}} \) are fixed at \( {\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)} \) and \( {\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)} \), respectively.

Likewise, the estimation of β and μ can also be decoupled. First, the channel attenuation scalar {β n }1 ≤ n ≤ N that minimizes (12) is given by

$$ {\widehat{\beta}}_{n,\mathrm{opt}}=\frac{1}{{\left\Vert {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right\Vert}_2^2\cdot {\left\Vert {\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right\Vert}_2^2}\cdot {\left(\left({\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right)\otimes {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right)}^{\mathrm{H}}{\overline{\mathbf{x}}}_n\kern1em \left(1\le n\le N\right) $$
(29)

Inserting (29) back into (12) we get the following concentrated problem

figure b

As each term in the sum only depends on the parameter of one array (i.e., μ n ), the minimization in (30) can be performed by solving the following N subproblems:

figure c

where

figure d

Obviously, (31) is a nonlinear least-squares optimization problem, which can be efficiently solved by the Gauss-Newton algorithm. The corresponding iterative formula is given by

figure e

where i is the iteration number and G n (μ n ) is the Jacobian matrix of g n (μ n ), i.e.,

$$ {\mathbf{G}}_n\left({\boldsymbol{\upmu}}_n\right)=\frac{\partial {\mathbf{g}}_n\left({\boldsymbol{\upmu}}_n\right)}{\partial {\boldsymbol{\upmu}}_n^{\mathrm{T}}}=\left[\begin{array}{c}{\mathbf{G}}_{n,1}\left({\boldsymbol{\upmu}}_n\right)\\ {}{\mathbf{G}}_{n,2}\left({\boldsymbol{\upmu}}_n\right)\end{array}\right] $$
(34)

From (32), it can be checked that

$$ {\displaystyle \begin{array}{c}{\mathbf{G}}_{n,1}\left({\boldsymbol{\upmu}}_n\right)=\frac{{\left(\left({\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right)\otimes {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right)}^{\mathrm{H}}{\overline{\mathbf{x}}}_n}{{\left\Vert {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right\Vert}_2^2\cdot {\left\Vert {\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right\Vert}_2^2}\cdot \left(\left({\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right)\otimes \frac{\partial {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)}{\partial {\boldsymbol{\upmu}}_n^{\mathrm{T}}}\right)\\ {}+\frac{\left(\left({\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right)\otimes {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right){\overline{\mathbf{x}}}_n^{\mathrm{T}}}{{\left\Vert {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right\Vert}_2^2\cdot {\left\Vert {\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right\Vert}_2^2}\cdot \left({\left({\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right)}^{\ast}\otimes \frac{\partial {\left({\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right)}^{\ast }}{\partial {\boldsymbol{\upmu}}_n^{\mathrm{T}}}\right)\\ {}-\frac{2{\left(\left({\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right)\otimes {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right)}^{\mathrm{H}}{\overline{\mathbf{x}}}_n}{{\left\Vert {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right\Vert}_2^4\cdot {\left\Vert {\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right\Vert}_2^2}\cdot \left(\left({\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)}\odot {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right)\right)\otimes {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right){\left({\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)\right)}^{\mathrm{H}}\cdot \frac{\partial {\mathbf{a}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)},{\boldsymbol{\upmu}}_n\right)}{\partial {\boldsymbol{\upmu}}_n^{\mathrm{T}}}\end{array}} $$
(35)
$$ {\mathbf{G}}_{n,2}\left({\boldsymbol{\upmu}}_n\right)=\frac{1}{\sqrt{2}}\cdot {\boldsymbol{\Omega}}_{n,0}^{-1/2} $$
(36)

We conclude some remarks as follows.

Remark 5: Similarly, as (31) is also a nonlinear least-squares optimization problem, we can use the Gauss-Newton algorithm to find its optimal solution.

Remark 6: The imaginary parts of \( {\left({\mathbf{G}}_n\left({\widehat{\boldsymbol{\upmu}}}_n^{(i)}\right)\right)}^{\mathrm{H}}{\mathbf{G}}_n\left({\widehat{\boldsymbol{\upmu}}}_n^{(i)}\right) \) and are neglected in (33). Nevertheless, this does not mean that there are any approximations in (33), because μ n is a real vector. The detailed derivation of (33) is given in Appendix 2.

Assuming that the convergence result of (33) is \( {\widehat{\boldsymbol{\upmu}}}_n^{\left(\mathrm{a}\right)} \), substituting this back into (29) leads to the estimation of β, which is denoted by \( {\widehat{\boldsymbol{\upbeta}}}^{\left(\mathrm{a}\right)} \).

4.2.3 Summary of the alternating minimization algorithm

The ingredients of the previous two subsections can be combined to form the proposed alternating minimization algorithm, which is summarized as follows.

Proposed alternating minimization algorithm I

Step 1: Define a convergence threshold δ > 0 and choose the initial values \( {\widehat{\mathbf{p}}}^{(0)} \), \( {\widehat{\mathbf{s}}}^{(0)} \), \( {\widehat{\boldsymbol{\upmu}}}^{(0)} \) and \( {\widehat{\boldsymbol{\upbeta}}}^{(0)} \).

Step 2: Set the iteration counter m 0 and compute the cost function \( {J}_{\mathrm{U}}^{(m)} \) using (12).

Step 3: Calculate \( {\widehat{\mathbf{p}}}^{(m)} \) using the Gauss-Newton algorithm given in (27).

Step 4: Compute \( {\widehat{\mathbf{s}}}^{(m)} \) according to (28).

Step 5: Calculate \( {\widehat{\boldsymbol{\upmu}}}^{(m)} \) from (33).

Step 6: Compute \( {\widehat{\boldsymbol{\upbeta}}}^{(m)} \) via (29).

Step 7: Increment the iteration counter mm + 1 and compute the cost function \( {J}_{\mathrm{U}}^{(m)} \) using (12). If \( \mid {J}_{\mathrm{U}}^{(m)}-{J}_{\mathrm{U}}^{\left(m-1\right)}\mid \le \delta \), stop the procedure; otherwise go to Step 3.

The following remarks concern the alternating minimization algorithm described above.

Remark 7: The algorithm iterates until the convergence criterion is satisfied. Similar to the analysis in [54], it follows that, at each step, the cost function reduces so that \( {J}_{\mathrm{U}}^{(1)}>{J}_{\mathrm{U}}^{(2)}>\cdots >{J}_{\mathrm{U}}^{(m)}\ge 0 \). Therefore, \( {\left\{{J}_{\mathrm{U}}^{(m)}\right\}}_{m=1}^{+\infty } \) is a convergent series and the convergence is ensured. Based on our experimental results, 20 iterations are sufficient to guarantee the convergence.

Remark 8: As in many iterative algorithms, convergence to the global optimum of the cost function depends on the initial estimate. If the initial values are properly chosen, the effects of these initial errors can largely be removed. For the problem at hand, the initial value of μ can be chosen as its nominal value μ0, and the starting point of p can be obtained using the traditional two-step localization method, where the DOA parameter can be estimated by subspace-based methods [1,2,3]. If the unknowns to be found are DOAs, the array manifold in (1)-(3) should be modeled as a function of direction, as in [1,2,3]. This is not difficult, because the position vector p is a function of direction. Once the DOAs relative to all arrays have been estimated, the initial source position can be easily determined using the closed-form method [4]. In addition, \( \overline{\mathbf{s}} \) and β can be initialized from (28) and (29), respectively. The simulation results in Section 7 demonstrate that using these initial estimates gives satisfactory performance.

4.2.4 Complexity analysis

Here, the computational complexity of the proposed DPD method is studied in terms of the number of multiplication. Table 1 summarizes the numerical complexity.

Table 1 Complexity of proposed method

5 Direct position determination method for case of known signal waveform

The aim of this section is to propose an alternative robust DPD method that accounts for the uncertainties in the array manifold for the case of known signal waveforms. Applications of this study can be found in, for example, wireless communications, where some known preamble sequences are often transmitted for training or synchronization purposes [55, 56]. Similar to Section 4, the robust DPD algorithm developed here is also based on an optimal MAP estimator for the current problem.

5.1 Optimization criterion for maximum a posteriori estimator

If the transmitted signals are a priori known, the sole unknown parameter embedded in \( \overline{\mathbf{s}} \) is t0. Consequently, we can consider the estimation of t0 instead of \( \overline{\mathbf{s}} \). Using the a priori distribution of {μ n }1 ≤ n ≤ N, the MAP estimator with respect to p, t0, β, and μ can be written as

$$ \underset{\mathbf{p},{t}_0,\boldsymbol{\upbeta}, \boldsymbol{\upmu}}{\min }{J}_{\mathrm{K}\hbox{-} \mathrm{MAP}}\left(\mathbf{p},{t}_0,\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)=\underset{\mathbf{p},{t}_0,\boldsymbol{\upbeta}, \boldsymbol{\upmu}}{\min}\left\{{J}_{\mathrm{K}\hbox{-} \mathrm{ML}}\left(\mathbf{p},{t}_0,\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)+\frac{1}{2}\cdot \sum \limits_{n=1}^N{\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)}^{\mathrm{T}}{\boldsymbol{\Omega}}_n^{-1}\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)\right\} $$
(37)

where

$$ {J}_{\mathrm{K}\hbox{-} \mathrm{MAP}}\left(\mathbf{p},{t}_0,\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)={J}_{\mathrm{K}\hbox{-} \mathrm{ML}}\left(\mathbf{p},{t}_0,\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)+\frac{1}{2}\cdot \sum \limits_{n=1}^N{\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)}^{\mathrm{T}}{\boldsymbol{\Omega}}_n^{-1}\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right) $$
(38)

Here, JK ‐ ML(p, t0, β, μ) is the negative log-likelihood function for the case of known waveform, which can be formulated as

$$ {\displaystyle \begin{array}{c}{J}_{\mathrm{K}\hbox{-} \mathrm{ML}}\left(\mathbf{p},{t}_0,\boldsymbol{\upbeta}, \boldsymbol{\upmu} \right)=\frac{1}{\sigma_{\varepsilon}^2}\cdot \sum \limits_{n=1}^N\sum \limits_{k=1}^K{\left\Vert {\overline{\mathbf{x}}}_{n,k}-{\beta}_n{\mathbf{a}}_n\Big(\mathbf{p},{\boldsymbol{\upmu}}_n\Big){\overline{s}}_k\cdot \exp \left\{-\mathrm{j}{\omega}_k\left({\tau}_n\left(\mathbf{p}\right)+{t}_0\right)\right\}\right\Vert}_2^2\\ {}=\frac{1}{\sigma_{\varepsilon}^2}\cdot \sum \limits_{n=1}^N{\left\Vert {\overline{\mathbf{x}}}_n-\left(\left({\overline{\mathbf{s}}}^{\prime}\odot {\boldsymbol{\upvarphi}}_n\left(\mathbf{p},{t}_0\right)\right)\otimes {\mathbf{a}}_n\right(\mathbf{p},{\boldsymbol{\upmu}}_n\left)\right){\beta}_n\right\Vert}_2^2\end{array}} $$
(39)

where

$$ \left\{\begin{array}{l}{\overline{\mathbf{s}}}^{\prime }={\left[{\overline{s}}_1\kern1.5em {\overline{s}}_2\kern1.5em \cdots \kern1.5em {\overline{s}}_K\right]}^{\mathrm{T}}\\ {}{\boldsymbol{\upvarphi}}_n\left(\mathbf{p},{t}_0\right)={\left[\exp \left\{-\mathrm{j}{\omega}_1\left({\tau}_n\left(\mathbf{p}\right)+{t}_0\right)\right\}\kern1.5em \exp \left\{-\mathrm{j}{\omega}_2\left({\tau}_n\left(\mathbf{p}\right)+{t}_0\right)\right\}\kern1.5em \cdots \kern1.5em \exp \left\{-\mathrm{j}{\omega}_K\left({\tau}_n\left(\mathbf{p}\right)+{t}_0\right)\right\}\right]}^{\mathrm{T}}\end{array}\right. $$
(40)

Note that the vector \( {\overline{\mathbf{s}}}^{\prime } \) is known here. Combining (37)–(39), we obtain the following optimization problem:

$$ \underset{\mathbf{p},{t}_0,\boldsymbol{\upbeta}, \boldsymbol{\upmu}}{\min}\left\{\sum \limits_{n=1}^N{\left\Vert {\overline{\mathbf{x}}}_n-\left(\left({\overline{\mathbf{s}}}^{\prime}\odot {\boldsymbol{\upvarphi}}_n\left(\mathbf{p},{t}_0\right)\right)\otimes {\mathbf{a}}_n\right(\mathbf{p},{\boldsymbol{\upmu}}_n\left)\right){\beta}_n\right\Vert}_2^2+\frac{1}{2}\cdot \sum \limits_{n=1}^N{\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)}^{\mathrm{T}}{\boldsymbol{\Omega}}_{n,0}^{-1}\left({\boldsymbol{\upmu}}_n-{\boldsymbol{\upmu}}_{n,0}\right)\right\} $$
(41)

It does not seem possible to solve (41) in closed form because of its nonlinear nature. In the next subsection, an efficient numerical algorithm for solving this problem is presented.

5.2 Numerical algorithm

The algorithm proposed in this subsection is similar to that in Subsection 4.2, and is also implemented by adopting the alternating minimization algorithm. After a close inspection of (41), we can divide the unknown parameters into two categories: one is composed of p and t0, and the other comprises β and μ. The two sets of variables are alternately updated by separate optimization procedures. A detailed description of the proposed algorithm is given in the following subsections.

5.2.1 Joint optimization of p and t 0

We perform joint optimization with respect to p and t0, with β and μ fixed to \( {\widehat{\boldsymbol{\upbeta}}}^{\left(\mathrm{b}\right)} \) and \( {\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)} \), respectively.

According to the derivation in [27], the joint estimates of p and t0 can be obtained by solving the following optimization problem:

$$ \underset{\mathbf{p},{t}_0}{\max }{f}_3\left(\mathbf{p},{t}_0\right)=\underset{\mathbf{p},{t}_0}{\max }{\left\Vert \left(\mathbf{B}\right(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\left)\right){}^{\mathrm{H}}{\overline{\mathbf{S}}}^{\prime}\mathbf{d}\left({t}_0\right)\right\Vert}_2^2 $$
(42)

where

$$ \left\{\begin{array}{l}{\overline{\mathbf{S}}}^{\prime }=\operatorname{diag}\left[{\overline{\mathbf{s}}}^{\prime}\right]=\operatorname{diag}\left[{\overline{s}}_1\kern1.5em {\overline{s}}_2\kern1.5em \cdots \kern1.5em {\overline{s}}_K\right]\\ {}\mathbf{d}\left({t}_0\right)={\left[\exp \left\{-\mathrm{j}{\omega}_1{t}_0\right\}\kern1.5em \exp \left\{-\mathrm{j}{\omega}_2{t}_0\right\}\kern1.5em \cdots \kern1.5em \exp \left\{-\mathrm{j}{\omega}_K{t}_0\right\}\right]}^{\mathrm{T}}\end{array}\right. $$
(43)

We define a permutation matrix Π subject to

$$ \mathrm{vec}\left({\left(\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right)}^{\mathrm{H}}\right)=\boldsymbol{\Pi} \cdot \mathrm{vec}\left(\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right) $$
(44)

Using the identity vec(XYZ) = (ZTX)  vec(Y), (42) can then be written as

$$ \underset{\mathbf{p},{t}_0}{\max }{f}_3\left(\mathbf{p},{t}_0\right)=\underset{\mathbf{p},{t}_0}{\max }{\left\Vert \left({\left({\overline{\mathbf{S}}}^{\prime}\mathbf{d}\left({t}_0\right)\right)}^{\mathrm{T}}\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\Big(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\Big)\right\Vert}_2^2 $$
(45)

where \( \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)=\mathrm{vec}\left(\mathbf{B}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right) \). Note that, instead of finding p and t0 jointly, we can estimate them separately in sequence to reduce the computational complexity.

First, we consider t0 as a nuisance parameter and rewrite (45) as an optimization problem for the position vector p as follows:

$$ \underset{\mathbf{p}}{\max }{f}_4\left(\mathbf{p}\right)=\underset{\mathbf{p}}{\max}\left\{\underset{t_0}{\max }{\left\Vert \left({\left({\overline{\mathbf{S}}}^{\prime}\mathbf{d}\left({t}_0\right)\right)}^{\mathrm{T}}\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\Big(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\Big)\right\Vert}_2^2\right\} $$
(46)

where

$$ {f}_4\left(\mathbf{p}\right)=\underset{t_0}{\max }{\left\Vert \left({\left({\overline{\mathbf{S}}}^{\prime}\mathbf{d}\left({t}_0\right)\right)}^{\mathrm{T}}\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\Big(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\Big)\right\Vert}_2^2 $$
(47)

In (47), the functional form of f4(p) is not explicit. Therefore, the most straightforward method for solving (46) may be a grid search. However, this is computationally intensive, as stated previously. Thus, we develop an iterative algorithm for solving (44) based on the Gauss-Newton algorithm.

To obtain an iterative procedure, we need to derive the gradient and Hessian matrix of the cost function f4(p). For this purpose, a preliminary mathematical result is presented as follows.

Proposition 2: Consider a twice continuously differentiable function f(z, y) that depends on the independent variables yR and zRm × 1. Define a vector function \( {f}_{\mathrm{m}}\left(\mathbf{z}\right)=\underset{y}{\max}\left\{f\left(\mathbf{z},y\right)\right\} \) and assume that, for a given z, the extreme point that maximizes f(z, y) with respect to y is denoted as ym(z). The gradient and Hessian matrix of fm(z) are then given by

$$ \left\{\begin{array}{l}{\mathbf{f}}_{\mathrm{m}}\left(\mathbf{z}\right)=\frac{\partial {f}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}}={\dot{\mathbf{f}}}_1\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)\\ {}{\mathbf{F}}_{\mathrm{m}}\left(\mathbf{z}\right)=\frac{\partial^2{f}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}\partial {\mathbf{z}}^{\mathrm{T}}}={\ddot{\mathbf{F}}}_{11}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)-\frac{{\ddot{\mathbf{f}}}_{21}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right){\ddot{\mathbf{f}}}_{21}^{\mathrm{T}}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)}{{\ddot{f}}_{22}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)}\end{array}\right. $$
(48)

where

$$ \left\{\begin{array}{l}{\dot{f}}_2\left(\mathbf{z},y\right)=\frac{\partial f\left(\mathbf{z},y\right)}{\partial y}\kern1em ,\kern1em {\ddot{f}}_{22}\left(\mathbf{z},y\right)=\frac{\partial^2f\left(\mathbf{z},y\right)}{\partial {y}^2}\\ {}{\dot{\mathbf{f}}}_1\left(\mathbf{z},y\right)=\frac{\partial f\left(\mathbf{z},y\right)}{\partial \mathbf{z}}\kern1em ,\kern1em {\ddot{\mathbf{f}}}_{21}\left(\mathbf{z},y\right)=\frac{\partial {\dot{f}}_2\left(\mathbf{z},y\right)}{\partial \mathbf{z}}=\frac{\partial^2f\left(\mathbf{z},y\right)}{\partial y\partial \mathbf{z}}\\ {}{\ddot{\mathbf{F}}}_{11}\left(\mathbf{z},y\right)=\frac{\partial^2f\left(\mathbf{z},y\right)}{\partial \mathbf{z}\partial {\mathbf{z}}^{\mathrm{T}}}\end{array}\right. $$
(49)

The proof of Proposition 2 is provided in Appendix 3. Note that the gradient and Hessian matrix of the criterion function f4(p) can be obtained from this result.

We first assume that, for a given p, the point that maximizes f3(p, t0) with respect to t0 is denoted as t0, m(p). This can be obtained by the fast Fourier transform (FFT) [27]. Using (45) and the first equality in (48), the gradient of f4(p) can be expressed as

$$ {\displaystyle \begin{array}{c}\mathbf{r}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)=\frac{\partial {f}_4\left(\mathbf{p}\right)}{\partial \mathbf{p}}={\left.\frac{\partial {f}_3\left(\mathbf{p},{t}_0\right)}{\partial \mathbf{p}}\right|}_{t_0={t}_{0,m}\left(\mathbf{p}\right)}\\ {}=2\cdot \operatorname{Re}\left\{{\left(\frac{\partial \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}}\right)}^{\mathrm{H}}{\boldsymbol{\Pi}}^{\mathrm{T}}\left(\left({{\overline{\mathbf{S}}}^{\prime}}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\mathrm{T}}{{\overline{\mathbf{S}}}^{\prime}}^{\mathrm{T}}\right)\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\Big(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\Big)\right\}\end{array}} $$
(50)

Furthermore, using the second equality in (48), the Hessian matrix of f4(p) can be formulated as

$$ \mathbf{R}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)=\frac{\partial^2{f}_4\left(\mathbf{p}\right)}{\partial \mathbf{p}\partial {\mathbf{p}}^{\mathrm{T}}}={\mathbf{R}}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)-\frac{1}{r_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)}\cdot {\mathbf{r}}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right){\left({\mathbf{r}}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right)}^{\mathrm{T}} $$
(51)

where

$$ {\mathbf{R}}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)={\left.\frac{\partial^2{f}_3\left(\mathbf{p},{t}_0\right)}{\partial \mathbf{p}\partial {\mathbf{p}}^{\mathrm{T}}}\right|}_{t_0={t}_{0,m}\left(\mathbf{p}\right)}\kern1em ,\kern1em {\mathbf{r}}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)={\left.\frac{\partial^2{f}_3\left(\mathbf{p},{t}_0\right)}{\partial \mathbf{p}\partial {t}_0}\right|}_{t_0={t}_{0,m}\left(\mathbf{p}\right)}\kern1em ,\kern1em {r}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)={\left.\frac{\partial^2{f}_3\left(\mathbf{p},{t}_0\right)}{\partial {t}_0^2}\right|}_{t_0={t}_{0,m}\left(\mathbf{p}\right)} $$
(52)

Combining (45) and (50) leads to

$$ {\displaystyle \begin{array}{c}{\mathbf{R}}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)=2\cdot \operatorname{Re}\left\{{\left(\frac{\partial \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}}\right)}^{\mathrm{H}}{\boldsymbol{\Pi}}^{\mathrm{T}}\left(\left({{\overline{\mathbf{S}}}^{\prime}}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\mathrm{T}}{{\overline{\mathbf{S}}}^{\prime}}^{\mathrm{T}}\right)\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \cdot \frac{\partial \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}}\right\}\\ {}+2\cdot \operatorname{Re}\left\{\left(\right({\boldsymbol{\Pi}}^{\mathrm{T}}\left(\left({{\overline{\mathbf{S}}}^{\prime}}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\mathrm{T}}{{\overline{\mathbf{S}}}^{\prime}}^{\mathrm{T}}\right)\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\Big(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\left)\right){}^{\mathrm{T}}\otimes {\mathbf{I}}_L\Big)\cdot \frac{\partial }{\partial {\mathbf{p}}^{\mathrm{T}}}\mathrm{vec}\left({\left(\frac{\partial \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}}\right)}^{\mathrm{H}}\right)\right\}\end{array}} $$
(53)
$$ {\displaystyle \begin{array}{c}{\mathbf{r}}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)=2\cdot \operatorname{Re}\left\{{\left(\frac{\partial \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}}\right)}^{\mathrm{H}}{\boldsymbol{\Pi}}^{\mathrm{T}}\left(\left({{\overline{\mathbf{S}}}^{\prime}}^{\ast }{\left(\dot{\mathbf{d}}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\mathrm{T}}{{\overline{\mathbf{S}}}^{\prime}}^{\mathrm{T}}\right)\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\Big(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\Big)\right\}\\ {}+2\cdot \operatorname{Re}\left\{{\left(\frac{\partial \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}}\right)}^{\mathrm{H}}{\boldsymbol{\Pi}}^{\mathrm{T}}\left(\left({{\overline{\mathbf{S}}}^{\prime}}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\ast }{\left(\dot{\mathbf{d}}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\mathrm{T}}{{\overline{\mathbf{S}}}^{\prime}}^{\mathrm{T}}\right)\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\Big(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\Big)\right\}\end{array}} $$
(54)
$$ {\displaystyle \begin{array}{c}{r}_0\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)=2\cdot \operatorname{Re}\left\{{\left(\mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right)}^{\mathrm{H}}{\boldsymbol{\Pi}}^{\mathrm{T}}\left(\left({{\overline{\mathbf{S}}}^{\prime}}^{\ast }{\left(\ddot{\mathbf{d}}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\ast }{\left(\mathbf{d}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\mathrm{T}}{{\overline{\mathbf{S}}}^{\prime}}^{\mathrm{T}}\right)\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right\}\\ {}+2\cdot \operatorname{Re}\left\{{\left(\mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right)}^{\mathrm{H}}{\boldsymbol{\Pi}}^{\mathrm{T}}\left(\left({{\overline{\mathbf{S}}}^{\prime}}^{\ast }{\left(\dot{\mathbf{d}}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\ast }{\left(\dot{\mathbf{d}}\left({t}_{0,m}\left(\mathbf{p}\right)\right)\right)}^{\mathrm{T}}{{\overline{\mathbf{S}}}^{\prime}}^{\mathrm{T}}\right)\otimes {\mathbf{I}}_N\right)\boldsymbol{\Pi} \mathbf{b}\left(\mathbf{p},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right\}\end{array}} $$
(55)

At this point, the Gauss-Newton algorithm for solving (46) can be formulated as

$$ {\widehat{\mathbf{p}}}^{\left(i+1\right)}={\widehat{\mathbf{p}}}^{(i)}-{\alpha}^i{\left(\mathbf{R}\left({\widehat{\mathbf{p}}}^{(i)},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right)\right)}^{-1}\mathbf{r}\left({\widehat{\mathbf{p}}}^{(i)},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\right) $$
(56)

where α (0, 1) is the step size, and superscript i represents the iteration number.

Let the convergence result of (56) be \( {\widehat{\mathbf{p}}}^{\left(\mathrm{b}\right)} \). Then, t0 can be obtained from the following minimization problem:

$$ \underset{t_0}{\max }{f}_5\left({t}_0\right)=\underset{t_0}{\max }{\left\Vert \left(\mathbf{B}\right({\widehat{\mathbf{p}}}^{\left(\mathrm{b}\right)},{\widehat{\boldsymbol{\upmu}}}^{\left(\mathrm{b}\right)}\left)\right){}^{\mathrm{H}}{\overline{\mathbf{S}}}^{\prime}\mathbf{d}\left({t}_0\right)\right\Vert}_2^2 $$
(57)

Note that the cost function in (57) can be minimized via FFT algorithm, which has been discussed in [27].

5.2.2 Joint optimization of β and μ

The objective function in (41) is now minimized with respect to β and μ, with p and t0 given by \( {\widehat{\mathbf{p}}}^{\left(\mathrm{b}\right)} \) and \( {\widehat{t}}_0^{\left(\mathrm{b}\right)} \), respectively. Note that the algorithm presented in Section 4.2.2 can still be applied here, but \( {\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)} \), \( {\boldsymbol{\upgamma}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{a}\right)}\right) \), and \( {\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{a}\right)} \) must be replaced with \( {\widehat{\mathbf{p}}}^{\left(\mathrm{b}\right)} \), \( {\boldsymbol{\upvarphi}}_n\left({\widehat{\mathbf{p}}}^{\left(\mathrm{b}\right)},{\widehat{t}}_0^{\left(\mathrm{b}\right)}\right) \), and

$$ {\widehat{\overline{\mathbf{s}}}}^{\left(\mathrm{b}\right)}={\left[{\overline{s}}_1\cdot \exp \left\{-\mathrm{j}{\omega}_1{\widehat{t}}_0^{\left(\mathrm{b}\right)}\right\}\kern1.5em {\overline{s}}_2\cdot \exp \left\{-\mathrm{j}{\omega}_2{\widehat{t}}_0^{\left(\mathrm{b}\right)}\right\}\kern1.5em \cdots \kern1.5em {\overline{s}}_K\cdot \exp \left\{-\mathrm{j}{\omega}_K{\widehat{t}}_0^{\left(\mathrm{b}\right)}\right\}\right]}^{\mathrm{T}}, $$
(58)

respectively.

5.2.3 Summary of the alternating minimization algorithm

A possible implementation of the proposed alternating minimization algorithm is outlined as follows.

Proposed alternating minimization algorithm II

Step 1: Define a convergence threshold δ > 0 and choose the initial values \( {\widehat{\mathbf{p}}}^{(0)} \), \( {\widehat{t}}_0^{(0)} \), \( {\widehat{\boldsymbol{\upmu}}}^{(0)} \) and \( {\widehat{\boldsymbol{\upbeta}}}^{(0)} \).

Step 2: Set the iteration counter m 0 and compute the cost function \( {J}_{\mathrm{K}}^{(m)} \) using (41).

Step 3: Calculate \( {\widehat{\mathbf{p}}}^{(m)} \) using the Gauss-Newton algorithm given in (56).

Step 4: Compute \( {\widehat{t}}_0^{(m)} \) from (57) by FFT algorithm.

Step 5: Calculate \( {\widehat{\boldsymbol{\upmu}}}^{(m)} \) from (33) and (58).

Step 6: Compute \( {\widehat{\boldsymbol{\upbeta}}}^{(m)} \) via (29) and (58).

Step 7: Increment the iteration counter mm + 1 and compute the cost function \( {J}_{\mathrm{K}}^{(m)} \) using (41). If \( \mid {J}_{\mathrm{K}}^{(m)}-{J}_{\mathrm{K}}^{\left(m-1\right)}\mid \le \delta \), stop the procedure; otherwise go to Step 3.

In the following, we make two remarks concerning the alternating minimization algorithm described above.

Remark 9: Similar to sequence \( {\left\{{J}_{\mathrm{U}}^{(m)}\right\}}_{m=1}^{+\infty } \), \( {\left\{{J}_{\mathrm{K}}^{(m)}\right\}}_{m=1}^{+\infty } \) is also monotonically decreasing and, hence, convergence is guaranteed. From our simulation results, it can be observed that 20 iterations are generally enough to satisfy the convergence criterion.

Remark 10: The selection of the initial values is again important. For the current problem, the initial value of μ can be chosen as its nominal value μ0 and the starting point of p can be obtained using the two-step localization method, where the DOA parameter can be estimated by the algorithm proposed for the scenario of known waveforms [55, 56]. As stated in Remark 8, when the unknowns to be determined are DOAs, we need to express the array manifold in (1)–(3) as a function of direction, as in [1,2,3]. This can easily be achieved, because the position vector p is a function of DOA. Once the DOAs corresponding to all the arrays have been estimated, the estimated initial position is given by the closed-form method [4]. Further, from (57), we have the initial solution of t0 from the FFT algorithm, and the initial guess of β can be given by (29) and (58). From the results in Section 7, it is clear that these initial estimates allow the proposed algorithm to provide satisfactory estimation accuracy.

5.2.4 Complexity analysis

We now address the computational complexity of the proposed DPD method in terms of the number of multiplications. Table 2 summarizes the numerical complexity.

Table 2 Complexity of proposed method

6 Cramér-Rao bound on covariance matrix of localization errors

The CRB gives a lower bound on the asymptotic covariance matrix of any asymptotically unbiased estimator. The bound can be obtained by the inverse of the Fisher information matrix (FIM). Relating the parameter estimates to this bound can provide a nature measure of performance. The purpose of this section is to derive the CRBs for the estimates of the emitter’s position. We consider the cases when the signal waveform is unknown and when the signal waveform is known.

6.1 Cramér-Rao bound on position estimate for case of unknown signal waveform

This subsection is devoted to the derivation of the CRB for localization under the assumption that the signal waveform is unknown. For this case, the full parameter set contains both the deterministic parameters \( {\sigma}_{\varepsilon}^2 \), p, β, \( \overline{\mathbf{s}} \), and the stochastic parameter μ. Hence, the CRB derivation should follow the Bayesian theory framework [47,48,49,50]. Note that the CRB derivation can also be used for stochastic parameters, as in [47,48,49,50, 57, 58]. To this end, a novel parameter vector that comprises all the deterministic and stochastic unknowns is introduced as

$$ {\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}={\left[{\sigma}_{\varepsilon}^2\kern1.5em {\mathbf{p}}^{\mathrm{T}}\kern.5em {\left(\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}\kern.5em {\left(\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}\kern.5em {\left(\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}\right)}^{\mathrm{T}}\kern.5em {\left(\operatorname{Im}\left\{\overline{\mathbf{s}}\right\}\right)}^{\mathrm{T}}\kern.5em \boldsymbol{\upmu} \right]}^{\mathrm{T}}={\left[{\sigma}_{\varepsilon}^2\kern1.5em {{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}\right]}^{\mathrm{T}} $$
(59)

where

$$ {{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}={\left[{\mathbf{p}}^{\mathrm{T}}\kern.5em {\left(\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}\kern.5em {\left(\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}\kern.5em {\left(\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}\right)}^{\mathrm{T}}\kern.5em {\left(\operatorname{Im}\left\{\overline{\mathbf{s}}\right\}\right)}^{\mathrm{T}}\kern.5em \boldsymbol{\upmu} \right]}^{\mathrm{T}} $$
(60)

We proceed to define a data vector as

$$ \overline{\mathbf{x}}={\left[{\overline{\mathbf{x}}}_1^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_2^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_N^{\mathrm{H}}\right]}^{\mathrm{H}}={\left[{\overline{\mathbf{x}}}_{1,1}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{1,2}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{1,K}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{2,1}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{2,2}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{2,K}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{N,1}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{N,2}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{N,K}^{\mathrm{H}}\right]}^{\mathrm{H}} $$
(61)

The mean vector of \( \overline{\mathbf{x}} \) is given by

$$ {\overline{\mathbf{x}}}_0=\mathrm{E}\left[\overline{\mathbf{x}}\right]={\left[{\overline{\mathbf{x}}}_{1,1,0}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{1,2,0}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{1,K,0}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{2,1,0}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{2,2,0}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{2,K,0}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{N,1,0}^{\mathrm{H}}\kern1.5em {\overline{\mathbf{x}}}_{N,2,0}^{\mathrm{H}}\kern1.5em \cdots \kern1.5em {\overline{\mathbf{x}}}_{N,K,0}^{\mathrm{H}}\right]}^{\mathrm{H}} $$
(62)

where

$$ {\overline{\mathbf{x}}}_{n,k,0}=\mathrm{E}\left[{\overline{\mathbf{x}}}_{n,k}\right]={\beta}_n{\mathbf{a}}_n\left(\mathbf{p},{\boldsymbol{\upmu}}_n\right){\overline{s}}_k\cdot \exp \left\{-\mathrm{j}{\omega}_k\left({\tau}_n\left(\mathbf{p}\right)+{t}_0\right)\right\} $$
(63)

When the deterministic and stochastic parameters coexist, we should use the hybrid CRB. As a consequence, the FIM for vector η(a) is given by [47,48,49,50, 57, 58].

$$ {\displaystyle \begin{array}{l}<\mathbf{FIM}\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}\right){>}_{ij}=<{\mathbf{FIM}}_1\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}\right){>}_{ij}+<{\mathbf{FIM}}_2\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}\right){>}_{ij}\\ {}\kern6.12em ={\mathrm{E}}_{\overline{\mathbf{x}},\boldsymbol{\upmu}}\left[\frac{\partial^2{f}_{\mathrm{ML}}\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}|\overline{\mathbf{x}}\right)}{\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_i\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_j}\right]+{\mathrm{E}}_{\boldsymbol{\upmu}}\left[\frac{1}{2}\cdot \frac{\partial^2{\left(\boldsymbol{\upmu} -{\boldsymbol{\upmu}}_0\right)}^{\mathrm{T}}{\boldsymbol{\Omega}}^{-1}\left(\boldsymbol{\upmu} -{\boldsymbol{\upmu}}_0\right)}{\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_i\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_j}\right]\end{array}} $$
(64)

where

$$ \left\{\begin{array}{l}<{\mathbf{FIM}}_1\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}\right){>}_{ij}={\mathrm{E}}_{\overline{\mathbf{x}},\boldsymbol{\upmu}}\left[\frac{\partial^2{f}_{\mathrm{ML}}\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}|\overline{\mathbf{x}}\right)}{\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_i\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_j}\right]\\ {}<{\mathbf{FIM}}_2\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}\right){>}_{ij}={\mathrm{E}}_{\boldsymbol{\upmu}}\left[\frac{1}{2}\cdot \frac{\partial^2{\left(\boldsymbol{\upmu} -{\boldsymbol{\upmu}}_0\right)}^{\mathrm{T}}{\boldsymbol{\Omega}}^{-1}\left(\boldsymbol{\upmu} -{\boldsymbol{\upmu}}_0\right)}{\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_i\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_j}\right]\end{array}\right. $$
(65)

in which \( {f}_{\mathrm{ML}}\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}|\overline{\mathbf{x}}\right) \) is the maximum likelihood (ML) function of the compound data vector \( \overline{\mathbf{x}} \). Note that the first term, FIM1(η(a)), resembles the standard expression for the FIM in the unperturbed model augmented by the perturbation parameter μ. However, FIM1(η(a)) is not easily calculated, because the μ-parameter generally behaves in a nonlinear fashion, making the expectation with respect to μ difficult to compute. In [57], the above expression is examined for some special cases, but it is hard to generalize this result. Indeed, a more common approach is to ignore the expectation with respect to μ and calculate FIM1(η(a)) at μ0, as in [47,48,49,50, 58]. Following the analysis in [47,48,49,50, 58], it can be shown that this approximation is O(1), which implies that

$$ {\mathrm{E}}_{\overline{\mathbf{x}},\boldsymbol{\upmu}}\left[\frac{\partial^2{f}_{\mathrm{ML}}\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}|\overline{\mathbf{x}}\right)}{\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_i\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_j}\right]={\mathrm{E}}_{\overline{\mathbf{x}}}\left[{\left.\frac{\partial^2{f}_{\mathrm{ML}}\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}|\overline{\mathbf{x}}\right)}{\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_i\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_j}\right|}_{{\boldsymbol{\upmu}}_0}\right]+O(1) $$
(66)

Additionally, note that \( {\mathrm{E}}_{\overline{\mathbf{x}},\boldsymbol{\upmu}}\left[\frac{\partial^2{f}_{\mathrm{ML}}\left({\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}|\overline{\mathbf{x}}\right)}{\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_i\partial <{\boldsymbol{\upeta}}^{\left(\mathrm{a}\right)}{>}_j}\right]=O(NK) \), so we can still obtain an asymptotically valid CRB. Another reason for ignoring this approximation is that the distribution of μ is symmetric and, in this paper, the error μ − μ0 is assumed to be sufficiently small.

Combining (64) and the above discussion, the FIM for vector η′(a) can be approximately expressed as

$$ <\mathbf{FIM}\left({{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}\right){>}_{ij}=\frac{2}{\sigma_{\varepsilon}^2}\cdot <\operatorname{Re}\left\{{\mathbf{T}}_{{{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}}^{\mathrm{H}}{\mathbf{T}}_{{{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}}\right\}{>}_{ij}+<{\boldsymbol{\Omega}}^{-1}{>}_{ij}\cdot \delta \left(i,j\right) $$
(67)

where δ(i, j) is an indicator function such that δ(i, j) = 1 if both i and j correspond to the element in μ, δ(i, j) = 0 otherwise, Ω = blkdiag[Ω1  Ω2    Ω N ], and

$$ {\displaystyle \begin{array}{c}{\mathbf{T}}_{{{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)\mathrm{T}}}=\left[\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\mathbf{p}}^{\mathrm{T}}}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}\right)}^{\mathrm{T}}}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Im}\left\{\overline{\mathbf{s}}\right\}\right)}^{\mathrm{T}}}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\boldsymbol{\upmu}}^{\mathrm{T}}}\right]\\ {}=\left[{\mathbf{T}}_{\mathbf{p}}\kern1.5em {\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}}\kern1.5em {\mathbf{T}}_{\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}}\kern1.5em {\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}}\kern1.5em {\mathbf{T}}_{\operatorname{Im}\left\{\overline{\mathbf{s}}\right\}}\kern1.5em {\mathbf{T}}_{\boldsymbol{\upmu}}\right]\end{array}} $$
(68)

It follows from (67) that

$$ \mathbf{FIM}\left({{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}\right)=\frac{2}{\sigma_{\varepsilon}^2}\cdot \operatorname{Re}\left\{{\mathbf{T}}_{{{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}}^{\mathrm{H}}{\mathbf{T}}_{{{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{a}\right)}}\right\}+\left[\begin{array}{cc}\mathbf{O}& \mathbf{O}\\ {}\mathbf{O}& {\boldsymbol{\Omega}}^{-1}\end{array}\right] $$
(69)

Using (62) and (63), and after some algebraic manipulations, the sub-matrices in (68) can be written as

$$ \left\{\begin{array}{l}{\mathbf{T}}_{\mathbf{p}}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\mathbf{p}}^{\mathrm{T}}}=\operatorname{diag}\left[\boldsymbol{\upbeta} \otimes {\mathbf{1}}_{MK\times 1}\right]\cdot \operatorname{diag}\left[{\mathbf{1}}_{N\times 1}\otimes \overline{\mathbf{s}}\otimes {\mathbf{1}}_{M\times 1}\right]\cdot \frac{\partial \left(\mathbf{A}\left(\mathbf{p},{\boldsymbol{\upmu}}_0\right){\mathbf{1}}_{K\times 1}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}}\\ {}{\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}}=\operatorname{diag}\left[{\mathbf{1}}_{N\times 1}\otimes \overline{\mathbf{s}}\otimes {\mathbf{1}}_{M\times 1}\right]\cdot \mathrm{blkdiag}\left[{\mathbf{A}}_1\left(\mathbf{p},{\boldsymbol{\upmu}}_{1,0}\right){\mathbf{1}}_{K\times 1}\kern1em {\mathbf{A}}_2\left(\mathbf{p},{\boldsymbol{\upmu}}_{2,0}\right){\mathbf{1}}_{K\times 1}\kern1em \cdots \kern1em {\mathbf{A}}_N\left(\mathbf{p},{\boldsymbol{\upmu}}_{N,0}\right){\mathbf{1}}_{K\times 1}\right]\\ {}{\mathbf{T}}_{\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}}=\mathrm{j}{\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}}=\mathrm{j}\cdot \operatorname{diag}\left[{\mathbf{1}}_{N\times 1}\otimes \overline{\mathbf{s}}\otimes {\mathbf{1}}_{M\times 1}\right]\cdot \mathrm{blkdiag}\left[{\mathbf{A}}_1\left(\mathbf{p},{\boldsymbol{\upmu}}_{1,0}\right){\mathbf{1}}_{K\times 1}\kern1em {\mathbf{A}}_2\left(\mathbf{p},{\boldsymbol{\upmu}}_{2,0}\right){\mathbf{1}}_{K\times 1}\kern1em \cdots \kern1em {\mathbf{A}}_N\left(\mathbf{p},{\boldsymbol{\upmu}}_{N,0}\right){\mathbf{1}}_{K\times 1}\right]\\ {}{\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}\right)}^{\mathrm{T}}}=\operatorname{diag}\left[\boldsymbol{\upbeta} \otimes {\mathbf{1}}_{MK\times 1}\right]\cdot \mathbf{A}\left(\mathbf{p},{\boldsymbol{\upmu}}_0\right)\\ {}{\mathbf{T}}_{\operatorname{Im}\left\{\overline{\mathbf{s}}\right\}}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Im}\left\{\overline{\mathbf{s}}\right\}\right)}^{\mathrm{T}}}=\mathrm{j}{\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}}=\mathrm{j}\cdot \operatorname{diag}\left[\boldsymbol{\upbeta} \otimes {\mathbf{1}}_{MK\times 1}\right]\cdot \mathbf{A}\left(\mathbf{p},{\boldsymbol{\upmu}}_0\right)\\ {}{\mathbf{T}}_{\boldsymbol{\upmu}}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\boldsymbol{\upmu}}^{\mathrm{T}}}=\operatorname{diag}\left[\boldsymbol{\upbeta} \otimes {\mathbf{1}}_{MK\times 1}\right]\cdot \operatorname{diag}\left[{\mathbf{1}}_{N\times 1}\otimes \overline{\mathbf{s}}\otimes {\mathbf{1}}_{M\times 1}\right]\cdot \frac{\partial \left(\mathbf{A}\left(\mathbf{p},{\boldsymbol{\upmu}}_0\right){\mathbf{1}}_{K\times 1}\right)}{\partial {\boldsymbol{\upmu}}_0^{\mathrm{T}}}\end{array}\right. $$
(70)

Substituting (68) and (70) into (69) leads to

figure i

where

figure j

The details of calculating the matrices in (72) are shown in Appendix 4.

Note that, in practice, only the p corner of the CRB matrix is of interest. Invoking the partitioned matrix inversion formula, the CRB matrix for position vector p can be written as

$$ {\mathbf{CRB}}^{\left(\mathrm{a}\right)}\left(\mathbf{p}\right)=\frac{\sigma_{\varepsilon}^2}{2}\cdot \left(\begin{array}{l}{\left(\operatorname{Re}\left\{{\mathbf{Z}}_1^{\left(\mathrm{a}\right)}\right\}\right)}^{-1}+{\left(\operatorname{Re}\left\{{\mathbf{Z}}_1^{\left(\mathrm{a}\right)}\right\}\right)}^{-1}\cdot \operatorname{Re}\left\{{\mathbf{Z}}_2^{\left(\mathrm{a}\right)}\right\}\cdot {\left(\operatorname{Re}\left\{{\mathbf{Z}}_3^{\left(\mathrm{a}\right)}\right\}-\operatorname{Re}\left\{{\mathbf{Z}}_2^{\left(\mathrm{a}\right)\mathrm{H}}\right\}\cdot {\left(\operatorname{Re}\left\{{\mathbf{Z}}_1^{\left(\mathrm{a}\right)}\right\}\right)}^{-1}\cdot \operatorname{Re}\left\{{\mathbf{Z}}_2^{\left(\mathrm{a}\right)}\right\}+\left[\begin{array}{cc}\mathbf{O}& \mathbf{O}\\ {}\mathbf{O}& {\sigma}_{\varepsilon}^2{\boldsymbol{\Omega}}^{-1}/2\end{array}\right]\right)}^{-1}\\ {}\times \operatorname{Re}\left\{{\mathbf{Z}}_2^{\left(\mathrm{a}\right)\mathrm{H}}\right\}\cdot {\left(\operatorname{Re}\left\{{\mathbf{Z}}_1^{\left(\mathrm{a}\right)}\right\}\right)}^{-1}\end{array}\right) $$
(73)

Note that the superscript “a” in (73) indicates that this CRB corresponds to the case of an unknown signal waveform. In addition, the trace of this CRB matrix is the minimum achievable localization MSE.

6.2 Cramér-Rao bound on position estimate for case of known signal waveform

In this subsection, the CRB on the covariance matrix of location estimation is deduced for the case where the transmitted signals are a priori known. Accordingly, the complete parameter set comprises the deterministic parameters \( {\sigma}_{\varepsilon}^2 \), p, β, and t0, as well as the random parameter μ. Hence, the hybrid CRB should also be considered. To proceed, we collect all these parameters into the following vector:

$$ {\boldsymbol{\upeta}}^{\left(\mathrm{b}\right)}={\left[{\sigma}_{\varepsilon}^2\kern1.5em {\mathbf{p}}^{\mathrm{T}}\kern1.5em {\left(\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}\kern1.5em {\left(\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}\kern1.5em {t}_0\kern1.5em \boldsymbol{\upmu} \right]}^{\mathrm{T}}={\left[{\sigma}_{\varepsilon}^2\kern1.5em {{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{b}\right)}\right]}^{\mathrm{T}} $$
(74)

where

$$ {{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{b}\right)}={\left[{\mathbf{p}}^{\mathrm{T}}\kern1.5em {\left(\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}\kern1.5em {\left(\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}\kern1.5em {t}_0\kern1.5em \boldsymbol{\upmu} \right]}^{\mathrm{T}} $$
(75)

Likewise, the FIM for vector η′(b) can be approximately expressed as

$$ \mathbf{FIM}\left({{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{b}\right)}\right)=\frac{2}{\sigma_{\varepsilon}^2}\cdot \operatorname{Re}\left\{{\mathbf{T}}_{{{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{b}\right)}}^{\mathrm{H}}{\mathbf{T}}_{{{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{b}\right)}}\right\}+\left[\begin{array}{cc}\mathbf{O}& \mathbf{O}\\ {}\mathbf{O}& {\boldsymbol{\Omega}}^{-1}\end{array}\right] $$
(76)

where

$$ {\displaystyle \begin{array}{c}{\mathbf{T}}_{{{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{b}\right)}}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {{\boldsymbol{\upeta}}^{\prime}}^{\left(\mathrm{b}\right)\mathrm{T}}}=\left[\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\mathbf{p}}^{\mathrm{T}}}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\left(\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}\right)}^{\mathrm{T}}}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {t}_0}\kern1.5em \frac{\partial {\overline{\mathbf{x}}}_0}{\partial {\boldsymbol{\upmu}}^{\mathrm{T}}}\right]\\ {}=\left[{\mathbf{T}}_{\mathbf{p}}\kern1.5em {\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}}\kern1.5em {\mathbf{T}}_{\operatorname{Im}\left\{\boldsymbol{\upbeta} \right\}}\kern1.5em {\mathbf{T}}_{t_0}\kern1.5em {\mathbf{T}}_{\boldsymbol{\upmu}}\right]\end{array}} $$
(77)

Note that matrix FIM(η′(b)) is also computed at the nominal value μ0. According to the analysis in Subsection 6.1, we can still obtain an asymptotically valid CRB. In addition, all the sub-matrices in (77) are given explicitly in (70), except for \( {\mathbf{T}}_{t_0} \). Therefore, we need only derive an expression for \( {\mathbf{T}}_{t_0} \). From (62) and (63), it can be checked that

$$ {\mathbf{T}}_{t_0}=\frac{\partial {\overline{\mathbf{x}}}_0}{\partial {t}_0^{\mathrm{T}}}=-\mathrm{j}\cdot \operatorname{diag}\left[\boldsymbol{\upbeta} \otimes {\mathbf{1}}_{MK\times 1}\right]\cdot \operatorname{diag}\left[{\mathbf{1}}_{N\times 1}\otimes \left(\overline{\mathbf{s}}\odot \boldsymbol{\upomega} \right)\otimes {\mathbf{1}}_{M\times 1}\right]\cdot \left(\mathbf{A}\left(\mathbf{p},{\boldsymbol{\upmu}}_0\right){\mathbf{1}}_{K\times 1}\right) $$
(78)

where ω = [ω1  ω2    ω K ]T. Inserting (70), (77), and (78) into (76) produces

figure k

where

figure l

(80)

In Appendix 5, the details of calculating the matrices in (80) are provided.

With the application of the partitioned matrix inversion formula, the CRB matrix for position vector p can be expressed as

$$ {\mathbf{CRB}}^{\left(\mathrm{b}\right)}\left(\mathbf{p}\right)=\frac{\sigma_{\varepsilon}^2}{2}\cdot \left(\begin{array}{l}{\left(\operatorname{Re}\left\{{\mathbf{Z}}_1^{\left(\mathrm{b}\right)}\right\}\right)}^{-1}+{\left(\operatorname{Re}\left\{{\mathbf{Z}}_1^{\left(\mathrm{b}\right)}\right\}\right)}^{-1}\cdot \operatorname{Re}\left\{{\mathbf{Z}}_2^{\left(\mathrm{b}\right)}\right\}\cdot {\left(\operatorname{Re}\left\{{\mathbf{Z}}_3^{\left(\mathrm{b}\right)}\right\}-\operatorname{Re}\left\{{\mathbf{Z}}_2^{\left(\mathrm{b}\right)\mathrm{H}}\right\}\cdot {\left(\operatorname{Re}\left\{{\mathbf{Z}}_1^{\left(\mathrm{b}\right)}\right\}\right)}^{-1}\cdot \operatorname{Re}\left\{{\mathbf{Z}}_2^{\left(\mathrm{b}\right)}\right\}+\left[\begin{array}{cc}\mathbf{O}& \mathbf{O}\\ {}\mathbf{O}& {\sigma}_{\varepsilon}^2{\boldsymbol{\Omega}}^{-1}/2\end{array}\right]\right)}^{-1}\\ {}\times \operatorname{Re}\left\{{\mathbf{Z}}_2^{\left(\mathrm{b}\right)\mathrm{H}}\right\}\cdot {\left(\operatorname{Re}\left\{{\mathbf{Z}}_1^{\left(\mathrm{b}\right)}\right\}\right)}^{-1}\end{array}\right) $$
(81)

where the superscript “b” indicates the case of a known signal waveform, and the trace of CRB(b)(p) is the minimum achievable MSE for any unbiased localization method.

7 Simulation results

This section presents a set of Monte Carlo simulations to examine the behavior of the proposed robust DPD methods. The RMSE of position estimate is employed to assess and compare the performance. All the simulation results are averaged over 2000 independent runs. We compare our algorithms with the algorithms in [27], and the traditional two-step localization algorithms, as well as the CRB for unknown and known signal waveforms. Besides, all the experiments are conducted for the 3-D localization.

7.1 Simulation results for case of unknown signal waveform

In this subsection, the algorithm presented in Section 4 is evaluated and its estimation accuracy is compared with that of the first algorithm in [27], which assumes the signal waveform is unknown. Both of these algorithms are denoted as algorithm I in the figures below. In addition, the two-step localization algorithm used for comparison here is realized using the robust Bayesian algorithm [48] to estimate the DOAs in the first step and exploiting the Taylor series (TS) algorithm [59] to locate the source in the second phase. If the unknowns to be found are DOAs, the array manifold in (1)–(3) should be modeled as a function of direction, as in [1,2,3]. This is not difficult to achieve because position vector p is a function of direction. The array model mismatch considered here is caused by sensor location perturbations. The sensor position errors are zero-mean Gaussian random variables in both the x and y directions, and are independent and identically distributed (IID) from sensor to sensor and array to array. The standard deviation of the position errors is denoted by σL.

In the first set of experiments, we consider three base stations placed at [0, 2500, 0] m, [0, 0, 0] m, and [0, − 2500, 0] m and a single emitter located at [1500, 2000, 2000] m. The transmitted waveforms are realizations of a narrowband Gaussian random process, and are unknown to the receivers. Each base station is equipped with a uniform circular array (UCA). The channel attenuation magnitude is fixed at 1, and the channel phase is selected at random from a uniform distribution over [−π, π). Additionally, unless stated otherwise, we use the following settings: (1) K = 128 samples; (2) SNR = 10 dB; (3) M = 5 sensors; (4) σL = 0.03λ (where λ is the wavelength of the carrier signal); and (5) array radius equal to the wavelength. Figures 1, 2, 3, 4, and 5 display the RMSEs of the localization methods as functions of the SNR of the emitter signal, number of snapshots K, standard deviation of the sensor position errors σL, number of array elements M, and ratio of array radius to wavelength, respectively.

Fig. 1
figure 1

RMSEs of localization methods and the corresponding CRBs as a function of SNR of the emitter signal for case of unknown signal waveform

Fig. 2
figure 2

RMSEs of localization methods and the corresponding CRBs as a function of number of snapshots for case of unknown signal waveform

Fig. 3
figure 3

RMSEs of localization methods and the corresponding CRBs as a function of standard deviation of sensor position errors for case of unknown signal waveform

Fig. 4
figure 4

RMSEs of localization methods and the corresponding CRBs as a function of number of array elements for case of unknown signal waveform

Fig. 5
figure 5

RMSEs of localization methods and the corresponding CRBs as a function of ratio of array radius to wavelength for case of unknown signal waveform

It can be observed from Figs. 1, 2, 3, 4, and 5 that algorithm I in this paper outperforms algorithm I in [27] in terms of the estimation accuracy. The performance gap is especially noticeable for high SNRs and large sensor position errors. This is because, in these cases, the array model errors dominate the performance. Our technique accounts for these errors and incorporates the prior statistics of the array perturbations for source localization. The results in Figs. 1, 2, 3, 4, and 5 show that, when the SNR decreases and the array aperture increases, the proposed algorithm gives only a minor performance improvement. This is because, for lower SNRs and large array apertures, the measurement noise is the dominant error source, and the two algorithms give similar performance. In addition, the estimation accuracy of our algorithm achieves the CRB provided by (73) at moderate noise and error levels. Finally, it is worth noting that the DPD estimators perform significantly better than the two-step localization method. This improvement has been explained in the literature.

In the second experiment, we use the same simulation settings with a varying emitter location. The source coordinate is set as [1500, 2000, 2000] + α∙[200, 200, 200] m, where α ranges from 0 to 10. The distance between the emitter and the receivers increases as α increases. Figure 6 illustrates the RMSEs of the localization methods with respect to α.

Fig. 6
figure 6

RMSEs of localization methods and the corresponding CRBs as a function of α for case of unknown signal waveform

From Fig. 6, we can draw similar conclusions similar to those for Figs. 1, 2, 3, 4, and 5; they are not repeated here for reasons of brevity. We only emphasize that the RMSE improvement from algorithm I in this paper over algorithm I in [27] is more significant as α increases. Hence, the gain in localization accuracy of our algorithm increases as the source moves farther away from the receivers.

7.2 Simulation results for case of known signal waveform

In this subsection, the positioning accuracy of the algorithm given in Section 5 is compared with that of the second algorithm given in [27], which assumes the signal waveform is known. These are referred to as algorithm II in the following figures. Additionally, for a fair comparison, in the simulated two-step localization algorithm, the DOAs are estimated by the algorithm presented in [56], which assumes a known waveform. As stated above, when the unknowns to be determined are DOAs, we need to express the array manifold in (1)–(3) as a function of direction, following [1,2,3]. This can be easily achieved because position vector p is a function of DOA. The array model error results from sensor gain and phase uncertainties, corresponding to case 2 in Section 4 of [39]. Moreover, both gain and phase perturbations are independent and identically distributed zero-mean Gaussian random variables. The standard deviations of the gain and phase errors are denoted as σG and σP, respectively. We assume σG = 0.01σP hereafter; thus, if σP changes, σG alters accordingly.

In the first set of experiments, the location system consists of three base stations located at [500, 2500, 0] m, [0, 0, 0] m, and [500, − 2500, 0] m. The emitter is positioned at [1500, 500, 2500] m. The signal waveforms are generated in the same way as described in Subsection 7.1, but they are known to the receivers. Each base station is equipped with a UCA. The channel attenuation magnitude is fixed at 1, and the channel phase is selected at random from a uniform distribution over [−π, π). Additionally, unless otherwise specified, we adopt the following simulation parameters: (1) K = 128 samples; (2) SNR = 10 dB; (3) M = 6 sensors; (4) σP = 0.1rad; and (5) array radius equal to wavelength. Figures 7, 8, 9, 10, and 11 plot the RMSEs of the localization methods against the SNR of the emitter signal, number of snapshots K, standard deviation of the sensor phase errors σP, number of array elements M, and ratio of array radius to wavelength, respectively.

Fig. 7
figure 7

RMSEs of localization methods and the corresponding CRBs as a function of SNR of the emitter signal for case of known signal waveform

Fig. 8
figure 8

RMSEs of localization methods and the corresponding CRBs as a function of number of snapshots for case of known signal waveform

Fig. 9
figure 9

RMSEs of localization methods and the corresponding CRBs as a function of standard deviation of sensor phase errors for case of known signal waveform

Fig. 10
figure 10

RMSEs of localization methods and the corresponding CRBs as a function of number of array elements for case of known signal waveform

Fig. 11
figure 11

RMSEs of localization methods and the corresponding CRBs as a function of ratio of array radius to wavelength for case of known signal waveform

In Figures 7, 8, 9, 10, and 11, the RMSEs clearly demonstrate the superior performance of algorithm II in this paper over algorithm II in [27] and the two-step localization algorithm. Moreover, the performance improvement is more pronounced as the SNR and sensor model errors increase and the array aperture decreases. The impact of array model mismatch is effectively mitigated by our algorithm. Additionally, the proposed algorithm yields a solution that attains the CRB accuracy provided by (81) at moderate noise and error levels.

In the second experiment, the same simulation parameters are used, but the source location changes. The emitter position coordinates are set as [1500, 500, 2500] + α∙[200, 200, 200] m, where α varies from 0 to 10. Again, the source becomes farther away from the receivers as α increases. Figure 12 shows the RMSEs of the localization methods with respect to α.

Fig. 12
figure 12

RMSEs of localization methods and the corresponding CRBs as a function of α for case of known signal waveform

The performance gain of the proposed algorithm over the other algorithms is corroborated by the results in Fig. 12. Moreover, the RMSE improvement of the new algorithm over algorithm II in [27] becomes stronger as the source moves farther away from the receivers. This observation is consistent with the results in Fig. 6.

7.3 Running time of the localization methods

In this subsection, the runtime of all the localization methods considered in the experiments is compared. All the simulations were implemented using MATLAB R2016b on a ThinkPad laptop equipped with Intel Core i7–7500 CPU and 8 GB RAM.

First, we adopt the simulation settings used to produce Fig. 2. Figure 13 depicts the average runtime of the considered localization methods as a function of snapshot number. Second, the simulation parameters used to produce Fig. 8 were applied. In Fig. 14, the average runtime versus the snapshot number is compared for the considered localization methods.

Fig. 13
figure 13

Average running time as a function of the snapshot number (the first simulation results)

Fig. 14
figure 14

Average running time as a function of the snapshot number (the second simulation results)

It is easily observed from Figs. 13 and 14 that the runtime of the proposed DPD methods is significantly higher than that of the other methods. This is because the model errors are considered and there are too many nuisance variables to be optimized besides the source position vector. The two-step localization method has the fastest runtime among the compared methods, mainly because it is a decentralized method.

8 Discussion

From the simulation results described above, we can observe that the proposed algorithms can effectively mitigate the effects of array model mismatch, and they outperform the algorithms in [27] obviously for high SNRs and large sensor position errors. Moreover, the gain in localization accuracy of our algorithms increases as the source moves farther away from the receivers. The estimation performance of the developed algorithms can attain the corresponding CRB at moderate noise and error levels. However, the complexity of the proposed algorithms are higher than that of the algorithms in [27] because they account for the array model errors and require more computational procedure. Finally, it needs to be mentioned that the present study only deals with the single-source situation. In our future work, we intend to extend the proposed methods to the multiple-source scenario.

9 Conclusions

This paper presents robust DPD methods that can reduce the negative effect of array model mismatch. The idea behind the newly proposed technique is similar to that of the array error auto-calibration methods, which assume that certain prior statistical distribution of the array model errors is available. We consider two different localization cases, the case of a priori known signal waveform and the more realistic case where transmitted signals are unknown to the location systems. The corresponding MAP estimators for the two cases are formulated and two effective alternating minimization algorithms are developed to locate the source directly from the signals captured at several antenna arrays simultaneously. The proposed methods follow the Bayesian framework given in [47,48,49,50]. Besides, for the purpose of verifying the asymptotic efficiency of the new methods, the CRB expressions for position estimation are also deduced for both unknown and known signal waveforms. Simulation results confirm that the proposed algorithms are able to provide the CRB accuracy and the effect of array model errors can be decreased considerably.

Abbreviations

3D:

Three-dimensional

CRB:

Cramér-Rao bound

DFT:

Discrete Fourier transform

DOA:

Direction-of-arrival

DPD:

Direct position determination

EXIP:

Extended invariance principle

FDOA:

Frequency difference of arrival

FFT:

Fast Fourier transform

FIM:

Fisher information matrix

FOA:

Frequency of arrival

GROA:

Gain ratios of arrival

IID:

Independent and identically distributed

MAP:

Maximum a posteriori

ML:

Maximum likelihood

MSE:

Mean square error

RMSE:

Root-mean-square-error

RSS:

Received signal strength

SNR:

Signal-to-noise ratios

TDOA:

Time difference of arrival

TOA:

Time of arrival

TS:

Taylor series

UCA:

Uniform circular array

References

  1. RO Schmidt, Multiple emitter location and signal parameter estimation[J]. IEEE Trans. Antennas Propag. 34(3), 267–280 (1986).

    Google Scholar 

  2. P Stoica, A Nehorai, MUSIC, maximum likelihood, and Cramér-Rao bound[J]. IEEE Trans. Acoust. Speech Signal Process. 37(5), 720–741 (1989).

    Article  MathSciNet  MATH  Google Scholar 

  3. M Viberg, B Ottersten, Sensor array processing based on subspace fitting[J]. IEEE Trans. Signal Process. 39(5), 1110–1121 (1991).

    Article  MATH  Google Scholar 

  4. SC Nardone, ML Graham, A closed-form solution to bearings-only target motion analysis[J]. IEEE J. Ocean. Eng. 22(1), 168–178 (1997).

    Article  Google Scholar 

  5. D Kutluyil, Bearings-only target localization using total least squares[J]. Signal Process. 85(9), 1695–1710 (2005).

    Article  MATH  Google Scholar 

  6. Z Lin, T Han, R Zheng, M Fu, Distributed localization for 2-D sensor networks with bearing-only measurements under switching topologies[J]. IEEE Trans. Signal Process. 64(23), 6345–6359 (2016).

    Article  MathSciNet  Google Scholar 

  7. L Yang, KC Ho, An approximately efficient TDOA localization algorithm in closed-form for locating multiple disjoint sources with erroneous sensor positions[J]. IEEE Trans. Signal Process. 57(12), 4598–4615 (2009).

    Article  MathSciNet  Google Scholar 

  8. W Jiang, C Xu, L Pei, W Yu, Multidimensional scaling-based TDOA localization scheme using an auxiliary line[J]. IEEE Signal Process Lett. 23(4), 546–550 (2016).

    Article  Google Scholar 

  9. KW Cheung, HC So, YT Chan, Least squares algorithms for time-of-arrival-based mobile location[J]. IEEE Trans. Signal Process. 52(4), 1121–1128 (2004).

    Article  MathSciNet  MATH  Google Scholar 

  10. H Shen, Z Ding, S Dasgupta, C Zhao, Multiple source localization in wireless sensor networks based on time of arrival measurement[J]. IEEE Trans. Signal Process. 62(8), 1938–1949 (2014).

    Article  MathSciNet  Google Scholar 

  11. KC Ho, X Lu, L Kovavisaruch, Source localization using TDOA and FDOA measurements in the presence of receiver location errors: analysis and solution[J]. IEEE Trans. Signal Process. 55(2), 684–696 (2007).

    Article  MathSciNet  MATH  Google Scholar 

  12. G Wang, Y Li, N Ansari, A semidefinite relaxation method for source localization using TDOA and FDOA measurements[J]. IEEE Trans. Veh. Technol. 62(2), 853–862 (2013).

    Article  Google Scholar 

  13. J Mason, Algebraic two-satellite TOA/FOA position solution on an ellipsoidal earth[J]. IEEE Trans. Aerosp. Electron. Syst. 40(7), 1087–1092 (2004).

    Article  Google Scholar 

  14. KC Ho, M Sun, An accurate algebraic closed-form solution for energy-based source localization[J]. IEEE Trans. Audio Speech Lang. Process. 15(8), 2542–2550 (2007).

    Article  Google Scholar 

  15. KC Ho, M Sun, Passive source localization using time differences of arrival and gain ratios of arrival[J]. IEEE Trans. Signal Process. 56(2), 464–477 (2008).

    Article  MathSciNet  Google Scholar 

  16. M Wax, T Kailath, Decentralized processing in sensor arrays[J]. IEEE Trans. Signal Process. 33(4), 1123–1129 (1985).

    Article  Google Scholar 

  17. P Stoica, On reparametrization of loss functions used in estimation and the invariance principle[J]. Signal Process. 17(4), 383–387 (1989).

    Article  MathSciNet  Google Scholar 

  18. A Amar, AJ Weiss, Localization of narrowband radio emitters based on Doppler frequency shifts[J]. IEEE Trans. Signal Process. 56(11), 5500–5508 (2008).

    Article  MathSciNet  Google Scholar 

  19. D Wang, Y Wu, Statistical performance analysis of direct position determination method based on doppler shifts in presence of model errors[J]. Multidim. Syst. Sign. Process. 28(1), 149–182 (2017).

    Article  MathSciNet  MATH  Google Scholar 

  20. E Tzoreff, AJ Weiss, Expectation-maximization algorithm for direct position determination. Signal Process. 97(4), 32–39 (2017).

    Article  Google Scholar 

  21. N Vankayalapati, S Kay, Q Ding, TDOA based direct positioning maximum likelihood estimator and the Cramer-Rao bound[J]. IEEE Trans. Aerosp. Electron. Syst. 50(3), 1616–1634 (2014).

    Article  Google Scholar 

  22. W Xia, W Liu, LF Zhu, Distributed adaptive direct position determination based on diffusion framework[J]. J. Syst. Eng. Electron. 27(1), 28–38 (2016).

    Google Scholar 

  23. AJ Weiss, Direct geolocation of wideband emitters based on delay and Doppler[J]. IEEE Trans. Signal Process. 59(6), 2513–5520 (2011).

    Article  MathSciNet  Google Scholar 

  24. M Pourhomayoun, ML Fowler, Distributed computation for direct position determination emitter location[J]. IEEE Trans. Aerosp. Electron. Syst. 50(4), 2878–2889 (2014).

    Article  Google Scholar 

  25. O Bar-Shalom, AJ Weiss, Emitter geolocation using single moving receiver[J]. Signal Process. 94(12), 70–83 (2014).

    Article  Google Scholar 

  26. JZ Li, L Yang, FC Guo, WL Jiang, Coherent summation of multiple short-time signals for direct positioning of a wideband source based on delay and Doppler[J]. Digital Signal Process. 48(1), 58–70 (2016).

    Article  MathSciNet  Google Scholar 

  27. AJ Weiss, Direct position determination of narrowband radio frequency transmitters[J]. IEEE Signal Process Lett. 11(5), 513–516 (2004).

    Article  Google Scholar 

  28. AJ Weiss, A Amar, Direct position determination of multiple radio signals[J]. EURASIP J Appl. Signal Process. 2005(1), 37–49 (2005).

    MATH  Google Scholar 

  29. A Amar, AJ Weiss, A decoupled algorithm for geolocation of multiple emitters[J]. Signal Process. 87(10), 2348–2359 (2007).

    Article  MATH  Google Scholar 

  30. T Tirer, AJ Weiss, High resolution direct position determination of radio frequency sources[J]. IEEE Signal Process Lett. 23(2), 192–196 (2016).

    Article  Google Scholar 

  31. L Tzafri, AJ Weiss, High-resolution direct position determination using MVDR[J]. IEEE Trans. Wireless Commun. 15(9), 6449–6461 (2016).

    Article  Google Scholar 

  32. T Tirer, AJ Weiss, Performance analysis of a high-resolution direct position determination method[J]. IEEE Trans. Signal Process. 65(3), 544–554 (2017).

    Article  MathSciNet  Google Scholar 

  33. O Bar-Shalom, AJ Weiss, Efficient direct position determination of orthogonal frequency division multiplexing signals[J]. IET Radar Sonar Navig. 3(2), 101–111 (2009).

    Article  Google Scholar 

  34. AM Reuven, AJ Weiss, Direct position determination of cyclostationary signals[J]. Signal Process. 89(12), 2448–2464 (2009).

    Article  MATH  Google Scholar 

  35. JX Yin, W Ying, W Ding, Direct position determination of multiple noncircular sources with a moving array[J]. Circuits Syst. Signal Process. 2017;36(10):4050-76.

  36. JX Yin, W Ding, W Ying, XW Yao, ML-based single-step estimation of the locations of strictly noncircular sources[J]. Digital Signal Process. 69(10), 224–236 (2017).

    Article  Google Scholar 

  37. M Oispuu, U Nickel, Direct detection and position determination of multiple sources with intermittent emission[J]. Signal Process. 90(12), 3056–3064 (2010).

    Article  MATH  Google Scholar 

  38. B Friedlander, A sensitivity analysis of the MUSIC algorithm[J]. IEEE Trans. Acoust. Speech Signal Process. 38(11), 1740–1751 (1990).

    Article  Google Scholar 

  39. A Swindlehurst, T Kailath, A performance analysis of subspace-based methods in the presence of model errors, part I: The MUSIC algorithm[J]. IEEE Trans. Signal Process. 40(7), 1758–1773 (1992).

    Article  MATH  Google Scholar 

  40. A Ferréol, P Larzabal, M Viberg, Statistical analysis of the MUSIC algorithm in the presence of modeling errors, taking into account the resolution probability[J]. IEEE Trans. Signal Process. 58(8), 4156–4166 (2010).

    Article  MathSciNet  Google Scholar 

  41. M Khodja, A Belouchrani, K Abed-Meraim, Performance analysis for time-frequency MUSIC algorithm in presence of both additive noise and array calibration errors[J]. EURASIP J. Adv. Signal Process. 2012(1), 94–104 (2012).

    Article  Google Scholar 

  42. V Inghelbrecht, J Verhaevert, T van Hecke, H Rogier, The influence of random element displacement on DOA estimates obtained with (Khatri-Rao-) root-MUSIC[J]. Sensors 14(11), 21258–21280 (2014).

    Article  Google Scholar 

  43. A Amar, AJ Weiss, Analysis of the direct position determination approach in the presence of model errors[A]. Proceedings of the IEEE Convention on Electrical and Electronics Engineers[C] (IEEE Press, Tel Aviv, 2004), pp. 408–411.

    Google Scholar 

  44. A Amar, AJ Weiss, Analysis of direct position determination approach in the presence of model errors[A]. Proceedings of the IEEE Workshop on Statistical Signal Processing[C] (IEEE Press, Novosibirsk, 2005), pp. 521–524.

    Google Scholar 

  45. A Amar, AJ Weiss, Direct position determination in the presence of model errors—known waveforms[J]. Digital Signal Process. 16(1), 52–83 (2006).

    Article  Google Scholar 

  46. W Ding, HY Yu, ZD Wu, C Wang, Performance analysis of the direct position determination method in the presence of array model errors[J]. Sensors 17(7), 1550–1590 (2017).

    Article  Google Scholar 

  47. B Wahlberg, B Ottersten, M Viberg, Robust signal parameter estimation in the presence of array perturbations[A]. Proceedings of the IEEE International conference on acoustic, speech and signal processing[C], vol 5 (IEEE Press, Toronto, 1991), pp. 3277–3280.

    Google Scholar 

  48. M Viberg, AL Swindlehurst, A Bayesian approach to auto-calibration for parametric array signal processing[J]. IEEE Trans. Signal Process. 42(12), 3495–3507 (1994).

    Article  Google Scholar 

  49. A Flieller, A Ferréol, P Larzabal, H Clergeot, Robust bearing estimation in the presence of direction-dependent modeling errors: identifiability and treatment[A]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing[C] (IEEE Press, Detroit, 1995), pp. 1884–1887.

    Google Scholar 

  50. M Jansson, AL Swindlehurst, B Ottersten, Weighted subspace fitting for general array error models[J]. IEEE Trans. Signal Process. 46(9), 2484–2498 (1998).

    Article  Google Scholar 

  51. EK Hung, Matrix-construction calibration method for antenna arrays[J]. IEEE Trans. Aerosp. Electron. Syst. 36(3), 819–828 (2000).

    Article  Google Scholar 

  52. QM Bao, C KO C, WJ Zhi, DOA estimation under unknown mutual coupling and multipath[J]. IEEE Trans. Aerosp. Electron. Syst. 41(2), 565–573 (2005).

    Article  Google Scholar 

  53. J Zheng, YC Wu, Joint time synchronization and localization of an unknown node in wireless sensor networks[J]. IEEE Trans. Signal Process. 58(3), 1309–1320 (2010).

    Article  MathSciNet  Google Scholar 

  54. B Friedlander, AJ Weiss, Direction finding in the presence of mutual coupling[J]. IEEE Trans. Antennas Propag. 39(3), 273–284 (1991).

    Article  Google Scholar 

  55. J Li, RT Compton, Maximum likelihood angle estimation for signals with known waveforms[J]. IEEE Trans. Signal Process. 41(9), 2850–2862 (1993).

    Article  MATH  Google Scholar 

  56. J Li, B Halder, P Stoica, M Viberg, Computationally efficient angle estimation for signals with known waveforms[J]. IEEE Trans. Signal Process. 43(9), 2154–2163 (1995).

    Article  Google Scholar 

  57. Y Rockah, Schultheiss, Array shape calibration using sources in unknown locations—part I: far-field sources[J]. IEEE Trans. Acoust. Speech Signal Process. 35(3), 286–299 (1987).

    Article  Google Scholar 

  58. JX Zhu, H Wang, Effects of sensor position and pattern perturbations on CRLB for direction finding of multiple narrowband sources[A]. Proceedings of the Fourth IEEE Workshop on Spectrum Estimation and Modeling[C] (IEEE Press, Minneapolis, 1988), pp. 98–102.

    Google Scholar 

  59. XN Lu, KC Ho, Taylor-series technique for source localization using AOAs in the presence of sensor location errors[A]. Proceedings of the Fourth IEEE Workshop on Sensor Array and Multichannel Processing[C] (IEEE Press, Waltham, 2006), pp. 190–194.

    Google Scholar 

  60. M Kaveh, AJ Barabell, The statistical performance of the MUSIC and the minimum-norm algorithms in resolving plane waves in noise[J]. IEEE Trans. Acoust. Speech Signal Process. 34(2), 331–341 (1986).

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Editorial board and the Reviewers for considering and revising this manuscript. The authors also thank Stuart Jenkinson, PhD, from Liwen Bianji, Edanz Group China (www.liwenbianji.cn/ac), for editing the English text of a draft of this manuscript.

Funding

This work is supported from National Natural Science Foundation of China (Grant No. 61201381, No. 61401513 and No.61772548), China Postdoctoral Science Foundation (Grant No. 2016M592989), the Self-Topic Foundation of Information Engineering University (Grant No. 2016600701), and the Outstanding Youth Foundation of Information Engineering University (Grant No. 2016603201).

Availability of data and materials

The datasets generated and/or analyzed during the current study are not publicly available but are available from the corresponding author on reasonable request.

Author information

Authors and Affiliations

Authors

Contributions

DW and JY wrote the manuscript and RL helped with writing and theoretical derivation. ZW and TT were in charge of the experiment and its results. All authors had significant contribution on the development of early ideas and design of the final algorithms. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jiexin Yin.

Ethics declarations

Ethics approval and consent to participate

All data and procedures performed in paper were in accordance with the ethical standards of research community. This paper does not contain any studies with human participants or animals performed by any of the authors.

Consent for publication

Informed consent was obtained from all authors included in the study.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

1.1 Proof of Proposition 1

Assume the normalized eigenvectors of matrix \( \widehat{\mathbf{Z}} \) are \( {\widehat{\mathbf{v}}}_1\kern0.5em ,\kern0.5em {\widehat{\mathbf{v}}}_2\kern0.5em ,\kern0.5em \cdots \kern0.5em ,\kern0.5em {\widehat{\mathbf{v}}}_n \). Applying the Hermitian matrix eigen-perturbation theory [60], we have

$$ \left\{\begin{array}{l}{\widehat{\lambda}}_j={\lambda}_j+{\delta \lambda}_j^{(1)}+{\delta \lambda}_j^{(2)}+\cdots \\ {}{\widehat{\mathbf{v}}}_j=\left(1-\frac{1}{2}\cdot \sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\left|{\xi}_{ji}^{(1)}\right|}^2\right){\mathbf{v}}_j+\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(1)}{\mathbf{v}}_i+\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(2)}{\mathbf{v}}_i+\cdots \end{array}\right. $$
(82)

where \( {\delta \lambda}_j^{(1)} \) and \( {\xi}_{ji}^{(1)} \) are the first-order perturbation terms, i.e., \( {\delta \lambda}_j^{(1)}=O\left({\left\Vert \boldsymbol{\updelta} \mathbf{Z}\right\Vert}_2\right) \) and \( {\xi}_{ji}^{(1)}=O\left({\left\Vert \boldsymbol{\updelta} \mathbf{Z}\right\Vert}_2\right) \), and \( {\delta \lambda}_j^{(2)} \) and \( {\xi}_{ji}^{(2)} \) are the second-order perturbation terms, i.e., \( {\delta \lambda}_j^{(2)}=O\left({\left\Vert \boldsymbol{\updelta} \mathbf{Z}\right\Vert}_2^2\right) \) and \( {\xi}_{ji}^{(2)}=O\left({\left\Vert \boldsymbol{\updelta} \mathbf{Z}\right\Vert}_2^2\right) \). It follows from the matrix eigen-equation that

$$ {\displaystyle \begin{array}{c}{\widehat{\mathbf{Z}}\widehat{\mathbf{v}}}_j={\widehat{\lambda}}_j{\widehat{\mathbf{v}}}_j\iff \left(\mathbf{Z}+\boldsymbol{\updelta} \mathbf{Z}\right)\kern0.1em \left(\left(1-\frac{1}{2}\cdot \sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\left|{\xi}_{ji}^{(1)}\right|}^2\right){\mathbf{v}}_j+\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(1)}{\mathbf{v}}_i+\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(2)}{\mathbf{v}}_i+\cdots \right)\\ {}=\left({\lambda}_j+{\delta \lambda}_j^{(1)}+{\delta \lambda}_j^{(2)}+\cdots \right)\;\left(\left(1-\frac{1}{2}\cdot \sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\left|{\xi}_{ji}^{(1)}\right|}^2\right){\mathbf{v}}_j+\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(1)}{\mathbf{v}}_i+\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(2)}{\mathbf{v}}_i+\cdots \right)\end{array}} $$
(83)

Comparing the first-order perturbation terms between both sides of (83), it can be observed that

$$ \sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(1)}{\lambda}_i{\mathbf{v}}_i+\boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_j=\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(1)}{\lambda}_j{\mathbf{v}}_i+{\mathbf{v}}_j\cdot {\delta \lambda}_j^{(1)} $$
(84)

Premultiplying \( {\mathbf{v}}_j^{\mathrm{H}} \) and \( {\mathbf{v}}_i^{\mathrm{H}}\left(i\ne j\right) \) on both sides of (84) leads to

$$ \left\{\begin{array}{l}{\delta \lambda}_j^{(1)}={\mathbf{v}}_j^{\mathrm{H}}\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_j\\ {}{\xi}_{ji}^{(1)}=\frac{{\mathbf{v}}_i^{\mathrm{H}}\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_j}{\lambda_j-{\lambda}_i}\end{array}\right. $$
(85)

Furthermore, comparing the second-order perturbation terms between both sides of (83), it follows that

$$ \sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(2)}{\lambda}_i{\mathbf{v}}_i+\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(1)}\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_i=\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(2)}{\lambda}_j{\mathbf{v}}_i+\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(1)}\cdot {\delta \lambda}_j^{(1)}\cdot {\mathbf{v}}_i+{\delta \lambda}_j^{(2)}\cdot {\mathbf{v}}_j $$
(86)

Premultiplying \( {\mathbf{v}}_j^{\mathrm{H}} \) on both sides of (86) and using (85) yields

$$ {\delta \lambda}_j^{(2)}=\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n{\xi}_{ji}^{(1)}{\mathbf{v}}_j^{\mathrm{H}}\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_i={\mathbf{v}}_j^{\mathrm{H}}\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot \left(\sum \limits_{\begin{array}{c}i=1\\ {}i\ne j\end{array}}^n\frac{{\mathbf{v}}_i{\mathbf{v}}_i^{\mathrm{H}}}{\lambda_j-{\lambda}_i}\right)\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_j={\mathbf{v}}_j^{\mathrm{H}}\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{V}}_j\cdot \boldsymbol{\updelta} \mathbf{Z}\cdot {\mathbf{v}}_j $$
(87)

Combining (82), (85) and (87) completes the proof.

Appendix 2

1.1 Proof of (33)

Assume that \( {\widehat{\boldsymbol{\upmu}}}_n^{(i)} \) is the estimation of μ n at the ith iteration step. In order to obtain the updated estimate of μ n at the (i + 1)th iteration, we need to substitute the first-order Taylor series expansion of g n (μ n ) round \( {\widehat{\boldsymbol{\upmu}}}_n^{(i)} \) into (31), which yields

figure m

(88)

Noting that \( {\boldsymbol{\upmu}}_n-{\widehat{\boldsymbol{\upmu}}}_n^{(i)} \) is a real vector, we can reformulate (88) as

figure n

(89)

It is obvious that (89) is a linear least-squares estimator, and its optimal solution satisfies

figure o

(90)

which indicates that (33) holds true.

Appendix 3

1.1 Proof of Proposition 2

From the definition of ym(z), it follows that

$$ {f}_{\mathrm{m}}\left(\mathbf{z}\right)=\underset{y}{\max}\left\{f\left(\mathbf{z},y\right)\right\}=f\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right) $$
(91)

Then, the gradient of fm(z) can be written as

$$ {\mathbf{f}}_{\mathrm{m}}\left(\mathbf{z}\right)=\frac{\partial {f}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}}={\dot{\mathbf{f}}}_1\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)+{\dot{f}}_2\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)\cdot \frac{\partial {y}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}} $$
(92)

From (92), we can obtain the Hessian matrix of fm(z) as

$$ {\displaystyle \begin{array}{c}{\mathbf{F}}_{\mathrm{m}}\left(\mathbf{z}\right)=\frac{\partial^2{f}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}\partial {\mathbf{z}}^{\mathrm{T}}}=\frac{\partial {\mathbf{f}}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial {\mathbf{z}}^{\mathrm{T}}}={\ddot{\mathbf{F}}}_{11}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)+{\ddot{\mathbf{f}}}_{21}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)\cdot {\left(\frac{\partial {y}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}}\right)}^{\mathrm{T}}\\ {}+\frac{\partial {y}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}}\cdot {\ddot{\mathbf{f}}}_{21}^{\mathrm{T}}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)+{\ddot{f}}_{22}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)\cdot \frac{\partial {y}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}}\cdot {\left(\frac{\partial {y}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}}\right)}^{\mathrm{T}}+{\dot{f}}_2\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)\cdot \frac{\partial^2{y}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}\partial {\mathbf{z}}^{\mathrm{T}}}\end{array}} $$
(93)

On the other hand, using the definition of ym(z) again, we have

$$ {\dot{f}}_2\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)={\left.\frac{\partial f\left(\mathbf{z},y\right)}{\partial y}\right|}_{y={y}_{\mathrm{m}}\left(\mathbf{z}\right)}=0 $$
(94)

Substituting (94) into (92) proves the first equality in (48). Taking the derivative with respect to z on both sides of (94) yields

$$ {\ddot{\mathbf{f}}}_{21}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)+{\ddot{f}}_{22}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)\cdot \frac{\partial {y}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}}={\mathbf{O}}_{m\times 1} $$
(95)

which implies

$$ \frac{\partial {y}_{\mathrm{m}}\left(\mathbf{z}\right)}{\partial \mathbf{z}}=-\frac{{\ddot{\mathbf{f}}}_{21}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)}{{\ddot{f}}_{22}\left(\mathbf{z},{y}_{\mathrm{m}}\left(\mathbf{z}\right)\right)} $$
(96)

The second equality in (48) can be proved by inserting (94) and (96) into (93). At this point, the proof of Proposition 2 is completed.

Appendix 4

1.1 Detailed derivation of matrices in (72)

It can be seen from (72) that, in order to calculate matrices \( {\mathbf{Z}}_1^{\left(\mathrm{a}\right)} \), \( {\mathbf{Z}}_2^{\left(\mathrm{a}\right)} \), and \( {\mathbf{Z}}_3^{\left(\mathrm{a}\right)} \), we must derive the explicit expressions for matrices \( {\mathbf{T}}_{\mathbf{p}}^{\mathrm{H}}{\mathbf{T}}_{\mathbf{p}} \), \( {\mathbf{T}}_{\mathbf{p}}^{\mathrm{H}}{\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}} \), \( {\mathbf{T}}_{\mathbf{p}}^{\mathrm{H}}{\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}} \), \( {\mathbf{T}}_{\mathbf{p}}^{\mathrm{H}}{\mathbf{T}}_{\boldsymbol{\upmu}} \), \( {\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}}^{\mathrm{H}}{\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}} \), \( {\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}}^{\mathrm{H}}{\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}} \), \( {\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}}^{\mathrm{H}}{\mathbf{T}}_{\boldsymbol{\upmu}} \), \( {\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}}^{\mathrm{H}}{\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}} \), \( {\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}}^{\mathrm{H}}{\mathbf{T}}_{\boldsymbol{\upmu}} \), and \( {\mathbf{T}}_{\boldsymbol{\upmu}}^{\mathrm{H}}{\mathbf{T}}_{\boldsymbol{\upmu}} \).

Using (70) and after some algebraic manipulation, we have

$$ {\mathbf{T}}_{\mathbf{p}}^{\mathrm{H}}{\mathbf{T}}_{\mathbf{p}}=\sum \limits_{n=1}^N\sum \limits_{k=1}^K{\left|{\beta}_n\right|}^2\cdot {\left|{\overline{s}}_k\right|}^2\cdot {\left(\frac{\partial {\mathbf{a}}_{n,k}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_{n,0}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}}\right)}^{\mathrm{H}}\cdot \frac{\partial {\mathbf{a}}_{n,k}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_{n,0}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}} $$
(97)
figure p

(98)

figure q

(99)

figure r

(100)

figure s

(101)

figure t

(102)

figure u

(103)

$$ {\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}}^{\mathrm{H}}{\mathbf{T}}_{\operatorname{Re}\left\{\overline{\mathbf{s}}\right\}}=\sum \limits_{n=1}^N{\left|{\beta}_n\right|}^2\cdot \operatorname{diag}\left[{\left\Vert {\mathbf{a}}_{n,1}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_{1,0}\right)\right\Vert}_2^2\kern1.5em {\left\Vert {\mathbf{a}}_{n,2}^{\prime}\Big(\mathbf{p},{\boldsymbol{\upmu}}_{2,0}\Big)\right\Vert}_2^2\kern1.5em \cdots \kern1.5em {\left\Vert {\mathbf{a}}_{n,K}^{\prime}\Big(\mathbf{p},{\boldsymbol{\upmu}}_{N,0}\Big)\right\Vert}_2^2\right] $$
(104)
figure v

(105)

figure w

(106)

Appendix 5

1.1 Detailed derivation of matrices in (80)

From (80) and the results in Appendix 4, in order to calculate matrices \( {\mathbf{Z}}_1^{\left(\mathrm{b}\right)} \), \( {\mathbf{Z}}_2^{\left(\mathrm{b}\right)} \), and \( {\mathbf{Z}}_3^{\left(\mathrm{b}\right)} \), we need to derive the closed-form expressions for matrices \( {\mathbf{T}}_{t_0}^{\mathrm{H}}{\mathbf{T}}_{t_0} \), \( {\mathbf{T}}_{t_0}^{\mathrm{H}}{\mathbf{T}}_{\mathbf{p}} \), \( {\mathbf{T}}_{t_0}^{\mathrm{H}}{\mathbf{T}}_{\operatorname{Re}\left\{\boldsymbol{\upbeta} \right\}} \), and \( {\mathbf{T}}_{t_0}^{\mathrm{H}}{\mathbf{T}}_{\boldsymbol{\upmu}} \).

Combining (70) and (78) and after some algebraic manipulation, we get

$$ {\mathbf{T}}_{t_0}^{\mathrm{H}}{\mathbf{T}}_{t_0}=\sum \limits_{n=1}^N\sum \limits_{k=1}^K{\omega}_k^2\cdot {\left|{\beta}_n\right|}^2\cdot {\left|{\overline{s}}_k\right|}^2\cdot {\left\Vert {\mathbf{a}}_{n,k}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_{n,0}\right)\right\Vert}_2^2 $$
(107)
$$ {\mathbf{T}}_{t_0}^{\mathrm{H}}{\mathbf{T}}_{\mathbf{p}}=\sum \limits_{n=1}^N\sum \limits_{k=1}^K\mathrm{j}{\omega}_k\cdot {\left|{\beta}_n\right|}^2\cdot {\left|{\overline{s}}_k\right|}^2\cdot {\left({\mathbf{a}}_{n,k}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_{n,0}\right)\right)}^{\mathrm{H}}\cdot \frac{\partial {\mathbf{a}}_{n,k}^{\prime}\left(\mathbf{p},{\boldsymbol{\upmu}}_{n,0}\right)}{\partial {\mathbf{p}}^{\mathrm{T}}} $$
(108)
figure x

(109)

figure y

(110)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, D., Yin, J., Liu, R. et al. Robust direct position determination methods in the presence of array model errors. EURASIP J. Adv. Signal Process. 2018, 38 (2018). https://doi.org/10.1186/s13634-018-0555-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-018-0555-7

Keywords