EURASIP Journal on Applied Signal Processing 2003:4, 359–370 c ○ 2003 Hindawi Publishing Corporation Acoustic Source Localization and Beamforming: Theory and Practice

We consider the theoretical and practical aspects of locating acoustic sources using an array of microphones. A maximum-likelihood (ML) direct localization is obtained when the sound source is near the array, while in the far-field case, we demonstrate the localization via the cross bearing from several widely separated arrays. In the case of multiple sources, an alternating projection procedure is applied to determine the ML estimate of the DOAs from the observed data. The ML estimator is shown to be effective in locating sound sources of various types, for example, vehicle, music, and even white noise. From the theoretical Cramér-Rao bound analysis, we find that better source location estimates can be obtained for high-frequency signals than low-frequency signals. In addition, large range estimation error results when the source signal is unknown, but such unknown parameter does not have much impact on angle estimation. Much experimentally measured acoustic data was used to verify the proposed algorithms.


INTRODUCTION
Acoustic source localization has been an active research area for many years. Applications include unattended ground sensor (UGS) network for military surveillance, reconnaissance, or around the perimeter of a plant for intrusion detection [1]. Many variations of algorithms using a microphone array for source localization in the near field as well as direction-ofarrival (DOA) estimation in the far field have been proposed [2]. Many of these techniques involve a relative time-delayestimation step that is followed by a least squares (LS) fit to the source DOA, or in the near-field case, an LS fit to the source location [3,4,5,6,7].
In our previous paper [8], we derived the "optimal" parametric maximum likelihood (ML) solution to locate acoustic sources in the near field and provided computer simulations to show its superiority in performance over other methods. This paper is an extension of [8], where both the far-and the near-field cases are considered, and the theoretical analysis is provided by the Cramér-Rao bound (CRB), which is useful for both performance comparison and basic understanding purposes. In addition, several experiments have been conducted to verify the usefulness of the proposed algorithm. These experiments include both indoor and outdoor scenarios with half a dozen microphones to locate one or two acoustic sources (sound generated by computer speaker(s)).
One major advantage that the proposed ML approach has is that it avoids the intermediate relative time-delay estimation. This is made possible by transforming the wideband data to the frequency domain, where the signal spectrum can be represented by the narrowband model for each frequency bin. This allows a direct optimization for the source location(s) under the assumption of Gaussian noise instead of the two-step optimization that involves the relative timedelay estimation. The difficulty in obtaining relative time delays in the case of multiple sources is well known, and by avoiding this step, the proposed approach can then estimate multiple source locations. However, in practice, when we apply the discrete Fourier transform (DFT), several artifacts can result due to the finite length of data frame (see Section 2.1.1). As a result, there does not exist an exact ML solution for data of finite length. Instead, we ignore these finite effects and derive the solution which we refer to as the approximated ML (AML) solution. Note that a similar solution has been derived independently in [9] for the far-field case.
In practice, the number of sources may be determined independent of or together with the localization algorithm, but here we assume that it is known for the purpose of this paper. For the single-source case, we have shown that the AML formulation is equivalent to maximizing the sum of the weighted cross-correlation functions between timeshifted sensor data in [8]. The optimization using all sensor pairs mitigates the ambiguity problem that often arises in the relative time-delay estimation between two widely separated sensors for the two-step LS methods. In the case of multiple sources, we apply an efficient alternating projection (AP) procedure, which avoids the multidimensional search by sequentially estimating the location of one source while fixing the estimates of other source locations from the previous iteration. In this paper, we demonstrate the localization results using the AML method to the measured data, both in the near-field and far-field cases, and for various types of sound sources, for example, vehicle, music, and even white noise. The AML approach is shown to outperform the LS-type algorithms in the single-source case, and by applying AP, the proposed algorithm is able to locate two sound sources from the observed data.
The paper is organized as follows. In Section 2, the theoretical performances of DOA estimation and source localization with the CRB analysis are given. Then, we derive the AML solution for DOA estimation and source localization in Section 3. In Section 4, simulation examples and experimental results are given to demonstrate the usefulness of the proposed method. Finally, we give our conclusions.

THEORETICAL PERFORMANCE AND ANALYSIS
In this section, the theoretical performances of DOA estimation for the far-field case and of source localization for the near-field case are analyzed. First, we define the signal models for the far-and near-field cases. Then, the CRBs are derived and analyzed. The CRB is most often used as a theoretical lower bound for any unbiased estimator [10]. Most of the derivations of the CRB for wideband source localization found in the literature are in terms of relative time-delay estimation error. In the following, we derive a more general CRB directly from the signal model. By developing a theoretical lower bound in terms of signal characteristics and array geometry, we not only bypass the involvement of the intermediate time-delay estimator but also offer useful insights to the physical properties of the problem.
The DOA and source localization variances both depend on two separate parts, one that only depends on the signal and another that only depends on the array geometry. This suggests separate performance dependence on the signal and the geometry. Thus, for any given signal, the CRB can provide the theoretical performance of a particular ge- Figure 1: Far-field example with randomly distributed sensors. ometry and helps the design of an array configuration for a particular scenario of interest. The signal dependence part shows that theoretically the DOA and source location root mean squares (RMS) error are linearly proportional to the noise level and the speed of propagation, and inversely proportional to the source spectrum and frequency. Thus, better DOA and source location estimates can be obtained for highfrequency signals than low-frequency signals. In further sensitivity analysis, large range estimation error is found when the source signal is unknown, but such unknown parameter does not affect the angle estimation.
The CRB analysis also shows that the uniformly spaced circular array provides an attractive geometry for good overall performance. When a circular array is used, the DOA variance bound is independent of the source direction, and it also does not degrade when the speed of propagation is unknown. An effective beamwidth for DOA estimation can also be given by the CRB. The beamwidth provides a measure of how dense the angles should be sampled for the AML metric evaluation, thus prevents unneeded iterations using numerical techniques.
Throughout this paper, we denote superscript T as the transpose, H as the complex conjugate transpose, and * as the complex conjugate operation.

Signal model of the far-and near-field cases 2.1.1 The far-field case
When the source is in the far-field of the array, the wave front is assumed to be planar and only the angle information can be estimated. In this case, we use the array centroid as the reference point and define a signal model based on the relative time delays from this position. For simplicity, we assume a randomly distributed planar (2D) array of R sensors, each at position r p = [x p , y p ] T , as depicted in Figure 1. The centroid position is given by The sensors are assumed to be omnidirectional and have identical responses. On the same plane as the array, we assume that there are M sources (M < R), each at an angle φ (m) s from the array, for m = 1, . . . , M. The angle convention is such that north is 0 degree and east is 90 degrees. The relative time delay of the mth source is given by t (m) and t (m) p are the absolute time delays from the mth source to the centroid and the pth sensor, respectively, and v is the speed of propagation in length unit per sample. The data collected by the pth sensor at time n can be given by cp is allowed to be any real-valued number, and w p is the zero-mean white Gaussian noise with variance σ 2 .
For the ease of derivation and analysis, the wideband signal model should be given in the frequency domain, where a narrowband model can be given for each frequency bin. A block of L samples in each sensor data can be transformed to the frequency domain by a DFT of length N. It is well known that the DFT creates a circular time shift when applying a linear phase shift in the frequency domain. However, the time delay in the array data corresponds to a linear time shift, thus creating a mismatch in the signal model, which we refer to as an edge effect. When N = L, severe edge effect results for small L, but it becomes a good approximation for large L. We can apply zero padding for small L to remove such edge effect, that is, N ≥ L + τ, where τ is the maximum relative time delay among all sensor pairs. However, the zero padding removes the orthogonality of the noise component across frequency. In practice, the size of L is limited due to the nonstationarity of the source location. In the following, we assume that either L is large enough or the noise is almost uncorrelated across frequency. Note that the CRB derived based on this frequency-domain model is idealistic and does not take this edge effect into account.
In the frequency domain, the array signal model is given by for k = 0, . . . , N − 1, where the array data spectrum is given by X(k) = [X 1 (k), . . . , X R (k)] T , the steering matrix is given by (m) cp /N , and the source spectrum is given by S c (k) = [S (1) c (k), . . . , S (M) c (k)] T . The noise spectrum vector η(k) is zero-mean complex white Gaussian, distributed with variance Lσ 2 . Note that, due to the transformation of the frequency domain, η(k) asymptotically approaches a Gaussian distribution by the central limit theorem even if the actual time-domain noise has an arbitrary i.i.d. distribution (with bounded variance) other than Gaussian. This asymptotic property in the frequency domain provides a more reliable noise model than the time-domain model in some practical cases. For convenience of notation, we define S(k) = D(k)S c (k). By stacking up the N/2 positive frequency bins (zero frequency bin is not important and the negative frequency bins are merely mirror images) of the signal model in (2) into a single column, we can rewrite the sensor data into an NR/2 × 1 space-temporal frequency vector as X = G(Θ) + ξ, where G(Θ) = [S(1) T , . . . , S(N/2) T ] T , and R ξ = E[ξξ H ] = Lσ 2 I NR/2 .

The near-field case
In the near-field case, the range information can also be estimated in addition to the DOA. Denote r sm as the location of the mth source, and in this case we use this as the reference point instead of the array centroid. Since we consider the near-field sources, the signal strength at each sensor can be different due to nonuniform spatial loss in the near-field geometry. The sensors are again assumed to be omnidirectional and have identical responses. In this case, the data collected by the pth sensor at time n can be given by for p is the signal-gain level of the mth source at the pth sensor (assumed to be constant within the block of data), s (m) 0 is the source signal, and t (m) p is allowed to be any real-valued number. The time delay is defined by t (m) p = r sm −r p /v, and the relative time delay between the pth and the qth sensors is defined by t (m) With the same edge-effect problem mentioned above, the frequency-domain model for the near-field case is given by for k = 0, . . . , N − 1, where each element of the steering vector now becomes d (m) p (k) = a (m) p e − j2πkt (m) p /N , and the source spectrum is given by

Cramér-Rao bound for DOA estimation
In the following CRB derivation, we consider the singlesource case (M = 1) under three conditions: known signal and known speed of propagation, known signal but unknown speed of propagation, and known speed of propagation but unknown signal. The comparison of the three conditions provides a sensitivity analysis of different parameters. Only the single-source case is considered since valuable analysis can be obtained using a single source while the analytic expression of the multiple-sources case becomes much more complicated. The far-field frequency-domain signal model for the single-source case is given by for After considering all the positive frequency bins, we can construct the Fisher information matrix [10] by where H = ∂G/∂φ s for the case of known signal and known speed of propagation. In this case, the Fisher information matrix is indeed a scalar F φs = ζα, 2 is the scale factor that is proportional to the total power in the derivative of the source signal, and α = R p=1 b 2 p is the geometry factor that depends on the array and the source direction, where Hence, for any arbitrary array, the RMS error bound for DOA estimation is given by σ φs ≥ 1/ ζα. The geometry factor α provides a measure of geometric relations between the source and the sensor array. Poor array geometry may lead to a small α, which results in large estimation variance. It is clear from the scale factor ζ that the performance does not solely depend on the SNR but also the signal bandwidth and spectral density. Thus, source localization performance is better for signals with more energy in the high frequencies.
In the case of unknown source signal, the matrix . . , S c (N/2)] T , and |S c | and Φ c are the magnitude and phase part of S c , respectively. The resulting bound after applying the well-known block matrix inversion lemma (see [11,Appendix]) on F φs,Sc is given by 2 is the penalty term due to the unknown source signal. It is known that the DOA performance does not degrade when the source signal is unknown; thus, we can show that z Sc is indeed zero, that is, Note that the above analysis is valid for any arbitrary array. When the speed of propagation is unknown, the matrix H = [∂G/∂φ s , ∂G/∂v], and the resulting bound after applying the matrix inversion lemma on F φs,v is given by 2 is the penalty term due to the unknown speed of propagation. This penalty term is not necessarily zero for any arbitrary array, but it becomes zero for a uniformly spaced circular array.

The circular-array case
In the following, we show the CRB for a uniformly spaced circular array. Not only a simple analytic form can be given but also the optimal geometry for DOA estimation. The variance of the DOA estimation is independent of the source direction, and also does not degrade when the speed of propagation is unknown. Without a loss of generality, we pick the array centroid as the origin, that is, r c = [0, 0] T . The location of the pth sensor is given by r p = [ρ sin φ p , ρ cos φ p ] T , where ρ is the radius of the circular array, φ p = 2π p/R + φ 0 is the angle of the pth sensor with respect to north, and φ 0 is the angle that defines the orientation of the array. Then, α = ρ 2 R/2. The DOA variance bound is given by σ 2 φs (circular array) ≥ 2/ζρ 2 R, which is independent of the source direction. It is useful to define the following terms for a better interpretation of the CRB. Define the normalized root weighted mean squared (nrwms) source frequency by and the effective beamwidth by Then, the RMS error bound for DOA estimation can be given by where the effective SNR This shows that the effective beamwidth is proportional to the speed and propagation and inversely proportional to the circular array radius and the nrwms source frequency. For example, take v = 345/1000 = 0.345 m/sample, N = 256, ρ = 0.1 m, k nrwms = 0.78, and φ BW = 2.8 degree. If we use a larger circular array where ρ = 0.5 m, φ BW = 0.6 degree. The effective beamwidth is useful to determine the angular sampling for the AML maximization. This avoids excessive sampling in the angular space and also prevents further iterations on the AML maximization. Based on the angular sampling by the effective beamwidth, a quadratic polynomial interpolation (concave function) of three points can yield the DOA estimate easily (see Appendix A). The explicit analytical form of the CRB for the circular array is also applicable to a randomly distributed 2D array. For instance, we can compute the RMS distance of the sensors from its centroid and use that as the radius ρ in the circular array formula to obtain the effective beamwidth to estimate the performance of a randomly distributed 2D array. For instance, for a randomly distributed array of 5 sensors at positions {(1, 1), (2, 0.8), (3, 1.4), (1.5, 3), (1, 2.5)}, the RMS distance of the array to its centroid is 1.14. Since we cannot obtain an explicit analytical form for this random array, we can simply use the circular array formula for ρ = 1.14 to obtain the effective beamwidth φ BW . For some random arrays, the DOA variance depends highly on the source direction, and an elliptical model is better than the circular one (see Appendix B).

CRB for source localization
For the near-field case, we also consider the CRB for a single source under three different conditions. The source signal S c and steering vector in the far-field case are replaced by S 0 and by the steering vector with signal-gain level a p in the signal component G, respectively. For the first case, we can construct the Fisher information matrix by (6), where H = ∂G/∂r T s , assuming that r s is the only unknown. In this case, F rs = ζA, where is the array matrix and u p = (r s − r p )/ r s − r p . The A matrix provides a measure of geometric relations between the source and the sensor array. Poor array geometry may lead to degeneration in the rank of matrix A. Note that the near-field CRB has the same dependence ζ on the signal as the far-field case.
When the speed of propagation is also unknown, that is, where , and t = [t 1 , . . . , t R ] T . By applying the block matrix inversion lemma, the leading D×D submatrix of the inverse Fisher information block matrix can be given by where the penalty matrix due to the unknown speed of propagation is defined by The matrix Z v is nonnegative definite; therefore, the source localization error of the unknown speed of propagation case is always larger than that of the known case. When the source signal is also unknown, that is,  (1), . . . , S 0 (N/2)] T , and |S 0 | and Φ 0 are the magnitude and phase part of S 0 , respectively. The Fisher information matrix can then be explicitly given by where B and D are not explicitly given since they are not needed in the final expression. By applying the block matrix inversion lemma, the leading D × D submatrix of the inverse Fisher information block matrix can be given by where the penalty matrix due to the unknown source signal is defined by The CRB with the unknown source signal is always larger than that with the known source signal, as discussed below. It can be easily shown that since the penalty matrix Z S0 is nonnegative definite. The Z S0 matrix acts as a penalty term since it is the average of the square of weighted u p vectors. The estimation variance is larger when the source is faraway since the u p vectors are similar in directions to generate a larger penalty matrix, that is, u p vectors add up. When the source is inside the convex hull of the sensor array, the estimation variance is smaller since Z S0 approaches zero, that is, u p vectors cancel each other. For the 2D case, the CRB for the distance error of the estimated location [ x s , y s ] T from the true source location can be given by where d 2 = ( x s −x s ) 2 +( y s − y s ) 2 . By further expanding the parameter space, the CRB for multiple source localization can also be derived, but its analytical expression is much more complicated and will not be considered here. The case of the unknown signal and the unknown speed of propagation is also not shown due to its complicated form but numerical similarity to the unknown signal case. Note that when both the source signal and sensor gains are unknown, it is possible to determine the values of the source signal and the sensor gains (they can only be estimated up to a scaled constant).

The circular-array case
In the following, we again consider the uniformly spaced circular array with radius ρ for the near-field CRB. Assume that the source is at distance r s from the array centroid that is large enough so that the signal-gain levels are uniform, that is, a p = a. Consider the 2D case of unknown source signal, and without loss of generality, let the line of sight (LOS) be the X-axis and let the cross line of sight (CLOS) be the Yaxis. Then, the error covariance matrix is given by F −1 rs,S0 11:22 (circular array) The intermediate approximations are given in Appendix C.
The above result shows that as r s increases, the LOS error increases much faster than the CLOS error. For any arbitrary source location, the LOS error is always uncorrelated with the CLOS error. The variance of the DOA estimation is given by σ 2 φs = σ 2 CLOS /r 2 s 2/ζRa 2 ρ 2 , which is the same as the farfield case for a = 1. The ratio of the CLOS and LOS error can provide a quantitative measure to differentiate far-field from near-field. For example, define far-field as the case when the ratio r s /ρ > γ. Then, for a given circular array, we can define far-field as the case when the source range exceeds the array radius γ times. The explicit analytical form of the circular array CRB in the near-field case is again useful for a randomly distributed 2D array. In the near-field case, the location error bound can be represented by an ellipse, where its major axis represents the LOS error and its minor axis represents the CLOS error.

Derivation of the ML solution
The derivation of the AML solution for real-valued signals generated by wideband sources is an extension of the classical ML DOA estimator for narrowband signals. Due to the wideband nature of the signal, the AML metric results in a combination of each subband. In the following derivation, the near-field signal model is used for source localization, and the DOA estimation formulation is merely the result of a trivial substitution.
We assume initially that the unknown parameter space is which is equivalent to finding min rs,S0(k) f (k) for all k bins, where The minima of f (k), with respect to the source signal vector S 0 (k), must satisfy ∂ f (k)/∂S H 0 (k) = 0, hence the estimate of the source signal vector which yields the minimum residual at any source location is given by Note that the AML metric J( r s ) has an implicit form for the estimation of S 0 (k), whereas the metric ᏸ(Θ) shows the explicit form. Once the AML estimate of r s is obtained, the AML estimate of the source signals can be given by (21). Similarly, in the far-field case, the unknown parameter vector contains only the DOAs, that is, Thus, the AML DOA estimation can be obtained by arg max φ s N/2 k=1 P(k, φ s )X(k) 2 . It is interesting that, when zero padding is applied, the covariance matrix R ξ is no longer diagonal and is indeed singular; thus, an exact ML solution cannot be derived without the inverse of R ξ . In the above formulation, we derive the AML solution using only a single block. A different AML solution using multiple blocks could also be formed with some possible computational advantages. When the speed of propagation is unknown, as in the case of seismic media, we may expand the unknown parameter space to include it, that is,

Single-source case
In the single-source case, the AML metric in (22) becomes is the beam-steered beamformer output in the frequency domain [12], d = d/ R p=1 a 2 p is the normalized steering vector, and a p = a p / R p=1 a 2 p is the normalized signal-gain level at the pth sensor. It is interesting to note that in the near-field case, the AML beamformer output is the result of forming a focused spot (or area) on the source location rather than a beam since the range is also considered. In the far-field case, the AML metric becomes J(φ s ). In [8], the AML criterion is shown to be equivalent to maximizing the weighted cross correlations between sensor data, which is commonly used for estimating relative time delays.
The source location can be estimated, based on where, J(r s ) is maximized for a given set of locations. Define the normalized metric 2 , which is useful to verify estimated peak values. Without any prior information on possible region of the source location, the AML metric should be evaluated on a set of grid points. A nonuniform grid is suggested to reduce the number of grid points. For the 2D case, polar coordinates with nonuniform sampling of the range and uniform sampling of the angle can be transformed to Cartesian coordinates that are dense near the array and sparse away from the array. When the crude estimate of the source location is obtained from the grid-point search, iterative methods can be applied to reach the global maximum (without running into local maxima, given appropriate choice of grid points). In some cases, grid-point search is not necessary since a good initial location estimate is available from, for example, the estimate of the previous data frame for a slowly moving source. In this paper, we consider the Nelder-Mead direct search method [13] for the purpose of performance evaluation.

Multiple-sources case
For the multiple-sources case, the parameter estimation is a challenging task. Although iterative multidimensional parameter search methods such as the Nelder-Mead direct search method can be applied to avoid an exhaustive multidimensional grid search, finding the initial source location estimates is not trivial. Since iterative solutions for the singlesource case are more robust and the initial estimate is easier to find, we extend the AP method in [14] to the near-field problem. The AP approach breaks the multidimensional parameter search into a sequence of single-source-parameter search, and yields fast convergence rate. The following describes the AP algorithm for the two-sources case, but it can be easily extended to the case of M sources. Let Θ = [Θ T 1 , Θ T 2 ] T be either the source locations in the near-field case or the DOAs in the far-field case.
Step 1. Estimate the location/DOA of the stronger source on a single-source grid Step 2. Estimate the location/DOA of the weaker source on a single-source grid under the assumption of a two-source model while keeping the first source location estimate from Step 1 constant Step 3. Iterative AML parameter search (direct or gradient search) for the location/DOA of the first source while keeping the estimate of the second source location from the previous iteration constant Step 4. Iterative AML parameter search (direct or gradient search) for the location/DOA of the second source while keeping the estimate of the first source location from Step 3 constant For i = 1, . . . (repeat Steps 3 and 4 until convergence).

Cramér-Rao bound example
In the following simulation examples, we consider a prerecorded tracked vehicle signal with significant spectral content of about 50-Hz bandwidth centered about a dominant frequency at 100 Hz. The sampling frequency is set to be 1 kHz and the speed of propagation is 345 m/s. The data length L = 200 (which corresponds to 0.2 second), the DFT size N = 256 (zero padding), and all positive frequency bins are considered. We consider a single-traveling-source scenario for a circular array of seven elements (uniformly spaced on the circumference), as depicted in Figure 2. In this case, we consider the spatial loss that is a function of the distance from the source location to each sensor location, thus the gains a p 's are not uniform. To compare the theoretical performance of source localization under different conditions, we compare the CRB for the known source signal and speed of propagation, for the unknown speed of propagation, and for the unknown source signal cases for this single-travelingsource scenario. As depicted in Figure 3, the unknown source signal is shown to be a much more significant parameter factor than the unknown speed of propagation in source location estimation. However, these parameters are not significant in the DOA estimations.

Single-source experimental results
Several acoustic experiments were conducted in Xerox PARC, Palo Alto, Calif, USA. The experimental data was collected indoor as well as outdoor by half to a dozen omnidirectional microphones. A semianechoic chamber with sound absorbing foams attached to the walls and ceiling (shown to have a few dominant reflections) was used for the indoor data collection. An omnidirectional loud speaker was used as the sound source. In one indoor experiment, the source is placed in the middle of the rectangular room of dimension 3 × 5 m surrounded by six microphones (convex hull configuration), as depicted in Figure 4. The sound of a moving light-wheeled vehicle is played through the speaker and collected by the microphone array. Under 12 dB SNR, the speaker location can be accurately estimated (for every 0.2 second of data)  with an RMS error of 73 cm using the near-field AML source localization algorithm. An RMS error of 127 cm is reported the same data using the two-step LS method. This shows that both methods are capable of locating the source despite some minor reverberation effects.
In the outdoor experiment (next to Xerox PARC building), three widely separated linear subarrays, each with four microphones (1 ft interelement spacing), are used. A stationary noise source (possibly air conditioning) is observed from an adjacent building. To demonstrate the effectiveness of the algorithms in handling wideband signals, a white Gaussian signal is played through the loud speaker placed at the two locations (from two independent runs) shown in Figure 5. In this case, each subarray estimates the DOA of the source independently using the AML method, and the bearing crossing (see Appendix D) from the three subarrays (labeled as A, B, and C in the figures) provides an estimate of the source location. The estimation is again performed for every 0.2 second of data. An RMS error of 32 cm is reported for the first location, and an RMS error of 97 cm is reported for the second location. Then, we apply the two-step LS DOA estimation to the same data, which involves relative timedelay estimation among the Gaussian signals. Poorer results are shown in Figure 6, where an RMS error of 152 cm is reported for the first location, and an RMS error of 472 cm is   reported for the second location. This shows that when the source signal is truly wideband, the time-delay-based techniques can yield very poor results. In other outdoor runs, the AML method was also shown to yield good results for music signals.
Then, a moving source experiment is conducted by placing the loud speaker on a cart that moves on a straight line from the top to the bottom of Figure 7. The vehicle sound is again played through the speaker while the cart is moving. We assume that the source location is stationary within each   data frame of about 0.1 second, and the DOA is estimated for each frame using the AML method. The source location is again estimated by the cross bearing of the three DOAs. As shown in Figure 7, the source can be well estimated to be very close to the actual traveled path. The results using the LS method (not shown) are much worse when the source is faraway.

Two-source experimental results
In a different outdoor configuration, two linear subarrays (labeled as A and C), each consisting of four microphones, are placed at the opposite sides of the road and two omnidirectional loud speakers are placed between them, as depicted in Figure 8. The two loud speakers play two independent prerecorded sounds of light-wheeled vehicles of different kinds. By using the AP steps on the AML metric, the DOAs of the two sources are jointly estimated for each array under 11 dB SNR (with respect to the bottom array). Then, the cross bearing yields the location estimates of the two sources. The estimation is performed for every 0.2 second of data. An RMS error of 37 cm is observed for source 1 and an RMS error of 45 cm is observed for source 2. Note that the range estimate of the second source is slightly worse than that of the first source because the bearings from the two arrays are close to being collinear for the second source. Another two-source localization experiment was also conducted inside the semianechoic chamber. In this setup, twelve microphones are placed in a linear manner near one of the walls. Two speakers are placed inside the room, as depicted in Figure 9. The microphones are then divided into three nonoverlapping groups (subarrays, labeled as A, B, and C), each with four elements. Each subarray performs the AML DOA estimation using AP. The cross bearing of the DOAs again provides the location estimate of the two sources. The estimation is again performed for every 0.2 second of data. An RMS error of 154 cm is observed for the first source, and an RMS error of 35 cm is observed for the second source. Since the bearing angles are not too different across the three subarrays, the source range estimate becomes poor, especially for source 1. This again suggests that  the geometry of the subarrays used in this experiment was far from ideal, and widely separated subarrays would have yielded better triangulation (cross bearing) results.

CONCLUSION
In this paper, the theoretical CRBs for source localization and DOA estimation are analyzed and the AML source localization and DOA estimation methods are shown to be effective as applied to measured data. For the single-source case, the AML performance is shown to be superior to that of the twostep LS method in various types of signals, especially for the truly wideband ones. The AML algorithm is also shown to be effective in locating two sources using AP. The CRB analysis suggests the uniformly spaced circular array as the preferred array geometry for most scenarios. When a circular array is used, the DOA variance bound is independent of the source direction, and it also does not degrade when the speed of propagation is unknown. The CRB also proves the physical observations which favor high energy in the higherfrequency components of a signal. The sensitivity of source localization to different unknown parameters has also been analyzed. It has been shown that unknown source signal results in a much larger error in range than that of unknown speed of propagation, but those parameters are not significant in DOA estimation.

A. DOA ESTIMATION USING INTERPOLATION
Denote the three data points {(x 1 , y 1 ), (x 2 , y 2 ), (x 3 , y 3 )} as the angular samples and their corresponding AML function values, where y 2 is the overall maximum and the other two are the adjacent samples. By the Lagrange interpolation polynomial formula [15], we can obtain a quadratic polynomial that interpolates the three data points. The angle (or the DOA estimate) that yields the maximum value of the quadratic polynomial is given by where c 1 = y 1 /(x 1 − x 2 )/(x 1 − x 3 ), c 2 = y 2 /(x 2 − x 1 )/(x 2 − x 3 ), and c 3 = y 3 /(x 3 − x 1 )/(x 3 − x 2 ). The interpolation step avoids further iterations on the AML maximization.

B. THE ELLIPTICAL MODEL OF DOA VARIANCE
In Section 2.2.1, we show that we can conveniently define an effective beamwidth for a uniformly spaced circular array. This gives us one measure of the beamwidth that is independent of the source direction. When we have randomly distributed arrays, the circular CRB may be a reasonable approximation if the sensors are distributed uniformly in both the X and Y directions. However, in some cases, the sensors may span more in one direction than the other. In that case, we may model the effective beamwidth using an ellipse. The direction of the major axis indicates the best DOA performance, where a small beamwidth can be defined. The direction of the minor axis indicates the poorest DOA performance, and a large beamwidth is defined in that direction. This suggests the use of a variable beamwidth as a function of angle, which is useful for the AML metric evaluation. First, we need to determine the orientation of the ellipse for an arbitrary 2D array. Without loss of generality, we define the origin at the array centroid r c = [x c , y c ] T = [0, 0] T . Let there be a total of R sensors. The location of the pth sensor is denoted as r p = [x p , y p ] T in the coordinate system. Our objective is to find a rotation angle ψ from the X-axis such that the cross terms of the new sensor locations are summed to zero. The major and minor axes will be the new Xand Y -axes. Denote [x p , y p ] T as the new coordinate of the pth sensor in the rotated coordinate system. The new coordinate has the following relation with the old coordinate: x p = x p cos ψ + y p sin ψ, y p = −x p sin ψ + y p cos ψ. (B.1) The sum of the cross terms is then given by where c 1 = R p=1 (y 2 p − x 2 p ) and c 2 = R p=1 x p y p . After double angle substitutions and some algebraic manipulation to equate the above to zero, we obtain the solution ψ = − 1 2 tan −1 2c 2 c 1 + π 2 , (B.3) for = 0 and 1, which means that the two solutions that are different by 90 degrees exist. We have shown that, for a circular array, the DOA variance bound is given by 1/ζα, where α = ρ 2 R/2. For an ellipse with the center at the origin, the corresponding α = R p=1 b 2 p = cos 2 φ s