Acoustic source localization in mixed field using spherical microphone arrays
 Qinghua Huang^{1}Email author and
 Tong Wang^{2}
https://doi.org/10.1186/16876180201490
© Huang and Wang; licensee Springer. 2014
Received: 30 October 2013
Accepted: 22 May 2014
Published: 14 June 2014
Abstract
Spherical microphone arrays have been used for source localization in threedimensional space recently. In this paper, a twostage algorithm is developed to localize mixed farfield and nearfield acoustic sources in freefield environment. In the first stage, an array signal model is constructed in the spherical harmonics domain. The recurrent relation of spherical harmonics is independent of farfield and nearfield mode strengths. Therefore, it is used to develop spherical estimating signal parameter via rotational invariance technique (ESPRIT)like approach to estimate directions of arrival (DOAs) for both farfield and nearfield sources. In the second stage, based on the estimated DOAs, simple onedimensional MUSIC spectrum is exploited to distinguish farfield and nearfield sources and estimate the ranges of nearfield sources. The proposed algorithm can avoid multidimensional search and parameter pairing. Simulation results demonstrate the good performance for localizing farfield sources, or nearfield ones, or mixed field sources.
Keywords
1 Introduction
Acoustic source localization using microphone arrays has many applications, such as video conferences, intelligent systems, and robotics. It has received great attention since almost four decades [1, 2]. In most of array signal processing applications, the wavefront is assumed to be planar, that is, all the sources are located in the farfield (FF) of an array. In this case, the parameter that characterizes a source location is its direction of arrival (DOA) [2]. In the nearfield (NF) of an array, the range information should be integrated into the array signal model for accurately characterizing sources [3]. Although plane wave assumption can simplify the modeling and processing, it cannot hold in nearfield applications and results in analysis errors. Moreover, in some practical applications, the signals collected by microphone arrays are often the mixture of farfield and nearfield sources. Each source may be located in the nearfield or farfield of an array [4–9]. The localization methods for the mixed field sources should discriminate farfield and nearfield sources. Then for farfield sources, they only estimate DOA information. For nearfield sources, range information is also estimated.
If an acoustic source locates in threedimensional (3D) space, its position information is jointly described by range and bearing (azimuth and elevation). The geometric structure of a microphone array is very important for the localization performance. Currently, most localization techniques used a uniform linear array (ULA) or a uniform circular array (UCA) [1, 2, 5–7, 9–11] to estimate source positions. Planar arrays, such as cross array and uniform rectangular array (URA) are the straightforward extensions of the ULA and can estimate both azimuth and elevation [8, 12, 13]. ULAs cause a 180° ambiguity in the azimuth estimation. UCAs can provide 360° azimuthal coverage due to its circular symmetry in the azimuth plane. The main drawback of planar arrays including UCAs is that they provide a smaller aperture in the elevation plane compared to the azimuth plane, resulting in poor estimation of elevation angles [10]. Some arbitrary array configurations were investigated to localize sources [14–17]. They were different from array uniformity that traditional localization approaches require. The array structure was selected according to some specific practical applications. Spherical microphone arrays have 3D symmetrical geometry configuration and can capture higher order sound field information. The 3D structure advantage facilitates more accurate sound source localization. Moreover, they can be analyzed within the mathematical framework of the spherical Fourier transform (SFT) which greatly simplifies processing in the space domain. Therefore, they have received considerable attention and have a wide variety of applications in the fields of source localization, beamforming, and acoustic analysis [18–20]. In this paper, we aim to develop a novel algorithm able to accurately estimate the locations of mixed field sources using spherical arrays.
Many techniques were proposed to estimate DOAs of multiple acoustic sources. Multiple signal classification (MUSIC) and estimating signal parameter via rotational invariance techniques (ESPRIT) are two subspace techniques [21, 22]. The latter avoids multidimensional search in the parameter space. Goossens and Rogier proposed a unitary spherical ESPRIT algorithm based on the spherical phasemode excitation that yielded accurate estimates with low computational complexity [23]. The eigenbeam (EB)ESPRIT algorithm for spherical arrays was presented in [24] with its performance analysis for robust localization in reverberant environments. It only exploited the relation between a fixed order of spherical harmonics. Many approaches were based on beamforming [22]. Argentieri and Danes proposed an online beamspace MUSIC method with a beamforming scheme to localize sound sources in robotics [25]. Sun et al. proposed several steered beamformerbased and subspacebased localization techniques in the spherical EB domain [26]. They localized early reflections in room acoustic environments. Wu et al. used sparse recovery to localize sources and formulate superresolution beamforming in the spherical harmonic domain [27]. However, when a source is close to the spherical array, the array signal model based on the farfield assumption is no longer valid. Independent component analysis (ICA) was used to estimate source locations. It employed higher order statistics and directly identified basis vectors containing the source location information. It was applied in nearfield or farfield [28]. The ICAbased method was used to estimate DOAs for spherical microphone arrays [29, 30]. It fails to localize sources which are not statistically independent. Source localization can be considered as an overcomplete basis representation problem using a grid of spatial locations. Many sparse recovery methods were used to estimate source DOAs [31, 32]. If the source is in 3D space, the number of basis is large and the computational complexity is high. Some approaches assumed that one source was dominant over the others in some timefrequency zones [11, 33]. They extended the singlesource DOA algorithm over these zones to estimate multiple source locations. They were based on the sparse representation of the observation signals in the timefrequency domain.
In many practical applications, the observations collected by an array may be either mixed farfield and nearfield signals or multiple farfield signals or multiple nearfield sources. Most of the above techniques localize sources in farfield or nearfield. In recent years, source localization in mixed nearfield and farfield has been developed using MUSIC algorithm [5–7], ESPRITlike technique [8], or sparse signal reconstruction method [4] based on a linear array. Jiang et al. proposed a 3D source localization algorithm with a cross array [9]. First, the elevation angles are obtained based on the generalized ESPRIT method. Similar to the root MUSIC method, the range parameters are estimated with the elevation estimates. Finally, a MUSIC pseudospectrum function is used to get the azimuth angles with the elevation and range estimates. Due to the 3D symmetrical structure, spherical arrays have been widely used in farfield source localization [23, 24]. A spherical microphone array was used in the nearfield, and a new closetalking microphone array was proposed in [34]. It can adaptively compensate for the distance and orientation of a nearfield source. Fisher and Rafaely presented a nearfield spherical microphone array and defined the nearfield criterion in terms of the array order and radius [35]. They analyzed spherical microphone array capabilities in the nearfield and designed a radial filter discriminating the distances between the sources incident from the same direction [36]. Although the aforementioned work considers the nearfield processing of the spherical microphone array, nearfield or mixed field source localization via a spherical microphone array has not yet been studied. Based on the recurrent relation of the spherical harmonics, only DOAs were estimated for mixed field sources simultaneously [37]. However, how to distinguish nearfield and farfield sources and how to estimate the ranges of nearfield sources were not considered.
The aim of our work is to develop a new method that is able to localize mixed farfield and nearfield sources simultaneously using spherical arrays in freefield environment. It can avoid the parameter pairing problem and complex multidimensional search. Threedimensional MUSIC method scans the azimuth, elevation, and range parameter space and brings very high computational complexity. Therefore, it is not practical for direct source localization in 3D space. First, we construct the mixed nearfield and farfield array signal model in the spherical harmonics domain. The mixed steering matrix in the spherical harmonics domain only contains the source DOA and range information. Moreover, the DOA and range information is decoupled. Exploiting the recurrent relation between spherical harmonics, we extend the spherical ESPRIT method to simultaneously discriminate directions of multiple farfield and nearfield sources. This avoids twodimensional parameter space search and the azimuth and elevation pairing. Based on the estimated DOAs, the ranges of nearfield sources can be easily obtained using the onedimensional MUSIC algorithm with high resolution.
The remainder of this paper is organized as follows: Section 2 introduces mixed field array signal model in the spherical harmonics domain. A twostage source localization with spherical arrays is developed in Section 3. Simulation results in Section 4 are presented to demonstrate the performance of the proposed algorithm. Conclusions are given in Section 5.
2 Array signal model for mixed field sources
To clarify the notations, scalars are denoted as italic letters (a, b, A, B, …), column vectors as lowercase boldface letters (a, b, …), and matrices as boldface capitals (A, B, …). The superscripts T, ∗, and H denote transpose, complex conjugation, and conjugate transpose, respectively. diag(⋅) defines a diagonal matrix and arg(⋅) calculates the phase.
where $\mathbf{x}\left(\mathit{k}\right)={\left[\begin{array}{cccc}\hfill {\mathit{x}}_{1}\left(\mathit{k}\right)\hfill & \hfill {\mathit{x}}_{2}\left(\mathit{k}\right)\hfill & \hfill \cdots \hfill & \hfill {\mathit{x}}_{\mathit{L}}\left(\mathit{k}\right)\hfill \end{array}\right]}^{\mathit{T}}$ denotes an observation vector composed of the pressure samples at each sensor at a frequency corresponding to the wavenumber k. $\mathbf{v}\left(\mathit{k}\right)={\left[\begin{array}{cccc}\hfill {\mathit{v}}_{1}\left(\mathit{k}\right)\hfill & \hfill {\mathit{v}}_{2}\left(\mathit{k}\right)\hfill & \hfill \cdots \hfill & \hfill {\mathit{v}}_{\mathit{L}}\left(\mathit{k}\right)\hfill \end{array}\right]}^{\mathit{T}}$ is a noise vector, and ${\mathbf{s}}_{\mathrm{F}}\left(\mathit{k}\right)={\left[\begin{array}{cccc}\hfill {\mathit{s}}_{1}\left(\mathit{k}\right)\hfill & \hfill {\mathit{s}}_{2}\left(\mathit{k}\right)\hfill & \hfill \cdots \hfill & \hfill {\mathit{s}}_{{\mathit{D}}_{1}}\left(\mathit{k}\right)\hfill \end{array}\right]}^{\mathit{T}}$ and ${\mathbf{s}}_{\mathrm{N}}\left(\mathit{k}\right)={\left[\begin{array}{cccc}\hfill {\mathit{s}}_{{\mathit{D}}_{1}+1}\left(\mathit{k}\right)\hfill & \hfill {\mathit{s}}_{{\mathit{D}}_{1}+2}\left(\mathit{k}\right)\hfill & \hfill \cdots \hfill & \hfill {\mathit{s}}_{\mathit{D}}\left(\mathit{k}\right)\hfill \end{array}\right]}^{\mathit{T}}$ are farfield and nearfield source signals, respectively. They are assumed to be statistically independent and well separated. ${\mathbf{A}}_{\mathrm{F}}\in {\mathit{C}}^{\mathit{L}\times {\mathit{D}}_{1}}$ and ${\mathbf{A}}_{\mathrm{N}}\in {\mathit{C}}^{\mathit{L}\times {\mathit{D}}_{2}}$ are the corresponding physical steering matrices assuming the array is in freefield. The D sources are incident from DOAs Φ_{ d } = (ϑ_{ d }, φ_{ d }), where d = 1, 2, …, D. Range information ${\mathit{r}}_{{\mathit{d}}_{2}}$ is important only for nearfield sources, where d_{2} = D_{1} + 1, D_{1} + 2, …, D. The objective is to estimate the azimuths and elevations for farfield sources and joint azimuthelevationranges for nearfield sources.
 1.
The number of all sources is known.
 2.
The incident source signals are statistically independent.
 3.
The noise is zeromean, complex circular Gaussian, and spatially uniform white, and is statistically independent of all the signals [6].
2.1 Spherical harmonic representation for spherical array processing
where δ_{nn′} = 1 for n = n^{′}, and δ_{nn′} = 0 otherwise. Equation 8 is the orthonormality of continuous spherical harmonics. However, spherical microphone arrays perform spatial sampling of continuous functions defined on a sphere. Spatial sampling, similar to timedomain sampling, requires limited spatial bandwidth (limited harmonic order) to avoid aliasing [39].
where v_{ nm }(k) is a (N + 1)^{2} × 1 transform coefficient vector in the spherical harmonics domain.
where ⊙ is the Hadamard product.
2.2 Array signal model in the spherical harmonics domain
where ${\mathbf{Y}}_{\mathrm{FN}}\left(\mathrm{\Phi}\right)=\left[\begin{array}{cc}\hfill {\mathbf{Y}}_{\mathrm{F}}\left(\mathrm{\Phi}\right)\hfill & \hfill {\mathbf{Y}}_{\mathrm{N}}\left(\mathrm{\Phi}\right)\hfill \end{array}\right]\in {\mathit{C}}^{{\left(\mathit{N}+1\right)}^{2}\times \mathit{D}}$ is the new mixed steering matrix in the spherical harmonics domain and ${\mathbf{s}}_{\mathrm{FN}}\left(\mathit{k}\right)={\left[\begin{array}{cc}\hfill {\mathbf{s}}_{\mathrm{F}}^{\mathit{T}}\left(\mathit{k}\right)\hfill & \hfill {\mathbf{s}}_{\mathrm{N}}^{\mathit{T}}\left(\mathit{k}\right)\hfill \end{array}\right]}^{\mathit{T}}.$ In the spherical harmonics domain, the farfield and nearfield mixed steering matrix is independent of the element positions of the sampled array. The following localization algorithm in Section 3 is developed based on the array signal model in (23).
3 Acoustic source localization algorithm
The source localization in mixed field aims to estimate 2D parameters {ϑ_{ d }, φ_{ d }} for farfield sources and 3D parameters {r_{ d }, ϑ_{ d }, φ_{ d }} for nearfield sources given the array observations x(k). Based on the spherical harmonic model in (23), the common characteristics of the farfield and nearfield sources lie in that only spherical harmonics in the mixed steering matrix contain the DOA information. That is, the mixed steering matrix contains the DOAs of all sources. The difference between these sources is whether mode strength depends on the source distance or not. Therefore, DOAs can be estimated using the recursive relationship of spherical harmonics. Based on the estimated DOAs, nearfield range can be easily computed by conventional MUSIC algorithm.
3.1 DOA estimation
where ${\mathbf{U}}_{\mathrm{s}}\in {\mathit{C}}^{{\left(\mathit{N}+1\right)}^{2}\times \mathit{D}}$ contains D eigenvectors spanning the signal subspace of R_{ nm }, and the diagonal matrix Σ_{s} contains the corresponding eigenvalues. Similarly, U_{v} denotes the noise subspace, and Σ_{v} is built from the remaining (N + 1)^{2}  D eigenvalues of R_{ nm }.
where d = 1, 2, …, D. When N^{2} < 2D, this procedure fails. Hence, the maximum number of sources that can be accurately estimated by this algorithm is D = ⌊N^{2}/2⌋, where the operation ⌊⋅⌋ is the flooring operation. The three classical spatial sampling schemes for a spherical microphone array are equiangular, Gaussian, and nearly uniform sampling schemes. For a spherical microphone array with a given order N, the equiangular sampling scheme requires 4(N + 1)^{2} sensors, the Gaussian sampling scheme demands 2(N + 1)^{2} sensors, while the uniform sampling scheme only needs (N + 1)^{2} sensors [43]. Therefore, when L sensors collect the information of acoustic sources, the dimension of the spherical harmonics space (N + 1)^{2} ≤ L, that is, the dimension of localization in the spherical harmonics domain is lower than that of the element space. The maximal number of sources that can be uniquely estimated is the nearest integer less than or equal to N^{2}/2.
3.2 Range estimation
The nearfield search extent is r_{ d } ∈ (R, r_{N}). The range estimate of the d th source is obtained by

${\widehat{\mathit{r}}}_{\mathit{d}}=\underset{{\mathit{r}}_{\mathit{d}}}{max}\mathit{p}\left({\mathit{r}}_{\mathit{d}},{\widehat{\mathrm{\Phi}}}_{\mathit{d}}\right).$(39)
With the DOA estimates ${\widehat{\mathrm{\Phi}}}_{\mathit{d}}=\left({\widehat{\mathit{\vartheta}}}_{\mathit{d}},{\widehat{\mathit{\phi}}}_{\mathit{d}}\right)$ (d = 1, 2, …, D) in (36), the range estimator calculates the MUSIC spectra according to (38) and (41). For a DOA estimate ${\widehat{\mathrm{\Phi}}}_{\mathit{d}},$ we compare $\mathit{p}\left({\widehat{\mathrm{\Phi}}}_{\mathit{d}}\right)$ and the peak of $\mathit{p}\left({\mathit{r}}_{\mathit{d}},{\widehat{\mathrm{\Phi}}}_{\mathit{d}}\right)$; if the former is larger than the latter, the source is the farfield one. Otherwise, the source is the nearfield one and the estimated ${\widehat{\mathit{r}}}_{\mathit{d}}$ in (39) automatically pairs with the DOA estimate ${\widehat{\mathrm{\Phi}}}_{\mathit{d}}$.
3.3 Algorithm summarization
The proposed twostage algorithm can be summarized as follows:
Step 1. Array signal modeling: Apply spherical harmonic representation and construct mixed field array signal model in the spherical harmonics domain in (23).
Step 2. DOA estimation: Perform the EVD of R_{ nm } in (32) and choose three submatrices from the signal subspace. Construct the recurrent relation of these submatrices in (33) and estimate DOA information for all sources in (36).
Step 3. Range estimation: Based on the estimated DOAs, compute the farfield MUSIC spectrum in (41) and search the nearfield MUSIC spectrum in (38) to discriminate farfield and nearfield sources and obtain the pairing ranges for the nearfield sources.
Remarks
 1.
For computational complexity, we mainly consider the implementation of EVD and onedimensional (1D) MUSIC spectral search. In the spherical harmonics domain, the dimension of covariance matrix R _{ nm } is (N + 1)^{2} × (N + 1)^{2}. The computational complexity of the proposed localization algorithm includes (a) the eigendecomposition of R _{ nm }, of order $\mathcal{O}\left({\left(\mathit{N}+1\right)}^{6}\right)$, and (b) 1D spectral search, of order $\mathcal{O}\left({\left(\mathit{N}+1\right)}^{4}{\mathit{g}}_{\mathit{r}}\right),$ where g _{ r } is the search number conducted along the range axis [45].
 2.
Whether the incident sources are farfield ones, nearfield ones, or their mixture, the spherical ESPRITlike algorithm in the first stage can estimate all DOAs. When all incident sources are farfield ones, only the DOA information is enough. If all sources locate in nearfield, 1D MUSIC spectral search can find out the pairing range parameters. When the mixed sources include farfield and nearfield, based on the nearfield spectral search, compute MUSIC spectra in (38) and (41) and distinguish the source being farfield or nearfield one.
 3.
The proposed algorithm can localize mixed sources without parameter pairing and multidimensional search. In the first stage, the proposed algorithm estimates DOAs of mixed farfield and nearfield sources. In the second stage, the DOA estimates are used to compute the 1D MUSIC spectra according to (38) and (41). The spectral peak corresponds to the pairing ranges. Therefore, parameter pairing can be avoided in the proposed method.
4 Simulations
In this section, we conduct some simulations in freefield environment to evaluate the proposed localization algorithm for narrowband and wideband sources. A 32element uniform sampling [43] of spherical microphone array is chosen to estimate source locations. Its radius is assumed to be 10 cm. The highest spherical harmonics order is N = 4, so the nearfield extent of the array is (R, 0.64λ), where λ is the wavelength. The maximum number of sources that can be detected by the algorithm is D = N^{2} / 2 = 8. The DOA (azimuth and elevation) and range estimations are scaled in units of degree and wavelength (or meter), respectively. The performance of the localization estimation is measured by the rootmeansquare error (RMSE) of 1,000 independent Monte Carlo trials. In addition, the Bayesian CramerRao bound (CRB) provides a lower bound on the variance of any estimated parameter and defines the ultimate accuracy. The CRB analysis in [46] assumed all the sources were from farfield. The CRB analysis in [47] assumed the incident sources were all from nearfield. When both farfield and nearfield sources coexisted, the CRB analysis was provided in [6].
4.1 Narrowband source localization
 1.
RMSE versus SNR: The number of snapshots is set equal to 128. The two sources are localized in freefield and a rectangular room with the floor area 71 m^{2}, the ceiling height 3 m, and the reverberation time 0.7 s [48], respectively. The array is placed in the center of the rectangular room. When SNR varies from 0 to 30 dB, the RMSEs of the azimuths, elevations, and range estimations are shown in Figure 3. When SNR increases, the RMSEs of the azimuths, elevations, and range estimations decrease. The localization performance degrades in reverberant environment compared with that in freefield. When the two sources are incident from the same direction (45°, 122°), the RMSEs of azimuths, elevations, and range estimations versus SNR in freefield are shown in Figure 4. From the figure, we can see that the proposed algorithm can localize the farfield and nearfield source incident from the same direction. Moreover, the proposed algorithm has approximated estimation accuracy for the farfield and nearfield sources with respect to azimuths and elevations. The elevation angle estimation accuracy is higher than the azimuth estimation accuracy. This is because spherical arrays can provide a larger aperture in the elevation plane than that in the azimuth plane.
 2.
RMSE versus snapshot: A crossed array placed in the XZ plane is used in the simulation. Each ULA branch of the array consists of 15 (M = 7) uniformly spaced omnidirectional sensors with the intersensor spacing R / M. First, the elevation angles are obtained based on the generalized ESPRIT method. Then, the range parameters are estimated using the root MUSIC method based on the elevation estimates [9]. Finally, a 1D MUSIC pseudospectrum is used to search the azimuth angles with the elevation and range estimates. SNR is fixed at 15 dB. When the number of snapshots varies from 100 to 1,100, the average performances of 1,000 Monte Carlo runs are shown in Figure 5. The RMSEs of the azimuth, elevation, and range estimations decrease as the number of snapshots increases. The estimation performance of the spherical array is better than that of the cross array. This may be due to the symmetrical structure of the spherical array in 3D space and the simultaneous estimation for azimuth and elevation angles of the proposed method. Both our proposed algorithm and the threestep estimation method in [9] are based on the eigendecomposition of the array covariance matrix. The estimate of the covariance matrix affects the localization performance. Vershynin proposed that the sample size Q ∼ (N + 1)^{2} suffices to estimate the covariance by the sample covariance matrix [49]. Therefore, the localization performance of the two methods gets more stable with the larger number of the snapshots. The CRBs decrease proportional to the number of the snapshots. When the number of snapshots increases, the CRBs get more stable too.
 3.
RMSE versus angular gap: The snapshot number is set equal to 128. The SNR is fixed at 15 dB. When the direction of the farfield source is fixed, the azimuth and elevation of the nearfield source both vary from 5° to 30°. The RMSE of directions and range estimations are demonstrated in Figure 6. The DOA estimation of the farfield source and the range estimation of the nearfield source are insensitive to the angular gap. The azimuth estimation performance for the nearfield source of the proposed algorithm changes with the increase of the angular gap. The RMSE of the elevation estimation for the nearfield source gets slightly smaller.
 4.
RMSE versus range: Let the number of snapshots and SNR be 128 and 10 dB, respectively. When the range of the nearfield source varies from 0.22λ to 0.58λ, the RMSEs of DOAs and range estimations are shown in Figure 7. From the results, we can see that both of DOAs and range estimations of the nearfield source are very sensitive to the varied range. The RMSEs of the azimuth, elevation, and range estimations for the nearfield source, which is closer to the spherical array, are smaller than those of the source which is farther to the array. However, the location estimation of the farfield source is insensitive to the varied range of the nearfield source.
RMSEs of azimuth, elevation, and range estimations for mixed field
Source  Azimuth (deg)  Elevation (deg)  Range (wavelength) 

Nearfield source 1  0.25  0.01  0.002 
Nearfield source 2  0.03  0.01  0.002 
Nearfield source 3  0.42  0.01  0.010 
Nearfield source 4  0.01  0.02  0.026 
Farfield source 5  0.13  0.01   
Farfield source 6  0.11  0.05   
Farfield source 7  1.47  0.03   
Farfield source 8  0.04  0.38   
4.2 Wideband source localization
where ${\widehat{\mathbf{x}}}_{\mathit{nm}}\left(\mathit{k},\mathit{q}\right)$ is the q th snapshot of x_{ nm }(k). The operating frequency bandwidth is limited by aliasing at the higher frequencies and measurement errors at the lower frequencies [43]. Therefore, the frequency bins we used are limited to around 2 ≤ kr ≤ N due to errors and spatial aliasing. The SNR is 10 dB. When the value of kr is within the extent (2, 4), the RMSEs of azimuth and elevation for two sources are smaller. Therefore, we choose the frequency bins satisfying the constraint 2 ≤ kr ≤ N to localize wideband sources.
5 Conclusions
In this paper, we developed a twostage source localization algorithm jointly estimating elevation, azimuth angles, and range for the mixed farfield and nearfield sources using spherical array. In the first stage, the 3D localization algorithm estimated the DOAs of all mixed sources. In the second stage, 1D MUSIC method was used to distinguish farfield and nearfield sources and provided the ranges of the nearfield sources based on the estimated DOAs. The algorithm had good performance for azimuth, elevation, and range estimations. It had low computational cost because it avoided multidimensional search and did not require parameter pairing procedure. The estimation performance of the farfield sources was not sensitive to the varied range of the nearfield sources. However, the RMSEs of the azimuths, elevations, and ranges for the nearfield source, which was closer to the spherical array, were smaller than those of the source which was farther to the array. Spherical array had better performance for elevation estimation due to its larger aperture in the elevation plane than azimuth estimation. In our future work, we will develop range estimation algorithm without 1D search and incorporate a reverberated signal model to localize multiple sources in reverberant environment.
Declarations
Acknowledgements
The authors would like to thank the editor and anonymous reviewers for their valuable comments. The work was supported by the National Natural Science Foundation (61001160), Innovation Program of Shanghai Municipal Education Commission (12YZ023), and Visiting Scholar Funding of Shanghai Municipal Education Commission of China.
Authors’ Affiliations
References
 Krim H, Viberg M: Two decades of array signal processing research: the parametric approach. IEEE Signal Process Mag 1996, 13(4):6794. 10.1109/79.526899View ArticleGoogle Scholar
 Chen JC, Yao K, Hudson RE: Acoustic source localization and beamforming: theory and practice. EURASIP J Appl Signal Process 2003, 4: 359370.View ArticleMATHGoogle Scholar
 Liang L, Liu D: Passive localization of nearfield sources using cumulant. IEEE Sensors J 2009, 9(8):953960.MathSciNetView ArticleGoogle Scholar
 Wang B, Liu J, Sun X: Mixed sources localization based on sparse signal reconstruction. IEEE Signal Process Lett 2012, 19(8):487490.MathSciNetView ArticleGoogle Scholar
 Liang L, Liu D: Passive localization of mixed nearfield and farfield sources using twostage MUSIC algorithm. IEEE Trans Signal Process 2010, 58(1):108120.MathSciNetView ArticleGoogle Scholar
 He J, Swamy MNS, Ahmad MO: Efficient application of MUSIC algorithm under the coexistence of farfield and nearfield sources. IEEE Trans Signal Process 2012, 60(4):20662070.MathSciNetView ArticleGoogle Scholar
 Wang B, Zhao Y, Liu J: Mixedorder MUSIC algorithm for localization of farfield and nearfield sources. IEEE Signal Process Lett 2013, 20(4):311314.MathSciNetView ArticleGoogle Scholar
 Jiang J, Duan F, Chen J, Li Y, Hua X: Mixed nearfield and farfield sources localization using the uniform linear sensor array. IEEE Sensors J 2013, 13(8):31363143.View ArticleGoogle Scholar
 Jiang J, Duan F, Chen J: Threedimensional localization algorithm for mixed nearfield and farfield sources based on ESPRIT and MUSIC method. Prog Electromagnetics Res 2013, 136: 435456.View ArticleGoogle Scholar
 Mathews CP, Zoltowski MD: Eigenstructure techniques for 2D angle estimation with uniform circular arrays. IEEE Trans Signal Process 1994, 42(9):23952407. 10.1109/78.317861View ArticleGoogle Scholar
 Pavlidi D, Griffin A, Puigt M, Mouchtaris A: Realtime multiple sound source localization and counting using a circular microphone array. IEEE Trans Audio Speech Lang Process 2013, 21(10):21932206.View ArticleGoogle Scholar
 Sommerkorn G, Hampicke D, Klukas R, Richter A, Schneider A, Thomä R: Uniform rectangular antenna array design and calibration issues for 2D ESPRIT application. In The 4th European Personal Mobile Communications Conference, Vienna. Morgan Kaufman, San Francisco; 2001:18.Google Scholar
 Ioannides P, Balanis CA: Uniform circular and rectangular arrays for adaptive beamforming applications. IEEE Antennas Wireless Propagation Lett 2005, 4: 351354. 10.1109/LAWP.2005.857039View ArticleGoogle Scholar
 Smaragdis P, Boufounos P: Position and trajectory learning for microphone arrays. IEEE Trans Audio Speech Lang Process 2007, 15(1):358368.View ArticleGoogle Scholar
 Costa M, Koivunen V, Richter A: Low complexity azimuth and elevation estimation for arbitrary array configurations. In IEEE Int. Conf. Acoust., Speech, Signal Process, Taipei. IEEE, Piscataway; 2009:21852188.Google Scholar
 Belloni F, Richter A, Koivunen V: D. O. A. Estimation, via manifold separation for arbitrary array structures. IEEE. Trans. Signal. Process. 2007, 55(10):48004810.MathSciNetView ArticleGoogle Scholar
 Filik T, Tuncer TE: A fast and automatically paired 2dimensional directionofarrival estimation using arbitrary array geometry. In IEEE 17th Signal Process and Communications Applications Conference, Antalya. IEEE, Piscataway; 2009:556559.Google Scholar
 Park M, Rafaely B: Soundfield analysis by planewave decomposition using spherical microphone array. J Acoust Soc Am 2005, 118(5):30943103. 10.1121/1.2063108View ArticleGoogle Scholar
 Yan S, Sun H, Svensson UP, Ma X, Hovem JM: Optimal modal beamforming for spherical microphone arrays. IEEE Trans Audio Speech Lang Process 2012, 19(2):361371.View ArticleGoogle Scholar
 Khaykin D, Rafaely B: Acoustic analysis by spherical microphone array processing of room impulse response. J Acoust Soc Am 2012, 132(1):261270. 10.1121/1.4726012View ArticleGoogle Scholar
 Teutsch H: Modal Array Signal Processing – Principles and Applications of Acoustic Wavefield Decomposition. SpringerVerlag, Heidelberg; 2007:134146.MATHGoogle Scholar
 Cohen I, Benesty J, Gannot S: Speech Processing in Modern Communication. SpringerVerlag, Heidelberg; 2010:281.View ArticleMATHGoogle Scholar
 Goossens R, Rogier H: Unitary spherical ESPRIT: 2D angle estimation with spherical arrays for scalar fields. IET Signal Process 2009, 3(2):221231.MathSciNetView ArticleGoogle Scholar
 Sun H, Kellermann W, Mabande E: Robust localization of multiple sources in reverberant environments using EBESPRIT with spherical microphone arrays. In IEEE Int. Conf. Acoust., Speech, Signal Process, Prague. IEEE, Piscataway; 2011:117120.Google Scholar
 Argentieri S, Danes P: Broadband variations of the MUSIC highresolution method for sound source localization robotics. In IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego. IEEE, Piscataway; 2007:20092014.Google Scholar
 Sun H, Mabande E, Kowalczyk K, Kellermann W: Localization of distinct reflections in rooms using spherical microphone array eigenbeam processing. J Acoust Soc Am 2012, 131(4):28282840. 10.1121/1.3688476View ArticleGoogle Scholar
 Wu PKT, Epain N, Jin C: A superresolution beamforming algorithm for spherical microphone arrays using a compressive sensing approach. In Conf. Acoust., Speech, Signal Process, Vancouver. IEEE, Piscataway; 2013:649653.Google Scholar
 Sawada H, Mukai R, Araki S, Malcino S: Multiple source localization using independent component analysis. In IEEE Antennas and Propagation Society International Symposium, Washington DC. IEEE, Piscataway; 2005:8184.Google Scholar
 Epain N, Jin C: Independent component analysis using spherical microphone arrays. Acta Acustica United Acustica 2012, 98(1):91102. 10.3813/AAA.918495View ArticleGoogle Scholar
 Noohi T, Epain N, Jin C: Direction of arrival estimation for spherical microphone arrays by combination of independent component analysis and sparse recovery. In Conf. Acoust., Speech, Signal Process, Vancouver. IEEE, Piscataway; 2013:346349.Google Scholar
 Malioutov D, Cetin M, Willsky AS: A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans Signal Process 2005, 53(8):30103022.MathSciNetView ArticleGoogle Scholar
 Wei X, Yuan Y, Ling Q: DOA estimation using a greedy block coordinate descent algorithm. IEEE Trans Signal Process 2012, 60(4):20662070.MathSciNetView ArticleGoogle Scholar
 Swartling M, Sallberg B, Grbic N: Source localization for multiple speech sources using low complexity nonparametric source separation and clustering. Signal Process 2011, 91: 17811788. 10.1016/j.sigpro.2011.02.002View ArticleMATHGoogle Scholar
 Meyer J, Elko GW: Position independent closetalking microphone. Signal Process 2006, 86(6):12541259. 10.1016/j.sigpro.2005.05.036View ArticleMATHGoogle Scholar
 Fisher E, Rafaely B: The nearfield spherical microphone array. In IEEE Int. Conf. Acoust., Speech, Signal Process, Las Vegas. IEEE, Piscataway; 2008:52725275.Google Scholar
 Fisher E, Rafaely B: Nearfield spherical microphone array processing with radial filtering. IEEE Trans Audio Speech Lang Process 2011, 19(2):256265.View ArticleGoogle Scholar
 Huang Q, Song T: DOA estimation of mixed nearfield and farfield sources using spherical array. In The 11th Int. Conf. on Signal Process, Beijing. IEEE, Piscataway; 2012:382385.Google Scholar
 Williams EG: Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography. Academic, New York; 1999:183.View ArticleGoogle Scholar
 Rafaely B, Weiss B, Bachmat E: Spatial aliasing in spherical microphone arrays. IEEE Trans Signal Process 2007, 55(3):10031010.MathSciNetView ArticleGoogle Scholar
 Meyer J, Elko G: A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield. In IEEE Int. Conf. Acoust., Speech, Signal Process, Orlando. IEEE, Piscataway; 2002:II1781II1784.Google Scholar
 Abhayapala TD, Ward DB: Theory and design of high order sound field microphones using spherical microphone array. In IEEE Int. Conf. Acoust., Speech, Signal Process Orlando. IEEE, Piscataway; 2002:II1949II1952.Google Scholar
 Teutsch H, Kellermann W: Detection and localization of multiple wideband acoustic sources based on wavefield decomposition using spherical apertures. In IEEE Int Conf Acoust, Speech, Signal Process, Las Vegas. IEEE, Piscataway; 2008:52765279.Google Scholar
 Rafaely B: Analysis and design of spherical microphone arrays. IEEE Trans Speech Audio Process 2005, 13(1):135143.View ArticleGoogle Scholar
 Schimidt RO: Multiple emitter location and signal parameter estimation. IEEE Trans Antennas Propag 1986, AP34: 276280.View ArticleGoogle Scholar
 Wang Y, Chen J, Fang W: TSTMUSIC for joint DOAdelay estimation. IEEE Trans Signal Process 2001, 49(4):721729. 10.1109/78.912916View ArticleGoogle Scholar
 Stoica P, Larsson EG, Gershman AB: The stochastic CRB for array processing: a textbook derivation. IEEE Signal Process Lett 2001, 8(5):148150.View ArticleGoogle Scholar
 El Korso MN, Boyer R, Renaux A, Marcos S: Conditional and unconditional Cramér–Rao bounds for nearfield source localization. IEEE Trans Signal Process 2010, 58(5):29012907.MathSciNetView ArticleGoogle Scholar
 Gardner B: A realtime multichannel room simulator. J Acoust Soc Am 1992, 92(4):2395.View ArticleGoogle Scholar
 Vershynin R: How close is the sample covariance matrix to the actual covariance matrix? J Theor Probabil 2012, 25: 655686. 10.1007/s109590100338zMathSciNetView ArticleMATHGoogle Scholar
 Lamel LF, Kasel RH, Senneff S: Speech database development: design and analysis of the acousticphonetic corpus. In Proc. of the DARPA Speech Recognition Workshop. IET, Glasgow; 1986:100109.Google Scholar
 Puigt M, Vincent E, Deville Y: Validity of the independence assumption for the separation of instantaneous and convolutive mixtures of speech and music sources. In Int. Conf. ICA, LNCS, Brazil. Springer, New York; 2009:613620.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.