Skip to main content

Double-Capon and double-MUSICAL for arrival separation and observable estimation in an acoustic waveguide



Recent developments in shallow water ocean acoustic tomography propose the use of an original configuration composed of two source-receiver vertical arrays and wideband sources. The recording space thus has three dimensions, with two spatial dimensions and the frequency dimension. Using this recording space, it is possible to build a three-dimensional (3D) estimation space that gives access to the three observables associated with the acoustic arrivals: the direction of departure, the direction of arrivals, and the time of arrival. The main interest of this 3D estimation space is its capability for the separation of acoustic arrivals that usually interfere in the recording space, due to multipath propagation. A 3D estimator called double beamforming has already been developed, although it has limited resolution. In this study, the new 3D high-resolution estimators of double Capon and double MUSICAL are proposed to achieve this task. The ocean acoustic tomography configuration allows a single recording realization to estimate the cross-spectral data matrix, which is necessary to build high-resolution estimators. 3D smoothing techniques are thus proposed to increase the rank of the matrix. The estimators developed are validated on real data recorded in an ultrasonic tank, and their detection performances are compared to existing 2D and 3D methods.


Estimation of sound speed variations in the ocean that is based on a linearized model and using acoustic waves is known as oceanic acoustic tomography (OAT) [1]. This tomography process is classically divided into three steps: first, estimation of observables extracted from the signal, then the building of a forward model that links these observables and the sound speed variations, and finally the inversion of this problem using the extracted observables. In this study, we are interested in the first step, i.e. the observable extraction and estimation.

We focus here on shallow water environments, as typically a coastal environment, that can be modeled as a waveguide. These environments are a subject of major interest in the ocean science community since they are the place where many physical phenomena occur, such as mixing layers, streams, tides, human influence and pollution. However, unfortunately, they are also complex environments where the acoustic propagation is multipath, due to reflections from the waveguide boundaries. These different paths lead to arrivals that interfere in the traditional recording space, and also in the traditional estimation spaces (e.g. the direction of arrival [DOA] space). The arrival separation, and consequently the observable estimation step, is thus difficult to achieve.

Several methods have been developed to improve this difficult task, traditionally using vertical line arrays (VLAs). These can be divided into two groups [2]: separation methods and high-resolution methods. This first group, the separation methods (which include beamforming [BF] techniques), were first proposed by [3], who developed an matched filter to estimate the arrival times and amplitudes in a noisy signal. Then improvements were proposed based on the use of more adapted signals [4, 5]. An excellent review of BF methods is presented in [6]. In this separation group of methods, adaptive BF (including Capon) has been extensively studied in signal processing [6]. The other group of methods, the high-resolution methods (or subspace-based methods), include the classical MUSIC [7] or ESPRIT [8]. The asymptotical separation power of these methods is not limited by the experimental conditions, such as by signal frequency, array length or signal-to-noise ratio. These methods use eigenvector decomposition of the cross-spectral density matrix. They were first proposed to estimate the DOA (see [9]) and arrival times [10], both separately and then jointly [11]. However, these methods have limitations, due to the proximity of arrivals in the estimation space, which are one-dimensional (1D) or 2D, depending on the experimental configuration.

To overcome this difficulty, we are interested in an original experimental configuration that is composed of two VLAs in the water column: a source array and a receiver array [12]. The arrivals are thus characterized by three observables: their direction of departure (DOD), their DOA, and their time of arrival (TOA). A three-dimensional (3D) estimator is required to transform the 3D recording space (receiver-frequency) into the 3D estimation space (DOA-DOD-TOA). To date, this estimation has been achieved with double BF (D-BF) [12, 13] on the source and receiver arrays. This method has resolution limitations due to the limited size of the arrays and to the source signal. This drawback is particularly problematic in the separation of the first arrivals that correspond to the shortest TOAs, which are close in the three dimensions of the estimation space.

To be able to provide a better separation of the arrivals, we propose here two methods for the improvement of the resolution in the DOA-DOD-TOA space. These methods are generalizations of the traditional Capon and MUSIC methods to the 3D configuration.

This report is organized as follows: the context and signal model that correspond to the experimental configuration are first presented. Then, the conventional estimations methods of BF, Capon and MUSIC are briefly recalled. The double Capon (D-Capon) and double MUSICAL (D-MUSICAL) methods are then considered. The implementation issues of these methods are discussed, and 3D smoothing of the cross-spectral data matrix is introduced. We show then that similar results can be obtained using these two methods. Finally, we illustrate the performance of these methods, compared to the existing methods, on real data recorded in an ultrasonic tank. This experimental environment reproduces oceanic acoustic propagation at a small scale.



Consider a shallow water environment. The experimental configuration is composed of two arrays: a VLA of N e regularly spaced sources, and a VLA of M r regularly spaced receivers. For the sake of simplicity, the inter-sensor distance is Δon both arrays. We consider that N e and M r are odd, and that the reference source (respectively the reference receiver) is located at the middle of the source array: index mref = (N e + 1)/2 (respectively at the middle of the receiver array: index mref = (M r + 1)/2). The source signal is broadband, and the propagated signals are recorded on F frequency bins that cover the frequency band. In the experiment, the recording space thus has three dimensions: source-receiver-frequency. The recorded data contain the whole transfer matrices between each source and each receiver in the frequency-domain, and finally form the data cube X(N e × M r × F). For a given signal emitted by the source array, the propagation is multipath in the waveguide, and it will lead to several plane wave arrivals on the receiver array. Note that in our configuration, the horizontal distance between the arrays is much larger than their lengths by at least a factor of 10. The plane wave approximation is thus realistic. Each arrival p, corresponding to a given raypath, is characterized by its three observables:

  • the DOA θ p r , which is also known as the reception angle: the angle between the raypath direction at the receiver array and the normal to the receiver array;

  • the DOD θ p e , which is also known as the emission angle: the angle between the raypath direction at the source array and the normal to the source array;

  • the TOA T p , which is the travel time between the reference source and the reference receiver.

Note that in this article “source” designates the emitting sensor. To avoid ambiguity, the elementary contribution that we want to detect and estimate the parameters for, and which corresponds to a raypath in the waveguide, is designated by “arrival” (and not by “source” as it can be classically designated in array processing).

Motivation of 3D detection

Detection and estimation of arrivals are classically achieved in 1D and 2D configurations. In the 1D case, narrowband signals recorded on a receiver array lead to a DOA estimation space [7, 14, 15]. In the 2D case, two configurations are studied: broadband signals recorded on a receiver array and a DOA–TOA estimation space [11], or narrowband signals emitted by a source array and recorded on a receiver array and a DOA-DOD estimation space [16]. The resolution is limited by the size of the arrays and by the signal central frequency in DOA and DOD, and by the signal bandwidth in TOA (cf. Section “Conventional estimation methods”).

When combined with a receive array and broadband signals, the source array adds a third dimension in the recording space, and consequently a new dimension in the estimation space: the DOD. For arrivals with different DODs, the use of an estimator that includes the DOD dimension can better detect and estimate these arrivals. This improvement will be particularly efficient for arrivals close in the DOA and TOA dimensions, but far in the DOD dimension.

We assume a configuration where two raypaths that start from the source propagate with close DOAs and close TOAs, but with two DODs that are significantly different (see Figure 1). The two arrivals might not be separated on the 2D DOA–TOA estimation space (see Figure 2-left). On the contrary, when adding the DOD dimension, the arrivals will be well separated (see Figure 2-right). It appears clear that the use of a 3D estimation space that includes the DOD dimension must improve the arrival separation and observable estimation, taking into account the propagating properties of the arrivals. This principle has been used to develop the D-BF in this experimental configuration [13]. Nevertheless, this method suffers from the inner limitation of BF-like methods; i.e. the limited resolution due to the limited size of the arrays, and to the signal central frequency and bandwidth.

Figure 1

Schematic representation of two raypaths propagating in the waveguide. The two raypaths have close DOAs (near = θ 0 r ) and TOAs (proportional to the raypath length for a homogeneous medium), but have DODs θ 1 e and θ 2 e that are significantly different.

Figure 2

The two arrivals that correspond to the raypaths of Figure 1 might not be detected in the 2D DOA-TOA space (left), but are detected in the 3D DOA-DOD-TOA space (right).

Signal model

At the frequency ν, the Fourier transform of the signal that is recorded on the receiver m coming from the source n is the sum of the P arrivals:

x m , n , ν = s ν p = 1 P a p exp[ ϕ p ]+ b m , n , ν


ϕ p =j2πν( T p +(m m ref )τ( θ p r )+(n n ref )τ( θ p e ))

where τ( θ p e )=Δsin( θ p e )/v (respectively τ( θ p r )=Δsin( θ p r )/v) are the delays associated with the DOD (respectively the DOA), assuming a constant sound speed v at the arrays, a p is the amplitude of the pth arrival, bm,n,νis the noise contribution that is generally considered uncorrelated in space and frequency, and s ν is the source spectrum, which will be assumed to be known. Note that the number of arrivals, P, is typically around 10.

Experimental constraints

We want to achieve arrival separation and observable estimation for a time evolving medium. It is thus a problem to record the different realizations at different times, because the medium and its sound speed might change between the two realizations. Consequently, we can only consider a single realization to perform the observable estimation. This constraint must be taken into account for the estimation of the cross-spectral matrix (cf. Section “Implementation”).

Conventional estimation methods

We briefly recall the 1D conventional estimation methods: BF, Capon and MUSIC, in the general context of array processing. The specific experimental constraints, and particularly the smoothing issues, are discussed in Section “Implementation”.

We consider a single source n0 that emits a narrowband signal (at frequency ν0) recorded on a VLA of M r receivers. The recorded signal forms the vector x1D(size M r × 1), and we introduce the covariance data matrix R 1 D =E{ x 1 D x 1 D H }. We also introduce the steering vector d ( θ r ) ν 0 , which represents the normalized contribution of a perfect plane wave of direction θr on the receiver array:

d ( θ r ) ν 0 = e j 2 π ν 0 ( 1 m ref . ) τ ( θ r ) e j 2 π ν 0 ( 2 m ref ) τ ( θ r ) . . e j 2 π ν 0 ( M r m ref ) τ ( θ r )

The conventional BF can be expressed by the squared projection of the signal on the steering vectors:

P BF ( θ r )=d ( θ r ) ν 0 H R ̂ 1 D d ( θ r ) ν 0

where R ̂ 1 D is the estimated covariance data matrix. The resolution of the BF is directly linked to the size l = (M r −1)Δ of the array: two sources with DOAs closer than θ min r λ/l with the wavelength λ at the central frequency, will not be separated by BF.

Two types of techniques have been developed to overcome those limitations:

  • Adaptive estimators, like the Capon beamformer [15],

  • High-resolution estimators, like MUSIC [7, 14].

As for BF, the Capon estimator principle is to project the signal on steering vectors [15]. Capon steering vectors are calculated adaptively, so that they minimize the power contributed by the noise and by any signals coming from other directions than θr, while maintaining a unitary gain in the direction of interest θr. The Capon algorithm (which is also known as the minimum variance distortionless response) has already been successfully applied in underwater acoustics [17].

MUSIC is a subspace-based method [7, 14]: the recording space is divided into a signal subspace and a noise subspace. This subspace division is achieved using eigenvalue decomposition (EVD) of R ̂ 1 D . The signal subspace is spanned by the L eigenvectors corresponding to the L maximum eigenvalues. The noise subspace is spanned by the other M r L eigenvectors. Finally, the estimator is the inverse of the projection of the signal on the noise subspace. For uncorrelated arrival amplitudes with spatially white noise, the MUSIC estimator is unbiased and its resolution is not limited.

D-Capon and D-MUSIC

The 3D model and estimators are considered in this section, in the general context of array processing. The specific experimental constraints and particularly the smoothing issues, are discussed in Section “Implementation”.

Data model

As shown in Section “Context”, the 3D recording signal is the data cube X. We concatenate X into a long vector of size N e M r F. The contribution of the source n at the frequency ν on the M r elements of the receiver array is expressed by the vector:

x n , ν = x 1 , n , ν . . x M r , n , ν

The whole contribution at the frequency ν on the source and receiver arrays is expressed as the concatenation of all of the source contributions xn,ν from n = 1 to n = N e :

x ν = x 1 , ν . . x N e , ν

The signal expressed on the long vector is finally the concatenation of all of the frequency contributions x ν from ν = ν1 to ν = ν F :

x= x ν 1 . . x ν F

The noise long vector b is built in the same way, starting from the noise contributions bn,m,ν.

Using Equation 1, x is composed of P arrivals and can be written as a matrix product:

x = p = 1 P a p d ( θ p r , θ p e , T p ) + b = D ( θ r , θ e , T ) a + b

where T = [T1,…,T p ] is the vector of the TOA, θ e =[ θ 1 e ,, θ P e ] and θ r =[ θ 1 r ,, θ P r ] are, respectively, the vector of the emission angles and the vector of the reception angles. a = [a1,…,a P ]H is the vector of the arrival amplitudes. d( T p , θ p r , θ p e ) is the steering vector that corresponds to the arrival p. It forms a long vector (N e M r F × 1) and corresponds to the contributions of a plane wave of parameters θ p r , θ p e , T p emitted by the source array with spectrum {s ν ,ν = ν1,…,ν F }, and recorded on the receive array. To build this, we first define the long steering vector (N e M r × 1) corresponding to the single frequency ν i for the N e sources and the M r receivers:

d ν i ( θ p r , θ p e , T p ) = s ν i e j 2 π ν i ( T p + ( 1 n ref ) τ ( θ p e ) ) d ( θ p r ) ν i s ν i e j 2 π ν i ( T p + ( 2 n ref ) τ ( θ p e ) ) d ( θ p r ) ν i . . s ν i e j 2 π ν i ( T p + ( N e n ref ) τ ( θ p e ) ) d ( θ p r ) ν i

and finally, we concatenate these vectors for all of the frequencies ν1 to ν F :

d( θ p r , θ p e , T p )= d ν 1 ( T p , θ p r , θ p e ) . . d ν F ( T p , θ p r , θ p e )

The D(θr,θe,T) matrix (N e M r F × P) contains the P steering vectors previously built:

D ( θ r , θ e , T ) = [ d ( θ 1 r , θ 1 e , T 1 ) , , d ( θ P r , θ P e , T P ) ]

We finally obtain the model of the 3D (N e M r F × N e M r F) cross-spectral data matrix R. This is equivalent to the spatial covariance matrix of the 1D case, and can be expressed by:

R = E { x x H } = D ( θ r , θ e , T ) A D ( θ r , θ e , T ) H + B

where A = E{aaH} is the arrival amplitude covariance matrix (P×P) and B = E{b bH} is the 3D cross-spectral noise matrix (N e M r F × N e M r F).

Note on 3D steering vectors d( θ p r , θ p e , T p )

  • two distinct steering vectors are independent (non-colinear), and the correspondence between a set of parameters and a 3D steering vector is thus unique;

  • two distinct steering vectors are never orthogonal;

  • given N = N e M r F as the dimension of the recording space, nN distinct steering vectors form a free family. N steering vectors thus forming a base;

  • assuming PN, the steering vectors linked to the signal thus form a free family and spanning a space of dimension P. The dimension of the signal subspace in R is thus equal to the rank of A. Consequently, the rank of R depends on the correlation degree between the P arrival amplitudes.


Capon has already been extended to the 2D context [18]. The proposed D-Capon method consists of extending the conventional Capon method to the 3D OAT context. We create 3D Capon steering vectors g(θr,θe,T) which minimize the power contributed by noise and any signal coming from other ‘directions’ than (θr,θe,T), while maintaining a unitary gain in the direction of interest (θr,θ,eT).

Assuming R ̂ is invertible, we obtain:

g ( θ r , θ e , T ) Cap = R ̂ 1 d ( θ r , θ e , T ) d ( θ r , θ e , T ) H R ̂ 1 d ( θ r , θ e , T )

where R ̂ is the estimated cross-spectral data matrix and d the theoretical steering vectors built using Equations 9 and 10. The estimation of the cross-spectral matrix will be discussed in Section “Implementation”.

The D-Capon estimator in the 3D DOA-DOD-TOA estimation space is then:

P D-Capon ( θ r , θ e , T ) = g ( θ r , θ e , T ) Cap H R ̂ g ( θ r , θ e , T ) Cap = 1 d ( θ r , θ e , T ) H R ̂ 1 d ( θ r , θ e , T )

Finally, the following information is necessary to calculate the D-Capon:

  1. (1)

    the recorded data to build R ̂ ;

  2. (2)

    the source spectrum s ν ;

  3. (3)

    the environment information Δ and v, to calculate τ(θ e) and τ(θ r), and thus the steering vectors d(θ r,θ e,T).


The MUSIC algorithm has already been extended to the 2D configuration by Gounon et al. in a large band context: MUSICAL (MUSIC Active Large Band) estimates the 2D (TOA-DOD) observables, starting from the 2D recording space: receiver-frequency. Recently, a 2D MUSIC was developed to estimate conjointly the DOD and DOA [16]. In the same way, we develop a 3D MUSIC estimator, which we call D-MUSICAL, to extend the conventional MUSIC method.

The EVD decomposition of R ̂ is expressed by:

R ̂ =V V H

where V=[ v 1 ,, v N e M r F ] is a N e M r F × N e M r F matrix that contains the eigenvectors v i , and is a N e M r F × N e M r F diagonal matrix that contains the N e M r F eigenvalues λ i .

As R ̂ is a normal matrix ( R ̂ R ̂ H = R ̂ H R ̂ ), eigenvectors are orthogonal (< v i .v j > = 0 {i,j}ij). Selecting the L largest eigenvalues and their associated eigenvectors, we span a ‘signal’ subspace, the N e M r FL others are spanning the ‘noise’ subspace. These two subspaces are orthogonal. The signal projector (respectively the noise projector) is deduced from the L first (respectively the N e M r FL last) eigenvectors: π s = i = 1 L v i v i H (resp. π n = i = L + 1 N e M r F v i v i H ).

The D-MUSICAL estimator, in the 3D DOA-DOD-TOA estimation space, is then:

P D-MUSICAL ( θ r , θ e ,T)= 1 d ( θ r , θ e , T ) H π n ̂ d ( θ r , θ e , T )

Note that a 3D MUSIC method has already been developed for multicomponent seismic signals [19]. The third multicomponent dimension does not correspond to the same issue, and the DOD estimation cannot be achieved.

The D-MUSICAL implementation needs the same information as the D-Capon (cf. Section “D-Capon”). Assuming that A is the full rank, and that the noise is white in the three dimensions and for variance σ n 2 , we choose L = P and the signal is completely represented by the signal subspace composed of the P first eigenvalues. The noise subspace contains only noise contributions. In this case, the D-MUSICAL estimation is unbiased.

In our practical case, arrival amplitudes are correlated, which means that A is not full rank. The implementation of D-Capon and D-MUSICAL thus needs preprocessing, which is detailed in the following Sections “Smoothing issue” and “Diagonal loading and estimation of L”.


In this section, the implementation of the proposed D-Capon and D-MUSICAL algorithms to our OAT context is discussed.

Smoothing issue

The Capon and MUSIC detection methods need the signal subspace to be correctly represented to be efficient. This means the arrivals must be uncorrelated, or at least not fully correlated, to generate a signal subspace of dimension P. Equivalently, the amplitude covariance matrix A must be full rank. To achieve this, classical methods assume that the arrivals are statistically uncorrelated, and estimate the R ̂ cross-spectral data matrix by averaging a great number of realizations.

In our context, this type of estimation is not possible, for the two following reasons:

  • As explained in Section “Experimental constraints”, only one realization can be considered to perform the observable estimation. R ̂ must be determined using a single data realization x.

  • Moreover, even assuming a non-evolving medium, the arrival amplitudes remain correlated. Indeed, the arrivals are induced by different raypaths that result from the acoustic propagation. The correlation degree between the arrival amplitudes a p is thus determined by the propagation and not by the emitted source signals. The amplitude vector a is constant between realizations up to a multiplying factor. The P arrival amplitudes are thus fully correlated.

Considering those two issues, the rank of R ̂ will be 1 if it is estimated classically. Capon-like or MUSIC-like methods are thus equivalent to the BF method.

To avoid this problem, a 3D smoothing method of the matrix is developed. Smoothing methods are used to increase the rank of R ̂ . They were developed in 1D configurations [20, 21] and then extended to 2D [22]. The principle is to divide the array into K different subarrays with the same characteristics (size, sampling). Each subarray induces a signal x k . The diversity in realization is replaced by a diversity in subarray. The estimated matrix R ̂ is the mean of the matrices R ̂ k .

We extend this principle to the 3D context, forming subarrays in the three dimensions of the recording space. The source and receiver VLAs are divided into K e , respectively K r , vertical line subarrays of N e s = N e K e +1, respectively M r s = M r K r +1, sensors. The inter-sensor distance is not changed. In the same way, the frequency band is divided into K f subbands of Fs = FK f + 1 frequency bins. The combinations of the subarrays in the three dimensions lead us to consider K = K e K r K f different 3D subarrays, and thus K different signals x k . Figure 3 illustrates the subarray and x k buildings starting from the data cube X. The estimated cross-spectral data matrix R ̂ is finally expressed by:

R ̂ = 1 K e K r K f k = 1 K x k x k H = 1 K e K r K f k = 1 K R ̂ k
Figure 3

Schematic representation of the subarray extraction for the smoothing.

Note that the size of R ̂ is now N e s M r s F s × N e s M r s F s . The numbers of subarrays K e , K r and K f depend on the configuration and on the arrival number P. In each direction, the choice of the subarray size depends on the size of the original array and on the resolution we want to obtain in the corresponding estimation dimension (DOD for the source array, DOA for the receive array, and TOA for the frequency). The aim of the smoothing is to increase the number of significant eigenvalues of R ̂ , so that they represent accurately the signal subspace. This objective is achieved when the eigen structure of the smoothed matrix (i.e. the repartition of its eigenvalues) is stable with K. We empirically observe that K must be chosen so as to be a lot larger than P. This number is a priori not precisely known, but depending on the environment knowledge, we can approximately estimate it. We thus increase the number K to achieve this objective.

The 3D smoothing is a pre-processing that is necessary to achieve D-Capon or D-MUSICAL in our context. However, it has two limitations:

  • Assuming a perfectly plane wave in the 1D configuration, two different signals corresponding to two different subarrays are equal up to a multiplying factor. The amplitude of this factor is 1, and its phase depends on the delay between the subarrays and on the considered plane wave. These can thus be seen as two realizations of the same signal with different amplitudes.Considering now P plane waves, the 1D smoothing leads to vectors of arrival amplitudes a k that are different for each subarray k. Finally, the estimated amplitude covariance matrix A ̂ =1/K a k a k H is non singular [21] if KP. The estimated R ̂ matrix has thus a rank equal to P. In the 3D case, two contributions of the same plane wave on two different 3D subarrays are not equal up to a multiplying factor. This produces a bias in the estimation of R ̂ , which prevents the 3D methods from having an unlimited resolution. However, this bias is not important in practical cases compared to the resolution needed.

  • As we smooth only one realization instead of averaging several realizations, we cannot recover the statistical characteristics of the additional noise. Consequently, noise must be considered as a deterministic element. The rank of R ̂ cannot exceed K and the K eigenvectors that correspond to the K first eigenvalues span a subspace in which the whole data are present, including the signal, and also the noise part and the bias induced by the smoothing.

Diagonal loading and estimation of L

D-Capon needs the inversion of the cross-spectral data matrix R ̂ to be achieved (Equation 14). Consequently, R ̂ must be a full rank matrix. Usually, additional spatial and spectral white noise is assumed and R ̂ is estimated with a great number of realizations, so that the statistical properties of the noise are satisfied. Consequently, the noise contribution into R ̂ is a diagonal matrix σ n 2 I where σ n 2 is the noise variance and the matrix is thus invertible. Here, due to the smoothing, the noise must be considered as a deterministic element, and R ̂ is no longer invertible. To overcome this issue, a diagonal loading is realized; this consists of adding a diagonal matrix to R ̂ , leading to a new estimated cross-spectral data matrix R ̂ C = R ̂ + σ 2 I. As R ̂ and σ2I are independent, R ̂ C is full rank and invertible. The diagonal loading introduces the parameter σ. As the diagonal loading can be seen as artificially adding white noise, the action of σ on the D-Capon estimation can be more reliably linked to an induced signal noise ratio:

SNR C =10log i R ̂ ( i , i ) N e s M r s F s σ 2

where R ̂ (i,i) is the ith element of the diagonal of R ̂ . A natural choice is to take σ as small as possible, so that the condition ‘ R ̂ C invertible’ is verified on the computational platform. As we will see, this choice does not generally give the best result. Recent methods have estimated the optimum diagonal loading [23, 24], but they are not adapted to our context.

D-MUSICAL requires the estimation of the signal subspace dimension L to be achieved. For an unbiased estimation of R ̂ , L would be equal to P, the number of expected arrivals. However the smoothing processing biases the estimation of R ̂ . A classical L estimator based on the statistical properties of the noise [14, 25] can thus not be applied. Moreover, empirically, we observe that the bias introduced by the 3D smoothing leads to a difference between P and the number L of significant eigenvectors that actually spanned the signal subspace. This number L is larger than P. The determination of L is thus made, starting from the decreasing curve of eigenvalues calculated with the EVD. The eigenvectors that correspond to all of the significant eigenvalues (chosen with a threshold corresponding to 0.5% of the first eigenvalue) are selected and span the signal subspace, whereas all the other ones span the noise subspace. Note that L is thus lower than K (in practice, a lot smaller) and larger than P.

Computational cost

One drawback of these developed methods compared to D-BF is the processing cost. D-Capon is divided into three computational steps:

  1. 1.

    R ̂ estimation by smoothing;

  2. 2.

    R ̂ inversion;

  3. 3.

    projection of steering vectors on R ̂ 1 (cf. Equation 14).

D-MUSICAL is divided in three computational steps:

  1. 1.

    R ̂ estimation by smoothing;

  2. 2.


  3. 3.

    projection of steering vectors on π n(cf. Equation 16).

The smoothing (point 1) is common for the two methods. A natural way to compute R ̂ is to follow the equation formulation: extract the k subarrays x k from the data, compute the corresponding cross-spectral data matrix R ̂ k , loop this step on all of the subarrays k = 1,…,K, and finally mean the R ̂ k . An alternative way consists of building the matrix Xs = [x1,…,x K ] and noting that R ̂ = X s X s H /K. This matrix product is less computationally expensive than the loop of the natural computation.

Knowing that πn = Iπs, the projection step for D-MUSICAL (point 3) can be considerably sped up, noting that the size of the signal subspace L (which is of the same order of magnitude as the number of arrivals P; typically around 10) is much smaller than the size of the noise subspace N e s M r s F s L (which depends on the size of the arrays and the smoothing parameters, typically > 100).

As only the signal projector πsis needed, the EVD can be realized by calculating only the first M eigenvalues and eigenvectors. M must be sufficiently great to allow the estimation of the subspace size L following the procedure given in section “Diagonal loading and estimation of L”. The programming platform integrates a function that only calculates the first M given elements of the EVD. Moreover, introducing the normal matrix T= X s H X s (K×K) and realizing an EVD on T = V Θ V H , it can be shown [19] that the K eigenvalues of are the same as the K first eigenvalues of , and that v i = X s v i / λ i . As the EVD on a N × N matrix is O(N), the use of EVD on T will be preferred (with the condition: K< N e s N r s F s ). Finally, step 2 of D-MUSICAL is computed as follows:

  1. 1.

    Calculation of X s(already calculated for smoothing processing) and T.

  2. 2.

    Calculation of the M first λ i = λ i and v i T.

  3. 3.

    Estimate L from the behavior of the decreasing eigenvalues and then deducing the L first eigenvectors v i of R.

On the contrary, steps (2) and (3) of D-Capon cannot be optimized by algorithm optimization. The computational gain obtained applying the optimized algorithm, as compared to the conventional algorithms, depends on the size of the array, on the smoothing parameters, and on the choice of M and L (itself depending on P). Table 1 gives an example of the computational gain on the experimental data, which are presented and analyzed in the next section.

Table 1 Optimization gain of the D-MUSICAL algorithm for the experimental data (see values of parameters in Table 2 )
Table 2 Values of the parameters for D-MUSICAL in the experimental data


Detection performances on real tank data

D-MUSICAL has previously been validated on simulated data [26]. We apply D-MUSICAL and D-Capon to real data recorded in an ultrasonic tank that reproduces the oceanic acoustical propagation at a small scale.

Two coplanar source and receiver arrays of 11 transducers face each other in a 1.1-m-long, 5.2-cm-deep ultrasonic waveguide (Figure 4). The 11 transducers are regularly spaced, with Δ = 0.75 mm. The arrays are centered in the water column. The central frequency of the transducers is 1 MHz (wavelength λ = 1.5 mm), with a 1 MHz frequency bandwidth. A 1000 scale ratio transforms this ultrasonic waveguide into a shallow-water realistic waveguide classically encountered in ocean. Each transducer size is 0.75mm×12mm, which makes the transducer arrays relatively omni-directional in the plane defined by the source-receiver arrays, and very collimated outside this plane, to prevent acoustic echoes from the tank sidewalls. The waveguide bottom is made of steel, for which the boundary conditions are nearly perfect at the water-bottom interface (full reflection of the raypaths). The acquisition sequence consists of recording the whole transfer matrix between each source and each receiver in the frequency domain. A fast way to perform this acquisition is to proceed through a round-robin sequence, during which each source successively emits a broadband pulse at the central frequency of the source transducers [27]. The duration between the emitted pulse from each source is such that the full waveguide transfer matrix is recorded in less than 10 ms. Consequently, at the ultrasonic scale, the medium can be considered constant during the acquisition and the data matrix X is formed by the acquisition sequence. Note that OAT has already been performed on these ultrasonic data using D-BF [28].

Figure 4

Raypaths of the seven first arrivals in the waveguide. Red horizontal lines represent the air-water interface and the waveguide bottom. Green vertical lines represent the source and receiver arrays. The raypaths are plotted between the centers of the source and receiver arrays.

Figure 4 shows the first seven raypaths that correspond to the first seven arrivals we want to detect. To illustrate the results, 2D methods are first applied on data recorded in the receiver-frequency domain (for a single source), leading to a 2D DOA-TOA estimation space. The source considered is the one located at the center of the source array. The source is unique but the number of arrivals to detect remains equal to P. Figures 5 and 6 show the 2D BF, 2D Capon and MUSICAL results. Black crosses represent the theoretical arrivals calculated with a raypath model. The smoothing parameters are K r = 5 and K f = 9. L = 10 and SNRC= 25 dB. 2D BF (Figure 5) does not manage to separate the three first arrivals (around 7.71 s), and neither the 4th and 5th ones (around 7.75 s). 2D Capon (Figure 6-left) and MUSICAL (Figure 6-right) are similar, and although they do not manage to separate the three first arrivals, they do manage to separate the 4th and 5th.

Figure 5

2D BF on the experimental data. Black crosses represent the theoretical arrivals.

Figure 6

2D methods (2D Capon and MUSICAL) on the experimental data. Black crosses represent the theoretical arrivals.

Let us consider the 3D estimators D-BF, D-Capon and D-MUSICAL. Figures 7, 8 and 9 represent, respectively the D-BF, D-Capon and D-MUSICAL results. Table 2 presents all of the values of parameters used by the D-Capon and D-MUSICAL estimators. The smoothing parameters are K e = K r = 5 and K f = 9. The estimation of L is realized as described in section “Diagonal loading and estimation of L” (selection of the most significant eigenvalues with a threshold corresponding to 0.5% of the first eigenvalue) leading to L = 16. The smoothed cross-spectral spectral matrix is singular (K< N s e M s r F s ), so a diagonal loading is necessary, and is performed with SNR C =30 dB. Figure 10 shows the decreasing eigenvalue curve. D-BF (Figure 7) manages to separate the 4th and 5th arrivals, but not the first three. D-Capon (Figure 8) and D-MUSICAL (Figure 9) are similar, and they manage to separate all the arrivals. Note that the 2D BF and D-BF performances in term of arrival separation conform to the resolution performances shown in Iturbe et al. [13], which were calculated with theoretical values of observables.

Figure 7

Isosurface of D-BF. Green surfaces are located at max(D-Capon)/2.8. Black crosses represent the theoretical arrivals.

Figure 8

Isosurface of D-Capon. Green surfaces are located at max(D-Capon)/3. Black crosses represent the theoretical arrivals.

Figure 9

Isosurface of D-MUSICAL. Green surfaces are located at max(D-MUSICAL)/3. Black crosses represent the theoretical arrivals.

Figure 10

Eigenvalue decreasing curve of the 3D experimental data.

Two conclusions can be drawn from these results:

  1. 1.

    As expected, the 3D estimation methods (D-BF, D-Capon and D-MUSICAL) have better performances than the 2D ones (2D BF, 2D Capon and MUSICAL), comparing for instance Figure 5 with Figure 7, or Figure 6 with Figures 8 and 9.

  2. 2.

    The adapted and high-resolution methods (i.e. Capon-like and MUSIC-like methods) have better performances than the conventional BF ones, comparing for instance Figure 5 with Figure 6, or Figure 7 with Figures 8 and 9.

A general conclusion is that D-MUSICAL and D-Capon have better detection performances than all of the existing 2D and 3D methods.

D-Capon and D-MUSICAL comparison

For a given signal, the D-Capon and D-MUSICAL results are determined by their two pre-processings: smoothing and diagonal loading for D-Capon, smoothing and choice of the signal and noise subspace sizes for D-MUSICAL. As we want to compare the performances, it is natural to take the same smoothing parameters K e , K r and K f for both of the methods. We first compare the results for L = K (the maximum possible rank of R ̂ ) and the largest possible value of SNR C : 150dB (up to this value, the platform does not manage to invert R ̂ ). Under these conditions, D-MUSICAL and D-Capon give very close results: considering D M norm and D C norm the normalized versions of D-MUSICAL and D-Capon, we have:

| D M norm D C norm | D M norm | D M norm D C norm | D C norm 0.3%

This observation can been explained as follows. Under these conditions, the noise projector πn is spanned by the B= N e s M r s F s K last null eigenvalues, and it can be shown that R ̂ C 1 is dominated by the πn/σ2 term [9]. D-Capon thus converges on D-MUSICAL up to a multiplying factor (1/σ).

To compare the performances for other values of L and σ, we focus on the first arrivals, and particularly on the two first arrivals, which are the hardest to detect (Figure 7). We introduce the contrast C to compare the performances. This is defined by the ratio between the amplitude of the weakest peak and the amplitude of the saddle point between the two arrivals. In the previous configuration (L = K and SNRC= 150 dB), all of the arrivals are detected, but the two first arrivals have weak contrast (C = 1.16 for both D-MUSICAL and D-Capon). The contrast between the two first arrivals increases for D-MUSICAL, from L = K to L = 16 (C = 2.5), and then decreases from L = 16 to L = 7, and the two first arrivals are not detected any more for L < 7. The D-Capon results are similar: the contrast between the two first arrivals on D-Capon increases from SNRC= 150 to SNRC= 29 dB (C = 2), then decreases from SNRC= 29 to SNRC= 23 dB, and the two first arrivals are not detected any more for SNRC< 23 dB. However, D-MUSICAL and D-Capon are no longer quasi-equal for L < K and SNRC< 150 dB. This empirical experiment leads thus to the following conclusion: the detection performances for D-MUSICAL when L decreases are similar to those for D-Capon when σ increases.

To decide on the choice of the method, three remarks must be made:

  1. 1.

    the choice of L for D-MUSICAL is achieved by taking into account the eigenvalue decrease. On the contrary, we have no indication for the choice of σ in D-Capon;

  2. 2.

    the processing cost is lower for the inversion of R ̂ C than for the EVD;

  3. 3.

    the projection step is faster for D-MUSICAL than for D-Capon, and the cost difference is generally large because L<< N e s M r s F s .

Points 1 and 3 give priority to D-MUSICAL, while point 2 gives priority to D-Capon. As the projection step (point 3) represents the most important part of the processing cost (depending on the projection domain), we generally choose D-MUSICAL to achieve the detection in real applications.


In this study, the D-MUSICAL and D-Capon 3D detection and estimation methods have been proposed for arrival separation in an OAT context. Starting from the 3D recording space receiver-source-frequency, they estimate observables in the 3D estimation space DOA-DOD-TOA. D-MUSICAL and D-Capon extend the high-resolution MUSIC method and adaptive Capon method to the 3D configuration, respectively.

Smoothing issues linked to the OAT context and implementation issues have been discussed here. The methods have been validated on real data recorded in an ultrasonic tank. These methods have better detection performances than the 2D methods (2D BF, 2D Capon and MUSICAL) and than D-BF. We have also shown that D-Capon and D-MUSICAL finally give similar performances.

Future work will concern the estimation performances of these new methods and their use in OAT experiments. As OAT has already been achieved using D-BF to estimate the TOA [28], it will be particularly interesting to apply the tomography process with D-Capon and D-MUSICAL to estimate the TOA.


  1. 1.

    Munk W, Worcester P, Wunsch C: Ocean Acoustic Tomography. Cambridge, Cambridge university press; 1995.

    Book  Google Scholar 

  2. 2.

    Krim H, Viber M: Two decades of array signal processing research. IEEE Signal Proc. Mag 1996, 67-94.

    Google Scholar 

  3. 3.

    Turin G: An introduction to matched filters. IEEE Trans. Inf. Theory 1960, 6(3):311-329.

    MathSciNet  Article  Google Scholar 

  4. 4.

    Spindel RC: An underwater acoustic pluse compression system. IEEE Trans Acoust. Speech Signal Process 1979, 27(6):723-728.

    Article  Google Scholar 

  5. 5.

    Birdsall TG, Metzger K, Dzieciuch MA: Signals, signal processing, and general results. J. Acoust. Soc. Am 1994, 96(4):2343-2352.

    Article  Google Scholar 

  6. 6.

    Van Veen BD: Beamforming: A versatile approach to spatial filtering. IEEE ASSP Mag 1988, 4-24.

    Google Scholar 

  7. 7.

    Bienvenu G, Kopp L: Adaptivity to background noise spatial coherence for high resolution passive methods. In ICASSP. Denver, Co.; 1980:307-310.

    Google Scholar 

  8. 8.

    Roy R, Kailath T: ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Acoust. Speech Signal Proc 1989, 37(7):984-995.

    Article  Google Scholar 

  9. 9.

    Marcos S: Les méthodes à haute résolution. Paris, France, HERMES; 1998.

    Google Scholar 

  10. 10.

    Pallas M-A: Identification active d’un canal de propagation à trajets multiples. Thèse de doctorat Institut National Polytechnique de Grenoble, 1988

    Google Scholar 

  11. 11.

    Gounon P, Bozinoski S: High resolution spatio-temporal analysis by an active array. In ICASSP. Volume 5. Detroit, MI , USA; 1995:3575-3578.

    Google Scholar 

  12. 12.

    Roux P, Cornuelle BD, Kuperman WA, Hodgkiss WS: The structure of ray-like arrivals in a shallow-water waveguide. J. Acoust. Soc. Am 2008, 124(6):3430-3439.

    Article  Google Scholar 

  13. 13.

    Iturbe I, Roux P, Nicolas B, Virieux J, Mars J: Shallow-water acoustic tomography performed from a double-beamforming algorithm: simulation results. IEEE J. Ocean. Eng 2009, 34(2):140-149.

    Article  Google Scholar 

  14. 14.

    Schmidt RO: Multiple emitter location and signal parameter estimation. IEEE Trans. A Proc 1986, 34(3):276-280.

    Article  Google Scholar 

  15. 15.

    Capon J, High-resolution frequency-wavenumber spectrum analysis: Proc. IEEE. 1969, 57(8):1408-1418.

    Article  Google Scholar 

  16. 16.

    He J, Swamy MNS, Omair Ahmad M: Joint DOD and DOA estimation for MIMO array with velocity receive sensors. IEEE Signal Proc. Lett 2011, 18(7):399-402.

    Article  Google Scholar 

  17. 17.

    Siderius M, Song H, Gertoft P, Hodkiss WS, Hursky P, Harrison C: Adaptive passive fathometer processing. J. Acoust. Soc. Am 2010, 127(4):2193-2200.

    Article  Google Scholar 

  18. 18.

    Liu ZS, Li H, Li J: Efficient implementation of Capon and APES for spectral estimation. IEEE Trans. Aerosp. Electron. Syst 1998, 34: 1314-1319.

    Article  Google Scholar 

  19. 19.

    Paulus C, Mars J: Vector-Sensor array Processing for polarization parameters and DOA estimation. EURASIP J. Adv. Sig. Proc 2010, 1-13. Article ID 850265

    Google Scholar 

  20. 20.

    Evans JE, Johnson JR, Sun DF: Application of advanced signal processing techniques to angle of arrival estimation in ATC navigation and surveillance systems. Technical report, MIT Lincoln Laboratory, 1982

    Google Scholar 

  21. 21.

    Shan T-J, Wax M, Kailath T: On spatial smoothing for direction-of-arrival estimation of coherents signals. IEEE Trans. Acoust. Speech Signal Proc 1985, 33(4):806-811.

    Article  Google Scholar 

  22. 22.

    Goncalves D, Gounon P: On sources covariance matrix singularities and high-resolution active wideband source localization. In ICASSP. Seattle, WA, USA; 1998.

    Google Scholar 

  23. 23.

    Li J, Stoica P, Wang Z: On robust Capon beamforming and diagonal loading. IEEE Trans. Signal Proc 2003, 51(7):1702-1715.

    Article  Google Scholar 

  24. 24.

    Holm S, Synnevag JF, Austeng A: Capon Beamforming for active ultrasound imaging systems. Proc. IEEE , 13th DSP Workshop 2009.

    Google Scholar 

  25. 25.

    Ziskind I, Wax M: Maximum likelihood localization of multiple sources by altenating projection. IEEE Trans. ASSP 1988, 36: 1553-1560.

    Article  Google Scholar 

  26. 26.

    Le Touzé G, Nicolas B, Mars JI: Double MUSIC Actif Large bande pour la tomographie sous-marine. In GRETSI. Bordeaux, France; 2011.

    Google Scholar 

  27. 27.

    Roux P, Kuperman WA, Hodgekiss WS, Song HC, Akal T: A nonreciprocal implementation of time-reversal in the ocean. J. Acoust. Soc. Am 2004, 116: 1009-1015.

    Article  Google Scholar 

  28. 28.

    Roux P, Iturbe I, Nicolas B, Virieux J, Mars JI: Travel-time tomography in shallow water: Experimental demonstration at an ultrasonic scale. J. Acoust. Soc. Am 2011, 130(3):1232-1241.

    Article  Google Scholar 

Download references


This work was supported by the French ANR Agency (Grant ANR 2010 JCJC 030601).

Author information



Corresponding author

Correspondence to Grégoire Le Touzé.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Touzé, G.L., Nicolas, B., Mars, J.I. et al. Double-Capon and double-MUSICAL for arrival separation and observable estimation in an acoustic waveguide. EURASIP J. Adv. Signal Process. 2012, 187 (2012).

Download citation


  • Estimation Space
  • Signal Subspace
  • Steering Vector
  • Observable Estimation
  • Receiver Array