 Research
 Open Access
 Published:
Enhancement of acoustic tomography using spatial and frequency diversities
EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 225 (2012)
Abstract
This article introduces several contributions to enhance an important application such as acoustic tomography (AT), using mainly the spatial and spectral diversities of underwater acoustic signals. Due to their inherited properties, (i.e. spareness, nonstationarity or cyclostationarity, wideband frequency range, wide range of power, etc.), the process of underwater acoustic signals becomes a real challenge for many scientists and engineers who are involved in studies related to the ocean. For various applications, these studies require huge and daily information. AT techniques remain fast and cheap ways to obtain such data. Nowadays, active acoustic tomography (AAT), is communally used to generate powerful and repetitive acoustic sources. Recently, researchers have been attracted by an alternative way, called passive acoustic tomography (PAT), which uses acoustic opportune signals of their environment. PAT techniques are mainly used for ecological, economical and other reasons such as military applications. With PAT, no signal is emitted; therefore, problems become more challenging. The number and positions of existent sources are unknown, and sensors measure mixtures of available sources. Algorithms based on time or frequency domains are widely deployed to classify, identify, and study received signals in AAT applications. For PAT, researchers employ multiple sensors in order to add an extra dimension, (such as space). This article focuses on approaches used in space along with time or frequency to extract information, improve performances, and simplify the overall architecture. This article explains the use of signal processing and statistical approaches to solve problems raised using PAT and discusses the experimental results. The review of the literature offers a big variety of algorithms to deal with classic AAT problems. Therefore, only problems related to PAT have been considered herein.
Introduction
Oceans cover more than 70% of the earth surface, roughly containing 97% of all our water supply and playing a major role in global climate regulation and economical systems.
Acoustic tomography (AT) is used in many civil or military applications such as: mapping underwater surfaces, oceanographical, meteorological applications, (to measure the temperature, the salinity, the motion and the depth of the water), to improve sonar technology, as well as other applications. Many algorithms[1] have been developed to deal with active acoustic tomography (AAT).
Interest in passive acoustic tomography (PAT), has been increasing, mainly, for acoustic discretion, (in military applications), ecological, economical or logistic reasons. On the other hand, PAT techniques are increasing the complexity of algorithms. In real world applications, PAT can be mainly achieved using following methods:

1.
Emitting sources similar to natural sounds or noises: a set of artificial signals imitating natural sounds, (whales, dolphins, etc.), or noises, (waves, ships, etc.), are generated. The main advantage of such approach remains in the control of the sources and their positions, (similar to active methods). In order to achieve this, researchers could imitate the timefrequency signature of nature signals. However, this method is not totaly discreet as the generated signals may have different high order statistics (HOS), instantaneous power, or frequency than original signals. Besides, artificial signals generally can be characterized by specific patterns, (periodicity, time or statistical coherence, fixed positions, and deterministic motions, etc.). These specific patterns can be used to unmask hidden emitted signals.

2.
Using natural signals: by completely relying on existing natural signals, a PAT system with a high discreet level can be achieved. However, main drawbacks of such system are the lack of information, (number, positions or natures of sources, etc).

3.
Applying hybrid systems: by mixing the previous two strategies, better performances and good discretion levels could be achieved. However, that will results in more complex emitterreceiver systems.
On the one hand, it seems that the 2nd strategy is the more attractive one, (completely discreet systems and no emitters). On the other hand, the problems raised in this case are more challenging because of the total lack of information about the sources. In order to reduce the complexity of this problem, we investigate several advanced signal processing techniques and statistical approaches. In fact, let us assume that we are able to estimate the number of the sources, separate the sources form their mixing observed signals, and evaluate their statistical properties. In this case, the identification of the channel could be, also, investigated. Therefore, remaining PAT problems become very similar to AAT.
The article’s primary purpose is to discuss the preprocess observed mixed signals to extract maximum information about the sources, then, we can apply classic algorithms to deal with residual problems. This article is organized as follows: Section “Acoustic oceanic tomography”, describes AAT and PAT, briefly; Section “Assumption and background, contains the assumptions and mathematical models; Section “Preprocessing systems”, presents the preliminary studies; Section “Adaptive HOS estimators, proposes new HOS estimators, in order to enhance the spatial diversity of original sources; Section “Spatial diversity and independence discrimination criteria, discusses several criteria, so as to exploit the spatial or the spectral diversity of our signals; Section “Blind separation of observed acoustic signals”, presents independent component analysis, (ICA), algorithms to separate mixed observed signals; Section “Experimental results”, shows experimental results; and Section “Conclusion”, presents the conclusions.
Acoustic oceanic tomography
Acoustic tomography’s goal is to get a fast and cheap monitoring of water mass and subbottom characteristics. This monitoring requires an inversion 2step procedure[2]. First, estimate the acoustic properties, (such as the water column sound speed profile, 3D structure of internal tides in water masses, geoacoustic parameters of the seafloor), from the measurement of a known propagated acoustic waveform between fixed sources and receivers. Second, infer some ocean physical parameters from these estimated acoustic characteristics.
Active acoustic tomography
To perform oceanic tomography, an active acoustic emission is propagated between an emitter and a set of receivers on an horizontal track of about 10 km long. Frequencies involved in tomography range is from 30 Hz to a few kHz, whereas, power range is from 180 to 220 dB.
First works in tomography have been only considered deep water channel, (depth deeper than 1 km). In this case and in order to estimate underwater acoustic transmission channel parameters, acoustic refraction is the main physical phenomenon which should be considered.
In the mid 1990s, scientists have extend their interests to shallow water, (i.e. depth less than 300 m),[3].
In shallow water, an acoustic propagation encounters numerous interactions with the sea surface and the sea floor. Therefore, new techniques had to be developed such as ‘matched field processing’ in[4] and the ‘matched impulse response processing’ in[5]. In their applications, a single input multiple output, (SIMO), configuration is used to extract channel information.
To get efficient results in a SIMO configuration, a large number of sensors should be used which means increasing the experimental setting. To tackle the last problem and using frequency diversity, researchers proposed “matched impulse response processing” methods. In the last case, a wide band signal should be emitted, but a single distant hydrophone could be enough as a receiver. The main idea of such technique consists of estimating the channel impulse response by applying maximum likelihood or matched filter estimations on the known emitted and received signals[6]. Once the channel response filter has been estimated, other features such as time delay or magnitude of arrivals could be extracted. The last features could be used in order to estimate water column and subbottom properties.
Passive acoustic tomography
Active acoustic tomography strongly relies on the possibility to emit powerful acoustic signals in the ocean. Major problems can arise. powerful emissions need a heavy power supply which can drastically limit the efficiency of autonomous monitoring systems, thereby causing drastic harm to marine mammals and disturbing their behavior. Finally in a warfare context, some constraints about covertness may exit in the acoustic process. To overcome these problems, the concept of PAT has recently emerged in the community.
Passive acoustic tomography consists in estimating acoustic properties by using natural opportunity sources present in the channel at the time of interest without using active emission. Surface noise created by breaking waves, ship noise, and marine mammal calls are three kinds of opportunistic sources which are under the scope of passive tomography[7].
The main drawbacks of PAT are the lack of information about the number, positions, and nature of emitted signals. With more than two sources many actual tomography algorithms can’t give satisfactory results. Many others don’t work well or at all when the emitted signals are wide band signals[8]. Some algorithms take into consideration the position of acoustic sound emitters[9]. Typically, in real world PAT applications, underwater acoustic signals are generated by various moving sources whose number and positions are hardly, (or impossible), to be identified, (as in the case of shoal of fish or wave noises). It is obvious that PAT is a quite difficult technique requiring substantial effort in signal processing to tackle the unknowns of source position and emitted waveform as well as to separate the sources present simultaneously in the channel before switching them toward a dedicated blind inversion processor.
Assumption and background
In PAT applications, the sources are obviously signals of opportunities which have various properties such as spatial diversity, different probability density functions (pdf), different temporal or spectral structures, different timefrequency signatures, etc. These properties can be used at different level of the separation stage. However, in PAT applications, simple and cheap systems are often used which means that linear multisensor antenna are not recommended. Mainly, for this reason, ICA algorithms will be of great importance to reach our goal. ICA algorithms can successfully handle multiinput multioutput (MIMO) channel.
In a previous work[10], an extensive experimental study has been conducted in order to classify and characterize many recorded anthropogenic signals, (made by human activities as boats, ships, or submarine noises, etc.), and natural signals, (mainly animals sounds or natural noises, such as waves etc.). According to that study, one can add to the above mentioned features, the following ones:

Recorded signals are affected by a background ocean noise which can be considered as an additive white Gaussian noise (AWGN).

Some signals have a very weak kurtosis[11].

Almost all of the signals are nonstationary signals with more or less cyclic behavior as boat noises.

Natural signals are very sparse ones and artificial ones are very noisy.
The above mentioned properties have been considered to select appropriate ICA algorithms.
Underwater acoustic channel
Underwater sounds are produced by natural or artificial phenomena through forced mass injection leading to inhomogeneous wave equations which can be converted to frequency domain[12]. The frequencydomain wave equation is called the Helmholtz’s equation which gives us an underwater sound propagation model. A general solution of the Helmholtz’s equation is very difficult to obtain. Therefore, researchers use simplified propagation models, (such as the ray theory, the mode theory, the parabolic model, the hybrid model, etc.), according to their applications[13]. The choice of a propagation model depends on many parameters such as wave frequency, the depth of the sea, etc. In our case, (shallow water, i.e. the channel depth is about few hundred meters), our frequency range is from 300 to 10 KHz, the ray theory was the more appropriate propagation model.
The sound speed C, (in m/s), in oceans is an increasing function of temperature T, (°C), salinity S, (parts per thousand, ppt), and pressure which is a function of depth D, (in meters),[14]:
The above equation is an empirical relationship satisfied when 0 ≤ T ≤ 30, 30 ≤ S ≤ 40, and D ≤ 8000. In shallow deep underwater channels[15], (depth less than 300 m), where emitters and receivers are not so close to the water surface nor to the bottom and the distances among emitters and receivers are less than 3 Km, the sound speed could be approximated by a constant.
The reflected acoustic waves on the bottom of the propagation channel depend on many parameters such as the composition and the geometrical properties of the bottom[16].
The reflected acoustic waves on the top of the propagation channel, i.e. the water surface, depend, also, on many parameters such as the wind, the wave frequency as well as the swell properties[16]. For this reason, the water surface can’t be considered as a flat surface. Therefore, the direction of a reflected acoustic wave is dispersed in the space. However in average term, reflected acoustic waves can be considered as obtained by a flat surface with absorption coefficients[15]. In our model, a flat surface is considered and random coefficients are added to characterize other unknown parameters.
Finally to consider acoustic propagation effects, an acoustic model proposed by Schulkin and Marsh[17] was considered. According to that model, a received signal should be multiplied by a corrective coefficient p given by:
Here r is a propagation distance and α stands for Rayleigh’s absorption coefficient which it can be approximated by:
where\left(\right)close="">\n \n \n \n f\n \n \n T\n \n \n =\n 21\n .\n 9\n \n \u2217\n \n 1\n \n \n 0\n \n \n \n \n 6\n \u2212\n \n \n 1520\n \n \n T\n +\n 273\n \n \n \n \n \n \n \n, (in kHz), T is the water temperature, (°C), S = 3.5% is the water salinity, (in the ocean S ≈ 35g/l), P_{ w } is the water pressure, (in kg/m^{2}), A = 2.34 ∗ 10^{−6}and B = 3.38 ∗ 10^{−6}.
From physical point of view, an acoustic ray represents a propagation trajectory of an emitted signal between the source, (emitter), and the receiver. In many cases, the channel depth is limited in size which means that the propagation is multirays. Each ray may be bent by refraction if the sound speed is a function of depth and range. Ray trajectories and sound speed profile allow us to compute propagation times. In addition ray trajectories, water attenuation, boundaries roughness and subbottom properties allow us to compute the signal magnitude.
From a computational view point, ray trajectory is computed by solving the ‘Eikonal equation’ but signal magnitude is obtained as a result of ‘Transport equation’[12]. As general and analytical solutions of Eikonal and transport equations do not exist, researchers use approximate and simulated results[18].
Mathematical models
Under some mild assumptions, (i.e. MIMO configuration and ray propagation model), acoustic underwater channel can be considered as multiple paths which, in frequency domain, each of them can be defined by a complex constant gain[1]. Let S(n) denotes a vector of p unknown sources which are statistically independent from each other and X(n) is a q × 1 observed vector (Figure1).
The relationship between S(n) and X(n) is given by:
where\left(\right)close="">\n \n H\n (\n z\n )\n \n stands for the channel effect. In the case of convolutive mixture, H(z) = (h_{ ij }(z)) becomes a q × p complex polynomial matrix. In the following, we consider that the channel is a linear and causal one and that the coefficients h_{ ij }(z) are RIF filters, (where the coefficients are evaluated according to the previous section). Let M denotes the degree of the channel which is the highest degree of h_{ ij }(z). The previous equation (4) can be rewritten as:
Here H(i) denotes the q × p real constant matrix corresponding to the impulse response of the channel at time i and S(n−i) is the source vector at time (n−i).
Preprocessing systems
As it was mentioned before that the processing of acoustic signals is a very challenging problem. To enhance our processing algorithms, pre and post processing systems have been proposed.
Pre & postprocessing
Our sources are bounded in frequency domain. Therefore, a lowpass filter was extremely helpful for us to reduce the impact of the AWGN and, then, achieve better performances. It is worth mentioning that only three tested algorithms have given satisfactory results. These three algorithms, (for further details see the following references[19–23]), were dedicated to separate nonstationary sources (audio or music signals). The last two algorithms[22, 23], which be called in the following SOS[22] and Parra and Alvino[23], are implemented in frequency domain using discrete frequency adapted filters.
Experimental studies showed the best results can be obtained by applying ICA algorithm over split signals in three frequency bands. Once the separation in each frequency bound are achieved, then, a reconstruction module should be used to recover original sources. Our reconstruction module is based on second order statistics which can be generalized in the use other statistical features. In the actual version, it uses the correlation of signal slices in time or frequency domain. Best results have been obtained when two cascade algorithms are used and the number of sensors is strictly greater than the number of sources, (q > p), as shown in Figure2.
Estimation of source number
It is obvious that the number of sources is an input parameter. ICA algorithms can cope with an overestimate number of sources, (extra separated signals should be residual noises). However, an underestimation of that number can affect seriously overall performances[24]. For this reason, a rough estimation of that number should be considered. To roughly estimate the source number, few approaches have been considered and briefly discussed. Hereinafter, the channel is assumed overdetermined, (i.e. q > p).
To simplify our discussion, let us consider the simplest case, i.e. an instantaneous mixture, (memoryless channel, i.e. H(i) = 0, ∀i ≠ 0 and H(0) = H is a real full column rank matrix). In this case, the number of sources can be estimated as the rank of the observation covariance matrix Σ_{ X }:
Here Σ_{ S } stands for the unknown and invertible diagonal covariance matrix of the statistically independent sources. For noise free channel, the rank of Σ_{ X } becomes equal to the rank of Σ_{ S }otherwise the number of sources[25].
With an AWGN channel, Σ_{ X }becomes a full rank matrix. Without loss of generality, let us assume that noise components have the same variance, then, the q singular values λ_{ i }of Σ_{ X }will have different values except the last q − p ones. Normally, the first p singular values are linked to signal space and the last q − p ones are related to the noise space. In order to apply this method, one should deal with two problems: How can we estimate the covariance matrix of nonstationary signals an what is the optimal threshold between the two sets of singular values? The estimation of covariance matrices has been conducted over slippery estimation windows, see Section “Adaptive HOS estimators’. Concerning the threshold, it can be easily set when the signal to noise ratio (SNR) is relatively high. Unfortunately, the SNR is our case is not high enough, (i.e. SNR > 2 dB). Therefore, different thresholds have been considered:

If q > p + 5, one can easily set a threshold as the limit between two sets of singular values. This approach requires a very good SNR and q >> p.

To improve the first approach, normalized singular values have been considered, (i.e. λ_{ i }have been divided by the maximum λ_{ i }). Experimental results showed that a threshold can be easily set using normalized singular values when SNR is higher than 10 dB and the signatures of sources are relatively the same, (the signature of the i th source on the j th sensor is the power received by that sensor from that source. Therefore, the signature of a source depends on the source power and the channel parameters.). The last two assumptions can’t be, always, satisfied in our application.

Another method was considered: first, the singular values λ_{ i }should be sorted in descending order; second, sorted λ_{ i }should be divided by λ_{2}. Finally, the number of sources is considered as the number of normalized λ_{ i }> ε, where ε depends on SNR. Experimentally, we obtained satisfactory results for SNR higher than 4 dB and ε > 0.1.

By considering that the signals are close to Gaussian ones, one can use Akaike’s information criterion, (AIC), to set the threshold. Even though the gaussianity assumption is a strong one, (underwater acoustic signals are very strong nonstationary signals which can not be considered as gaussian signals), Karhunen et al.[26] shows that obtained results are still satisfactory.
The above methods can be easily extended to the case of convolutive mixture, (memory channel), by considering our extended covariance matrix\left(\right)close="">\n \n \n \n \Sigma \n \n \n X\n \n \n \n instead of Σ_{ X }[27], where\left(\right)close="">\n \n \n \n X\n \n \n N\n \n \n (\n n\n )\n =\n \n \n (\n \n \n X\n \n \n T\n \n \n (\n n\n )\n ,\n \n \n X\n \n \n T\n \n \n (\n n\n \u2212\n 1\n )\n ,\n \u2026\n ,\n \n \n X\n \n \n T\n \n \n (\n n\n \u2212\n N\n )\n )\n \n \n T\n \n \n \n is the extended observation vector:
where\left(\right)close="">\n \n \n \n S\n \n \n N\n +\n M\n \n \n (\n n\n )\n \n stands for the extended signal vector and T_{ N }(H) is the Sylvester matrix which is full rank under some mild assumptions[27].
In order to improve our estimation, we implemented and tested another algorithm dedicated to estimate the number of telecommunication transmitted signals. In fact, Chen et al.[28] use two sets of receiver antennas X_{1}(n) and X_{2}(n) with N_{1} > p, (respectively N_{2} > p), components. The main idea of Chen’s algorithm consists on using the rank of a covariance matrix Σ_{ Z }:
where\left(\right)close="">\n \n Z\n (\n n\n )\n =\n \n \n \n \n \n \n X\n \n \n 1\n \n \n T\n \n \n (\n n\n )\n \n \n \n X\n \n \n 2\n \n \n T\n \n \n (\n n\n )\n \n \n \n \n T\n \n \n \n, Σ_{ i } is the covariance matrix of X_{ i }(n) and Σ_{12} is the crosscovariance matrix of X_{1}(n) and X_{2}(n). Using Equation (8), a normalized covariance matrix Σ_{ N }is defined as follows:
Here X^{H} represents the hermitian transpose of X. Chen et al. proved that the number of sources can be estimated using the singular values ρ_{ i }, canonical correlation coefficients, of Σ_{ N }. In fact, let us consider the following set of hypotheses:
where r = min(N_{1},N_{2}) and ∀ 0 ≤ s ≤ r is the number of sources under test. The selected number is the one that can satisfy the following equation:
where H_{s+}is the hypothesis that number of sources is higher than s and the threshold T_{ s } should be set so that the allowable probability of false alarm can be achieved.
A main advantage of the last algorithm comparing to previous approaches is that this algorithm can be applied even though the noise are spatially correlated and that it can give a confidence level for the estimated number. The main drawback is the computational effort. In fact, with 2N + 1 receivers, one can only estimate a source number up to N. In the following, we consider that the number of sources is already estimated.
Adaptive HOS estimators
In order to exploit spatial diversity, many blind or semiblind separation; or identification algorithms uses HOS, in time or frequency domain. For this reason, the estimation of cross cumulants and moments up to the fourth order have been investigated in this section, further details are given in Appendix 1.
Using the definition of cumulants and moments[29], an estimator of fourth order cumulant can be easily derived:
k_{4}(X) is a consistent biased estimator of Cum_{4}(X). In previous studies[30], we proposed and compared estimators for autocumulants of second and fourth orders. Here, we propose new adaptive HOS estimators for fourth order crosscumulants which can be applied on underwater acoustic signals which are nonstationary signals.
A nonbiased estimator of fourth order crosscumulants can be obtained from the definition of the crosscumulants[29]. In fact, let us consider K_{22} an estimator of Cum_{22}(X Y) defined as follows:
where a, b and c should be set in order to make K_{22} a nonbiased and consistent estimator. When samples x_{ i }and y_{ i } are independent, one can use similar estimators to these proposed in[30]. In the following, we assume that the samples are independent and identically distributed (iid) over time but spatially correlated. In this case, one can prove that K_{22} become a nonbiased and consistent estimator:
when\left(\right)close="">\n \n a\n =\n \n \n N\n +\n 2\n \n \n N\n \u2212\n 1\n \n \n \n and\left(\right)close="">\n \n b\n =\n c\n =\n \n \n N\n \n \n N\n \u2212\n 1\n \n \n \n. Similarly, one can develop two other estimators:
To obtain these estimators, signals are assumed to be stationary. The last assumption can not be satisfied in our application. Therefore, some modifications should be considered. Using some algebraic operations, Equation (13) can be modified as follows, see Appendix 2:
Here C_{31}(N) is an adaptive online version of K_{31} and\left(\right)close="">\n \n \n \n \mu \n \n \n nm\n \n \n (\n X\n ,\n Y\n )\n =\n \n \n 1\n \n \n N\n \u2212\n 1\n \n \n \n \n \u2211\n \n \n i\n =\n 1\n \n \n N\n \u2212\n 1\n \n \n \n \n x\n \n \n i\n \n \n n\n \n \n \n \n y\n \n \n i\n \n \n m\n \n \n \n is an unbiased consistent estimator of E(X^{n}Y^{m}). An adaptive version of μ_{ nm }(X,Y) can be, also, obtained as follows:
where 0 < λ < 1 is a forgotten factor. To evaluate the performances of last estimators, some simulations have been conducted using a nonstationary zeromean signals. For example, let S(n) be a nonstationary signal that consists of four parts:

S_{1} is an uniform signal in [1, 1] with 8,000 samples.

S_{2} is Gaussian with unit variance and 5,000 samples.

S_{3} is an uniform signal in [2, 2] with 3,000 samples.

S_{4} is Gaussian with a standard deviation\left(\right)close="">\n \n \sigma \n =\n \n \n 2\n \n \n \n and 4,000 samples.
Using S, two other signals have been generated X(n) = S(n) and Y(n) = S^{3}(n), (it is clear that x_{ i }and y_{ i }are i.i.d and that x_{ i }depends on y_{ i }). Using the definition of cumulants and the properties S, we can prove that:

For uniform parts,\left(\right)close="">\n \n \n \n Cum\n \n \n 31\n \n \n (\n X\n ,\n Y\n )\n =\n \u2212\n \n \n 2\n \n \n 35\n \n \n \n \n a\n \n \n 6\n \n \n \n, here a is the maximum amplitude.

For the Gaussian parts, Cum_{31}(X,Y) = 6σ^{6}.
We conducted many simulations, according to our experimental study, the performance of estimator (14) can be improved by using another forgotten factor 0 < γ < 1, see Figure3:
Finally, x_{ N }and y_{ N } in Equation (15) have been replaced by their average over a small estimation window, (10 to 50 samples). The above proposed estimators can be improved by considering non iid samples. However, in the last case, a stochastic model with transition probability should be considered. The last statement is beyond the scoop of this manuscript and it will be considered in a future study. Hereinafter, HOS are estimated at different stages using the estimators described in this section.
Spatial diversity and independence discrimination criteria
In the literature, one can find a huge number of ICA algorithms to solve the blind source separation (BSS) problem. Most of them are dedicated to the separation of instantaneous, (i.e. echo free), channel. In our application, the underwater acoustic propagation channel can be modeled by a convolutive mixture, (i.e a multi path and a MIMO finite impulse response (FIR) channel with huge filter order ≥ 6000). It is well known that a blind separation of statistically independent sources of convolutive mixture can lead us to the original sources up to a permutation and scalar filter:
where s_{2}(n) represents a mixture of all sources except the first one s_{1}(n). The filter\left(\right)close="">\n \n \n \n h\n \n \n i\n \n \n (\n z\n )\n =\n \n \n h\n \n \n i\n \n \n (\n 0\n )\n +\n \n \n h\n \n \n i\n \n \n (\n 1\n )\n \n \n z\n \n \n \u2212\n 1\n \n \n +\n \cdots \n +\n \n \n h\n \n \n i\n \n \n (\n \n \n m\n \n \n i\n \n \n )\n \n \n z\n \n \n \u2212\n \n \n m\n \n \n i\n \n \n \n \n \n is a residual separation filter. The separation is considered achieved when the norm of the residual error h_{2}(z)∗s_{2}(n) becomes much less than the one of the separated signal h_{1}(z)∗s_{1}(n). In addition, the identification or classification of underwater acoustic signals is extraordinarily difficult step because these signals are nonstationary and nonintelligible sparse signals with low variable kurtosis. In this context, the classification of ICA algorithms according to the separation quality becomes a difficult and important task.
The following discrimination criteria can be optimized to maximize the spatial diversity or the independence among estimated signals. At the same time, they can be very useful to quantify the separation achievement. In the last case, these criteria are called performance indices.
Modified crosstalk
In this section, a new and modified performance index is proposed. The crosstalk is the inverse of SNR and it is widely used as a performance index for BSS algorithms of instantaneous mixture. By definition the crosstalk index of the first estimated signal, is given by:
To apply the crosstalk, one should have original sources. Therefore, this performance index cannot be applied in real situation where sources are unknown. However it is very useful in simulations.
It is well known that sources can be separated from a convolutive mixture up to a permutation and up to a scalar filter. Therefore, the last definition D_{ r } is useless for the BSS convolutive mixture, see Equation (16), since it doesn’t take into consideration the power ratio between the filtered version of the signal ξ_{1} = h_{1}(z)∗s_{1}(n) and the residual error h_{2}(z)∗s_{2}(n).
We developed a modified definition for the crosstalk. First, one should apply (17) as\left(\right)close="">\n \n \n \n D\n \n \n r\n \n \n (\n \n \n \u015d\n \n \n 1\n \n \n ,\n \n \n \xi \n \n \n 1\n \n \n )\n \n. Second, an estimated h_{1}(z) should be obtained using s_{1}(n) and the estimated signal\left(\right)close="">\n \n \n \n \u015d\n \n \n 1\n \n \n \n. To estimate h_{1}(z), one can minimize the least mean square (LMS) error ζ:
Let H_{ i }= (h_{ i }(0)…h_{ i }(m_{ i }))^{T}and S_{ i }= (s_{ i }(n)…s_{ i }(n−m_{ i }))^{T}, the convolutive product in Equation (16) becomes a simple scalar product:
Using the independence properties of the sources, one can prove that:
where ε_{ H }= H_{1}−H and\left(\right)close="">\n \n \n \n \Sigma \n \n \n i\n \n \n =\n E\n \n \n \n \n S\n \n \n 1\n \n \n \n \n S\n \n \n 1\n \n \n T\n \n \n \n \n \n is an invertible definite positive matrix. The second term of (19) doesn’t depend on H. Therefore, one can prove that the optimal value of H is given by:
Our experimental results show that for a low order channel filter, (<20), this performance index can be used efficiently. When the order of channel is larger than 20, computing time becomes a big issue.
Mutual information
According to[31], mutual information (MI) I(p_{ U }) is one of the best independence indices:
where U = (u_{1},…,u_{ n })^{T} is a random vector and P_{ U }(V) (resp.\left(\right)close="">\n \n \n \n p\n \n \n \n \n u\n \n \n i\n \n \n \n \n (\n \n \n v\n \n \n i\n \n \n )\n )\n \n are the joint, (resp. marginal), PDF. In the context of BSS problem, the joint and the marginal PDF are unknown but they can be estimated.
To estimate the MI in our project, we used a method proposed by Pham[32]. In his method, the integral is replaced by a discrete sum and the PDF are estimated using kernel methods. In[32], spline functions of third order have been used as kernel function. By definition, a spline function of order r is the PDF of the sum of r uniform independent random variables u_{ i }∈ [−0.5,0.5]. For example, the 3rd order spline function is defined as:
Finally, the MI estimator is given by:
Here\left(\right)close="">\n \n \n \n \n \n \Pi \n \n \u0302\n \n \n \n U\n \n \n (\n i\n )\n \n is the joint PDF estimator and\left(\right)close="">\n \n \n \n \n \n \Pi \n \n \u0302\n \n \n \n \n \n u\n \n \n k\n \n \n \n \n (\n j\n )\n \n is the marginal PDF estimator. Good results have been obtained with stationary signals, but we couldn’t get similar results for underwater acoustic signals.
Quadratic dependence
To measure the independence among the components of a random vector X = (x_{1},⋯,x_{ n })^{T}, the authors of[33] make a comparison between the joint PDF of the vector X and the marginal PDF product of its components x_{ i }. Using similar approach, Kankainen[34] proposed the quadratic dependence measure D(X) which is a comparison between the joint first characteristic function (FCF), i.e. Φ(Ω) = E{exp(j Ω^{T}X)}, and the product of the marginal FCF:
Here h is an integrable function from\left(\right)close="">\n \n \n \n R\n \n \n n\n \n \n \n to\left(\right)close="">\n \n R\n \n which satisfies the following two conditions[34]:

h is a non zero almost everywhere and a positive function.

For analytical FCF Φ(Ω), h should be positive around zero and vanish elsewhere.
If the components of X are independent in their set, then, the joint FCF is equal to the product of the marginal FCF, (i.e.\left(\right)close="">\n \n \Phi \n (\n \Omega \n )\n =\n \n \n \u220f\n \n \n i\n =\n 1\n \n \n n\n \n \n \Phi \n (\n \n \n \Omega \n \n \n i\n \n \n )\n \n) and D(X) = 0. To deal with nonlinear BSS, Achard et al.[35] proposed the following h:
Here\left(\right)close="">\n \n K\n \n is a square integrable kernel function that its Fourier transform should be non zero almost everywhere and\left(\right)close="">\n \n \n \n \sigma \n \n \n \n \n X\n \n \n i\n \n \n \n \n \n is a scale factor, (i.e. a positive function only depends on the PDF of X_{ i }).
Using the energy conservation theorem of Parseval, Equation (23) can be replaced by the following functions[35]:
Finally, Achard et al.[35] proved that the quadratic dependence measure is equivalent to Q(X) = 0 ⇔ x_{ i } are indenpendent from each other and Q can be estimated as follows:
Here\left(\right)close="">\n \n f\n (\n \n \n x\n \n \n k\n \n \n )\n =\n \n \n 1\n \n \n Ns\n \n \n \n \n \u2211\n \n \n i\n =\n 1\n \n \n Ns\n \n \n K\n \n \n \n \n \n \n x\n \n \n k\n \n \n \u2212\n \n \n X\n \n \n k\n \n \n (\n i\n )\n \n \n \n \n \sigma \n \n \n k\n \n \n \n \n \n \n \n, X_{ k }(i) is the ith sample of the kth component of X,\left(\right)close="">\n \n \xca\n \n is the empirical mean and\left(\right)close="">\n \n F\n (\n X\n )\n =\n \n \n 1\n \n \n Ns\n \n \n \n \n \u2211\n \n \n i\n =\n 1\n \n \n Ns\n \n \n \n \n \u220f\n \n \n k\n =\n 1\n \n \n n\n \n \n K\n \n \n \n \n \n \n x\n \n \n k\n \n \n \u2212\n \n \n X\n \n \n k\n \n \n (\n i\n )\n \n \n \n \n \sigma \n \n \n k\n \n \n \n \n \n \n \n where the kernel function\left(\right)close="">\n \n K\n \n can be:

(1)
Gaussian Kernel \left(\right)close="">\n \n \n \n K\n \n \n 1\n \n \n (\n x\n )\n =\n exp\n (\n \u2212\n \n \n x\n \n \n 2\n \n \n )\n \n

(2)
Square Gaussian Kernel \left(\right)close="">\n \n \n \n K\n \n \n 2\n \n \n (\n x\n )\n =\n \n \n 1\n \n \n \n \n (\n 1\n +\n \n \n x\n \n \n 2\n \n \n )\n \n \n 2\n \n \n \n \n \n

(3)
Inverse of Square Gaussian Kernel second derivative function \left(\right)close="">\n \n \n \n K\n \n \n 3\n \n \n (\n x\n )\n =\n \u2212\n \n \n 4\n \u2212\n 20\n \n \n x\n \n \n 2\n \n \n \n \n \n \n (\n 1\n +\n \n \n x\n \n \n 2\n \n \n )\n \n \n 2\n \n \n \n \n \n
In our experimental studies, the best results were obtained using the Gaussian Kernel. In fact, the Gaussian Kernel gives the largest possible difference between the quadratic independence measure applied on a vector A with i.i.d uniformly independent components and the quadratic independence measure applied on a vector B = MA, here M is a full rank mixing matrix. The main drawback of such performance index is the important computing time.
Nonlinear Kernel decorrelation
Bach and Jordan[36] proposes an independence measure based on the concept of nonlinear decorrelation or the\left(\right)close="">\n \n F\n \ncorrelation function\left(\right)close="">\n \n \n \n \rho \n \n \n F\n \n \n \n:
We call Cov(X,Y) and Var(X) respectively the covariance and the variance of X and Y . It is worth mentioning that\left(\right)close="">\n \n F\n \n is a vectorial space of all functions applied from\left(\right)close="">\n \n R\n \n to\left(\right)close="">\n \n R\n \n which contents all Fourier transform basis, (i.e. the exponential functions exp(jwx), with\left(\right)close="">\n \n w\n \u2208\n R\n \n).\left(\right)close="">\n \n \n \n \rho \n \n \n F\n \n \n \n means the independence between X and Y .
According to Bach and Jordan[36], the best choice of the two nonlinear functions f and g can be done using Mercer Kernel functions. A bilinear function K(X,Y ) from a vectorial space X, (for example\left(\right)close="">\n \n \n \n R\n \n \n m\n \n \n \n) to\left(\right)close="">\n \n R\n \n is said to be a Mercer kernel if and only if its Gram matrix is a semipositive matrix. By definition the Gram matrix of basis vectors, (X_{1},…,X_{ m }), of a m dimensional vectorial space X with respect to a bilinear function K(X,Y ) is the matrix given by G_{ ij }= K(x_{ i }y_{ j }). K(X,Y ) should, also, have the translation invariance, the convergence in\left(\right)close="">\n \n \n \n L\n \n \n 2\n \n \n (\n \n \n R\n \n \n m\n \n \n )\n \n and isotropic properties. A possible kernel is the Gaussian kernel:
Table1 shows experimental results obtained by applying NLdecorrelation on source and mixed signals using three different kernels, Gaussian, Polynomial and Hermite functions. We should notice that for acoustic signals better results are obtained using polynomial kernel which gives us the maximum difference between independent and correlated signals. Our experimental studies show that this performance index can be applied successfully in our project. However, computing time and needed memory become extremely important when the number of samples is over 500,000 samples. Finally, the difference between the NLdecorrelation of the sources and the mixed signals depends on original signals, the chosen kernel, as well as, the mixing model and parameters.
Simplified nonlinear decorrelation
Using similar approach to[36], we proposed a simplified performance index based on the concept of a nonlinear covariance matrix Υ = (ρ_{ ij }) defined by:
where X = (x_{ i }) is a random vector, f (x) and g(x) are two nonlinear functions, and 〈x〉_{ c }= x−E(x). If the components of X are independent from each other, then, Υ becomes a diagonal matrix. Using the last definition, we suggest the following performance index:
Here diag(M) is a diagonal matrix which has the same principal diagonal of matrix M and Off(M) = M−diag(M). Functions f and g are chosen from the following functions:

(1)
‘Gauss’: Gaussian kernel.

(2)
‘poly’: 6th order polynomial Kernel which its coefficients are the components of an unitary vector.

(3)
‘atan’: Saturation kernel using arctangent function.

(4)
‘tanh’: Saturation kernel using hyperbolic tangent function.
Our experimental studies, (see Table2), show the effectiveness of this performance index to deal with underwater acoustic signals and channels. The main drawback of this performance index is that obtained values depend on the kind and the number of original independent signals. Therefore, this performance index can only be used in simulations where the original sources are known.
Independence measure based on the FCF
The joint FCF of a random vector X = (x_{1},…,x_{ n })^{T}is equal to the product of the marginal FCF of its components if and only if they are independent from each other. Using that property, Feuerverger[37] proposed the following independence measure:
where g is an adequately chosen function[37],\left(\right)close="">\n \n \n \n X\n \n \n \u2032\n \n \n =\n \n \n \Phi \n \n \n \u2212\n 1\n \n \n \n \n \n \n 8\n X\n \u2212\n 3\n \n \n 8\n n\n +\n 2\n \n \n \n \n \n is the approximation of the score function of X, and Φ(X) is the PDF of zero mean and unite variance Gaussian signal. Our experimental studies show that the computing time is the main drawback of this performance index. We should mention that for stationary signals, this performance index is consistent. Unfortunately, the last intersting property is useless in our application since the acoustic signals are nonstationary signals.
Recently, Murata[38] proposed a simplified test to measure the independence between two random signals. This independence measure was, also, based on the estimation of the cross FCF:
If X and Y are statistically independent from each other, then, Φ_{ XY }(t,s) = Φ_{ X }(t)Φ_{ Y }(s). Murata’s independence measure is defined by the following equation:
Here k(t,s) is a bounded estimation window. Our experimental studies show that:

The obtained values depend on original sources. This inconvenient is common to previous performance indices.

For beta random variables, good results have been obtained. On the other hand, we noticed bad results for uniform random signals.

For acoustic signals, we noticed good results for instantaneous mixture and bad ones for convolutive mixtures.

Computing time is crucial.
Crosscumulants
The previously described performance indices can not be applied in real situations, where original signals are unknown because the performance values depend on the sources. Therefore, we developed a new performance index based on crosscumulant:
Here\left(\right)close="">\n \n \n \n \n \n Cum\n \n \n (\n 1\n ,\n 3\n )\n \n \n \n \n (\n X\n ,\n Y\n )\n \n \n 2\n \n \n \n \xc2\xaf\n \n \n is the average of Cum_{(1,3)}(X,Y)^{2} which is obtained using a sliding estimation window. The index of Equation (32) is limited to two signals. To generalize this index to the case of multisignals, we proposed the following index:
where Γ = (Perfc(X_{ i },X_{ j })) and\left(\right)close="">\n \n Off\n (\n \Gamma \n )\n =\n \n \n \u2211\n \n \n i\n \u2260\n j\n \n \n \n \n \gamma \n \n \n ij\n \n \n 2\n \n \n \n. Good results have been obtained using this performance index on instantaneous or convolutive mixture of acoustic signals. However, the computing time is relatively important.
Blind separation of observed acoustic signals
In previous study[39], we implemented and tested some instantaneous ICA algorithms. According to that study, good results, at least in instantaneous mixture of acoustic underwater signals, can be obtained using ICA algorithms based on HOS or dedicated to nonstationary signals. The algorithms discussed in this section have been selected according to our previous study.
In real applications of PAT, hydrophones could record mixed signals. In order to apply classic AAT algorithms, one should, first, separate the recorded mixed signals. It was mentioned that in PAT applications, MIMO configuration is quite possible. In this case, the sources could be generated and recorded at different locations. This spatial diversity could be translated into statistical independence. Since the early of 1990s, ICA, has been considered as a set of important signal processing tools[40–42]. By assuming that the unknown p emitted signals, (i.e. sources), are statistically independent from each other, ICA consists on retrieving a set of independent signals, (output signals), from the observation of unknown mixtures of the p sources. It was proved that the output signals can be the sources up to a factor, (or filter), scale and up to a permutation[43].
Due to long and sparse impulse response of acoustic underwater channels and acoustic underwater signals’ features, (i.e. nonstationary, close to Gaussian, sparseness, etc.), see Section “Assumption and background’, many ICA algorithms couldn’t achieve the separation of sources in our application. Every selected and implemented algorithm has been evaluated using the following steps: we, first, used the same, (or similar), signals to the ones originally proposed by the authors of that algorithm. Second, an algorithm should be run over some simulated scenarios using a set of nonstationary signals, (normally speech signals), in memoryless or simple convolutive channels. Algorithms that give good, (or at least satisfactory), results in the first two stages have been selected in our project.
Best experimental results were obtained using two frequency domain ICA Algorithms[22, 23] based on the minimization of second order statistics criteria in frequencydomain. These two algorithms exploit the spatial and the spectral diversity of the original signals. In the following, the major tested algorithms are briefly described.
Blind estimation of time delay
In order to retrieve source signals, one can estimate the transmission channel, then, separate the source using some invertible filters. In this scenario, an algorithm to estimate different time delays can be of great helpful. Emile and Comon[44] proposed an elegant blind time delay estimation algorithm for a simplified convolutive mixture:
Here b_{ i }(t) stands for an AWGN. The proposed algorithm can estimate different mixing parameters, (τ_{ i } and α_{ i }), using HOS in frequency domain, see Equation (34).
where R_{ i }(w) is the Fourier transform of observed signals. Using the independency assumption of the sources, algebraic operations, Van Der Monde matrix properties, and an inverse Fourier transform, the authors successfully generate a signal of shifted Diracs with the required time delays, see Figure4.
Our experimental studies show that the performance can be improved by increasing the number of samples. On the other hand, even though we used over 250,000 samples, we couldn’t unfortunately achieve good results when the sources are underwater acoustic signals, see Figure5.
It is worth to be mentioned that the authors proposed in[44] another version of their algorithm. However, we didn’t implemented the latest version of the algorithm, for the simple reason that the first version of algorithm didn’t give satisfactory results in our application. In fact, underwater acoustic channel is more complex than the model considered by the authors.
Nguyen’s algorithms
In the early 1990s, Nguyen and Jutten[45–47] were the first to propose an ICA algorithm to separate a convolutive mixture of speech signals. The first version of the algorithm consists on the minimization of a cost function as the mathematical expectation of an odd nonlinear function evaluated over the estimated signals. Later on, they proposed another cost function as the sum of fourth order crosscumulants. To prevent a matrix invertible problem, they proposed a recursive structure which can only deal with a mixture of two sources. The latest constraint can be easily avoided by using our recursive system proposed in[48]. In addition, the algorithms proposed by Nugyen et al. can be, easily, implemented and they have been used to separate speech signals. For these reasons, we decided to implement these algorithms.
In addition to different versions originally proposed by the authors, we implemented hybrid structures, (i.e. a minimization of cost function based on a weighted sum of their different cost functions). Unfortunately, our experimental studies show that the algorithm, in all implemented versions, is not helpful to reach our goal. In fact the performance of the separation were not satisfactory due to the particularity of our application. It is worth mentionning that the convergence of the algorithm was a critical point in many cases.
Natural gradient applied to entropy maximization
In order to characterize and localize the developing of material defects, acoustic emission analysis (AEA) is used. To improve the performance of their AEA, Kosel et al.[49] have processed observed signals by using an ICA algorithm proposed earlier by Amari and Cardoso[50] based on the natural gradient minimization algorithm proposed in[51], and introduced independently by Cardoso and Laheld[52] under the name of relative gradient.
Many variant of Amari’s algorithm can be found in the literature which are based on the minimization of different contrast functions such as MI Shannon entropy, etc. Douglaset al.[53, 54] addressed the stability problems of Amari’s algorithms and proposed the minimization of:
where\left(\right)close="">\n \n W\n (\n z\n ,\n k\n )\n =\n \n \n \u2211\n \n \n i\n \n \n \n \n W\n \n \n i\n \n \n (\n k\n )\n \n \n z\n \n \n \u2212\n i\n \n \n \n is the separation filter and\left(\right)close="">\n \n \n \n p\n \n \n \n \n y\n \n \n i\n \n \n ;\n G\n (\n z\n )\n \n \n \n is the marginal probability density of y_{ i }, f_{ i }(y_{ i }) is a nonlinear function, and the separation filter can be adapted using:
where L is the filter order,\left(\right)close="">\n \n U\n (\n k\n )\n =\n \n \n \u2211\n \n \n i\n =\n 0\n \n \n L\n \n \n \n \n W\n \n \n L\n \u2212\n i\n \n \n T\n \n \n (\n k\n )\n Y\n (\n k\n \u2212\n i\n )\n \n and\left(\right)close="">\n \n \eta \n (\n k\n )\n =\n \n \n \mu \n \n \n \beta \n +\n \n \n \u2211\n \n \n p\n =\n 0\n \n \n L\n \n \n \n Y\n (\n k\n \u2212\n p\n )\n \n \n \n \n \n q\n \n \n \n \n \n, μ, β‘ and q are constant parameters, stand for an adaptive minimization step suggested by Amari. According to same authors, the components f = (f_{ ik }(Y)) can be selected by:
where f_{ r }(X), r = {P N}, is used when signals have positive, (Resp. negative), sign of kurtosis. ρ(k), κ(k) and σ(k) are signal statistics and they can be iteratively adapted. Finally, Douglas et al.[53] have suggested the following non linear function:
In the context of our project, many simulations have been conducted. According to our experimental studies, these algorithms can render good results for stationary signals and for relatively short channel filters, (i.e. low order filters). Unfortunately, divergence problems or non satisfactory results were often observed when the signals were sparse non stationary ones and the channel filter was very long as in our application.
Blind separation of non stationary signals
Kawamoto et al.[19] proposed an ICA algorithm to separate a convolutive mixture of speech signals. The proposed algorithm can be considered as an extension of Matsuoka’s algorithm which can, only, deal with instantaneous mixtures, see[55, 56]. The main idea of Kawamoto’s algorithm is that the separation of nonstationary signals can be obtained by minimizing a cost function based on Hadamard inequality[57] with the following assumptions:

(1)
H(z) is a full rank stable filter matrix and it has no zero on the unit circle.

(2)
The sources are zeromean nonstationary signals.

(3)
The sources have different autocovariance r _{ i }(n,m) = E(s _{ i }(n)s _{ i }(n−m)) which should be a time function.
In this case, Kawamoto et al. proved that the separation can be obtained by adapting the following filter:
where\left(\right)close="">\n \n L\n \u2208\n Z\n \n stands for time delay and R_{ Y }(n) = E(Y(n)Y^{T}(n)).
In our simulations, good results have been obtained when the signals are speech ones and the channel filter is considered as a FIR, see Figure6. Unfortunately, we couldn’t obtain good results when the signals and the channel are driven form acoustic underwater applications. By using preprocessing stages described in Section “Preprocessing systems”, and huge number of samples, fairly average results have been shown. Besides, the obtained performance depends on the signals as well as on the transmission channel.
The convergence needed a huge number of samples. Besides, obtained results were not always satisfactory. The performances of the algorithm depended on the source signals as well as the transmission channel. The algorithm was a time and memory consuming.
A frequency domain method for BSS of convolutive audio mixture (SOS)
Rahbar and Reilly[22] proposed an algorithm which minimizes a criterion Γ based on the crossspectral density matrix of the observed signals. For nonstationary signals, the latter matrix depends on frequency and time epoch m:
where\left(\right)close="">\n \n \u2225\n F\n (\n w\n ,\n m\n )\n \n \n \u2225\n \n \n F\n \n \n 2\n \n \n \n is the Frobenius norm of F(w,m), M is an estimation of channel degree,\left(\right)close="">\n \n \n \n \u0124\n \n \n \alpha \n \n \n \n is an estimation of the channel response at time α, and\left(\right)close="">\n \n \n \n \n \n D\n \n \u0302\n \n \n \n m\n \n \n (\n w\n )\n \n are diagonal matrices estimated crossspectral density matrix of the sources. To estimate the crossspectral density matrix of the signals, the authors used L estimation windows with L_{ m } samples each:
where X_{ im }(w) is the Fourier transform of the observed signals, and J is the number of estimated windows such that L_{ J }< L_{ m } and J L_{ J }> L_{ m }.
It is clear that the minimization of (35) needs a continues variable w which it is very difficult to be implemented. To solve that problem, the authors proposed the minimization of another criterion over K frequency points such that\left(\right)close="">\n \n \n \n w\n \n \n k\n \n \n =\n \n \n \Pi k\n \n \n K\n \n \n \n:
where F_{ R }(w,m) and F_{ I }(w,m) are the real and the imaginary parts of Equation (36). Finally, the minimization was done using a conjugate gradient algorithm.
Convolutive blind separation of nonstationary sources
The approach proposed by Parra and Alvino[23] is similar to the one proposed by Rahbar et al. Using the spectral density of different signals, the authors suggested the minimization of the following criterion by using a gradient algorithm:
where\left(\right)close="">\n \n \n \n \n \n R\n \n \u0302\n \n \n \n X\n \n \n (\n w\n ,\n k\n )\n \n is the estimated crosspower spectra of X. To improve the performance of their algorithm, the authors performed the minimization using a joint diagonalization algorithm applied on the following criterion J(w) and subject to a constraint in time domain concerning the filter size, (this constraint aims to solve the permutation indeterminacy in frequency domain):
Experimental results
Using the structure proposed in Figure2, many simulations have been conducted. Generally, over 500,000–1,000,000 samples were needed to achieve the separation. The original sources were sampled at 44 KHz. In almost all the simulations, the separation of artificial or natural signals have been successfully achieved. In these simulations, we have set the channel depth between 100 to 500 m, the distances among the sources or the sensors were among 30 to 100 m, the distances among the different sources and the divers sensors are from 1,500 to 2,500 m, the number of sensors is strictly higher than the number of sources.
Figure7 represents experimental results which were obtained by only applying SOS algorithm to separate a mixture of acoustic signals, (Ship and Whale).
Finally, good results have been obtained by only applying SOS algorithm except for some configurations notably when the sources are close to the water surface. For the latter cases, we found that the Parra algorithm before SOS algorithm can improve the overall results. Figure8 shows different experimental results obtained by the different algorithms, (Parra, SOS or Parra + SOS), each point corresponds to results of random simulations using Parra, SOS or Parra & SOS algorithms. In this figure, a normalized positive performance index based on a nonlinear decorrelation is used. The normalized performance index is forced to be zero for the mixture values and 1 for the sources.
Conclusion
In this article, several signal processing contributions applied on real world application such as the PAT, have been presented. Many simulations have been conducted and experimental studies showed the necessity of considering preprocessing and post processing of the observed signals in order to achieve properly the separation of the sources.
Many algorithms have been implemented and tested. However, few algorithms which are dedicated to the separation of nonstationary signals, give us satisfactory results. In a real scenario of warfare applications, the use of any ICA algorithm becomes very challenging. In fact, many ICA algorithms can not achieve satisfactory results when:

Most of the signals are close to Gaussian ones.

Sources have very inhomogeneous power, (the power ratio can be up to a dozen of dB).

SNR can be very limited depending on operational situations.

Even though ICA algorithms can handle convolutive mixtures. However, in our applications, the channel filter orders can be up to few thousand. At the same time, such a filter is a very sparse one. In fact, just few filter parameters do not vanish.
Our future work consists on developing an ICA algorithm which can use other features of acoustic signals such as sparseness along with nonstationarity, etc.
Appendix
Appendix 1: HOS estimators
Since the beginning of the 1980s, HOS methods and theories have been widely used in signal processing. Most of HOS algorithms are based on the fourth order statistics. By definition[58], the q th order moment μ_{ q } of a stochastic signal X is:
where E stands for the mathematical expectation. The q th order cumulant of X can be evaluated from its moments, by using the Leonov–Shiryayev formula[59]:
where the addition operation is over all the set of v_{ i }(1 ≤ i ≤ p ≤ q) and v_{ i } composes a partition[60] of {1,…,q}. By using the above relationship, we can calculate the 4th order cumulant of X:
For a zeromean stochastic signal, the second order cumulant, (i.e. the variance), is equal to its second order moment and its 4th order cumulant becomes:
Arithmetic estimators
Let X to be a zero mean stochastic ergodic signal where x_{ i } is an event, (or a signal sample), of X (1 < i < N). In this case, the arithmetic estimator of the qth order moment is given by:
This estimator asuumes that the signal X is stationary over N samples. This estimator is a non biased estimator, (i.e\left(\right)close="">\n \n E\n (\n \n \n \n \n \mu \n \n \n q\n \n \n \n \u0302\n \n )\n =\n \n \n \mu \n \n \n q\n \n \n \n), and its variance is given by:
Clearly, the above mentioned estimator is a consistent estimator; hence for stationary signals, its variance decreased with an increased number of samples. An arithmetic estimator of the qth order cumulant can be developed form Equation (38):
It is proved[61] that the estimator (41) is a biased estimator and the estimation error decreases proportional to\left(\right)close="">\n \n \n \n 1\n \n \n N\n \n \n \n:
However, it is a consistent estimator. A nonbiased cumulant estimator can be deduced from the last equation:
where the parameters c_{ p } depend on the partitions of the indices v_{ i }. These parameters can be estimated as the solution of q linear equations. Let us consider the fourth order cumulant:
In order to make the last estimator unbiased, one should solve a linear system of equations obtained by comparing termtoterm the expectation of Equation (43) and the theoretical value given by (38):
For zero mean signals, we can easily proved that:
That means the following estimator is an unbiased estimator for the fourth order cumulant of a zeromean stationary signal X:
For real time applications, the estimators should be adaptive ones. The estimator (40) is not an adaptive one, but it is easy to derive an adaptive version:
where\left(\right)close="">\n \n \n \n \n \n \mu \n \n \n r\n \n \n \n \u0302\n \n {\n k\n }\n \n is the estimator of the r th order moment at the k th iteration.
Exponential estimators
Exponential estimators are defined as follows:
where 0 < λ_{ q }< 1 stands for a forgotten factor. This estimator can be calculated easily in an adaptive way:
The latest estimator is biased, (\left(\right)close="">\n \n E\n \n \n \n \n \n \n \mu \n \n \n q\n \n \n \n \u0302\n \n \n \n =\n (\n 1\n \u2212\n \n \n \lambda \n \n \n q\n \n \n N\n \n \n )\n \n \n \mu \n \n \n q\n \n \n \n), but it is asymptotically non biased. This estimator can achieve better estimation for the moments of nonstationary signals. Thus more λ is close to 1, more the past samples are taking into account. A non biased exponential estimator can be used:
Estimator (53) can be, also, modified into an adaptive version:
An adaptive non biased estimator of the cumulants could be derived using (39) and (54). To simplify our discussion, the fourth order cumulant unbiased estimator for zero mean signals are developed as follows:
where γ is another forgotten factor and
Appendix 2: adaptive unbiased estimator of 4th order cumulant
Let C_{13}(N) = K_{13}(X,Y) be the adaptive estimator of the cumulant 3×1 using N samples,\left(\right)close="">\n \n \n \n A\n \n \n N\n \n \n =\n \n \n \u2211\n \n \n i\n \n \n N\n \n \n \n \n x\n \n \n i\n \n \n \n \n y\n \n \n i\n \n \n 3\n \n \n \n and\left(\right)close="">\n \n \n \n B\n \n \n N\n \n \n =\n \n \n \u2211\n \n \n ij\n \n \n N\n \n \n \n \n x\n \n \n i\n \n \n \n \n y\n \n \n i\n \n \n \n \n y\n \n \n j\n \n \n 2\n \n \n \n. In this case, Equation (13) can be written as follows:
Hence, we can prove that:
Last equation can be written as follows:
Finally, the last equation can be rewritten as:
where\left(\right)close="">\n \n \n \n \mu \n \n \n nm\n \n \n (\n X\n ,\n Y\n )\n =\n \n \n 1\n \n \n N\n \u2212\n 1\n \n \n \n \n \u2211\n \n \n i\n =\n 1\n \n \n N\n \u2212\n 1\n \n \n \n \n x\n \n \n i\n \n \n n\n \n \n \n \n y\n \n \n i\n \n \n m\n \n \n \n is the estimator of E(X^{n}Y^{m}) using N−1 samples. Using the last two equations, we derive the final form of our adaptive 4th order cumulant estimator:
References
Gervaise C, Quinquis A, Martins N: Time frequency approach of blind study of acoustic submarine channel and source recognition. In Physics in Signal and Image Processing, PSIP 2001. Marseille, France; January 2001.
Munk W, Worcester P, Wunsch C: Ocean Acoustic Tomography. Cambridge University Press, Cambridge,; 1995.
Baggeroer AB, Kuperman WA, Mikhalevsky PN: An overview of matched field methods in ocean acoustics. IEEE J. Oceanic Eng 1993, 18: 4.
Chapman NR, Lindsay CE: Matchedfield inversion for geoacoustic model parameters in shallow water. IEEE J. Oceanic Eng 1996, 21: 4.
Hermand JP: Broadband geoacoustic inversion in shallow water from waveguide impulse response measurements on a single hydrophone: theory and experimental results. IEEE J. Oceanic Eng 1999, 24: 1.
Michalopoulou ZH: Estimating the impulse response of ocean: correlation versus deconvolution, in Inverse problems in underwater acoustics. Springer, Paris and Milan and Barcelone,; 2001.
Gervaise C, Vallez S, Ioana O, Staphan Y, Simard Y: Passive acoustic tomography: review, new concepts and application using marine mammals. J. Mar. Biol. Assoc. U. K 2007, 87: 510. 10.1017/S0025315407054872
Martins N, Jesus S, Gervaise C, Quinquis A: A timefrequency approach to blind deconvolution in multipath underwater channels. In Proceedings of International Conference on Acoustics Speech and Signal Processing 2002, ICASSP 2002. Orlando, Florida, USA; 13–17 May 2002.
Gaucher D, Gervaise C: Feasibility of passive oceanic acoustic tomography: a Cramer Rao bounds approach. In Oceans 2003 Marine Technology and Ocean Science Conference. San Diego, USA; 22–26 September 2003. pp. 56–60
Gaucher D, Gervaise C, LE Flock H: Contributions to passive acoustic oceanic tomography. In 7me Journes d’Acoustique SousMarine. Brest, France;
Mansour A, Jutten C: What should we say about the kurtosis. IEEE Signal Process. Lett December 1999, 6(2):321322.
Jensen FB, Kuperman WA, Porter MB, Schmidt H: Computational ocean acoustics. SpringerVerlag, New York, London, Tokyo,; 2000.
Etter P: Recent advances in underwater acoustic modelling and simulation. J. Sound Vib 2001, 240(2):351383. 10.1006/jsvi.2000.3212
Etter P: Underwater acoustic modeling principles, techniques and applications. Elsevier, New York,; 1991.
Lurton X: Introduction to underwater acoustics principles and applications. Springer, London,; 2002.
Brekhovskikh LM, Lysanov YP: Fundamentals of ocean acoustics. Springer Verlag, New York,; 2003.
Shulkin M, Marsh HW: Sound absorption in sea water. J. Acoustical Soc. Am 1962, 134: 864865.
Etter PC: Underwater acoustic modeling and simulation. Spon Press Editor, London, UK,; 2003.
Kawamoto M, Matsuoka K, Ohnishi N: A method of blind separation for convolved nonstationary signals. Neurocomputing 1998, 22: 157171. 10.1016/S09252312(98)000551
Kawamoto M, Kardec Barros A, Mansour A, Matsuoka K, Ohnishi N: Real world blind separation of convolved nonstationary signals. In First International Workshop on Independent Component Analysis and signal Separation (ICA99). Edited by: Cardoso JF, Jutten Ch, loubaton Ph. Aussois, France; 11–15 January 1999. pp. 347–352
Rahbar K, Reilly J: Blind separation of convolved sources by joint approximate diagonalization of crossspectral density matrices. In Proceedings of International Conference on Acoustics Speech and Signal Processing 2001, ICASSP 2001. Salt Lake City, Utah, USA; May 7–11 2001.
Rahbar K, Reilly J: A frequency domain method for blind source separation of convolutive audio mixtures. IEEE Trans. Speech Audio Process 2005, 13(5):832844.
Parra L, Alvino CV: Convolutive blind separation of nonstationnary sources. IEEE Trans. Speech Audio Process May 2000, 8(3):320327. 10.1109/89.841214
Mansour A, Jutten C, Loubaton Ph: Subspace method for blind separation of sources and for a convolutive mixture model. In European Signal Processing Conference. Elsevier, Triest, Italy; September 1996. pp. 2081–2084
Kailath T: Linear systems. Prentice Hall, New Jersey,; 1980.
Karhunen J, Cichocki A, Kasprazak W, Pajunen P: On neural blind source separation with noise suppression and redundancy reduction. Int. J. Neural Syst April 1997, 8(2):219237.
Mansour A: A mutually referenced blind multiuser separation of convolutive mixture algorithm. Signal Process November 2001, 81(11):22532266.
Chen W, Reilly JP, Wong KM: Detection of the number of signals in noise with banded covariance matrices. IEE Proc Radar, sonar and Navogation October 1996, 143(5):289294.
Kendall M, Stuart A: The advanced theory of statistics: Design and analysis, and timeseries. Charles Griffin & Company Limited, London,; 1961.
Martin A, Mansour A: Comparative study of high order statistics estimators. In International Conference on Software, Telecommunications and Computer Networks. Split (Croatia), Dubrovnik (Croatia), Venice (Italy); October 10–13 2004. pp. 511–515
Tan Y, Wang J, Zurada JM: Nonlinear blind source separation using a radial basis function network. IEEE Trans. Neural Networks January 2001, 12(1):124134. 10.1109/72.896801
Pham DT: Fast algorithm for estimating mutual information, entropies and score functions. In 4th International Workshop on Independent Component Analysis and blind Signal Separation, ICA2003. Nara, Japan; 1–4 April 2003. pp. 17–22
Rosenblatt M: A quadratic measure of deviation of twodimensional density estimates and a test of independence. Ann. Stat 1975, 3(1):114. 10.1214/aos/1176342996
Kankainen A: Consistent testing of total independence based on empirical characteristic functions. Ph.D. thesis, University Jyvaskyla 1995
Achard S, Pham DT, Jutten C: Quadratic dependence measure for nonlinear blind sources separation. In 4th International Workshop on Independent Component Analysis and blind Signal Separation, ICA2003. Nara, Japan; 1–4 April 2003. pp. 263–268
Bach FR, Jordan MI: Finding clusters in independent component analysis. In 4th International Workshop on Independent Component Analysis and blind Signal Separation, ICA2003. Nara, Japan; 1–4 April 2003. pp. 891–896
Feuerverger A: A consistent test for bivariate dependence. Int. Stat. Rev 1993, 61(3):419433. 10.2307/1403753
Murata N: Properties of the empirical characteristic function and its application to testing for independence. In Third International Workshop on Independent Component Analysis and signal Separation (ICA2001). San Diego, California, USA; 9–12 December 2001. pp. 295–300
Mansour A, Gervaise C: ICA applied to passive ocean acoustic tomography. WSEAS Trans. on Acoustics and Music April 2004, 1(2):8389.
Cardoso JF, Comon P: Independent component analysis, a survey of some algebraic methods. In International Symposium on Circuits and Systems Conference, volume 2. Atlanta, USA; May 1996. pp. 93–96
Mansour A, Kardec Barros A, Ohnishi N: Blind separation of sources: Methods, assumptions and applications. IEICE Trans Fundam Electron, Commun and Comput Sci August 2000, E83A(8):14981512.
Jutten C, Karhunen J: Advances in nonlinear blind source separation. In 4th International Workshop on Independent Component Analysis and blind Signal Separation, ICA2003. Nara, Japan; 1–4 April 2003. pp. 245–256
Comon P: Independent component analysis, a new concept? Signal Process April 1994, 36(3):287314.
Emile B, Comon P: Estimation of time delays between unknown colored signals. Signal Process 1998, 69: 93100. 10.1016/S01651684(98)000619
Nguyen Thi L, Jutten C, Caelen J: Separation aveugle de parole et de bruit dans un mlange convolutif. In Actes du XIIIème colloque GRETSI. JuanLesPins, France; September 1991. pp. 737–740
Nguyen Thi L, Jutten C, Caelen J: Speech enhancement: Analysis and comparison of methods in various real situations. In European Signal Processing Conference. Edited by: Vandewalle J, Boite R, Moonen M, Oosterlinck A. Elsevier, Brussels, Belgium; August 1992. pp. 303–306
Nguyen Thi L, Jutten C: Blind sources separation for convolutive mixtures. Signal Process 1995, 45(2):209229. 10.1016/01651684(95)00052F
Kardec Barros A, Mansour A, Ohnishi N: Removing artifacts from ECG signals using independent components analysis. NeuroComputing 1999, 22: 173186.
Kosel T, Grabec I, Kosel F: Time delay estimation of acoustic emission signals using ICA. Ultrasonics 2002, 40: 303306. 10.1016/S0041624X(02)001117
Amari SI, Cardoso JF: Blind source separationsemiparametric statistical approach. IEEE Trans. on Signal Process November 1997, 45(11):26922700.
Amari SI: Neural learning in structured parameter spaces: Natural Riemannian Gradient. In Neural Information Processing SystemNatural and Synthetic. San Diego, Colorado, USA; 2–7 December 1996.
Cardoso JF, Laheld B: Equivariant adaptive source separation. IEEE Trans. Signal Process December 1996, 44(12):30173030. 10.1109/78.553476
Douglas SC, Cichocki A, Amari SI: Multichannel blind separation and deconvolution of sources with arbitrary distributions, in the book Neural Networks for Signal Processing. In IEEE Workshop on Neural Networks for Signal Processing. New York; September 1997. pp. 436–445
Cichocki A, Douglas SC, Amari S: Robust techniques for independent component analysis (ICA) with noisy data. NeuroComputating 1998, 22: 113129. 10.1016/S09252312(98)000526
Matsuoka K, Oya M, Kawamoto M: A neural net for blind separation of nonstationary signals. Neural Networks 1995, 8(3):411419. 10.1016/08936080(94)00083X
Kawamoto M, Matsuoka K, Oya M: Blind separation of sources using temporal correlation of the observed signals. IEICE Trans. Fundam Electron, Commun. Comput Sci April 1997, E80A(4):111116.
Noble B, Daniel JW: Applied linear algebra. PrenticeHall, New Jersey,; 1988.
McCullagh P: Tensor methods in statistics. Chapman and Hall, London,; 1987.
Shiryayev AN: Probability. Springer Verlag, London,; 1984.
Papoulis A: Probability, random variables, and stochastic processes. McGrawHill, New York,; 1991.
Kotz S, Johnson NL: Encyclopedia of statistical sciences. University of Amesterdam, Amesterdam,; 1993.
Acknowledgements
A part of this work was supported by the French Military Center for Hydrographic & Oceanographic Studies, (SHOM i.e. Service Hydrographique et Océanographique de la Marine, Centre Militaire d’Océanographie).
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The author declares that I have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Mansour, A. Enhancement of acoustic tomography using spatial and frequency diversities. EURASIP J. Adv. Signal Process. 2012, 225 (2012). https://doi.org/10.1186/168761802012225
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/168761802012225
Keywords
 Underwater acoustic applications
 Passive acoustic tomography
 Multipath channel
 Sparseness or nonstationary signals
 Independent component analysis
 High order statistics