 Research
 Open access
 Published:
Reverberation suppression using nonnegative matrix factorization to detect lowDoppler target with continuous wave active sonar
EURASIP Journal on Advances in Signal Processing volume 2019, Article number: 11 (2019)
Abstract
In active sonar systems, the detection of echo from targets can deteriorate due to reverberation. Detection becomes more difficult if targets have lowDoppler frequency and are located near the reverberation band, especially in an environment with low signaltoreverberation ratio. In this paper, we propose an algorithm for the reverberation suppression of continuous wave signals using nonnegative matrix factorization. To extract the target echo signal mixed with reverberations, the bases for the target echo and the reverberation are independently defined, and different constraints are applied for their corresponding estimation. We also derive constraints on temporal continuity and temporal length to estimate bases for the target echo. Experiments using simulated reverberations are performed to evaluate the proposed algorithm, and the results show an enhancement in the signaltonoise ratio by 6–15 dB, as well as in the detection probability at several signaltoreverberation ratios. Moreover, an experiment is conducted using reverberation measured from an ocean, and the results show that the proposed algorithm can effectively suppress reverberation and enhance detection performance in practical settings.
1 Introduction
Active sonar is used to detect a target under water using a reflected ping. However, in addition to reflection from the target, the received signal contains reverberations caused by reflection from scatterers. Conventional target detection algorithms, including the matched filter, are sensitive to reverberations, and extensive research has been conducted to improve detection in this condition [1–11].
Although the reverberation can be addressed by the beamformer, adequate suppression of the reverberation is not achieved. In case of the lowDoppler target, the target echo is located in the mainlobe and sidelobe of the beamformer, so additional algorithm is required to suppress the reverberation [1, 2]. Methods of reverberation mitigation are twofold: transmission pulse design at the transmitter and signal processing for reverberation suppression at the receiver. Continuous wave (CW) pulses are commonly used active sonar signals used to detect targets because this type of pulse allows the easy location of targets in the rangeDoppler resolution but is vulnerable to reverberations [3]. On the contrary, linear frequencymodulated pulses are robust against reverberation but deliver a lowDoppler resolution [4]. Some research has focused on the generation of specific types of pulses, such as the geometric comb [5], tripletpair comb [4], and other Dopplersensitive pulses [2, 3], to handle reverberations. Although such pulses can mitigate the effects of reverberations, they cannot suppress them. Moreover, CW pulses and some variants [2] are still widely used in active sonar systems. Consequently, signal processing for suppressing reverberation at the receiver is needed, and some algorithms have been proposed in this vein [6–11].
For instance, algorithms based on whitening with autoregressive (AR) models have been developed [6, 7] for reverberation suppression. However, the effectiveness of this approach deteriorates when the Doppler shifts of the reverberation and the target echo are similar. To overcome this limitation, algorithms based on principal component inversion (PCI) have been developed [8–10]. The signal subspace extraction (SSE) algorithm has recently been proposed as an improvement to the PCIbased algorithm by expressing the reverberation model as a sum of higher and lower reverberation echoes [11].
In this paper, we focus on reverberation suppression for the detection of lowDoppler targets. Such targets can be easily confused with clutter in reverberation. The lowDoppler target detection problem occurs more frequently when the receiver moves because the energy of the reverberation spreads over a frequency range proportional to the speed of the receiver [12]. Therefore, we also consider a moving receiver to develop a reverberationsuppression algorithm. Specifically, we use nonnegative matrix factorization (NMF), a segmented signal representation [13], to suppress reverberations. While PCIbased algorithms detect the desired signal using only low ranks, we propose using signal characteristics in the time–frequency domain.
NMF decomposes a nonnegative matrix into a multiplication of two nonnegative matrices [13]. The spectrogram of frequency and the time basis matrices can then be analyzed, as has been shown for sound signals [14]. Iterative algorithms for the NMF based on the multiplicative update rule were first developed for the cost functions of the Euclidean distance and the Kullback–Leibler divergence [15] and were then expanded to other distance measures, such as Itakura–Saito divergence [16] and beta divergence [17]. Several update rules have also been researched, such as the alternating least squares [18] and expectationmaximization [19] algorithms. The application of NMF to music and speech signal processing have been researched actively to analyze music signals [20, 21], separate the desired source signals from the received signals [22, 23], and to denoise speech signals [24, 25]. NMFbased algorithms are not limited to the processing of monochannel signals and can also deal with multichannel acoustic signals [19, 26]. Although NMFbased algorithms for music and speech signals have been researched, it is challenging to find NMFbased sonar signal processing algorithms, excluding those for sonar image recognition [27]. Both the PCA and NMF algorithms are matrix decomposition techniques, and the NMF algorithm has advantage for analyzing the sparse components of a matrix, which is called “the partbased representation,” compared to the PCA algorithm [13]. It is expected that the NMF algorithm is also suitable for finding the target echo because the spectrogram of the target echo is a sparse component of the spectrogram of the entire received signal.
The main contribution of this research article is the design of NMFbased reverberation suppression in continuous wave active sonar, as no research has applied the NMF method to this problem, to the best of the authors’ knowledge. To develop the reverberation suppression algorithm, we divide NMF bases into two parts—echo bases and reverberation bases—and design a constraint to discriminate between them. We apply two constraints, on the temporal length and temporal continuity. The former constraint and its estimation algorithm are novel whereas the latter constraint is developed by Virtanen [22].
The remainder of this paper is organized as follows: Section 2 presents the problem statement for this study and describes the time–frequency characteristics of the target echo and reverberation considering a moving receiver. Section 3 details the proposed NMFbased algorithm, and Section 4 describes the results of simulation and measurements using the proposed algorithm. Finally, we draw the conclusions of this article in Section 5.
Notation: We denote vectors and matrices by boldface lowercase and boldface uppercase letters, respectively. Table 1 specifies the symbols and their meanings.
2 Problem statement
We assume that a sonar (receiver) is moving with constant velocity through a field with stationary scattering elements, as shown in Fig. 1. Echo signal s_{e}(t) from the target is a replica of transmitted ping signal s_{t}(t) with Doppler shift f_{d}:
where a is an attenuation factor, t_{d} is time delay, and j is the unit imaginary number. The frequency of s_{e}(t) is given by
where S_{t}(f) is the Fourier transform of s_{t}(t).
If the transmitted ping signal is a CW signal with a center frequency f_{0}, the received ping signal is a narrowband signal with frequency (f_{0}+f_{d}). The reverberation signal s_{r}(t) consists of a large number of replicas from scatterers:
with spectrum
where a_{i} and \(t_{d_{i}}\) are an attenuation factor and a time delay of ith scatterer, respectively, and the Doppler shift \(f_{d_{i}}\) is given by
where V is the speed of the receiver, c is the speed of sound, and ψ_{i} is the angle between the direction of motion of the sonar and each scatterer. Equation (5) shows that the spectrum of reverberation has a range of frequency \(  \left (\frac {2V}{c}\right) f_{0} < f < \left (\frac {2V}{c} \right) f_{0}\) and, hence, has a wider band than the echo signal when the sonar receiver moves.
Figures 2 and 3 show the spectrograms of the simulated target echo and the reverberation, respectively, for a transmitted CW ping signal. Clearly, the target echo is a narrowband signal with frequency f_{0}+f_{d}, whereas the reverberation is a broadband signal in the range \(  \left (\frac {2V}{c} \right) f_{0} < f < \left (\frac {2V}{c} \right) f_{0}\), as verified in (2) and (4), respectively. Further, unlike reverberations, the spectrum of the target echo signal does not randomly fluctuate over time in response to a received signal. Moreover, the target echo signal last for a short period, whereas reverberation is more persistent.
In this study, we extract the target echo signal from a received signal that contains reverberation. To this end, we analyze the spectrogram of the mixed signal by using NMF with distinctive characteristics of the target echo signal, as detailed below.
3 Proposed method
3.1 Nonnegative matrix factorization
NMF allows for the decomposition of a nonnegative matrix into a multiplication of two nonnegative matrices by using the following model:
where \(\mathbf {V} \in \mathbb {R}_{K \times N}^{+}\), \(\mathbf {W} \in \mathbb {R}_{K \times R}^{+}\), \(\mathbf {H} \in \mathbb {R}_{R \times N}^{+}\), and \(\mathbf {E} \in \mathbb {R}_{K \times N}\). When NMF is applied to acoustic signal processing, V is the spectrogram of the input signal with K frequency bins and N time frames, W and H are the frequency and the time basis matrices, respectively, and R is the number of basis vectors. Then, NMF determines matrices W and H from V by alternating estimations
where C(AB) is a divergence function between A and B. Lee and Seung [15] introduced the multiplicative update rule for two cost functions—namely, the Euclidean distance and the Kullback–Leibler divergence—and Virtanen [22] subsequently generalized the update rule as
where ⊗ and the fractions represent Hadamard multiplication and elementwise divisions, respectively, \(\nabla _{\mathbf {W}}^{+} C \left (\mathbf {W,H} \right)\), \(\nabla _{\mathbf {W}}^{} C \left (\mathbf {W,H} \right)\), \(\nabla _{\mathbf {H}}^{+} C \left (\mathbf {W,H} \right)\), and \(\nabla _{\mathbf {H}}^{} C \left (\mathbf {W,H} \right)\) are elementwise nonnegative terms that are part of gradients
and C(W,H) is an arbitrary cost function containing a divergence function and additional constraints.
NMF decomposes the spectrogram of the input signal into several basis components in acoustic signal processing, with the number of components usually larger than that of the sources. Recently proposed algorithms apply different constraints to each basis group to handle the basis components, depending on the source signal, as illustrated in Fig. 4.
3.2 Basis estimation
As shown in Figs. 2 and 3, the target echo signal has distinctive characteristics due to reverberation. The target echo signal is as follows:

Is a frequencyshifted replica of the transmitted ping

Has continuous values along the time axis (unlike those of reverberations, which randomly fluctuate)

Is short and limited in the time domain (unlike reverberations, which are persistent)
To estimate each basis group separately, consider that the frequency and time basis matrices consist of the echo and the reverberation basis groups, respectively, as
where \(\mathbf {W_{P}} \in \mathbb {R}_{K \times R_{P}}^{+}\) and \(\mathbf {H_{P}} \in \mathbb {R}_{R_{P} \times N}^{+}\) are the frequency and time bases of the echo basis group, \(\mathbf {W_{R}} \in \mathbb {R}_{K \times R_{R}}^{+}\) and \(\mathbf {H_{R}} \in \mathbb {R}_{R_{R} \times N}^{+}\) are those of the reverberation basis group, respectively, and R_{P} and R_{R} are numbers of echo and reverberation bases, respectively. Then, each basis group should be estimated separately as described below.
Applying Eqs. (13) and (14) into Eq. (6), matrix V can be expressed as
If we define the target echo portion as V_{P}=W_{P}H_{P} and the reverberation portion as V_{R}=W_{R}H_{R} in the magnitude spectrogram, Eq. (15) becomes
Therefore, the strategy of dividing into basis groups can help identify the energy contribution of each group to the spectrogram, if we apply appropriate constraints. This strategy enables easy application of prior knowledge of frequency and temporal structures, so it is widely used in current research for source separation and speech signal denoising problems [28].
3.2.1 Echo bases
As the target echo signal has a frequency similar in structure to the Dopplershifted transmitted ping, the frequency basis matrix consists of Dopplershifted replicas of frequency basis w_{T} of the transmitted ping.
Frequency basis \(\mathbf {w_{T}} \in \mathbb {R}_{K \times 1}^{+} \) and time basis \(\mathbf {h_{T}} \in \mathbb {R}_{1 \times N}^{+} \) are modeled by rank–one NMF as V_{T}≈w_{T}h_{T}, where \(\mathbf {V_{T}} \in \mathbb {R}_{K \times N}^{+} \) is the spectrogram of transmitted ping s_{t}(t). The cost function is defined by the Kullback–Leibler divergence between V_{T} and w_{T}h_{T}, and w_{T} and h_{T} can be iteratively estimated as [15]
where 1_{K×N} denotes a K×N matrix whose elements are all one. Following convergence, the echo frequencybasis matrix W_{P} consists of frequencyshifted replicas of frequency basis w_{T} as follows:
where w_{T,↑d} is a dbinshifted version of w_{T}, and D is the maximum number of bins of the Doppler shift to be observed. As shown in Eq. (19), the number of echo bases R_{P} is 2D+1. Equations (17)–(19) depend only on the transmitted ping and not on the received signal. Hence, the iterative process can be carried out initially and W_{P} is fixed during runtime.
The time basis matrix of the echo signal is estimated using constraints on temporal continuity and temporal length limitation (TLL) to utilize the second and third characteristics of the echo signal, i.e., time continuity and limited duration, respectively. The cost function consists of error terms
where C_{E}(W_{P},H_{P}), C_{T}(H_{P}), and C_{L}(H_{P}) are the costs of the reconstruction error, temporal continuity, and TLL, respectively. α and β are the weights of the costs of temporal continuity and TLL, respectively. The gradient of total cost is the weighted sum of the gradients of each cost:
where we omit the parameters of the cost functions for convenience. Assuming that the derivative of each cost function can be expressed as a sum of positive and negative terms, the total cost gradient is given by
where
and H_{P} is iteratively estimated from (10) as
Reconstruction error C_{E} is defined by the Kullback–Leibler divergence, which is often used in NMF for source separation [15]:
where v_{(k,n)} and [WH]_{(k,n)} are elements of V and WH, respectively, in the kth row (1≤k≤K) and the nth column (1≤n≤N). In the proposed algorithm, V is the spectrogram of received signal s(t)=s_{e}(t)+s_{r}(t). Therefore, the positive and negative terms of the cost function derivative with respect to H_{P} are given by [22]
The temporal continuity constraints are defined as [22]
where h_{p,(r,n)} is the element of H_{P} in the rth row and nth column. The gradient of C_{T} with respect to H_{P} is derived as [22]
where H_{P}_{←1} and H_{P}_{→1} are left and right shifts of H_{P} by one, respectively, 1_{N×N} denotes a N×N matrix whose elements are all one, and H_{P}^{2} means an elementwise square of H_{P}. By grouping the positive and negative terms, \(\nabla _{\mathbf {H_{P}}}^{+}C_{T}\) and \(\nabla _{\mathbf {H_{P}}}^{}C_{T}\) are expressed as [22]
The TLL constraint is defined as
where l_{n} is the expected length of a ping. The maximum value indicator function \(\hat {h}_{p,(r,n)}\) is given as
where \(\bar {h}_{p,(r,n)}\) is the moving sum of each time basis signal calculated as
The individual elements from the positive part \(\nabla _{\mathbf {H_{P}}}^{+} C_{L}\) and negative part \(\nabla _{\mathbf {H_{P}}}^{} C_{L}\) are given by (see Appendix 1)
where [A]_{(r,n)} is the (r,n) element of matrix A.
3.2.2 Reverberation bases
We consider the basis matrices of reverberation as components of the signal that are longer and more fluctuating than the target echo signal. Therefore, the frequency and time basis matrices can be estimated by the multiplicative update rule without additional constraints as
and
3.2.3 Reconstruction of the target echo signal
Following the iterative estimation of the basis matrices for the echo and reverberation, (25), (38), and (39) converge, and the estimated signal of the target echo can be reconstructed from its basis matrices. The spectrogram of the reconstructed echo signal is calculated from these matrices as
To construct a complex spectrogram, phase information is needed. In several NMFbased algorithms, either the phase from the original spectrogram is used [22] or an approach based on the Wiener filter is applied to the original spectrogram [19]. We observed a similar performance by both methods, but prefer the phase of the original spectrogram because it allows the easy construction of the complex spectrogram. The estimated target echo signal \(\hat {s}_{e} (t)\) is then obtained by the inverse shorttime Fourier transform.
The proposed algorithm is summarized in Table 2, where H_{P}, W_{R}, and H_{R} are initialized using nonnegative random values and iteratively estimated, whereas W_{P} is initialized and fixed by the frequencyshifted replicas of the transmitted ping spectrum w_{T}.
3.2.4 Considerations on the convergence of the algorithm
Lee and Seung [15] proved that the cost function is nonincreasing when the multiplicative update processes (Eqs. (9) and (10)) minimize the Kullback–Leibler divergence. Convergence is proved by designing an auxiliary function for the objective functions. The detailed proof can be found in [15, 17] and is briefly reviewed in Appendix 2.
It is difficult to show that convergence is guaranteed when the NMF contains additional constraints because finding an appropriate auxiliary function is difficult. However, we expect that the iterative algorithm converges well if α and β are small, owing to the nonincreasing property of the multiplicative update rules for the Kullback–Leibler divergence. Instead of a theoretical analysis of the convergence condition, we check the convergence of the objective function with varying α and β in the simulation experiments presented in Section 4. We observe that the objective function converges well in most cases except for those with very large values of α and β, such as 10^{4}.
4 Results and discussion
4.1 Simulation with synthesized reverberation
To evaluate the proposed algorithm, we perform simulations with synthesized reverberations. Specifically, a sonar system transmits CW ping signals of 50 ms and moves forward with speed V such that \(\frac {2V}{c}=0.07\). The reverberation is synthesized using the nonRayleigh underwater reverberation model [29]. Moreover, the target echo signal is received at 600 ms and the normalized Doppler (f_{d}/f_{0}) of the target echo is 0.036 for this simulation.
To apply the proposed algorithm, we use the shorttime Fourier transform for the received signal with a 16ms Hamming window, a 75% overlap, and 128 Fourier transform points. The number of basis vectors for the echo R_{P} and reverberation R_{R} of the proposed algorithm are set to 19 and 60, respectively, and the weight factors for temporal continuity and TLL constraints, α and β, are set to 1.0 and 10.0, respectively. The expected ping length l_{n} is set to 50 ms. The proposed algorithm is compared with the PCI [9], autoregressive (AR) prewhitening [10], and SSE [11] algorithms. The length of the time frame and the order of the AR model are set to 64 ms and 20, respectively, and these values are selected as the optimal ones through trial and error. The thresholds for eigenvalues of the PCI and SSE algorithms are set to optimal values, assuming that the power of the reverberation is known.
Figures 5a–f and 6a–f show examples of the basis matrices of the echo estimated by the proposed algorithm with an input signaltoreverberation ratio (SRR) of −12 dB, from r=7 to r=12, respectively. As shown in Fig. 5, the ninth frequency basis vector correspond to the frequency structure of the transmitted CW ping signal, whereas the other basis vectors are shifted from the CW ping in d bins. Figure 6 shows that the 11th time basis vector has large values between 600 and 650 ms, whereas the other basis vectors have relatively small values. As the difference in frequency between adjacent bins is 0.018f_{0} in this experiment, the 11th frequency and time basis vectors corresponded to the target echo. Thus, the graphs show that the proposed constraints are suitable to find the target echo signal.
Figure 7 shows the waveforms of the ideal target echo (ground truth, Fig. 7a), the received signal comprising the ideal target echo and reverberation (Fig. 7b), the output signal obtained using the proposed algorithm (Fig. 7c), the SSE algorithm (Fig. 7d), the AR prewhitening algorithm (Fig. 7e), and the PCI algorithm (Fig. 7f). For the algorithms, we consider the input signal shown in Fig. 7b. Reverberations with amplitudes similar to that of the target echo signal can be seen before applying the algorithm, but the reverberations are removed following the application of the proposed algorithm. Although the SSE and PCI algorithms can also suppress reverberation, parts of the reverberation signals persist upon application of these algorithms.
To evaluate the proposed algorithm for various input SRRs, we measure the output signaltonoise ratio (SNR) using Monte Carlo simulations with 100 iterations per SRR ranging from −20 to −6 dB, with increments of 2 dB. The SNR is calculated as
where s_{e}(n) is the echo signal of the ideal target and \(\hat {s_{e}}(n)\) is the output signal obtained after applying the algorithm. Figure 8 shows the SNR of the proposed, the SSE, and the PCI algorithms. The proposed algorithm achieves an SNR gain of 6 to 15 dB compared to the input SRR and approximately 5 to 7 dB compared to the SSE and PCI algorithms. The SNR results of the AR prewhitening algorithm could not be obtained because the output amplitude of the algorithm is too small. Thus, it is not appropriate to evaluate AR prewhitening algorithm using the SNR values.
Given that reverberation suppression aims to improve target detection, we calculate the rangeDoppler by applying a matched filter to the outputs of the algorithms. To calculate the map, we use the blocknormalized matched filter [9, 11], with the Dopplershifted transmitted ping signal being
where N is the length of a block, \(s_{f_{d}}(n)\) is a Dopplershifted transmitted ping signal with Doppler frequency f_{d}, and x_{i}(n) is the ith block of the received signal. Figure 9a–e show the results of applying the matched filter only and applying the matched filter with the proposed algorithm, the SSE algorithm, the AR prewhitening algorithm, and the PCI algorithm, respectively. Figure 9a shows that the rangeDoppler map has a large value at the location of the target echo, but has many false peaks. Figure 9b shows that the SSE algorithm enhances the target echo, but several false peaks still remain, and Fig. 9c shows that the PCI algorithm retrieves some false peaks over several Doppler frequencies. Figure 9d shows that the AR prewhitening algorithm yields relatively good performance for enhancing the target echo, but several large false peaks persist near the zerometer range, for certain Doppler frequencies. Figure 9e shows that the proposed algorithm determines the true peak corresponding to the ideal target echo with better intensity than for the other peaks. Therefore, it can reduce false positives when applying a threshold to the matched filter during target detection.
To evaluate detection performance, we calculate the probabilities of detection and false alarm after 1000 Monte Carlo simulations per SRR input. A positive is defined as the rangeDoppler bin containing the target, whereas a negative is defined as a bin with no target. When the value of any bin in the rangeDoppler map is greater than a predefined threshold, we define those bins as true positives (TPs) if they are positives, and as false positives (FPs) otherwise. The probabilities of detection and false alarm are respectively defined as
These probabilities are dependent on the threshold, and we thus calculate them for various thresholds to determine the receiver operating characteristic (ROC) curves as shown in Fig. 10.
Figure 10 shows the ROC results at –12 dB, –14 dB, –16 dB, and –18 dB because the detection performance of the algorithms rapidly changed in this region. The graphs show that the proposed algorithm with matched filter enhances detection performance compared with the AR prewhitening, PCI, and SSE algorithms. Under certain false alarm conditions (greater than a probability of 30%), the AR prewhitening algorithm shows slightly higher detection probability than the proposed algorithm, but the difference in the high false alarm conditions are much smaller compared to the difference in the low false alarm conditions. Therefore, the proposed algorithm is considered to be more advantageous than the AR prewhitening algorithm. Figure 11 shows the detection probability with respect to input SRR conditions at a false alarm probability of 1%, and the results show that the proposed algorithm can significantly enhance detection performance at SRR inputs between –18 dB and –8 dB with a low probability of false alarms.
To verify the convergence of the proposed algorithm, we calculate the value of the objective functions during iterative estimation. Figure 12a and b show the Kullback–Leibler divergence between V and WH used to estimate W_{R} and H_{R} (Eqs. (38) and (39)), respectively, and Fig. 12c and d show the objective function, including the costs of temporal continuity and TLL (Eq. (20)), used to estimate H_{P}. The curves are calculated by an ensemble averaged over 1000 Monte Carlo simulations. We check the value of the objective function by varying the SRR input from –20 dB to 0 dB, but the results are barely affected by the input SRR. We thus show the graph corresponding to –10 dB as input SRR only. As shown in Fig. 12a and c, the objective functions do not diverge when the value of α is less than 10^{4}. Figure 12b and d show that the objective functions converge with β less than 10^{4}. We test the convergence of the objective functions under several combinations of α and β, and the results show similar behaviors regardless of the value of each parameter.
Because the computational complexities of the SSE and PCI algorithms depend on the singularvalue decomposition used, it is challenging to compare the complexities of the algorithms. We thus measure the computation times on a laptop with a 3.5GHz CPU and 16 GB of memory. All the tested algorithms are implemented in MATLAB, and the SSE and PCI algorithms use the builtin functions for singularvalue decomposition. For a 1s long signal, the proposed algorithm with 200 iterations requires 0.7 s, whereas the AR and SSE (PCI) algorithms require 0.025 and 0.6 s, respectively. The calculation times of the SSE and PCI algorithms are nearly identical. The proposed algorithm thus requires more calculation time than the conventional algorithms.
4.2 Simulation with measured reverberation
Having verified the performance of the proposed algorithm through a simulation, we test it on reverberation measurements acquired at the Eastern Sea of Pohang in the Republic of Korea from a 1s transmitted CW ping. The reverberation of the CW signal is measured by a nested towed array consisting of 48 sensors moving at four knots.
During this experiment, we assume an input SRR of −12 dB, and the target echo is received after 2 s with a normalized Doppler of 1.0025. Its signal is thus interfering with the reverberation. We apply the shorttime Fourier transform to the received signal with a 133ms Hamming window, 75% overlap, and 2048 transform points. As for the simulations with synthesized reverberation, temporal continuity, and TLL weights, α and β are set to 1.0 and 10.0, respectively, and the expected ping length l_{n} is set to 1 s. The length of the time frame and the order of the AR model are set to 530 ms and 50, respectively, and these values are selected as the optimal ones through trial and error. The thresholds for eigenvalues of the PCI and SSE algorithms are set to optimal values, assuming that the power of the reverberation is known.
Figure 13 shows the waveforms of the ideal target echo (ground truth, Fig. 13a), the received signal comprising the ideal target echo and reverberation (Fig. 13b), the output signal obtained using the proposed algorithm (Fig. 13c), the SSE algorithm (Fig. 13d), the AR prewhitening algorithm (Fig. 13e), and the PCI algorithm (Fig. 13f). It is challenging to identify the target echo from the received signal, whereas it is clearly distinguishable when the proposed algorithm is applied. Although the SSE and PCI algorithms can also suppress reverberation, parts of the reverberation signals persist following its application.
To analyze the proposed algorithm in the frequency domain, we calculate the spectrograms of the received signal and the output signals as shown in Fig. 14. Clearly, the received signal has many undistinguishable components as shown in Fig. 14a, and parts of the reverberation signals persist following the application of SSE and PCI algorithms, as shown in Fig. 14b and c, respectively. The AR prewhitening algorithm suppresses the target echo as well as the reverberation as shown in Fig. 14d, whereas the target echo signal is clearly visible in the spectrogram of the algorithm’s output as shown in Fig. 14e. Therefore, the proposed algorithm can estimate both the Doppler frequency and the temporal location of the target echo.
Figure 15 shows the rangeDoppler maps of the received signal without reverberation suppression and the processed signal using the SSE algorithm, PCI algorithm, AR prewhitening, and the proposed algorithm, respectively. Each rangeDoppler map is determined from the blocknormalized matched filter, as for the simulation of synthesized reverberation. Figure 15a shows significant values in the two Doppler frequency bins, where values at the negative Doppler bin correspond to reverberations and those at the positive Doppler bin to the target echo. The values related to reverberations are large and can lead to target misidentification. Figure 15b shows that the SSE algorithm reduces the peak value corresponding to the reverberation, but several false peaks persist. Figure 15c shows that the PCI algorithm cannot effectively suppress the reverberation peak, and Fig. 15d shows that the result of AR prewhitening has a large value near zero meters. Figure 15e shows that the proposed algorithm suppresses the peak of the reverberation and enhances that of the target echo, thus confirming that it performs well in practice.
5 Conclusion
In this work, the detection performance of the target echo from CW ping signals is enhanced in the presence of reverberations. To this end, a reverberation suppression algorithm is proposed based on NMF and distinguishing characteristics between the target echo and reverberations. Specifically, three characteristics of the target are considered: the frequency structure of the target echo is similar to that of the Dopplershifted transmitted signal, the target echo has timecontinuous values, and the target echo has relatively short and finite duration. To use these characteristics, the frequency bases of the target echo are fixed by frequencyshifted transmitted pings, and constraints on the temporal continuity and TLL are developed.
Experiments are conducted with both simulated and measured reverberations to evaluate the proposed algorithm. The results of the simulations show the ability of the proposed algorithm to find the target echo signal by enhancing SNR from 6 to 15 dB compared to that of the received signal. ROC curves calculated by a Monte Carlo simulation confirm that the proposed algorithm enhances detection probability with a low false alarm rate. The performance of the proposed algorithm is also verified in an ocean by measuring the reverberations. Overall, the results show that the proposed algorithm suppresses reverberations and enhances detection performance in simulations and practice. Expanding our algorithm to multichannel sonar signals and analyzing the convergence of the constrained NMF are expected to be considered in our future work.
6 Appendix 1: Gradients of the TLL constraints
For the TLL constraint, the backward moving sum of each time basis signal is calculated as
where l_{n} is the expected length of a ping, and \(\bar {h}_{p,(r,n)}\) represents the sum of the nth moving block that has a frame index ranging from (n−l_{n}+1) to n. The maximum value indicator function \(\hat {h}_{p,(r,n)}\) is defined as
To penalize all data from each time basis except for the moving block that has maximum energy, the TLL constraint is defined as
The TLL penalty takes values between zero and one per basis r. As shown in (46), the maximum value of \({\sum \nolimits }_{m=n}^{n+l_{n}1} \hat {h}_{p,(r,m)}\) is one, thus limiting the value of the constraint.
Figure 16 shows an example of a simulated TLL constraint on the basis of limited length corrupted by estimation errors. The time basis has a lengthlimited signal from the 21st to the 30th index, as shown in Fig. 16a. The TLL constraint should ideally have large values outside this region because the TLL is designed to penalize the longer signal. The graphs in Fig. 16b–d describe (45)–(47), respectively. The TLL constraint (Fig. 16e) has large values for every time index except for the region of the signal between the 21st and 30th indices and is thus consistent with the purpose of the design.
Gradient \(\nabla _{\mathbf {H_{P}}}C_{L}\) can be determined by the chain rule, but the derivative of (46) is challenging to find. Therefore, we use function softmax instead of max:
The TLL penalty with softmax is also between zero and one for each basis r because
and \(\forall \hat {h}_{p,(r,m)} \ge 0\).
Then, the derivative of C_{L} with respect to h_{p,(r,n)} can be easily determined by the chain rule as
Consequently, each element from \(\nabla _{\mathbf {H_{P}}}^{+} C_{L}\) and \(\nabla _{\mathbf {H_{P}}}^{} C_{L}\) is given by
Figure 16f shows the TLL constraint using softmax, which has a similar structure to that obtained using max, as shown in Fig. 16e. Thus, softmax can be used instead of max to design the constraint for the lengthlimited basis.
7 Appendix 2: Review of convergence analysis of the NMF algorithm
e briefly review the auxiliary function approach to optimizing H with the Kullback–Leibler divergence only (α=β=0). Let C_{E}(h_{(r,n)}) denote the Kullback–Leibler divergence that is a function of h_{(r,n)} with a fixed w_{(k,r)}. An auxiliary function \(C^{+} (h_{(r,n)}, h^{(t)}_{(r,n)})\) for C_{E}(h_{(r,n)}) is defined by a function that satisfies
where \(h^{(t)}_{(r,n)}\) is the updated value of h_{(r,n)} after t iterations (definition 1 in [15]). If we update \(h^{(t+1)}_{(r,n)}\) as
C_{E}(h_{(r,n)}) is nonincreasing during the update because
Lee and Seung [15] and Nakano et al. [17] showed the nonincreasing property of multiplicative update algorithms by designing an auxiliary function that satisfies Eqs. (53) and (54) based on Jensen’s inequality [17] and by deriving the multiplicative update algorithm from Eq. (55).
Abbreviations
 AR:

Autoregressive
 CW:

Continuous wave
 FP:

False positive
 NMF:

Nonnegative matrix factorization
 PCI:

Principal component inverse
 ROC:

Receiver operating characteristic
 SNR:

Signaltonoise ratio
 SRR:

Signaltoreverberation ratio
 SSE:

Signal subspace extraction
 TLL:

Temporal length limitation
 TP:

True positive
References
K. Mio, Y. Chocheyras, Y. Doisy, in Oceans 2000 MTS/IEEE Conference and Exhibition, vol. 2. Space time adaptive processing for low frequency sonar (IEEEPiscataway, 2000), pp. 1315–1319.
Y. Doisy, L. Deruaz, S. van IJsselmuide, S. Beerens, R. Been, Reverberation suppression using wideband Dopplersensitive pulses. IEEE J. Oceanic Eng. 33(4), 419–433 (2008).
T. Collins, P. Atkins, Dopplersensitive active sonar pulse designs for reverberation processing. IEE Proc.Radar, Sonar Navig. 145(6), 347–353 (1998).
J. Alsup, in 33rd Asilomar Conference on Signals, Systems, and Computers, vol. 2. Comb Waveforms for Sonar (IEEEPiscataway, 1999), pp. 864–869.
H. Cox, H. Lai, in Proceedings of 28th Asilomar Conference on Signals, Systems and Computers, vol. 2. Geometric comb waveforms for reverberation suppression (IEEEPiscataway, 1994), pp. 1185–1189.
S. Kay, J. Salisbury, Improved active sonar detection using autoregressive prewhiteners. J. Acoust. Soc. Am. 87(4), 1603–1611 (1990).
V. Carmillet, G. Jourdain, in OCEANS’96 MTS/IEEE Prospects for the 21st Century, vol. 3. Wideband sonar detection in reverberation using autoregressive models (IEEEPiscataway, 1996), pp. 1435–1440.
G. Ginolhac, G. Jourdain, in OCEANS 2000 MTS/IEEE Conference and Exhibition, vol. 2. Detection in presence of reverberation (IEEEPiscataway, 2000), pp. 1043–1046.
G. Ginolhac, G. Jourdain, “Principal component inverse” algorithm for detection in the presence of reverberation. IEEE J. Oceanic Eng. 27(2), 310–321 (2002).
W. Li, X. Ma, Y. Zhu, J. Yang, C. Hou, Detection in reverberation using space time adaptive prewhiteners. J. Acoust. Soc. Am. 124(4), 236–242 (2008).
W. Li, Q. Zhang, X. Ma, C. Hou, Active sonar detection in reverberation via signal subspace extraction algorithm. EURASIP J. Wirel. Commun. Netw. 2010(1), 11 (2010).
W. Burdic, Underwater Acoustic System Analysis (Prentice Hall, NJ, 1991).
D. Lee, H. Seung, Learning the parts of objects by nonnegative matrix factorization. Nature. 401(6755), 788–791 (1999).
P. Smaragdis, J. C. Brown, in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, vol. 3. Nonnegative matrix factorization for polyphonic music transcription (IEEEPiscataway, 2003), pp. 177–180.
D. D. Lee, H. S. Seung, in Advances in Neural Information Processing Systems. Algorithms for nonnegative matrix factorization (MIT PressCambridge, 2001), pp. 556–562.
C. Févotte, N. Bertin, J. L. Durrieu, Nonnegative matrix factorization with the ItakuraSaito divergence: with application to music analysis. Neural Comput. 21(3), 793–830 (2009).
M. Nakano, H. Kameoka, J. Le Roux, Y. Kitano, N. Ono, S. Sagayama, in Machine Learning for Signal Processing (MLSP), 2010 IEEE International Workshop On. Convergenceguaranteed multiplicative algorithms for nonnegative matrix factorization with βdivergence (IEEEPiscataway, 2010), pp. 283–288.
H. Kim, H. Park, Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl. 30(2), 713–730 (2008).
A. Ozerov, C. Févotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. 18(3), 550–563 (2010).
E. Vincent, N. Bertin, R. Badeau, in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference On. Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription (IEEEPiscataway, 2008), pp. 109–112.
E. Benetos, S. Cherla, T. Weyde, in 6th International Workshop on Machine Learning and Music. An efficient shiftinvariant model for polyphonic music transcription (ACMNew York, 2013).
T. Virtanen, Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066–1074 (2007).
S. Ewert, M. Müller, in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference On. Using scoreinformed constraints for NMFbased source separation (IEEEPiscataway, 2012), pp. 129–132.
K. W. Wilson, B. Raj, P. Smaragdis, A. Divakaran, in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference On. Speech denoising using nonnegative matrix factorization with priors (IEEEPiscataway, 2008), pp. 4029–4032.
N. Mohammadiha, P. Smaragdis, A. Leijon, Supervised and unsupervised speech enhancement using nonnegative matrix factorization. IEEE Trans. Audio Speech Lang. Process. 21(10), 2140–2151 (2013).
S. Lee, S. H. Park, K. M. Sung, Beamspacedomain multichannel nonnegative matrix factorization for audio source separation. IEEE Sign. Process. Lett. 19(1), 43–46 (2012).
Y. L. Hao, L. Wang, in Advanced Materials Research, vol. 383. Compressed sensing recognition algorithm for sonar image based on nonnegative matrix factorization and adjacency spectra feature (Trans Tech PublSingapore, 2012), pp. 5818–5823.
C. Joder, F. Weninger, F. Eyben, D. Virette, B. Schuller, in International Conference on Latent Variable Analysis and Signal Separation. Realtime speech separation by semisupervised nonnegative matrix factorization (SpringerBerlin, 2012), pp. 322–329.
D. Abraham, A. Lyons, Simulation of nonrayleigh reverberation and clutter. IEEE J. Oceanic Eng. 29(2), 347–362 (2004).
Acknowledgements
This paper was supported by the Agency for Defense Development (ADD) in Korea (UD160015DD).
Author information
Authors and Affiliations
Contributions
SL and JSL conceived and designed the study. JSL defined the problem and directed the study. SL developed the detailed algorithm and performed the experiments. Both authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Lee, S., Lim, Js. Reverberation suppression using nonnegative matrix factorization to detect lowDoppler target with continuous wave active sonar. EURASIP J. Adv. Signal Process. 2019, 11 (2019). https://doi.org/10.1186/s1363401906086
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1363401906086