Low Complexity Iterative Receiver Design for Shallow Water Acoustic Channels

An adaptive iterative receiver structure for the shallow underwater acoustic channel (UAC) is proposed using a decision feedback equalizer (DFE) and employing bit-interleaved coded modulation with iterative decoding (BICM-ID) in conjunction with adaptive Doppler compensation. Experimental results obtained from a sea trial demonstrate that the proposed receiver not only reduces inherent problem of error propagation in the DFE but also improves its convergence, carrier phase tracking, and Doppler estimation. Furthermore, simulation results are carried out on UAC, modelled by utilizing geometrical modelling of the water column that exhibits Rician statistics and a long multipath spread resulting in severe frequency selective fading and intersymbol interference (ISI). It has been demonstrated that there is a practical limit on the number of feedback taps that can be employed in the DFE and data recovery is possible even in cases where the channel impulse response (CIR) is longer than the span of the DFE. The performance of the proposed receiver is approximately within 1dB of a similar system employing DFE and turbo code, however, at a signiﬁcantly reduced computational complexity and memory requirements, making our system attractive for real-time implementation.


Introduction
The UAC is considered to be one of the most difficult and challenging physical communications media in use today. Unlike in Radio Frequency-(RF-) based communications systems, the electromagnetic waves do not propagate over long distances through the water, and thus, acoustic (pressure) waves are employed in order to carry the information signal through a UAC instead. The acoustic waves propagate at a very low speed of approximately 1500 m/s and the propagation occurs over multiple paths due to reflections from the surface and bottom of the sea. Hence, the UAC is considered and modelled as a highly time varying frequencyselective channel. In practice, the multipath profile of the channel depends on the channel geometry and density of the propagation medium. In the case of vertical channels the multipath spread is very short; however, horizontal channels exhibit a multipath spread of 100 s of symbols. Owing to this long multipath spread, the transmitted signal suffers from ISI that degrades the quality of the received signal which needs to be compensated for before detection. The time varying nature of the multipath also poses the problem of the continuous tracking of receiver parameters required for demodulation. Furthermore, the Doppler effect caused by the relative motion between transmitter and receiver plays an important role due to the wideband nature of the transmitted signal, which results in time expansion or compression of the symbol duration, depending on the direction of motion, and requires compensation in order to establish carrier phase and timing synchronization. The combination of these effects poses many challenges to the realization of robust, high data rate communications. Rapidly moving platforms such as autonomous underwater vehicles (AUVs) present a more serious problem. Compensating for Doppler shifts resulting from relative velocities up to 10 m/s is far beyond the capability of conventional adaptive equalization structures, even with explicit phase tracking loops [1]. These velocities can cause an excessive rate of equalizer tap rotation, and hence, the required convergence rate may lead to instability of the adaptive receiver algorithms.
The introduction of the turbo codes [2] has opened a new research area, where researchers are aiming to design iterative or turbo receivers. Each processing block in the traditional receiver outputs binary integer values resulting in the reliability information about the output symbols being lost. The performance of the receiver can be greatly improved if each block of the receiver outputs a posteriori probabilities (APP) or log likelihood ratios (LLR) of the symbols, that is, soft outputs. Much work in the design of soft output algorithms was encouraged by the need to provide soft inputs to the next processing stage. For example, a channel equalizer should generate soft outputs so as to increase the efficiency of the soft input channel decoder. The channel decoder then not only provides APPs of the information bits but also provides APPs of the encoded bits. These APPs, known as extrinsic information, can be used after interleaving by the equalizer as prior probabilities, also known as intrinsic information, for the next iteration. This is the fundamental idea behind the turbo or iterative receiver, that is, the exchange of soft information. The performance of the receiver improves as the number of iterations increases between the blocks of the receiver. Interested readers can refer to [3][4][5][6][7][8] for detailed information on this subject. The first turbo equalizer of its kind was presented by Douillard et al. [9] to combat multipath using the soft output Viterbi Algorithm (SOVA), where soft information is exchanged between the equalizer and decoder. A complete maximum a posteriori-(MAP-) based turbo equalizer was proposed by Bauch et al. [10] where it was shown that for a 5-tap channel exhibiting a deep spectral null, the performance of the receiver after 8 iterations between the MAP equalizer and MAP channel decoder is very close to that of a code on a non-ISI channel; however, this cannot be possible when the channel is unknown to the receiver and possibly time varying. A low complexity iterative equalizer structure using minimum mean square error (MMSE) criterion was proposed by Tuchler et al. [7,11]. The receiver architectures discussed above assume that perfect channel state information (CSI) is available at the receiver, which in most cases is not practical. Moreover, due to the long delay spreads, the MAP-based turbo equalization is simply impractical and similarly the MMSE-based methods have a computational complexity that is beyond the available resources. Recent sea trial experiments [12,13] put emphasis on the application of iterative receiver structures for the UAC. In [12], longterm experimental results were presented in order to look for the correlation between environmental parameters. It was also shown that receiver performance can be improved if actual noise statistics were taken into account. An application of the message passing (MP) algorithm is demonstrated in [13] in order to perform iterative decoding and estimation of channel model parameters. Another active area of research is bit-interleaved coded modulation with iterative detection (BICM-ID). In fading channels, the performance of an error correcting code depends on the code diversity defined by its minimum Hamming distance. The code diversity in BICM is equal to the smallest number of distinct bits along any error event and this is achieved by bitwise interleaving at the encoder output prior to the symbol mapping. The application of turbo-coded BICM (turbo BICM) was proposed [14] in conjunction with an adaptive decision feedback equalizer (DFE), where the structure takes advantage of the extrinsic information provided by the turbo decoder. Since the DFE is a nonlinear device, as it utilizes previous symbol decisions to eliminate ISI from the current symbol, an erroneous hard decision will propagate throughout the DFE and degrade the performance when used in conjunction with error correction coding (ECC). Most of the ECC techniques are designed to correct random errors, the DFE on the other hand produces errors which are bursty in nature due to the fact that DFE relies on delay-free hard decisions (before decoding) to cancel the ISI. The use of interleavers can convert the burst errors into random errors, thus, a BICM-based receiver not only reduces the error propagation in the DFE but also reduces error floor introduced by turbo decoding.
The focus of this paper is to provide a robust and low complexity receiver solution for underwater communications. The paper is organized as follows. Section 2 presents the communication system and channel model based on the geometry of the channel. The proposed receiver is explained and compared with an iterative DFE using turbo BICM in Section 3. Section 4 summarizes simulation and experimental results, along with the complexity analysis of both receivers. Finally, conclusions are drawn in Section 5.

System Definition
The communication systems to be investigated contain the generic transmitter depicted in Figure 1. The information bits b i ∈ {0, 1} of length K d are encoded using recursive systematic convolutional (RSC) encoder to produce K c = (K d + K 0 )/R c encoded bits, c p , where R c ∈ (0, 1] is the coding rate and K 0 ≥ 0 is the overhead introduced by the encoder, that is, a termination sequence to set the final state of the encoder to zero. The random interleaver permutes the encoded bits and the output bits, c ′ p , are mapped to a quadrature phase shift keying (QPSK) constellation. The data symbols are then multiplexed with a pseudorandom noise (PN) sequence, whose binary phase shift keying (BPSK) modulated symbols are known to the receiver. The resulting packet is pulse shaped using a transmit filter g T (t) and then up-converted using carrier modulation. A squareroot raisedcosine type of filter is employed having approximately 98% of its energy within a bandwidth equal to symbol rate [15]. The carrier modulated signal is transmitted through the UAC and is corrupted by noise samples.
Let us now consider the transmission of the baseband signal u(t) that is modulated onto a carrier of frequency f c . The transmitted signal s(t) can be expressed as where where T is the symbol duration and x k represents either training symbols or QPSK modulated symbols. The received . .
x k u s Encoder II S/P g T Mapping Carrier modulation Training sequence noiseless signal r(t) in the presence of multipath is then given as where Re{·} denotes the real part, α l (t) is the attenuation factor for the lth path, τ l is the delay associated with lth path, ∆ = v r /v 0 is the Doppler shift, where v r denotes the relative velocity between transmitter and receiver and v 0 denotes the speed of sound. The received noiseless baseband signal can be written as If we let ∆ = 0 and approximate the channel by its equivalent discrete-time baseband model, where the transmit filter, channel and receiver filter are represented by a linear filter with impulse response where L is number of paths and the complex coefficients h l,k are time varying and unknown to the receiver. The equivalent received baseband signal at time k can be written as where w k is complex additive white Gaussian noise (AWGN) with zero mean and variance σ 2 w in each dimension, that is, the noise samples are independent and identically distributed (i.i.d.) exhibiting a normal probability density function (PDF) It was mentioned earlier that the multipath in the UAC depends on the geometry and we assume a Channel Geometry as shown in Figure 2. The illustrated channel has uniform depth D and constant sound speed v 0 . The transmitted signal arrives at the receiver via a direct path, D LOS , and multipath. The multipath signals are grouped into four types according to the form and order of reflection. The notation SS denotes multipath signals which make a first and last boundary reflection from sea surface before arriving at the receiver [16]. Similarly other paths are defined as SB, BS and BB. This notation is extended to SS n , SB n , BS n and BB n , where n is the order of multipath. The length of each signal path shown in Figure 2 is given as and the angle of arrival of the acoustic ray is given as In (8) and (9), The time delay, τ l , associated with each path can be calculated by dividing the path length by the speed of sound v 0 . Underwater channels are commonly classified as doubly spread channels implying that the received signal is dispersed both in time and frequency. A considerable amount of work has been carried out in the past few years in order to characterize the UAC [16][17][18][19][20][21]. Models developed in [18][19][20][21] are derived using the measured data from sea trial experiments and provide a deeper insight of the channel dynamics. There are two sources that cause channel variability: inherent changes in the propagation medium and transmitter and/or receiver motion. Inherent changes range from those that occur on very long-time-scales to those that occur on short-time-scales. While the former does not affect the instantaneous power level of the communication signal, the latter are changes induced by surface waves. This causes displacement of the reflection point, resulting in both scattering of the signal and Doppler spreading due to the changing path length, affecting the signal.

Proposed Receiver
The system model of the proposed receiver structure is given in Figure 3. In the preprocessing stage the received signal is first passed through a bandpass filter centered on the operating carrier frequency to remove unwanted low frequency signal disturbances such as ship engine noise. After bandpass filtering the received signal is down converted in frequency to a complex baseband signal by employing inphase and quadrature oscillator mixers. In order to establish symbol and phase synchronization lost due to Doppler, sampling rate conversion is performed on the received signal using a low-complexity method such as linear interpolation. The complex baseband signal is matched filtered and, to minimize the distortion introduced by the linear interpolation operation, sampled at 4 times the symbol rate, denoted by y m ′ . Subsequently, the output of the linear interpolator, r n ′ , shown in Figure 3 is down sampled to 2 samples per symbol as required for the equalizer [22,23]. We can express r n ′ mathematically as where m ′ ∈ {1, 3, 5 . . .}, n ′ ∈ {1, 2, 3 . . .}, and I k = 1 for k = 1. Let x k denote the soft output of the DFE at the kth symbol which can be written as where (·) H is the Hermitian transpose, w fk and w bk are the feedforward and feedback filters, respectively, and x k is the vector containing the previous hard symbol decisions. The interpolation filter I k of the 1st order linear interpolator is recursively updated as where K p is a phase tracking constant and φ k is the data-aided phase error measurement given as where (·) * denotes a complex conjugate operation. In [14], a DFE-based receiver was presented that takes advantage of the extrinsic information provided from a turbo decoder, where after a fixed number of turbo-decoding iterations, the new extrinsic information is hard limited and given as feedback to the DFE. The key idea exploited is that as the reliability of the extrinsic information increases with the number of iterations, the quality of symbols fed back into the DFE is improved, which in turn reduces error propagation, a key source of performance degradation associated with a DFE. Another problem associated with the DFE is that there is a practical limit for the number of taps utilized. As we increase the number of taps, a longer training sequence is required for the DFE to converge to its optimum solution. The DFE taps are optimized and updated iteratively using a least mean square error (LMS) algorithm in order to maintain low complexity of implementation.
The soft symbols, x k , are converted into soft bit estimates and deinterleaved before they are passed to the channel decoder. In the turbo BICM transmitter, the encoder in Figure 1 is a parallel concatenation of two or more convolutional codes followed by a bit-by-bit interleaver and a mapper. Unlike turbo BICM, convolutional BICM requires only one encoder and decoder; therefore, the receiver complexity is greatly reduced. The interleaver permutes the encoder output and consequently burst errors created by error propagation in the DFE are converted into random errors. Due to the bitinterleaver in BICM-ID, the bit-based minimum Hamming distance is maximised, in other words the code diversity equals the smallest number of distinct bits, and hence, BICM-ID will achieve a lower bit error probability in fading channels.
At the receiver, we assume that the equalizer has removed most of the ISI which leads to the soft equalized symbols having a Gaussian distribution. The soft demapper processes equalized complex symbols x and the corresponding a priori

of the coded bits and outputs extrinsic LLRs [24]
where C k (i) denotes the binary random variable with realizations c k (i) ∈ {0, 1}. Using Bayes' rule and taking the expectation of p( where Ω is the set of QPSK symbols and b ∈ {0, 1} in position i ∈ {1, 2, . . . m} with m = log 2 M, we can write The first term p( x k | x k ) is computed according to the channel model assuming a Gaussian distribution The second term P[x k | C k (i) = b] is computed from the a priori information of the individual bits The extrinsic estimates L e [C k (i)] are deinterleaved and applied to the a priori probability (APP) channel decoder.
EURASIP Journal on Advances in Signal Processing  By performing iterative decoding, the extrinsic information about the coded bits from the decoder is fed back and regarded as a priori information, L a [C k (i)], at the demapper. During the initial demapping step, the a priori LLRs are set to zero.
After the Doppler correction and equalization, the soft estimates x k are demapped into bit likelihoods using (15)- (18). These bit likelihoods are then deinterleaved and fed to the MAP decoder. The MAP decoder not only provides estimates of the information bits, b i , but also provides extrinsic LLRs about the coded bits. This extrinsic LLRs are then interleaved and treated as a priori information at the demapper for the next iteration. The proposed BICM-ID-based receiver is different in the sense that the extrinsic information is directly exchanged between channel decoder and demapper. In contrast, the turbo-based BICM utilizes two channel decoders resulting in increased performance and complexity.
Here, we take advantage of this extrinsic information and after a fixed number of BICM-ID iterations between demapper and channel decoder, the updated extrinsic information is interleaved and hard limited to form the new decision statistic where q{·} is the quantization operation applied to the a priori information. These newly formed hard symbols are treated as a priori information for the next iteration and fed back to the DFE as shown in Figure 3 represented by a dashed line. The reliability of these new symbols increases as the number of iteration increases, which helps to reduce error propagation in the DFE. In practice, the quantized output of the DFE is used to calculate the error signal in order to update the equalizer taps. However, in this iterative receiver we utilize the newly formed hard symbols to calculate the error signal, and hence, update both the equalizer taps and interpolating factor as well as phase for the 2nd and consecutive iterations between DFE and channel decoder.
At the kth received symbol, the feedforward w f k and feedback w b k equalizer coefficients are adaptively updated using the following recursive equation where contains K 1 input symbols for feedforward filter, and K 2 input symbols for feedback filter, where K 1 and K 2 are the number of feedforward and feedback taps, respectively, and E(ε * k v k ) represents the cross-correlation function. Since the exact correlation function is mathematically unavailable, we use the LMS estimate ε * k v k and average out the noise in the estimation through the recursion In the case of the DFE, if an error is made in the hard decision then the estimate ε * k v k will contain erroneous decisions, which will propagate through the DFE and will cause burst errors. If an interleaver is not used then the Log-MAP  decoding algorithm will not be able to correct these long burst errors. However, when the correct decisions on the symbols are fed back in the iterative mode, the estimate ε * k v k will have improved decisions which will in turn reduce error propagation.

Simulation Results.
In this section, we present extensive simulation results for a given geometry and different scenarios, such as static and dynamic frequency selective channel conditions. For the simulation results, it is assumed that the Doppler shift due to relative motion between Tx and Rx is estimated correctly and the resampling operation does not introduce any significant distortion, which leads to the simplified received signal model of (6).
With reference to Figure 2, the parameters for the selected channel geometry are given in Table 1. The rationale behind the placement of the transmitter near the sea bottom is to reduce the interaction of transmitted signal with the dynamic surface of the sea.
We are considering only first-order multipath reflections, so by substituting n = 1 in (10), we can calculate all the path lengths and the delay τ l associated with each path by dividing the path length by the speed of sound v 0 . In order to simulate the multipath channel, we have considered the relative delays of the multipath arrivals with respect to the direct path. The resulting total delay spread of this channel is of the order 43.4 milliseconds. The delay spread of each path in terms of symbols can be easily calculated by multiplying the delay of each path by the data rate r S .
A block of K d = 2000 data bits b i is encoded with a rate R c = 1/2 RSC code of constraint length 5 and generator polynomials [23 35] 8 . The encoder appends the information sequence with a terminating sequence of K 0 = 4 bits and outputs the encoded sequence c p of length K c = 4008 bits. The encoded bits c p are interleaved using a pseudorandom permutation operation Π(·) of length K c . The interleaved bits c p ′ are then mapped to the QPSK constellation x k of      soft information is exchanged between channel decoder and demapper, non-Gray mapping yields better Bit Error Rate (BER) results than Gray mapping, because in Gray mapping, the number of constellation points that are at minimum Euclidean distance apart is not reduced through a priori knowledge. Thus, only very small performance improvement is expected over the iterations. In contrast, for turbo BICM, Gray mapping will give better performance because the a priori information is exchanged between two decoders. Besides the data symbols x k , a pseudo random BPSK training  sequence of length 511 known to the receiver is multiplexed to form the transmitted frame. In order to make a fair comparison between DFE-turbo-BICM and DFE-BICM-ID, appropriate puncturing is used for the turbo codes to match the corresponding rates. Consequently, we have utilized Gray mapping of QPSK for DFE-turbo-BICM and non-Gray mapping for DFE-BICM-ID. The signal-to-noise ratio (SNR) is defined as where E b is energy per bit and σ 2 w is variance of noise w k in each dimension. The simulated channel follows Rician characteristics, where the time-varying process is modelled as both uncorrelated and correlated for each channel tap. The correlated fading processes are generated by passing white Gaussian noise through a 3rd-order auto-regressive (AR) filter of appropriate cut-off frequency equal to the Doppler spread. Figure 5 shows the detected impulse response of the simulated time invariant (TIV) channel sampled at the symbol rate and its corresponding power delay profile is given in Table 2. In the simulation, the number of feedforward (FF) and feedback taps (FB) is 31 and 50, respectively. Both the FF and FB taps are symbol (T) spaced. In order to challenge the performance of the proposed receiver, the DFE length is selected so that it does not cover the entire span of the CIR. Figure 6 shows the BER plot of both receivers for the impulse response shown in Figure 5. In all figures showing BER results, the first iteration refers to the conventional receiver, that is, equalization followed by error correction decoding. For consecutive iterations, we have kept two turbo iterations per DFE iteration and two BICM-ID iterations per DFE iteration, respectively. We can observe that for the 1st iteration, the turbo-coded BICM outperforms BICM by approximately 3 dB at BER of 10 −5 . The reason behind this performance difference in the 1st iteration is that the turbo code performs better than BICM due to the Gray mapping employed. For the same reason DFE-BICM-ID performs worse compared to its first iteration at low SNRs. However, after the 3rd iteration, the BICM-ID-based receiver performs approximately within 1 dB of the turbo BICM performance at a BER of 10 −5 . Figure 7 shows performance results for both receivers when the channel is modelled as a Rician channel and the time-varying process at each tap is modelled as uncorrelated. This scenario can be visualized as all the paths varying independently to each other in an unpredictable fashion. From previous sea trial experiments [19], it has been established that the UAC exhibits Rician characteristics and the value of the Rice factor, K R , is selected to be 6 for the simulations. We can see that the DFE-turbo-BICM outperforms the proposed receiver during the 1st iteration; however, after the 3rd iteration the BICM-ID receiver is within 1 dB at a BER of 10 −5 . Figure 8 shows the performance of both receivers for correlated fading on each tap exhibiting a Doppler spread of 40 Hz corresponding to a normalized Doppler spread of 0.01 at a data rate of 4 ksps. In this scenario, the paths are varying independently, however, the tap gain of each tap at the (k + 1) th symbol is correlated with its value at time k.
As one of the key contributions of the paper is to shorten the DFE relative to the CIR, it is necessary to investigate the performance of the proposed receiver for different number of feedback taps. Figure 9 shows the BER performance of a conventional noniterative receiver, that is, one time equalization and channel decoding for three different values of SNR and different length of feedback filter. In Figure 9, 0 tap corresponds to the system with only a linear equalizer that is the feedback filter is deactivated. It can be observed that there is a threshold to the number of feedback taps that can be employed due to the fact that output of the equalizer will have more residual ISI for less number of feedback taps; whereas for large number of feedback taps, the convergence of the equalizer will be slow. Furthermore, a longer feedback filter can lead to high error propagation and the performance entirely depends on how well the interleaver translates the burst errors into random errors. Figure 10 shows the performance of the proposed receiver for different lengths of feedback filter for the 2nd and 3rd iterations. It is therefore established that as long as the feedforward filter covers significant duration of the CIR, a short feedback filter can be employed to cancel the ISI from past symbols.

Complexity Analysis.
We now consider the computational complexity of both receivers. We will follow a conventional approach, where the complexity of the receiver is measured in terms of the total number of computational operations such as addition and multiplication. From the simulation results we can clearly see that DFE-turbo BICM gives slightly better performance than DFE-BICM-ID. However, it is the complexity of the receiver where DFE-BICM-ID outperforms DFE-turbo BICM. The parameters were kept identical in order to make a fair comparison of both systems. Here we have used the Log-MAP algorithm as the channel decoder instead of the MAP due to its reduced complexity. Table 3 shows the complexity of the Log-MAP decoding algorithm for a single decoder of the (n c , k c ) convolutional code with memory m c . In our case, n c = 1, k c = 2, and m c = 4. We can see that the turbo code will have to perform 780 addition and 128 multiplication as it has two Log-MAP decoders where BICM-ID will have to perform only 390 additions and 64 multiplications. Apart from these large computational requirements, other operations such as delay caused by interleaving and deinterleaving, memory requirements to store the extrinsic information provided by both the decoders and the puncturing operation required at the receiver to combine the extrinsic information of uncoded and coded bits, are in principle higher for DFE-turbo-BICM than DFE-BICM-ID.

Experimental
Results. The performance of the proposed receiver structure was evaluated by processing offline signal recordings acquired during a practical experiment conducted in Lake Windermere. The range between Tx and Rx is 500 m, and the water column depth was approximately 50 m. The transmitter was positioned 10 m above the bottom of the lake pointing horizontally towards the receiver. The signal design parameters were kept identical to those of the simulated system using a bandwidth of 4 kHz centered around a carrier frequency of 12 kHz. The transmitted source level was approximately 176 dB re 1 µPa @ 1 m. The transmitter was connected to a standard digital-to-analog (D/A) card in a PC  containing files with encoded and modulated payload data packets. The receiver was placed at 5 m below the surface level of the water. It was connected to an analog-to-digital (A/D) card in a PC and the signal was continuously sampled at 48 kHz for offline processing. The channel impulse response was obtained by correlating the known PN sequence with the down-converted baseband signal. Figure 11(a) illustrates the peaks of the channel impulse responses for all the received packets over a duration of a minute. It can be clearly seen that the channel was highly time varying. Figures 11(b)-11(d) illustrate the channel impulse, magnitude, and phase response, respectively, for one of the packets. The average multipath spread (τ av ) of the channel can be given as [15] where P l is the power associated with path l and the total power P t = L−1 l=0 P l . The rms delay spread (τ rms ) of the channel can be given as By using (23) and (24), the observed multipath exhibits τ av of 1.4 ms and τ rms of 1.1 ms. The receiver used in the demodulation of the data consisted of a DFE equalizer with 32 feedforward taps (T/2-spaced) and 10 feedback taps. The first few packets were decoded error free for the 1st iteration itself without employing adaptive Doppler correction. However, many packets resulted in BER of 0.5 because synchronization was lost due to relative motion between transmitter and receiver.   Figures 12 and 13 show the output of the DFE-BICM-ID receiver after the 1st and 4th iterations, respectively. In each plot, the equalized I-Q constellation, mean square error (MSE) J convergence curve, phase tracking, and velocity estimate are illustrated. The MSE performance index, J, is defined based on the minimum mean square error (MMSE) criterion J = ε k = E{|x k − x k | 2 }. In the absence of ISI, the minimum of performance index J is related to the noise energy N 0 as [15] J min = N 0 1 + N 0 .
The steady-state signal to interference plus noise ratio (SINR) γ ∞ and J min are related by Experimentally, γ ∞ may be obtained from ε k in decision directed mode as where E{|x| 2 } is average symbol energy and E{|x k − x k | 2 } has been replaced by its sample average estimate. The SINR after the 1st iteration obtained using (27) is approximately 7.3 dB and the mean velocity estimate is approximately 0.2 m/s. Comparing Figure 12 with Figure 13 clearly demonstrates that the iterative process not only improves SINR by approximately 2.7 dB but also improves MSE, phase tracking, and velocity estimate. Figure 14 illustrates the probability density function (pdf) of the equalized output, p( x | x), obtained from the experimental data after the 1st and 4th iteration, in Figures 14(a) -14(b) and Figures 14(c)-14(d), respectively. We can observe that the experimental pdf obtained after the 1st and 4th iterations is evidently Gaussian in shape. However, after the 4th iteration, the mean estimates of the pdf that correspond to the transmitted constellation improve while the variance of the noise scatter around the constellation points is reduced, which supports our claim that iterative equalization removes residual ISI to a great extent and improves SINR. Finally, Figure 15 demonstrates the average output SINR over a period of 50 consecutive packets. Despite the considerable variations in channel conditions and nonnegligible Doppler, successful error free detection was achieved throughout the entire transmission period.

Conclusion
In this paper, we have proposed a receiver structure employing adaptive DFE and BICM-ID in conjunction with an adaptive Doppler compensation technique. The objective of this paper was to investigate the performance of the system when the DFE does not cover the entire span of CIR. The shallow water channel has been simulated based on a given geometry for short range communication, which produces large delay spread and was modelled as a Rician multipath fading channel. Moreover, simulation results were carried out for static and dynamic channels and we compared the proposed DFE-BICM-ID receiver with a more complex system employing a DFE receiver and turbo-BICM. It was shown that the proposed receiver performs approximately within 1 dB of the performance of the DFE-turbo-BICM system. It was also shown that it will be more feasible to implement the DFE-BICM-ID receiver in real time due to its lower memory and complexity requirements. Furthermore, it has also been established that there is an upper and lower limit on the number of feedback taps that can be employed. Experimental results demonstrated that in a highly dynamic channel, the proposed receiver not only reduces intersymbol interference and error propagation in the DFE but also improves SINR by approximately 2 dB. It is also shown that the iterative receiver gives better Doppler estimates, and thus, improving the interpolation and phase tracking. The encouraging results and reduced complexity in implementation make the proposed iterative receiver an attractive solution for a robust high data-rate underwater acoustic modem.