Deep adaptive temporal network (DAT-Net): an effective deep learning model for parameter estimation of radar multipath interference signals

Accurate parameter estimation in radar systems is critically hindered by multipath interference, a challenge that is amplified in complex and dynamic environments. Traditional methods for parameter estimation, which concentrate on single parameters and rely on statistical assumptions, often struggle in such scenarios. To address this, the deep adaptive temporal network (DAT-Net), an innovative deep learning model designed to handle the inherent complexities and non-stationarity of time series data, is proposed. In more detail, DAT-Net integrates both the pruned exact linear time method for effective time series segmentation and the exponential scaling-based importance evaluation algorithm for dynamic learning of importance weights. These methods enable the model to adapt to shifts in data distribution and provide a robust solution for parameter estimation. In addition, DAT-Net demonstrates the capability to comprehend inherent nonlinearities in radar multipath interference signals, thereby facilitating the modeling of intricate patterns within the data. Extensive validation experiments conducted across parameter estimation tasks and demonstrates the robust applicability and efficiency of the proposed DAT-Net model. The architecture yield root mean squared error scores as low as 0.0051 for single-parameter estimation and 0.0152 for multiple-parameter estimation.

such as the ground, sea, or other structures, before reaching the receiver [2].This event can result in distorted or misleading information regarding the target's position, velocity, and other characteristics, thereby complicating the interpretation of the signals.As illustrated in Fig. 1, the direct and reflected paths, denoted by A → D and A → B → D , respectively, give rise to such multipath interference, where h 1 and h 2 representing the heights of the radar antenna and the receiver antenna, respec- tively.Moreover, the path lengths of AD , AB , and BD are designated by R d , R 1 , and R 2 , respectively.The challenge of multipath interference becomes increasingly evident in complex and dynamic environments, highlighting the need for advanced and robust techniques for parameter estimation.
Although there exists considerable research on radar multipath interference signals with a focus on aspects such as simulation modeling [3,4] and interference elimination [5,6], studies that specifically address the estimation of parameters for these signals are notably limited.The few existing studies have predominantly used traditional methods to estimate a single parameter of the signal [7,8].Although these methods provide partial solutions, they struggle to effectively handle complex and dynamic scenarios due to their dependence on statistical assumptions and linear models.Furthermore, their ability to simultaneously estimate multiple parameters is limited, highlighting a notable gap in this crucial area of study.
To address this challenge, we propose the deep adaptive temporal network (DAT-Net) is proposed.It consists of a deep learning (DL) model designed for the parameter estimation of radar multipath interference signals.Moreover, DAT-Net incorporates advanced techniques relative to time series segmentation and importance evaluation through the integration of a pruned exact linear time (PELT) approach and an exponential scaling-based importance evaluation (ESBIE) algorithm.
Therefore, the main innovations and contributions of this paper are summarized as follows: 1.The introduction of the DAT-Net The DAT-Net is specifically crafted to effectively manage non-stationary time series data and adeptly accommodate the distribution shifts that can arise due to the inherent temporal variability of these data.By strategi- Receiver Antenna sea surface cally tackling these distribution shifts, the model provides a robust solution for challenges in parameter estimation, ensuring high adaptability and precision even when statistical properties of the data change over time.2. The integration of the PELT method for effective time series segmentation in DAT-Net.This approach allows to identify the periods of significant divergence in data distribution; thereby, it ensures the model's adaptability to data relative to different periods and domains.3. The development of the ESBIE algorithm for dynamic learning of importance weights in DAT-Net.This mechanism provides a better responsive adjustment of the importance weights according to the changes in the distribution distances, ensuring a controlled and stable learning process.4. DAT-Net showcases the ability to understand inherent nonlinearities in radar multipath interference signals, and thus, it enables the modeling of intricate patterns within the use of data.The efficacy of the method is rigorously validated, with model performance appraised in both single-and multi-parameter estimation contexts, affirming its robust applicability.
To sum up, the remaining part of the paper is defined as follows: In Sect.2, the related work is presented, whereas the methodology is proposed in Sect.3. As for Sect.4, the experimental configuration is introduced.Finally, the conclusion and some future works are shown in Sect. 5.

Existing methods for radar signal parameter estimation and their limitations in handling multipath interference
Despite the scarcity of research regarding the parameter estimation of radar multipath interference signals, several studies have contributed significantly to the understanding and implementation of parameter estimation for conventional radar signals.These established methods, fundamental to radar signal processing, serve as an important starting point reference for our work.
For instance, Liu et al. [9] used the cumulative Wigner-Hough transform (CWHT) for estimating parameters of the linear frequency modulation continuous wave (LFMCW) signal, taking advantage of the signal's periodicity.However, this method was designed for individual LFMCW signals, not for multiple ones.Moreover, Geroleo et al. [10] introduced the periodic Wigner-Ville Hough transform (PWVHT) to detect the LFMCW radar signal and estimate its parameter.This method accommodated multiple pulses within an observation interval at the intercept receiver, extending the accumulation of signal energy, and thereby enhancing detection and parameter estimation.In addition, Wen et al. [11] leveraged the focusing capability of the fractional Fourier transform (FRFT) to estimate the pulse width and frequency modulation rate of Linear Frequency Modulation (LFM) signals.This method yields accurate results even under low signal-to-noise ratio (SNR) conditions.Furthermore, Tang et al. [12] presented a method to estimate the direction of arrival (DOA) using a modified spatial time-frequency distribution (STFD) matrix.This new approach surpassed the limitations of traditional narrowband methods when dealing with wideband signals.Finally, Deng et al. [13] proposed a technique for carrier frequency and code period estimation of polyphase-coded radar signals.Their approach, utilizing the Fourier transform and a modified Choi-Williams distribution, was designed to facilitate parameter estimation in environments with low SNRs.As reported, it achieved high estimation accuracy under these challenging conditions.
Based on the above discussion, it becomes evident that the existing methods, although their good performance for conventional radar signal parameter estimation, reveal limitations when it comes to more complex and dynamic scenarios.These methods often rely on statistical assumptions and linear models, primarily tailored for specific signal types.Therefore, the requirement to build a distinct mathematical model for each signal type presents a significant limitation, particularly when confronting diverse and dynamically varying signal conditions.For instance, Djemal et al. [14] proposed an adaptive threshold detection approach based on CFAR techniques for radar systems in both homogeneous and non-homogeneous environments.As we shift our attention to signals under multipath interference conditions, the scenario becomes even more challenging.These distorted signals display distinct characteristics and behavior compared to conventional signals, thus requiring the development of innovative parameter estimation methods.
The limitations of existing methods underscore the needs for universal models, such as those crafted using DL techniques.Their benefits include the capability for nonlinear modeling, flexibility to handle a wide variety of signals, adaptability to changing scenarios, and ability to learn and adjust their models based on recorded data.In fact, DL methods have showcased their effectiveness in several signal processing areas and are well-positioned to address the challenges incorporated within multipath interference.The exploration of DL techniques, particularly their potential in achieving robust radar signal parameter estimation under multipath interference conditions, offers an exciting pathway for future research.

Deep learning for radar signal processing
The progress in radar signal classification has been significantly influenced by the adoption of DL architectures, notably convolutional neural networks (CNNs), owing to their success in image classification tasks.For instance, Sun et al. [15] proposed a technique that leverages a unidimensional convolutional neural network (U-CNN) for radar emitter classification, demonstrating competitive accuracy levels.Similarly, Liu et al. [16] utilized a triplet convolutional neural network (T-CNN) to enhance the identification of different modulations of low probability of intercept (LPI) radar signals, particularly effective in harsh electromagnetic environments with low SNRs.To further enhance the performance in radar signal classification, fusion strategies, based on CNN architectures, have been explored.An approach, proposed by Akyon et.al [17], employed two independent CNNs to separately process frequency-and phase-related aspects of radar signals.
Moreover, efforts have also been made to accelerate CNN's feature learning.A principal component analysis (PCA)-based CNN architecture, proposed by Ye et al. [18], was developed to reduce the dimensionality of time-frequency distribution (TFD) images.Additionally, techniques, such as convolutional denoising auto-encoder (CDAE) and inception-based deep CNN, have been employed to facilitate the signal recognition and noise reduction in TFD images [19].
Furthermore, deep time series networks have been employed in radar signal processing.For instance, Zhang et al. [20] utilized recurrent neural networks (RNNs) for theclassify, denoise, and deinterleave pulse streams, aiming to exploit long-term temporal patterns to enhance processing outcomes.Furthermore, Apfeld et al. [21] used long short-term memory (LSTM) RNNs to identify multifunction radar emitters.Their approach utilized the frequency and agility of emissions to encode radar pulses into symbolic representations.This enabled the discrimination between similar emitter types by analyzing emission parameters and employing resource management techniques.In another study, Notaro et al. [22] applied RNNs to classify radar emitters, capitalizing on temporal dependencies within pulse streams.For this purpose, they introduced two techniques: per-sequence normalization to improve the temporal pattern extraction, and attribute-specific RNN processing to enhance information efficiency.Their approach outperformed previous methods in terms of accuracy and robustness, especially in noisy environments.
These DL-based techniques have shown promising results in radar signal recognition.However, the application of DL methods for radar signal parameter estimation remains limited.Given that radar signals are typical non-stationary time series data, leveraging deep time series networks for radar signal parameter estimation present an intuitive research direction.

Problem definition
In the context of multi-step time series prediction, we consider the scenario known as temporal covariate shift (TCS) [23].This concept refers to situations in which the distribution of input data, or features, varies across different time intervals, whereas the conditional distribution of labels, based on these features, remains consistent.
Consider a time series dataset consisting of n segments with corresponding labels.This dataset is denoted as D trn = x i , y i n i=1 and is applied for training.It encompasses an unknown quantity K of underlying time periods.Within a constant period i , seg- ments adhere to a specific data distribution, P D i (x, y) .However, when evaluating two distinct periods, i and j (where 1 ≤ i � = j ≤ K ), the input data distributions diverge, i.e., P D i (x) = P D j (x) .Despite this shift, the conditional distributions of labels given the inputs remain consistent, represented as P D i (y|x) = P D j (y|x) , and constituting a TCS. Figure 2 provides a visual depiction of this scenario.It displays four distinct periods of the time series data.The first three periods are part of the data training process, each with its unique data distribution, whereas the fourth period represents the test data.
The objectives of this work are twofold: (1) automatically identify the K periods within the training time series data, visually represented by the first three periods in Fig. 2, and (2) construct a predictive model M capable of harnessing the shared characteristics across these periods.This model should provide accurate forecasts for the forthcoming r segments, represented as the fourth time interval in Fig. 2 and formally denoted as D tst = {x j } n+r j=n+1 .Assuming that the test segments belong to a uniform time period, these segments exhibit a distinct input data distribution, separate from that of any the training periods, i.e., P D tst (x) = P D i (x) .However, even with the variation in input data distribution, the conditional distribution of labels given the inputs remains consistent across all periods.It is, expressed as P D tst (y|x ) = P D i (y|x ) for any 1 ≤ i ≤ K.

Time series segmentation using the PELT approach
In order to reach the goal of maximizing shared knowledge extraction from a time series in the presence of temporal covariate shift, this study presents an innovative and efficient method to identify periods that exhibit the highest degree of divergence from each other.These periods represent the extreme cases of temporal covariate shift, where the cross-period distributions, denoted as D i and D j , are the most diverse.As a result, they hold considerable importance for training a predictive model that aims to be robust against distribution shifts, as illustrated in Fig. 2.
An easy approach to time series segmentation involves an even division of the data into a fixed number of parts, N , where each part is considered as a minimal-unit period that cannot be further subdivided.Given a set of predefined K values, the optimal K is determined according to a method that maximizes the distribution distance [23].Nevertheless, this approach has significant limitations.Although it is simple, it does not allow for time series data fine-grained segmentation.In addition, it may not always result in the most optimal solution, especially when dealing with large-scale data sets.
In response to these limitations, this paper recommends the use of the PELT method.This technique, a state-of-the-art changepoint detection algorithm, is recognized for its computational efficiency, precision, and ability to efficiently segment time series data.It operates based on the primary principle of identifying significant shifts in data distribution, represented as "changepoints." Moreover, it employs a dynamic programming to achieve computational efficiency while preserving the accuracy of the changepoint detection.The computational efficiency is achieved a pruning rule in the dynamic programming process, which discards unnecessary computations, hence the name pruned exact linear time.
More formally, given a time series data with n time points, the PELT approach mini- mizes the cost function as described in Eq. ( 1): (1) In this function, τ 1 , τ 2 , ..., τ m denote the locations of changepoints, with τ 0 = 1 and τ m+1 = n + 1 .Moreover, F (i) symbolizes the cumulative sum of model-fitting costs up to time point i , and f (m) represents a penalty term proportional to the number of change- points m.
By adopting the PELT technique, an efficient computation is ensured and the need for pre-specifying a range for K is no more required.This approach can detect periods of vary- ing lengths, yielding to enhance its adaptability to data.The output of PELT, in terms of the identified changepoints, enables segmentation of the time series into K periods.Fur- thermore, the distributions D i and D j of each period can subsequently be investigated for further analysis.

Temporal distribution matching enhanced with exponential scaling-based importance evaluation
Once the distinct time periods have been established, a temporal distribution matching (TDM) module is constructed.Its main objective is to extract the shared knowledge across these periods by aligning their distributions.Through this process, the model, symbolized as M , would outperform on unseen test data compared to models that rely solely on local or statistical data.
The prediction loss of TDM, denoted as L pred , in Eq. as follows: where (x j i , y j i ) represents the ith labeled segment from period D j , ℓ(•, •) denotes the mean squared error (MSE) loss function, and θ signifies the learnable model parameters.In the proposed model model, M adopts the LSTM technique, a form of deep time series model.The LSTM, through its specialized architecture, effectively processes time series data by capturing long-range temporal dependencies.Its strength lies in learning from and retaining information over extended periods, which renders it particularly suited for our scenario involving time-dependent features across different periods.
However, minimizing solely the prediction loss, as given in Eq. ( 2), would fosters the predictive knowledge inherent in each individual period and results in failing to reduce the distributional divergence across the different periods.Moreover, this divergence could potentially carry common knowledge valuable to the training model.
To address this issue, TDM introduces the concept of an importance evaluation, denoted α, with dimensions α ∈ R V .This vector is responsible for assessing the relative significance of the V hidden states within the LSTM, each state being weighted by a normalized α.This approach dynamically reduces the distribution divergence across periods.
For a pair of periods, D i , D j , the loss associated with the temporal distribution match- ing, as expressed in Eq. (3), is expressed as follows: where α t i,j represents the distributional importance between periods D i and D j at state t. ( Furthermore, calculating all the hidden states in an LSTM model is a straightforward process.Let δ(•) represent the computation of a subsequent hidden state based on a pre- vious state.The state computation can be expressed as shown in Eq. ( 4): Finally, by merging Eqs. ( 2) and (3), the comprehensive objective of temporal distribution matching can be defined as in Eq. ( 5): where λ is a trade-off hyper-parameter.The second term calculates the average distribution distances of all pairwise periods.For computational efficiency, we take a mini-batch of D i and D j to conduct the forward operation in LSTM layers and subsequently concat- enate all hidden features.The resulting TDM can then be performed using Eq. ( 5).
Therefore, we propose the ESBIE algorithm to learn and identify the importance weights, α t,(n) i,j .The schematic representation of the ESBIE algorithm is displayed in Fig. 3.The initial step involves pre-training the network parameter θ on a fully labeled data from all periods, which leads to get superior hidden state representations that help in the learning process of α t,(n) i,j .We assign the symbol θ 0 to this pre-trained parameter.Once θ 0 is defined, the ESBIE algorithm is applied to discern the importance of the hid- den states.All weights within each LSTM layer are uniformly initialized and denoted α t,(0) i,j = {1/V } V .To guide the weight updates, we utilize the cross-domain distribution distance.This distance is calculated using the Maximum Mean Discrepancy (MMD) metric [24], recognized for its ability to quantify the disparity between two probability (4) Fig. 3 ESBIE algorithm in temporal distribution matching distributions, making it an effective choice for high-dimensional spaces.As for the ESBIE algorithm, it follows a nonlinear approach in adjusting weights according to the alterations in distribution distance.When the distribution distance increases, indicating larger divergence, the ESBIE algorithm adjusts the importance weights upwards to work toward reducing this divergence.Conversely, when the distribution distance decreases, indicating a reduced divergence, the importance weights are adjusted downward-.
The adopted update rules are as follows: where the scaling function G is defined as follows: here, η n = η 0 × e −p×n represents the learning rate at the n th epoch, which is exponen- tially decayed from the initial value η 0 .Moreover, the decay rate is controlled by the decay constant p , a hyperparameter that is empirically set.In Eq. ( 6), d t,(n) i,j = D(h t i , h t j ; α t,(n) i,j ) represents the distribution distance at time step t in epoch n .Finally, the weights are normalized after each update using this formula: . Therefore, by applying Eqs. ( 3) and ( 6), the importance evaluation can be learned.
Furthermore, the ESBIE mechanism provides a more responsive adjustment of the importance weights to the changes in distribution distances.By integrating a learning rate with exponential decay, this approach allows a more controlled and stable learning process.The learning rate's decay throughout epochs ensures a gradual fine-tuning phase in which the weights are adjusted more subtly, enhancing the chances of converging to an optimal solution.Leveraging the effectiveness of temporal distribution matching across domains, the model's overall performance is improved when managing data from diverse periods and domains, resulting in a more robust generalization on unseen test data.

Dataset: details and specifications
For our experiments, we utilized a dataset derived from the "Radar Signal Simulation Platform under Complex Electromagnetic Environment, " located at Southwest China Research Institute of Electronic Equipment.The platform incorporates a parabolic equation method to generate synthetic radar signal data.This method serves as the basis of radar signal processing, aligning and filtering received echo signals with a reference signal, essentially a duplicate of the transmitted signal.This procedure yields a complex signal exhibiting a parabolic trajectory within the time domain.Simulated echo signals are designed to encompass noise and the Doppler effect to mimic real-world radar signals.
A representation of the simulated scenario is displayed in Fig. 4. The setup takes into consideration the multipath effects induced by specular reflections from the sea surface.These effects result in distorting the received signal as it traverses different routes before reaching (6) ) the receiver.This latter, which simulates a flight at an elevation of 8,000 m, follows a circular trajectory while the emitter maintains a linear path from the origin, at an elevation of 3,000 m, directing toward the circular path's center.Both the transmission and reception elements are simulated as dipole antennas having vertical polarization.The power of the transmitted signal is set at 100 W, with a transmission gain of 20 dB, and a receiver gain of 13 dB.The simulation generates four signal types: LFM, nonlinear frequency modulation (NLFM), binary phase shift keying (BPSK), and quadrature phase shift keying (QPSK).To meet the requirements of the model training, 9000 instances of each signal type are generated, which were divided into sets of 7000 instances for training, 1000 for validation, and 1000 for testing.

Model parameter settings
The model was implemented using the PyTorch deep learning framework, and the experiments were executed on a machine equipped with an RTX 3080TI graphics card, using CUDA version 11.7 and the Windows 10 operating system.
The PELT method, available in the Python ruptures library, was used for the changepoint detection, It utilized the radial basis function (RBF) model as the cost model.The penalty term was set to 10.In the LSTM model, a two-layer network structure was utilized with a hidden state dimension of 32.The Adam optimizer was used with a learning rate of 0.002.Finally, in the ESBIE algorithm, the initial learning rate η 0 was set at 0.2 and the decay con- stant p was equal to the unit.

Application and performance comparison: parameter estimation in radar multipath interference signals
In this study, our model is applied to time-domain data from three distinct sets of simulated radar multipath interference signals, where each set features distinctive patterns of parameter variation.
• In Set 1, all signals-LFM, NLFM, BPSK, and QPSK-have variable bandwidths (BW) while the other parameters remain constant.This requires an estimation of the BW for these signals.• In Set 2, the pulse width (PW) for LFM and NLFM and the number of sub-pulses (NSP) for BPSK and QPSK are varied while the other parameters are kept constant, thus requiring an estimation of PW for LFM and NLFM, and NSP for BPSK and QPSK.• In Set 3 a more complex scenario is proposed.It consists of having two parameters that vary simultaneously.For LFM and NLFM, we aim to estimate both PW and BW, whereas for BPSK and QPSK, both BW and NSP must be estimated.
The performance of the model in estimating parameters is evaluated using two statistical measurement approaches: the root mean squared error (RMSE) and the mean absolute error (MAE).The former technique employs a quadratic scoring rule that computes the average magnitude of the error by effectively squaring the difference between predicted and observed values before calculating the mean.This technique puts more weight on large errors.Conversely, MAE consists of computing the average absolute difference between predicted and observed values.Moreover, it provides a linear score that can be easily interpretable as it directly averages the absolute error magnitudes.
The proposed evaluations involve comparing the DAT-Net model against XGBoost [25], LSTM, LSTNet [26], and AdaRNN [23].However, in the case of Set 3, the XGBoost model does not provide RMSE and MAE values as it is unable to simultaneously generate two parameter estimates.As a result, these values are represented as "-" in Tables 1  and 2. In more detail, Table 1  When considering the RMSE and MAE metrics, the DAT-Net consistently emerges as the preferred model as it achieves lower error values.This underscores its capability in accurately handling complex temporal dependencies in radar multipath interference signals.
Concerning the single-parameter estimation scenarios (simulations regarding Sets 1 and 2), the DAT-Net method continues to demonstrate superior performance.It reflects remarkable accuracy in estimating BW across all signal types in Set 1.However, in the case of BPSK, although DAT-Net performs well, the AdaRNN technique manages to achieve slightly lower RMSE and MAE values.Similarly, in Set 2, DAT-Net effectively estimates the PW for LFM and NLFM signals as well as the NSP for BPSK and QPSK signals, achieving lower errors than its counterparts.
Going to scenarios that involve the simultaneous estimation of two parameters (mainly the simulation of Set 3), DAT-Net maintains its exemplary performance, as it consistently achieves lower RMSE and MAE values across all signal types, knowing that XGBoost fails to provide any estimates in this set due to its limitations.
As the estimation task shifts from single-parameter scenarios (Sets 1 and 2) to multiparameter scenarios (Set 3), there is a notable escalation in error metrics is observed across all models.This increase reflects the complexity inherent with multi-parameter estimation.Nonetheless, this increase is notably less severe for DAT-Net compared to the other models.This result shows, once again, robustness ability of DAT-Net to tackle the intricacies of multi-parameter estimation with a remarkable level of error control.Despite the increased complexity, DAT-Net manages to keep error growth in check, highlighting its resilience and adaptability across several estimation scenarios.
Upon a detailed examination of the error values, the DAT-Net's performance with LFM and NLFM signals consistently results in lower RMSE and MAE compared to the results obtained with BPSK and QPSK signals across all sets.This finding could potentially reflect the differences in signal structures as well as the DAT-Net's specific proficiency in handling LFM and NLFM signals.Furthermore, Fig. 5 illustrates the comparative proficiency of the DAT-Net and other models in approximating real values, with a specific emphasis on the BW estimation of QPSK signals from Set 1.This visual evaluation is inline with the conclusions drawn from the error metric analyses, showcasing clearly that DAT-Net provides a better approximation of the actual values compared to the other models.The close alignment between the fluctuations in the actual data and the outputs produced by DAT-Net reflects its effective learning and adaptation to model the radar multipath interference signals.Conversely, the plots for the other models, such as XGBoost, LSTM, LSTNet, and AdaRNN, reflect greater deviations from the actual values.These variations highlight the efficiency of DAT-Net in modeling complex temporal dependencies in radar multipath interference signals.

Convergence speed analysis
As an advancement over AdaRNN, DAT-Net not only enhances estimation accuracy but also improves the efficiency in model training.Evaluating of model performance goes beyond the estimation precision to encompass the speed at which these accurate estimations are achieved.
By integrating the ESBIE algorithm, DAT-Net gains a significant advantage in this respect.ESBIE enables a swift decrease in loss during the initial training stages, thereby accelerating the convergence speed.Furthermore, in the later stages of model training, ESBIE contributes to smaller fluctuations in the loss function, promoting stability in the learning process and helping in preventing overfitting.
The improvements in the convergence speed are vividly depicted in Fig. 6a where the graph demonstrates clearly that DAT-Net's reduces faster the loss compared to AdaRNN during the early phases of training, along with having lower variation in loss during the later learning stages.
This enhanced convergence speed and stability, especially notable in the context of large datasets and time-critical tasks, further validates the DAT-Net method position as an improved model over AdaRNN.
Finally, as shown in Fig. 6b, the proposed DAT-Net architecture demonstrates computational efficiency improvements over AdaRNN.However, this improvement is offset by longer training times compared to traditional LSTM, as well as XGBoost and LSTNet.The additional time demands of DAT-Net's series segmentation and TDM components are justified by its superior performance over existing methods across assessed benchmarks.Once the model training is completed, DAT-Net does not significantly elevate computational consumption in the inference phase compared to LSTM.

Conclusion
This paper introduced and thoroughly examined the performance of DAT-Net, a deep learning model specifically designed to manage complex and non-stationarity time series data, while considering radar multipath interference signals.Notably, DAT-Net's ability to manage various signal types contributes toward addressing the complexities frequently encountered in the current research problems.Moreover, the DAT-Net incorporates advanced techniques, such as time series segmentation via the PELT method, and importance evaluation using the ESBIE algorithm.These methods enhance the model's capability to adapt to shifts in data distribution and offer a robust solution for parameter estimation.The comprehensive comparative analysis achieved in this work underscored the advantages of DAT-Net, demonstrating its superior performance in terms of precision over other commonly used models through various scenarios.Specifically, the DAT-Net's ability to understand and model inherent nonlinearities and non-stationarities in radar multipath interference signals distinguishes it from other approaches.
Future work could aim to further enhance DAT-Net and broaden its applicability to a wider array of scenarios and datasets.Such improvements might include integrating more advanced machine learning techniques or applying the model to other timedependent signal processing tasks.

Fig. 1
Fig. 1 Schematic diagram illustrating multipath interference effects on radar signal propagation over sea surface

Fig. 2
Fig. 2 Multi-period time series data distribution under temporal covariate shift

Fig. 4
Fig.4 The schematic diagram of the simulation scenario

Fig. 5
Fig. 5 Comparative visualizations of model fits to real values

Fig. 6
Fig. 6 Convergence speed and training time

Table 1
(for RMSE) and Table 2 (for MAE) present the results of Comparison of RMSE for parameter estimation in radar multipath interference signals across various models

Table 2
Comparison of MAE for parameter estimation in radar multipath interference signals across various models