In this section, the workflow of the proposed range alignment method is discussed in detail. In general, as mentioned in Section 1, this novel algorithm contains three steps:

1
Split the full aperture and estimate the target motion parameters in each subaperture based on the CDA and the LevenbergMarquardt (LM) optimization method

2
Align the envelopes of the average range profiles (ARPs) of every subaperture using the MCA

3
Smooth the estimated deviations by means of locally weighted regression (LOESS)
3.1 Estimating the motion parameters in the subapertures
Traditional range alignment methods often focus on the similarity among range profiles without fully considering the target motion information. As a result, these algorithms are vulnerable in low signaltonoise ratio (SNR) scenarios.
In a relatively short observation time, the target translation motion can be regarded as stable movement, i.e., uniformly accelerated motion; thus, it is rational to model the envelope shift as a secondorder polynomial with respect to the slow time. With the adoption of the motion information, the algorithm robustness under a low SNR can be considerably improved. However, in real ISAR imaging, hundreds or even thousands of echoes are accumulated, which leads to a long observation time. To use the motion information, we split the full aperture into a certain number of subapertures to obtain a short slowtime span. In each subaperture, optimization based on the minimum entropy principle is performed to estimate the velocity and acceleration of the target, which is discussed at length in the following.
Assume every M consecutive echoes of the full aperture are viewed as a subaperture and that the echo in the kth subaperture is \({x_{k}}\left ({{t_{m}},\hat {t}} \right)\), where t_{m} is the slow time and t_{n} is the fast time (discrete). As mentioned above, the envelope shift in one subaperture can be modeled as a secondorder polynomial in t_{m} with two unknown parameters, i.e., acceleration a and velocity v:
$$ \begin{aligned} \Phi \left({{t_{m}}} \right) = v{t_{m}} + a{t_{m}}^{2} \end{aligned} $$
(8)
where Φ(t_{m}) represents the envelope shift error.
As (4) indicates, once the optimal values of the unknown parameters are obtained, modulation in the time domain can be carried out to compensate for the range shift:
$$ {}\begin{aligned} {\tilde{x}_{k}}\left({{t_{m}},t_{n}} \right) = {x_{k}}\left({{t_{m}},t_{n}} \right) \cdot \exp \left[ {  j2\pi t_{n}\left({v{t_{m}} + a{t_{m}}^{2}} \right)} \right] \end{aligned} $$
(9)
where \({\tilde {x}_{k}}\) denotes the compensated echo.
To estimate a and v, the minimum entropy method is introduced. By referring to [24], compared with contrastbased methods, this method can attain a good compromise among all kinds of scatters contained in the echo and result in a globally highquality image. The entropy of the ARP of the current subaperture is chosen as the metric, and the unknown parameters are estimated by minimizing the entropy.
The HRRP of one subaperture is denoted by
$$ \begin{aligned} HRRP = fft\left\{ {{{\tilde{x}}_{k}}\left({{t_{m}},t_{n}} \right)} \right\} \end{aligned} $$
(10)
where fft{·} represents the FFT operation on t_{n}.
Based on (9) and (10), the ARP of the subaperture can be expressed as:
$$ {}\begin{aligned} ARP &= \frac{1}{M}\sum\limits_{m = 0}^{M  1} {\leftfft\left\{ {{{\tilde{x}}_{k}}\left({{t_{m}},t_{n}} \right)} \right\}\right}\\ & = \frac{1}{M}\sum\limits_{m = 0}^{M  1} {\leftfft\! \left\{ {{x_{k}}\left({{t_{m}},t_{n}} \right) \cdot \exp \left[ { \! j2\pi t_{n}\!\left(\!{v{t_{m}} \,+\, a{t_{m}}^{2}} \right)} \right]} \right\}\right} \\&\buildrel \Delta \over = \frac{1}{M}\sum\limits_{m = 0}^{M  1} {{f_{k}}\left({v,a} \right)} \end{aligned} $$
(11)
where \({f_{k}}\left ({v,a} \right){\mathrm { = }}fft\left \{ {{x_{k}}\left ({{t_{m}},t_{n}} \right) \cdot \exp \left [ {  j2\pi \hat {t}\left ({v{t_{m}} + a{t_{m}}^{2}} \right)} \right ]} \right \}\) and m is the index of each echo in the subaperture. Equation (11) is apparently a onedimensional real function with a length equal to the number of fast time sampling points N.
According to [18], the entropy of the ARP can be written as
$$ \begin{aligned} E\left(ARP \right) =  \frac{1}{{{S_{{arp}}}}}\sum\limits_{n = 0}^{N  1} {{ARP}{^{2}}\ln {ARP}{^{2}}} {\mathrm{ + }}\ln {S_{{arp}}} \end{aligned} $$
(12)
where n is the index of the fasttime sampling points and S_{arp} is the intensity of the ARP, namely,
$$ \begin{aligned} {S_{{arp}}} = \sum\limits_{n = 0}^{N  1} {{ARP}{^{2}}} \end{aligned} $$
(13)
Combining (11) and (12), it can be seen that the entropy E is a function of the unknown parameters a and v. Therefore, the problem of estimating these parameters can be abstracted to the following form:
$$ \begin{aligned} \left\langle {\hat{a},\hat{v}} \right\rangle = \underset{a,v}{\arg \min } E(a,v) \end{aligned} $$
(14)
where \(\hat {a}\) and \(\hat {v}\) are the estimated values that minimize the entropy.
Equation (14) is a twodimensional optimization. In the proposed algorithm, the CDA is implemented as the optimization solver, which is an iterative method with outer and inner iterations. The inner ones, with the same number of unknown parameters, are accomplished by minimizing the objective along a certain dimension while fixing the remaining components of the vector of the parameters at their current values. The outer one is not to be terminated until the criteria on the tolerance of the change in the cost function or the preset maximum loop times are met [21].
According to [25], by using a proximal point update technique, the CDA can achieve better robustness in solving onedimensional subproblems. Suppose the vector of the unknown parameters is θ and that the CDA procedure is in the pth outer loop and implemented to update the i_{p}th parameter. The CDA updating scheme with the proximal point update can be written as
$$ {}\begin{aligned} \theta_{{i_{p}}}^{p} = \underset{{\theta_{{i_{p}}}}}{\arg \min } \!\left[\! {E\left({{\theta_{{i_{p}}}},\theta_{\ne {i_{p}}}^{p  1}} \right) \,+\, \frac{1}{{2a_{{i_{p}}}^{p  1}}}\left\left{\theta_{{i_{p}}}}  \theta_{{i_{p}}}^{p  1}\right\right_{2}^{2}}\! \right] \end{aligned} $$
(15)
where \({\frac {1}{{2a_{{i_{p}}}^{p  1}}}\left \left {\theta _{{i_{p}}}}  \theta _{{i_{p}}}^{p  1}\right \right _{2}^{2}}\) is the socalled quadratic proximal term and \(a_{{i_{p}}}^{p  1}\) serves as a step size and can be any bounded positive number. The addition of the quadratic proximal term makes the function of each subproblem dominate the original objective around the current iteration and therefore produces increased stability and better convergence properties, especially in the case of nonsmooth optimization [25].
For the onedimensional search in the CDA, the LM algorithm, which has been the de facto standard for most optimization problems [26], is utilized. In the LM method, the cost function in the neighborhood of the current iteration θ_{i} can be approximated as
$$ {}\begin{aligned} E\left({{\theta_{i}} + \Delta} \right) \approx L\left(\Delta \right) = E\left({{\theta_{i}}} \right) + \frac{{\partial E\left({{\theta_{i}}} \right)}}{{\partial {\theta_{i}}}} \cdot \Delta + \frac{1}{2}\frac{{\partial {E^{2}}\left({{\theta_{i}}} \right)}}{{\partial \theta_{i}^{2}}}{\Delta^{2}} \end{aligned} $$
(16)
where Δ is the update value and L(Δ) represents the approximation of E(θ_{i}+Δ).
With two parameters, i.e., the damping parameter λ and the division factor γ, the LM procedure can be summarized as in Algorithm 1, where the initial values of λ and γ are empirically obtained.
It can be seen from Algorithm 1 that the first and second derivatives of the cost functions are needed to complete the LM method, as illustrated in the Appendix.
Assume that the number of subapertures is SN; therefore, there are M·SN echoes in total. The envelope deviations obtained by parameter estimation can be written as
$$ {}\begin{aligned} {\Delta_{\text{sub}}} = {\left[{v_{1}}{\mathbf{t_{s1}}} + {a_{1}}{\mathbf{t_{s1}}}^{2},\ldots,{v_{SN}}{{\mathbf{t_{sSN}}}} + {a_{SN}}{\mathbf{t_{sSN}}}^{2}\right]^{T}} \end{aligned} $$
(17)
where v_{i} and a_{i} (i=1,2,…,SN) denote the estimated parameters and t_{sm} (m=1,2,…,M) represents the slowtime vector with length M of each subaperture. Therefore, the total length of vector Δ_{sub} is M·SN.
3.2 Aligning the ARPs of the subapertures
After the parameter estimation based on the method mentioned in the previous subsection and compensation according to the estimated results, the envelopes in each subaperture have been aligned. However, due to the following two reasons, some finetuned techniques are required to achieve better alignment.
On the one hand, the parameter estimation method focuses mainly on the alignment in the subapertures, and as a result, there exist envelope fluctuations among different subapertures; on the other hand, after the misalignment compensation in each subaperture, the processing gain of noncoherent integration can be obtained by means of averaging all envelopes in a subaperture, which provides useful information for performance enhancement in low SNR scenarios.
To make full use of the noncoherent integration gain and improve the effect of alignment between subapertures, we implement the ACM (5) on the ARPs of all subapertures. The framework of this finetuned technique is demonstrated below.
Suppose the estimated motion parameters of the kth subaperture are \(\hat {v}\) and \(\hat {a}\); therefore, the compensated echoes of this subaperture can be written as
$$ \begin{aligned} {x_{ck}}\left({{t_{m}},t_{n}} \right) = {\tilde{x}_{k}}\left({{t_{m}},t_{n}} \right) \cdot \exp \left[ {j2\pi \left({\hat{v}{t_{m}} + \hat{at}_{m}^{2}} \right)} \right] \end{aligned} $$
(18)
where x_{ck}(t_{m},t_{n}) denotes the aligned echoes of the kth subaperture. The ARP of the kth subaperture can be expressed as
$$ \begin{aligned} h = \frac{1}{M}\sum\limits_{m = 0}^{M  1} {fft\left\{ {{x_{ck}}\left({{t_{m}},t_{n}} \right)} \right\}} \end{aligned} $$
(19)
Again, M is the number of echoes in one subaperture, and fft{·} represents application of the FFT along the fasttime direction.
Assume that the number of subapertures is SN. After the ACM, the SN values of the envelope deviations are obtained, i.e.,
$$ \begin{aligned} {\Delta_{\text{ave}}} = {[{\Delta_{1}},{\Delta_{2}},\ldots,{\Delta_{SN}}]^{T}} \end{aligned} $$
(20)
where Δ_{i}(i=1,2,…,SN) represents the envelope deviation of each subaperture’s ARP and Δ_{ave} denotes the vector of all deviations.
The length of vector Δ_{ave} should be extended to M when carrying out compensation for each echo. The longer version of Δ_{ave} can be expressed as
$$ \begin{aligned} {\Delta_{\text{ave}}} = {[\overbrace {\underbrace{\Delta_{1},\ldots,\Delta_{1}}_{M},\ldots,\underbrace{\Delta_{SN},\ldots,\Delta_{SN}}_{M}}^{M \cdot SN}]^{T}} \end{aligned} $$
(21)
3.3 Total error fitting using locally weighted regression
In the previous two subsections, the envelope deviations in each subaperture and between every two subapertures were obtained through optimization and the ACM, respectively. With the combination of (17) and (21), the total envelope deviations can be expressed as
$$ \begin{aligned} {\Delta_{\text{total}}} = {\Delta_{\text{sub}}} + {\Delta_{\text{ave}}} \end{aligned} $$
(22)
where Δ_{total} denotes the total deviations of the envelopes.
After aligning the ARPs of all subapertures, in general, the misaligned envelopes can be calibrated well. To achieve a higher performance, some fine tuning is still required. Figure 1 shows an estimation result for a full aperture’s envelope misalignment error, where relatively accurate error estimation could be achieved; however, according to the enlarged error estimation curve, there exist step changes between the two subapertures, which can undermine the imaging quality. Because each subaperture is aligned as a whole by the method proposed in the previous subsection, the step changes are inevitable.
These step changes can be easily smoothed and eliminated by some curvefitting techniques. By referring to [27] and [11], we use locally weighted regression (LOESS) to smooth the step changes between every two adjacent subapertures. The procedure of LOESS is briefly introduced in the following:

1
Suppose that there are N points to be fitted, i.e., [x_{1},…,x_{N}]^{T}. For the ith point x_{i}, put N·f_{r}(0<f_{r}≤1) points into its neighborhood Ω_{i}.

2
Determine the weight w_{k}(x_{i}),k=1,2,…,N·f_{r} for the weighted least squares (WLS) in the neighborhood of x_{i} using tricube functions.

3
Because of the short length of the neighborhood, it is reasonable to model the points in the neighborhood as quadratic. After conducting WLS, the ith fitted value can be expressed as \({y_{i}} = {\beta _{0}} + {\beta _{1}}{x_{i}} + {\beta _{2}}x_{i}^{2}\), where β_{i},i=0,1,2 are the estimated coefficients of the quadratic polynomial.

4
Repeat steps 1–3 for all N points to obtain N fitted values.
After LOESS, accurate envelope deviations can be obtained, and good alignment can be achieved. The whole framework is shown in Fig. 2.
3.4 Optimal selection and computational complexity
This subsection discusses how to choose the number of subapertures and the computational complexity of the proposed algorithm.
By referring to [28] and [29], we develop the following adaptive selection method:

1
Initialize SN. The principle of initializing SN is that the envelope deviation in one subaperture should not exceed half the range unit, i.e., c/4F_{s}, where c is the velocity of light and F_{s} is the sampling rate.

2
Implement the minimum entropy optimization. The estimated error of the pth subaperture is ΔR_{p}(t_{m}), where t_{m} represents the slow time of the current aperture.

3
Double SN and implement the minimum entropy optimization. The pth aperture in (2) is split into two equal subapertures, and the estimated envelope errors in each one can be expressed as ΔR_{p1}(t_{m}) and ΔR_{p2}(t_{m}). They can be denoted as ΔR_{pNew}(t_{m})=[ΔR_{p1}(t_{m}),ΔR_{p2}(t_{m})], and the length of t_{m} is equal to that in (2).

4
If the following condition is satisfied, the initialized SN can be used; otherwise, return to (3) and repeat.
$$ \begin{aligned} &\max \left(\Delta {R_{p}}({t_{m}})  \Delta {R_{pNew}}({t_{m}})\right) \\& \min \left(\Delta {R_{p}}({t_{m}})  \Delta {R_{pNew}}({t_{m}})\right) \le c/4{F_{s}} \end{aligned} $$
In addition to the selection scheme above, in real cases, it is also important to jointly consider prior information, such as the motion parameters, the expected gain through accumulated echoes, and the computational burden (a larger number of apertures means a higher computational complexity). Moreover, when the accumulated echoes are insufficient, a certain number of echoes can be reused by two different subapertures. In general, the selection of the number of subapertures requires thorough consideration.
In the following part of this subsection, the computational complexity is briefly numerically analyzed, with the detailed derivation shown in the Appendix. As mentioned above, the proposed algorithm contains three parts. In the first part, we use the CDA to solve the optimization in each subaperture. In each loop, the computational burden is devoted mainly to obtaining the entropy and its first and second derivatives. The computational complexity of these operations is denoted as
$$ \begin{aligned} N_{mul}^{OP} \sim \Theta \left({M \cdot N \cdot {{\log }_{2}}N} \right) \end{aligned} $$
(23)
$$ \begin{aligned} N_{add}^{OP} \sim \Theta \left({M \cdot N \cdot {{\log }_{2}}N} \right) \end{aligned} $$
(24)
where \(N_{mul}^{OP}\) and \(N_{add}^{OP}\) represent the numbers of multiplications and additions of the proposed algorithm’s optimization procedure, respectively, and M and N denote the numbers of Doppler and range cells of each subaperture, respectively. It also can be seen that each subaperture is independent; thus, parallel programming can be used to execute all optimizations concurrently.
The second step is the ACM in each loop, of which the computational burden arises mainly from obtaining the correlation function between the current profile and the reference. The computational complexity of this step is
$$ \begin{aligned} N_{mul}^{ACM} \sim \Theta \left({SN \cdot N \cdot {{\log }_{2}}N} \right) \end{aligned} $$
(25)
$$ \begin{aligned} N_{add}^{ACM} \sim \Theta \left({SN \cdot N \cdot {{\log }_{2}}N} \right) \end{aligned} $$
(26)
where \(N_{mul}^{ACM}\) and \(N_{add}^{ACM}\) represent the numbers of multiplications and additions of the proposed algorithm’s ACM procedure, respectively, N denotes the number of range cells of each subaperture, and SN is the number of subapertures.
With reference to [30], LOESS also decomposes the problem into independent pieces, with all operations being completely parallel. Therefore, with welldesigned concurrent programming, the computational complexity of the LOESS step is equal to that of a weighted least squares operation with few points.