EURASIP Journal on Applied Signal Processing 2005:20, 3206–3219 c ○ 2005 F. Lombardini and F. Gini Model Order Selection in Multi-baseline Interferometric Radar Systems

Synthetic aperture radar interferometry (InSAR) is a powerful technique to derive three-dimensional terrain images. Interest is growing in exploiting the advanced multi-baseline mode of InSAR to solve layover effects from complex orography, which generate reception of unexpected multicomponent signals that degrade imagery of both terrain radar reflectivity and height. This work addresses a few problems related to the implementation into interferometric processing of nonlinear algorithms for estimating the number of signal components, including a system trade-off analysis. Performance of various eigenvalues-based information-theoretic criteria (ITC) algorithms is numerically investigated under some realistic conditions. In particular, speckle effects from surface and volume scattering are taken into account as multiplicative noise in the signal model. Robustness to leakage of signal power into the noise eigenvalues and operation with a small number of looks are investigated. The issue of baseline optimization for detection is also addressed. The use of diagonally loaded ITC methods is then proposed as a tool for robust operation in the presence of speckle decorrelation. Finally, case studies of a nonuniform array are studied and recommendations for a proper combination of ITC methods and system configuration are given.


INTRODUCTION
Synthetic aperture radar interferometry (InSAR) is a powerful and increasingly expanding technique to derive digital height maps of the land surface from radar images, with high spatial resolution and accuracy [1,2]. The surface height is estimated from the phase difference between two complex SAR images, obtained by two sensors slightly separated by a cross-track baseline. The InSAR technique is finding many applications in radar remote sensing, for example, for topographic and urban mapping, geophysics, forestry, hydrology, glaciology, sighting for cell phones, flight simulators [1,2]. Accurate measurement of radar reflectivity is useful for vegetation and snow mapping, forestry, land-use monitoring, agriculture, soil moisture determination, mineral exploration, and again for hydrology and geophysics [3].
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
However, conventional single-baseline InSAR suffers from possible layover phenomena, which show up when the imaged scene contains highly sloping areas, for example, mountainous terrain or discontinuous surfaces, such as cliffs, buildings [1,2]. The received signal is the superposition of the echoes backscattered from several terrain patches, which are mapped in the same slant-range/azimuth resolution cell, but have different elevation (see Figure 1). In these conditions the height map produced by conventional InSAR is affected by severe bias and inflated variance, and the height and reflectivity of the multiple layover terrain patches cannot be separately retrieved. Recently, it was suggested that baseline diversity, originally proposed to reduce the problems of interferometric phase ambiguity and data noise (see [4,5] and references therein) can also be exploited to solve layover (see, e.g., [6]). In fact, a multi-baseline (MB) interferometer has resolving capability along the elevation angle. Conventional beamforming has been experimented for this application in [7], but it is not the ultimate solution. Resolution limitations stand both for advanced single-pass airborne MB systems [8], planned single-pass MB distributed interferometers based on satellite formations [9], and repeat-pass MB systems [4]. A step in the direction of an effective layover solution with multi-baseline InSAR (MB-InSAR) is the use of modern spectral estimation techniques, such as adaptive [10] or model-based methods, to obtain a better resolution than the Rayleigh limit and reduced masking effects [6,11,12]. However, the problem of model order selection (MOS) in MB-InSAR imaging is still somewhat overlooked in the literature [13], despite the fact that the correct definition of the number of signal components is a critical problem for good operation of model-based signal-subspace methods [14].
This work constitutes a first step to address some problems related to the implementation of MOS eigenvaluesbased information-theoretic criteria (ITC) methods into a practical MB-InSAR for radar imaging of layover areas. The ITC methods considered here are the Akaike information criterion (AIC), the minimum description length (MDL), and the efficient detection criterion (EDC) [15,16,17]. All these parametric detection methods have been conceived for line spatial spectra, which is the case with point-like targets. Therefore, in the presence of speckle from extended natural targets, modeled as complex correlated multiplicative noise, they are mismatched to the actual data model, and leakage of signal power into the noise eigenvalues (EVs) is expected [18]. In this framework, the novelties of this work are (i) to analyze the impact of speckle noise due to surface scattering from locally flat terrains or to volume scattering from rough terrains; (ii) to investigate the classical baseline optimization problem in the new context of estimating the number of terrain patches, as a trade-off between resolution and speckle decorrelation; (iii) to analyze performance in the realistic scenarios of small number of available looks and possible strong scattering from the layover patches, which can cause increased leakage of signal power into the noise EVs; (iv) to investigate the use of diagonally loaded ITC methods for robust operation in the presence of speckle decorrelation, leakage of signal power, and small number of looks regime; (v) to link the case of MOS with realistic nonuniform array structures to the area of multisource identifiability problems [19], and to analyze a noncritical case of nonuniform dualbaseline array, typical of advanced airborne or formationbased spaceborne systems.

STATISTICAL DATA MODEL AND PROBLEM FORMULATION
The MB system is modeled as a cross-track array of K twoway phase centres, which for ease of analysis can be assumed to be linear and orthogonal to the nominal radar line of sight after local phase aligning (deramping) [7], see Figure 1. As usual in SAR interferometry, in each radar image we consider N independent and identically distributed looks [1,2]. For each look, the complex amplitudes of the pixels corresponding to the same imaged area on ground, observed in the K SAR images, are arranged in the K × 1 vector y(n). The observed vectors can be modeled as [20,21] where is the Hadamard (element-by-element) product, and N s is the number of terrain patches in layover; we assume that N s ≤ K − 1 (N s = 2 in Figure 1). Parameter τ m denotes the mean pixel intensity contribution from the mth patch (texture in the radar jargon). Vector a m = a(ϕ m ) is the steering vector pertinent to the mth source; it encodes the various interferometric phases at the MB array due to the imaging geometry. Parameter ϕ m is the interferometric phase at the overall baseline; it is related to elevation angle θ m by where λ is the radar wavelength, B is the overall orthogonal baseline length, and θ is the nominal (line of sight) offnadir incidence angle [1]. Note that for fixed λ, θ, and θ m , ϕ m is proportional to the baseline B. The steering vector is given by a m = [1 e jϕmB12/B e jϕmB13/B · · · e jϕmB1K /B ] T , which in general can be nonuniformly spatially sampled; B 1l is the orthogonal baseline between phase centres 1 and l; B 1K = B is the overall baseline. The multiplicative noise term x m (n) is the speckle vector pertinent to the mth terrain patch in isolation. Considering homogeneous terrain patches, speckle is fully developed and can be modeled as a stationary, circular, complex Gaussian-distributed random process; its spatial autocorrelation function is triangular shaped when only baseline decorrelation effect from locally smooth terrain is considered [1,2]. The autocorrelation linearly decreases for increasing baseline between the two spatial samples, reaching zero for the critical baseline value, which is given by Figure 2: Geometry of the layover problem: surface illumination from SAR impulse response, projected onto the cross-range elevation axis (example with two adjacent layover sources, z: height axis, y: ground-range axis, r: slant range, R: slant-range resolution, h i : source height).
where r is the slant range, R the slant-range resolution, and δ ym the local slope of mth patch [1]. To model additional volumetric decorrelation for locally Gaussian-distributed topography (rough patch), a Gaussian-times-triangular autocorrelation function is assumed in this paper, as formalized in the sequel. Vectors x m1 (n) and x m2 (n) are assumed to be independent for m 1 = m 2 , since they collect scattering from different terrain patches. Vector v(n) models the additive white Gaussian thermal noise (AWGN). The data spatial power spectral density (PSD) is the Fourier transform of the spatial autocorrelation. It corresponds to the profile of the backscattered power as a function of the elevation, for the N s terrain patches "illuminated" through the Sinc SAR slant-range impulse response [21,22]. For a homogeneous and smooth patch (triangular autocorrelation), the corresponding spatial PSD component is squared-Sinc shaped [21]. Data model (1) neglects the truncation of the tails of the illumination function, thus it is slightly approximated for neighboring flat terrain patches (see Figure 2); for rough terrain patches, approximation is good only for nonneighboring patches. 1 The problem of multi-baseline layover solution is therefore equivalent to the problem of jointly estimating the number N s of signal components and the N s interferometric 1 In more complex layover scenarios, terrain patches may be nonhomogeneous, possibly including one or more predominant point-like scatterers. For a nonhomogeneous terrain patch without predominant scatterers, speckle is still fully developed but reflectivity is not constant within the patch. The corresponding signal component would exhibit a spatial PSD that is a weighted version of that described above, the elevation-varying reflectivity in the patch constituting the weighting factor. For a nonhomogeneous terrain patch with predominant point-like scatterers, speckle would not be fully developed because of the additive deterministic contributions, the N looks would not be independent and identically distributed, and the spatial PSD would exhibit line components in correspondence to the elevations of the predominant scatterers. phases {ϕ m } and radar reflectivities {τ m } [6,7,11,12] in the presence of multiplicative correlated noise with unknown PSD and AWGN [20]. The problem of layover solution can be divided into two subproblems: (i) estimating the number of sources, which is the so-called detection problem or model order selection problem; (ii) retrieving the parameters of each single component, which is the estimation problem. The final appeal for the user of an MB layover solution processing chain strongly depends on the automatic estimation of N s , and accuracy of the overall layover solution depends on the successful determination of the number of signal components. In particular, most of the reported good properties of model-based signal-subspace methods are valid only if the assumed model order is the correct one. Also when nonmodel-based (possibly adaptive) spectral estimation methods are employed, model order has to be selected in the height/reflectivity map reconstruction stage of the layover area from the continuous elevation profiling [7,10].
The focus in this paper is on system and estimation problems for the retrieval of the number of overlaid terrain patches N s from the observation of the MB data {y(n)} N n=1 , with N the number of looks. In this framework, the intensities and interferometric phases of the patches, the autocorrelation matrices of the corresponding speckle vectors, and possibly the thermal noise power are modeled as unknown deterministic parameters. This formulation of the detection problem shows that it is equivalent to the problem of estimating the order of a multicomponent signal composed by multiple complex exponentials corrupted by correlated complex Gaussian multiplicative noise with unknown power spectral shape, embedded in AWGN.

MODEL ORDER SELECTION METHODS
Estimation of the number of components in MB-InSAR data in the presence of layover is an atypical detection problem, because of the presence of multiplicative noise. In fact, the most extensively used methods, based on informationtheoretic criteria, have been conceived to estimate the number of signal components in the presence of additive white noise only (see [15] and references therein, and the summarized theory in the sequel). In this case, the dimension of the signal subspace is N s , provided that the N s steering vectors, {a m }, are linearly independent; this is always the case for uniform linear arrays (ULA) [19], and sources within the unambiguous phase range [21,22]. The K −N s smallest eigenvalues of the data covariance matrix are all equal [15], so the MOS problem is equivalent to the estimation of the multiplicity of the smallest eigenvalues of the data covariance matrix. In presence of multiplicative noise, the EV spectrum broadens [18]. Consequently, ITC operates under model mismatch. We want to investigate the effects of this model mismatching and possible cures for it.
We consider here four ITC methods: two of them are based on the AIC and MDL criteria [23], the other two are based on the EDC criterion [17]. All these algorithms consist of minimizing a criterion over the hypothesized number m of signals that are detectable, for m = 0, 1, . . . , K − 1. To construct these criteria, a family of probability densities, , is needed, where χ is the vector of parameters which describe the model that generated the data y and it is a function of the hypothesized number m of sources. The criteria are composed of the negative log-likelihood function of the density f (y | χ(m)), where χ(m) is the maximum likelihood estimate of χ under the assumption that m components are present, plus an adjusting term, the penalty function p(η(m)), which is related to the number η(m) of the degrees of freedom (DOF): 2 (4) The number of components N S is estimated as The introduction of a penalty function is necessary because the negative log-likelihood function always achieves a minimum for the highest possible model dimension. Therefore, the adjusting term will be a monotonically increasing function of m and it should be chosen so that the algorithm is able to determine the correct model order. The choice of the penalty function is the only difference among AIC, MDL, and EDC. Akaike [24] introduced the penalty function so that the AIC is an unbiased estimate of the Kullback-Liebler distance between f (y | χ(m)) and f (y | χ(m)): Two different approaches were taken by Schwartz [25] and Rissanen [26]. Schwartz utilized a Bayesian approach, assigning a prior probability to each model, and selected the model with the largest a posteriori probability. Rissanen used an information-theoretic argument: one can think of the model as an encoding of the observation; he proposed choosing the model that gave the minimum code length. In the large sample limit, both approaches lead to the same criterion: where N is the number of independent observations of the data vector y (in our InSAR problem it is the number of looks). MDL is a particular case of the EDC procedure. EDC is a family of criteria developed by statisticians at the University of Pittsburgh [16,17], chosen such that they are all consistent: 2 DOF is the number of real independent parameters in χ.
where C N can be any function of N such that The EDC is implemented here by choosing C N = log N (EDC 1 ) and C N = N log N (EDC 2 ). For the statistical data model of Gaussian data with a line spectrum in AWGN, typical in sensor array processing applications, the handy expression for the log-likelihood function is [23] ln are the eigenvalues in descending order of the estimated data covariance matrix. Thus, as hinted in the beginning of this section, the solution of the MOS problem by ITC methods relies on a particular uniformity test on the eigenvalues of the covariance matrix of the array data to detect the number of the smallest constant ones. Their uniformity is measured by the ratio of the harmonic and algebraic mean of the values, as from (10).
When the multi-baseline array is nonuniform (non ULA), we derive the { λ i } K i=1 from the unstructured sample covariance matrix estimate (forward-only averaging, F-only) When the array is ULA, it can be convenient to use the structured forward-backward (FB) averaging covariance estimate accounting for the Toeplitz form of the true covariance matrix [15] where J is the so-called exchange matrix, that has ones on the main anti-diagonal [21,22]. FB averaging is essentially a way of preprocessing the data which preserves the desired information and removes to some extent unwanted perturbations (noise) by effectively doubling the number of observations of the data vector. However, FB averaging significantly changes the statistical properties of noise, introducing noise correlation [15, Section 7.8], [27]. Consequently, when FB averaging is employed, ITC methods must be changed to correctly account for this preprocessing of data. Concerning the DOF expression, this has been derived by Wax and Kailath [23] for F-only covariance matrix estimation and again the standard model of Gaussian data with line spectrum in AWGN: Xu et al. [27] solved the problem of how the ITC detection tests should be modified to account for the use of the FB covariance matrix. AIC, MDL, and EDC are applicable modifying the number of DOF as [15] η As regards the performance of these criteria using the Fonly covariance matrix, Zhao et al. showed that, under the data assumptions of the standard model [23], MDL is consistent and generally performs better than AIC [17]. They also showed that AIC is not consistent and will tend to overestimate the number of sources as N goes to infinity. The EDC criteria are all consistent [28]. As concerning the performance using the FB covariance matrix, Xu et al. [27] showed that MDL-FB is consistent (as MDL), whereas AIC-FB is not (as AIC). The assumption of whiteness of the additive noise is critical for the ITC methods. If the noise covariance matrix is not proportional to the identity matrix, the noise eigenvalues are no longer all equal. The effect of the noise eigenvalues dispersion on AIC and MDL performance has been studied by Liavas and Regalia [29]. They showed that when the noise eigenvalues are not clustered sufficiently closely, the AIC and the MDL may lead to overestimating N S . For fixed dispersion, overmodeling becomes more likely for increasing the number of data samples [30]. Undermodeling may happen in the cases where the signal and the noise eigenvalues are not sufficiently closely clustered [29]. In the InSAR application additive noise is white, yet model mismatch problems are expected from the presence of multiplicative noise.

PERFORMANCE AND TRADE-OFF ANALYSIS
Numerical analysis of the estimation accuracy of the various ITC methods in the InSAR application has been derived by Monte Carlo simulation, by generating 10 000 multilook pixel vector realizations according to model (1). The speckle vectors {x m (n)} have been generated assuming a triangulartime-Gaussian shaped spatial autocorrelation function: for B uv /B Cm ≤ 1, and r xm (u, v) = 0 otherwise; B uv is the orthogonal baseline between phase centres u and v, B Cm is the critical baseline for the mth speckle term. For a ULA system, with l = u − v the array lag for phase centres u and v. The term s 2 = 2R 2 cos 2 (ϑ)/σ 2 s is a smoothness parameter, σ 2 s /R 2 is the vertical variance of the scatterers in the sensed scene in slant-range resolution units [22]. The true spatial PSD of the mth speckle term can be expressed by assuming a ULA configuration and Fourier transforming the nontruncated autocorrelation sequence r xm (l), that is, allowing l > K − 1. When s → ∞ the autocorrelation sequence is triangular shaped and the corresponding spatial PSD is a discrete squared Sinc [20,21], as mentioned in Section 2.

Eigenvalues leakage from multiplicative speckle noise
It is known that performance of ITC methods degrades when errors in real systems affect the separation between signal and noise EVs [14]. Several phenomena in array processing can produce leakage of the signal power into the noise subspace. The random modulation induced by the speckle results in a covariance matrix tapering (CMT) of the unmodulated (absence of the speckle phenomenon or fully correlated speckle) signal covariance matrix. CMT models also the effects on radar data of other well-known phenomena, such as, for example, internal clutter modulation (ICM), scintillation, bandwidth dispersion, uncompensated antenna jitter/motion, and transmitter/receiver instabilities [31]. It produces subspace leakage (or eigenspectrum modulation), that is, an increase of the effective rank of the data covariance matrix, which in turn heavily impacts the performance of many adaptive sensor array processing algorithms.
In the InSAR application, an important source of leakage is the presence of the multiplicative noise [13]. In fact, speckle decorrelation results in the noise EVs of the true data covariance matrix being no longer all equal, and in the matrix being full rank, even in the limit of thermal noise power σ 2 v → 0 [13,18]. This phenomenon, and the other effects and trends of the MOS problem in InSAR, are first analyzed with reference to ULA systems in this and in the following section. This scenario is representative of advanced MB singlepass platforms such as PAMIR from FGAN [8], and is also useful to capture the basics of the problem in more complex configurations. An analysis for a non-ULA system is presented in Section 6. The mentioned EV leakage effect is illustrated in Figure 3, where the actual EV spectrum is plotted for a ULA system with K = 8, N s = 2, signal-to-noise ratios SNR m = τ m /σ 2 v = 12 dB, for m = 1, 2, σ 2 v = 1, same critical baseline B C1 = B C2 = B C , flat terrain patches (s → ∞), ∆ϕ = ϕ 1 −ϕ 2 ∼ = 4πλ −1 B(ϑ 1 −ϑ 2 ) = 540 • . For increasing slantrange resolution, or patch slope producing local grazing angle, the critical baseline tends to infinity and the backscattering sources behave like point-targets as far as speckle spatial correlation is concerned, that is, completely correlated multiplicative disturbance. In this condition B/B C → 0 and there is a large gap between the signal EVs and the noise EVs, which are all equal to σ 2 v . Conversely, in the presence of extended backscattering sources, the multiplicative disturbance is only partially correlated, and as a result there is not a large separation between the signal and the noise EVs, despite the good SNR (see the curve in Figure 3 relative to B/B C = 0.2). EV leakage may affect differently the behavior of ITC methods.

Baseline optimization for detection
The classical baseline optimization problem of InSAR is set in the context of estimation of the height of a single terrain patch, trading-off interferometer sensitivity for speckle decorrelation [2]. Here, the issue of baseline optimization for detection is investigated for the first time, extending the classical analysis in [2] to the layover scenario. The trade-off to be analyzed is now between speckle decorrelation effects and source resolution problems. To this aim, we consider a given critical baseline common to all the layover sources, that is, same local incidence angles, coding given slant-range resolution and patch slopes. We then analyze the behavior of the estimated EV spectrum and the performance of ITC methods as a function of the baseline B. It is important to note that both the ratio B/B c and ∆ϕ = ϕ 1 − ϕ 2 ∼ = 4πλ −1 B(ϑ 1 − ϑ 2 ) are proportional to the system baseline B. Where not otherwise stated, performance is evaluated assuming a ULA system with K = 8, N s = 2, SNR 1 = SNR 2 = 12 dB, s → ∞, N = 32, and FB averaging. Two scenarios are analyzed in this trade-off analysis: the close sources scenario and the spaced sources scenario. In the former the two squared Sinc main lobes of the source spatial spectra are adjacent [20,21,22], as in Figure 2. Consequently, the difference ∆ϕ between the two interferometric phases is equal to the spatial bandwidth of the two spectral contributions (expressed in terms of interferometric phase), that is, ∆ϕ = 4πB/B c [20,21]. This condition encodes adjacent layover patches. In the spaced sources scenario, the source separation is larger than their spatial extent; in particular, we consider the case where ∆ϕ = 15πB/B c . The spaced sources scenario model is also valid for rough patches. Where not otherwise stated, the close sources scenario is considered. Detection performance is evaluated in terms of the probability of correct model order estimation (P CE ), the probability of overestimation (P OE ), and the probability of underestimation (P UE ), which are related by P CE + P OE + P UE = 1.
The baseline influence on the detection performance of the four ITC methods is investigated in Figure 4. Performance curves are plotted as a function of the ratio B/B C for normalization purposes; however, one should bear in mind that B C is fixed and ∆ϕ varies with B/B C according to the selected scenario. For the given close sources scenario, speckle decorrelation increases with increasing B/B C , while at the same time the source separation in terms of ∆ϕ increases. A similar trend stands also for the spaced sources scenario, with ∆ϕ increasing more rapidly. Figure 4 shows that AIC and MDL generally fail to correctly determine the number of signal components in the presence of partially correlated multiplicative noise, whatever baseline is selected. EDC methods show better robustness to model mismatching. Specifically, EDC 2 can be considered the best performing, having generally the highest P CE . Note however that none of the ITC methods is uniformly most efficient; this condition will show up also in the subsequent analyses, and some subjective judgment in selecting the globally best method may be required again. The results in Figure 4 can be used to derive indications for baseline optimization. In fact, the trade-off between speckle decorrelation effects and resolution problems for varying baseline results in an optimal range for B/B C . This is, say, 0.1-0.4 for EDC 2 . Of course one should also consider that for increasing baseline the equivocation height corresponding to the unambiguous phase range decreases [1]. 3 The trade-off problem is clear from Figure 5, where the average values of the eight estimated EVs are plotted versus B/B C (each EV order is marked). For B/B C ∼ = 0.1 − 0.4, two dominant (signal) EVs can be identified, a number that 3 It is worth noting another possible use of this analysis where B/B C and ∆ϕ are coupled. One can consider a given system baseline B and adjacent layover sources of varying extent because of varying same local incidence angles, or varying system slant-range resolution. In this condition, both B C and ϑ 1 − ϑ 2 vary such that ∆ϕ = 4πB/B C . In this light, Figure 3 shows that both largely extended (B/B C → 1) and compact (B/B C → 0) adjacent sources are difficult to be correctly detected by ITC methods. corresponds to what is expected for N s = 2 and negligible multiplicative noise effect. However, for B/B C → 0, ∆ϕ → 0, and one signal EV migrates towards the noise EVs, leaving one dominant EV only: because of resolution problems, all the ITC methods produce E{ N s } ∼ = 1, where the loss of P CE in the leftmost part of plots in Figure 4. Conversely, for large B/B C the corresponding ∆ϕ is large and resolution is no more a problem; for B/B C > 0.5, ∆ϕ > 2π and the intersource distance is larger than the classical Rayleigh resolution limit [21]. However, speckle decorrelation causes the noise EVs to diffuse making fuzzy the gap between noise and signal EVs. In this condition, none of the ITC methods can estimate the correct value N s = 2, as shown in the rightmost part of Figure 4, but their estimation errors can be different. EDC 2 can underestimate N s , as shown in Figure 6, where P UE is plotted, whereas the other methods overestimate N s . In particular, for B/B C → 1, for EDC 2 we find E{ N s } ∼ = 0, which can be termed a "blind baseline" effect: the diffused EV spectrum is interpreted by EDC 2 as originated by noise only. Conversely, the other ITC methods tend to interpret the EV spectrum from two extended sources as originated by a greater number of point sources. So far, we have considered the highest P CE as best index of quality of MOS methods, which is undoubtedly a reasonable judgment criterion from a pure statistical point of view. However, in an engineering framework a low probability of underestimation P UE is also important in judging MOS algorithms for the system application at hand. In fact, when P CE is not high, in terms of impact on the subsequent InSAR processing, the overestimation condition can be better than underestimation. Thus, from a practical point of view, by jointly inspecting Figures 4 and 6, one might consider EDC 1 as producing overall performance comparable to or better than EDC 2 for the examined scenario. A definite judgment would require simulation of the complete layover solution processing chain, including estimation of the heights and reflectivities of the N s layover terrain patches, post-processing, and height/reflectivity map derivation, which is out of the scope of this paper.
In Figure 7, performance is plotted as a function of the number of sources. The sources are still characterized by B Cm = B C and SNR m = 12 dB for all m. The interferometric phase separations between neighboring sources are all the same and equal to 4πB/B C . Simulations, not shown here for lack of space, reveal that the optimal baseline range for detection tends to vary with N S . Thus, in practice it may be difficult to get good baseline optimization. However, for B/B C = 0.3 shown in Figure 7, EDC 2 performance is good up to four layover sources (a realistic upper bound value for N S is about three-four), both in terms of high P CE and low P UE .

Effect of volumetric speckle decorrelation
In Figure 8, both the probability of correct order estimation and that of overestimation of the EDC 2 method are reported for the case of spaced sources, for both flat (s → ∞) and very rough patches (s = 1). The curves stop when the two sources reach the maximum distance possible within the unambiguous phase range 2π(K − 1) [21,22], which corresponds to the equivocation height [2]. Compared to Figure 4, the range of baselines for optimum operation of EDC 2 is wider. Note that the Rayleigh resolution limit corresponds now to B/B C = 0.13. Conversely, other numerical results not reported here showed that EDC 1 in this scenario tends to perform worse than for close sources, exhibiting a P CE similar to that of MDL in Figure 4. Thus, for spaced sources the trade-off between speckle decorrelation effects and resolution problems for varying baseline is not critical, and EDC 2 has the best performance for the whole range of operating B/B C values. Also, in Figure 8 it can be seen how the additional decorrelation from volumetric scattering can increase P OE of EDC 2 around B/B C = 0.15, while P UE is slightly increased around B/B C = 0.45. Interestingly, the increased EV leakage effect from volumetric decorrelation is not very sensible and does not significantly impair P CE , which remains high. Thus, EDC 2 is a good choice when source separation can vary from the close to the spaced sources condition and volumetric decorrelation can be present.

DIAGONALLY LOADED ITC METHODS
So far, we have analyzed the impact of surface and volume speckle decorrelation on the performance of classical ITC methods. To increase the robustness of the ITC methods to speckle effects, we propose here to resort to diagonal loading (DL). In fact, it is well known that DL can be quite effective in stabilizing the variations of the small eigenvalues, to which ITC methods are highly sensitive [14]. This stabilization effect is independent of the particular source of the leakage phenomenon, thus should have some efficacy also to reduce leakage problems from multiplicative noise. The diagonallyloaded covariance matrix estimate R Y is obtained as where R Y is the sample (or the FB) covariance matrix, δ is the DL factor, and σ 2 v is the AWGN power that in practice can be obtained by noise calibration data. The corresponding modified ITC methods are denoted by DL-AIC, DL-MDL, DL-EDC 1 , and DL-EDC 2 . DL is a simple yet effective technique. However, a definite recipe for setting the DL factor δ is not  available, thus one has to resort to simulations to evaluate the best δ choice [14] in the application and typical scenarios at hand.

Robustness to multiplicative speckle noise
As a reference for the effect of DL on EV leakage, Figure 9 shows the mean values of the estimated EVs for two adjacent flat patches with B/B C = 0.3, with and without DL. The ±3σ interval of the estimated EVs is also reported. DL produces an increase of the mean value of the small EVs and a reduction of the estimation variance. The effect of this stabilization on MOS performance in presence of multiplicative noise is analyzed in Figure 10, which plots the performance of EDC 2 and DL-EDC 2 . The loaded EDC 2 provides significantly reduced P OE , as expected, and enhanced P CE at medium values of B/B C , at the cost of a slight reduction of P CE for low B/B C . The loading factor δ = 1 has been chosen among others by simulation, to get the above mentioned benefits on P OE and P CE with little loss for B/B C ∼ = 0. The range of optimal baselines for detection is slightly enlarged compared to the classical EDC 2 . Thus, increased robustness to multiplicative noise is generally achieved by DL-EDC 2 .

Strong SNR regime
Strong signals can arise in the layover geometry because of the possible high local slopes facing the radar, or in the case of layover in man-made structures. EV leakage from multiplicative noise increases when signals are strong. This can produce the counter-intuitive degradation of performance shown in Figure 11: P OE of EDC 2 increases when the SNRs of both sources change from 12 dB to 18 dB. Again, the stabilization of noise EVs operated by loading produces some benefit, limiting the increment of P OE . However, in the case shown in Figure 11, the beneficial effect of DL results from an almost-rigid shift towards higher SNR of the P OE and P CE performance as a function of SNR. Actually, a similar effect of robustness to EV leakage from strong scattering could be obtained by lowering SNR through the radar pulse energy reduction, which would also result in a cheaper system.
A much more amenable effect of DL against the strong signal regime is exhibited for the AIC method, as reported in Figure 12 for B/B C = 0.2. The P CE is plotted as a function of SNR. The DL-AIC curve is not almost equal to a merely shifted copy of the AIC curve; there is also an improvement of the maximum value. This robustness effect is not possible by a mere radar pulse energy reduction, and makes DL-AIC a possible candidate for robust operation, taking into account also its low P UE for large B/B C (no blind baseline effect).

Small-sample regime
Diagonal loading can produce benefits also on operation with small number of looks. Operation with N < K can be often necessary in MB layover solution, where it may be difficult to get many identically distributed looks because of the possible high local slopes. In this condition the covariance matrix estimate is no longer positive definite [15] and ITC methods significantly degrade. Figure 13 refers to N=4 looks and shows how both the bad P CE and P OE of AIC in this In-SAR scenario are largely improved by DL with δ = 3. This is due to the restoration of the positive definiteness of the covariance matrix estimate operated by the DL. As a drawback, DL-AIC tends to be partially affected by the blind baseline effect for large B/B C .

DUAL-BASELINE NON-ULA SYSTEM
When the array is nonuniform, the change of structure of the array steering vector with respect to the classical ULA structure impacts on the achievable performance. In addition to that, the structured FB covariance matrix estimate cannot be adopted. Moreover, in the airborne case, non-ULA systems generally have a lower number K of phase centres than ULA systems; also formation-based spaceborne systems have low K. To gain some insight on the behavior of ITC methods in non-ULA InSAR systems, we first simulated performance for a system with K = 4 phase centres and ULA structure. The P CE of the four ITC methods for F-only processing is shown in Figure 14. The curves stop when the two adjacent sources reach the maximum distance possible within the unambiguous range. Notably, in this case study the rankings of AIC, MDL, EDC 1 , EDC 2 are different with respect to the K = 8 ULA (FB) case. The ranking derived from Figure 4 is no more valid: EDC 1 is now to be considered the best performing, followed by MDL and AIC, whereas EDC 2 is now the worst. This new ranking is partly due to the lowered K, in part due to abandoning FB averaging, as revealed by the results for K = 4 FB, not shown here. In particular, lowering K results in significant improving of EDC 1 , MDL, and AIC; subsequent turning to F-only processing produces a strong degradation of EDC 2 . Thus, it is expected that for non-ULA with low K, a ranking stands similar to this new one. Before quantifying this expected trend by simulation, it is worth noting that non-ULA arrays can lead to identifiability problems of multiple sources with specific spatial frequency separations [19], which in our InSAR application mean specific separations among the multiple interferometric phases [20,22] or patch heights. This is due to possible linear dependence among the multiple steering vectors, which can arise also when all the sources are located within the same unambiguous interval (nontrivial nonidentifiability). The reason is that the nonuniform spatial sampling makes the matrix collecting the N s steering vectors to loose the Vandermonde structure that it exhibits in the ULA case. On the other hand, non-ULA arrays theoretically allow estimation of a greater number of sources than K − 1, which is the limit for ULA arrays, conditioned to the use of proper sophisticated processing and large number of looks [19].
A case of non-ULA array is investigated in Figure 15. Here K = 3 (dual-baseline system), and the smallest baseline is 1/3 of the overall baseline, which is a minimum redundant array [19] that may be obtained by thinning the array employed for Figure 14. This case is a good representative of advanced three-antenna airborne systems such as AER-II from FGAN [12], and can give a flavor of performance for formation-based spaceborne systems [9]. It can easily be proved that the K = 3 phase centre non-ULA array has no identifiability problem when N S ≤ 2. As expected, the ranking of AIC, MDL, EDC 1 , EDC 2 is quite similar to that for the K = 4 ULA F-only. AIC is now the best-performing algorithm, closely followed by MDL and EDC 1 , whereas EDC 2 is again the worst (it is strongly affected by the blind baseline effect). Note that now P OE = 0, since N S = 2 coincides with the maximum number of signals that is detectable by the ITC methods. The optimum baseline range for this non-ULA array and N S = 2 is good. Other simulations not reported here  show that it is not enlarged by DL, conversely from what happens when K = 8 (see Figure 10). This is reasonable given that for N S = 2 we find P OE = 0, so there is no space for improvement through stabilization of EV leakage. The rankings found for this non-ULA configuration and N S = 2 stand also when volumetric decorrelation is present, see the results in Figure 16 for spaced sources, very rough patches (s = 1).
However, as seen in Figure 7 for the ULA with K = 8, ITC rankings tend to vary with different number of sources, this should be considered for the selection of a globally optimal method. In Figure 17 P CE is reported for the four ITC methods and a single source present. Notably, AIC is now the worst performing method, while EDC 1 and EDC 2 can be considered the best ones. It is now apparent that the reason for the good performance of AIC with N S = 2 in Figure 15 was merely its tendence to overestimation: N S = N S with high probability simply because AIC typical output is the maximum number of signals that is detectable, which equals 2. Thus, when N s = 1 AIC typically fails, it tends to declare a layover situation that does not exist. In conclusion, considering both the case studies N S = 2 and N s = 1, one can select EDC 1 as globally optimum method for the given non-ULA configuration: it performs reasonably well for N S = 2, also when volumetric decorrelation is present, and is one of the best for N s = 1. Notably, for N s = 1 there is some space for improvement for the non-ULA case through stabilization of EV leakage by DL. Figure 18 shows the increase of P CE which is achievable by applying DL with δ = 1 to the selected EDC 1 .

FINAL DISCUSSION AND FUTURE DEVELOPMENTS
In this work, some problems have been investigated which are related to the implementation of nonlinear eigenvaluesbased ITC methods into practical MB-InSAR systems for the imaging of layover areas. Multiplicative speckle noise for both surface and volume scattering has been included in the data model for the simulations, showing that while the basic model mismatch effect from the extended nature of flat layover terrain patches is significant, the impact of additional volumetric decorrelation is more limited. For a compact analysis, same critical baseline has been assumed for the multiple signal components, corresponding to the same local incidence angles. Some results for different critical baselines can be found in [13]. The issue of baseline optimization for detection has been discussed. The analyzed trade-off between EV leakage and source resolution problems results in some guidelines for optimal baseline selection. The use of diagonally loaded ITC methods has also been proposed for more robust operation in presence of speckle. The benefits of diagonal loading have been demonstrated in terms of extended range of optimal baseline, robustness to leakage of signal power from strong sources into the noise EV, and improved operation with a small number of looks. Benefits of loading proved to be particularly effective for small-sample operation. Case studies of a nonuniform dual-baseline array have finally been discussed. Performance rankings of the ITC methods in terms of probability of correct model order estimation are summarized in Table 1, for the ULA (K = 8) and the non-ULA (K = 3) configuration, for the conditions of varying-baseline-to-critical-baseline ratio (adjacent sources), varying number of layover patches, and volumetric decorrelation (spaced sources); results for ITC methods not included in Figures 7 and 8 have also been considered. An important result is represented by the change of the globally best-performing method from EDC 2 to EDC 1 for decreasing number of phase centres and forward-only processing (see the highlighted fields in Table 1). Despite model mismatching, performance of model order selection can be generally satisfactory after proper choice of the best ITC method, especially if diagonal loading is adopted with loading factor in the range 1-3 and the number of phase centres and looks is large enough.
The model order selection techniques investigated in this paper can be useful in conjunction with spectral estimators in MB-InSAR applications of layover solution [6,7,11,12,21,22,32], to fully exploit both existing repeat-pass SAR data archives [4,7,11,32] and advanced experimental or planned single-pass MB systems [8,9,12]. In particular, implications of MB layover solution for InSAR topographic and reflectivity mapping are the following. The new functionality of operation in presence of discontinuous surfaces (e.g., cliffs) may be provided, which is not possible with classical InSAR. Conventional change of perspective methods of ascendingdescending passes mosaicing [33] or possible incidence angle flexibility [34] cannot solve the layover problem in this scenario. Also, operation in steep areas may be possible through MB layover solution for systems with fixed incidence angle [4,11,32], allowing efficient use of the available data set. Moreover, fusion of ascending-descending passes, if available, would be possible in place of a mere mosaicing, thus obtaining improved topographic accuracy (both the pass geometries would produce estimates since operation of that affected by layover would be restored by the MB processing). Even when the system has angle flexibility that can be useful to solve the layover problem in steep areas, MB layover solution may furnish the improved functionality of reflectivity measuring in homogeneous conditions, avoiding the need of changing incidence angle. Again, also improved topographic accuracy may be possible through fusion of estimates obtained with different incidence angles (operation with a layover-affected angle would be restored by MB processing). As regards the complexity of the MB array needed, to detect the typically low number of layover patches a few baselines are enough. Current single-pass airborne MB systems [12,35,36] and some planned single-pass MB distributed interferometers based on satellite formations [9] have two baselines (K = 3), thus they would be limited to detection of no more than two layover sources [12]. For detecting also three-layover sources at least three baselines (K ≥ 4) are required, which are already available from airborne and spaceborne repeat-pass SAR data sets [4,7,11,32], are going to be available with the upgrading experimental single-pass airborne MB system [8], and possibly in the future with formations of many satellites [37].
Concerning the application to real data, it is worth recalling that baseline errors [4], in addition to phase artefacts from changes in atmospheric propagation in repeatpass spaceborne implementations of MB arrays [32], produce deviations of the steering vector from the nominal one. However, the impact of imperfect knowledge of the steering vector structure on model order selection has not been numerically investigated, since it is known in the array processing literature that it is negligible [14]. ITC methods do not directly exploit this knowledge, thus it is expected that in presence of steering vector errors, ITC performance is not significantly affected.
Concerning future developments, the fact that performance is still far from optimal when the system baseline is very large or the spatial correlation length of the speckle is very small reveals space for investigating new ad hoc MOS methods which take the presence of multiplicative noise into account at the design stage. Other developments may include performance analysis in presence of nonhomogeneous terrain patches with dominant scatterers and long-term temporal decorrelation in spaceborne repeat-pass systems [2], and validation with laboratory [38] or live data [7,32].