Identification of MIMO systems with sparse transfer function coefficients

We study the problem of estimating transfer functions of multivariable (multiple-input multiple-output–MIMO) systems with sparse coefficients. We note that subspace identification methods are powerful and convenient tools in dealing with MIMO systems since they neither require nonlinear optimization nor impose any canonical form on the systems. However, subspace-based methods are inefficient for systems with sparse transfer function coefficients since they work on state space models. We propose a two-step algorithm where the first step identifies the system order using the subspace principle in a state space format, while the second step estimates coefficients of the transfer functions via L1-norm convex optimization. The proposed algorithm retains good features of subspace methods with improved noise-robustness for sparse systems.


Introduction
The problem of identifying multiple-input multiple-output (MIMO) systems arises naturally in spatial division multiple access architectures for wireless communications. Subspace system identification methods refer to the category of methods which obtain state space models from subspaces of certain matrices constructed from the input-output data [1]. Being based on reliable numerical algorithms such as the singular value decomposition (SVD), subspace methods do not require nonlinear optimization and, thereby, are computationally efficient and stable without suffering from convergence problems. They are particularly suitable for identifying MIMO systems since there is no need to impose on the system a canonical form and, therefore, they are free from the various inconveniences encountered in classical parametric methods.
Another good feature of subspace methods is they incorporate a reliable order estimation process. Although this is largely ignored by other identification methods, system order estimation should be an integral part of a system identification algorithm. In fact, correctly identifying the system order is essential to ensure that subsequent parameter estimation will yield a well-defined set of model coefficient estimates. While it is obvious that an underestimated order will result in large modeling errors, it is equally dangerous to have an overparameterized model as a result of selecting an improperly large order. Over-parameterization not only creates a larger set of parameters to be estimated, but also leads to poorly defined (high variance) coefficient estimates and surplus unvalidated content in the resulting model [2]. For parameterized linear models, a usual way of estimating the order is to conduct a series of tests on different orders, and select the best one based on the goodness-of-fit using certain criterion such as the Akaike's information criterion [3]. Subspace-based order identification determines the order as the number of principle eigenvalues (or singular values) of a certain input-output data matrix. This mechanism has been proven to be a simple and reliable way of order estimation and the same principle has been applied to detect the number of emitter sources in array signal processing [4], to determine the number of principal components in signal and image analysis [5,6] and to estimate the system order in blind system identification [7,8].
It has become well known that many systems in real applications actually have sparse representations, i.e., with a large portion of their transfer function coefficients equal to zero [9]. For example, communication channels exhibit great sparseness. In particular, in high-definition television, there are few echoes but the channel response spans many hundreds of data symbols [10][11][12]. In broadband wireless communications, a hilly terrain delay profile consists of a sparsely distributed multipath [13]. Underwater acoustic channels also exhibit sparseness [14].
Despite of their good features, subspace methods are not suited for systems with sparse transfer function coefficients since they deal with state space models. One fact is that for a given transfer function matrix, there exist infinite number of state space representations related by a similarity transform. As a result, when the system to be identified has sparse transfer function coefficients, the state space model produced by a subspace method is almost certain to be non-sparse due to the inherent arbitrary similarity transform associated with the model. Work on sparse system identification has been reported in [15][16][17], however, they consider only singleinput single-out (SISO) systems. In addition, these methods assume the system order is known a priori, which is not necessarily true in practice. This article studies the problem of the identification of MIMO systems with sparse transfer functions coefficients. In particular, we leverage on reliable system order identification of subspace methods to build input-output relationship in terms of transfer function coefficients, and exploit L1norm optimization proven to be able to produce robust sparse solutions [9] to estimate these coefficients. The resulting method consists of a systematic way of order identification and efficient coefficient estimation for sparse systems.
The rest of this article is organized as follows. Section 2 gives an overview of subspace identification methods. Section 3 proposes the LRL1 algorithm. Section 4 presents simulation results and Section 5 draws conclusions.

Subspace identification
For an M-input {u m (k), m = 1,...,M} L-output {y l (k), l = 1,...,L} system described in the state space form: T are the input, output, and state vectors, respectively, with T denoting matrix transpose. A R N×N , B R N×M , C R L×N , and D R L×M are the system, input, output, and direct feed-through matrices, respectively. Noise in (1) has been omitted for brevity purposes and will be added in the simulation in Section 4. Given S measurements (i.e., k = 0,..., S -1) of the input and output, subspace identification methods estimate the system order N and matrices (A, B, C, D), and derive system transfer functions via H(z) = D + C(zI NN -A) -1 B, with I NN denoting the identity matrix of size N × N. The procedure is as follows.
Choose a positive integer i and define the input and output block Hankel matrices as: columns with the first one formed by u k (k = 0,..., i -1). Similarly, we can define the j-column block Hankel matrix U i/2i-1 using u k (k = i,..., 2i -1) to form the first column. For convenience, the following short-hand notations are adopted: For each of the four matrices in (3) and (4), by deleting or adding one block row at the bottom and keeping the number of columns unchanged, we have two additional notations. For example, Note that i is a user-specified parameter and is required to be larger than the system order, i.e., i >N. In general, a subspace identification algorithm consists of three steps. The first step performs projections of the row spaces of the data matrices and estimates the order of the system. In particular, the following oblique projections are first calculated: the oblique projection of the row space of Y f along the row space of U f on the row space of W p and can be calculated according to The SVD of the weighted oblique projection is calculated and the order of the system (N) is determined as the number of principal singular vales in Σ, leading to the identification of the principal subspaces U 1 and V 1 , where W 1 and W 2 are weighting matrices and a specific choice of them leads to different algorithms [1]. For example, and result in the popular N4SID and MOESP algorithms, respectively, with † denoting Moore-Penrose pseudoinverse.
Define the extended observability matrix and state sequence as The second step of a subspace algorithm is to find estimates of the extended observability matrix and state sequences via The final step is to determine the system matrices A, B, C, and D (up to within a similarity transformation) by Note that the sizes of matrices involved in the subspace methods are dependant on i. It will be shown in Section 4 that while the order can be estimated reliably across a broad range of i's as long as the condition i >N is satisfied, the quality of subsequent coefficient estimates depends on i in a non-monotonous fashion.

The proposed LRL1
As mentioned previously, subspace methods are inefficient in dealing with sparse transfer function coefficients. That is, for a system with sparse transfer function coefficients, the outcome of subspace methods is one of many (non-sparse) state space realizations (A, B, C, D). In order to exploit the sparseness in the system, system (1) is represented using transfer function coefficients. In particular, for the lth output the following linear regression (LR) relationship holds: Where{a n , b mn, l } are the system transfer function coefficients. Equation (10) can be re-written into a vector form: By stacking up the output samples in (11), one can have where y l = [y l (N + 1)y l (N + 2) . . . y l (S)] T , Note that it is a common practice to identify a multiinput single-output model for each output separately and then combine the individual models into a final MIMO model. However, if there are common or correlated parameters among models for different output variables and/or correlated noise, then performing identification on all outputs simultaneously can lead to better and more robust models [18]. By combining all outputs of (14), we have We now propose a two-step algorithm which first identifies the system order in state space format. This allows the formation of a MIMO model in terms of transfer function coefficients. L1-norm optimization is then exploited to estimate the coefficients of the model. One of the key factors leading to recent advances in compressed sensing (CS) is that L1-norm optimization promotes sparsity. That is, as compared with its popular L2-norm counterpart, L1-norm optimization produces better results when the parameters to be estimated are sparse [9]. In our case, the data model (15) allows b to be obtained by L1-norm minimization with guaranteed convergence since, unlike the cases in CS, this is strict convex optimization. The minimization generally takes the following form [19]: Where l balances the sparsity of the solution with the fidelity to the data and should be inversely proportional to the noise level. Efficient algorithms for solving (16) have been developed [9,19].
The proposed algorithm makes further use of the SVD in (6), where matrix Σ 1 = diag(l 1 ,..., l N ) contains the principal singular-values and Σ 2 = diag(l N+1 ,..., l iL ) contains noise singular-values. When noise is absent, Σ 2 is an allzero matrix. Otherwise, its diagonal entries will be nonzero positive numbers. Although it is impossible here to accurately estimate the noise variance, these diagonal entries do reflect the noise level and can be represented bȳ In our proposed method, we will set the parameter in (16) to λ = δ/σ where δ is a positive constant. The proposed LRL1 algorithm can now be summarized as follows: Step 1: Subspace identification of system order using (5) and (6), and estimation of noise level using (17). This step is a systematic order selection procedure with reliable performance.
Step 2: Estimation of system transfer function coefficients using (16). This step is simultaneous optimization over all outputs with robust sparse solutions.

Simulation
We evaluate the performances of the proposed LRL1 method against that of the subspace method N4SID [1]. The first MIMO system under consideration is generated by modifying the SISO system in [ [1], p. 155]. In particular, the system order and eigenvalues (i.e., the vector [1 a T ] T ) are kept unchanged and one more input and one more output are added to the original system. Then, some elements in each of the transfer function vector b m,l (see (13)) are set to zero to create the desired sparsity. The resulting 2-input 2-output system of order N = 4 has the following transfer function coefficients: In the simulation, the data size is chosen to be S = 200 and the inputs are sequences of Gaussian variables with zero mean and unit variance. A white noise vector v k is added to the output vector y k (see (1b)) and the SNR as defined below is kept constant for each k, 100 runs are conducted for each simulated scenario, where for each run the input and noise vectors are independently generated. The root mean square error (RMSE) for the estimates of the T = N + M(N + 1)L parameters in b is measured as where b(r) denotes the estimates in the rth run. In the simulation, δ = 0.5 is fixed for all the scenarios and the convex programming package from [20] is used in solving (16).
As mentioned in Section 2, the subspace approach identifies the system order (N) by examining the iL diagonal entries of Σ in (6), i.e., the singular values (l 1 ,..., l iL ) of the system, where i is the number of block rows of the Hankel data matrices (see (2)) and L = 2 is the number of outputs. For SNR = 30 and i changing from 4 to 10, we obtain iL singular values for each particular choice of i. Figure 1 shows (l 1 ,..., l iL , 0,..., 0) where zero padding is made for cases with i < 10.
Checking for the number of principal (dominant) singular values clearly indicates that the system order is N = 4. This result verifies that the identification of system order is reliable regardless of the value of i as long as it is chosen to be larger than N. This feature is very attractive in practice since it is relatively easy to select an i which is larger than the largest possible value the system order might have.
Having identified the system order, we now follow the subsequent steps of the subspace and proposed LRL1 methods to estimate model coefficients. Figure 2 shows the RMSE of the subspace method when i changes from 5 to 10 at SNR = 30. It can be seen here that the performance of the subspace method depends on i in a non-monotonous fashion. That is, a larger i does not necessarily leads to better coefficient estimates. This is due to the fact that only finite datasets are available; increasing the number of rows of the data matrices will lead to a reduction in the number of columns. In the subsequent simulations, i = 9 is used. Figure 3 compares the estimation errors of the subspace and LRL1 methods for different SNR values. The results show that LRL1 is more robust to noise in dealing with this sparse system. Further test is conducted by modifying System 1 to create an even sparser system. In particular, some of the transfer function coefficients of System 1 are set to zero, which results in   Figure 4 shows the performances of the two methods under the same conditions as in Figure 3. The results demonstrate that the improvement of LRL1 over the original subspace method increases when the system to be identified becomes sparser.

Conclusion
A noise-robust algorithm for the identification of MIMO systems has been presented. The proposed method leverages reliable system order identification of subspace principle and exploits L1-norm optimization to achieve high effectiveness for identifying systems with sparse transfer function coefficients. While retaining good features of original subspace methods such as the convenience for multivariable systems, the proposed method is shown to be able to significantly improve estimation accuracy for sparse systems.