Skip to main content

Identification of MIMO systems with sparse transfer function coefficients


We study the problem of estimating transfer functions of multivariable (multiple-input multiple-output--MIMO) systems with sparse coefficients. We note that subspace identification methods are powerful and convenient tools in dealing with MIMO systems since they neither require nonlinear optimization nor impose any canonical form on the systems. However, subspace-based methods are inefficient for systems with sparse transfer function coefficients since they work on state space models. We propose a two-step algorithm where the first step identifies the system order using the subspace principle in a state space format, while the second step estimates coefficients of the transfer functions via L1-norm convex optimization. The proposed algorithm retains good features of subspace methods with improved noise-robustness for sparse systems.

1. Introduction

The problem of identifying multiple-input multiple-output (MIMO) systems arises naturally in spatial division multiple access architectures for wireless communications. Subspace system identification methods refer to the category of methods which obtain state space models from subspaces of certain matrices constructed from the input-output data [1]. Being based on reliable numerical algorithms such as the singular value decomposition (SVD), subspace methods do not require nonlinear optimization and, thereby, are computationally efficient and stable without suffering from convergence problems. They are particularly suitable for identifying MIMO systems since there is no need to impose on the system a canonical form and, therefore, they are free from the various inconveniences encountered in classical parametric methods.

Another good feature of subspace methods is they incorporate a reliable order estimation process. Although this is largely ignored by other identification methods, system order estimation should be an integral part of a system identification algorithm. In fact, correctly identifying the system order is essential to ensure that subsequent parameter estimation will yield a well-defined set of model coefficient estimates. While it is obvious that an underestimated order will result in large modeling errors, it is equally dangerous to have an over-parameterized model as a result of selecting an improperly large order. Over-parameterization not only creates a larger set of parameters to be estimated, but also leads to poorly defined (high variance) coefficient estimates and surplus unvalidated content in the resulting model [2]. For parameterized linear models, a usual way of estimating the order is to conduct a series of tests on different orders, and select the best one based on the goodness-of-fit using certain criterion such as the Akaike's information criterion [3]. Subspace-based order identification determines the order as the number of principle eigenvalues (or singular values) of a certain input-output data matrix. This mechanism has been proven to be a simple and reliable way of order estimation and the same principle has been applied to detect the number of emitter sources in array signal processing [4], to determine the number of principal components in signal and image analysis [5, 6] and to estimate the system order in blind system identification [7, 8].

It has become well known that many systems in real applications actually have sparse representations, i.e., with a large portion of their transfer function coefficients equal to zero [9]. For example, communication channels exhibit great sparseness. In particular, in high-definition television, there are few echoes but the channel response spans many hundreds of data symbols [1012]. In broadband wireless communications, a hilly terrain delay profile consists of a sparsely distributed multipath [13]. Underwater acoustic channels also exhibit sparseness [14].

Despite of their good features, subspace methods are not suited for systems with sparse transfer function coefficients since they deal with state space models. One fact is that for a given transfer function matrix, there exist infinite number of state space representations related by a similarity transform. As a result, when the system to be identified has sparse transfer function coefficients, the state space model produced by a subspace method is almost certain to be non-sparse due to the inherent arbitrary similarity transform associated with the model.

Work on sparse system identification has been reported in [1517], however, they consider only single-input single-out (SISO) systems. In addition, these methods assume the system order is known a priori, which is not necessarily true in practice. This article studies the problem of the identification of MIMO systems with sparse transfer functions coefficients. In particular, we leverage on reliable system order identification of subspace methods to build input-output relationship in terms of transfer function coefficients, and exploit L1-norm optimization proven to be able to produce robust sparse solutions [9] to estimate these coefficients. The resulting method consists of a systematic way of order identification and efficient coefficient estimation for sparse systems.

The rest of this article is organized as follows. Section 2 gives an overview of subspace identification methods. Section 3 proposes the LRL1 algorithm. Section 4 presents simulation results and Section 5 draws conclusions.

2. Subspace identification

For an M-input {u m (k), m = 1,...,M} L-output {y l (k), l = 1,...,L} system described in the state space form:

x k + 1 = A x k + B u k
y k = C x k + D u k

where u k = def [ u 1 ( k ) , , u M ( k ) ] T , y k = def [ y 1 ( k ) , , y L ( k ) ] T , and x k = def [ x 1 ( k ) , , x N ( k ) ] T are the input, output, and state vectors, respectively, with T denoting matrix transpose. A RN×N, B RN×M, C RL×N, and D RL×Mare the system, input, output, and direct feed-through matrices, respectively. Noise in (1) has been omitted for brevity purposes and will be added in the simulation in Section 4. Given S measurements (i.e., k = 0,..., S - 1) of the input and output, subspace identification methods estimate the system order N and matrices (A, B, C, D), and derive system transfer functions via H(z) = D + C(z I NN - A)-1B, with I NN denoting the identity matrix of size N × N. The procedure is as follows.

Choose a positive integer i and define the input and output block Hankel matrices as:

U 0 / i - 1 = u 0 u 1 u j - 1 u 1 u 2 u j u i - 1 u i u i + j - 2
Y 0 / i - 1 = y 0 y 1 y j - 1 y 1 y 2 y j y i - 1 y i y i + j - 2

Note that U0/i-1has j (= S - i + 2) columns with the first one formed by u k (k = 0,..., i - 1). Similarly, we can define the j-column block Hankel matrix Ui/2i-1using u k (k = i,..., 2i - 1) to form the first column. For convenience, the following short-hand notations are adopted:

U p = def U 0 / i - 1 , Y p = def Y 0 / i - 1
U f = def U i / 2 i - 1 , Y f = def Y i / 2 i - 1

For each of the four matrices in (3) and (4), by deleting or adding one block row at the bottom and keeping the number of columns unchanged, we have two additional notations. For example,

Y f - = def Y i / 2 i - 2 , Y f + = def Y i / 2 i

Note that i is a user-specified parameter and is required to be larger than the system order, i.e., i > N. In general, a subspace identification algorithm consists of three steps. The first step performs projections of the row spaces of the data matrices and estimates the order of the system. In particular, the following oblique projections are first calculated:

O i = Y f / U f W p
O i + 1 = Y f - / U f - W p +

where W p = U p Y p , W p + = U p + Y p + , and Y f / U f W p are the oblique projection of the row space of Y f along the row space of U f on the row space of W p and can be calculated according to

Y f / U f W p = [ Y f / U f T ] [ W p / U f T ] W p

with [ Y f / U f T ] denoting the projection of the row space of Y f on the row space of the orthogonal compliment of the row space of U f [1].

The SVD of the weighted oblique projection is calculated and the order of the system (N) is determined as the number of principal singular vales in Σ, leading to the identification of the principal subspaces U1 and V1,

W 1 O i W 2 = U Σ V = ( U 1 U 2 ) Σ 1 0 0 Σ 2 V 1 T V 2 T = U 1 Σ 1 V 1 T

where W1 and W2 are weighting matrices and a specific choice of them leads to different algorithms [1]. For example,

W 1 = I i L , W 2 = I j


W 1 = I i L , W 2 = ( U f ) T ( U f ( U f ) T ) U f ,

result in the popular N4SID and MOESP algorithms, respectively, with † denoting Moore-Penrose pseudo-inverse.

Define the extended observability matrix and state sequence as

Γ i = def C C A C A i - 1
X i = def ( x i x i + 1 , , x i + j - 1 ) R N × j .

The second step of a subspace algorithm is to find estimates of the extended observability matrix and state sequences via

Γ i = W 1 - 1 U 1 Σ 1 1 / 2
X i = Γ i O i , X i + 1 = Γ i - 1 O i + 1

The final step is to determine the system matrices A, B, C, and D (up to within a similarity transformation) by

min A , B , C , D X i + 1 Y i / i - A B C D X i U i / i F 2 .

Note that the sizes of matrices involved in the subspace methods are dependant on i. It will be shown in Section 4 that while the order can be estimated reliably across a broad range of i's as long as the condition i > N is satisfied, the quality of subsequent coefficient estimates depends on i in a non-monotonous fashion.

3. The proposed LRL1

As mentioned previously, subspace methods are inefficient in dealing with sparse transfer function coefficients. That is, for a system with sparse transfer function coefficients, the outcome of subspace methods is one of many (non-sparse) state space realizations (A, B, C, D). In order to exploit the sparseness in the system, system (1) is represented using transfer function coefficients. In particular, for the l th output the following linear regression (LR) relationship holds:

y l ( k + N ) = n = 1 N a n y l ( k + N - n ) + m = 1 M n = 0 N b m n , l u m ( k + N - n ) , k = 1 , , S - N , l = 1 , , L ,

Where{a n , b mn, l } are the system transfer function coefficients. Equation (10) can be re-written into a vector form:

y l ( k + N ) = y k , l T a + m = 1 M b m , l T u m k , k = 1 , , S - N ,


a = ( a 1 a N ) T ,
y k , l T = y l ( k + N - 1 ) , y l ( k + N - 2 ) , , y l ( k ) ,
u m k T = u m ( k + N ) , u m ( k + N - 1 ) , , u m ( k ) ,

and the contribution of the m th input to the l th output is presented by

b m , l = ( b m 0 , l , , b m N , l ) T

By stacking up the output samples in (11), one can have

y l = [ Y l U ] γ l ,


y l = [ y l ( N + 1 ) y l ( N + 2 ) y l ( S ) ] T ,
U = U 1 U M ,
Y l = y 1 , l T y 2 , l T y S - N , l T , U m = u m 1 T u m 2 T u m ( S - N ) T ,
γ l = ( a T β l T ) T ,
β l = ( b 1 , l T b M , l T ) T ,

Note that it is a common practice to identify a multi-input single-output model for each output separately and then combine the individual models into a final MIMO model. However, if there are common or correlated parameters among models for different output variables and/or correlated noise, then performing identification on all outputs simultaneously can lead to better and more robust models [18]. By combining all outputs of (14), we have

y = H β


y = y 1 y 2 y L , β = a β 1 β 2 β L ,
H = Y 1 U 0 0 Y 2 0 U 0 0 Y L 0 0 U .

We now propose a two-step algorithm which first identifies the system order in state space format. This allows the formation of a MIMO model in terms of transfer function coefficients. L1-norm optimization is then exploited to estimate the coefficients of the model. One of the key factors leading to recent advances in compressed sensing (CS) is that L1-norm optimization promotes sparsity. That is, as compared with its popular L2-norm counterpart, L1-norm optimization produces better results when the parameters to be estimated are sparse [9]. In our case, the data model (15) allows β to be obtained by L1-norm minimization with guaranteed convergence since, unlike the cases in CS, this is strict convex optimization. The minimization generally takes the following form [19]:

β = arg min x ( | | x | | 1 + λ | | y - H x | | 2 2 )

Where λ balances the sparsity of the solution with the fidelity to the data and should be inversely proportional to the noise level. Efficient algorithms for solving (16) have been developed [9, 19].

The proposed algorithm makes further use of the SVD in (6), where matrix Σ1 = diag(λ1,..., λ N ) contains the principal singular-values and Σ2 = diag(λN+1,..., λ iL ) contains noise singular-values. When noise is absent, Σ2 is an all-zero matrix. Otherwise, its diagonal entries will be non-zero positive numbers. Although it is impossible here to accurately estimate the noise variance, these diagonal entries do reflect the noise level and can be represented by

σ ̄ = k = N + 1 i L λ k 2 / i L - N

In our proposed method, we will set the parameter in (16) to λ=δ/ σ ̄ where δ is a positive constant. The proposed LRL1 algorithm can now be summarized as follows:

Step 1: Subspace identification of system order using (5) and (6), and estimation of noise level using (17). This step is a systematic order selection procedure with reliable performance.

Step 2: Estimation of system transfer function coefficients using (16). This step is simultaneous optimization over all outputs with robust sparse solutions.

4. Simulation

We evaluate the performances of the proposed LRL1 method against that of the subspace method N4SID [1]. The first MIMO system under consideration is generated by modifying the SISO system in [[1], p. 155]. In particular, the system order and eigenvalues (i.e., the vector [1 aT ] T) are kept unchanged and one more input and one more output are added to the original system. Then, some elements in each of the transfer function vector b m,l (see (13)) are set to zero to create the desired sparsity. The resulting 2-input 2-output system of order N = 4 has the following transfer function coefficients:

System 1:

a = ( 0 0 0  0 .5288 ) T ,
b 1 , 1 T b 2 , 1 T b 1 , 2 T b 2 , 2 T =  - 0 .7139 0 1 .2080 0 - 4 .0361 0 2 .9505 - 2 .7061 0 0 .5238 0 0 .0526 - 0 .0239 - 0 .6024 0 0 - 0 .2686 - 0 .2374 0 .1585 0 .

In the simulation, the data size is chosen to be S = 200 and the inputs are sequences of Gaussian variables with zero mean and unit variance. A white noise vector v k is added to the output vector y k (see (1b)) and the SNR as defined below is kept constant for each k,

SNR = | | y k | | 2 | | v k | | 2 .

100 runs are conducted for each simulated scenario, where for each run the input and noise vectors are independently generated. The root mean square error (RMSE) for the estimates of the T = N + M(N + 1)L parameters in β is measured as

RMSE = 1 100 T r = 1 100 | | β ( r ) - β | | 2 2

where β(r) denotes the estimates in the r th run. In the simulation, δ = 0.5 is fixed for all the scenarios and the convex programming package from [20] is used in solving (16).

As mentioned in Section 2, the subspace approach identifies the system order (N) by examining the iL diagonal entries of Σ in (6), i.e., the singular values (λ1,..., λ iL ) of the system, where i is the number of block rows of the Hankel data matrices (see (2)) and L = 2 is the number of outputs. For SNR = 30 and i changing from 4 to 10, we obtain iL singular values for each particular choice of i. Figure 1 shows (λ1,..., λ iL , 0,..., 0) where zero padding is made for cases with i < 10.

Figure 1
figure 1

System singular values (System 1).

Checking for the number of principal (dominant) singular values clearly indicates that the system order is N = 4. This result verifies that the identification of system order is reliable regardless of the value of i as long as it is chosen to be larger than N. This feature is very attractive in practice since it is relatively easy to select an i which is larger than the largest possible value the system order might have.

Having identified the system order, we now follow the subsequent steps of the subspace and proposed LRL1 methods to estimate model coefficients. Figure 2 shows the RMSE of the subspace method when i changes from 5 to 10 at SNR = 30. It can be seen here that the performance of the subspace method depends on i in a non-monotonous fashion. That is, a larger i does not necessarily leads to better coefficient estimates. This is due to the fact that only finite datasets are available; increasing the number of rows of the data matrices will lead to a reduction in the number of columns. In the subsequent simulations, i = 9 is used.

Figure 2
figure 2

Estimation error of the subspace method (System 1).

Figure 3 compares the estimation errors of the subspace and LRL1 methods for different SNR values. The results show that LRL1 is more robust to noise in dealing with this sparse system.

Figure 3
figure 3

Performance comparison (System 1).

Further test is conducted by modifying System 1 to create an even sparser system. In particular, some of the transfer function coefficients of System 1 are set to zero, which results in

System 2:

a = ( 0 0 0  0 .5288 ) T ,
b 1 , 1 T b 2 , 1 T b 1 , 2 T b 2 , 2 T = 0 0 1 . 2080 0 - 4 . 0361 0 0 - 2 . 7061 0 0 0 0 . 0526 - 0 . 0239 0 0 0 0 0 0 . 1585 0 .

Figure 4 shows the performances of the two methods under the same conditions as in Figure 3. The results demonstrate that the improvement of LRL1 over the original subspace method increases when the system to be identified becomes sparser.

Figure 4
figure 4

Performance comparison (System 2).

5. Conclusion

A noise-robust algorithm for the identification of MIMO systems has been presented. The proposed method leverages reliable system order identification of subspace principle and exploits L1-norm optimization to achieve high effectiveness for identifying systems with sparse transfer function coefficients. While retaining good features of original subspace methods such as the convenience for multivariable systems, the proposed method is shown to be able to significantly improve estimation accuracy for sparse systems.


  1. Vanoverschee P, DeMoor B: Subspace Identification for Linear Systems: Theory-Implementation-Applications. 1996. Springer

    Book  Google Scholar 

  2. Young P: Recursive Estimation and Time-series Analysis: An Introduction. 1984. Springer

    Book  Google Scholar 

  3. Akaike H: A new look at the statistical model identification. IEEE Trans Autom Control 1974, 19(6):716-723.

    Article  MathSciNet  Google Scholar 

  4. Ouyang S, Hua Y: Bi-iterative least square method for subspace tracking. IEEE Trans Signal Process 2005, 53(8):2984-2996.

    Article  MathSciNet  Google Scholar 

  5. Qiu W, Skafidas E: Robust estimation of GCD with sparse coefficients. Signal Process 2010, 90(3):972-976.

    Article  Google Scholar 

  6. Andrews H, Patterson C: Singular value decompositions and digital image processing. IEEE Trans Acoust Speech Signal Process 2003, 24(1):26-53.

    Article  Google Scholar 

  7. Meraim KA, Qiu W, Hua Y: Blind system identification. Proc IEEE 1997, 85(8):1310-1322.

    Article  Google Scholar 

  8. Qiu W, Saleem SK, Pham M: Blind identification of multichannel systems driven by impulsive signals. Digital Signal Process 2010, 20(3):736-742.

    Article  Google Scholar 

  9. Candès E, Wakin M: An introduction to compressive sampling. IEEE Signal Process Mag 2008, 25(2):21-30.

    Article  Google Scholar 

  10. Cotter SF, Rao BD: Sparse channel estimation via matching pursuit with application to equalization. IEEE Trans Commun 2002, 50(3):374-377.

    Article  Google Scholar 

  11. Schreiber WF: Advanced television systems for terrestrial broadcasting: some problems and some proposed solutions. Proc IEEE 1995, 83: 958-981.

    Article  Google Scholar 

  12. Fevrier IJ, Gelfand SB, Fitz MP: Reduced complexity decision feedback equalization for multipath channels with large delay spreads. IEEE Trans Commun 1999, 47: 927-937.

    Article  Google Scholar 

  13. Ariyavisitakul S, Sollenberger NR, Greenstein LJ: Tap selectable decision-feedback equalization. IEEE Trans Commun 1997, 45: 1497-1500.

    Article  Google Scholar 

  14. Kocic M, Brady D, Stojanovic M: Sparse equalization for real-time digital underwater acoustic communications. In Proc OCEANS95. San Diego, CA; 1995:1417-1422.

    Google Scholar 

  15. Chen Y, Gu Y, Hero AO III: Sparse LMS for system identification. In Proceedings of ICASSP. Taipei, Taiwan; 2009:3125-3128.

    Google Scholar 

  16. Sharp M, Scaglione A: Application of sparse signal recovery to pilot-assisted channel estimation. In Proc of Intl Conf on Acoustics, Speech and Signal Proc. Las Vegas, NV; 2008.

    Google Scholar 

  17. Sharp M, Scaglione A: Estimation of sparse multipath channels. In Proceedings of IEEE Military Communications Conference. San Diego, CA; 2008:1-7.

    Google Scholar 

  18. Dayal BS, MacGregor JF: Multi-output process identification. J Process Control Elsevier Sci 1997, 7(4):269-282.

    Article  Google Scholar 

  19. Candès E, Tao T: The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 2005, 35: 2313-2351.

    Article  Google Scholar 

  20. Grant M, Boyd S: CVX: Matlab software for disciplined convex programming (web page and software).[]

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Wanzhi Qiu.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Qiu, W., Saleem, S.K. & Skafidas, E. Identification of MIMO systems with sparse transfer function coefficients. EURASIP J. Adv. Signal Process. 2012, 104 (2012).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: