Singular spectrumbased matrix completion for time series recovery and prediction
 Grigorios Tsagkatakis^{1}Email author,
 Baltasar BeferullLozano^{2} and
 Panagiotis Tsakalides^{1, 3}
https://doi.org/10.1186/s1363401603600
© Tsagkatakis et al. 2016
Received: 2 October 2015
Accepted: 4 May 2016
Published: 27 May 2016
Abstract
Big data, characterized by huge volumes of continuously varying streams of information, present formidable challenges in terms of acquisition, processing, and transmission, especially when one considers novel technology platforms such as the InternetofThings and Wireless Sensor Networks. Either by design or by physical limitations, a large number of measurements never reach the central processing stations, making the task of data analytics even more problematic. In this work, we propose Singular Spectrum Matrix Completion (SSMC), a novel approach for the simultaneous recovery of missing data and the prediction of future behavior in the absence of complete measurement sets. The goal is achieved via the solution of an efficient minimization problem which exploits the low rank representation of the associated trajectory matrices when expressed in terms of appropriately designed dictionaries obtained by leveraging the theory of Singular Spectrum Analysis. Experimental results in real datasets demonstrate that the proposed scheme is well suited for the recovery and prediction of multiple time series, achieving lower estimation error compared to stateoftheart schemes.
Keywords
1 Introduction
The dynamic nature of Big Data, a feature termed velocity, is a critical aspect of massive data streams from a signal processing viewpoint [1]. Due to the high velocity of the input streams, measurements may be missing with a high probability. This phenomenon can be attributed to three factors, namely: (a) intentionally collecting a subset of the measurements for efficiency purposes; (b) unintentional subsampling due to desynchronization; and (c) missing measurements due to communications errors including packet drops, outages, and congestion. To elaborate on these factors, we consider data streams associated with the InternetofThings (IoT) paradigm and we focus on Wireless Sensor Networks (WSNs) since WSNs can serve as an enabling platform for IoT applications [2, 3]. In the context of IoT/WSNs, one source of missing measurements is attributed to intentional subsampling, a scenario where the designer/operator reduces the sampling rate of the sensing infrastructure in order to increase the lifetime of the network. The relationship between sampling rate and lifetime is governed by the limited energy availability that typically characterizes WSNs. While efficient compression and aggregation schemes can be employed to reduce power consumption, reducing the number of measurements is the most efficient approach to achieve this goal [4].
Even when a specific sampling rate is selected, desynchronization between nodes inevitably leads to a reduction of the networkwide sampling rate, since nodes that were supposed to sample at the same time instance end up acquiring measurements at different instances [5]. This issue is also closely related to the quantization of the sampling time, as measurements that were collected in succession can be mapped to different sampling instances, introducing missing measurements for particular time slots. In addition to energy consumption and desynchronization, missing measurements can also be attributed to network outages and packet losses, which are frequent in WSNs deployed in harsh and cluttered environments, causing a large number of packets to fail in reaching their destination.

A novel efficient paradigm for estimating missing measurements which extents the recently developed framework of lowrank matrix recovery by exploiting inherent correlations without the need for explicit models.

The proposed SSMC scheme is an integrated approach for accurately predicting future values even when only a limited number of past measurements is available. This is radical departure from traditional time series forecasting schemes which assume the full availability of historical data.

The proposed scheme can naturally handle a single or multiple time series sources extending traditional estimation approaches that operate strictly on either single or multisource data.

The performance of the proposed method against stateoftheart techniques is evaluated on real data acquired by a distributed sensor network, which serves as an illustrative example of a Big Data application.
The rest of the paper is organized as follows: Section 2 presents an overview of stateoftheart methods for energyefficient data collection. Sections 3 and 4 provide the description of the two theoretical models we consider in this work, namely time series modeling via Singular Spectrum Analysis and missing measurement estimation via the Matrix Completion framework. Section 5 introduces SSMC, our proposed recovery and prediction method, including the mathematical formulation as well as an efficient optimization approach based on Augmented Lagrange Multipliers. The performance of the proposed scheme is experimentally validated against stateoftheart methods in Section 6 and the paper concludes in Section 7.
2 Related work
Designing efficient techniques for minimizing the cost of continuous data collection by exploiting data correlations has been extensively studied from multiple aspects and different perspectives in the context of WSNs [8]. Jindal and Psounis [8] presented a method for inferring the spatial correlation of WSN data and for generating synthetic data using a statistical tool called variagriam. Estimating the sampling field at a given location, based on the available sensor data at other additional locations is a common approach for energy efficient sampling. Data imputation and interpolation techniques, such as Nearest Neighbors Imputation and Kriging, are two very efficient schemes for estimating unavailable data [9]. While in interpolation, one seeks the value of the field in a location where no sensors are present, imputation approaches try to estimate the value at the sensor location at a time instance where sampling did not take place. Kriging relies on the semivariogram, a statistical tool developed by geostatisticians [10] in order to estimate the value of a field at a specific location, given prior knowledge about the inherent correlations of data from neighboring nodes. In kNearest Neighbors, this objective is reached by using a weighted nearest neighbor interpolation, where the weight corresponding to each sample is based on statistical information indicating the degree of spatial dependence in the field [11].
Another line of work for data imputation exploits probabilistic models for estimating the missing entries. In [12], an Expectation Maximization (EM) algorithm is presented which estimates the parameters of the probability distribution of the data by iteratively maximizing the likelihood of the available data as a function of these parameters. In order to increase the robustness of the process, the authors proposed the regularized EM (RegEM) where a regularization term is added during the inversion of the correlation matrix in order to increase the robustness of the algorithm when more variables are present than data records. RegEM is currently one of the stateoftheart data imputation techniques, and its performance is compared against the proposed and other schemes in the experimental section.
Data compression has also been extensively explored in the context of energyefficient data collection in WSNs, based on the premise that data processing is less demanding in terms of energy consumption compared to transmission; hence, energy reduction can be achieved. For example, the recently proposed framework of Compressed Sensing (CS), a stateoftheart signal sampling and compression scheme, was investigated for WSN data acquisition and aggregation [13, 14] exploiting the sparsity of the sampled data when expressed in an appropriate basis [15]. Distributed compression schemes such as Distributed Source Coding [16] have also been proposed for compressing WSN measurements in densely deployed networks, since utilizing side information from neighboring nodes can dramatically reduce communication cost. The sparse characteristics of correlated datasets have also been recently considered for transmission of EEG signals [17, 18]. Although sparsity and CSbased methods can have a dramatic reduction in transmission power, typically in these scenarios, the signals are first fully sampled and then compressed.
While the CS framework requires a particular form of sampling (incoherent sampling), the related paradigm of lowrank matrix recovery (MC) assumes a random sampling of the matrix entries. Due to the intuitive sampling, the MC framework has been considered for a variety of signal recovery problems including collaborative spectrum sensing [19], sensor localization [20, 21], and image reconstruction problems [22, 23] among others. MC has been recently explored as a sampling scheme for WSNs [24–27]. In [24], the authors investigated the scenario where sensors lie on a uniform rectangular grid and random subsampling is taking place by each sensor. Our work bares some similarities with this line of work; however, we do not pose specific deployment constraints and we allow the sensors to occupy any location in the sensed region. Furthermore, our work differs significantly in the exploitation of prior knowledge in the form of a dictionary, which is utilized during the reconstruction stage. The utilization of the singular spectrum dictionary allows for the incorporation of prior knowledge regarding the data generation process which can significantly improve the reconstruction performance [25]. Furthermore, the proposed scheme is able to predict future measurements in addition to estimating missing past ones.
Lowrank recovery was also recently considered in [28] where the authors employ MC for the recovery of undersampled correlated EEG signals. Our work in this paper investigates different extensions of MCbased recovery by considering trajectory matrices and singular spectrum dictionaries. We develop a generative model where the sampled data can be jointly represented as a lowrank linear combination of dictionary elements, spanning the subspace where data is lying. A similar situation was recently explored, leading to the low rank representations (LRR) framework [29] where the objective is to identify a low rank matrix which can accurately represent the source data. LRR has been considered for subspace clustering problems [30]; however, only fully populated matrices were considered.
In the context of Big Data, matrix and tensor data recovery via an online rank minimization process [31] was recently proposed for scalable imputation of missing data. This was achieved by lowdimensional subspace tracking through the minimization of a weighted least squares regression, regularized with a nuclear norm. While this work bares resemblance to our work, our generative model does not require a fixed bilinear factorization due to a prespecified rank, while it exploits the subspace identified by the SSA for simultaneous missing past measurement imputation and future predictions.
3 Analysis of time series data
Singular Spectrum Analysis (SSA) is a modelfree method for time series analysis and forecasting which has been widely exploited in the analysis of environmental, economical, and computer network data [32, 33]. The basic assumption underlying SSA is that one can approximate a time series \(\mathbb {M}_{i}\) of length K from L lagged samples, by considering the spectral analysis of specialized matrices, called trajectory matrices. Embedding at sampling instance T, the first step of SSA, involves the process of generating a trajectory matrix \({\mathbf M}_{i}=\{\mathbf {m}_{i,t} \vert t=TL:T\} \in \mathbb {R}^{K \times L}\) of lag L measurement vectors, where each vector \(\phantom {\dot {i}\!}\mathbf {m}_{i,t'}=\{m_{i,t'} \vert t'=tK:t \}\) encodes the measurements corresponding to a sampling window of length K for sensor i. The length K of the time window and the lag L are two critical parameters encoding important aspects of the underlying data.
In SSA, once the trajectory matrix of the time series has been generated, the subsequent step involves the spectral analysis of the lagcovariance matrix. Formally, given the matrix M _{ i }, the lagcovariance matrix defined as \({\mathbf C}_{i}={\mathbf M}_{i}{\mathbf M}_{i}^{T}\) can be used for extracting the eigenvectors of C which define an Ldimensional subspace where the time series \(\mathbb {M}_{i}\) resides, while the associated eigenvalues encode the variance along the direction of the associated eigenvector. Alternatively, one can apply the SVD decomposition to the original trajectory matrix M _{ i } in which case the outputs are two matrices containing the right and left singular vectors U and V and a diagonal matrix Σ containing the singular values. Given the SVD decomposition, the trajectory matrix M _{ i } can be expressed as the sum of rank1 matrices given by \({\mathbf M}_{j}=\sum _{j} \sqrt {\lambda _{j}}\mathbf {u}_{j} \mathbf {v}_{j}^{T}\), where each collection (λ _{ j },u _{ j },v _{ j }) is called eigentriple.
Given the eigenvectors extracted via the SSA, one can project and reconstruct the time series or perform prediction by employing two steps, eigentriple grouping and diagonal averaging. Eigentriple grouping aims at arranging the eigentripes in sets in order to separate additive components that are exactly or approximate separable, facilitating the analysis of the eigenvectors. Diagonal averaging aims at translating the recovered trajectory matrix into a time series according to
where m ^{∗}[i,j]=m[i,j] for L<K and m ^{∗}[i,j]=m[j,i] otherwise.
It is worth noting that SSA has also been considered in situations when a number of measurements are missing. A straightforward approach, also employed here, is to estimate the eigenvectors and eigenvalues using only the available measurements during the lagcovariance matrix generation [34]. SSA has also been considered when missing measurements are present [35, 36]; however, the proposed methods differ from our work in that we exploit prior knowledge in the form of a dictionary. Furthermore, the proposed scheme is able to perform missing value estimation, either past or future, while there is no constraint associated with the structure of the missing measurements.
In addition to the analysis of time series, SSA can also be used as a forecasting mechanism. In recurrent forecasting SSA, the time series of known measurements and unknown components is transformed to its Hankel form and the linear recurrent relation coefficients are utilized for forecasting the future values. While typical SSA considers the trajectory matrices associated with a single time series, the Multivariate Singular Spectrum Analysis (MSSA) method has been proposed for handling multiple time series [37–39]. In this work, we consider a simple extension of SSA where instead of analyzing a single trajectory matrix, we consider a compound trajectory matrix generated by the concatenation of S individual matrices, i.e., \(\mathbf {M}=\;[\mathbf {M}_{1}, \mathbf {M}_{2}, \ldots, \mathbf {M}_{S}] \in \mathbb {R}^{S(K \times L)}\). Introducing multiple sources of data can have a dramatic impact in performance as will be shown in the experimental results, with at most linear increase in computational complexity.
4 Lowrank matrix completion
where ε is the approximation error, related to the noise power. By utilizing the SVD decomposition M=U S V ^{ T }, a lowrank approximation matrix X can be found by \({\mathbf {X}}=\mathbf {U} \mathcal {T}(\mathbf {S}) \mathbf {V}^{T}\), where \(\mathcal {D}_{\tau }(\mathbf {S})=diag([\sigma _{i}(\mathbf {S})\tau ]_{+})\) is a thresholding operator that selects only the elements with values greater than τ from the diagonal matrix S and sets the rest to zero. The effect of this process is that only a small number of singular values are kept for the lowrank approximation X of M.
where Ω is the sampling set. In the context of WSN for example, the set Ω specifies the collection of sensors that are active at each specific sampling instance. In general, to solve the MC problem, the sampling operator \(\mathcal {P}\) must satisfy the modified restricted isometry property, which is the case when uniform random sparse sampling is employed in both rows and columns of matrix M [42]. The incoherence of sampling introduced by \(\mathcal {P}\) with respect to M guarantees that recovery is possible from a limited number of measurements.
where the nuclear norm is defined as \(\{\mathbf X}\_{*}=\sum \\sigma _{i}\_{1}\), i.e., the sum of absolute values of the singular values. Candès and Tao showed that under certain conditions the nuclear norm minimization in Eq. (5) can estimate the same matrix as the rank minimization in Eq. (3) with high probability provided q≥C K ^{6/5} r l o g(K) randomly selected entries of the rank r matrix are acquired [7] (assuming K≥L).
To solve the nuclear norm minimization problem, various approaches have been proposed including Singular Value Thresholding [43] and the Augmented Lagrange Multipliers [44], among others. We review the technique based on the ALM due to its exceptional performance in terms of both processing complexity and reconstruction accuracy and since it is used as a basis for the extended scheme we discuss next.
where Y is the Lagrange multiplier matrix associated to the first equality constraint and μ is the penalty parameter. Minimization of the problem in Eq. (7) involves an iterative process, where a sequential minimization over all variables, i.e., X,E, and Y, takes place at each iteration. This method of iteratively minimizing over each variable is refereed to as the Alternating Directions Method of Multipliers (ADMM) [45, 46].
One of the key characteristics of MC is the minimal conditions that are imposed for successful recovery, namely the incoherence of sampling and the low rank of the recovered matrix. While a minimal set of requirements is beneficial in situations where limited prior information is available, when such information exists introducing additional constraints can lead to a significantly better recovery. In this section, we exploit the temporal dynamic that time series exhibit in order to enhance typical MC with an additional dictionary which encodes past behavior in a proposed SSMC framework.
5 The SSMC algorithm
We consider the truncated trajectory matrices M formed by concatenating the individual trajectory matrices according to the MSSA approach. The objective of this work is to consider a generative model that produces the time series Hankel matrices M according to the factorization M = D L where M may correspond to a single or multiple sources. In both cases, our key assumption is that given a full rank dictionary matrix D obtained through training data, the coefficient matrix L is approximately low rank, i.e., the number of significant singular vectors is much smaller than the ambient dimensions of the matrix.
5.1 Efficient optimization
A key novelty of our work is that in addition to the low rank of the matrix, during the recovery, we employ a dictionary for modeling the generative process that produces the sensed data, as it can be seen in Eq. (9).
where the notation \(\mathcal {P}_{\not {\Omega }}\) is used to restrict the error estimation only on the measurements that do not belong to the sampling set. Last, we perform updates on the two Lagrange multipliers Y _{1} and Y _{2}. The steps at each iteration of the optimization are shown in Algorithm 1.
Due to its numerous applications, the ADMM method has been extensively studied in the literature for the case of two variables [45, 46] where it has been shown that under mild conditions regarding the convexity of the cost functions, the twovariables ADMM converges at a rate \(\mathcal {O}(1/r)\) [49]. Although extending the convergence properties to a larger number of variables has not been shown in general, recently the convergence properties of ADMM for a sum of two or more nonsmooth convex separable functions subject to linear constraints were examined [50].
The proposed minimization scheme in Eq. (11) satisfies a large number of the constraints suggested in [50] such as the convexity of each subproblem, the strict convexity and continuous differentiability of the nuclear norm, the full rank of the dictionary, and the size of the step for the dual update α, while empirical evidence suggests that the closed form solution of each subproblem allows the SSMC algorithm to converge to an accurate solution in a small number of iterations.
5.2 Singular spectrum dictionary
In this work, we investigate the utilization of prior knowledge for the efficient reconstruction of severely undersampled time series data. To model the data, we follow a generative scheme where the full collection of acquired measurements is encoded in the trajectory matrix \(\mathbf {M} \in \mathbb {R}^{K \times L}\). M is assumed to be generated from a combination of a dictionary \(\mathbf {D} \in \mathbb {R}^{K \times K}\) and a coefficient matrix \(\mathbf {L} \in \mathbb {R}^{K \times L}\) according to M = D L, where we assume that K≤L. This particular factorization is related to SVD by M = D L = U ( S V ^{ T }) where the orthonormal matrix D = U is a basis for the subspace associated with the column space of M, while L = S V ^{ T } is a lowrank representation matrix encoding the projection of the trajectory matrix onto this subspace.
This particular choice of dictionary D implies a specific relationship between the spectral characteristics of the trajectory matrix M and the lowrank representation matrix L. To understand this relationship, we consider the spectral decomposition of each individual matrix in the form D=U G _{1} R ^{−1} and L=R G _{2} V ^{∗} The matrices U , R and V are unitary while G _{1} and G _{2} are diagonal matrices containing the singular values of the D and L, respectively. The particular factorization permits us to utilize the product SVD [51, 52] and expresses the singular value decomposition of the product according to the expression D L=U(G _{1} G _{2})V ^{∗}, where the singular values of the matrix product are given by the product of the singular values of the corresponding matrices.
In this work, we consider orthogonal dictionaries, as opposed to overcomplete ones. Orthogonality of the dictionary guarantees that the vectors encoded in the dictionary span the lowdimensional subspace and therefore the representation of the measurements is possible. Furthermore, an orthonormal dictionary, such as the one considered in this work, is characterized by G _{1}=I, leaving G _{2} responsible for the representation. We target exactly G _{2} in our problem formulation by seeking a lowrank representation matrix L.
In our experimental results, we consider sets of training data associated with fully sampled time series from the first days of each experiment for generating the dictionaries. The subspace identified by the fully sampled data is used for the subsequent recovery of past measurements and prediction of future ones. Alternatively, the dictionary could be updated during the course of the SSMC application via an incremental subspace learning method [53, 54]. We opted out from an incremental subspace learning since although it can potentially lead to better estimation, it is also associated with increased computational load and the higher probability of estimation drift and lower performance.
5.3 Networking aspects of SSMC
In the context of IoT applications utilizing WSN infrastructures, communication can take place among nodes, but most typically between the nodes and the base station where data analytics are extracted. This communication can be supported (a) by a direct wireless link between the nodes and the sink/base station; (b) via appropriate paths that allow multihop communications; or (c) via more powerful cluster heads what forward the measurements to the base station.
For the multihop scheme, equal weight of each sample (democratic sampling) implies that no complicated processing needs to take place by the resource limited forwarding nodes. Furthermore, for highperformance WSNs, where pointtopoint communication between nodes is available and processing capabilities are sufficient, nodes could perform reconstruction of a local neighborhood thus offering advantages similar to other distributed estimation schemes [55].
From a practical pointofview, we argue that recovery and prediction of measurements from low sampling rates offer numerous advantages. First, it saves energy by reducing the number of samples that have to be acquired, processed, and communicated thus increasing the lifetime of the network. The proposed sampling scheme also reduces the frequency of sensor recalibrations for sensors that perform complex signal acquisition, including chemical and biological sampling. As a result, higher quality measurements and therefore more reliable estimation of the field samples can be achieved. Furthermore, the method increases robustness to communication errors by estimating measurements included in lost or dropped packets, without the need for retransmission. Last, our scheme does not require explicit knowledge of node locations for the estimation of the missing measurements, since the incomplete measurement matrices and the corresponding trajectory matrices are indexed by the sensor id, thus allowing greater flexibility during deployment.
6 Experimental results
To evaluate the performance of the proposed lowrank reconstruction and prediction scheme, we consider real data from the Intel Berkeley Research Lab dataset^{1} [56] and the SensorScope Grand StBernard dataset^{2} [57]. The former dataset contains the recordings of 54 multimodal sensors located in an indoor environment over a 1month period, while the latter contains multimodal measurements from 23 stations deployed at the GrandStBernard pass between Switzerland and Italy.
In both cases, we analyze temperature measurements as an exemplary modality, while we exclude failed sensors from the recovery process. Unless stated otherwise, in all cases, we fix the SSA parameters, K=50 and L=100, and we train using a single day’s worth of data while testing on the five consecutive ones. The threshold τ for the singular value thresholding operator is set to preserve 90 % of the signals’ energy, while the parameter μ was set to 0.01 through a validation process, although the specific value had a minimal impact in performance.
To evaluate the performance, we consider three stateoftheart methods and we compare them to the proposed SSMC. More specifically, we evaluate the performance of the ADMM version of MC [44], the Knnimputation [58], and the RegEM [12]. The reconstruction error is measured by the normalized mean squared error between the true M and the estimated X trajectory matrices given by \(\frac {\sum \\mathbf {MX}\^{2}}{\sum \\mathbf {M}\^{2}}\).
6.1 Recovery with respect to measurement availability
The objective of this subsection is to present the recovery capabilities of the proposed SSMC and stateoftheart methods with respect to the availability of measurements, i.e., the sampling rate.
Comparing the four methods, we observe that under all measurement availability scenarios, the proposed SSMC scheme typically achieves the lowest reconstruction error and exhibits the most stable performance. The performance of SSMC is closely followed, especially in low sampling rates, by RegEM which also exhibits a very stable performance, while on the other hand, MC and Knnimpute are more sensitive to the sampling instance, exhibiting a more erratic behavior.
Regarding the performance on the SensorScope data, one can observe that in this case RegEM achieves a significantly better performance compared to the other methods, followed by MC at low sampling rates and SSMC at large ones. Similar to the behavior observed for the IntelBerkeley data, MC again reaches a performance plateau while the other methods achieve a monotonically reducing reconstruction error. Note that although RegEM achieves the lowest reconstruction error, it is also the most computationally demanding of the four methods.
6.2 Recovery from multiple sources
Stateoftheart methods, like Knnimpute and RegEM, not only appear to be unable to exploit the additional sources of data, but introducing the additional sources leads to an increase in reconstruction error for a given sampling rate. On the other hand, typical MC is unaffected by the different scenarios, exhibiting the same plateau in behavior regardless of the number of sources under consideration. Unlike the other methods, the proposed SSMC is able to better handle the additional data. Although applying SSMC with multiple sources of data does not lead to better performance, the proposed method is better in handling such complex data streams, offering the lowest reconstruction error among all methods considered.
In general, for the stateoftheart methods we consider, experimental results suggest that introducing multiple correlated sources does not necessarily aid in the recovery performance, while under different scenarios, the aggregation of multiple sources may also introduce prohibitively large communication overheads. On the other hand, the proposed SSMC can smoothly transition from the single sensor/source case to multiple sensors/sources achieving compelling gains in certain scenarios.
6.3 Joint recovery and prediction
6.4 Performance with respect to computational resources
The results reported in the previous subsections assume that a single day’s worth of data is utilized during the training phase where the dictionary D is obtained. Here, we investigate the recovery capability of the proposed SSMC method as a function of the amount of training data, i.e., the number of days used for training.
This aspect is critical since we assume that the training data is fully populated without any missing measurements. To achieve the acquisition of such training data requires extra care in terms of communication robustness as well as a larger energy consumption due to full sampling.
Computational time for different number of sensors and measurement availability
25 %  50 %  75 %  

1  5  1  5  1  5  
SSMC  0.188  0.950  0.137  0.719  0.087  0.358 
MC  0.101  0.140  0.101  0.146  0.103  0.152 
RegEM  0.092  0.137  0.098  0.407  0.154  1.194 
Knn  0.153  0.866  0.102  0.632  0.051  0.275 
Table 1 clearly demonstrates the relationships of each method with respect to the sampling rate where we observe that for the proposed SSMC method, increasing the sampling rate leads to lower processing time for both the single and the multiple source cases. On the other hand, MC requires a fixed processing time independently of the number of available measurements, while the effect of the number of sources is minimal. RegEM’s processing time is increasing as the number of available measurements increase due to the inner mechanics of the algorithm which require multiple regression to take place. Last, the Knnimpute method exhibits a decrease in processing time with respect to the measurement availability and an increase associated with multiple sources. Overall, the proposed SSMC exhibits a stable and predicable performance, achieving a very good tradeoff between processing requirements and reconstruction quality.
7 Conclusions
Acquiring, transmitting, and processing Big Data presents numerous challenges due to the complexity and volume issues among others. The situation becomes even more complicated when one considers data sources associated with the InternetofThings paradigm, where component and architecture limitations, including processing capabilities, energy availability, and communication failures, must also be considered. In this work, we proposed a distributed samplingcentralized recovery scheme where due to various design choices and physical constraints, only a small subset of the entire set of measurements is collected during each sampling instance. The proposed SSMC approach exploits the lowrank representation of appropriately generated trajectory matrices, when expressed in the subspace associated with dictionaries learned using training data, in order to recover missing measurements as well as predict future values. The recovery and prediction procedures are implemented via an efficient optimization based on the augmented Lagrange multipliers method. Experimental results on real data from the IntelBerkeley and the SensorScope datasets validate the merits of the proposed scheme compared to stateoftheart methods like typical matrix completion, RegEM, and Knnimputation, both in terms of pure reconstruction as well as in the demanding case of simultaneous recovery and prediction.
8 Endnotes
Declarations
Acknowledgements
This work was founded by the DEDALE (contract no. 665044) within the H2020 Framework Program) of the EC. This work was also supported by the PETROMAKS SmartRig (grant 244205 /E30), SFI Offshore Mechatronics (grant 237896/O30), both from the Research Council of Norway, and the RFF Agder UiA CIEMCoE grant.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 K Slavakis, G Giannakis, G Mateos. IEEE Signal Proc. Mag. 31(5), 18 (2014).View ArticleGoogle Scholar
 J Gubbi, R Buyya, S Marusic, M Palaniswami. Futur. Gener. Comput. Syst. 29(7), 1645 (2013).View ArticleGoogle Scholar
 DB Rawat, JJ Rodrigues, I Stojmenovic, CyberPhysical Systems: From Theory to Practice (CRC Press, Boca Raton, 2015).View ArticleGoogle Scholar
 G Tzagkarakis, G Tsagkatakis, D Alonso, E Celada, C Asensio, A Panousopoulou, P Tsakalides, B BeferullLozano, in: Cyber Physical Systems: From Theory to Practice. (DB Rawat, J Rodrigues, I Stojmenovic, eds.) (CRC Press, USA, 2015).Google Scholar
 F Sivrikaya, B Yener. IEEE Netw. 18(4), 45 (2004).View ArticleGoogle Scholar
 B EJ Candès, Recht, Found. Comput. Math. 9(6), 717 (2009).MathSciNetGoogle Scholar
 T EJ Candès, IEEE Tao, Trans. Inf. Theory. 56(5), 2053 (2010).MathSciNetGoogle Scholar
 A Jindal, K Psounis. ACM Trans. Sens. Netw. (TOSN). 2(4), 466 (2006).View ArticleGoogle Scholar
 GE Batista, MC Monard. Appl. Artif. Intell. 17(5–6), 519 (2003).View ArticleGoogle Scholar
 N Cressie. Terra Nova. 4(5), 613 (1992).View ArticleGoogle Scholar
 J Li, A Heap. Ecol. Informa. 6(3), 228 (2011).View ArticleGoogle Scholar
 T Schneider. J. Clim. 14(5), 853 (2001).View ArticleGoogle Scholar
 C Luo, F Wu, J Sun, C Chen, in International conference on Mobile computing and networking (ACM, Beijing China, 2009).Google Scholar
 C Luo, F Wu, J Sun, CW Chen, IEEE Trans. Wirel. Commun. 9(12), 3728 (2010).Google Scholar
 A Fragkiadakis, I Askoxylakis, E Tragos, in: International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (IEEE, 2013).Google Scholar
 Z Xiong, A Liveris, S Cheng, IEEE Signal Proc. Mag. 21(5), 80 (2004).Google Scholar
 A Majumdar, RK Ward, Biomed. Signal Process. Control. 13:, 142 (2014).Google Scholar
 A Shukla, A Majumdar, Biomed. Signal Process. Control. 18:, 174 (2015).Google Scholar
 JJ Meng, W Yin, H Li, E Houssain, Z Han, in: Acoustics Speech and Signal Processing (ICASSP) 2010 IEEE International Conference on (IEEE, Dallas, 2010).Google Scholar
 S Nikitaki, G Tsagkatakis, P Tsakalides, IEEE Trans. Mob. Comput. 14(11), 2244 (2015).Google Scholar
 S Nikitaki, G Tsagkatakis, P Tsakalides, in: Signal Processing Conference (EUSIPCO) 2012 Proceedings of the 20th European (IEEE, Bucharest, 2012).Google Scholar
 PJ Shin, PE Larson, MA Ohliger, M Elad, JM Pauly, DB Vigneron, M Lustig, Magn. Reson. Med. 72(4), 959 (2014).Google Scholar
 G Tsagkatakis, P Tsakalides, in: Machine Learning for Signal Processing (MLSP) 2012 IEEE International Workshop on (IEEE, Santander, 2012).Google Scholar
 A Majumdar, R Ward, in: Data Compression Conference, 2010 (IEEE, Snowbird, 2010).Google Scholar
 G Tsagkatakis, P Tsakalides, in: Sensor Array and Multichannel Signal Processing Workshop (SAM) (IEEE, Hoboken, 2012).Google Scholar
 F Fazel, M Fazel, M Stojanovic, in: Information Theory and Applications Workshop (ITA) (IEEE, San Diego, 2012).Google Scholar
 S Savvaki, G Tsagkatakis, P Tsakalides, in ACM International Workshop on CyberPhysical Systems for Smart Water Networks (ACM, New York, 2015).Google Scholar
 A Majumdar, A Gogna, Sensors. 14(9), 15729 (2014).Google Scholar
 G Liu, Z Lin, S Yan, J Sun, Y Yu, Y Ma, IEEE Trans. Pattern Anal. Mach. Intell.35(1), 171 (2013).Google Scholar
 E Elhamifar, R Vidal, IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2765 (2013).Google Scholar
 M Mardani, G Mateos, GB Giannakis, IEEE Trans. Signal Process. 63(10), 2663 (2015).Google Scholar
 N Golyandina. Singular Spectrum Analysis for time series (Springer Science & Business MediaNew York, 2013).Google Scholar
 G Tzagkarakis, M Papadopouli, P Tsakalides, in: ACM Symposium on Modeling, analysis, and simulation of wireless and mobile systems (ACM, Chania, 2007).Google Scholar
 DH Schoellhamer, Geophys. Res. Lett. 28(16), 3187 (2001).Google Scholar
 D Kondrashov, M Ghil, Nonlinear Process. Geophys. 13(2), 151 (2006).Google Scholar
 N Golyandina, E Osipov, J. Stat. Plan. Infer. 137(8), 2642 (2007).Google Scholar
 K Patterson, H Hassani, S Heravi, A Zhigljavsky, J. App. Stat. 38(10), 2183 (2011).Google Scholar
 N Golyandina, D Stepanov, in: 5th St. Petersburg workshop on simulation, vol. 293 (St. Petersburg State University, St. Petersburg, 2005).Google Scholar
 N Golyandina, A Korobeynikov, A Shlemov, K Usevich. J. Stat. Softw. 67(1), 1 (2015).Google Scholar
 I Markovsky. Low rank approximation: algorithms, implementation, applications (Springer Science & Business MediaNew York, 2011).Google Scholar
 Y E Candès, Plan, Proc. IEEE. 98(6), 925 (2010).View ArticleGoogle Scholar
 B Recht, M Fazel, P Parrilo, SIAM Rev. 52(3), 471 (2010).Google Scholar
 JF Cai, EJ Candès, Z Shen, SIAM J. Optim. 20(4), 1956 (2010).Google Scholar
 Z Lin, M Chen, Y Ma, arXiv preprint 1009.5055, (2010). http://arxiv.org/abs/1009.5055.
 DP Bertsekas. 1st edn. Constrained Optimization and Lagrange Multiplier Methods (Optimization and Neural Computation Series) (Athena ScientificNashua, 1996).Google Scholar
 S Boyd, N Parikh, E Chu, B Peleato, J Eckstein, Found. Trends Mach Learn. 3(1), 1 (2011).Google Scholar
 Z Liu, L Vandenberghe, SIAM J. Matrix Anal. Appl. 31(3), 1235 (2009).MathSciNetGoogle Scholar
 M Grant, S Boyd, Y Ye (2008). Online accessiable: http://stanford.edu/~boyd/cvx. Accessed 1 Jan 2014.
 S Boyd, N Parikh, E Chu, B Peleato, J Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends®; Mach. Learn. 3(1), 1–122 (2011). Now Publishers Inc.View ArticleMATHGoogle Scholar
 Luo ZQ, arXiv preprint arXiv:1208.3922 (2012). http://arxiv.org/abs/1208.3922.
 KV Fernando, S Hammarling. Linear algebra in signals, systems, and control (Boston, MA, 1986), pp. 128–140(1988).Google Scholar
 B De Moor, Signal Process. 25(2), 135 (1991).Google Scholar
 DA Ross, J Lim, RS Lin, MH Yang, Int. J. Comput. Vis. 77(1–3), 125 (2008).Google Scholar
 Y Li, Pattern Recogn. 37(7), 1509 (2004).Google Scholar
 ID Schizas, GB Giannakis, ZQ Luo, IEEE Trans. Signal Process. 55(8), 4284 (2007).Google Scholar
 S Madden, Intel lab data, 2004, (2012). http://db.csail.mit.edu/labdata/labdata.html.
 F Ingelrest, G Barrenetxea, G Schaefer, M Vetterli, O Couach, M Parlange, ACM Trans. Sens. Netw. (TOSN). 6(2), 1 (2010).Google Scholar
 O Troyanskaya, M Cantor, G Sherlock, P Brown, T Hastie, R Tibshirani, D Botstein, R Altman, Bioinformatics. 17(6), 520 (2001).Google Scholar