 Research
 Open Access
 Published:
Sequential Monte Carlo for inference of latent ARMA timeseries with innovations correlated in time
EURASIP Journal on Advances in Signal Processing volume 2017, Article number: 84 (2017)
Abstract
We consider the problem of sequential inference of latent timeseries with innovations correlated in time and observed via nonlinear functions. We accommodate timevarying phenomena with diverse properties by means of a flexible mathematical representation of the data. We characterize statistically such timeseries by a Bayesian analysis of their densities. The density that describes the transition of the state from time t to the next time instant t+1 is used for implementation of novel sequential Monte Carlo (SMC) methods. We present a set of SMC methods for inference of latent ARMA timeseries with innovations correlated in time for different assumptions in knowledge of parameters. The methods operate in a unified and consistent manner for data with diverse memory properties. We show the validity of the proposed approach by comprehensive simulations of the challenging stochastic volatility model.
Introduction
This paper addresses inference of a broad class of latent timeseries observed via nonlinear functions. We aim at modeling timeseries with diverse memory properties in a unified manner so that a method for inference of heterogeneous timevarying data can be proposed. To that end, we elaborate on classical autoregressive moving average (i.e., ARMA) models and consider innovations^{Footnote 1} that are correlated in time. With these flexible modeling assumptions, a diverse set of scenarios and data properties can be accommodated. The studied latent timeseries framework not only covers classical ARMA type models and their fractionally integrated generalizations, i.e., autoregressive fractionally integrated moving average (ARFIMA) processes but also allows for inference of timeseries with heterogeneous memory properties.
The analysis of timeseries is relevant in a plethora of disciplines in science, engineering and economics [1–3]. In all these areas, stochastic processes are used to model the behavior of timevarying data. Often, the modeling is carried out by two processes, one of which is latent and the other, observed and informative about the hidden process.
Among the relevant features of timeseries data and the stochastic models used for their description, their memory is one of the most important characteristics. On the one hand, there are shortmemory processes, where only few past data values affect the present of the timeseries. On the other, the present value is dependent on samples far into the past for longmemory processes.
ARMA models have been widely studied for characterizing shortterm processes, as they accurately describe quickly forgetting data. The pioneering work on shortmemory processes and ARMA(p,q) timeseries was presented in the early 1950s by [4], it was continued by [5], and later expanded by [2]. ARMA(p,q) processes are defined by their autoregressive (AR) parameters a _{1},a _{2},⋯,a _{ p }, of order p; moving average (MA) parameters b _{1},b _{2},⋯,b _{ q }, of order q; and driving innovations u _{ t }, which are assumed to be independent and identically distributed (i.i.d.).
The work on longmemory processes also began in the middle of the 20th century, with the groundwork laid by [6]. He studied Nile river data and realized that it manifested longrange dependence. In the following decades, plenty of other geophysical, climatological, and financial records have been described by similar longterm characteristics [7–9].
For modeling timeseries with long memory, there are two types of formulations that have attracted interest of practitioners [8]. They arise naturally from limit theorems and classic models. With the first formulation, the longmemory processes are described as stationary increments of selfsimilar models, of which the fractional Gaussian process (fGp) is a prime example. The second formulation appears in the form of autoregressive fractionally integrated moving average processes. These models are built upon ARMA models by introducing noninteger values of the differencing parameter d, which accounts for the “integrated” part I of the model. The acronyms ARFIMA or FARIMA are used to refer to these processes (where the F refers to the “fractional” component), even if the ARIMA (p,d,q) notation suffices if fractional values of d are considered.
Both short and longmemory processes (modeled by ARMA, FARIMA, or other models) are commonly used in practice to describe all kinds of timevarying data, including eolic phenomena [10, 11], biomedical signals [12], or financial markets [13, 14]. Our goal in this paper is to consider a generic formulation of timeseries so that we accommodate diverse memory properties in a unified framework. We do so by considering ARMA models with innovations that are correlated in time.
We have pointed out that it is common to use two dependent processes to model observed phenomena. The reason is that often, we are not able to directly observe the process of interest. This may be due to the intricacies of the acquisition procedure or simply because the latent process cannot be observed. Inference of hidden processes is a very challenging task. The timeseries estimation and prediction problems are certainly much more difficult when the process of interest is not directly observed, as is the case in this paper.
The latent process has dynamics that are not directly observed, but the observed process depends on the latent process in various forms. One represents the two by using a statespace formulation, where the evolution of the system is modeled over time by a series of hidden variables associated with another series of measurements. That is, the statespace model comprises a set of ordered observations y _{ t } that depend on some latent timeevolving unknowns x _{ t }. Oftentimes, one is interested in online or sequential estimation of the state, as data are observed over time.
The task of estimating the hidden processes (i.e., x _{ t }) based on observations y _{ t } has been widely studied, see [15] for example. To that end, two classes of problems are contemplated: one where the processes and the observations are linear functions of the states with additive and Gaussian perturbations and another where the functions are nonlinear and/or the noises are not Gaussian. The former class allows for estimating the latent process by optimal methods (e.g., Kalman filtering [16]) while the latter, by resorting to suboptimal methods, based on Bayesian theory [17] or other approximating techniques [18]. Precisely, popular approaches are based on (1) model transformations (e.g., extended Kalman filtering [19]), (2) resorting to QML solutions [15], and (3) Monte Carlo sampling principles.
Our goal here is to provide a method that achieves sequential estimation of hidden timeseries with diverse memory properties within a unified framework. We avoid Gaussianity and linearity assumptions in the model and thus consider techniques that can overcome such difficulties. In particular, among the advanced techniques available for online estimation of data, we choose to work with sequential Monte Carlo (SMC) or particle filtering (PF) methods [20–22].
Ever since the publication of [23], they have been shown to overcome the difficulties posed by nonGaussian and nonlinear models. They have an extensive record of being successfully applied to many disciplines, including engineering [24], geophysical sciences [25], biology [26, 27], and economics [28]. Furthermore, it is an active area of research, where new paradigms and extensions to the classical SMC method for estimation of latent states and fixed parameters are being explored [29–31]. In this work, we focus on the use of standard SMC methods for inference of latent states with different memory properties in a unified and consistent manner. We present our contributions in detail after providing a brief synopsis of the paper in the following section.
Outline
We consider the general problem of sequential inference of latent timeseries, observed via nonlinear functions. We describe the considered mathematical framework in Section 2, where we adopt a statespace representation of the processes. The latent timeseries is modeled as an ARMA(p,q) process driven by innovations correlated in time. We make minimal assumptions about the observation equation so that any computable nonlinear function can be accommodated.
In Section 3, we provide a Bayesian analysis of the timeseries model, for different assumptions about the parameters of the model. We first derive the joint probability density of the data, for which the computation of the covariance matrix is critical. We subsequently obtain the transition density of the timeseries of interest, which plays a critical role in the proposed method.
We leverage the statistical description of the model and propose an SMC solution in Section 4. After an outline of the key concepts of the SMC technique in subsection 4.1, in the following subsection, we present a novel SMC method for inference of latent timeseries with noni.i.d. innovations.
We conclude the paper with Section 5 where we evaluate the proposed set of SMC methods on an illustrative application: inference of the stochastic volatility of observed timevarying data. This model is not only of interest in practice (e.g., finance) but also serves as a challenging benchmark for alternative methods already available in the literature. The results validate our proposed method.
Contribution
The main contribution of this paper is a set of SMC methods for inference of latent states with different memory properties in a unified and consistent manner.
In previous work, SMC methods have been introduced for specific modeling assumptions: ARMA processes in [32–34] and latent fractional Gaussian processes in [35]. On the contrary, an SMCbased alternative for filtering timeseries of different properties could be to pose the problem from a model selection perspective [36–38]. However, we study a different and more ambitious approach in this paper, as we target inference of latent states with different characteristics in a unified and consistent manner.
We consider timevarying phenomena with diverse characteristics, provide a flexible and compact mathematical representation of the data, and propose generic SMC methods for sequential inference of latent timeseries with different memory properties.
More precisely:

We provide a mathematical framework for the characterization of heterogeneous timeseries as ARMA(p,q) models driven by innovations correlated in time.

We derive a compact mathematical formulation that describes timeseries of diverse memory properties.

We derive both the joint and the transition density of such timeseries under different model parameter assumptions.

We present a set of SMC methods for inference of latent timeseries with noni.i.d. innovations, based on which parameters of the model are known.

We demonstrate the performance of the proposed SMC on the stochastic volatility model driven by a fractional Gaussian process filtered by an ARMA(p,q) process.
Mathematical model
We are interested in the study of latent timeseries with diverse memory properties observed via nonlinear functions. To systematically accommodate observed and hidden variables, we adopt the statespace methodology. For the state, we use a flexible formulation that allows for short and longmemory processes. As for the space equation, we make minimal assumptions.
Specifically, the state dynamics follow an ARMA(p,q) model with noni.i.d. innovations. That is, the driving noise for the ARMA process is correlated in time. The only restriction is that the innovation process must be stationary. For the observation equation, we consider nonlinear functions of the state.
Mathematically, the model of interest is
where \(x_{t}\in \mathbb {R}\) (x _{ t }=0, ∀t<0) is the state of the latent process at t; a _{1},a _{2},⋯,a _{ p } are the autoregressive (AR) parameters; and b _{1},b _{2},⋯,b _{ q } are the moving average (MA) parameters of the ARMA model.
The symbol u _{ t } represents a zeromean Gaussian innovation process that is correlated in time. That is, it is fully characterized by its first (i.e., mean) and second moments (i.e., an arbitrary autocovariance function γ _{ u }(τ)). Note that stationarity is enforced (the mean is constant and the autocovariance depends only on the time lag τ), but no restriction on the form of the autocovariance function is imposed.
We denote with \(y_{t}\in \mathbb {R}\) the observation at time t. We consider that the observation can be expressed by any generic function h(x _{ t },v _{ t }) and that the observation noise v _{ t } can have any distribution for as long as the likelihood function f(y _{ t }x _{ t }) is computable up to a proportionality constant.
Our goal is to sequentially estimate the posterior distribution of x _{ t }, given the observations y _{1:t }≡{y _{1},y _{2},⋯,y _{ t }}, that is, f(x _{ t }y _{1:t }). We want to update f(x _{ t }y _{1:t }) to f(x _{ t+1}y _{1:t+1}), for each new acquired observation y _{ t+1}.
We resort to Bayesian theory for the derivation of such density. The key equation in the derivation is the one that connects the filtering density at t, f(x _{ t }y _{1:t }), with that at the next time instant t+1, f(x _{ t+1}y _{1:t+1}). It is given by
The analytical solution to this integral is only possible for the case of Gaussian densities and linear functions, i.e., the celebrated Kalman Filter [16]. Since we do not restrict ourselves to such assumptions in this paper, we resort to SMC methods, widely popular since the seminal publication of [23]. These methods provide suboptimal solutions, as they compute probability random measure approximations of the densities of interest, while guaranteeing convergence under suitable conditions.
We detail in Section 3 the Bayesian analysis of the studied timeseries (i.e., ARMA(p,q) models with correlated innovations), before delving into the intricacies of sequential Monte Carlo methods in Section 4.
Bayesian analysis of the timeseries
We provide a Bayesian analysis of an ARMA(p,q) model driven by an innovation process that is not i.i.d. We emphasize the importance of considering the full ARMA(p,q) with correlated innovations model, as it allows for an accurate description of timeseries with a wide range of memory properties within the same mathematical formulation.
Typical ARMA(p,q) models (i.e., with i.i.d. innovations) possess exponentially decaying memory properties. Regardless of the parameterization of the autoregressive or the moving average components, the timeseries forgets its past at a rate proportional to c ^{−τ} for a constant c∈(0,1). On the contrary, by considering correlated innovations u _{ t }, we allow for the timeseries to exhibit a wide range of decaying memory properties. That is, the modeled timeseries can forget their past values subexponentially.
For example, if the innovations are fractional Gaussian processes, then the memory dependency is proportional to τ ^{−c} for a constant c∈(0,1) (e.g., see Fig. 1). Note that such diverse memory characteristics cannot be modeled with ARMA models and uncorrelated innovations. Furthermore, any of the particular subcases (AR(p), MA(q), or ARMA(p,q), with or without correlated noise) are covered within the proposed generic formulation.
All in all, we study a generic ARMA(p,q) model with noni.i.d. innovations. We are interested in the statistical properties of this stochastic process and, thus, in the joint distribution of the timeseries up to time instant t, i.e., f(x _{1:t }). This distribution contains all the relevant information of the timeseries, from which any marginal and conditional of interest can be derived.
We start by reformulating the state timeseries in Eq. (1), where we fix b _{0}=1,
and rewrite the above recursion in matrix form
By defining the matrices
and
we can immediately write A _{ t } x _{1:t }=B _{ t } u _{1:t }, and thus,
where the time subscript t also indicates matrix dimensionality.
In our problem of interest, the innovation process u _{ t } is zeromean Gaussian with an arbitrary autocovariance function \(\gamma _{u}(\tau)=\sigma _{u}^{2}\rho _{u}(\tau)\). The innovation process is correlated unless ρ _{ u }(τ)=δ(τ), when we have i.i.d. noise. The vector of innovations up to time instant t, i.e., u _{1:t }, follows a zeromean Gaussian multivariate density with covariance matrix \(C_{u_{t}}=\sigma _{u}^{2}R_{u_{t}}\), where
Note the symmetric Toeplitz structure of the matrix, where the only required elements are the normalized autocovariance values ρ _{ u }(τ), for lags τ=0,1,⋯,t−1.
Based on these sufficient statistics of the correlated innovation process and the formulation of the timeseries as in Eq. (7), we derive the densities of interest for our model in Eq. (1).
Bayesian analysis: joint density of the timeseries
Let us consider an ARMA(p,q) timeseries with correlated innovations at time instant t, i.e.,
One readily concludes that the joint density of the timeseries x _{1:t } is a multivariate zeromean Gaussian with a covariance matrix dependent on the matrices A _{ t }, B _{ t }, and \(R_{u_{t}}\); that is,
It is also of practical interest to consider the unknown driving noise variance case. Instead of estimation of the unknown \(\sigma _{u}^{2}\), we hereby proceed by marginalizing it, i.e., we RaoBlackwellize \(\sigma _{u}^{2}\). To that end, we use conjugate priors due to their convenient analytical properties on deriving the marginalized density.
We start with a scaled inverse chisquared prior for the unknown \(\sigma _{u}^{2}\),
The derivation of the marginalized joint density is then given by
We conclude that the joint density of the timeseries at time instant t after marginalization of the unknown variance is a multivariate Student t density
where ν=ν _{0} represents the degrees of freedom, μ _{ t }=0 is the location parameter, and \( \Phi _{t}=\sigma _{0}^{2}\Sigma _{t}\) is the scale matrix.
Bayesian analysis: transition density of the timeseries
In timeseries analysis, the transition density of the process, which describes the dynamics of the state from time t to the next time instant t+1, is of practical interest. To that end, we leverage the joint densities derived in Subsection 3.1 for both the known and unknown innovation variance cases.
Let us consider the statistical description of the timeseries at time instant t+1, i.e.,
and rewrite the covariance matrix Σ _{ t+1} in block form
We can readily derive the transition density of the state by using the expressions for the conditionals of the multivariate Gaussian and Student t distributions [39]. Namely,

if \(\sigma _{u}^{2}\) is known
$$ {} \begin{array}{ll} f\left(x_{t+1}x_{1:t}, \sigma_{u}^{2}\right) &= \mathcal{N}\left(x_{t+1}\mu_{t+1}, \sigma_{t+1}^{2}\right), \\ &\text{with} \left\{\begin{array}{l} \mu_{t+1}=\lambda_{t}\Sigma_{t}^{1}x_{1:t} \;, \\ \sigma_{t+1}^{2}=\sigma_{u}^{2}\left(h_{t+1}\lambda_{t}\Sigma_{t}^{1}\lambda_{t}^{\top} \right) \;. \end{array}\right. \end{array} $$(16) 
if \(\sigma _{u}^{2}\) is unknown
$$ {} \begin{array}{ll} f(x_{t+1}x_{1:t}) &\!\!= \mathcal{T}_{\nu_{t+1}}\left(x_{t+1}\mu_{t+1},\phi_{t+1}^{2}\right), \\ &\!\!\text{with} \left\{\begin{array}{l} \nu_{t+1}\,=\, \nu_{0}+t \;, \\ \mu_{t+1}\,=\,\lambda_{t}\Sigma_{t}^{1}x_{1:t} \;, \\ \phi_{t+1}^{2}\,=\,\frac{\nu_{0}\sigma_{0}^{2}+x_{1:t}^{\top} \Sigma_{t}^{1}x_{1:t}}{\nu_{0}+t}\!\left(\!h_{t+1}\!\,\,\!\lambda_{t}\!\Sigma_{t}^{1}\!\lambda_{t}^{\top} \!\right) \;. \end{array}\right. \end{array} $$(17)
Proposed SMC method for inference of latent timeseries
Our objective is to sequentially infer the evolution of a latent timeseries with correlated innovations, as we acquire new observations. In Bayesian terminology, one is interested in updating the filtering density from f(x _{ t }y _{1:t }) to f(x _{ t+1}y _{1:t+1}) as new data are observed. We do so by using Eq. (2).
As previously pointed out, the analytical solution to such equation is intractable for the most interesting cases (models with nonlinearities and nonGaussianities) and thus, we resort to sequential Monte Carlo methods. We briefly provide an overview of SMC methods in subsection 4.1, before explaining in detail our proposed method in subsection 4.2.
Sequential Monte Carlo
Monte Carlo methods are a class of computational algorithms that numerically approximate functions of interest by random sampling. In particular, sequential Monte Carlo methods recursively compute approximations to relevant probability densities, by replacing the true densities with discrete random probability measures
where δ(·) is the Dirac delta function.
The points x ^{(m)} represent the support of the random measure and are called particles. These particles are assigned weights w ^{(m)}, which are interpreted as probability masses. The random measure is thus a weighted sum of M particles and their weights.
The key to SMC methods is the sequential computation of Eq. (2), which is done by updating the approximating random measures at time instant t to the next time instant t+1. Let f ^{M}(x _{ t }) be the approximation of f(x _{ t }y _{1:t }) at time instant t. The update of f ^{M}(x _{ t }) to f ^{M}(x _{ t+1}) is done in two steps.
First, one propagates the particles \(x_{t}^{(m)}\) to \(x_{t+1}^{(m)}\) via a socalled proposal density π(·),
where one may use all or part of the available information (that is, the history of observations and previous states). Then, one computes the weights of each candidate sample \(x_{t+1}^{(m)}\) according to
where \(f\left (y_{t+1}x_{t+1}^{(m)}\right)\) is the likelihood of the new observation given sample \(x^{(m)}_{t+1}\), and \(f\left (x_{t+1}^{(m)}x_{1:t}^{(m)}\right)\) is the transition density of the latent state. The computation of the weights is followed by their normalization so that they sum up to one and form a proper probability random measure.
SMC methods require an additional third step called resampling [40]. If one proceeds with propagation and weight computation only, the approximation f ^{M}(x _{ t }) degenerates quickly, as only few of the particles are assigned nonnegligible weights. Resampling consists on deciding which particles to propagate by selecting those with higher probability, i.e., bigger weights \(w_{t}^{(m)}\). One prevents the quick deterioration of the SMC method by resorting to resampling methods (see [40] for an overview of the most common techniques). These are often triggered based on the effective sample size of the SMC approximation at every time instant [41].
The choice of the proposal density is critical for any SMC method. It has been shown that the optimal importance function is f(x _{ t+1}x _{ t },y _{1:t+1}), which minimizes the variance of the resulting random measure. However, this density is analytically intractable in our problem of interest. We adopt the simpler, but yet effective, alternative known as Sequential Importance Resampling (SIR) [23].
In summary, we sample new particle candidates from the transition density of the latent states f(x _{ t+1}x _{1:t }). Such proposal function entails that the weighting of the particles is proportional to their likelihood function, i.e., \(w_{t+1}^{(m)} \propto w_{t}^{(m)} f\left (y_{t+1}x_{t+1}^{(m)}\right)\). Details of the proposed SMC method for inference of latent ARMA models with correlated innovations follows.
Proposed SMC method
We first present an SMC method for inference of an ARMA process with correlated innovations, when the ARMA parameters, i.e., θ=(a _{1} a _{2} ⋯ a _{ p } b _{1} b _{2} ⋯ b _{ q })^{⊤}, are known. We later relax the assumptions for the case when these parameters are unknown. In all the cases, the normalized autocovariance values ρ _{ u }(τ) for lags τ=0,1,⋯,t−1, of the correlated innovation process must be known.
Proposed SMC method: known ARMA parameters
Let us consider at time instant t the following probability random measure approximation of the filtering density f(x _{ t }y _{1:t }):
Upon reception of a new observation y _{ t+1}, the algorithm proceeds as follows:

1.
Compute the joint normalized covariance matrix Σ _{ t+1} at time instant t+1.
$$ \Sigma_{t+1}=A_{t+1}^{1}B_{t+1} R_{u_{t}} B_{t+1}^{\top} \left(A_{t_{1}}^{1}\right)^{\top} = \left(\begin{array}{ll} h_{t+1} & \lambda_{t} \\ \lambda_{t}^{\top} & \Sigma_{t} \end{array}\right). $$(22) 
2.
Perform resampling of the state’s genealogical line by drawing from a categorical distribution defined by the random measure f ^{M}(x _{ t }).
$$ \overline{x}_{1:t}^{(m)} \sim \left\{x_{t}^{(m)}, w_{t}^{(m)}\right\}, \text{where}\ m=1,\cdots, M. $$(23) 
3.
Propagate the state particles by sampling from the transition density, conditioned on the available resampled streams \(\overline {x}_{1:t}^{(m)}\).

If \(\sigma _{u}^{2}\) is known
$$\begin{array}{ll} \quad x_{t+1}^{(m)}&\sim \mathcal{N}\left(x_{t+1}\mu_{t+1}^{(m)}, \sigma_{t+1}^{2}\right) \;,\\ \text{where }&\left\{\begin{array}{l} \mu_{t+1}^{(m)}=\lambda_{t}\Sigma_{t}^{1}\overline{x}_{1:t}^{(m)} \;, \\ \sigma_{t+1}^{2}=\sigma_{u}^{2}\left(h_{t+1}\lambda_{t}\Sigma_{t}^{1}\lambda_{t}^{\top} \right) \;. \end{array}\right. \end{array} $$(24) 
If \(\sigma _{u}^{2}\) is unknown
$$\begin{array}{ll} x_{t+1}^{(m)}&\sim \mathcal{T}_{\nu_{t+1}}\left(x_{t+1}\mu_{t+1}^{(m)},\phi_{t+1}^{2(m)}\right)\;,\\ {} \text{where }&\left\{\begin{array}{l} \nu_{t+1}=\nu_{0}+t \;,\\ \mu_{t+1}^{(m)}=\lambda_{t}\Sigma_{t}^{1}\overline{x}_{1:t}^{(m)} \;,\\ \phi_{t+1}^{2(m)}=\frac{\nu_{0}\sigma_{0}^{2}+\overline{x}_{1:t}^{(m)^{\top}} \Sigma_{t}^{1}\overline{x}_{1:t}^{(m)}}{\nu_{0}+t} \left(\!h_{t+1}\!\lambda_{t}\Sigma_{t}^{1}\lambda_{t}^{\top} \right) \;. \end{array}\right. \end{array} $$(25)


4.
Compute the nonnormalized weights for the drawn particles according to
$$ \widetilde{w}_{t+1}^{(m)} \propto f\left(y_{t+1}x_{t+1}^{(m)}\right), $$(26)and normalize them to obtain a new random measure
$$ f^{M}(x_{t+1}) = \sum_{m=1}^{M} w_{t+1}^{(m)} \delta\left(x_{t+1}x_{t+1}^{(m)}\right). $$(27)
For the above method to be applicable, one needs to have full knowledge of the parameters in the transition density. That is, the matrices A _{ t+1}, B _{ t+1}, and \(R_{u_{t+1}}\) must be known for the covariance matrix Σ _{ t+1} to be computed for propagation of the state particles.
One can efficiently compute Σ _{ t+1} by leveraging algebraic tricks prompted by the structural properties of the involved matrices (Toeplitz and upper triangular). On the one hand, the upper triangular nature of A _{ t } and B _{ t } simplify the number of computations considerably (the inverse of an upper triangular matrix is also upper triangular). The product \(A_{t}^{1} B_{t}\) is a matrix with a structure similar to A _{ t } and B _{ t }: an upper triangular matrix with elements of its first row shifted to the right. On the other hand, due to the Toeplitz structure of the \(R_{u_{t}}\) matrix, one can resort to a LevinsonDurbin type technique [42] to recursively compute the necessary matrix product operations.
The assumption that knowledge of the parameters within A _{ t+1} and B _{ t+1} exists, however, is often not substantiated. Therefore, we resort to a parameter sampling scheme when the ARMA parameters are not known. We augment the state vector with the unknown parameters ρ _{ t }=(x _{ t } θ _{ t })^{⊤}, similar to the work in [43, 44]. Note that the subscript t in θ _{ t } does not imply that the parameter evolves over time. It is there only to signify that we obtain samples of the unknowns at time t.
The full parameter posterior for the model in Eq. (1) is analytically intractable, and thus, we cannot draw samples from the true parameter posterior. Furthermore, as the parameters do not change over time, their particle propagation becomes troublesome and various methodologies have been suggested to overcome these challenges. Some include the use of artificial parameter evolution [23], while others resort to kernel smoothing [43] or densityassisted (DA) particle filtering techniques [45].
In this paper, we explore and compare two sampling alternatives, one based on the principles of DASMC methods and another where importance sampling (IS) of the parameters is carried out. In the former, one approximates the posterior of the unknown parameters with a density of choice; in the latter, one draws from a proposal density for the parameters and later adjusts by computing the appropriate weights.
These proposed methods are the first approximation to dealing with unknown ARMA parameters. We acknowledge that any of the advanced SMC techniques that mitigate the challenges of estimating constant parameters (e.g., parameter smoothing [29, 46, 47] or nested SMC methods [30, 31]) can only improve the accuracy of the proposed SMC methods.
Proposed SMC method: DASMC for unknown ARMA parameters
We now propose an SMC method for the case where the parameters of the latent ARMA model, i.e., θ=(a _{1} a _{2} ⋯ a _{ p } b _{1} b _{2} ⋯ b _{ q })^{⊤}, are unknown. This first alternative follows the principles of densityassisted SMC methods. Because the true posterior of the unknown parameters is analytically intractable, it approximates such posterior with a density of choice.
In particular, we propose to approximate the posterior of the unknown parameter θ, given the current timeseries x _{1:t }, with a Gaussian distribution, i.e.,
where the sufficient statistics are computed based on samples and weights available at this time instant
One uses this approximation to propagate parameter samples from this time instant to the next. As a result, the overall weight computation of the SMC method simplifies to
In summary, the proposed DASMC for the unknown parameter case considers a joint state and parameter random measure at time instant t of the following form
and, upon reception of a new observation y _{ t+1}, proceeds as follows:

1.
Estimate the sample mean and covariance of the parameter vector θ _{ t }.
$$ \left\{\begin{array}{l} \mu_{\theta_{t}} = \sum_{i=1}^{M} \theta_{t}^{(m)} w_{t}^{(m)}, \\ \Sigma_{\theta_{t}} = \sum_{i=1}^{M} \left(\theta_{t}^{(m)}  \mu_{\theta_{t}}\right)\left(\theta_{t}^{(m)}  \mu_{\theta_{t}}\right)^{\top} w_{t}^{(m)} \;. \end{array}\right. $$(32) 
2.
Draw new parameter samples from the Gaussian approximation to the posterior density with the newly computed sufficient statistics.
$$ \theta_{t+1}^{(m)} \sim f\left(\theta_{t}x_{1:t}^{(m)}\right) \approx \mathcal{N}\left(\theta_{t+1}\mu_{\theta_{t}}, \Sigma_{\theta_{t}}\right) \;. $$(33) 
3.
Compute the joint covariance matrix for each parameter sample \(\theta _{t+1}^{(m)}\).
$$ {}\Sigma_{t+1}^{(m)}=A_{t+1}^{(m)^{1}}B_{t+1}^{(m)} R_{u_{t}} B_{t+1}^{(m)^{\top}} A_{t_{1}}^{(m)^{1^{\top}}} = \left(\begin{array}{ll} h_{t+1}^{(m)} & \lambda_{t}^{(m)} \\ \lambda_{t}^{(m)^{\top}} & \Sigma_{t}^{(m)} \end{array}\right). $$(34) 
4.
Perform resampling of the state’s genealogical line by drawing from a categorical distribution defined by the random measure f ^{M}(x _{ t }).
$$ \overline{x}_{1:t}^{(m)} \sim \left\{x_{t}^{(m)}, w_{t}^{(m)}\right\}, \text{where}\ m=1,\cdots, M. $$(35) 
5.
Propagate the state particles by sampling from the transition density, conditioned on available resampled streams \(\overline {x}_{1:t}^{(m)}\).

If \(\sigma _{u}^{2}\) is known
$$\begin{array}{ll} \quad x_{t+1}^{(m)}&\sim \mathcal{N}\left(x_{t+1}\mu_{t+1}^{(m)}, \sigma_{t+1}^{2^{(m)}}\right) \;,\\ \text{where }&\left\{\begin{array}{l} \mu_{t+1}^{(m)}=\lambda_{t}^{(m)}\Sigma_{t}^{(m)^{1}}\overline{x}_{1:t}^{(m)} \;, \\ \sigma_{t+1}^{2^{(m)}}=\sigma_{u}^{2}\left(h_{t+1}^{(m)}\lambda_{t}^{(m)}\Sigma_{t}^{(m)^{1}}\lambda_{t}^{(m)^{\top}} \right) \;. \end{array}\right. \end{array} $$(36) 
If \(\sigma _{u}^{2}\) is unknown
$$\begin{array}{ll} \quad x_{t+1}^{(m)}&\sim \mathcal{T}_{\nu_{t+1}}\left(x_{t+1}\mu_{t+1}^{(m)},\phi_{t+1}^{2^{(m)}}\right)\;,\\ \text{where }&\left\{\begin{array}{l} \nu_{t+1}=\nu_{0}+t \;,\\ \mu_{t+1}^{(m)}=\lambda_{t}^{(m)}\Sigma_{t}^{(m)^{1}}\overline{x}_{1:t}^{(m)} \;,\\ \phi_{t+1}^{2^{(m)}}=\frac{\nu_{0}\sigma_{0}^{2}+\overline{x}_{1:t}^{(m)^{\top}} \Sigma_{t}^{(m)^{1}}\overline{x}_{1:t}^{(m)}}{\nu_{0}+t} \\ \qquad \qquad \times \left(h_{t+1}^{(m)}\lambda_{t}^{(m)}\Sigma_{t}^{(m)^{1}}\lambda_{t}^{(m)^{\top}} \right) \;. \end{array}\right. \end{array} $$(37)


6.
Compute the nonnormalized weights for the drawn particles according to
$$ \widetilde{w}_{t+1}^{(m)} \propto f\left(y_{t+1}x_{t+1}^{(m)}\right), $$(38)and normalize them to obtain a new probability random measure
$$ f^{M}(\rho_{t+1}) = \sum_{m=1}^{M}w_{t+1}^{(m)} \delta\left(\rho_{t+1}\rho_{t+1}^{(m)}\right). $$(39)
Proposed SMC method: ISSMC for unknown ARMA parameters
We now propose an alternative SMC method for the unknown ARMA parameter case too, based on importance sampling principles. Instead of approximating the analytically intractable parameter posterior, one can choose a proposal density and apply IS to jointly adjust the state and parameter samples.
Specifically, we use a Gaussian proposal density to draw samples for the unknown ARMA parameters θ. At every time instant, one propagates parameter particles by sampling from the proposal
with sufficient statistics as in Eq. (29). The corresponding weight computation results in
Since the posterior of the parameters is analytically intractable, we have
which results in
With \(\Sigma _{t}^{\left (\mu _{\theta _{t}}\right)}\), we describe the covariance matrix computed using the parameter estimates \(\mu _{\theta _{t}}\) as in Eq. (29), while with \(\Sigma _{t}^{(m)}\), we refer to the covariance matrix evaluated per drawn parameter sample \(\theta _{t+1}^{(m)}\).
Therefore, the proposed ISSMC for the unknown parameter case at time instant t starts with a joint state and parameter random measure
and, upon reception of a new observation y _{ t+1}, proceeds as follows:

1.
Estimate the sample mean and covariance of the parameter vector θ _{ t }.
$$ \begin{array}{l} \mu_{\theta_{t}} = \sum_{i=1}^{M} \theta_{t}^{(m)} w_{t}^{(m)}, \\ \Sigma_{\theta_{t}} = \sum_{i=1}^{M} \left(\theta_{t}^{(m)}  \mu_{\theta_{t}}\right)\left(\theta_{t}^{(m)}  \mu_{\theta_{t}}\right)^{\top} w_{t}^{(m)} \;. \end{array} $$(45) 
2.
Draw new parameter samples from the Gaussian proposal with the newly computed sufficient statistics.
$$ \theta_{t+1}^{(m)} \sim \pi(\theta_{t+1}) = \mathcal{N}\left(\theta_{t+1}\mu_{\theta_{t}}, \Sigma_{\theta_{t}}\right) \;. $$(46) 
3.
Compute the joint normalized covariance matrix for each parameter sample \(\theta _{t+1}^{(m)}\).
$$ {}\Sigma_{t+1}^{(m)}=A_{t+1}^{(m)^{1}}B_{t+1}^{(m)} R_{u_{t}} B_{t+1}^{(m)^{\top}} A_{t_{1}}^{(m)^{1^{\top}}} = \left(\begin{array}{ll} h_{t+1}^{(m)} & \lambda_{t}^{(m)} \\ \lambda_{t}^{(m)^{\top}} & \Sigma_{t}^{(m)} \end{array}\right). $$(47) 
4.
Perform resampling of the state’s genealogical line by drawing from a categorical distribution defined by the random measure f ^{M}(x _{ t }).
$$ \overline{x}_{1:t}^{(m)} \sim \left\{x_{t}^{(m)}, w_{t}^{(m)}\right\}, \text{where}\ m=1,\cdots, M. $$(48) 
5.
Propagate the state particles by sampling from the transition density, conditioned on available resampled streams \(\overline {x}_{1:t}^{(m)}\).

If \(\sigma _{u}^{2}\) is known
$$\begin{array}{ll} \quad x_{t+1}^{(m)}&\sim \mathcal{N}\left(x_{t+1}\mu_{t+1}^{(m)}, \sigma_{t+1}^{2^{(m)}}\right) \;,\\ \text{where }&\left\{\begin{array}{l} \mu_{t+1}^{(m)}=\lambda_{t}^{(m)}\Sigma_{t}^{(m)^{1}}\overline{x}_{1:t}^{(m)} \;, \\ \sigma_{t+1}^{2^{(m)}}=\sigma_{u}^{2}\left(h_{t+1}^{(m)}\lambda_{t}^{(m)}\Sigma_{t}^{(m)^{1}}\lambda_{t}^{(m)^{\top}} \right) \;. \end{array}\right. \end{array} $$(49) 
If \(\sigma _{u}^{2}\) is unknown
$$\begin{array}{ll} \quad x_{t+1}^{(m)}&\sim \mathcal{T}_{\nu_{t+1}}\left(x_{t+1}\mu_{t+1}^{(m)},\phi_{t+1}^{2^{(m)}}\right)\;,\\ \text{where }&\left\{\begin{array}{l} \nu_{t+1}=\nu_{0}+t \;,\\ \mu_{t+1}^{(m)}=\lambda_{t}^{(m)}\Sigma_{t}^{(m)^{1}}\overline{x}_{1:t}^{(m)} \;,\\ \phi_{t+1}^{2^{(m)}}=\frac{\nu_{0}\sigma_{0}^{2}+\overline{x}_{1:t}^{(m)^{\top}} \Sigma_{t}^{(m)^{1}}\overline{x}_{1:t}^{(m)}}{\nu_{0}+t} \\ \qquad \qquad \times \left(h_{t+1}^{(m)}\lambda_{t}^{(m)}\Sigma_{t}^{(m)^{1}}\lambda_{t}^{(m)^{\top}} \right) \;. \end{array}\right. \end{array} $$(50)


6.
Compute the nonnormalized weights for the drawn particles.

If \(\sigma _{u}^{2}\) is known
$$ \widetilde{w}_{t+1}^{(m)} \propto \frac{f\left(y_{t+1}x_{t+1}^{(m)}\right) \cdot \mathcal{N}\left(x_{1:t}^{(m)}\left0, \sigma_{u}^{2}\Sigma_{t}^{\left(\theta_{t+1}^{(m)}\right)}\right.\right)}{\mathcal{N}\left(\theta_{t+1}^{(m)}\left\mu_{\theta_{t}}, \Sigma_{\theta_{t}}\right.\right) \mathcal{N}\left(x_{1:t}^{(m)}\left0, \sigma_{u}^{2}\Sigma_{t}^{\left(\mu_{\theta_{t}}\right)}\right.\right)} \;. $$(51) 
If \(\sigma _{u}^{2}\) is unknown
$$ {} \widetilde{w}_{t+1}^{(m)} \propto \frac{f\left(y_{t+1}x_{t+1}^{(m)}\right) \cdot \mathcal{T}_{\nu_{0}}\left(x_{1:t}^{(m)}\left0, \sigma_{0}^{2}\Sigma_{t}^{\left(\theta_{t+1}^{(m)}\right)}\right.\right)}{\mathcal{N}\left(\theta_{t+1}^{(m)}\left\mu_{\theta_{t}}, \Sigma_{\theta_{t}}\right.\right) \mathcal{T}_{\nu_{0}}\left(x_{1:t}^{(m)}\left\nu_{0},0,\sigma_{0}^{2}\Sigma_{t}^{\left(\mu_{\theta_{t}}\right)}\right.\right)} \;. $$(52)
and normalize them to obtain a new probability random measure
$$ f^{M}(\rho_{t+1}) = \sum_{m=1}^{M}w_{t+1}^{(m)} \delta\left(\rho_{t+1}\rho_{t+1}^{(m)}\right). $$(53) 
Practical application
We now illustrate the applicability of the proposed SMC methods and evaluate their performance. We do so by considering the stochastic logvolatility (SV) statespace framework. That is, the observations are a zeromean process with timevarying logvariance that one wants to estimate.
The SV model is popular in the study of nonlinear statespace models (due to the estimation challenges that it presents [15, 48–50]) and is of interest in finance (due to its applicability in the study of stock returns [17, 51–53]).
It has been established that Kalman filter (KF)based methods fail to accurately estimate the latent state for SV models. In principle, for the nonlinearities in the SV model observation equation, extensions to the popular KF, such as the extended Kalman filter (EKF) [19], the unscented Kalman filter (UKF) [54], and other SigmaPoint Kalman filters [55] should be applicable. However, as reported in [56], these methods fail when addressing the SV model since they are unable to update their prior beliefs for such model (the Kalman gain is always null). Alternatives based on transformations of the model have been suggested ([15, 50]) but, as reported in [33], they fall short when compared to SMC methods.
Furthermore, the SV model has been in use in econometrics for a long time [53], as it is of interest in estimating the risk involved in financial transactions. There, the observations describe the price evolution of an asset, for which estimating its volatility is critical. This is not an easy task, and many efforts have been reported, where the risk is described with diverse memory properties [7, 14, 57].
Motivated by its practical application and the challenges it poses to the inference problem, we focus on the SV model, where the logvolatility is described by a latent ARMA(p,q) model with correlated innovations.
Without loss of generality, we focus on ARMA models with fractional Gaussian noise. With this modeling, we accommodate a wide range of memory properties: from uncorrelated to longmemory processes. This is a natural extension of the classical ARMA model, where instead of i.i.d. Gaussian innovations, the ARMA(p,q) filters a fractional Gaussian process with Hurst parameter H. The properties of such model are equivalent to those of the FARIMA(p,d,q), when \(d=H\frac {1}{2}\).
Mathematically, the SV model, where the latent timeseries is an ARMA(p,q) with fractional Gaussian noise, is written as
where v _{ t } is a standard Gaussian variable and the state innovation u _{ t } is a zeromean Gaussian process with known autocovariance function γ _{ u }(τ). In particular, for the fractional Gaussian process,
which is parameterized by the Hurst parameter H and variance \(\sigma _{u}^{2}\). When H=0.5, the process is uncorrelated, while the memory of the innovations increases as H→1. We illustrate in Fig. 1 the dependency of the normalized autocovariance function \(\left (\text {i.e.,}\ \sigma _{u}^{2}=1\right)\) with respect to the parameter H.
We evaluated the proposed method in this nonlinear model, first under the known ARMA parameter case, and then, under unknown parameters. We show in Fig. 2 how the proposed method is able to accurately track different latent processes with diverse memory properties, even when the innovation variance \(\sigma _{u}^{2}\) is unknown. The SMC methods were run with M=1000 particles for different values of the Hurst parameter.
We further studied the filtering performance of the methods described in Subsection 4.2.1 and conclude, based on results summarized in Table 1, that the proposed SMC method successfully estimates the latent state, both for the known and unknown innovation variance cases, for any given memory.
Note that the unknown \(\sigma _{u}^{2}\) case induces a slight loss in accuracy. However, the estimation performance is comparable. The justification relies on the form of the derived marginalized density. As more data are observed, the transition density for the unknown variance case (i.e., Eq. (17)) becomes very similar to the one in the known case (i.e., Eq. (16)). This occurs because a Student tdistribution with high degrees of freedom is very similar to a Gaussian distribution. Thus, the proposal densities in both SMC methods become almost identical with time.
To get further insight on the impact of not knowing the innovation variance \(\sigma _{u}^{2}\), we study the evolution of the scale factor in Eq. (17) over time. We plot (see Fig. 3) the scale factor
as we get more data and observe that the estimate approaches the true \(\sigma _{u}^{2}\) value. The estimation accuracy improves with time for all the evaluated ARMA parameterizations and memory properties of the innovation process.
We now turn our attention to the more challenging scenario where the ARMA parameters are unknown. We evaluate both proposed approaches, i.e., the DA and ISbased SMC methods from subsections 4.2.2 and 4.2.3, respectively.
Once again, we study the performance of the method for latent processes with different memory properties. In Table 2, we provide averaged state mean squared error (MSE) results for AR(1), MA(1), and ARMA(1,1) models with uncorrelated (H=0.5), short (H=0.7), and longmemory (H=0.9) fractional Gaussian innovations.
The state filtering results allow us to conclude that both proposed approaches are suitable solutions for inference of latent processes with unknown ARMA parameters. Besides, it is noticeable that the impact on state estimation accuracy of not knowing the parameters a and b is more pronounced than not knowing the innovation variance \(\sigma _{u}^{2}\).
We also observe a slightly better filtering performance of the DASMC when compared to the ISSMC. However, this improved state tracking accuracy comes with a cost, as the estimation of the unknown parameters is worse for the DAbased SMC. This effect is illustrated with some parameter estimation realizations of an ARMA(1,1) process with unknown parameters a _{1} and b _{1}, with known innovation variance \(\sigma _{u}^{2}\) in Fig. 4, and unknown variance \(\sigma _{u}^{2}\) in Fig. 5.
We note that, for both proposed SMC methods, estimation of the AR parameters is more accurate than of the MA parameters. Recall that, due to how these parameters are used in our computations (inversion and multiplication of matrices involved when evaluating the sufficient statistics), identifiability and numerical issues may arise. In the proposed method, particles that are result of numerical instabilities are automatically discarded by their corresponding weights, as they become negligible. In such cases, we observe a reduced effective particle size, but the method is still able to track the state.
Furthermore, the DAbased SMC overestimates the unknown variance (see Fig. 5). Although this might seem irrelevant for the filtering problem, variance overestimation is critical when predicting future instances of the timeseries, as the density becomes too wide to be informative.
The poor parameter estimation accuracy for the DAbased SMC can be understood by inspecting the weight computation for each of the proposed alternatives. For the DAbased approach (i.e., Eq. (30)), only state samples are accounted for, while for the ISbased approach as in Eq. (41), both state and parameter samples are taken into account. We further explain this effect with results in Table 3 and interpret them as follows.
When applying IS, one explicitly computes weights based on both the state and parameter samples, while with the DA approach, one hopes that the best state particles are associated with good parameters too (although this is not explicitly accounted for). As a result of the parameterexplicit weight computation in ISbased methods, the number of particles with nonnegligible weights is much reduced at every time instant (as one looks for both good state and parameter samples). Consequently, the effective particle size of the ISbased SMC method is quite low, and thus, the obtained results much more volatile. Averaged effective particle sizes for the proposed SMC methods are provided in Table 3.
Conclusions
In this paper, we proposed a set of SMC methods for inference of latent timeseries with innovations correlated in time. This is achieved in a unified and consistent manner for different types of models and for scenarios that include known and unknown parameters. We mathematically formulated the problem using the statespace methodology, where the latent timeseries was modeled as an ARMA(p,q) driven by innovations correlated in time. The provided compact formulation allows for a Bayesian analysis of the model, which results in the derivation of the key transition density for the proposed SMC methods. Different parameter assumptions were considered and, as shown by the presented extensive results, the SMC methods are able to accurately infer the hidden states. The proposed method is generic in that it addresses diverse memory properties in a coherent manner and is accurate in estimating latent timeseries.
Notes
 1.
We use innovations to refer to the stochastic process driving the timeseries. Although a reader from the signalprocessing community might be more familiar with the term noise, we prefer to use innovations as it is most common in the statistical literature of stochastic processes and timeseries analysis.
References
 1
PJ Brockwell, RA Davis, Time Series: Theory and Methods, 2nd edn. Springer Series in Statistics (Springer, 1991).
 2
J Durbin, SJ Koopman, Time Series Analysis by StateSpace Methods. Oxford Statistical Science Series (Oxford University Press, 2001).
 3
RH Shumway, DS Stoffer, Time Series Analysis and Its Applications: With R Examples (Springer Texts in Statistics), 3rd edn. (Springer, 2010).
 4
P Whittle, Hypothesis Testing in Time Series Analysis (Almquist and Wicksell, 1951).
 5
GEP Box, GM Jenkins, Time Series Analysis: Forecasting and Control. HoldenDay series in time series analysis and digital processing (HoldenDay, 1976).
 6
HE Hurst, Longterm storage capacity of reservoirs. Trans. Am. Soc. Civil Eng.116:, 770–808 (1951).
 7
RT Baillie, Long memory processes and fractional integration in econometrics. J. Econ.73(1), 5–59 (1996).
 8
J Beran, Statistics for LongMemory Processes. Chapman & Hall/CRC Monographs on Statistics & Applied Probability (Taylor & Francis, 1994).
 9
W Palma, LongMemory Time Series: Theory and Methods. Wiley Series in Probability and Statistics (Wiley, 2007).
 10
H Liu, E Erdem, J Shi, Comprehensive evaluation of ARMA–GARCH(M) approaches for modeling the mean and volatility of wind speed. Appl. Energy. 88(3), 724–732 (2011).
 11
DJ Swider, C Weber, Extended ARMA models for estimating price developments on dayahead electricity markets. Electr. Power Syst. Res.77(5–6), 583–593 (2007).
 12
R Prado, HF Lopes, Sequential parameter learning and filtering in structured autoregressive statespace models. Stat. Comput.23(1), 43–57 (2013).
 13
S Degiannakis, C Floros, Methods of Volatility Estimation and Forecasting (Palgrave Macmillan UK, London, 2015).
 14
ST Rachev, JSJ Hsu, BS Bagasheva, FJ Fabozzi, Bayesian Methods in Finance. Frank J. Fabozzi Series (Wiley, 2008).
 15
J Durbin, SJ Koopman, Time Series Analysis by StateSpace Methods: Second Edition, 2nd edn. Oxford Statistical Science Series (Oxford University Press, 2012).
 16
RE Kalman, A new approach to linear filtering and prediction problems. Trans. ASME–J. Basic Eng.82(Series D), 35–45 (1960).
 17
E Jacquier, NG Polson, PE Rossi, Bayesian analysis of stochastic volatility models. J. Business Econ. Stat.12(4), 371–89 (1994).
 18
C Broto, E Ruiz, Estimation methods for stochastic volatility models: a survey. J. Econ. Surv., 613–649 (2004).
 19
BDO Anderson, JB Moore, Optimal Filtering. Dover Books on Electrical Engineering (Dover Publications, 2012).
 20
MS Arulampalam, S Maskell, N Gordon, T Clapp, A tutorial on particle filters for online nonlinear/nonGaussian Bayesian tracking. Signal Process. IEEE Trans.50(2), 174–188 (2002).
 21
PM Djurić, JH Kotecha, J Zhang, Y Huang, T Ghirmai, MF Bugallo, J Míguez, Particle filtering. IEEE Signal Process. Mag.20(5), 19–38 (2003).
 22
A Doucet, N De Freitas, N Gordon, Sequential Monte Carlo Methods in Practice (Springer, 2001).
 23
NJ Gordon, DJ Salmond, AFM Smith, Novel approach to nonlinear/nonGaussian Bayesian state estimation. Radar Signal Process. IEEE Proc.140(2), 107–113 (1993).
 24
B Ristic, S Arulampalam, N Gordon, Beyond the Kalman Filter: Particle Filters for Tracking Applications. (A Print, ed.) (Artech House, 2004).
 25
PJ van Leeuwen, Particle filtering in geophysical systems. Monthly Weather Rev.12(137), 4089–4114 (2009).
 26
EL Ionides, C Bretó, AA King, Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci.103(49), 18438–18443 (2006).
 27
EL Ionides, KS Fang, RR Isseroff, GF Oster, Stochastic models for cell motion and taxis. J. Math. Biol.48(1), 23–37 (2004).
 28
D Creal, A survey of Sequential Monte Carlo methods for economics and finance. Econ. Rev.31(3), 245–296 (2012).
 29
CM Carvalho, MS Johannes, HF Lopes, NG Polson, Particle learning and smoothing. Stat. Sci.25(1), 88–106 (2010).
 30
N Chopin, PE Jacob, O Papaspiliopoulos, SMCˆ2: an efficient algorithm for sequential analysis of statespace models. ArXiv eprints (2011). http://arxiv.org/abs/1101.1528.
 31
D Crisan, J Míguez, Nested particle filters for online parameter estimation in discretetime statespace Markov models. ArXiv eprints (2013). http://arxiv.org/abs/1308.1883.
 32
I Urteaga, PM Djurić, in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. Estimation of ARMA state processes by particle filtering, (2014), pp. 8033–8037.
 33
I Urteaga, PM Djurić, Sequential estimation of hidden ARMA processes by particle filtering—part I. IEEE Trans. Signal Process.65:, 482–493 (2016).
 34
I Urteaga, PM Djurić, Sequential estimation of hidden ARMA processes by particle filtering—part II. IEEE Trans. Signal Process.65:, 494–504 (2016).
 35
I Urteaga, MF Bugallo, PM Djurić, in Computational Advances in MultiSensor Adaptive Processing (CAMSAP), 2015 IEEE 6th International Workshop On. Filtering of nonlinear timeseries coupled by fractional Gaussian processes, (2015).
 36
CC Drovandi, JM McGree, AN Pettitt, A sequential monte carlo algorithm to incorporate model uncertainty in bayesian sequential design. J. Comput. Graph. Stat.23(1), 3–24 (2014).
 37
I Urteaga, MF Bugallo, PM Djurić, in 2016 IEEE Statistical Signal Processing Workshop (SSP). Sequential Monte Carlo methods under model uncertainty, (2016), pp. 1–5.
 38
L Martino, J Read, V Elvira, F Louzada, Cooperative parallel particle filters for online model selection and applications to urban mobility. Digital Signal Process.60(Supplement C), 172–185 (2017).
 39
JM Bernardo, AFM Smith, Bayesian Theory. Wiley Series in Probability and Statistics (Wiley, 2009).
 40
T Li, M Bolić, PM Djurić, Resampling methods for particle filtering: classification, implementation, and strategies. Signal Process. Mag. IEEE. 32(3), 70–86 (2015).
 41
L Martino, V Elvira, F Louzada, Effective sample size for importance sampling based on discrepancy measures. Signal Process.131:, 386–401 (2017).
 42
MH Hayes, Statistical Digital Signal Processing and Modeling (John Wiley & Sons, 1996).
 43
J Liu, M West, Combined Parameter and State Estimation in Simulation Based Filtering (Springer, 2001). Chapter 10 in “Sequential Monte Carlo Methods in Practice”.
 44
C Nemeth, P Fearnhead, L Mihaylova, Sequential Monte Carlo methods for state and parameter estimation in abruptly changing environments. Signal Process. IEEE Trans.62(5), 1245–1255 (2014).
 45
PM Djurić, MF Bugallo, J Míguez, in 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), 2. Density assisted particle filters for state and parameter estimation, (2004), pp. ii701–704.
 46
J Olsson, O Cappé, R Douc, E Moulines, Sequential Monte Carlo smoothing with application to parameter estimation in nonlinear state space models. ArXiv Mathematics eprints (2006). http://arxiv.org/abs/math/0609514.
 47
J Olsson, J Westerborn, Efficient particlebased online smoothing in general hidden Markov models: the PaRIS algorithm. ArXiv eprints (2014). http://arxiv.org/abs/1412.7550.
 48
G Agamennoni, EM Nebot, Robust estimation in nonlinear statespace models with statedependent noise. Signal Process. IEEE Trans.62(8), 2165–2175 (2014).
 49
PM Djurić, M Khan, DE Johnston, Particle filtering of stochastic volatility modeled with leverage. Selected Topics Signal Process. IEEE J.6(4), 327–336 (2012).
 50
AC Harvey, E Ruiz, N Shephard, Multivariate stochastic variance models. Rev. Econ. Stud.61(2), 247–264 (1994).
 51
R Bhar, D Lee, Comparing estimation procedures for stochastic volatility models of shortterm interest rates (2009).
 52
SS Ozturk, JF Richard, Stochastic volatility and leverage: application to a panel of S&P500 stocks. Finance Res. Lett.12:, 67–76 (2015).
 53
N Shephard, Stochastic Volatility: Selected Readings. Advanced texts in econometrics (Oxford University Press, 2005).
 54
SJ Julier, JK Uhlmann, Unscented filtering and nonlinear estimation. Proc. IEEE. 92(3), 401–422 (2004).
 55
RVD Merwe, E Wan, in Proceedings of the Workshop on Advances in Machine Learning. Sigmapoint Kalman filters for probabilistic inference in dynamic statespace models, (2003).
 56
O Zoeter, A Ypma, T Heskes, in Machine Learning for Signal Processing, 2004. Proceedings of the 2004 14th IEEE Signal Processing Society Workshop. Improved unscented Kalman smoothing for stock volatility estimation, (2004), pp. 143–152.
 57
R Cont, in Fractals in Engineering, ed. by J LévyVéhel, E Lutton. Long range dependence in financial markets (Springer, 2005), pp. 159–179.
Acknowledgements
We thank the anonymous reviewers for their feedback and comments.
Funding
This work has been supported by the National Science Foundation under award CCF1617986 (MFB) and award CCF1618999 (PMD).
Availability of data and materials
Not applicable, all data has been simulated via the algorithms described in the manuscript.
Author information
Affiliations
Contributions
IU, MFB, and PMD conceived the main ideas behind the proposed approach and designed the experiments. IU conducted the experiments and wrote the first draft of the manuscript. MFB and PMD reviewed and edited it. All authors read and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional information
Authors’ information
Iñigo Urteaga received the M.S. degree in telecommunications engineering from the ETSI Bilbao, UPV/EHU, Bilbao, Spain, in 2008, and the Ph.D. degree in Electrical Engineering from Stony Brook University, Stony Brook, NY, USA, in 2016. He is currently a Postdoctoral Research Scientist affiliated with both the Data Science Institute and the Department of Biomedical Informatics, at Columbia University, NY, USA. From 2007 to 2008, he was a Research Assistant in the Department of Computer Science, Colorado School of Mines (USA) and a Research Associate with Tecnalia Research and Innovation (Spain) from 2009 to 2011. He received the 2016 Armstrong Memorial Award for the best graduate student in the Department of Electrical and Computer Engineering at Stony Brook University. His main research interests include statistical signal processing and more specifically, Bayesian statistics and simulationbased Monte Carlo techniques. He focuses on the science of data inference, modeling, and prediction with applications to a wide range of disciplines. Urteaga is a member of the IEEE and serves as a reviewer of several journals and conferences.
Mónica F. Bugallo received her B.S., M.S, and Ph. D. degrees in computer science and engineering from University of A Coruña, Spain. She joined Stony Brook University in 2002 where she is currently a Professor of the Department of Electrical and Computer Engineering, as well as the Faculty Director of the Women In Science and Engineering (WISE) program of the College of Engineering and Applied Sciences. Her research interests are in the field of statistical signal processing, with emphasis on the theory of Monte Carlo methods for complex systems and its application to different disciplines including cancer research, sensor networks, and finance. In addition, she has focused on STEM education and has initiated several successful programs with the purpose of engaging students at all academic stages in the excitement of engineering and research, with focus on underrepresented groups. She has authored and coauthored two book chapters and more than 150 journal papers and refereed conference articles. Bugallo is a senior member of the IEEE, serves as an elected member on the Sensor Array and Multichannel and the Signal Processing Theory and Methods technical committees, and is the current chair of the IEEE Signal Processing Society Education Committee. She has been part of the technical committee and has organized various professional conferences and workshops and has received several prestigious research and education awards including the award for Best Paper in the IEEE Signal Processing Magazine (2007), the IEEE Athanasios Papoulis Award (2011), and the Chair of Excellence by the Universidad Carlos III de MadridBanco de Santander (Spain) (2012).
Petar M. Djurić received the B.S. and M.S. degrees in electrical engineering from the University of Belgrade, Belgrade, in 1981 and 1986, respectively, and the Ph.D. degree in electrical engineering from the University of Rhode Island, Kingston, RI, in 1990. Since 1990, he has been with Stony Brook University in the Department of Electrical and Computer Engineering, where he is a SUNY Distinguished Professor. From 1981 to 1986, he was a Research Associate with the Vinča Institute of Nuclear Sciences, Belgrade. His research interests include the area of signal and information processing with primary interests in the theory of signal modeling, detection, and estimation; Monte Carlobased methods; signal and information processing over networks; and applications in a wide range of disciplines. He has been invited to lecture at many universities in the USA and overseas. He received the IEEE Signal Processing Magazine Best Paper Award in 2007 and the EURASIP Technical Achievement Award in 2012. In 2008, he was the Chair of Excellence of Universidad Carlos III de MadridBanco de Santander. From 2008 to 2009, he was a Distinguished Lecturer of the IEEE Signal Processing Society. He has been on numerous committees of the IEEE Signal Processing Society and of many professional conferences and workshops. He is a Fellow of IEEE and EURASIP and the Editorin Chief of the IEEE Transactions on Signal and Information Processing Over Networks.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Urteaga, I., Bugallo, M.F. & Djurić, P.M. Sequential Monte Carlo for inference of latent ARMA timeseries with innovations correlated in time. EURASIP J. Adv. Signal Process. 2017, 84 (2017). https://doi.org/10.1186/s1363401705184
Received:
Accepted:
Published:
Keywords
 Sequential Monte Carlo
 Correlated innovations
 Latent timeseries
 Statespace models
 ARMA
 FARIMA
 Fractional Gaussian process