This section details the sequential Monte Carlo algorithms which can be used to approximate the conditional distribution of the states (a
1,…,a
n
) or the marginal distributions of (a
i
,z
i
) given the observations (Y
1,…,Y
n
). For all m × m matrix let |A| be the determinant of A. If A is a positive-definite matrix, for all \(z\in \mathbb {R}^{m}\) define
$$\left\|z\right\|_{A}^{2} {:=} z'A^{-1}z\,, $$
where for any vector or matrix z, z
′ denotes the transpose matrix of z. Let m(a
i
,z
i−1;z
i
) be the probability density of the conditional distribution of Z
i
given (a
i
,Z
i−1) and g(a
i
,z
i
;y
i
) be the probability density of the conditional distribution of Y
i
given (a
i
,Z
i
):
$$ {}\begin{aligned} m_{}(a_{i},z_{i-1};z_{i}) & {:=} \left|2\pi\overline{H}_{a_{i}}\right|^{-1/2}\exp\!\left\{-\frac{1}{2}\left\|z_{i} -d_{a_{i}} \,-\,T_{a_{i}}z_{i-1}\right\|_{\overline{H}_{a_{i}}}^{2}\right\}\,, \end{aligned} $$
(3)
$$ {}\begin{aligned} g_{}(a_{i},z_{i};y_{i}) & {:=} \left|2\pi\overline{G}_{a_{i}}\right|^{-1/2}\exp\!\left\{-\frac{1}{2}\left\|y_{i} - c_{a_{i}} \,-\, B_{a_{i}}z_{i}\right\|_{\overline{G}_{a_{i}}}^{2}\right\}\,, \end{aligned} $$
(4)
where
$$\overline{G}_{j} {:=} G_{j}G'_{j}\;,\; \overline{H}_{j} {:=} H_{j}H'_{j}\,. $$
All the algorithms considered in this paper are based on forward-backward or two-filter decompositions of the smoothing distributions and share the same forward filter presented in Section 2.1.
2.1 Forward filter
The SMC approximation \(p^{N}_{}(a_{1:i},z_{i}|y_{1:i})\) of p(a
1:i
,z
i
|y
1:i
) may be obtained using a standard Rao-Blackwellized algorithm. The procedure produces a sequence of trajectories \(\left (a^{k}_{1:i}\right)_{1\le k \le N}\) associated with normalized importance weights \(\left (\omega ^{k}_{i}\right)_{1\le k \le N} \left (\sum _{k=1}^{N} \omega ^{k}_{i} = 1\right)\) used to define the following approximation of p(a
1:i
,z
i
|y
1:i
):
$$ p^{N}_{}(a_{1:i},z_{i}|y_{1:i}) = \sum_{k=1}^{N}\omega^{k}_{i}\,p_{}\left(z_{i}|a^{k}_{1:i},y_{1:i}\right)\,\delta_{a^{k}_{1:i}}(a_{1:i})\,, $$
(5)
where δ is the Dirac delta function. In this equation, the conditional distribution of the hidden state z
i
given the observations y
1:i
and a trajectory \(a^{k}_{1:i}\) is a Gaussian distribution whose mean \(\mu ^{k}_{i}\) and variance \(P^{k}_{i}\) may be obtained by using the Kalman filter update.
2.1.1 Initialization
At time i=1, write, for all 1≤j≤J,
$$\mu^{j}_{1|0} = c_{j}+B_{j}\mu_{1}\;\;\text{and}\;\;P_{1|0}^{j} =B_{j}\Sigma_{1}B'_{j} + \overline{G}_{j}\,. $$
\(\left (a^{k}_{1}\right)_{1\le k \le N}\) are sampled independently in {1,…,J} with probabilities proportional to
$$\begin{array}{*{20}l} \pi_{j} p(a_{1}&=j|y_{1}) \propto \pi_{j} \left|P_{1|0}^{j}\right|^{-1/2}\exp\left\{-\left(y_{1}-\mu^{j}_{1|0}\right)'\right.\\ &\quad \times \left.\left(P_{1|0}^{j}\right)^{-1}\left(y_{1}-\mu^{j}_{1|0}\right)/2\right\}\,. \end{array} $$
Then, \(\mu _{1}^{k}\) and \(P_{1}^{k}\) are computed using a Kalman filter:
$$\begin{array}{*{20}l} K^{k}_{1} &=\Sigma_{1}B'_{a_{1}^{k}}\left(B_{a_{1}^{k}}\Sigma_{1}B'_{a_{1}^{k}} + \overline{G}_{a_{1}^{k}}\right)^{-1}\,,\\ \mu^{k}_{1} &= \mu_{1} + K^{k}_{1}\left(Y_{1} - c_{a_{1}^{k}} - B_{a_{1}^{k}}\mu_{1}\right)\,,\\ P^{k}_{1} &=\left(I_{\mathsf{m}}-K^{k}_{1}B_{a_{1}^{k}}\right)\Sigma_{1}\,, \end{array} $$
where for all positive integer p, I
p
is the p×p identity matrix. Each particle \(a_{1}^{k}\) is associated with the importance weight \(\omega ^{k}_{1} = 1/N\).
2.1.2 Iterations
Several procedures may be used to extend the trajectories \(\left (a^{k}_{1:i-1}\right)_{1\le k \le N}\) at time i. For all sampled trajectories \(\left (a_{1:i-1}^{k}\right)_{1\le k \le N}\) and all 1≤j≤J, [6] used the incremental weights:
$$\gamma_{i}^{j,k} = p\left(y_{i} | a_{i} = j, a_{1:i-1}^{k}, y_{1:i-1}\right) Q\left(a_{i-1}^{k},j\right)\,. $$
The conditional distribution of Y
i
given \(a^{k}_{1:i-1}\), a
i
, and Y
1:i−1 is a Gaussian distribution with mean \(c_{a_{i}}+B_{a_{i}}\mu ^{k}_{i|i-1}(a_{i})\) and variance \(B_{a_{i}}P^{k}_{i|i-1}(a_{i})B'_{a_{i}} + \overline {G}_{a_{i}}\) where
$$\begin{array}{*{20}l} \mu^{k}_{i|i-1}(a_{i}) &= d_{a_{i}} + T_{a_{i}}\mu^{k}_{i-1}\,,\\ P^{k}_{i|i-1}(a_{i}) &= T_{a_{i}}P^{k}_{i-1}T'_{a_{i}} + \overline{H}_{a_{i}}\,. \end{array} $$
Therefore,
$$\begin{array}{*{20}l} \gamma_{i}^{j,k} \propto &Q\left(a_{i-1}^{k},j\right)\left|B_{j}P^{j,k}_{i|i-1}B'_{j} + \overline{G}_{j}\right|^{-1/2}\exp\left\{-\frac{1}{2}\left\|y_{i}{\vphantom{\mu^{j,k}_{i|i-1}}}\right.\right.\\ &\left.\left. \quad-c_{j}-B_{j}\mu^{j,k}_{i|i-1}\right\|_{B_{j}P^{j,k}_{i|i-1}B'_{j} + \overline{G}_{j}}^{2}\right\} \,, \end{array} $$
where
$$\begin{array}{*{20}l} \mu^{j,k}_{i|i-1} &= \mu^{k}_{i|i-1}(j) = d_{j} + T_{j}\mu^{k}_{i-1}\,, \end{array} $$
(6)
$$\begin{array}{*{20}l} P^{j,k}_{i|i-1} &= P^{k}_{i|i-1}(j) = T_{j}P^{k}_{i-1}T'_{j} + \overline{H}_{j}\,. \end{array} $$
(7)
In [6], for all 1≤k≤N, an ancestral path is chosen with probabilities proportional to \(\left (\omega ^{k}_{i-1}\right)_{1\le k \le N}\). Then, the new regime \(a_{i}^{k}\) is sampled in {1,…,J} with probabilities proportional to \((\gamma _{i}^{j,k})_{1\le j\le J}\). A drawback of this method is that only ancestral paths that have been selected using the importance weights \(\left (\omega ^{k}_{i-1}\right)_{1\le k \le N}\) are extended at time i. Following [5], this may be improved by considering all the offsprings of all ancestral trajectories \(\left (a_{1:i-1}^{k}\right)_{1\le k \le N}\). Each ancestral path has J offsprings at time i, it is thus necessary to choose a given number of trajectories at time i (for instance N) among the NJ possible paths. To obtain the weight associated with each offspring, write the following approximation of p(a
1:i
|y
1:i
) based on the weighted samples at time i−1:
$$\begin{aligned} p^{N}(a_{1:i}|y_{1:i})\propto& \sum_{k=1}^{N}\omega^{k}_{i-1}Q\left(a^{k}_{i-1},a_{i}\right)\\ & p\left(y_{i}|a^{k}_{1:i-1},a_{i},y_{1:i-1}\right)\delta_{a^{k}_{1:i-1}}(a_{1:i-1})\,,\\ \propto& \sum_{k=1}^{N}\sum_{j=1}^{J}\omega^{k}_{i-1}\gamma_{i}^{j,k}\delta_{\left(a^{k}_{1:i-1},j\right)}(a_{1:i})\,. \end{aligned} $$
Therefore, each ancestral trajectory of the form \(\left (a^{k}_{1:i-1},j\right)\), 1≤k≤N, 1≤j≤J, is associated with the normalized weight \(\tilde {\omega }^{j,k}_{i} \propto \omega ^{k}_{i-1}\gamma _{i}^{j,k}\). Several random selection schemes have been proposed to discard some of the possible offsprings to maintain an average number of N particles at each time step. Following [5], we might choose between the Kullback-Leibler optimal selection (KL-OS) or the chi-squared optimal selection (CS-OS) to associate a new weight to each of the NJ trajectories. If the new weight is 0, then the corresponding particle can be removed.
2.1.2.1 KL-OS:
λ is chosen as the solution of
$$\sum_{k=1}^{N}\sum_{j=1}^{J}\text{min}\left(\tilde{\omega}^{j,k}_{i}/\lambda,1\right) = N\,. $$
For all 1≤j≤J and 1≤k≤N, if \(\tilde {\omega }^{j,k}_{i}\ge \lambda \) then the new weight \(\tilde {\Omega }^{j,k}_{i}\) is \(\tilde {\Omega }^{j,k}_{i}=\tilde {\omega }^{j,k}_{i}\) and if \(\tilde {\omega }^{j,k}_{i}< \lambda \):
$$\tilde{\Omega}^{j,k}_{i}= \left\{ \begin{array}{rl} &\lambda \;\text{with probability~} \tilde{\omega}^{j,k}_{i}/\lambda\,,\\ &0 \;\text{with probability~} 1-\tilde{\omega}^{j,k}_{i}/\lambda\,. \end{array} \right. $$
2.1.2.2 CS-OS:
λ is chosen as the solution of
$$\sum_{k=1}^{N}\sum_{j=1}^{J}\text{min}\left(\sqrt{\tilde{\omega}^{j,k}_{i}/\lambda},1\right) = N\,. $$
For all 1≤j≤J and 1≤k≤N, if \(\tilde {\omega }^{j,k}_{i}\ge \lambda \) then the new weight \(\tilde {\Omega }^{j,k}_{i}\) is \(\tilde {\Omega }^{j,k}_{i}=\tilde {\omega }^{j,k}_{i}\) and if \(\tilde {\omega }^{j,k}_{i}< \lambda \):
$$\tilde{\Omega}^{j,k}_{i}= \left\{ \begin{array}{rl} &\sqrt{\tilde{\omega}^{j,k}_{i}\lambda} \;\text{with probability} \sqrt{\tilde{\omega}^{j,k}_{i}/\lambda}\,,\\ &0 \;\text{with probability} 1-\sqrt{\tilde{\omega}^{j,k}_{i}/\lambda}\,. \end{array} \right. $$
Then, in both cases, all particles such that \(\tilde {\Omega }^{j,k}_{i} = 0\) are discarded and for all the other trajectories defined as an ancestral path \(\left (a^{k}_{1:i-1}\right)\) extended by \(a^{k}_{i} = j\), the new corresponding weight ω in (5) is given by the normalized weight \(\tilde {\Omega }^{j,k}_{i}\). In the numerical sections of this paper, the Kullback-Leibler optimal selection (KL-OS) scheme is used.
2.2 FFBS-based algorithms
2.2.1 FFBS algorithms of [20, 21, 24]
Lindsten et al. [20, 21, 24] proposed a Rao-Blackwellized procedure to sample the regime backward in time following the same steps as in the forward filtering backward smoothing algorithm [7, 15]. The algorithm relies on the decomposition given, for all 1≤i≤n−1, by:
$$p(a_{1:n}|y_{1:n}) = p(a_{1:i}|a_{i+1:n},y_{1:n})p(a_{i+1:n}|y_{1:n})\,. $$
This decomposition is similar to the Rauch-Tung-Striebel decomposition of the filtering distribution. The first factor on the right hand side of the previous equation is nevertheless more difficult to handle because it itself relies on all the observations. As noted by [24], this term can be computed recursively by considering the following decomposition:
$$ {}p(a_{1:i}|a_{i+1:n},y_{1:n}) \propto p(y_{i+1:n},a_{i+1:n}|a_{1:i},y_{1:i})p(a_{1:i}|y_{1:i})\,. $$
(8)
The second factor in the last equation may be approximated using the ancestral trajectories \(\left (a^{k}_{1:i}\right)_{1\le k \le N}\) and the associated importance weights \(\left (\omega ^{k}_{i}\right)_{1\le k \le N}\) produced by the forward filter. Therefore, p(a
1:i
|a
i+1:n
,y
1:n
) may be approximated by:
$$\begin{array}{*{20}l} p^{N}&(a_{1:i}|a_{i+1:n},y_{1:n}) = \sum_{k=1}^{N} \tilde{\omega}^{k}_{i|n}\delta_{a^{k}_{1:i}}(a_{1:i}) \\ &\text{with} \quad \tilde{\omega}^{k}_{i|n} \propto \omega_{i}^{k} p\left(y_{i+1:n},a_{i+1:n}|a^{k}_{1:i},y_{1:i}\right) \,. \end{array} $$
Then, a trajectory \(\tilde {a}_{1:n}\) approximatively distributed according to p(a
1:n
|y
1:n
) may be sampled following these steps:
-
Set \(\tilde {a}_{n}= a_{n}^{k}\) with probabilities proportional to \(\left (\omega _{n}^{k}\right)_{1\le k \le N}\).
-
For all 1≤i≤n−1, set \(\tilde {a}_{i} = a_{i}^{k}\) with probabilities proportional to \(\left (\tilde {\omega }_{i|n}^{k}\right)_{1\le k \le N}\).
This algorithm requires to compute the quantity \(p\left (y_{i+1:n},a_{i+1:n}|a^{k}_{1:i},y_{1:i}\right)\). This predictive quantity is available analytically using Kalman filtering techniques. However, this has to be done for each trajectory \(\left (a^{k}_{1:i}\right)_{1\le k \le N}\), which leads to an algorithm with a prohibitive computational complexity. Lindsten et al. [21] proposed a procedure computationally less intensive by conditioning with respect to z
i
and then marginalizing with respect to this variable:
$$ \begin{aligned} {}p\left(y_{i+1:n},a_{i+1:n}|a^{k}_{1:i},y_{1:i}\right) =& \int p\left(y_{i+1:n},a_{i+1:n}|z_{i},a^{k}_{i}\right)\\ &\times p\left(z_{i}|a^{k}_{1:i},y_{1:i}\right)\mathrm{d} z_{i}\,. \end{aligned} $$
(9)
This is similar to the two-filter decomposition of the smoothing distribution, see Section 2.3. By [21],
$$\begin{array}{*{20}l} {}p(y_{i+1:n},a_{i+1:n}|z_{i},a_{i}) \propto Q(a_{i},a_{i+1}) \exp\left\{-\left(z_{i}'\Omega_{i}(a_{i+1:n})z_{i}\right.\right.\\ -\left.\left.2\lambda_{i}'(a_{i+1:n})z_{i}\right)/2\right\}\,, \end{array} $$
where the proportionality is with respect to (a
i
,z
i
) and
$${}p(y_{i:n},a_{i+1:n}|z_{i},a_{i}) \!\propto\! \exp\left\{-\left(z_{i}'\widehat{\Omega}_{i}(a_{i:n})z_{i}-\! 2\widehat{\lambda}'_{i}(a_{i:n})z_{i}\right)/2\right\}\!, $$
where the proportionality is with respect to z
i
. These quantities may be computed recursively backward in time with:
$$\begin{array}{*{20}l} \widehat{\Omega}_{n}(a_{n}) &= B'_{a_{n}}\overline{G}^{-1}_{a_{n}}B_{a_{n}}\,,\\ \widehat{\lambda}_{n}(a_{n}) &=B'_{a_{n}}\overline{G}^{-1}_{a_{n}}(y_{n}-c_{a_{n}})\,. \end{array} $$
Then, for 1≤i≤n−1, define \(m_{i+1} = \widehat {\lambda }_{i+1} - \widehat {\Omega }_{i+1}d_{a_{i+1}}\) and \(M_{i+1} = H_{a_{i+1}}'\widehat {\Omega }_{i+1}H_{a_{i+1}} + I_{\mathsf {m}}\) and write
$$\begin{array}{*{20}l} &\Omega_{i}(a_{i+1:n})\\&\quad= T'_{a_{i+1}}\left(I_{\mathsf{m}}-\widehat{\Omega}_{i+1}(a_{i+1:n})H_{a_{i+1}}M^{-1}_{i+1}H'_{a_{i+1}}\right)\\ &\qquad\times\widehat{\Omega}_{i+1}(a_{i+1:n})T_{a_{i+1}}\,,\\ &\lambda_{i}(a_{i+1:n})\\&\quad=T'_{a_{i+1}}\left(I_{\mathsf{m}}-\widehat{\Omega}_{i+1}(a_{i+1:n})H_{a_{i+1}}M^{-1}_{i+1}H'_{a_{i+1}}\right)m_{i+1}\,. \end{array} $$
As p(y
i:n
,a
i+1:n
|z
i
,a
i
)=p(y
i
|z
i
,a
i
)p(y
i+1:n
,a
i+1:n
|z
i
,a
i
),
$$\begin{array}{*{20}l} \widehat{\Omega}_{i}(a_{i:n}) &= \Omega_{i}(a_{i+1:n})+ B'_{a_{i}}\overline{G}^{-1}_{a_{i}}B_{a_{i}}\,,\\ \widehat{\lambda}_{i}(a_{i:n}) & = \lambda_{i}(a_{i+1:n}) + B'_{a_{i}}\overline{G}^{-1}_{a_{i}}(y_{i}-c_{a_{i}})\,. \end{array} $$
Then, by (9),
$$ {}\begin{aligned} p\left(y_{i+1:n},a_{i+1:n}|a^{k}_{1:i},y_{1:i}\right)\propto &Q\left(a_{i}^{k},a_{i+1}\right)\left|\Lambda^{k}_{i}(a_{i+1:n})\right|^{-1/2}\\ &\times\exp\left\{-\eta^{k}_{i}(a_{i+1:n})/2\right\}\,, \end{aligned} $$
(10)
where the proportionality is with respect to \(a^{k}_{1:i}\) and
$$\begin{array}{*{20}l} \Lambda^{k}_{i}(a_{i+1:n})&= \left(\Gamma_{i}^{k}\right)'\Omega_{i}(a_{i+1:n})\Gamma_{i}^{k} + I_{\mathsf{m}}\,,\\ \eta^{k}_{i}(a_{i+1:n}) &= \|\mu_{i}^{k}\|^{2}_{\Omega^{-1}_{i}(a_{i+1:n})} - 2\lambda'_{i}(a_{i+1:n})\mu_{i}^{k}\\ &\quad-\|\left(\Gamma_{i}^{k}\right)'(\lambda_{i}(a_{i+1:n})\\ &\quad-\Omega_{i}(a_{i+1:n})\mu_{i}^{k})\|^{2}_{\Lambda_{i}(a_{i+1:n})}\,, \end{array} $$
where \(P_{i}^{k} = \Gamma _{i}^{k}(\Gamma _{i}^{k})'\). Therefore,
$${}\tilde{\omega}_{i|n} \propto \omega_{i}^{k}Q\!\left(\!a_{i}^{k},a_{i+1}\!\right)\!\left|\Lambda^{k}_{i}(a_{i+1:n})\right|^{-1/2}\exp\!\left\{\!-\eta^{k}_{i}(a_{i+1:n})/2\!\right\}. $$
If \(\left (\tilde {a}^{k}_{1:n}\right)_{1\le k \le \tilde {N}}\) are independent copies of \(\tilde {a}_{1:n}\), the SMC approximation of [21] of the joint smoothing distribution of the regime is:
$$p^{\mathsf{Lbscg}}_{\tilde{N}}(a_{1:n}|Y_{1:n}) = \frac{1}{\tilde{N}}\sum_{k=1}^{\tilde N} \delta_{\tilde{a}^{k}_{1:n}}(a_{1:n})\,. $$
2.2.2 Particle rejuvenation of FFBS algorithms
The crucial step of the FFBS algorithm is the decomposition (8) which allows to extend a backward trajectory \(\tilde {a}_{i+1:n}\) by choosing a particle in the set \(\left (a_{i}^{k}\right)_{1\le k \le N}\) produced by the forward filter (and discarding the states \(a^{k}_{1:i-1}\)). An improved version of this FFBS algorithm which is not constrained to sample states in the support \(\left (a_{i}^{k}\right)_{1\le k \le N}\) may be defined for all 2≤i≤n−1 by writing:
$$\begin{array}{*{20}l} {}p(a_{1:i}|a_{i+1:n},y_{1:n}) &\propto p(y_{i+1:n},a_{i+1:n}|a_{1:i},y_{1:i})p(a_{1:i}|y_{1:i})\,,\\ &\propto p(y_{i+1:n},a_{i+1:n}|a_{1:i},y_{1:i})\\ &\quad\times\int p(a_{1:i-1},z_{i-1}|y_{1:i-1})Q(a_{i-1},a_{i})\\ &\quad\times m(a_{i},z_{i-1};z_{i})g(a_{i},z_{i};y_{i})\mathrm{d} z_{i-1:i}\,. \end{array} $$
Replacing p(a
1:i−1,z
i−1|y
1:i−1) in the integral by the particle approximation obtained during the forward pass and using Kalman filtering techniques for each trajectory \(\left (a^{k}_{1:i-1}\right)_{1\le k\le N}\) and each a
i
∈{1,…,J} yields:
$$\begin{array}{*{20}l} \int& p^{N}(a_{1:i-1},z_{i-1}|y_{1:i-1})Q(a_{i-1},a_{i})m(a_{i},z_{i-1};z_{i})\\ &\times g(a_{i},z_{i};y_{i})\mathrm{d} z_{i-1:i} \propto \sum_{k=1}^{N} \omega_{i|i-1}^{k}(a_{i})\delta_{a^{k}_{1:i-1}}(a_{1:i-1})\,, \end{array} $$
where
$$\begin{array}{*{20}l} \omega_{i|i-1}^{k}(a_{i}) =&\ \omega_{i-1}^{k} Q\left(a_{i-1}^{k},a_{i}\right)|\Sigma^{k}_{i|i-1}(a_{i})|^{-1/2}\\ &\times\text{exp}\left\{-\frac{1}{2}\left\|y_{i} - y^{k}_{i|i-1}(a_{i})\right\|^{2}_{\Sigma^{k}_{i|i-1}(a_{i})}\right\}\,, \end{array} $$
$$\begin{array}{*{20}l} {}&y^{k}_{i|i-1}(a_{i}) = c_{a_{i}} + B_{a_{i}}\left(d_{a_{i}}+T_{a_{i}}\mu^{k}_{i-1}\right)\\ &\text{and}\; \Sigma^{k}_{i|i-1}(a_{i}) = B_{a_{i}}\left(T_{a_{i}}P^{k}_{i-1}T'_{a_{i}}+\overline{H}_{a_{i}}\right)B'_{a_{i}} + \overline{G}_{a_{i}}\,. \end{array} $$
On the other hand, for all 1≤k≤N, \(p\left (y_{i+1:n},a_{i+1:n}|a^{k}_{1:i-1},a_{i},y_{1:i}\right)\) is computed as in (10) with all possible values a
i
∈{1,…,J} and not only the regime of the filtering pass \(\left (a_{i}^{k}\right)_{1\le k \le N}\). This means that a Kalman filter must be used for each trajectory \(a^{k}_{1:i-1}\) which may be extended by a
i
∈{1,…,J}. Denote by \(\mu _{i|i-1}^{k}(a_{i})\) and \(P_{i|i-1}^{k}(a_{i}),\) the mean and covariance matrix of the law of z
i
given \(\left (a^{k}_{1:i-1},a_{i}\right)\) obtained as in (6) and (7). Then,
$$ {}\begin{aligned} &p\left(y_{i+1:n},a_{i+1:n}|a^{k}_{1:i-1},a_{i},y_{1:i}\right)\\ &\qquad= Q(a_{i},a_{i+1})\!\left|\Lambda^{k}_{i|i-1}(a_{i:n})\right|^{-1/2}\exp\!\left\{\!-\eta^{k}_{i|i-1}(a_{i:n})/2\right\}, \end{aligned} $$
(11)
where the proportionality is with respect to \(\left (a^{k}_{1:i-1},a_{i}\right)\) and
$$\begin{array}{*{20}l} \Lambda^{k}_{i|i-1}(a_{i:n})&= \left(\Gamma_{i|i-1}^{k}(a_{i})\right)'\Omega_{i}(a_{i+1:n})\Gamma_{i|i-1}^{k}(a_{i}) + I_{\mathsf{m}}\,,\\ \eta^{k}_{i|i-1}(a_{i:n}) &= \left\|\mu_{i|i-1}^{k}(a_{i})\right\|^{2}_{\Omega^{-1}_{i}(a_{i+1:n})}\\ &\quad - 2\lambda'_{i}(a_{i+1:n})\mu_{i|i-1}^{k}(a_{i})\\ &\quad-\|\left(\Gamma_{i|i-1}^{k}(a_{i})\right)'(\lambda_{i}(a_{i+1:n})\\ &\quad-\Omega_{i}(a_{i+1:n})\mu_{i|i-1}^{k}(a_{i}))\|^{2}_{\Lambda_{i}(a_{i+1:n})}\,, \end{array} $$
where \(\Gamma _{i|i-1}^{k}(a_{i})\) is defined as \(P_{i|i-1}^{k}(a_{i}) = \Gamma _{i|i-1}^{k}(a_{i}) \left (\Gamma _{i|i-1}^{k}(a_{i})\right)'\). The distribution p(a
1:i
|a
i+1:n
,y
1:n
) is then approximated by :
$$\begin{array}{@{}rcl@{}} &&{}p^{N}(a_{1:i}|a_{i+1:n},y_{1:n})\\ &&{}\propto\sum_{k=1}^{N} \omega_{i|i-1}^{k}(a_{i})Q(a_{i},a_{i+1})\left|\Lambda^{k}_{i|i-1}(a_{i:n})\right|^{-1/2}\\ &&{}\quad\times\exp\left\{-\eta^{k}_{i|i-1}(a_{i:n})/2\right\}\delta_{a^{k}_{1:i-1}}(a_{1:i-1})\,. \end{array} $$
(12)
By integrating over all possible paths, a
1:i−1, \(\tilde {a}_{i}\) is sampled in {1,…,J}. This particle rejuvenation step allows to explore states which are not in the support of the particles produced by the forward filter and improves the accuracy and the variance of the original FFBS algorithm, see Section 3 for numerical illustrations.
Another modification of the FFBS algorithm based on a Markov chain Monte Carlo (MCMC) sampling step was introduced in ([21], Section 5.2). Instead of sampling from (12), ([21], Section 5.2) proposed to draw a forward path a
1:i−1 in \((a^{k}_{1:i-1})_{1\le k \le N}\) and a sate a
i
in {1,…,J} according to:
$$\begin{array}{*{20}l} &\widetilde{q}(a_{1:i}|a_{i+1:n},y_{1:n}) \\ &\quad= \sum_{k=1}^{N} \widetilde{\vartheta}^{k}_{i-1}\widetilde{q}\left(a_{i}|a^{k}_{1:i-1},a_{i+1:n},y_{1:n}\right)\delta_{a^{k}_{1:i-1}}(a_{1:i-1})\,, \end{array} $$
where \(\left (\widetilde {\vartheta }^{k}_{i-1}\right)_{1\le k \le N}\) are adjustment multipliers and \(\widetilde {q}\left (a_{i}|a^{k}_{1:i-1},a_{i+1:n},y_{1:n}\right)\) is a proposal kernel chosen by the user. This means that an ancestral path \(a^{\star }_{1:i-1}\) is sampled in \(\left (a^{k}_{1:i-1}\right)_{1\le k \le N}\) with weights \(\left (\widetilde {\vartheta }^{k}_{i-1}\right)_{1\le k \le N}\) and \(a^{\star }_{i}\) is sampled from \(\widetilde {q}(\cdot |a^{\star }_{1:i-1},a_{i+1:n},y_{1:n})\). Then, the proposed sequence \(a^{\star }_{1:i}\) is accepted or rejected using the usual Metropolis-Hastings acceptance ratio. The choice of MCMC rejuvenation has interesting practical consequences as the computation of the acceptance ratio only requires to compute the posterior probability (11) for the proposed sequence \(a^{\star }_{1:i}\) while our technique is based on the computation of (11) for all combinations of sequences \(\left (a^{k}_{1:i-1}\right)_{1\le k \le N}\) and states a
i
∈{1,…,J}. Sampling from (12) is computationally more intensive, especially when N is large, but our method is based on a direct approximation of p(a
1:i
|a
i+1:n
,y
1:n
) based on \(\left (a^{k}_{1:i-1}\right)_{1\le k \le N}\) and a
i+1:n
instead of approximate MCMC draws.
2.3 Rao-Blackwellized two-filter smoother
2.3.1 Rao-Blackwellized two-filter smoother of [3]
Contrary to the previous methods, two-filter-based smoothers are designed to compute approximations of marginal smoothing distributions (usually the posterior distribution of one or two consecutive regimes given all the observations). Briers et al. [3] introduced the following decomposition of the smoothing distributions for all 2≤i≤n:
$$ p(a_{i},z_{i}|y_{1:n}) \propto p_{}(a_{i},z_{i}|y_{1:i-1})p(y_{i:n}|a_{i},z_{i})\,. $$
(13)
The first term on the right hand side may be approximated using the forward filter by noting that
$$\begin{array}{*{20}l} &p_{}(a_{i},z_{i}|y_{1:i-1})\\ &\quad=\sum_{a_{i-1}}\int_{z_{i-1}}p_{}(a_{i-1},z_{i-1}|y_{1:i-1})m_{}(a_{i},z_{i-1};z_{i})\\ &\qquad\times Q(a_{i-1},a_{i}) \mathrm{d} z_{i-1}\,. \end{array} $$
In the forward pass described in Section 2.1, a set of possible sequences of regimes \(a_{1:i-1}^{k}\) associated with importance weights \(\omega _{i-1}^{k}\), 1≤k≤N are sampled to approximate p(a
i−1,z
i−1|y
1:i−1). This provides a normalized approximation p
N(a
i
,z
i
|y
1:i−1) of p(a
i
,z
i
|y
1:i−1). Define
$$\begin{array}{*{20}l} \Omega^{k}_{i-1}(a_{i}) &= T_{a_{i}}P^{k}_{i-1}T'_{a_{i}} + \overline{H}_{a_{i}}\,,\\ \mu^{k}_{i-1}(a_{i}) &= d_{a_{i}} + T_{a_{i}}\mu^{k}_{i-1}\,,\\ r_{i-1}^{k}(a_{i}) &= \left(\Omega^{k}_{i-1}(a_{i})\right)^{-1}\mu^{k}_{i-1}(a_{i})\,,\\ \omega_{\mathsf{f},i}^{k}(a_{i}) &= \omega^{k}_{i-1}Q\left(a^{k}_{i-1},a_{i}\right) \left|2\pi \Omega^{k}_{i-1}(a_{i})\right|^{-1/2}\\ &\quad\times\exp\left\{-\frac{1}{2}\left\|\mu^{k}_{i-1}(a_{i})\right\|_{\Omega^{k}_{i-1}(a_{i})}^{2}\right\}\,. \end{array} $$
Then,
$$\begin{array}{*{20}l} &p^{N}(a_{i},z_{i}|y_{1:i-1}) \\ &\quad= \sum_{k=1}^{N} \omega_{\mathsf{f},i}^{k}(a_{i})\exp\left\{-\frac{1}{2}\left\|z_{i}\right\|_{\Omega^{k}_{i-1}(a_{i})}^{2}+z'_{i}r_{i-1}^{k}(a_{i})\right\}\,. \end{array} $$
(14)
As the function (a
i
,z
i
)↦p(y
i:n
|a
i
,z
i
) is not a probability density function, approximating the second term of (13) using SMC samples is not straightforward. The backward filter uses artificial densities to introduce a surrogate target density function which may be approximated recursively using SMC methods. Then, the forward and backward weighted samples are combined using (13) to approximate p(a
i
,z
i
|y
1:n
). Following [3], for any probability densities \((\gamma ^{}_{i})_{1\le i \le n}\), define the following joint probability densities:
$$\begin{array}{*{20}l} \tilde{p}_{n}(a_{n},z_{n},y_{n})& {:=} \gamma^{}_{n}(a_{n},z_{n})g_{}(a_{n},z_{n};y_{n}),\\ \tilde{p}_{n}(y_{n})&{:=} \sum_{a_{n}=1}^{J}\int \gamma^{}_{n}(a_{n},z_{n})g_{}(a_{n},z_{n};y_{n})\mathrm{d} z_{n}\,, \end{array} $$
and, for all 1≤i≤n−1,
$$\begin{array}{*{20}l} &\tilde{p}_{i}(a_{i:n},z_{i:n},y_{i:n}) \\ &\quad{:=} \gamma^{}_{i}(a_{i},z_{i})p_{}(y_{i:n}|a_{i:n},z_{i:n})p_{}(a_{i+1:n},z_{i+1:n}|a_{i},z_{i})\,,\\ &\tilde{p}_{i}(y_{i:n})\\ &\quad{:=} \sum_{a_{i:n}=1}^{J}\int \gamma^{}_{i}(a_{i},z_{i})p_{}(y_{i:n}|a_{i:n},z_{i:n})\\ &\qquad \times p_{}(a_{i+1:n},z_{i+1:n}|a_{i},z_{i})\mathrm{d} z_{i:n}\,. \end{array} $$
Note that this choice differs slightly from [3] where it is advocated to set \(\gamma ^{}_{i}\) as the product of two independent densities \(\gamma _{i}^{a}(a_{i})\) and \(\gamma _{i}^{z}(z_{i})\). As the accuracy of the algorithm relies heavily on a proper tuning of this artificial density, a more general choice of \(\gamma ^{}_{i}\) is considered in this paper. By Lemma 1, these probability densities may be used to approximate the quantities p(y
i:n
|a
i
,z
i
), 1≤i≤n, in (13).
Lemma 1
For all 1≤i≤n−1,
$$\begin{array}{*{20}l}{} \tilde{p}_{i}(a_{i},z_{i}|y_{i:n}) &= p_{}(y_{i:n}|a_{i},z_{i})\gamma^{}_{i}(a_{i},z_{i})/\tilde{p}_{i}(y_{i:n})\,, \end{array} $$
(15)
$$\begin{array}{*{20}l} {}\tilde{p}_{i}(a_{i},z_{i}|y_{i:n}) &\!= \!\gamma^{}_{i}(a_{i},z_{i})\sum_{a_{i+1:n}=1}^{J}\frac{\tilde{p}_{i}(a_{i:n}|y_{i:n})p_{}(y_{i:n}|a_{i:n},z_{i})}{\int \gamma^{}_{i}(a_{i},z')p_{}(y_{i:n}|a_{i:n},z')\mathrm{d} z'}\,. \end{array} $$
(16)
Proof
The proof is postponed to “Appendix: Technical lemmas”. □
By definition of \(\tilde {p}_{i}\) for all 1≤i≤n,
$$\begin{array}{*{20}l} {}\tilde{p}_{i}(a_{i:n},z_{i}|y_{i:n}) &\propto \gamma^{}_{i}(a_{i},z_{i}) \int p_{}(y_{i:n}|a_{i:n},z_{i:n})\\ &\quad\; p(a_{i+1:n},z_{i+1:n}|a_{i},z_{i})\mathrm{d} z_{i+1:n}\,,\\ &\propto\gamma^{}_{i}(a_{i},z_{i})\!\left\{\prod_{u=i}^{n-1}Q(a_{u},a_{u+1})\right\}p(y_{i:n}|z_{i},a_{i:n})\,. \end{array} $$
This yields
$${}\tilde{p}_{i}(a_{i:n}|y_{i:n}) \!\propto\!\! \left\{\prod_{u=i}^{n-1}Q(a_{u},a_{u+1})\!\right\}\! \int\! \!\gamma^{}_{i}(a_{i},z_{i}) p(y_{i:n}|z_{i},a_{i:n})\mathrm{d} z_{i}\,. $$
A set of weighted trajectories \(\left (\tilde {a}^{\ell }_{i:n}\right)_{1\le \ell \le N}\) with importance weights \(\left (\tilde {\omega }^{\ell }_{i}\right)_{1\le \ell \le N}\), 1≤i≤n, may then be sampled recursively backward in time to produce a SMC approximation of \(\tilde {p}_{}(a_{i:n}|y_{i:n})\) as follows.
-
For 1≤ℓ≤N, sample \(\tilde {a}^{j}_{n}\sim \tilde {q}_{n}(\cdot)\) and set:
$$\tilde{\omega}^{\ell}_{n} \propto \frac{\int\gamma^{}_{n}\left(\tilde{a}^{\ell}_{n},z'\right)g_{}\left(\tilde{a}^{\ell}_{n},z';y_{n}\right)\mathrm{d} z'}{\tilde{q}_{n}\left(\tilde{a}^{\ell}_{n}\right)}\,. $$
-
For all 1≤i≤n−1, resample the set \(\left (\tilde {a}^{\ell }_{i+1:n}\right)_{1\le j\le N}\) using the normalized weights \(\left (\tilde {\omega }^{\ell }_{i+1}\right)_{1\le j \le N}\). Then, for 1≤ℓ≤N, sample \(\tilde {a}^{j}_{i}\sim \tilde {q}_{i}\left (\tilde {a}^{\ell }_{i+1:n},\cdot \right)\) and set:
$${}\tilde{\omega}^{\ell}_{i} \propto \frac{Q\left(\tilde{a}^{\ell}_{i},\tilde{a}^{\ell}_{i+1}\right)\int \gamma^{}_{i}\left(\tilde{a}^{\ell}_{i},z'\right)p_{}\left(y_{i:n}|\tilde{a}^{\ell}_{i:n},z'\right)\mathrm{d} z'}{\tilde{q}_{i}\left(\tilde{a}^{\ell}_{i+1:n},\tilde{a}^{\ell}_{i}\right)\int \gamma^{}_{i+1}\left(\tilde{a}^{\ell}_{i+1},z'\right)p_{}\left(y_{i+1:n}|\tilde{a}^{\ell}_{i+1:n},z'\right)\mathrm{d} z'}\,. $$
To obtain uniformly weighted samples at each time step, in the numerical experiments we use
$${} \begin{aligned} &\tilde{q}_{n}(\cdot) = \int\gamma^{}_{n}(\cdot,z')g_{}(\cdot,z';y_{n})\mathrm{d} z'\\ &\text{and}\quad \tilde{q}_{i}\left(\tilde{a}^{\ell}_{i+1:n},\cdot\right) = \frac{Q\left(\cdot,\tilde{a}^{\ell}_{i+1}\right)\int \gamma^{}_{i}(\cdot,z')p_{}(y_{i:n}|\left(\cdot,\tilde{a}^{\ell}_{i+1:n}\right),z')\mathrm{d} z'}{\int \gamma^{}_{i+1}\left(\tilde{a}^{\ell}_{i+1},z'\right)p_{}(y_{i+1:n}|\tilde{a}^{\ell}_{i+1:n},z')\mathrm{d} z'}\,. \end{aligned} $$
By (15) and (16),
$$\begin{array}{*{20}l} {}p_{}(y_{i:n}|a_{i},z_{i}) &= \frac{\tilde{p}_{i}(y_{i:n}) \tilde{p}_{i}(a_{i},z_{i}|y_{i:n})}{\gamma^{}_{i}(a_{i},z_{i})}\\ &= \tilde{p}_{i}(y_{i:n}) \sum_{a_{i+1:n}=1}^{J}\frac{\tilde{p}_{i}(a_{i:n}|y_{i:n})p_{}(y_{i:n}|a_{i:n},z_{i})}{\int \gamma^{}_{i}(a_{i},z')p_{}(y_{i:n}|a_{i:n},z')\mathrm{d} z'}\,, \end{array} $$
which suggests the following particle approximation \(p^{N}_{}(y_{i:n}|a_{i},z_{i})\) of p(y
i:n
|a
i
,z
i
):
$$ \begin{aligned} &p^{N}_{}(y_{i:n}|a_{i},z_{i}) \\ &\quad= \tilde{p}_{i}(y_{i:n}) \sum_{\ell=1}^{N}\frac{\tilde{\omega}^{\ell}_{i} p_{}(y_{i:n}|\tilde{a}^{\ell}_{i:n},z_{i})}{\int \gamma^{}_{i}(\tilde{a}^{\ell}_{i},z')p_{}(y_{i:n}|\tilde{a}^{\ell}_{i:n},z')\mathrm{d} z'}\delta_{\tilde{a}^{\ell}_{i}}(a_{i})\,. \end{aligned} $$
(17)
The conditional likelihood of the observations given the sequence of states p(y
i:n
|a
i:n
,z
i
) can be computed explicitly using a Gaussian backward smoother; these computations are summarized in Lemma 2. In the numerical experiments, \(\gamma ^{}_{i}(a_{i},z_{i})\) is set as a mixture of Gaussian distributions. Note that for such a choice, the integral \(\int \gamma ^{}_{i}(a_{i},z')p_{}(y_{i:n}|a_{i:n},z')\mathrm {d} z'\) may be computed explicitly, see Lemma 3. Then, combining (17) and (14) with (13) provides an approximation of p(a
i
,z
i
|y
1:n
) by merging the forward particles \(\left (a^{k}_{i-1}\right)_{1\le k \le N}\) with the backward particles \(\left (\tilde {a}^{k}_{i+1}\right)_{1\le k \le N}\), the support of this SMC approximation of p(a
i
,z
i
|y
1:n
) being \(\left (\tilde {a}^{k}_{i+1}\right)_{1\le k \le N}\).
As noted in ([13], Section 2.6), two-filter smoothers are prone to suffer from degeneracy issues when the algorithm associates forward particles at time i−1 with backward particles at time i. The authors illustrate this issue in the case where the hidden state is an AR(2) process. To overcome the weakness of such standard two-filter approaches the particle rejuvenation proposed in Section 2.3.2 follows the idea introduced in [13] where new particles at time i are sampled conditional on \((a^{k}_{1:i-1})_{1\le k \le N}\) and on \(\left (\tilde {a}^{k}_{i+1:n}\right)_{1\le k \le N}\) and appropriately weighted. This allows to produce new particles at time i and to obtain a SMC approximation of p(a
i
,z
i
|y
1:n
) whose support is not restricted to \(\left (\tilde {a}^{k}_{i+1}\right)_{1\le k \le N}\). Section 2.3.2 exploits this idea in the specific case of linear and Gaussian models where explicit computations allows to produce an approximation using \(\left (a^{k}_{1:i-1}\right)_{1\le k \le N}\) and \(\left (\tilde {a}^{k}_{i+1:n}\right)_{1\le k \le N}\) with support {1,…,J} and without any additional sampling steps.
2.3.2 Particle rejuvenation of two-filter based algorithms
For 2≤i≤n−1, particle rejuvenation relies on the explicit marginalization:
$$ {}\begin{aligned} &p(a_{i},z_{i}|y_{1:n}) \\ &\quad= \sum_{a_{i-1}}\sum_{a_{i+1}} \int_{z_{i-1}} \int_{z_{i+1}} \psi^{n}_{i}(a_{i-1:i+1},z_{i-1:i+1})\mathrm{d} z_{i-1}\mathrm{d} z_{i+1}\,, \end{aligned} $$
(18)
where \(\psi ^{n}_{i}(a_{i-1:i+1},z_{i-1:i+1})\) is the smoothing distribution of the hidden regimes and states between time indices i−1 and i+1. Note that the EM algorithm requires the approximation of p(a
i−1,z
i−1,a
i−1,z
i−1|y
1:n
) in the E-step, this may be obtained following the same steps by marginalizing explicitly the linear states at time i−2 and i+1. Intermediate computations follow the same steps as for the approximation of p(a
i
,z
i
|y
1:n
). First, \(\psi ^{n}_{i}\) may be decomposed as follows:
$${}\begin{aligned} \psi^{n}_{i}(a_{i-1:i+1},z_{i-1:i+1}) &\propto p_{}(y_{i+1:n}|a_{i+1},z_{i+1})\\ &\quad\times p_{}(a_{i-1},z_{i-1}|y_{1:i-1})Q(a_{i-1},a_{i})\\ &\quad\times m_{}(a_{i},z_{i-1};z_{i})g_{}(a_{i},z_{i};y_{i})\\ &\quad\times Q(a_{i},a_{i+1})m_{}(a_{i+1},z_{i};z_{i+1})\,, \end{aligned} $$
where the proportionality is with respect to (a
i−1:i+1,z
i−1:i+1). Then, by (18), the smoothing distribution p(a
i
,z
i
|y
1:n
) may be written as
$$ {}p(a_{i},z_{i}|y_{1:n}) \propto p(a_{i},z_{i}|y_{1:i-1})g_{}(a_{i},z_{i};y_{i})t_{i}(a_{i},z_{i},y_{i+1:n})\,, $$
(19)
where m and g are defined in (3) and (4) and
$$\begin{array}{*{20}l} t_{i}(a_{i},z_{i},y_{i+1:n}) &= \sum_{a_{i+1}}\int_{z_{i+1}}m_{}(a_{i+1},z_{i};z_{i+1})Q(a_{i},a_{i+1})\\ &\quad\times p_{}(y_{i+1:n}|a_{i+1},z_{i+1}) \mathrm{d} z_{i+1}\,. \end{array} $$
(20)
The backward pass described in Section 2.3.1 produces a sequence of states \(\tilde {a}_{i+1:n}^{\ell }\) associated with importance weights \(\tilde {\omega }_{i+1}^{\ell }\), 1≤ℓ≤N which are used to approximate p(y
i+1:n
|a
i+1,z
i+1). Plugging this approximation into (20) provides an approximation \(t^{N}_{i}(a_{i},z_{i},y_{i+1:n})\) of t
i
(a
i
,z
i
,y
i+1:n
) integrating over all possible choices (a
i+1,z
i+1). These steps are then combined to form a non normalized SMC approximation of p(a
i
,z
i
|y
1:n
) using (19). The normalization of the SMC approximation of p(a
i
,z
i
|y
1:n
) is obtained by integrating over the states a
i
,z
i
, when p(a
i
,z
i
|y
1:i−1) and t
i
(a
i
,z
i
,y
i+1:n
) are replaced by p
N(a
i
,z
i
|y
1:i−1) and \(t^{N}_{i}(a_{i},z_{i},y_{i+1:n})\) in (19). Our procedure allows to construct sequence of regimes with non-degenerated importance weights in the combination step. This procedure improves significantly [3] where no marginalization of p(a
i
,z
i
|y
1:n
) over the states at times i−1 and i+1 is performed and where the proposed forward and backward paths are directly merged. This method often leads to importance weights which are close to be numerically degenerated. By Lemma 2, the SMC approximation \(p^{N}_{}(y_{i:n}|a_{i},z_{i})\) of p(y
i:n
|a
i
,z
i
) is then given by:
$$ {}\begin{aligned} p^{N}_{}(y_{i:n}|a_{i},z_{i}) &= \tilde{p}_{i}(y_{i:n}) \sum_{\ell=1}^{N} \frac{\delta_{\tilde{a}^{\ell}_{i}}(a_{i})\tilde{\omega}_{i}^{\ell}}{\int \gamma^{}_{i}\left(\tilde{a}^{\ell}_{i},z'\right)p_{}(y_{i:n}|\tilde{a}^{\ell}_{i:n},z')\mathrm{d} z'} \\ &\quad\times\exp\left\{- \frac{1}{2}\left\|z_{i}\right\|_{\tilde{P}^{\ell}_{i}}^{2} + z'_{i}\tilde{\nu}^{\ell}_{i} - \frac{1}{2} \tilde{c}^{\ell}_{i} \right\}\,, \end{aligned} $$
(21)
where \(\left (\tilde {P}^{\ell }_{i}\right)^{-1} {:=} \tilde {P}_{i}^{-1}\left (\tilde {a}^{\ell }_{i:n}\right)\), \(\tilde {\nu }^{\ell }_{i}{:=} \tilde {\nu }_{i}\left (\tilde {a}^{\ell }_{i:n}\right)\) and \(\tilde {c}^{\ell }_{i} {:=} \tilde {c}^{\ell }_{i}\left (\tilde {a}^{\ell }_{i:n}\right)\) are defined in Lemma 2. Define
$$\begin{array}{*{20}l} \Delta^{\ell}_{i+1} &{:=} \left(I_{\mathsf{m}} + H'_{\tilde{a}^{\ell}_{i+1}}\left(\tilde{P}^{\ell}_{i+1}\right)^{-1}H_{\tilde{a}^{\ell}_{i+1}}\right)^{-1}\,,\\ \delta^{\ell}_{i+1}&{:=}\tilde{\nu}^{\ell}_{i+1} + \overline{H}_{\tilde{a}^{\ell}_{i+1}}^{-1}\left(d_{\tilde{a}^{\ell}_{i+1}}+T_{\tilde{a}^{\ell}_{i+1}}z_{i}\right)\,. \end{array} $$
Then, by (20), the SMC approximation \(t^{N}_{i}(a_{i},z_{i},y_{i+1:n})\) of t
i
(a
i
,z
i
,y
i+1:n
) is given by:
$$ {} \begin{aligned} t^{N}_{i}(a_{i},z_{i},y_{i+1:n}) &=\sum_{a_{i+1}=1}^{J}\int_{z_{i+1}}m_{}(a_{i+1},z_{i};z_{i+1})\\ &\quad\times Q(a_{i},a_{i+1})p^{N}_{}(y_{i+1:n}|a_{i+1},z_{i+1}) \mathrm{d} z_{i+1}\,,\\ &= \tilde{p}_{i+1}(y_{i+1:n}) \sum_{\ell=1}^{N} C_{i}^{-1}\left(\tilde{a}^{\ell}_{i+1:n}\right)\\ &\quad\times Q\left(a_{i}, \tilde{a}^{\ell}_{i+1}\right) \tilde{\omega}^{\ell}_{i+1} \left|\overline{H}_{\tilde{a}^{\ell}_{i+1}}\right|^{-1/2} \left|H_{\tilde{a}^{\ell}_{i+1}} \Delta^{\ell}_{i+1} H'_{\tilde{a}^{\ell}_{i+1}}\right|^{1/2} \, \\ &\quad \times \exp\left\{{\vphantom{-\frac{1}{2}\left\|d_{\tilde{a}^{\ell}_{i+1}}+T_{\tilde{a}^{\ell}_{i+1}}z_{i}\right\|_{\overline{H}_{\tilde{a}^{\ell}_{i+1}}}^{2}}}\frac{1}{2} \left(\delta^{\ell}_{i+1}\right)'H_{\tilde{a}^{\ell}_{i+1}}\Delta^{\ell}_{i+1}H'_{\tilde{a}^{\ell}_{i+1}} \delta^{\ell}_{i+1}\right. \\ &\qquad\quad\left.-\frac{1}{2}\left\|d_{\tilde{a}^{\ell}_{i+1}}+T_{\tilde{a}^{\ell}_{i+1}}z_{i}\right\|_{\overline{H}_{\tilde{a}^{\ell}_{i+1}}}^{2}\right\} \,,\\ &= \sum_{\ell=1}^{N}\tilde{\omega}_{\mathsf{b},i}^{\ell}(a_{i})\exp\left\{-\frac{1}{2}\left\|z_{i}\right\|_{\tilde{S}_{i+1}^{\ell}}^{2}+z_{i}'\tilde{s}_{i+1}^{\ell}\right\}\,, \end{aligned} $$
(22)
where
$${}\begin{aligned} C_{i}(\tilde{a}^{\ell}_{i+1:n}) &{:=} \exp \left\{\tilde{c}^{\ell}_{i+1}/2\right\} \int_{z_{i+1}} \gamma^{}_{i+1}\left(\tilde{a}^{\ell}_{i+1}, z\right)\tilde{p}\left(y_{i+1:n}|\tilde{a}^{\ell}_{i+1:n},z\right) \mathrm{d} z\,,\\ \tilde{\omega}_{\mathsf{b},i}^{\ell}(a_{i}) &= \tilde{p}_{i+1}(y_{i+1:n})C_{i}\left(\tilde{a}^{\ell}_{i+1:n}\right)^{-1}\\ &\quad\times Q\left(a_{i}, \tilde{a}^{\ell}_{i+1}\right) \tilde{\omega}^{\ell}_{i+1} \left|\overline{H}_{\tilde{a}^{\ell}_{i+1}}\right|^{-1/2} \left|H_{\tilde{a}^{\ell}_{i+1}} \Delta^{\ell}_{i+1} H'_{\tilde{a}^{\ell}_{i+1}}\right|^{1/2}\\ &\quad\times\exp\left\{-d'_{\tilde{a}^{\ell}_{i+1}}\overline{H}_{\tilde{a}^{\ell}_{i+1}}^{-1}d_{\tilde{a}^{\ell}_{i+1}}/2\right\}\\ &\quad\times \exp\left\{\left(\tilde{\nu}^{\ell}_{i+1} + \overline{H}_{\tilde{a}^{\ell}_{i+1}}^{-1}d_{\tilde{a}^{\ell}_{i+1}}\right)'H_{\tilde{a}^{\ell}_{i+1}}\Delta^{\ell}_{i+1}H'_{\tilde{a}^{\ell}_{i+1}}\right.\\ &\quad\left.\times\left(\tilde{\nu}^{\ell}_{i+1} + \overline{H}_{\tilde{a}^{\ell}_{i+1}}^{-1}d_{\tilde{a}^{\ell}_{i+1}}\right)/2\right\} \,,\\ \left(\tilde{S}_{i+1}^{\ell}\right)^{-1} &= T'_{\tilde{a}^{\ell}_{i+1}}\overline{H}_{\tilde{a}^{\ell}_{i+1}}^{-1}\left(T_{\tilde{a}^{\ell}_{i+1}}-H_{\tilde{a}^{\ell}_{i+1}}\Delta^{\ell}_{i+1}H'_{\tilde{a}^{\ell}_{i+1}}\overline{H}_{\tilde{a}^{\ell}_{i+1}}^{-1}T_{\tilde{a}^{\ell}_{i+1}}\right)\,,\\ \tilde{s}_{i+1}^{\ell} &=T'_{\tilde{a}^{\ell}_{i+1}}\overline{H}_{\tilde{a}^{\ell}_{i+1}}^{-1}\!\left(\!H_{\tilde{a}^{\ell}_{i+1}}\Delta^{\ell}_{i+1}H'_{\tilde{a}^{\ell}_{i+1}}\left(\tilde{\nu}^{\ell}_{i+1} + \overline{H}_{\tilde{a}^{\ell}_{i+1}}^{-1}d_{\tilde{a}^{\ell}_{i+1}}\!\right)-d_{\tilde{a}^{\ell}_{i+1}}\right)\,. \end{aligned} $$
In the numerical experiments, \(\gamma ^{}_{i}(a_{i},z_{i})\) is set as a mixture of Gaussian distributions. As explained in Section 2.3.1, the integral \(\int _{z_{i+1}} \gamma ^{}_{i+1}\left (\tilde {a}^{\ell }_{i+1}, z\right)\tilde {p}\left (y_{i+1:n}|\tilde {a}^{\ell }_{i+1:n},z\right) \mathrm {d} z\) may be computed explicitly, see Lemma 3.