A general solution to the continuous-time estimation problem under widely linear processing

A general problem of continuous-time linear mean-square estimation of a signal under widely linear processing is studied. The main characteristic of the estimator provided is the generality of its formulation which is applicable to a broad variety of situations, including finite or infinite intervals, different types of noises (additive and/or multiplicative, white or colored, noiseless observation data, etc.), capable of solving three estimation problems (smoothing, filtering or prediction)


Introduction
In most engineering systems, the state variables represent some physical quantity that is inherently continuous in time (ground-motion parameters, atmospheric or oceanographic flow, and turbulence, etc.). Thus, the formulation of realistic models to represent a signal processing problem is one of the major challenges facing engineers and mathematicians today. Given that in many problems the incoming information is constituted by continuous-time series, the use of a continuous-time model will be a more realistic description of the underlying phenomena we are trying to model. For example, [1] gives techniques of continuous-time linear system identification, and [2] illustrates the use of stochastic differential equations for modeling dynamical phenomena (see also the references therein). Continuous-time processing is especially suitable when data are recorded continuously, as an approximation for discrete-time sampled systems when the sampling rate is high [3] and when data are sampled irregularly [4]. It is also necessary with applications that require high-frequency signal processing and/or very fast initial convergence rates. Analog realizations also result in a smaller integrated circuit, lower power dissipation, and freedom from clocking and aliasing effects [5,6]. In such cases, the continuous-time solution becomes an adequate alternative to the discrete one since it allows real-time processing and alleviates the overload problem assuring more reliable overall operation of the system [7]. Moreover, the analytical tools developed in the continuous-time case might bring new insights to the analysis which are not possible in their discrete-time counterparts. In particular, [8] illustrates this fact in the problem of sorting continuous-time signals, [9] in the problem of nonfragile H ∞ filtering for a class of continuous-time fuzzy systems, and [10] in the study of the behavior of the continuous-time spectrogram.
The estimation problem is a topic of great interest in the statistical signal processing community. This problem has traditionally been solved by using a conventional or strictly linear (SL) processing. For instance, [11,12] deal with classical estimation problems (e.g., the Kalman-Bucy filter) under a real formalism, [13] tackles similar problems in the complex field, and [14] uses factorizable kernels for solving such problems. The main characteristic of the SL treatment is that it takes into account only the autocorrelation of the complex-valued observation process, ignoring its complementary function. That is, the only information considered for the building of the estimator is that supplied by the observation process, while the information provided by its conjugate is ignored. Cambanis [15] provided the more general solution to the problem of continuous-time linear mean-square (MS) estimation of a complex-valued signal on the basis of noisy complex-valued observations under a SL processing. In fact, Cambanis's approach is valid for any type of second-order signals and observation intervals, and it is not necessary to impose conditions such as stationarity, Gaussianity or continuity on the involved processes, nor restrictions of finite intervals.
Recently, it has been proved that the treatment of the linear MS estimation problem through widely linear (WL) processing, which takes into account both the observation process and its conjugate, leads to estimators with better performance than the SL ones in the sense that they show lower error variance. Specifically, and from a discrete-time perspective, the WL regression problem was tackled in [16], the prediction problem in a complex autoregressive modeling setting was addressed in [17,18] and later extended to autoregressive moving average prediction in [19]. Also, an augmented 1 affine projection algorithm based on the full secondorder statistical information has been newly devised in [20]. Among the wide range of applications of WL processing is the analysis of communication systems [21], ICA models [22], quaternion domain [23], adaptive filters [24][25][26], etc.
The study of continuous-time estimation problems is also interesting because it provides precise information on some structural properties of the system under study [8,9]. For instance, an explicit expression of the MS error associated with the optimal estimator can be derived in this approach (e.g., see [12,13]). Notice that this well-known result is independent of the number of available observations. In addition, the continuous-time solution becomes an excellent alternative to the discrete one when the number of available data is large. Discrete-time solutions involve the explicit calculation of matrix inverses whose dimensions depend on the number of observations (see, e.g., [16]). In practice, the process would be cumbersome or even prohibitive if this number were large (as occurs, e.g., in a major earthquake where the workload of the system increases suddenly).
The WL estimation problem under a continuous-time formulation was initially dealt with in [27,28] and [29]. More precisely, the particular problem of estimating a complex signal in additive complex white noise is solved in [27] or [28] through an improper version of the Karhunen-Loève expansion. A general result comparing the performance of WL and SL processing is also presented in which it is shown that the performance gain, measured by MS error, can be as large as 2. Finally, [29] provides an extension of the previous problem to the case in which the additive noise is made up of the sum of a colored component plus a white one. The handicaps of both solutions are: i) they are limited to MS continuous signals, ii) the signals must be defined on finite intervals, iii) the model for the observation process involves additive noise (white noise in the case of [27] and [28]), and iv) they are only devoted to solving a smoothing problem.
In this paper, we address a more general estimation problem than those solved in [27][28][29]. For that, we consider the general formulation of the estimation problem given in [15], and we solve it by using WL processing. The generality of this formulation allows the solution of a wide range of problems, including general second-order processes, infinite observation intervals, additive and/or multiplicative noise, noiseless observations, estimation of functionals of the signal, etc. It also brings under a single framework three different kinds of estimation problems: prediction, filtering, and smoothing. Hence, all the above handicaps are avoided with the proposed solution. Specifically, we present two forms of the WL estimator depending on the nature, either proper or improper, of the observation process. Then, we state conditions to express such an estimator in closed form. Closed form expressions for the estimator are convenient from a computational point of view [11,12,15]. Three numerical examples show that the proposed solution is feasible and demonstrate the aforementioned generality. The first one compares the performance of the WL estimator in relation to the SL one by considering an observation process defined on an infinite interval and with multiplicative noise. The second concerns the problem of estimating a signal in nonwhite noise and illustrates its application with discrete data. Lastly, the third example considers the earthquake ground-motion representation problem and illustrates a possible real application.
The rest of this paper is organized as follows. In Section 2, we review the SL solution proposed in [15]. Section 3 presents the main results. We derive the new estimator and its associated MS error. Moreover, we prove the better performance of this in relation to the SL estimator, and we give conditions to obtain a closed form of the WL estimator. The results obtained in this section are first stated and then proved rigorously in an Appendix. This section also includes a brief description of how the technique can be implemented in practice. Finally, Section 4 contains three numerical examples illustrating the application of the suggested estimator, and a performance comparison between WL and SL estimation is carried out.
Throughout this paper, all the processes involved are complex, measurable and of second-order. Next, we introduce the basic notation. The real part of a complex number will be denoted by R{·}, the complex conjugate by (·)*, the conjugate transpose by (·) H and the orthogonality of two complex-valued random variables, say a and b, by a ⊥ b. Also, a.s. stands for almost surely and a.e. for almost everywhere.

Strictly Linear Estimation
A core problem in signal processing theory is the estimation of a signal from the information supplied by another signal. A very general formulation of this problem was provided by Cambanis in [15]. Specifically, let F and G be two functionals and {s(t), t S} be a random signal, where S is any interval of the real line. Suppose that s(t) is not observed directly and that we observe the process where T is any interval of the real line. Based on the observations {x(t), t T}, the aim is to estimate a functional of s(t) S' being any interval of the real line. As noted above, this formulation is very general and contains as particular cases a great number of classical estimation problems, such as estimation of signals in additive and/or multiplicative noise, estimation of signals observed through random channels, random channel identification, etc. [15]. It can also be adapted to treat filtering, prediction, and smoothing problems.
In order to proceed with the building of the Cambanis estimator, the second-order statistics of the processes involved are needed. Let r x (t, τ) and r ξ (t, τ) be the respective autocorrelation functions of x(t) and The weakness of the hypotheses imposed on the processes and the possibility of considering infinite intervals force us to construct measures other than Lebesgue measure. To avoid an excess of mathematical formalism, we do not follow the Cambanis exposition literally. Changing the measure is equivalent to searching for a function F(t) such that This function F(t) can be selected by a trial-and-error method or by using the procedure given in [30], and in addition, it does not have to be unique. This freedom of choice is to be exploited appropriately in every particu- Some practical examples can be consulted in [31].
Condition (1) guarantees the existence of the eigenvalues and eigenfunctions, {l k } and {j k (t)}, respectively, of r x (t, τ). Next, we need an orthogonal basis of random variables built from the observation process and the Hilbert space spanned by it. The elements of such a basis take the form ε k = T x(t)φ * k (t)F(t)dt a.s., and let H(ε k ) be the Hilbert space spanned by the random variables {ε k }. By using SL processing, the estimator ξ SL (t) proposed in [15] is calculated by projecting the process ξ(t) onto H(ε k ). As a consequence,ξ SL (t) is given by

Widely Linear Estimation
In general, complex-valued random processes are improper [24], and then the appropriate processing is the WL processing. In this section, we provide a new estimator,ξ WL (t), by using WL processing and calculate its corresponding MS error, To this end, we consider, together with the information supplied by the observation process, x(t), the information provided by its conjugate, x*(t). Both processes are stacked in a vector giving rise to the augmented observation process, Notice that ξ WL (t) receives the name of WL estimator because it depends linearly not only on x(t) but also x*(t) in contrast with the conventional estimator.
In order to find an explicit form of the estimator and its error, we have to distinguish two possibilities in relation to the nature of x(t): proper or improper. If x(t) is proper, i.e., cx(t, τ) = 0, then the expression for the estimator iŝ Expressions (2) and (3) are derived in Theorem 1 in the "Appendix". These expressions extend to the SL ones since if r 2 (t, τ) = 0, thenξ WL (t) =ξ SL (t) and P WL (t) = P SL (t).
On the other hand, in the improper case (c x (t, τ) ≠ 0), and unlike the proper case, it is not as quick to calculate an explicit and easily implemented expression ofξ WL (t).
The main difference between both cases is that now the members of the set {ε k } ∪ {ε * k } are not orthogonal. In fact, we have Thus, the goal will be to calculate an orthogonal basis in the Hilbert space generated by {ε k } and {ε * k }, H(ε k , ε * k ), which avoids this serious problem. This objective is attained in Lemma 1 in the "Appendix" by means of the eigenvalues, {a k }, and the corresponding eigenfunctions, k (t), of r x (t, τ). Following a similar reasoning to [28], it can be shown that the eigenfunctions k (t) have the particular structure given by ϕ and are orthonormal in the sense of (10). The elements of this new set are real random variables of the form verifying that E[w n w m ] = a n δ nm . By using this new set of variables, we can obtain the WL estimator explicitlŷ where ψ k (t) = 1 and its corresponding MS error is Theorem 2 in the "Appendix" proves these assertions. From a practical standpoint, it would be interesting to get a closed form forξ WL (t). For that, it is necessary to restrict the kind of processes considered so far. Theorem 3 in the "Appendix" gives conditions in order to express the estimator in the following waŷ for some square integrable functions h 1 (t, ·) and h 2 (t, ·). Expression (7) is computationally more amenable than (2) or (5). The key question is whether the conditions of Theorem 3 are fulfilled. An example of the latter is the classical problem of estimating an improper complex-valued random signal in colored noise with an additive white part addressed in [29]. Specifically, the observation process considered is is an improper complex-valued MS continuous random signal, the colored noise component, n c , is a complex-valued MS continuous stochastic process uncorrelated with v(t), and v(t) is a complex white noise uncorrelated with the signal s(t). Note that the formulation of the estimation problem treated in [29] is much more restrictive than that studied in the present paper.
Finally, a remarkable advantage of the proposed estimator appears when ξ(t) is a real process, and x(t) is still complex. In this case,ξ WL (t) is real too. However, there is no reason for the SL estimator to be real, which is not convenient when we estimate a real functional. Moreover, if x(t) is proper, thenξ WL (t) = 2R{ξ SL (t)} and its associated MS error is which provides a decrease in the error that is twice as great as the SL estimator.
Notice also that the Hilbert space approach we have followed to derive the WL estimators allows us to give an alternative proof of the well-known fact that WL estimation outperforms SL estimation. The estimator ξ WL (t) is really obtained by projecting the functional ξ(t) onto the Hilbert space H(ε k , ε * k ). Observe that H(ε k ) ⊆ H(ε k , ε * k ) and then trivially by the projection theorem of the Hilbert spaces 2 [[12], Proposition VII. C.1], we have P WL (t) ≤ P SL (t), for t S', and hence, the WL estimator outperforms the SL one as regards its MS error.

Practical Implementation of the Estimator
We enumerate the necessary steps in implementing the estimation technique proposed for the estimator (5). Nevertheless, some comments are made on how the algorithm can be adapted to obtain (2). Moreover, the role played by (7) becomes clear at the end of the procedure. The steps are the following: 1) Determine the augmented statistics of the processes involved. In some practical applications, the secondorder structure is initially known. In fact, it may be derived from experimental measurements or mathematical models. For instance, the information-bearing signal in the communications problem is purposely designed to have desired statistical properties [32]. Other examples can be consulted in [33,34].
2) Select a function F(t) such that condition (1) holds. As noted above, this function F(t) can be selected by a trial-and-error method or by using the procedure given in [30]. Notice that this function is not unique and, in general, there are many specifications possible.
3) Obtain the eigenvalues {a k } and eigenfunctions { k (t)} associated with r x (t, τ). In general, determination of eigenvalues and eigenfunctions, except for a few cases, is a problem that is very involved, if not impossible. However, we can avoid the calculation of true eigenvalues and eigenfunctions by means of the Rayleigh-Ritz (RR) method, which is a procedure for numerically solving operator equations involving only elementary calculus and simple linear algebra (see [31,35] for a detailed study about the practical application of the RR method). (6) at n terms and substitute, if necessary, the true eigenvalues and eigenfunctions by the RR ones. This truncated version of the estimator, which is in fact a suboptimum estimator, can be calculated via the expression (7) with

4) Truncate expressions (5) and
and where both functions satisfy the conditions of Theorem 3.
Thus, we have replaced the computation of 2n integrals in the truncated version of (5) (or n integrals in the finite series obtained from (2)) by the computation of two integrals in (7), and hence, it entails a reduction in the error of approximation for a given precision.
Note that both the precision and the amount of computation required in applying this method depend heavily on the number n. An easy criterion 3 for determining an adequate level of truncation n without an unnecessary excess of computation can be the following: select n in such a way that n k=1 α k represents at least 95% of the total variance of the process, ∞ k=1 α k = 2 T r x (t, t)F(t)dt (see the proof of Lemma 1 in the "Appendix"). 5) Finally, from a discrete set of observations, x 1 , ..., x N , we can compute the integrals in (7) by means of where the weights g 1 (t, k) and g 2 (t, k) are obtained via a suitable method that performs numerical integration with integrands constituted for discrete points. For example, using the Gill-Miller quadrature method [36] implemented by subroutine d01gaf from the NAG Toolbox for MAT-LAB or the trapezoidal rule (trapz function in MATLAB).
The only changes for implementing the estimator (2) are in steps 1 and 3, where we have to use r x (t, τ) and their associated eigenvalues and eigenfunctions, {l k } and {j k (t)}, instead.

Numerical Examples
Three examples illustrate the implementation of the proposed solution and show its capability to solve very general estimation problems. Example 1 shows a situation where true eigenvalues and eigenfunctions are available and aims at comparing the performance of WL processing in relation to SL processing. Example 2 applies the RR method to approximate the eigenexpansion and also illustrates its implementation with discrete data. Finally, Example 3 considers an application in seismic signal processing in which the groundmotion velocity is estimated from seismic ground acceleration data.

Example 1
Assume that a real waveform s(t) is transmitted over a channel that rotates it by some random phase θ and adds a noise n(t). Unlike [28] and [29], we consider infinite observation intervals and a multiplicative quadratic noise in the observations. More precisely, s(t) is defined on the real line, S = ℝ, with zero-mean and r s (t, τ ) = e −(t−τ ) 2 . Thus, the observation process is given by where j = √ −1 and the noise n(t) is a zero-mean Gaussian process with r n (t, τ) = 3 -1/2 p 1/4 (t)p 1/4 (τ), where p(t) = 2 πe −2t 2 (this type of process is studied in [34]). Three different probabilistic distributions for θ are taken: a uniform distribution on (-s, s), a zero-mean normal with variance s, and a Laplace distribution with zero-mean and variance s. Several choices of s will be used to show how the advantages of WL processing vary with the level of improperness of the observations. Finally, mutual independence of θ, s(t) and n(t) is assumed. The objective is to estimateṡ(t), t ∈ [0, 1], whereṡ(t) denotes the MS derivative of s(t).
We first notice that ∞ −∞ r x (t, t)dt < ∞, where F(t) = 1 has been selected by a trial-and-error method and thus, condition (1) is verified. This example is one of the particular cases where calculation of true eigenvalues and eigenfunctions is possible. In fact, r x (t, τ) has eigenvalues (1 + E[e 2jθ ]λ k and (1 − E[e 2jθ ])λ k with respective asso- ..., and and H k (t) = (−1) k e t 2 ∂ k ∂t k e −t 2 are the Hermite polynomials. Moreover, we can check that the associated MS errors are the following: with l k (t) = 3 −1 / 2 T ∂ ∂t r s (t, τ )p 1 / 2 (τ )φ k (τ )dτ. We use the measure which is closely related to the performance measure considered in [29], to compare the performance of WL processing in relation to SL processing. For that, we have truncated the series in P SL (t) and P WL (t) at n = 10 terms (this approximate expansion explains 99.86% of the total variance of the process). The performance of both the SL and the WL estimators for n = 10 does not really vary substantially from the case of n >10. Figure  1a depicts the measure I in function of s for the three probabilistic distributions considered for θ. It turns out that the advantages of WL processing decrease in both cases as s tends toward zero and as s tends toward infinity. However, this occurs for different reasons. Another performance measure which helps in the interpretation is which, for this example, takes the value L = |E[e 2jθ ]|. Figure 1b shows the index L as a function of s for the three probabilistic distributions considered for θ. On the one hand, as s tends toward zero, then the index L tends to one since in that limit the observation process becomes a real signal 4 . On the other hand, when s increases, then L tends toward zero since x(t) becomes a proper signal. The faster convergence to zero in the normal case and the slower one for the Laplace distribution are also observed.

Example 2
We study a generalization of the classical communication example addressed in [28] and [29]. Assume that a real waveform s 1 (t) is transmitted over a channel that rotates it by a standard normal phase θ 1 and adds a nonwhite noise n(t). More precisely, s 1 (t) is defined on the interval [0, 1], with zero-mean and r s 1 (t, τ ) = min{t, τ }. Thus, the observation process is where the nonwhite noise n(t) is obtained from a linear time-invariant system of the form n(t) = e jθ 2 1 0 r s 1 (t, τ )s 2 (τ )dτ, with θ 2 being a zero-mean normal random variable with variance 2 and s 2 (t) a standard Wiener process (these types of noises appear in [ [37], p. 357]). Moreover, we assume that θ 1 , θ 2 , s 1 (t), and s 2 (t) are independent of each other. This example extends the cases studied in [28] and [29] since the considered noise here does not have a white component and thus, the previous solutions cannot be applied. The observations have been taken in the following time instants: i/1000, i = 1, ..., 1000. The objective is to estimate s(t) = e jθ 1 s 1 (t), t [0,1].
We first notice that 1 0 r x (t, t)dt < ∞, where F(t) = 1 has been selected since the processes involved are continuous and thus, condition (1) is verified. Now, to apply the RR method, we choose the Fourier basis of complex exponentials on [0, 1], {exp{2π jk}} ∞ k=−∞ . Following the recommendations in step 5 of Section 3.1, we compute the integrals in (7) via the subroutines d01gaf and trapz (there were no significative differences between both methods). Figure 2 depicts the MS error P WL (t) together with the MS errors of the WL estimator obtained from the RR method with n = 25 and n = 50 terms in step 5 of the algorithm, which have been generated by Monte Carlo simulation (a total of 10,000 simulations were performed). We can see that the method may yield a sufficiently accurate solution with a short number n of terms while reducing the complexity of the problem significantly. Note that a truncated expansion at n = 25 terms explains 88.77% of the total variance of the process and the expansion with n = 50 terms 95.81%.

Example 3
The seismic ground acceleration can be represented by a uniformly modulated nonstationary process [33]. The modulated nonstationary process is obtained in the following way where a(t) is a time modulating function that could be a complex function, and z(t) is a stationary process with zero-mean and known second-order moments. In general, the so-called exponential modulating function is adopted [38,39]. A common choice for z(t) is the standard Ornstein-Uhlenbeck process with a particular version of the exponential modulating function given by a (t) = e -t [ [33], p. 38]. Thus, the seismic ground acceleration can be modeled as a stochastic signal {s(t), t S = ℝ + } with r s (t, τ) = e -(t+τ) e -|t-τ| . Consider the observation process where θ is a standard normal phase independent of s (t). Now, the objective is to estimate the seismic ground velocity at instant t ≥ 2, i.e., ξ (t) = 1 0 s(τ )dτ, with t S' = [2, ∞). A justification for considering infinite intervals on the basis of the stationarity property of z(t) can be found in [40].
By using a trial-and-error method, we select F(t) = e -t and then, (1) holds. For the case of infinite intervals, T = ℝ + , the true eigenvalues and eigenfunctions of r x (t, τ) are not known. We approximate them by means of the RR method. The RR eigenvalues and eigenfunctions of , whereλ k andφ k (t) are the RR eigenvalues and eigenfunctions, respectively, of r x (t, τ) obtained from the following trigonometric basis In Figure 3, we compare the MS error of the SL estimator calculated with n = 10 terms with the MS errors of the WL estimator with n = 2, 4 and, 10 terms (which account for 57.60, 82.30 and 93.88% of the total variance of x(t), respectively). We have limited the estimation interval to [2,6] because of the observed stabilization of the MS errors for t ≥ 4. Apart from the better performance of the WL estimator with respect to the SL estimator (as was to be expected), the rapid convergence of the RR estimators is also confirmed.

Concluding Remarks
A new WL estimator has been given for solving general continuous-time estimation problems. The formulation considered can be adapted in order to include as particular cases a great number of estimation problems of interest. The proposed estimator becomes a way that avoids explicit calculation of matrix inverses altogether and can be applied provided that the second-order characteristics of the processes involved are known. Such knowledge is usual in some practical problems in fields as diverse as seismic signal processing, signal detection, finite element analysis, etc. An alternative procedure is the stochastic gradient-based iterative solution called augmented complex least mean-square algorithm (see, e. g., [24]) in which the second-order statistics are estimated from data. However, if we wish to take advantage of the knowledge of the second-order characteristics and the number of observation data is very large, then the continuous-time solution is a recommended option.

Appendix
This "Appendix" is written following a rigorous mathematical formalism parallel to [15] or [30]. Condition (1) is indeed more restrictive than the one imposed in the works of Cambanis. Specifically, suppose μ a measure on (T, B(T)) (B(T) is the s-algebra of Lebesgue measurable subsets of T) which is equivalent to the Lebesgue measure and verifies The existence of μ satisfying (9) is proved in [30]. Cambanis also shows that (9) allows us to select a function F(t) such that dμ(t)/dt = F(t) and (1) holds.
. Moreover, its associated MS error is Proof: Firstly, notice that if x(t) is proper, then the members of the set of random variables {ε k } ∪ {ε * k } are orthogonal. Thus, the estimatorξ WL (t) is obtained by projecting the functional ξ(t) onto the Hilbert space  generated by {ε k } and {ε * k }, H(ε k , ε * k ). Hence, the estimator can be expressed in the form ξ WL (t) = ∞ k=1 b k (t)ε k + ∞ k=1b k (t)ε * k , where the coefficients b k (t) andb k (t) are determined via the projection theorem of the Hilbert spaces. This result assures that , a n d E[ξ WL (t)ε k ] = λ kbk (t), then the first part of the result follows.
On the other hand, the corresponding MS error is We need the following Lemma before proving Theorem 2.
Lemma 1 Proof: From (9), we get that r x (t, τ) is the kernel of an integral operator of L 2 (μ × μ) into L 2 (μ × μ), which is linear, self-adjoint, nonnegative-definite, and compact. Let {a k } be their eigenvalues and { k (t)} the corresponding eigenfunctions. The eigenfunctions ϕ k (t) = [f k (t), f * k (t)] are orthonormal in the following sense Thus, the real random variables given by (4) are trivially orthogonal, i.e., E[w n w m ] = a n δ nm .

Competing interests
The authors declare that they have no competing interests. Note 1 Using augmented statistics means incorporating in the analysis the information supplied by the complex conjugate of the signal and examining properties of both the correlation and complementary correlation functions. 2 This result is an extension of the more familiar orthogonality principle for finite-dimensional vector space (see, e.g., [12,13]). 3 It should be remarked that this criterion only takes into account the information provided by x(t) and the removed coefficients could be very informative about ξ(t). 4 Notice that the complex nature of x(t) in (8) stems from the term e jθ . Hence, as s 0, then the variance of θ vanishes and it becomes a degenerate random variable that only takes the value 0 with probability 1.