Nonlinear Transformation of Differential Equations into Phase Space

Time-frequency representations transform a one-dimensional function into a two-dimensional function in the phase-space of time and frequency. The transformation to accomplish is a nonlinear transformation and there are an infinite number of such transformations. We obtain the governing differential equation for any two-dimensional bilinear phase-space function for the case when the governing equation for the time function is an ordinary differential equation with constant coefficients. This connects the dynamical features of the problem directly to the phase-space function and it has a number of advantages.


INTRODUCTION
Ordinary linear differential equations with constant coefficients are the most venerable and studied differential equations, and many ideas and methods have been developed to obtain exact, approximate, and numerical solutions, and to qualitatively study the nature of the solutions [1]. The subject is over 300 years old, but nonetheless we argue that a totally new perspective is achieved when the differential equation, even a simple ordinary differential equation, is transformed into phase space by a nonlinear transformation. Moreover we further argue that this transformation not only results in greater insight into the nature of the solution, but leads to new approximation methods [2]. To illustrate and motivate our method we start with a simple example. Consider the following harmonic oscillator differential equation (it is the equation of the RLC circuit, or the damped spring-mass system): where f (t) is a given driving force and x(t) the output signal of the system, that is, the solution to the differential equation (µ and ω 0 are real constants). Perhaps there is no more studied equation than this one. In principle, this equation can be solved symbolically by many methods, for example, by obtaining Green's function. However, doing so does not add any particular insight into the nature of the solution. For practical reasons and to gain insight, one often transforms this equation into the Fourier domain. Defining x(t) e −itω dt, the differential equation transforms into [3] − whose exact solution is The reasons for going into the Fourier domain are many. First, we have a practical way of solution, since now one can find the time solution by way of Perhaps more importantly is that one can gain insight into the nature of the solution and both reasons have become part of standard analysis in all fields of science. We emphasize that in some sense the spectrum, among other things, tells us what frequencies exist in the function. To be more concrete, Figure 1: Solution of (6), real part of x(t). The parameters are µ = 1, ω 0 = 6π rad/s, α = 0.001, β = 6/5π, and ω 1 = −8π rad/s. as an example, we take an important case of the driving force, This driving force is a linear chirp, with a Gaussian amplitude modulation. The instantaneous frequency of the driving force is linearly increasing with time. The Fourier transform of the driving force is [4] which gives In Figures 1 and 2 we plot the signal and spectrum for the values indicated in the caption. As mentioned, much can be learned from a study of x(t) and X(ω). However, even more can be learned than is commonly discussed in textbooks as we now show. We take the solution x(t) and make the following nonlinear transformation [4]: where φ(θ, τ) is a two-dimensional function called the kernel. If the kernel is taken to be independent of the signal x(t), then the resulting distributions are called bilinear in x(t). By choosing different kernels particular distributions are ob-   tained [5,6,7,8]. Equivalent to (9) is the form: and there is a one-to-one relation between K(t, ω, t , t ) and φ(θ, τ) [4]. The form given by (9) is more convenient than (10) above, because the properties of the distributions are studied easier. The resulting transformation, C(t, ω), is a two-dimensional function of both time and frequency; the transformation takes us from one variable to a function of two variables. Such functions are called distributions or representations or quasiprobability distributions. In Figure 3 we plot a possible

C(t, ω) for the signal x(t).
We see that something remarkable happens: one gets a simple, clear picture of what is going on and of the regions which are important. In particular we see what the response of the system to the input chirp is, in a simple way. We can immediately see that we get a larger response when the input chirp hits the resonant frequency of the harmonic oscillator, whose parameters µ and ω 0 have been chosen to give the so-called underdamped behavior. Hence by making a nonlinear transformation we get more insight than by looking at x(t) or X(ω) separately. We get considerably more insight because the joint distribution tells us how time and frequency are related. In  Figure 7: Solution of (6), real part of x(t), for a sinusoidal frequency modulated forcing term: seventy years in the field of time-frequency analysis in engineering, and also as quasidistributions in quantum mechanics [4,9,10]. A major development has been done in this area and the ideas that have developed have become standard and powerful methods of analysis [11,12,13]. In engineering, where the distributions are called time-frequency distributions, the main aim has been to understand time-varying spectra [14,15,16,17,18,19]. Among the many areas to which they have been applied are heart sounds, heart rate, the electroencephalogram (EEG), the electromyogram (EMG) [20,21,22,23,24], machine fault monitoring [11,17,18,19,25,26], radar and sonar signals, acoustic scattering [14,16,27], speech processing [28,29], analysis of marine mammal sounds [30,31], musical instruments [32], linear and nonlinear dynamical systems [33,34,35], among many others. Our aim is the following. Suppose x(t) is governed by an ordinary differential equation with constant coefficients: a n d n x(t) dt n + a n−1 where f (t) is the driving force. Instead of solving for x(t) and putting it in (9), we obtain a governing differential equation for C(t, ω). In the next section we discuss some general properties of these bilinear transformations, and after that we derive the differential equation for C(t, ω) that corresponds to the solution of an ordinary differential equation with constant coefficients, (11).

BILINEAR TRANSFORMATIONS
We list just a few of the main properties of these distributions which are useful to our consideration. If we have two distributions, C 1 and C 2 , with corresponding kernels φ 1 and φ 2 , then the two distributions are related by with In operator form, The reason for writing (9) is that it is easier to handle because the properties of C(t, ω) are easier to understand from φ(θ, τ) than from K(t, ω; t , t ) but we emphasize that (10) and (9) are equivalent. The relation between φ(θ, τ) and K(t, ω; t , t ) is given in reference [4].

DIFFERENTIAL EQUATIONS
The above harmonic oscillator examples show that by making a nonlinear transformation one obtains a twodimensional function which shows clearly the physical nature of the solution and the relation with the driving force. Historically the way these distributions have been used is to solve for x(t) from its governing equation (or experimentally obtain x(t)) and substitute it into the time-frequency function, (9). Our aim has been to relate the phase-space distribution with the dynamical system, that is, to obtain a differential equation for C(t, ω), so that we may study directly the phase-space function. We have been successful in doing so for the Wigner distribution, and for a few other distributions (smoothed pseudo-Wigner distribution, Rihaczek distribution). In this paper we obtain the governing equation for any distribution C(t, ω), that is, for all bilinear time-frequency representations. We first give the result and then the derivation. Differential equation (11) is first written in polynomial form where P(D) = a n D n + a n−1 D n−1 + · · · + a 1 D + a 0 , Then the governing differential equation for any distribution C x (t, ω) is where and in the definition of P * (A c ) only the coefficients a 0 , . . . , a n are complex conjugated and not the operators, that is, We now explain the meaning of a quantity such as φ c ((1/i)(∂/∂t), (1/i)(∂/∂ω)). This operator is obtained by making the following substitution in the scalar function φ c (θ, τ): Similarly, what we mean by the differentiation noted in (18) and (19) is that

Derivation
We now give the derivation of (17). First consider the class of bilinear cross-distributions C x,y (t, ω) of two signals x(t) and y(t): In general one has that C ax1+bx2,y (t, ω) = a * C x1,y + b * C x2,y (t, ω), where x 1 (t), x 2 (t), y 1 (t), y 2 (t), x(t), and y(t) are arbitrary signals, and a and b are complex constants. Also we prove in the appendix that where The operators A c and B c will be simplified in Section 3.2 to obtain the compact form of (18) and (19). The combined use of (24)-(27) allows one to obtain (17). Now, we take the bilinear distribution of the left-and right-hand sides of (15) to obtain and we use (24) and (26) to simplify to Similarly, we apply (25) and (27) to obtain (17).

Simplification of the operators
We now simplify the operators A c and B c . Consider But and therefore we have that Also, it can be shown that and therefore and further Hence we have (18) and similarly (19). Furthermore it is often the case that the kernel is a product kernel: in which case we have that (38)

SPECIAL CASES
We now consider special cases, that is, distributions that are well known and have been used extensively in the literature.
It is given by and therefore the derivative with respect to τ is zero: and therefore we get

Rihaczek distribution
The Rihaczek distribution is and the kernel is given by Hence and therefore For the B operator we have and therefore the operators are

Smoothed pseudo-Wigner
The smoothed pseudo-Wigner distribution S x (t, ω) is obtained by convolving the Wigner distribution with a smoothing function, h(t, ω): Here we consider the Gaussian smoothing function given by and the corresponding kernel is (51) We apply (18) to obtain In the same way we obtain the B c operator, and hence we have that (53)

CONCLUSION
Time-frequency distributions transform a one-dimensional signal of time x(t) into a two-dimensional function of time and frequency C x (t, ω). There are an infinite number of phase-space distributions, C x (t, ω), and they are characterized by the kernel function. The advantage of transforming a function in time to a phase-space distribution is that we can see clearly how time and frequency are related or correlated for the signal, x(t). Also, we can see both mathematically and physically the regions of phase-space which are of importance. In this paper we have derived the governing equation for any bilinear phase-space distribution, C x (t, ω), when the governing equation for the corresponding time signal, x(t), is an ordinary linear differential equation with constant coefficients. A fundamental question is whether there is any particular advantage in choosing one such distribution over another. The motivations are manyfold. First, all bilinear equations are transformable into each other and hence all the resulting differential equations for C x (t, ω) are in some sense equivalent. However, one can have an advantage over another in a variety of ways. For example, the equation for a particular distribution may be easier to solve than for another. Also, one differential equation may be more transparent into the nature of the solution than another, and moreover one equation may be more amenable than another to devise approximation methods [2]. These issues are currently being studied.

ACKNOWLEDGMENT
This work was supported by the Air Force Information Institute Research Program (Rome, New York).