Bias correction for direct spectral estimation from irregularly sampled data including sampling schemes with correlation

The prediction and correction of systematic errors in direct spectral estimation from irregularly sampled data taken from a stochastic process is investigated. Different sampling schemes are investigated, which lead to such an irregular sampling of the observed process. Both kinds of sampling schemes are considered, stochastic sampling with non-equidistant sampling intervals from a continuous distribution and, on the other hand, nominally equidistant sampling with missing individual samples yielding a discrete distribution of sampling intervals. For both distributions of sampling intervals, continuous and discrete, different sampling rules are investigated. On the one hand, purely random and independent sampling times are considered. This is given only in those cases, where the occurrence of one sample at a certain time has no influence on other samples in the sequence. This excludes any preferred delay intervals or external selection processes, which introduce correlations between the sampling instances. On the other hand, sampling schemes with interdependency and thus correlation between the individual sampling instances are investigated. This is given whenever the occurrence of one sample in any way influences further sampling instances, e.g., any recovery times after one instance, any preferences of sampling intervals including, e.g., sampling jitter or any external source with correlation influencing the validity of samples. A bias-free estimation of the spectral content of the observed random process from such irregularly sampled data is the goal of this investigation.


Introduction
Digital signal processing normally implies a time-limited, non-interrupted sequence of equidistant samples taken from a signal-generating process under investigation. Various reasons may lead to a different sampling: (1) the measurement process may depend on a non-regular, typically stochastic, sampling process, e.g., laser Doppler systems [1][2][3] in burst mode measure arrival times and velocities of randomly arriving particles carried by a timevariant flow field. (2) The equidistant measurement may be disturbed, leading to either individual missing samples *Correspondence: holger.nobach@nambis.de 3 Max Planck Institute for Dynamics and Self-Organization, Am Faßberg 17, 37077 Göttingen, Germany Full list of author information is available at the end of the article or longer data gaps. Irregular sampling directly influences the spectral content of the observation. Spectral estimators may correctly estimate, e.g., the power spectral density of the data sequence (the observed signal) from the measurement. However, the spectrum of the observed signal will deviate from the spectrum of the process under investigation. Particularly, the observed signal is the product of the process under investigation multiplied with the sampling function, which is understood as the train of sampling instances of the observed signal, where all values of the sampling function are of unit amplitude (see illustration in Fig. 1). Hence, the spectrum of the observed signal is the convolution of the spectrum of the process with the spectrum of the sampling function. The spectral properties of the sampling function thus have a direct influence on the spectrum of the observed signal. To obtain meaningful information about the underlying process, appropriate estimators must consider the irregular sampling process and its spectral composition. Spectral estimation from randomly sampled signals in continuous time has been investigated in the past, mainly in the context of controlled sampling with induced variation of the sampling times, known as digital alias-free signal processing or sampling jitter [4][5][6][7], in the context of astrophysical observations [8][9][10][11] or in the context of laser Doppler data processing [12][13][14][15][16][17][18][19][20][21][22][23][24] including the specific role of processor dead times [25][26][27]. More general in their application are investigations in [28][29][30][31][32]. All these estimators are potentially able to handle also equidistant data with missing samples, and some of the given references include this case. However, adaptations are necessary, since random sampling on a continuous domain has a different spectral composition than equidistant sampling with missing samples. Independent and randomly distributed missing samples with otherwise equidistant sampling has also been investigated [31,[33][34][35][36][37][38]. Correlated data gaps have been investigated only for very specific cases [31,[39][40][41][42], without options for generalization or without satisfactory bias correction.
In contrast to theoretical investigations or computer simulations, where purely random sampling can be assumed or realized, strictly independent sampling is not possible in technical systems. For stochastic sampling in continuous time, e.g., laser Doppler systems cannot deliver successive samples with lag times below a certain minimum value. Particles with too low distance will lead to overlapping scatter signals from the measurement volume. The laser Doppler system will remove these overlapping signals to avoid faulty measurements due to possible interference between the signals with an unpredictable phase shift. This way, the size of the measurement volume defines a minimum distance between successive particles and finally a certain minimum lag time, below which the probability of appropriate pairs of measurement events drops rapidly. The result is a distinct measurement error of spectral estimates based on algorithms, which rely on the assumption of purely random and independent sampling instances as shown in [25][26][27].
Missing data from equidistantly sampled processes may occur as individual outliers, which often are independent from each other. However, if external processes disturb the process under observation, these impinging processes may have a certain correlation and generate missing data samples or gaps, which are not independent. Depending on the individual reasons for missing data samples or data gaps, the correlation and the spectral content of the sampling scheme will differ significantly between application cases. Also here, solutions of bias correction for purely random and independent occurrence of missing samples like in [43] will fail with correlated data gaps.
Universal solutions of bias correction for correlated data gaps with unspecific characteristics are not available so far, neither for continuous time nor for equidistant time. The present article investigates systematic errors of spectral estimators processing directly the sequence of data for different sampling cases with different distributions and with different spectral characteristics of sampling intervals. In continuous time, random sampling is investigated, where in one case purely random and independent sampling instances are taken from a linear stochastic process. For the same process, a minimum time interval between successive samples leads to correlated sampling intervals in another simulation. For equidistant time, also a linear stochastic process is observed. Again, a random occurrence of missing data samples is investigated with independent individual outliers first. In the last sampling scheme, also longer sequences of data points are removed from the original data set, introducing correlation into the pattern of missing samples.
A common procedure to correct systematic errors of direct spectral estimation is introduced and proven to be able to deliver bias-free estimates of the spectra, independent of the spectral characteristics of the sampling scheme or that of the missing data. The required parameters can be obtained by theoretical investigation for sampling processes with a priory known sampling mechanisms. For unknown relations, an alternative procedure is given to obtain the correction parameters numerically, straight from the measured data sequence. The procedures are available as Python source codes as supplementary material to this article together with all data sets under [44].
Note, that the correction procedure is not able to correct aliasing errors. If any aliasing occurs due to significant spectral content of the observed process above the temporal resolution of the sampling scheme, then systematic errors of the estimation will remain. Aliasing has its origin in an insufficient extraction of information from the process under observation, which principally cannot be repaired a posteriori, since the required information is not available from the observed data set any more. Note further, that the bias correction may lead to corresponding correlation matrices, which potentially may violate the non-negative definiteness. As a consequence, negative values occur in the corresponding estimated power spectral densities. Since the introduced procedures yield biasfree estimates, averages over multiple estimates of spectra will converge towards the true spectrum of the underlying process. As a common means to reduce the estimation variance, averaging over spectral estimates from data segments (block averaging) is often applied anyway. The ultimate solution of course, would be regularization. Since this inevitably introduces a bias to the spectrum, regularization is not considered in the present article, where bias-free estimation has priority. If block averaging is applied, bias-free estimates are essential to obtain a consistent mean spectrum with vanishing systematic and random errors for an increasing sample size.

Primary estimates
Let S p (f ) be the power spectral density of an observed, time dependent process x(t). The process is sampled at N ascending time instances t i , with i = 1 . . . N. The entire duration of the signal is T, where 0 ≤ t 1 < t N ≤ T. The sampling is either at random time instances t i with real values from a continuous domain or at quantized time instances with the fundamental time step t, but with missing samples. Note that T is not determined finally by a data set with irregular sampling instances. Only T ≥ t N is fix. The exact duration of the signal can be defined later, according to post-processing requirements like the spectral resolution desired for the discrete numerical spectral estimation.
Let S s (f ) be a bias-free estimate of the power spectral density of the observed signal. For the random sampling in continuous time, the observed signal is understood as the series of measurements x i = x(t i ). For nominally regular sampling with missing samples, the observed signal is understood as the series of valid values only. Neglecting any possible errors due to a periodic continuation of the signal, the particular estimate of S s (f ) can be obtained, e.g., as the Fourier spectrum with the imaginary unit i or with other methods like Lomb-Scargle's method [8,9] or generalized Lomb-Scargle's method [45] for data having a non-zero mean value. Note, that S s (f ) deviates from S p (f ) because irregular sampling influences the spectral content of the observation as outlined below. Dissolving the square in Eq. (1) leads to The last line with the double sum can be rewritten as with separate summations over self-products x 2 i on one side and cross-products x i x j with i = j on the other side.
For uninterrupted equidistant sampling, the probability P(t i ) of getting a valid sample at time t i is unity at any sampling instance t i = i t with the sampling interval t and zero between the samples. Since the self-products x 2 i in Eq. (5) can be built directly from the respective samples x i , the respective probability of such self-products occurring at time t i is identical to the probability for the occurrence of the sample itself, namely P(t i ). Therefore, the probability of self-products also is unity at any sampling instance t i = i t and zero between the samples.
Cross-products x i x j with i = j instead require two samples at two different times t i and t j . The respective joint probability of getting such cross-products P(t i , t j ) then is the product of the probability P(t i ) of having the first sample at t i and the conditional probability P(t j |t i ) of having another sample at t j under the condition of having had the first sample at t i .
However, for uninterrupted equidistant sampling, the probability of having another sample with a delay of t j − t i that equals an integer multiple of the sampling interval t is also unity. Hence, with t i = i t and t j = j t, finally cross-products between different samples and selfproducts of single samples have the same probability, provided the sampling is equidistant and uninterrupted.
With any kind of irregular sampling, the probability P(t i ) of having a valid sample at a sampling time t i = i t is either less than one for nominally equidistant sampling with missing samples or it changes its physical dimension to a probability density for irregular sampling in a continuous time domain. However, the probability of self-products at a certain time t i is still identical to that probability of the sample itself P(t i ) and also Eq. (6) still holds for cross-products. But with irregular sampling unfortunately, the respective conditional probabilities or conditional probability densities P(t j |t i ) of getting another sample at time t j after the sample at time t i also differ from unity and may additionally vary for different delays t j − t i . Under these conditions, Eq. (5) averages over selfand cross-products of varying delays, which all have different probabilities or probability densities and contribute differently to the estimate of the spectrum if applied to irregularly sampled data. This finally leads to the observed bias.
Let further be R s (τ ) the correlation function of the observed signal. This one corresponds to S s (f ) via the Fourier transform incorporating Wiener-Khinchin's theorem [46]. For a numerical realization via the discrete Fourier transform (DFT) resp. the inverse discrete Fourier transform (IDFT), both R s and S s must be sampled regularly. For that, a fundamental time increment and a frequency increment f are defined. While can be chosen arbitrarily for a randomly sampled signal in continuous time, equals the primary time step t of the signal for equidistant sampling. For the purpose of better clarity, in the following, the time step will be denoted as commonly, dropping the discrimination between time series with steps of t and correlation functions with steps of . The frequency increment in all cases is f = 1 T , defined by the assumed duration of the signal T, where T can be chosen either slightly larger than t N or significantly larger corresponding to an optional zero padding of the signal. In the present study, T = M is chosen with an integer M, where T is close to t N , e.g., the difference is within one time step . That leads to the correspondences , where x is the largest integer smaller than or equal x.

Bias correction
The sampling schemes investigated here yield either a continuous distribution of sampling intervals (exponential distribution for purely random sampling times or with deviations from the exponential distribution for correlated random sampling) or a discrete distribution (unity at the sampling interval of and zero otherwise for uninterrupted equidistant sampling or with further integer multiples of occurring with samples missing). All investigated sampling schemes have in common, that the sampling function, which defines the sampling times t i , has a certain randomness, namely the sampling instances occur at random times. For continuously distributed sampling times, the sampling instances themselves are randomly distributed. Regarding equidistant data sets with missing or invalid samples, the randomness applies to the availability or validity of the individual samples. In the following, continuous and discrete distributions of the respective sampling intervals are distinguished to ensure a unique discrimination between these two cases of either sampling in a continuous time domain or nominally equidistant sampling with missing instances. This is apart from possible correlations between the sampling instances resp. between the sampling intervals, which additionally cause deviations from a purely random sampling in each of these two cases.
Let the mean sampling rate be α and let the mean number of samples per time unit be α = . For a discrete distribution of sampling intervals, α is also the probability of a sample x i being valid, which complies with 0 ≤ α ≤ 1. Self-products x 2 i of samples then also occur with the probability of α as explained above. Cross-products x i x j can be made only from two different samples occurring at different time units t i = i and t j = j with i = j. Assuming independence between sampling times, the probability of having sampling time t j after sampling time t i becomes P(t j |t i ) = P(t j ), namely α . The probability of getting cross-products P(t j , t i ) in Eq. (6) then finally becomes α 2 . For a continuous distribution of independent sampling intervals with a mean sampling rate of α, the number of samples per time unit is Poisson distributed with the mean value of α = , where α can also be larger than one. In this case, the mean number of self-products x 2 i per time unit again is α , and the mean number of cross-products x i x j within two different time units again is α 2 following Eq. (6), provided all sampling instances within the two time units are independent.
Both of these values, the probability of self-products α and the probability of cross-products from two different time units α 2 , are identical for the two cases, that of discrete distribution of sampling intervals and that of continuous distribution of sampling intervals for purely random occurrence of samples without correlations between the sampling instances. In contrast to the case of discrete distribution of sampling intervals, for a continuous distribution of sampling intervals, the occurrence of more than one sample within one time unit is possible. Therefore, there also exists a probability of cross-products occurring from different samples within the same time unit, which receives further attention at a later point.
Let β k be the mean number of self-and cross-products per time unit counted in R s (τ k ). If only cross-products from different time units were counted and no correlation between the sampling intervals is assumed, from each Let further β k be β k normalized by α 2 . Then for independent sampling and if only cross-products from different time units are counted β k becomes unity. Any other influence, self-products, correlation between sampling instances, or cross-products within one time unit causes β k deviating from unity. In any case, β k is the ratio of the mean number of pairs of samples expected in R s (τ k ) including all effects causing β k deviating from unity and the expected number of cross-products only from different time units and from independent sampling instances. A prediction of all coefficients β k for a given sampling scheme then can be used to balance the different probabilities of self-and cross-products for each lag time τ k of the primary estimate of the correlation function R s (τ k ) by normalization with β k yielding an improved estimateR s (τ k ) applyinĝ The general procedure for bias correction then requires to transform the primary estimate of the spectrum into the appropriate correlation function. The correlation function then can be corrected for its bias based on appropriate correction factors β k , followed by another Fourier transform of the improved correlation function back into a spectrum.
The determination of the values β k depends on the particular sampling scheme. An analytical derivation requires a priori knowledge about the specific rules of the particular sampling process. Purely random sampling without any correlations between the sampling instances is a dedicated case, where only β 0 deviates from unity. This allows to perform the intended bias correction directly on the spectrum, as shown in the following subsection. With correlations between the sampling instances, the correction of the spectrum is more complicated and all values β k are required. Analytical derivations for the following test cases with correlations of the sampling instances are given later, namely below the introduction of the particular simulation procedure. These derivations are limited to the shown test cases and they are no longer valid with other sampling schemes. As a universal alternative, the correction coefficients can also be estimated numerically directly from the data taken, as shown in the next but one subsection.

Purely random sampling with independent sampling instances
For continuous distribution of sampling intervals and with independent sampling instances, all cross-products are independent, which leads to β k = 1 for all k = 0. For k = 0, a number of n samples in a time unit delivers n 2 cross-and self-products, where the number n of samples in that time unit is Poisson distributed with the probability and with the mean value of α . The mean number of crossand self-products for k = 0 then becomes which, after normalization with α 2 , finally leads to For discrete distribution of sampling intervals and with independent randomly occurring missed samples, again all cross-products are independent, which leads to β k = 1 for all k = 0 also here. In contrast to the previous case, where multiple sampling instances were possible within one time unit, in nominally equidistant sampling, only one or zero samples can occur per time interval. Therefore, R s (0) can include only self-products, which occur with the mean rate of α . After normalization with α 2 , this finally leads to In both cases, continuous and discrete distribution of sampling intervals without correlations, only β 0 deviates from unity. The corresponding correlation function can be corrected at lag time zero bŷ Due to the correspondence between the power spectral density and the correlation function given in Eq. (7), a correction of R s (0) as in Eq. (14) leads to an offset correction of the power spectral densitŷ for all frequencies f j . Fortunately, the procedure directly corrects the spectral estimates, which are the focus of this investigation. On the other hand, the procedure involves the corresponding correlation function. However, only the value R s (0) is needed. Since the spectral estimates have been obtained directly from the data, a procedure without values of the correlation function would be preferable. In that case, the transformation of the spectra into corresponding correlation functions could be dropped. For this purpose, the mean signal power is used instead, whereR results from the fact that in R s (0) orR s (0) in addition to self-products also cross-products of samples within single time units of may occur, while only self-products are counted in P s . However, this deviation occurs only for continuous distributions of sampling intervals, and it becomes significant only for large values of α above one. Fortunately, for continuous distributions of sampling intervals, the interval can be chosen arbitrarily small, ensuring α to be sufficiently smaller than one. In the limit of infinitesimal small , R s (0) and β 0 P s become the same and the correction of the spectrum becomeŝ Using the derivation of β 0 from above, for random sampling with continuous distribution of sampling intervals this reduces tô and for randomly and independent missing samples with discrete distribution of sampling intervals tô

Empirical estimates of the correction coefficients
For unknown spectral composition of the sampling function or that of the data gaps, the correction procedure with theoretical values of α and β k is not practicable because the values β k resp. β k are not known a priori. In that case, the number of self-and cross-products and finally has problems with signals of constant value, instead the Fourier spectrum is generally used for the empirical estimation of β k , yielding

Mean value
The procedures given here can be applied with no changes to data with or without a mean value. However, in contrast to equidistant sampling, a non-zero mean value in combination with irregular sampling increases the estimation variance of the derived spectra. Therefore, a possible mean value in real data should be estimated and removed from the data before spectral analysis. Unfortunately, the estimation and removal of the mean value is another bias source for the estimated correlation function and finally for the spectrum, as has been analyzed in [47,48] including appropriate corrections. To avoid interference with additional bias sources, the following test simulations are done with mean-free processes only. Accordingly, only Lomb-Scargle's method is used for direct comparison with the Fourier spectrum, while generalized Lomb-Scargle's method has not been considered further.

Simulation
To demonstrate the ability of the estimation routines to derive bias-free estimates of the power spectral density, a moving-average stochastic process of order 200 is gen- The last procedure also yields 50 % invalid samples on average, where the length of valid data or that of invalid data has an exponential distribution with a mean of five samples and the sequence of weights gets correlated. However, in all four cases a mean sampling rate of α = 0.5 tu −1 is set. Figure 2 illustrates the sampling schemes for the four test cases and the appropriate classification according to the definition in the abstract. The arrangement of the four test cases will remain the same for all following figures in the Section 3.

Appropriate correction coefficients for the test cases
For the two cases (a) and (c) with purely random sampling and independent sampling instances, the offset of the spectrum can be determined from the mean data rate and the spectrum can be corrected directly by removing the predicted offset as given in Eqs. (18) resp. (19). More effort is needed to derive the correction coefficients β k and β k for the two cases (b) and (d) with correlation between the sampling intervals. Since all properties of the signal generation including the rules of the sampling process are known, particular bias correction coefficients can be derived analytically.  is then derived by summation over all possible numbers n of samples as with the normalized minimum time between samples d = d . To derive the probabilities P 0 (n), also samples before the investigated time unit must be considered, because their delay times influence the probabilities of following samples within the actually investigated time unit. The probability to have no preceding sample effecting the actually investigated time unit is P e = 1 − α d (e -empty). The probability to have the actually investigated time unit fully covered by a delay time of a preceding event is P f = α max{0, d − 1} (f -full). In this case, no events can occur within the actually investigated time unit. This case occurs only, if d > 1. The probability to have the actually investigated time unit partially covered by the delay time from a preceding sample event is P p = α min{1, d } (p -partial). The probability to have n samples in a time unit consists of the sum of these three cases, yielding where P e (n), P f (n), and P p (n) are the probabilities to obtain n samples within the investigated time unit with either no preceding sample to be considered, time unit fully covered by the delay time of a preceding sample or partially covered. The probabilities of these three cases finally are where P (n, x) is the probability to have at least n samples within the (normalized) fraction x of a time unit, which is with the normalized rate of further samples α in the remaining time between dead time intervals after assumed samples For the mean number β k of pairs of samples between two different time units (k = 0), only cross-products between any sample within one time unit and any sample in the other time unit can contribute. Therefore, one sample is assumed at time t a in one time unit and another sample is assumed at time t b in a different time unit, which is k time units away from the initial one. Because the autocorrelation is symmetric, only k > 0 is discussed. For k < 0, |k| can be used instead of k to derive all required parameters. The sample at t a can occur at any time within its time unit with a mean rate of α. Normalization with the time unit yields t a = t a . Then, the sample at t a occurs with the mean rate of α . The occurrence of the sample at t b = t b depends on the number n of further samples between t a and t b . However, reducing the time to the remaining fraction between all dead times of other samples, the mean rate of the sample at t b is α as in the case for k = 0 above. The number of further samples n between t a and t b and their dead times nd plus the dead time of the initial sample at t a , finally leads to the dependence between the sampling instances. Without limitation of generality, t a shall occur in time unit number zero. The mean number of pairs of samples between the two time units then becomes where P(n, x) is the probability to have n further samples within the (normalized) time interval x, which is From the mean number of pairs of samples β k the correction coefficients β k (for all k, including k = 0) can be obtained as For case (d) with correlated data gaps in discrete time and with a discrete distribution of sampling intervals, the probability is derived that a valid sample occurs at a given time unit and another one occurs |k| time units away. The probability of having a valid sample at the first instance is α = 0.5 in this simulation. For a given valid sample at the first instance, the probability of having another valid sample at the second instance depends on the number of possible changes n from valid data to invalid data and vice versa. Between the two instances up to |k| changes can occur with a mean number of changes per time unit of c = 0.2 in this simulation. A change occurs at a given time instance with a probability of c , no change occurs with a probability of 1 − c . An odd number of changes yields an invalid sample at the second instance, while an even number of changes yields a valid sample. The changes can occur at any time instance between 1 and |k| , where the result depends on their number but not on their order. The mean number β k of pairs of samples then becomes with n = 2n , which finally leads to the correction coefficients

Results and discussion
In Fig. 3, individual realizations of the signals are shown taken from the same original simulation of the discrete time series from the moving-average process. Random sampling at instances from continuous time therefore involves continuous interpolation of the signal before sampling. The plots show the valid samples as impulses to better illustrate the different sampling schemes. In Fig. 3a and b, the sampling instances are chosen from continuous time with a continuous distribution of sampling intervals, where in Fig. 3a the sampling is purely random, while in Fig. 3b, a minimum distance between consecutive samples is complied. In Fig. 3c and d, the sampling is nominally equidistant (with missing samples) yielding a discrete distribution of sampling intervals. While in Fig. 3c, only individual samples (outliers) are taken out independently, longer sequences of missing samples (data gaps) can be identified in Fig. 3d. However, in all four cases, a mean sampling rate of α = 0.5 tu −1 is obtained on average. From the signals simulated, direct spectral estimates are derived, based on the Fourier spectrum and based on Lomb-Scargle's method. In Fig. 4, the mean spectra averaged over 1000 realizations of the simulation and the spectral estimates are shown for the four sampling schemes. A significant bias can be identified in all four sampling cases between the estimate of the power spectral density and the spectrum of the simulated process. Note that this estimate of the power spectral density, however, is a bias-free estimate of the observed signal. The deviation between the estimated spectrum and that of the simulated process is a direct result of the irregular sampling. The sampling scheme changes the spectral content of the observed signal in comparison to the spectrum of the process under observation. Therefore, the spectral composition of the primary estimates directly depends on the characteristics of the sampling scheme. Accordingly, the four averages of biased primary spectra in Fig. 4a-d also have different spectral characteristics, while no significant differences can be observed between the estimates based on the Fourier spectrum and that based on Lomb-Scargle's method.
From the primary estimates of the power spectral density, the procedures from above are used for bias correction. For random sampling (no correlation) with a continuous distribution of sampling intervals (case a) and for independent individual outliers from equidistant data (case c), the offset of the spectrum is corrected directly as in Eq. (18) resp. Eq. (19) using α = 0.5 tu −1 . For correlated random sampling with a continuous distribution of sampling intervals and for correlated data gaps with a discrete distribution of sampling intervals the spectrum is transformed into the corresponding correlation function first. The latter one is corrected as in Eq. (9) using the appropriate coefficients β k , which have been derived according to the models of the different sampling schemes in Eqs. (28)- (30) and (32). Finally, the improved estimates of the correlation function are transformed back into the corresponding spectra for comparison. The results in Fig. 5 show that for all four sampling schemes, the bias correction succeeds, yielding bias-free estimates of the power spectral density for the two test cases (c) and (d) with discrete distribution of sampling intervals. In cases (a) and (b) with continuous distribution of sampling intervals, the obtained spectra are almost bias-free only. For the highest frequencies resolved, a small aliasing error remains, leading to slightly increased values. Due to the random sampling in continuous time, this alias has no sharp boundary frequency [4] and it is occurring smeared over a certain range of frequencies. However, this error is a result of the insufficient information extraction by the sampling process from the observed process and the bias correction introduced here is not able to add this missing information.
In Fig. 6, the correction coefficients β k are derived directly from the data sets as in Eq. (20) and the corresponding correlation functions are corrected as in Eq. (9) and transformed back into the corresponding spectra for all four sampling schemes. Also, in this case, except for the small remaining aliasing error in cases (a) and (b), the correction succeeds for all investigated sampling schemes.

Conclusion
Random sampling of time series causes a systematic error of the spectral estimation compared to the observed process. Lomb-Scargle's method for the spectral estimation of irregularly sampled time series, often used as a reference, does not show any advantages in this respect compared to a direct spectral estimation using the Fourier transform. The systematic errors caused by the irregular sampling can be analyzed, predicted, and finally corrected using the methods presented in this paper. The presented methods are not limited to certain sampling models. Beyond the present simulation, this is the possibility to obtain biasfree direct spectral estimates from irregularly sampled data, independent of the spectral composition of the sampling scheme. The only requirements for the presented correction method are that the irregular sampling is independent of the values of the observed process and that enough pairs of samples can be built for any lag time. If this is given, even static patterns of the irregular sampling function can be corrected, like periodic drop outs. Also, data sequences with significant parts missing can still be processed to appropriate spectra or corresponding correlation functions. Therefore, this is the first universal solution of bias-free spectral and correlation estimation for a broad range of irregular sampling processes. This work is part of a broader attempt to bias-free estimation of correlation functions and spectra from irregularly sampled data. The authors' experience from data processing in laser Doppler applications inspired to enhance the methods developed there towards more universality and towards an extended range of applications. Next steps are measures to reduce the estimation variance, since the conservation of information about the spectral content seems to be less efficient with irregular sampling than with equidistant sampling. Other methods of spectral analysis like quantization of arrival times or slot correlation, known from laser Doppler applications also are potential candidates for broader use and are object of further investigations.