Maximum length binary sequences and spectral power distribution of periodic signals

basic

Since the estimation of spectra and spectral densities is a well-known signal processing technique [13], their application to MLBSs should be straightforward.Yet, there are many discrepancies and inconsistencies in the literature concerning the exact properties of MLBSs like in particular their spectrum and their power spectral density (which are unfortunately often used synonymously), and how these properties are affected by the basic time steps of the sequence and the sampling time, to name only a few aspects.We address these issues by giving clear and consistent definitions with well-defined ranges of validity.
The importance of choosing appropriate parameters for the MLBS excitation signal (according to the expected dynamics of the system that is to be identified) is highlighted, for example, by the occurrence of a system identification error when a PI controller is used [14] that can be mitigated to some extent if the MLBS's order is sufficiently high [14].
In practice, one typically doesn't use the "pure" MLBS (which is a discrete series of numbers) but rather a continuous-time realization (typically generated by means of a zero-order hold (ZOH) process).However, when measurements are performed (e.g., with a digital sampling oscilloscope), these time-continuous signals are converted back to discrete-time data by means of sampling.In our contribution, we will therefore distinguish as precisely as possible between the base MLBS, its time-continuous representation, and a sampled MLBS signal and explicitly elaborate the links between them.
For the characterization of these MLBS signals, Fourier analysis is of course an important technique as for other signals as well, e.g., via a fast Fourier transform (FFT) [15].However, since not only time-discrete and time-continuous signals have to be distinguished but also periodic and non-periodic signals (and even deterministic vs. stochastic signals) [16,17], statements concerning the spectrum of an MLBS signal have to be handled with care.Last but not least, many references work with formulas that are normalized so that a physically meaningful interpretation is difficult.Therefore, we will provide definitions of the different signal processing steps (i.e., transformation into the frequency domain, autocorrelation function, and power spectral density) with special focus on the relevant scaling factors.Afterward, an application of these definitions to the different MLBS representations leads us to transparent interpretations of the results.Specifically, we will discuss the influence of different parameters such as the number of bits of the base MLBS, the base time or the observation time.
The paper is organized as follows: Section 2 gives detailed definitions of terms and formulas to create a solid mathematical basis and lists the main properties of MLBSs that are of major importance for the scope of this paper.
In Sect.3, the discrete Fourier transform (DFT) and the power spectral density (PSD) of MLBSs are calculated, with clear reference to scaling factors that ensure the physical interpretability of these results.
Finally, in Sect.4, the results obtained for the pure discrete-time sequence are extended for a continuous-time realization of an MLBS that is sampled afterward.An example highlights the pitfalls one might encounter designing an MLBS for system identification purposes and how this work helps to circumvent them.The procedures outlined in this last section also allow for a description of the impacts of hardware limitations (finite rise times of the signal-generating devices or time-step restrictions of their internal clocks) on the PSD of the signal.

Definitions and properties of an MLBS
Figure 1 shows an example for a 4-bit linear feedback shift register based on [2].
Each bit a i k can hold the values 0 or 1: a i k ∈ {0, 1} ⊂ R .The upper index i ∈ {0, 1, ..., n − 1} , where n is the total number of bits in the register, denotes the position, and the lower index k ∈ N 0 denotes the time step for the time t = kT base with the so-called base time The state of the whole shift register at time step k is given by the n-tuple After each time step, the values of all bits are shifted one tab to the right, a i k+1 = a i+1 k , for i ∈ {0, 1, ..., n − 2} .Due to the feedback loop, the value of the highest bit a n−1 k+1 at time step k + 1 is calculated as a sum (modulo 2) of the values of some bits at the previous time step k.Which bits are fed back is called the taps and is a crucial property of the shift register that will be discussed later.In the example shown in Fig. 1 with n = 4 , the bits at position 0 and 1 are fed back via an XOR gate, performing the modulo 2 addition.Thus, The output of the shift register at time step k is the value of a 0 k , and the total output is represented by the sequence a 0 k = a 0 0 , a 0 1 , a 0 2 , ..., a 0 p−1 , a 0 0 , a 0 1 , ... .A shift register with a length of n bits represents one of 2 n different states a i k at each time step k.As proven by Golomb [1], the succession of these states is periodic with period 0 < p ≤ 2 n − 1 , since every state is determined by the previous state and there are only a finite number of different states: a i k+p = a i k .The state all zeroes produces a trivial output sequence.(This state will be maintained infinitely.)The term "maximum length" denotes a feedback shift register whose output sequence periodically produces all other 2 n − 1 states.The output has the maximum period p = 2 n − 1 , so whatever the initial state is (except all zeroes), the register will eventually hold every possible initial state.It is therefore sufficient to only consider the initial state all ones and-except explicitly stated otherwise-this will be assumed to be the standard initial state throughout this work: a i 0 = (1, 1, ..., 1).Since the history of every bit a i with i ∈ {0, 1, ..., n − 2} is identical to the history of the previous bit a i+1 with a delay of one time step, for k ≥ n − 1 every bit's state can be traced back to a previous state of a n−1 .This allows for a generalization of Eq. (1) in the form of a linear recurrence [1]: (1) (2) Fig. 1 Example for a 4-bit linear feedback shift register Figure 2 shows the general structure of the LFSR according to Eq. ( 2).
In the following, we assume that the taps have been chosen in a way such that an MLBS is obtained.
An MLBS in the sense of this work is then created by mapping the output of the shift register to −1 and +1 , respectively, via b k = −e jπ a 0 k with the imaginary unit j or simply b k = 2a 0 k − 1 .Note that the superscript 0 of the bit is omitted for the MLBS.The MLBS is represented by the sequence In the literature, the mapping to −1 and +1 is sometimes reversed [20,21], or even both versions of the mapping are used [1].Being deterministic, an MLBS cannot be truly random but it has some "randomness characteristics" as postulated by Golomb [1], accounting for the name pseudorandom noise sequence: R-1 In every period, the disparity between the number of +1 s and the number of −1 s does not exceed 1. R-2 In every period, half the runs (i.e., a sequence of identical values) have length one, one-fourth have length two, and so on.Notably, there are one run of length n of +1 s and one run of length n − 1 of −1s.

R-3
The autocorrelation function R bb [l] (definition: Eq. ( 14)) has only two values: The property R-3 is responsible for the broad spectrum of an MLBS, as will be discussed later.Further properties, especially considering cross-correlations, are given in [20].
The autocorrelation function R bb [l] calculated with Eq. ( 3) is shown in Fig. 4 for a 4-bit and a 7-bit MLBS, respectively.
Since the signal is periodic, so is its autocorrelation function, and the distinct peaks occur at 0 and at multiples of the period p.The scaling factor 1/p in Eq. ( 3) ensures that the peaks have unit height and the negative "floor" decreases to zero as the number of bits increases.Thus, even for a lag of only one single time step, the shifted signal looks almost completely uncorrelated to the original one.That means, although an MLBS is deterministic its autocorrelation function approximates a pure random process.
It is important to note that the MLBS as a sequence {b k } exists independently of any time scale and here we strictly distinguish between sequences and time series.If the MLBS is to be played as a time series, the time steps are given by the above-mentioned base time T base with f base = 1 T base .

Discrete spectrum
The discrete Fourier transform (DFT) [22] can be calculated via where N is the length of the dataset (see Appendix 1.2 for a discussion of the scaling factor 1/N and the notation).For the MLBS, N = p is used.
The total length pT base of the time series determines the frequency resolution via while the maximum frequency that contains information (as per the Nyquist-Shannon sampling theorem) is given by Note that Eq. ( 6) is valid here since p is always odd for an MLBS.
When the length of the dataset matches one full period, N = p , the absolute values |B n | of the DFT of an MLBS sequence {b k } are given by (see Appendix 3 for the derivation).Examples are shown in Fig. 5 for a 4-bit and a 7-bit MLBS, respectively.The scaling of the frequency axis is done by assuming T base = 1 s for both sequences.
First of all, the DFT produces distinct peaks that can be attributed to certain frequencies, and Fig. 5 illustrates the effects of Eqs. ( 5), and ( 6).The higher the number of bits, the better becomes the approximation of Eq. ( 6).Simultaneously, increasing the number of bits results in smaller frequency bins so that the amplitudes also become smaller.
Secondly, the spectrum of an MLBS is flat, every frequency component is equally present (except a near-zero DC value).This property underlines the white noise characteristic of an MLBS.In the literature, however, the reader often finds the statement that the spectrum of an MLBS has a sinc (fT base )-shape and the flatness thus would only apply to the lowest frequencies.This is not correct for an MLBS as a sequence.The error occurs by using the term "MLBS" for the sequence itself and the time series that additionally (4) for n ∈ {1, 2, ..., p − 1} gets filtered by a ZOH process synonymously.The latter will be discussed in the next section.
The property R-1 results in a remaining nonzero mean value of the MLBS and is responsible for the nonzero DC value that occurs in its DFT.The higher the number of bits, the lower this value becomes.However, because of the increased frequency resolution that comes with a longer period, higher bit numbers enable low-frequency system excitation and, hence, more reliable system identification results for low frequencies, as was documented in [14].

Sampled MLBS
Often, the MLBS is used as an (discrete-time) input sequence for an arbitrary waveform generator (AWG) whose (continuous-time) voltage output can approximately be described as the discrete-time MLBS processed by a ZOH element.(Note that this is only a good approximation if the base time is sufficiently larger than the rise time of the AWG and if ringing phenomena can be neglected-both will be assumed in the following.)Any measurement of this continuous-time signal will be sampled, and we denote the time difference between the samples as T sample and the sample frequency as In the following, we assume T base to be constant to evaluate the effects of the other parameters.
A mathematical description of a ZOH-processed continuous-time MLBS is obtained by the convolution of the MLBS with a rectangular pulse of width T base .Since the sampled result is again a discrete sequence, a simpler description of this process, avoiding the continuous-time domain, can be made by assuming that f sample is an integer multiple of f base , f sample = k 0 f base , with k 0 ∈ N , so that the sampled  Comparing this result with Fig. 4, one notices that the peaks occur at k 0 -times the lag for the base MLBS as expected due to the higher number of samples.But there are also smaller side-peaks broadening the distinct peaks of the base MLBS.This again is to be expected: For the application of Eq. ( 14) with N = k 0 p , the sampled MLBS can be regarded as k 0 instances of the base MLBS, with one sampling step between each instance.For lag l = 0 mod k 0 p all of these k 0 base MLBSs overlap.Increasing the lag l results in l less overlapping base MLBSs up to no overlap for k 0 ≤ l ≤ N − k 0 .For N − k 0 + 1 ≤ l ≤ N , the overlap increases again, resulting in Thus, the sampling results in an approximation of triangles instead of the distinct peaks of the base MLBS.While the resolution of this approximation can be increased with k 0 , the negative "floor" remains unaffected (and still depends on the number of bits).
In order to obtain the spectrum of a sampled MLBS, another degree of freedom can be introduced that has been left out for the pure MLBS in the previous section: The sampling/observation time does not need to be restricted to only one full period.However, to circumvent the problems arising from sampling only a fraction of a period [21] we shall restrict ourselves to sampling that matches full periods.Thus, the total length of the dataset is NT sample with N = k 0 n 0 p and the sampling matches a total of n 0 periods ( n 0 ∈ N).
The frequency resolution is then given by which contains Eq. ( 5) as a special case for k 0 = 1 , n 0 = 1 .Likewise, Equation ( 9) can be regarded as a generalized version of Eq. ( 6).
The two degrees of freedom, the sampling frequency (determined by k 0 ) and the number n 0 of periods, thus alter the spectrum of the base MLBS in independent ways: • The frequency resolution f , Eq. ( 8), can be increased by including more periods in the dataset, while it is unaffected by the sampling time.• The maximum frequency f max , Eq. ( 9), on the other hand is unaffected by the number of periods and only increases with the sampling frequency.
To calculate the DFT of a sampled MLBS, the base MLBS needs to be convoluted with a scaled rectangular function to obtain the representation with (defined in [16] with the same definition of the DFT as Eq. ( 20)) of the sampled MLBS.Due to the convolution theorem [16], the absolute values |B n | of the sampled MLBS's DFT are now given by the product of the absolute values of the DFT of the base MLBS (Eq.( 7)) and those of the DFT of that rectangular function.Applying Eq. ( 4) yields and with Eq. ( 7) one obtains for n = 0. Thus, with f = n f = n NT sample = nk 0 NT base , they approximate the | sinc (fT base )|-shape, Figure 8 shows that this is indeed the case, even for a very low sampling rate with k 0 = 2 .The visible deviation for k 0 = 2 can be attributed to the bad approximation sin(πn/N ) ≈ πn/N of the denominator in Eq. ( 10) for low sampling rates k 0 .(Note that N = k 0 n 0 p and that the numerator is independent of k 0 .) Sampling with k 0 = 10 leads to a nearly undisturbed spectrum up to half f max , see Fig. 9.The maximum frequency increases with k 0 according to Eq. ( 9).
The expected | sinc (fT base )|-shape of the amplitudes does not cover the near-zero DC value of the sampled MLBS.
Increasing the number n 0 of periods included in the DFT calculation results in an increased frequency resolution but as shown in Fig. 10 this does not provide additional information.Particularly the very low-frequency region is not "enhanced" (for system identification purposes) by adding more periods because only the number of zeroes is increased.This is in contrast to what the expected | sinc (fT base )|-shape might suggest if one thinks increasing the frequency resolution should lead to a better "fit" to this expected shape.
Until now, the discussion only included amplitude spectra.Unfortunately, in the literature, the amplitude spectra depicted herein are sometimes said to follow a sinc 2 (fT base )-shape (what they don't) and called power spectra (what they aren't).
The raw discrete power spectral density S ββ [n] (not scaled with the frequency reso- lution 1  f nor with an ohmic resistor 1 R 0 -see Eq. ( 28) for a version of the PSD that accounts for this scaling and physical units) of a sampled MLBS is obtained according to Eq. ( 26   However, the discrete PSD will suffer from the same deviation from this expected shape due to the approximation made for the amplitude spectra (Eq.( 10)) if the sampling rate k 0 isn't high enough.
One particular frequency of interest is the one where the PSD drops by −3 dB [7, 10] given by S ββ (f −3 dB ) expected = 1 2 as which corresponds to n −3 dB ≈ 0.443 N k 0 .

Guidelines
For system identification purposes, the user can now follow these guidelines (in accordance with [9]): A Estimate the bandwidth of interest, i.e., estimate the minimum and maximum frequency f BW,min and f BW,max , respectively, that are to be excited.B Equation ( 11) with f −3 dB = f BW,max leads to the necessary T base .C Given T base , Eq. ( 5) with f = f BW,min gives the necessary period p and thus the required number n of bits for the MLBS.D A common advice is to sample 10 times faster than f BW,max [23], f sample = 10f BW,max ≈ 4.43f base , leading to k 0 = 5.E For the number n 0 of periods in the observation, there are different aspects to con- sider, like time constraints for the measurement or the possibility to average over multiple periods for a better signal-to-noise ratio [9][10][11].F Lastly, the amplitude of the MLBS signal has to be chosen.Since high amplitudes offer a better signal-to-noise ratio but can also excite unwanted nonlinear dynamics [9,10], this decision strongly depends on the system itself and might not be set a priori, requiring some iterations.The precise definition of the PSD (28), including the correct scaling factors, allows for a better estimation of the suitable signal amplitude.
Deviations from these guidelines may be necessary due to constraints on the available memory [9].

Application example
Consider, for example, a radio-frequency system whose transfer function shall be identified in the frequency range from f BW,min = 600 kHz to f BW,max = 20 MHz using an arbitrary waveform generator (AWG) to provide an MLBS as broadband excitation signal.In order to reach sufficient signal power at 20 MHz , a base time less than 22.15 ns is necessary (see guideline B), so, e.g., T base = 20 ns is chosen.If the AWG's rise time isn't sufficiently faster, the excitation signal will suffer from distortions.As mentioned above, in the literature, the raw discrete PSD S ββ (f ) sometimes gets confused with the amplitude spectrum B(f), leading to T * base = 0.603/f BW,max ≈ 30 ns and thus to S * ββ (20 MHz) ≈ 0.25 , i.e., only half the intended value.Following the above guideline C, a period p ≥ 84 is required to excite the system down to 600 kHz , leading to a shift register length of n = 7 .In this example, the wrong T * base would lead to p * ≥ 56 and thus to only n * = 6 .So the wrong T * base does not only affect the power at f BW,max but also leads to not being able to identify the system at f BW,min due to the MLBS's period being too short.As discussed above, the necessary number of bits to reach f BW,min cannot be reduced by simply lengthening the excitation signal via recording n 0 > 1 periods.Lastly, when the AWG's output is amplified with a solid-state amplifier to an amplitude of û = 150 V , the average power delivered to a pure resistive load of R 0 = 50 is P = û2 R 0 = 450 W , which is in accordance with Eq. (32).If one would neglect the scaling factor f while calculat- ing the PSD (Eq.( 28)), Eq. (32) would result in P * ≈ 160 W • MHz , which does not only have the wrong unit but also, if one wouldn't pay attention to the unit at all, quickly leads to thinking of P * * ≈ 160 W , underestimating the average power by a factor of almost 3.If the system can handle 500 W, this wrong estimation can lead to the fatal conclusion that increasing the amplitude by a factor of √ 3 ≈ 1.7 for a better signal-to-noise ratio would still be tolerable, ultimately damaging the system.
If hardware limitations like the AWG's rise time or other filtering effects shall be taken into account for the modeling of the excitation signal, the formulas presented herein can be applied to the filtered MLBS in a consistent way by using the convolution theorem.

Summary and conclusion
Clear definitions for MLBS signals were given which distinguish between a base MLBS on the one hand and a sampled MLBS (which is the result of a ZOH process applied to the base MLBS and re-sampling it) on the other hand.Furthermore, signal processing definitions such as the DFT, the autocorrelation function and the power spectral density were introduced with suitable scaling factors in order to allow a meaningful physical interpretation.One conclusion is that the spectrum of any base MLBS is white, whereas the spectrum of any sampled MLBS approximates a sinc function.This resolves contradictory or at least inconsistent statements that can be found in the literature.

Autocorrelation functions
The autocorrelation function of a real-valued time-continuous function x(t) of finite energy is defined as For a real-valued time-continuous function x(t) of finite power (like a bounded periodic function), however, this integral does not exist and one defines x(t + τ )x(t)dt instead.Please note that the division by T leads to a different physical unit and there exists no combined definition for both cases.However, the definition Eq. ( 12) can in principle be applied to a signal of finite energy as well.The resulting autocorrelation function will then be zero, ultimately leading to the average power being zero as wellwhich is indeed the correct value for a finite-energy signal.
For a periodic function with period T 0 , the integral can be evaluated by setting T = n 0 T 0 with n 0 ∈ N: Due to the periodicity, every integral over one full period gives the same result.Thus, the formula simplifies to just one integral over one full period, whose bounds can also be shifted: It follows that the same R xx (τ ) is obtained for an arbitrary observation time T that equals a finite number n 0 of full periods: Equation ( 13) can now be used for the motivation of a corresponding definition for a periodic and time-discrete function.For that purpose, let the sampling be done in N time steps t with T = N t and t k = k t , k ∈ {0, 1, ..., N − 1} .Further define x k = x(t k ) = x(k�t) .The integral can then be approximated by a Riemann sum: Assuming τ = l�t with l ∈ Z finally yields which is regarded as a definition for the discrete case, no longer an approximation.We use brackets instead of parentheses to distinguish the discrete versions of our definitions from the time-continuous ones.
The scaling factor 1 N in Eq. ( 14) ensures that for a large number of samples, i.e., a short sampling time t , the discrete formula properly approximates the continuous-time result from Eq. (13), Note that there are also definitions in the literature which omit this scaling factor [24,25].Motivated by these derivations, we consistently use the following definitions: • T 0 : shortest period of the signal • T = n 0 T 0 : observation time, equals an integer number of periods • f = 1 T : frequency resolution • t = T 0 N 0 = T N : time resolution, equals sampling time • N = n 0 N 0 : total number of samples • N 0 : number of samples per shortest period

Fourier transforms
The Fourier transform of a real-valued time-continuous signal is defined as which is in general meant in the sense of generalized functions/distributions [26].Please note that parameters like the frequency f and the time t are of course not specified for distributions in the mathematical literature.In this work, however, we add them to clarify the physical dependencies.Equation ( 16) can be evaluated as if the integral exists.
For a periodic signal, we define where T represents the observation time and shall match a number n 0 ∈ N of full peri- ods (T = n 0 T 0 ) so that the Fourier coefficients can be obtained via with the frequency resolution f = 1 T .In Eq. ( 18) as well as in further definitions, we write the function x including its parameter t as an argument x(t) for the operator FC f .Although of course only the function itself is the argument, we add the parameter to distinguish time-continuous and timediscrete versions and to visualize the physical meaning.
For a periodic and time-discrete signal, we define the discrete Fourier transform [22] as ( 16) x k e −j2π nk/N .
The scaling factor 1 N in Eq. ( 20) ensures that for a large number of samples, i.e., a short sampling time t , the discrete formula properly approximates the Fourier coeffi- cients (19) of a continuous-time periodic signal.A detailed derivation of Eq. ( 20) can be found in [22].
The zeroth component of the DFT is the DC component of the signal, while the others can be interpreted as one half of the amplitudes of the corresponding spectral component [22].
A special shaping of the spectrum can also be done by means of windowing [13].In this work, however, no windowing will be applied since we are dealing with periodic functions only and the sampling shall match full periods.(For computing purposes, this is equivalent to a rectangular or boxcar window of full period length.) If the sampling does not match full periods and only certain frequencies are of interest, a generalized Goertzel algorithm [27] can be used.

Power spectral densities
In the following, the signals x(t) shall be electrical voltages.In order to obtain correct physical units, our definitions for the power spectral density (PSD) will therefore always include an ohmic resistor R 0 .For other uses of the PSD, where only its shape is relevant, such scaling factors might be omitted or a different normalization might be applied [28].
The Wiener-Khinchin theorem [29,30] states that for the continuous-time case the PSD of a signal x(t) can be obtained by the Fourier transform of its autocorrelation function, According to Eqs. ( 16) and (17), Eq. ( 21) can be evaluated via if the integral exists.We therefore define For a periodic signal, however, the autocorrelation function is periodic as well.Thus, the integral in Eq. ( 22) does not converge and Eq. ( 21) results in distributions [26].Therefore, we use which can be evaluated according to Eq. ( 18) as to define the PSD of a time-continuous periodic signal as R xx (τ ) e −j2π f τ dτ For a discrete-time periodic signal, the PSD can be defined in an analog way: Beginning with which-according to Eq. ( 20)-can be calculated as we define Definition (28) also includes the scaling factor 1 f = T = N t to ensure for the frequencies f n = n f with the frequency resolution f = 1 T = 1 N t if the sampling time t is short enough.This can be proven by approximating the integral in Eq. ( 24) (with shifted bounds) by a Riemann sum using τ = l�t and Eq. ( 15): The discrete version of the PSD is periodic with period N (inherited from the DFT [22]), as can be seen from Eq. ( 27):

Average powers
The average power P of a real-valued time-continuous voltage signal x(t) can be calculated directly as With the definitions of the PSD, the average power can also be calculated as for a time-continuous signal, as (25

Fig. 5
Fig. 5 Absolute values |B n | of the DFT of a 4-bit and a 7-bit MLBS with T base = 1 s for exactly one full period each

Fig. 6 Fig. 7
Fig. 6 Example of a sampled 4-bit MLBS compared to the base MLBS and the ZOH-processed MLBS ) as the DFT of the autocorrelation function R ββ [l] .Since the

Fig. 8 Fig. 9
Fig. 8 Absolute values |B n | of the DFT of a 4-bit MLBS with T base = 1 s for exactly one full period and sampled with different k 0 in comparison with the expected | sinc (fT base )|-shape