Goertzel algorithm generalized to noninteger multiples of fundamental frequency
 Petr Sysel^{1} and
 Pavel Rajmic^{1}Email author
https://doi.org/10.1186/16876180201256
© Sysel and Rajmic; licensee Springer. 2012
Received: 10 May 2011
Accepted: 6 March 2012
Published: 6 March 2012
Abstract
The article deals with the Goertzel algorithm, used to establish the modulus and phase of harmonic components of a signal. The advantages of the Goertzel approach over the DFT and the FFT in cases of a few harmonics of interest are highlighted, with the article providing deeper and more accurate analysis than can be found in the literature, including the memory complexity. But the main emphasis is placed on the generalization of the Goertzel algorithm, which allows us to use it also for frequencies which are not integer multiples of the fundamental frequency. Such an algorithm is derived at the cost of negligibly increasing the computational and memory complexity.
Keywords
1 Introduction
In the case of discretetime signals, the discrete Fourier transform (DFT) is widely used for spectral analysis. The frequencies of the harmonics in the DFT always depend on the length of the transform, N, and they are integer multiples of the fundamental frequency $\Delta f=\frac{{f}_{\mathsf{\text{s}}}}{N},$where f_{s} represents the sampling frequency. Thus, Δf gives the frequency resolution of the DFT. In the case, when the transform length N is not a multiple of the signal period, the signal is a sum of harmonic components whose frequencies are not integral multiples of the fundamental frequency. Such components are not expressible in the Npoint DFT spectrum by a single spectral line—this effect is called "leakage" into the neighboring DFT spectral coefficients, placed at the integer multiples of Δ f[1].
Therefore, the transform length N always needs to be chosen with respect to the desired accuracy of the frequency resolution. The computational complexity of the DFT increases quadratically with the number of samples/frequencies, and thus in practice we use almost exclusively the fast Fourier transform algorithm (FFT), whose computational complexity is linearithmic (linearlogarithmic). When the task is to identify the modulus and/or phase of a single or of just a few of the frequency components, even the FFT is of no advantage, because it always computes all the frequency components, most of which are discarded, as being of no interest. In such situations, methods specialized in computing a subset of output frequencies can be exploited with great benefit. Besides the Goertzel algorithm, which deals with single frequencies separately, it is worth mentioning the socalled prunedFFT [2, 3], which is also connected to the zoomFFT algorithm [1], and the transform decomposition of Sorensen and Burrus [3], which efficiently combines the ideas of the splitradix FFT and the Goertzel approach. However, the essential disadvantage of rounding frequencies to the nearest integer multiples (and the inaccuracy thus introduced) remains if using any of these methods, including the classical version of the Goertzel algorithm.
In Section 2, we first show the derivation of the common Goertzel algorithm in detail. While this may seem superfluous, it will be necessary to refer back to a number of its particular steps in the later sections. Then the computational and memory complexity of the Goertzel algorithm and the FFT is compared. In Section 3, we provide the announced generalization of the Goertzel algorithm, so that it is possible to use it also for the nonintegral multiples of the fundamental frequency. Such a generalization has been mentioned before, e.g., in [4, 1], but just for the computation of the modulus, not the phase of a harmonic component.
1.1 Example of utilization of Goertzel algorithm—DTMF
The value N = 205 is often used in practice [6], because one of the local minima of the sum of squared relative deviations of the signaling frequencies is experienced precisely for this length. In this situation, the deviation is approximately equal to 1.4%, while the transmitter frequency tolerance is 1.8%. Nevertheless, in some applications of the Goertzel algorithm the deviation from the exact frequency can exceed a prescribed tolerance, and thus both the DFT and the Goertzel algorithm would be of little use.
Using the approach presented in this article it is not necessary to round the frequencies at which detection is desired; it is possible to determine the modulus and phase of a component at an arbitrary (even noninteger) frequency. The number of operations and memory requirements increases only negligibly with this approach.
1.2 Notation
In the following text, we assume a discrete signal x of length N, whose samples can be complex, {x[n]} = {x[0], x[1],..., x[N  1]}. Symbol k represents the number (index) of the harmonic component in the DFT, thus k ∈ ℕ. However, in the later parts of the text, we will work also with k ∈ ℝ. The unit step signal is denoted by {u[n]}, whilst u[n] = 1 for n ≥ 0, u[n] = 0 for n < 0.
2 Standard Goertzel algorithm
2.1 Derivation of standard Goertzel algorithm
where the compact support of the signal {x[n]} is taken into consideration.
for an arbitrary but fixed k = 0,..., N  1. This means that the required value can be obtained as the output sample in time N of an IIR linear system with the impulse response {h_{ k } [n]}.
The statespace description is advantageous because only the output sample y[N] is of interest. The algorithm iterates the realnumberonly system (16) for (N + 1) times (beginning with the sample with the time index 0; in the last iteration the input sample x[N] is put equal to zero). Only in the last step is the output y_{ k } [N] calculated according to (17) using only a single complex multiplication. As mentioned earlier, the value in y_{ k } [N] is the desired spectral coefficient X[k].
2.2 Comparison of Goertzel algorithm and FFT
2.2.1 Properties

First of all, the Goertzel algorithm is advantageous in situations when only values of a few spectral components are required (as in the DTMF example in Section 1.1), not the whole spectrum. In such a case the algorithm can be significantly faster.

The efficiency of using the FFT algorithm for the computation of DFT components is strongly determined by the signal length N. The most effective case is when N is a power of two. On the contrary, N can be arbitrary in the case of the Goertzel algorithm, and the computational complexity does not vary.

The computation can be initiated at an arbitrary moment, even at the very time of the arrival of the very first input sample; it is not necessary to wait for the whole data block as in the case of the FFT. Thus, the Goertzel algorithm can be less demanding from the viewpoint of the memory capacity and it can perform at a very low latency. Also, the Goertzel algorithm does not need any reordering of input or output data in the bitreverse order [1].

Finally, as will be shown later in the article, the modulus and phase can be established also for the nonintegral spectral indexes k, raising the computational effort only negligibly. Therefore the Goertzel algorithm is convenient in cases when, for some reason, it is required to detect harmonic signals of nonintegral frequencies, or, signals with a limited number of samples which causes a decrease of the DFT frequency resolution.
2.2.2 Computational and memory complexities
In the following analysis, operations which can be performed before the first data sample has been received are not considered. Specifically, the constants A, B, C in Figure 3 can be precomputed. The memory performance is handled in a minimalist scenario, i.e., such that it would not be possible to implement the algorithm with fewer storage locations.
The FFT algorithm used with N being a power of two has computational demands proportional to N log_{2}N, the absolute number depends on the particular implementation. Usually the number of realnumber operations found in the literature is approximately 6N log_{2}N (taking one complex multiplication as a combination of four multiplications and two summations). When working with real signals, a number of operations can be avoided; however, it is at the cost of increased complexity of the algorithm, and, it is not true that the demands can be reduced by half, as can be read, for example in [9]. For this reason, we consider the standard "complex" FFT even for real signals.
If we analyze the number of operations of the standard Goertzel algorithm, we realize that for a real input signal, N real multiplications and 2N real additions are performed in the main loop. So the total number of operations is approximately 3N for a single frequency; we omit the small number of operations needed for precomputing $B=2\mathsf{\text{cos}}\left(\frac{2\pi k}{N}\right),C={\mathsf{\text{e}}}^{\mathsf{\text{j}}\frac{2\pi k}{N}}$and the concluding complex multiplication (one for each frequency k). Thus, if N frequencies were of interest, the Goertzel algorithm would be of quadratic complexity as the DFT is.
which represents a more accurate result than for example [[8], p. 635], where the sharper inequality K < log_{2}N, based solely on a comparison of the order of magnitude, is presented. Such a result, however, holds only for N being a power of two; otherwise the inequality (18) can even be more favorable for the Goertzel algorithm.
The formula (18) says that the computation should be faster than the FFT as long as the number of frequencies does not exceed 2 log_{2}N. For example, with a signal of length N = 32 the Goertzel algorithm is preferable if K ≤ 9. In the case of N = 128 Goertzel dominates over the FFT if K ≤ 13.
In fact, the algorithm introduced in [3] can be even more efficient than this. It combines the good properties of both the FFT and the Goertzel algorithm, producing a DFT decomposition similar to the one used in the splitradix FFT. The dominance of the algorithm of Sorensen and Burrus over the FFT is guaranteed even for K < N/2. An experimental comparison of this approach with the Goertzel algorithm showed that the Goertzel algorithm performs actually better than their algorithm when K ≤ 4 or K ≤ 5 for a wide range of N. It should be noticed, however, that the algorithm from [3] has to work with a whole data block, and also the complexity being compared does not include rearrangement of the input data sequence.
Using the FFT algorithm requires a memory space of at least 2N, which contains the real and imaginary parts of signal samples. Also the N values of the transformation kernel, sin and cos (socalled twiddle factors), are often precomputed and stored. The FFT calculation itself can be performed with no values being moved in memory (i.e., inplace), however, with regard to the impossibility of starting the computation until the last sample of a block of data is received, a buffer of at least 2N in size must be used. In the case of real signals, N memory locations are enough. Thus, the overall FFT memory demand is 4N for real signals.
For each considered frequency, the Goertzel algorithm requires: locations for saving two state variables, the real constant B, the real and imaginary parts of the precomputed C, and the real and imaginary parts of the final result. There is no need to implement input buffering, because the computation can be run as the new signal samples arrive. Similarly, the output signal can be overwritten after the last sample has arrived. In many cases it will therefore not be necessary to use buffering at the output side either. The total memory complexity of the Goertzel algorithm is thus 7K positions.
A comparison of (19) and (18) leads to the conclusion that, if we look for a number K for which the Goertzel algorithm dominates over the FFT from both the memory and the computational viewpoints, then: for N ≥ 13 formula (18) is decisive, because for these N it holds $\frac{4}{7}N>2{\mathsf{\text{log}}}_{2}N;$ on the other hand, for N ∈ {2,..., 12} (which is unusual in practice), the decisive formula is (19), because for these N it holds $\frac{4}{7}N>2{\mathsf{\text{log}}}_{2}N;$ nevertheless, as the difference of the right and the left sides does not exceed 2 in this case, we can conclude, with a small loss of generality, that the comparison of the effectiveness of the two algorithms can be based just on relation (18).
3 Generalized Goertzel algorithm
Formula (2) holds for integervalued k only. In such a case, the integer number of periods of the transformation kernel, ${{\mathsf{\text{e}}}^{\mathsf{\text{j}}2\pi k}}^{\frac{n}{N}},$ corresponds to the signal length N. In the case of k ∈ ℝ, formulas (1) and (2) are generally no longer in agreement. (The period of the transformation kernel no longer corresponds to N, hence the standard approach cannot be used.)
In Sections 3.1 and 3.2, we will generalize the algorithm such that it includes also the nonintegralvalued multiples of the fundamental frequency. The complexity of the novel approach is analyzed in Section 3.3. And, as shown in Section 3.4, the noninteger case can be treated by the standard algorithm using a small trick; however, this is at the cost of increased computational effort.
3.1 Generalizing to noninteger k
where we exploited the compactness of the support of the signal {x[n]}.
Indeed this is so, since the "correction constant", e^{j2πk}, depends only on the index of the frequency component, which remains constant throughout the computation. The complex constant is equal to one for k ∈ Z, which shows that this is indeed a generalization. In fact, the only variation compared to the standard Goertzel algorithm is the multiplication by this constant at the very end of the algorithm.
The constant e^{j2π k}affects only the phase of the result, not the module. Among other things, this means that the interest in the modules of the components with noninteger k can be satisfied using the standard algorithm. Indeed, for example [4] uses it in this way. In cases when the phase plays a role (the delay of a signal is detected, for example), however, the use of this "correction constant" is necessary. A short remark can be found in [[1], p. 531], describing the possiblility of computing the Goertzel results also for nonintegervalued k; however, it misleads the reader in that the phase case is not distinguished at all.
3.2 Reducing number of iterations
It will be shown in this section that the last iteration of the Goertzel algorithm can be substituted by merely a single complex multiplication, instead of performing it in the usual manner.
This means that the very last iteration of the traditional Goertzel algorithm can be replaced by a simple multiplication by ${\mathsf{\text{e}}}^{\mathsf{\text{j}}2\pi \frac{k}{N}}.$ Relation (30) holds for y_{ k } [N] and y_{ k } [N1] due to the limited support of x[n]. Nothing similar, however, holds for samples y_{ k } [N1] and y_{ k } [N2], due to the term u[·].
3.3 Computational and memory complexities
The computational complexity of the generalized Goertzel algorithm described in Section 3.1 (without the shortening in Section 3.2) grows by one complex multiplication (i.e., four real multiplications and two real additions) compared to the traditional approach. The memory requirements increase by two positions, which contain the real and imaginary parts of the correction constant e^{j2π k}.
Although saving one iteration in the main loop according to Section 3.2 results in lowering the computational effort by two additions and one multiplication, the need for the final complex multiplication cancels such a benefit. This means: there is no advantage in shortening the main loop in case of integervalued k; in such a case the traditional algorithm as defined in Section 2 is the most efficient one.
However, in the case of nonintegervalued k the iteration reduction does make sense, since joining the correction constants into a single one (31) leads to the overall growth of computation complexity by three real multiplications (it would be four real multiplications and two real additions if the reduction was not exploited.) Considering the memory, such a case requires two more positions for the real and imaginary parts of (31), compared to the standard algorithm.
It is evident that the computational and memory complexities of the generalized case are only negligibly greater. The main advantage of shortening the loop according to Section 3.2 can be seen in that, for example, in continuous operation, it is not necessary to perform the last iteration and it is possible to start processing the input sample x[N] in the time spared.
3.4 Yet another approach utilizing standard Goertzel algorithm
It will be shown that, by a trick, the computation required for k ∈ ℝ can be transformed into integervalued problem, where the standard Goertzel algorithm can be utilized—so no modifications are needed. However, it is at the cost of raising the computational complexity, which is even greater than with the generalized Goertzel algorithm (Figure 4).
whose right side is a usual DFT of signal $\widehat{x}$ (which is complex!) and thus can be computed by the standard Goertzel algorithm.
Regarding the memory complexity, in case of realtime processing it is of advantage to precompute and store the signal $\left\{{\mathsf{\text{e}}}^{\mathsf{\text{j}}2\pi \widehat{k}\frac{n}{N}}\right\}$. It is complex and therefore requires 2N memory locations. Furthermore, in contrast to the traditional algorithm, we also need 2N positions to store $\widehat{x}$, which is complex (instead of N in the traditional, real case).
To compute $\widehat{x}$ we need 2N real multiplications. In the standard Goertzel algorithm (Figure 3) the N iterations of the main loop work with real numbers only (supposing the input signal to be real). In the approach presented above, the number of real operations is doubled.
It is therefore clear that the generalized algorithm according to Figure 4 beats the above alternative approach from both the computational and the memory viewpoints.
4 Software
Two Matlab functions are available for download at URL [10].
The function named goertzel_classic.m realizes the standard Goertzel algorithm for k ∈ Z; the generalized (and shortened) algorithm for k ∈ ℝ is implemented in the function goertzel_general_shortened.m. The structure of the functions corresponds to the pseudocodes in Figures 3 and 4. Indexing the vector elements, however, starts with "1" in Matlab, which differs from our theoretical description, where it starts with "0".
5 Conclusion
The article presented the generalization of the Goertzel algorithm. The novel approach allows us to employ also the nonintegervalued multiples of the fundamental frequency, making it possible to compute the Fourier transform in discretetime (DTFT) this way. The main advantage consists in that in various applications where the Goertzel algorithm is utilized, it is no longer necessary to round the frequencies of desire, thus obtaining more accurate results. The article shows that this is reached at the cost of only a negligible rise in computational and memory complexities. Furthermore, it has been shown that the very last iteration of the algorithm can be substituted with a multiplication which is little more effective.
Declarations
Acknowledgements
This work was supported by projects of the Czech Ministry of Education, Youth and Sports MSM0021630513, the Czech Ministry of Industry and Trade FRTI2/220, and the Czech Science Foundation 102/09/1846.
Authors’ Affiliations
References
 Lyons RG: Understanding Digital Signal Processing. 2nd edition. Prentice Hall PTR, NJ; 2004.Google Scholar
 Duhamel P, Vetterli M: Fast Fourier transforms: A tutorial review and a state of the art. Signal Process 1990, 19: 259. 10.1016/01651684(90)90158UMathSciNetView ArticleMATHGoogle Scholar
 Sorensen H, Burrus C: Efficient computation of the DFT with only a subset of input or output points. IEEE Transn Signal Process 1993, 41(3):1184. 10.1109/78.205723View ArticleMATHGoogle Scholar
 Gay SL, Hartung J, Smith GL: Algorithms for MultiChannel DTMF Detection for the WE DSP32 Family. In IEEE on Proceedings of International Conference on Acoustics, Speech, and Signal Processing. Glasgow; 1989:11341137.View ArticleGoogle Scholar
 Q.23, Technical Features of PushButton Telephone Sets ITUT, Geneva; 1988.Google Scholar
 Mock P: Add DTMF generation and decoding to DSPuP designs. Digital Signal Processing Applications with the TMS320 Family PrenticeHall, NJ; 1987, 1: 543557. [http://www.ti.com/lit/an/spra168/spra168.pdf]Google Scholar
 Goertzel G: An algorithm for the evaluation of finite trigonometric series. Am. Math Monthly 1958, 65(1):34. 10.2307/2310304MathSciNetView ArticleMATHGoogle Scholar
 Oppenheim AV, Schafer RW, Buck JR: Discretetime Signal Processing. 2nd edition. PrenticeHall, NJ; 1998.Google Scholar
 Wikipedia contributors Wikipedia: the Free Encyclopedia Wikipedia Foundation, St. Petersburg, Florida; 2010. 29. 6. 2005, 19. 1. 2010 [cit. 6. 4. 2010] [http://en.wikipedia.org/wiki/Goertzel_algorithm]
 Rajmic P: Matlab codes for the generalized Goertzel algorithm (2012).[http://www.mathworks.com/matlabcentral/fileexchange/35103]
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.