2.1 Base 2 DIT FFT algorithm
The Fast Fourier Transform (FFT) is a fast algorithm of the Discrete Fourier Transform (DFT). DFT is a method to analyze and process signals and systems. It realizes the analysis of signals and systems from the frequency domain, which analyzes complex signals and systems more conveniently and intuitively.
According to the definition, the discrete Fourier transform (DFT) of a sequence \(x\left( n \right)\) of length N can be expressed as:
$$X\left( k \right) = {\text{DFT}}\left[ {x\left( n \right)} \right] = \mathop \sum \limits_{n = 0}^{N - 1} x\left( n \right)W_{N}^{kn} 0 \le k \le N - 1$$
(1)
In the formula \(W_{N}^{kn} = e^{{ - \frac{j2\pi nk}{N}}}\).
From Eq. (1), it can be seen that the direct calculation of \(X\left( k \right)\) involves \(N\) complex multiplication and n-1 complex addition. The N complex multiplication can be decomposed into 4N real multiplication and 2N real addition, and the N − 1 complex addition can be decomposed into 2 (N − 1) real addition. Therefore, every \(X\left( k \right)\) value evaluated requires 4N real multiplications and 4N minus 2 real addition. Thus, to compute a DFT of length \(N\) requires \(N^{2}\) complex multiplication and \(N^{2} - N\) complex addition, corresponding to 4\(N^{2}\) real multiplication and \(4N^{2} - 2N\) real addition. When \(N\) is very large, the computation amount is very large and it is difficult to realize in hardware. Therefore, it is necessary to improve the calculation method of DFT to reduce the computation amount of DFT.
By using the inherent characteristics of rotation factors, such as symmetry, periodicity, and reducibility, the computation of DFT can be greatly reduced. By using these characteristics, a long sequence of DFT can be decomposed into a short sequence of DFT, and the sequence can be decomposed into shorter sub-sequences according to the odd and even time sequence. Known as the time domain extraction method FFT algorithm (base -2 DIT-FFT algorithm), also known as Cooley-Tukey algorithm.
The calculation formula of the base-2 DIT-FFT algorithm is directly given here without process derivation, as shown in Equation (2) :
$$\left\{ {\begin{array}{*{20}l} {X\left( K \right) = X_{1} \left( K \right) + W_{N}^{k} X_{2} \left( K \right)} \hfill \\ { X\left( {K + \frac{N}{2}} \right) = X_{1} \left( K \right) - W_{N}^{k} X_{2} \left( K \right)} \hfill \\ \end{array} } \right.\quad \quad n \in \left[ {0,\frac{N}{2}} \right],\;k \in \left[ {0,\frac{N}{2}} \right]$$
(2)
In the formula
$$\left\{ {\begin{array}{*{20}l} {X_{1} \left( K \right) = \mathop \sum \limits_{n = 0}^{{\left( \frac{N}{2} \right) - 1}} x\left( {2n} \right)W_{\frac{N}{2}}^{nk} } \hfill \\ {X_{2} \left( K \right) = \mathop \sum \limits_{n = 0}^{{\left( \frac{N}{2} \right) - 1}} x\left( {2n + 1} \right)W_{N/2}^{nk} } \hfill \\ \end{array} \quad \quad n \in \left[ {0,\frac{N}{2}} \right],k \in \left[ {0,\frac{N}{2}} \right] } \right.$$
(3)
Here, \(X_{1} \left( K \right)\) and \(X_{2} \left( K \right)\) represent the N/2-point DFT of even and odd sequences of \(x\left( n \right)\), respectively. Formula (3) describes the base 2 butterfly operation in the DIT algorithm, as shown in Fig. 1. FFT algorithm is implemented by iterating the butterfly operation, and its corresponding implementation circuit is shown in Fig. 2. This operation requires one addition and one subtraction, followed by multiplication of complex rotation factors. Base 2 DIT-FFT is mainly to decompose n-point sequence \(x\left( n \right)\) into an even sequence and an odd sequence, and then perform odd–even decomposition of the dual sequence and odd sequence, respectively, and repeat the operation until it cannot be decomposed. Where N is an integer power of 2. The corresponding 8-point base 2 DIT-FFT iteration process is shown in Fig. 1. The 8-point base 2 DIT-FFT data flow diagram is shown in Fig. 2, which has 3 stages and a total of 16 butterfly cells. Note that the input and output are a bit reversed, with the input in reverse order and the output in positive order.
2.2 Complex multiplication based on CORDIC algorithm
The CORDIC algorithm is widely used and is extended in literature [27] to compute a set of arithmetic functions, including multiplication, division, sine, cosine, arctangent, and hyperbolic functions. Based on the CORDIC algorithm, this paper uses simple shift, addition, and subtraction operations to achieve complex multiplication. It does not require the use of dedicated multipliers or embedded functional blocks. The use of CORDIC in the proposed architecture eliminates the complex multiplier and memory blocks needed to store the rotation factor values. The advanced complex multiplication scheme based on CORDIC enables the proposed FFT processor to improve its processing speed and save a lot of hardware resources.
The key operation of the FFT algorithm is multiplying the operand by the rotation factor \(W_{N}^{k} x\left( n \right)\), which essentially rotates \(x\left( n \right)\) in the complex plane, where \(\theta = \frac{2\pi k}{N}\). As shown in Fig. 3, when the coordinate is the vector rotation Angle \(\theta\) of \(\left( {x_{i} ,y_{i} } \right)\), its new coordinate can be expressed as:
$$\begin{aligned} & x_{i + 1} = x_{i} \cos \theta - y_{i} \sin \theta \\ & y_{i + 1} = y_{i} \cos \theta + x_{i} \sin \theta \\ \end{aligned}$$
(4)
Extract \(\cos \theta\), Eq. (4) can be expressed as:
$$\begin{aligned} & x_{i + 1} = \cos \theta (x_{i} - y_{i} \tan \theta ) \\ & y_{i + 1} = \cos \theta (x_{i} + y_{i} \tan \theta ) \\ \end{aligned}$$
(5)
From Eq. (5), it can be seen that \(\cos \theta\) only changes the magnitude of the vector. If \(\cos \theta\) is removed, this rotation is called pseudo rotation, as shown in Fig. 3b. As shown in Fig. 3c, for pseudo rotation, rotation Angle \(\theta\) can be decomposed into a series of a small sum of angles.
$$\theta = \mathop \sum \limits_{i = 0}^{\infty } \theta_{i}$$
(6)
We can know by using the property of tangent function \(\tan \theta = d_{i} 2^{ - i}\), in there \(d_{i} \in \left\{ { - 1,1} \right\}\).
The introduction of the variable \(z\), \(z_{0} = \theta\), defines
$$z_{i + 1} = z_{i} - d_{i} \arctan \left( {2^{ - i} } \right)$$
(7)
Thus, the rotation mode iteration process of the CORDIC algorithm can be expressed as
$$\begin{aligned} & x_{i + 1} = K\left( {x_{i} - d_{i} y_{i} 2^{ - i} } \right) \\ & y_{i + 1} = K\left( {x_{i} + d_{i} y_{i} 2^{ - i} } \right) \\ & z_{i + 1} = z_{i} - d_{i} \theta_{i} \\ & d_{i} \left\{ {\begin{array}{*{20}l} { + 1} \hfill & {z_{i} \ge 0} \hfill \\ { - 1} \hfill & {z_{i} < 0} \hfill \\ \end{array} } \right. \\ \end{aligned}$$
(8)
In the formula, \(K = \mathop \prod \limits_{i = 0}^{n - 1} \frac{1}{{\sqrt {1 + 2^{ - 2i} } }}\) is the Mold length compensation factor. When the number of iterations is large enough \(K \approx 0.60725\). From Eq. (8), it can be seen that complex multiplication can be achieved by shifting, adding, and subtracting.