# 4k-point FFT algorithms based on optimized twiddle factor multiplication for FPGAs

TL;DR: It is shown that there is a trade-off between twiddle factor memory complexity and switching activity in the introduced algorithms.

Abstract: In this paper, we propose higher point FFT (fast Fourier transform) algorithms for a single delay feedback pipelined FFT architecture considering the 4096-point FFT These algorithms are different from each other in terms of twiddle factor multiplication. Twiddle factor multiplication complexity comparison is presented when implemented on Field-Programmable Gate Arrays(FPGAs) for all proposed algorithms. We also discuss the design criteria of the twiddle factor multiplication. Finally it is shown that there is a trade-off between twiddle factor memory complexity and switching activity in the introduced algorithms.

## Summary (2 min read)

### Introduction

- Computation of the discrete Fourier transform (DFT) and inverse DFT is used in for e.g. orthogonal frequency-division multiplexing (OFDM) communication systems, Digital Video Broadcasting (DVB) and spectrometers.
- Also, many different architectures to efficiently map the FFT algorithm to hardware have been proposed [1].
- Low power can be achieved by either reducing the switching activity or resource utilization.
- Also discussed are the design criteria for the proposed algorithms on the basis of implementation of twiddle factor multiplication.
- In Section III the authors discuss the design criteria of the algorithms.

### II. BINAY TREE REPRESENTATION OF COOLEY-TUKEY ALGORITHM

- Typically, the P and Qpoint DFTs are again divided into smaller DFTs.
- An efficient representation of algorithms of this type is the binary tree representation [7].
- FFT algorithm is categorized by the way Cooley-Tukey recursive decomposition is applied.
- The radix-2i has simple radix-2 butterfly operations and twiddle factor multiplications depend upon the value of i.

### III. CRITERIA FOR ALGORITHM SELECTION

- Algorithm selection criteria is the most important step to design low power FFT algorithm.
- Twiddle factor multiplication is one of the major power contributors of the single delay feedback pipelined FFT architecture.
- Twiddle factor multiplication requires both memory and complex multiplier which consumes more power and more area.

### A. Complexity of WN Multiplier

- The simplest approach, is to just use a large look-up table to store the twiddle factors.
- It should also be noted that this scheme possibly stores the same twiddle factor in several positions as the mapping is from row to twiddle factor and for radix-2i algorithms some twiddle factors appears more than once for i ≥.
- This can easily be realized using a multiplexer selecting between the input or the output of a constant multiplier with coefficient sin π4 .
- The constant multiplier can be realized using a minimum number of adders using the method in [14].
- This twiddle factor multiplication can be implemented with the dedicated constant multiplier of sin π8 , cos π 8 and sin π4 with some control logic. [5] proposed a W16 multiplier based on trigonometric identities which were implemented with the constant coefficients sin π8 and cos π 8 .

### B. Switching activity

- Switching activity between two successive coefficients fed to the complex multiplier affects the power consumption.
- In [17] the equivalent radix-22 algorithm with low switching activity was proposed.
- The different decompositions of the 64-point FFT block is shown in Fig. 4 and the switching activity is tabulated in Table II.
- In case II and IV, the authors have same twiddle factor complexity but case II has less switching activity.
- Proposed architectures can be formulated with eq.

### V. RESULTS

- The authors have analyzed the complexity and switching activity of twiddle factor multiplications.
- The architectures of the twiddle factor multiplication have been coded in VHDL.
- The resulting complexity for each stage is illustrated in Table V.
- The switching activity between successive coefficient fed to the complex multiplier is defined in terms of Hamming distance for each coefficient transition.
- Low power design is trade off between these parameters.

### VI. CONCLUSIONS

- The authors proposed the different algorithms for single delay feedback architecture for higher radix, considering the 4096-point FFT.
- The twiddle factor multiplications at each stage is different for each proposed algorithms.
- Low power designs of each algorithm depends upon few twiddle factor multiplication design parameters.
- Design criteria of twiddle factor multiplication is trade off between these parameters.
- It is shown that in the proposed algorithms the authors have better choices to select the low power architecture for 4096-point FFT.

Did you find this useful? Give us your feedback

...read more

##### Citations

19 citations

### Cites background from "4k-point FFT algorithms based on op..."

...The similar FFT design approach even more extends to a general radix-2k [31], [32] basis....

[...]

12 citations

### Cites background from "4k-point FFT algorithms based on op..."

...Besides, in order to achieve lower computation complexity, radix-22 [11]–[13], radix-23 [14]–[20], radix-24 [21], [22], and radix-2k [23], [24] FFT circuits are developed in sequence....

[...]

5 citations

### Additional excerpts

...There are other studies to reduce the circuit complexity with algorithms to minimize the size and power of the FFT processor using coefficient memory reduction [8] [9] and switching activity analysis schemes [10] [11]....

[...]

4 citations

4 citations

### Cites background from "4k-point FFT algorithms based on op..."

...The twiddle factor multiplication is one of the major contributors to the area of the FFT processor, which requires both memories and complex multipliers [15]....

[...]

##### References

^{1}

401 citations

### Additional excerpts

...Stage number Radix 1 2 3 4 5 6 7 2 W256 W128 W64 W32 W16 W8 W4 22 [3] W4 W256 W4 W64 W4 W16 W4 23 [4] W4 W8 W256 W4 W8 W32 W4 24 [5] W4 W8 W16 W256 W4 W8 W16 25 [6] W4 W8 W16 W32 W256 W4 W8 26 [6] W4 W8 W16 W32 W64 W256 W4...

[...]

^{1}

316 citations

### Additional excerpts

...Stage number Radix 1 2 3 4 5 6 7 2 W256 W128 W64 W32 W16 W8 W4 22 [3] W4 W256 W4 W64 W4 W16 W4 23 [4] W4 W8 W256 W4 W8 W32 W4 24 [5] W4 W8 W16 W256 W4 W8 W16 25 [6] W4 W8 W16 W32 W256 W4 W8 26 [6] W4 W8 W16 W32 W64 W256 W4...

[...]

316 citations

### "4k-point FFT algorithms based on op..." refers methods in this paper

...A commonly used architecture for transforms of length N = b is the pipelined FFT [2]....

[...]

298 citations

81 citations

### "4k-point FFT algorithms based on op..." refers methods in this paper

...The constant multiplier can be realized using a minimum number of adders using the method in [14]....

[...]