scispace - formally typeset
Search or ask a question
Topic

Split-radix FFT algorithm

About: Split-radix FFT algorithm is a research topic. Over the lifetime, 1845 publications have been published within this topic receiving 41398 citations.


Papers
More filters
Proceedings ArticleDOI
05 Jun 2013
TL;DR: This paper considers the modifications required to transform a highly-efficient, specialized linear algebra core into an efficient engine for computing Fast Fourier Transforms (FFTs) and proposes a flexible architecture that can perform both classes of applications.
Abstract: This paper considers the modifications required to transform a highly-efficient, specialized linear algebra core into an efficient engine for computing Fast Fourier Transforms (FFTs) We review the minimal changes required to support Radix-4 FFT computations and propose extensions to the micro-architecture of the baseline linear algebra core Along the way, we study the critical differences between the two classes of algorithms Special attention is paid to the configuration of the on-chip memory system to support high utilization We examine design trade-offs between efficiency, specialization and flexibility, and their effects both on the core and memory hierarchy for a unified design as compared to dedicated accelerators for each application The final design is a flexible architecture that can perform both classes of applications Results show that the proposed hybrid FFT/Linear Algebra core can achieve 266 GFLOPS/S with a power efficiency of 40 GFLOPS/W, which is up to 100× and 40× more energy efficient than cutting-edge CPUs and GPUs, respectively

18 citations

Proceedings ArticleDOI
11 Apr 1988
TL;DR: Reasons are suggested why the split-radix algorithm is better than any single-radIX algorithm on length-2/sup m/ DFTs (discrete Fourier transforms) and it is shown that whenever a radix-p/sup 2/ outperforms a radIX-p algorithm, a radx-p-p 2/ algorithm will outperform both of them.
Abstract: Reasons are suggested why the split-radix algorithm is better than any single-radix algorithm on length-2/sup m/ DFTs (discrete Fourier transforms). The split-radix approach is generalized to length-p/sup m/ DFTs. It is shown that whenever a radix-p/sup 2/ outperforms a radix-p algorithm, a radix-p/p/sup 2/ algorithm will outperform both of them. As an example, a radix-3/9 algorithm is developed for length-3/sup m/ DFTs. >

18 citations

Journal ArticleDOI
Weihua Zheng1, Kenli Li1
TL;DR: Novel order permutation of sub-DFTs and reduction of the number of arithmetic operations enhance the practicability of the proposed algorithm and inherently provides a wider choice of accessible FFT's lengths.
Abstract: Discrete Fourier transform (DFT) is widespread used in many fields of science and engineering. DFT is implemented with efficient algorithms categorized as fast Fourier transform. A fast algorithm is proposed for computing a length-N=6m DFT. The proposed algorithm is a blend of radix-3 and radix-6 FFT. It is a variant of split radix and can be flexibly implemented a length 2r×3m DFT. Novel order permutation of sub-DFTs and reduction of the number of arithmetic operations enhance the practicability of the proposed algorithm. It inherently provides a wider choice of accessible FFT's lengths.

18 citations

Journal ArticleDOI
TL;DR: A configurable floating-point FFT accelerator based on CORDIC rotation is proposed, in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory.
Abstract: Fast Fourier transform (FFT) accelerator and Coordinate rotation digital computer (CORDIC) algorithm play important roles in signal processing. We propose a configurable floating-point FFT accelerator based on CORDIC rotation, in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory. To finish CORDIC rotation efficiently, a novel approach in which segmented-parallel iteration and compress iteration based on CSA are presented and redundant CORDIC is used to reduce the latency of each iteration. To prove the efficiency of our FFT accelerator, four FFT accelerators are prototyped into a FPGA chip to perform a batch-FFT. Experimental results show that our structure, which is composed of four butterfly units and finishes FFT with the size ranging from 64 to 8192 points, occupies 33230(3%) REGs and 143006(30%) LUTs. The clock frequency can reach 122MHz. The resources of double-precision FFT is only about 2.5 times of single-precision while the theoretical value is 4. What's more, only 13331 cycles are required to implement 8192-points double-precision FFT with four butterfly units in parallel.

18 citations

Journal ArticleDOI
01 Feb 1993
TL;DR: The proposed algorithm is more efficient compared to the radix-2 FHT in terms of the computational requirements, as well as the execution time for transform lengths higher than 30 and is faster than the prime-factor FFT algorithm for real-valued series.
Abstract: Fast algorithms for computing the DHT of short transform lengths (N = 2, 3, 4, 5, 7, 8, 9 and 16) are derived. A new prime-factor algo- rithm is also proposed to compute the long-length DHTs from the short-length DHT algorithms. The short-length algorithms (except for N = 8 and N = 16) are such that the even and the odd parts of the DHT components are obtained directly, without any additional computation. This feature of the short-length algorithms makes the proposed prime-factor DHT algorithm more attractive and efficient. It is found that the proposed algorithm is more efficient compared to the radix-2 FHT in terms of the computational requirements, as well as the execution time for transform lengths higher than 30. It is also observed that the number of operations required for the computation of DHT by the prime-factor FFT algorithm for real-valued data is the same as those of the proposed algo- rithm for certain transform lengths, e.g. N = 30, 60, 252 etc., which do not contain 8 or 16 as a cofactor. However, for all other transform lengths the proposed algorithm has a lower computa- tional complexity. It is further observed that the proposed algorithm is faster than the prime-factor FFT algorithm for real-valued series.

18 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
80% related
Filter (signal processing)
81.4K papers, 1M citations
78% related
Robustness (computer science)
94.7K papers, 1.6M citations
78% related
Iterative method
48.8K papers, 1.2M citations
77% related
Optimization problem
96.4K papers, 2.1M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202234
20192
20188
201748
201689