scispace - formally typeset
Search or ask a question
Topic

Prime-factor FFT algorithm

About: Prime-factor FFT algorithm is a research topic. Over the lifetime, 2346 publications have been published within this topic receiving 65147 citations. The topic is also known as: Prime Factor Algorithm.


Papers
More filters
01 Jan 2009
TL;DR: This paper presents auto tuning, optimization, and performance modeling of 3 Dimensional Fast Fourier Transforms on Cray XT4 (Franklin) system and achieves a substantial improvement in performance over conventional approaches to tuning 3D FFT performance.
Abstract: We present auto tuning, optimization, and performance modeling of 3 Dimensional Fast Fourier Transforms on Cray XT4 (Franklin) system. Spectral methods involving FFTs are a commonly used numerical technique with applications in engineering, chemistry, geosciences, and other areas of scientific computing. In the case of materials science the wavefunction of the electrons are expanded in spatial frequency components which is a natural basis since the wavefunction for a free electron is a plane wave. In this paper we study the performance of a 3D FFT specifically written for materials science applications. The problem with a parallel 3d FFT is that for a grid of N points the computational work involved is O(NlogN) while the amount of communication is O(N). This means that for small values of N (64 x 64 x 64 3D FFTs), the communication costs rapidly overwhelm the parallel computation savings. A distributed 3D FFT represents a considerable challenge for the communications infrastructure of a parallel machine because of the all-to-all nature of the distributed transposes required, and it stresses aspects of the machine that complement those addressed by other benchmark kernels, such as Linpack, that solves system of linear equations, Ax = b. Auto tuning can play a vital role in optimizing 3D FFT kernels for a diverse platforms as it can exploit features of the system-specific configuration characteristics to improve performance. We also depend on analytic performance models as a tool for predicting the idealized performance for the 3-dimensional Fast Fourier Transforms to set performance expectations for the target computational system. Overall, our methodology is able to achieve a substantial improvement in performance over conventional approaches to tuning 3D FFT performance.

2 citations

Proceedings ArticleDOI
02 Apr 2015
TL;DR: This work reduces the additive complexities of cyclotomic fast Fourier transforms (CFFTs) requiring fewer additions through greedy algorithm and exhaustive search to select the best set of common sub expressions.
Abstract: Common sub expression elimination (CSE) is a critical procedure in many multiplierless implementations of DSP algorithms. The aim of CSE is dual-pronged to reduce the number of logic operators used and to minimize the logic depth (critical path) of the DSP algorithm implemented in VLSI. CSE algorithm combines greedy algorithm and exhaustive search to select the best set of common sub expressions is proposed. Using CSE algorithm, we reduce the additive complexities of cyclotomic fast Fourier transforms (CFFTs) requiring fewer additions.

2 citations

Journal Article
TL;DR: Experiments prove that since the results of the algorithm are shown by periods, it can give the displacement of a moving target with a resolution less than 1 pixel and it is proved that the algorithm is simple and fast and can be used in image tracking and post- image processing.
Abstract: An algorithm which is used to estimate the translation of a moving object in spatial domain according to its Fourier transform spectrum is proposed in the paper. Based on shift theorem of Fourier transform and auto-registration, the algorithm directly estimates the translation with the phase spectrum difference of continuous images of a moving object in a polar coordinate system. Compared with traditional algorithm which has to search Direc peak, this method needs not transform spectrum to spatial domain after calculation, which reduces processing time. Experiments prove that since the results of the algorithm are shown by periods, it can give the displacement of a moving target with a resolution less than 1 pixel. It is also proved that the algorithm is simple and fast and can be used in image tracking and post- image processing.

2 citations

Journal ArticleDOI
Shou Ming Liu, Hong Wei Shi, Yan Jiang1, Xu Min, Gen Yong Chen 
TL;DR: In this article, a new interpolation FFT algorithm based on Rife-Vincent (I) window is provided, which has the amplitude error less than 1×10-6 %, the frequency error more than 1 ×10-7Hz, and the phase error lower than 0.0001%.
Abstract: To further improving the precision of harmonics measurement, a new interpolation FFT algorithm based on Rife–Vincent (I) window is provided in this paper. First the spectrum leakage of FFT briefly and the frequency response of the Class I Rife- Vincent window is discussed, and then paper analyzes the interpolation algorithm on Rife–Vincent (I) window in detail. At last the cubic spine function is adopted to calculate the frequency and the harmonic amplitude modification coefficient. An example of simulation is given, and simulative calculation results show that Rife–Vincent (I) window interpolation algorithm by using cubic spine function has the amplitude error less than 1×10-6 % , the frequency error less than 1×10-7Hz, and the phase error less than 0.0001%. Comparing with other cosine windows interpolation FFT algorithm, the new interpolation FFT algorithm based on Rife–Vincent (I) window has the highest accuracy.

2 citations

01 Dec 1991
TL;DR: This architecture is an addressless, routed, bit-serial scheme that directly maps an N-point algorithm onto silicon that appears to be far less costly than systolic schemes for implementing the WFTA, and faster than current FFT devices for similar transform sizes.
Abstract: : A VLSI architecture for computing the discrete Fourier transform (DFT) using the Winograd Fourier transform algorithm (WFTA) is presented. This architecture is an addressless, routed, bit-serial scheme that directly maps an N-point algorithm onto silicon. The architecture appears to be far less costly than systolic schemes for implementing the WFTA, and faster than current FFT devices for similar transform sizes. The nesting method of Winograd is used for partitioning larger transformations into several circuits. The advantage of this partitioning technique is that it allows using circuits that are all of the same type. However, the number of input/output pins of each circuit is higher than with some other approaches like, for example, the prime factor algorithm. The design of a 20-point DFT circuit with logic diagrams of its major cells is presented. The gate array circuit has been sent for fabrication in a 0.7 micron CMOS technology. five circuits interconnected together will compute 60-point complex transforms at a rate of one transformation every 0.53 micron.

2 citations


Network Information
Related Topics (5)
Wavelet
78K papers, 1.3M citations
81% related
Robustness (computer science)
94.7K papers, 1.6M citations
78% related
Feature extraction
111.8K papers, 2.1M citations
77% related
Support vector machine
73.6K papers, 1.7M citations
76% related
Optimization problem
96.4K papers, 2.1M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20235
202224
20211
20188
201757
201692