Showing papers on "Split-radix FFT algorithm published in 2004"

PDF

Open Access

Journal Article•DOI•

Accelerating the Nonuniform Fast Fourier Transform

[...]

Leslie Greengard¹, June-Yub Lee•Institutions (1)

01 Jan 2004-Siam Review

TL;DR: This paper observes that one of the standard interpolation or "gridding" schemes, based on Gaussians, can be accelerated by a significant factor without precomputation and storage of the interpolation weights, of particular value in two- and three- dimensional settings.

...read moreread less

Abstract: The nonequispaced Fourier transform arises in a variety of application areas, from medical imaging to radio astronomy to the numerical solution of partial differential equations. In a typical problem, one is given an irregular sampling of N data in the frequency domain and one is interested in reconstructing the corresponding function in the physical domain. When the sampling is uniform, the fast Fourier transform (FFT) allows this calculation to be computed in O(N log N ) operations rather than O(N 2 ) operations. Unfortunately, when the sampling is nonuniform, the FFT does not apply. Over the last few years, a number of algorithms have been developed to overcome this limitation and are often referred to as nonuniform FFTs (NUFFTs). These rely on a mixture of interpolation and the judicious use of the FFT on an oversampled grid (A. Dutt and V. Rokhlin, SIAM J. Sci. Comput., 14 (1993), pp. 1368-1383). In this paper, we observe that one of the standard interpolation or "gridding" schemes, based on Gaussians, can be accelerated by a significant factor without precomputation and storage of the interpolation weights. This is of particular value in two- and three- dimensional settings, saving either 10 d N in storage in d dimensions or a factor of about 5-10 in CPUtime (independent of dimension).

...read moreread less

714 citations

Journal Article•DOI•

Simulation of non-Gaussian surfaces with FFT

[...]

Jiunn-Jong Wu¹•Institutions (1)

Chinese Culture University¹

01 Apr 2004-Tribology International

TL;DR: In this paper, a numerical procedure for the simulation of non-Gaussian surfaces has been developed, which can simulate surfaces with given skewness and kurtosis and with spectral density or auto-correlation function.

...read moreread less

123 citations

Proceedings Article•DOI•

Design of an efficient variable-length FFT processor

[...]

Chung-Ping Hung¹, Sau-Gee Chen¹, Kun-Lung Chen¹•Institutions (1)

National Chiao Tung University¹

23 May 2004

TL;DR: In this paper, an efficient variable-length FFT processor architecture suitable for multi-mode and multi-standard OFDM communication systems is proposed, based on radix-2/sup 2/ DIF FFT algorithm and also supports non-power-of-4 FFT computation.

...read moreread less

Abstract: In this paper, we propose an efficient variable-length FFT processor architecture suitable for multi-mode and multi-standard OFDM communication systems. The FFT processor is based on radix-2/sup 2/ DIF FFT algorithm and also supports non-power-of-4 FFT computation. The design contains an efficient processing element (PE), which can execute radix-2/sup 2/ butterfly (BF) operations, as well as radix-2 BF operations. Moreover, in order to achieve high-performance variable-length FFT operations and data accesses, an efficient variable-length address generator and twiddle factor generator are designed. The design has the merits of low complexity and high speed performance. The designs consider seven different FFT lengths including 64, 256, 512, 1024, 2048, 4096, and 8192 points, which cover all the required FFT lengths by 802.11a, 802.16a, DAB, DVB-T, VDSL and ADSL.

...read moreread less

60 citations

Journal Article•DOI•

A new radix-2/8 FFT algorithm for length-q/spl times/2/sup m/ DFTs

[...]

S. Bouguezel¹, M.O. Ahmad¹, Mallappa Kumara Swamy¹•Institutions (1)

Concordia University¹

13 Sep 2004-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: A new radix-2/8 fast Fourier transform (FFT) algorithm is proposed for computing the discrete Fouriertransform of an arbitrary length N=q/spl times/2/sup m/, where q is an odd integer.

...read moreread less

Abstract: In this paper, a new radix-2/8 fast Fourier transform (FFT) algorithm is proposed for computing the discrete Fourier transform of an arbitrary length N=q/spl times/2/sup m/, where q is an odd integer. It reduces substantially the operations such as data transfer, address generation, and twiddle factor evaluation or access to the lookup table, which contribute significantly to the execution time of FFT algorithms. It is shown that the arithmetic complexity (multiplications+additions) of the proposed algorithm is, in most cases, the same as that of the existing split-radix FFT algorithm. The basic idea behind the proposed algorithm is the use of a mixture of radix-2 and radix-8 index maps. The algorithm is expressed in a simple matrix form, thereby facilitating an easy implementation of the algorithm, and allowing for an extension to the multidimensional case. For the structural complexity, the important properties of the Cooley-Tukey approach such as the use of the butterfly scheme and in-place computation are preserved by the proposed algorithm.

...read moreread less

50 citations

Proceedings Article•DOI•

Improved radix-4 and radix-8 FFT algorithms

[...]

S. Bouguezel¹, M.O. Ahmad¹, Mallappa Kumara Swamy¹•Institutions (1)

Concordia University¹

23 May 2004

TL;DR: These modified radix-4 andRadix-8 algorithms provide savings of more than 33% and 42% respectively in the number of twiddle factor evaluations or accesses to the lookup table compared to the corresponding conventional FFT algorithms without imposing any additional complexity.

...read moreread less

Abstract: In this paper, improved algorithms for radix-4 and radix-8 FFT are presented. This is achieved by re-indexing a subset of the output samples resulting from the conventional decompositions in the radix-4 and radix-8 FFT algorithms. These modified radix-4 and radix-8 algorithms provide savings of more than 33% and 42% respectively in the number of twiddle factor evaluations or accesses to the lookup table compared to the corresponding conventional FFT algorithms without imposing any additional complexity.

...read moreread less

28 citations

Journal Article•DOI•

Fast Fourier transform for discontinuous functions

[...]

Guo-Xin Fan¹, Qing Huo Liu¹•Institutions (1)

Duke University¹

05 Apr 2004-IEEE Transactions on Antennas and Propagation

TL;DR: A fast algorithm for the evaluation of the Fourier transform of piecewise smooth functions with uniformly or nonuniformly sampled data by using a double interpolation procedure combined with the fast Fouriertransform (FFT) algorithm is presented.

...read moreread less

Abstract: In computational electromagnetics and other areas of computational science and engineering, Fourier transforms of discontinuous functions are often required. We present a fast algorithm for the evaluation of the Fourier transform of piecewise smooth functions with uniformly or nonuniformly sampled data by using a double interpolation procedure combined with the fast Fourier transform (FFT) algorithm. We call this the discontinuous FFT algorithm. For N sample points, the complexity of the algorithm is O(/spl nu/Np+/spl nu/Nlog(N)) where p is the interpolation order and /spl nu/ is the oversampling factor. The method also provides a new nonuniform FFT algorithm for continuous functions. Numerical experiments demonstrate the high efficiency and accuracy of this discontinuous FFT algorithm.

...read moreread less

26 citations

Patent•

Fft architecture and method

[...]

Raghuraman Krishnamoorthi¹, Chinnappa K. Ganapathy¹•Institutions (1)

Qualcomm¹

03 Dec 2004

TL;DR: In this paper, a Fast Fourier Transform (FFT) hardware implementation and method provides efficient FFT processing while minimizing the die area needed in an Integrated Circuit (IC), where FFT hardware can implement an N point FFT, where N = rn is a function of a radix (r), and the hardware implementation includes a sample memory having N/r rows, each storing r samples.

...read moreread less

Abstract: A Fast Fourier Transform (FFT) hardware implementation and method provides efficient FFT processing while minimizing the die area needed in an Integrated Circuit (IC). The FFT hardware can implement an N point FFT, where N = rn is a function of a radix (r). The hardware implementation includes a sample memory having N/r rows, each storing r samples. A twiddle factor memory can store k twiddle factors per row, where 0 < k

...read moreread less

25 citations

Patent•

Digital signal processor structure for performing length-scalable fast fourier transformation

[...]

Cheng-Han Sung¹, Chein-Wei Jen¹, Chih-Wei Liu¹, Lai Horng-Chi¹, Gin-Kou Ma¹ - Show less +1 more•Institutions (1)

Industrial Technology Research Institute¹

07 Jan 2004

TL;DR: In this article, a digital signal processor structure by performing length-scalable Fast Fourier Transformation (FFT) discloses a single processor element (single PE), and a simple and effective address generator are used to achieve length scalability, high performance, and low power consumption in split-radix-2/4 FFT or IFFT module.

...read moreread less

Abstract: A digital signal processor structure by performing length-scalable Fast Fourier Transformation (FFT) discloses a single processor element (single PE), and a simple and effective address generator are used to achieve length-scalable, high performance, and low power consumption in split-radix-2/4 FFT or IFFT module In order to meet different communication standards, the digital signal processor structure has run-time configuration to perform for different length requirements Moreover, its execution time can fit the standards of Fast Fourier Transformation (FFT) or Inverse Fast Fourier Transformation (IFFT)

...read moreread less

25 citations

Patent•

Recoded radix-2 pipelined fft processor

[...]

Sean G. Gibb, Peter J. W. Graumann

21 Jun 2004

TL;DR: A single-path delay feedback pipelined fast Fourier transform processor comprising at least one set of triplet FFT stage means: a first stage means comprising a radix-2 butterfly, a feedback memory, and a multiplication by unity or Wnn/8 ; and a third stage mean comprising a trivial coefficient pre-multiplication, a butterfly and a complex twiddle coefficient multiplication with coefficients determined using a twiddle factor decomposition technique as mentioned in this paper.

...read moreread less

Abstract: A single-path delay feedback pipelined fast Fourier transform processor comprising at least one set of triplet FFT stage means: a first FFT stage means comprising a radix-2 butterfly, a feedback memory, and a multiplication by unity; a second FFT stage means comprising a trivial coefficient pre-multiplication, a radix-2 butterfly, a feedback memory, and a multiplication by selectable unity or Wnn/8 ; and a third FFT stage means comprising a trivial coefficient pre-multiplication, a butterfly, a feedback memory, and a complex twiddle coefficient multiplication with coefficients determined using a twiddle factor decomposition technique.

...read moreread less

20 citations

Patent•

Modular pipeline fast Fourier transform

[...]

Earl E. Swartzlander¹, Ayman Moustafa El-Khashab¹•Institutions (1)

University of Texas System¹

02 Nov 2004

TL;DR: In this paper, a modular pipeline algorithm and architecture for computing discrete Fourier transforms is described, where two pipeline N point {square root}{square root over (N)} point fast Fourier transform modules are combined with a center element.

...read moreread less

Abstract: A modular pipeline algorithm and architecture for computing discrete Fourier transforms is described. For an N point transform, two pipeline N point {square root}{square root over (N)} point fast Fourier transform modules are combined with a center element. The center element contains memories, multipliers and control logic. Compared with standard N point pipeline FFT, the modular pipeline FFT maintains the bandwidth existing pipeline FFTs with reduced dynamic power consumption and reduced complexity of the overall hardware pipeline.

...read moreread less

19 citations

Journal Article•

The fast Fourier transform algorithm for the production of the permutation factor circulant matrices

[...]

Lai Yi-xin

01 Jan 2004-Journal of Baoji College of Arts and Science

TL;DR: A fast Fourier transform algorithm for the production of the permutation factor circulant matrices of order n based on the fast Fouriers transform (FFT) was presented, and arithmetric complexity is O(nlog_2n).

...read moreread less

Abstract: A fast Fourier transform algorithm for the production of the permutation factor circulant matrices of order n based on the fast Fourier transform(FFT) was presented, and arithmetric complexity is O(nlog_2n).

...read moreread less

Proceedings Article•DOI•

Fast computation of the ambiguity function

[...]

Wei-Qiang Zhang¹, Ran Tao¹, Yong-feng Ma¹•Institutions (1)

Beijing Institute of Technology¹

01 Jan 2004

TL;DR: Both the theory and the simulations show that, pro weighted zoom FFT method has lower computational complexity, less memory need and negligible error, and can meet the need of real-time processing.

...read moreread less

Abstract: This paper addresses the problem of fast computation of the ambiguity function using a new method based on pre-weighted zoom FFT algorithm, which employs zoom FFT technique and performs the weighting process previously and thus gets ride of the extra computation. The computational complexity of the presented algorithm is compared with other methods and the simulation results are given. Both the theory and the simulations show that, pro weighted zoom FFT method has lower computational complexity, less memory need and negligible error, and can meet the need of real-time processing.

...read moreread less

Journal Article•DOI•

A memory-reduction scheme for the FFT T-matrix method [EM wave scattering applications]

[...]

Kim¹•Institutions (1)

Air Force Research Laboratory¹

01 Dec 2004-IEEE Antennas and Wireless Propagation Letters

TL;DR: The technique is capable of reducing the memory requirement by a factor of 6/spl sim/16 depending on the number of modes used and the spatial distribution of scatterers and is simple to implement in an existing FFT T-matrix code.

...read moreread less

Abstract: We present a memory-reduction technique for the fast Fourier transformation (FFT) T-matrix method. The technique exploits the configuration- and Fourier-space symmetry relations of the transverse spherical multipole translation coefficients whose storage drives the memory requirement. The technique is capable of reducing the memory requirement by a factor of 6/spl sim/16 depending on the number of modes used and the spatial distribution of scatterers and is simple to implement in an existing FFT T-matrix code. We establish its accuracy and effectiveness by applying the technique to compute the RCS of aggregates of dielectric spheres.

...read moreread less

Proceedings Article•DOI•

Fast Fourier transform processor based on low-power and area-efficient algorithm

[...]

Jung-Yeol Oh¹, Myoung-Seob Lim¹•Institutions (1)

Chonbuk National University¹

01 Nov 2004

TL;DR: This paper proposes a new efficient FFT architecture with structured pipeline for OFDM systems, based on radix-2/sup 4/ algorithm, which achieved above 60% area reduction when compared with the conventional programmable multiplier.

...read moreread less

Abstract: This paper proposes a new efficient FFT architecture with structured pipeline for OFDM systems, based on radix-2/sup 4/ algorithm. The pipeline architecture with the new algorithm has the same number of multipliers as that of the radix-2/sup 2/ algorithm. However, the multiplier complexity could be reduced by an amount of above 30% by means of replacing a half of programmable multipliers with the newly proposed constant multipliers. A newly proposed complex constant multiplier can enhance the area/power efficiency of the design. From synthesis simulations of a standard 0.35/spl mu/m CMOS process, it achieved above 60% area reduction when compared with the conventional programmable multiplier.

...read moreread less

Proceedings Article•DOI•

A novel pipelined fast Fourier transform architecture for double rate OFDM systems

[...]

Hsin-Lei Lin¹, Hongchin Lin¹, Yu-Chuan Chen¹, Robert Chen-Hao Chang¹•Institutions (1)

National Chung Hsing University¹

06 Dec 2004

TL;DR: An efficiently pipelined radix-2 FFT architecture, which doubles the throughput with significant hardware reduction, and the utilization rate of multipliers and the processing elements reach 100%.

...read moreread less

Abstract: A high throughput fast Fourier transform/inverse fast Fourier transform (FFT/IFFT) processor for double-rate wireless LAN, based on double-rate OFDM communication systems, is proposed. It is an efficiently pipelined radix-2 FFT architecture, which doubles the throughput with significant hardware reduction. The utilization rate of multipliers and the processing elements reach 100%. The core size is 10 mm/sup 2/ with a power consumption of 208 mW at 20 MHz for data inputs with 15-bit word length, using 0.35 /spl mu/m IP4M CMOS technology.

...read moreread less

Proceedings Article•DOI•

Cooley-Tukey FFT like algorithm for the discrete triangle transform

[...]

Markus Püschel¹, Martin Rötteler•Institutions (1)

Carnegie Mellon University¹

01 Aug 2004

TL;DR: It is shown that the discrete triangle transform has, like the type III DCT, a Cooley-Tukey FFT type fast algorithm and an upper bound for the number of complex operations it requires.

...read moreread less

Abstract: The discrete triangle transform (DTT) was recently introduced (Pu/spl uml/schel, M and Ro/spl uml/tteler, M, Proc ICASSP, 2004) as an example of a non-separable transform for signal processing on a two-dimensional triangular grid The DTT is built from Chebyshev polynomials in two variables in the same way as the DCT, type III, is built from Chebyshev polynomials in one variable We show that, as a consequence, the DTT has, like the type III DCT, a Cooley-Tukey FFT type fast algorithm We derive this algorithm and an upper bound for the number of complex operations it requires Similar to most separable two-dimensional transforms, the operations count of this algorithm is O(n/sup 2/ log(n)) for an input of size n/spl times/n

...read moreread less

Journal Article•DOI•

A fast algorithm for three‐dimensional electrostatics analysis: fast Fourier transform on multipoles (FFTM)

[...]

E. T. Ong¹, K.H. Lee², Kian Meng Lim²•Institutions (2)

Institute of High Performance Computing Singapore¹, National University of Singapore²

07 Oct 2004-International Journal for Numerical Methods in Engineering

TL;DR: It is demonstrated that FFTM is an accurate method, and is generally more accurate than FMM for a given order of multipole expansion (up to the second order), implying that F FTM is as efficient as FMM.

...read moreread less

Abstract: In this paper, we propose a new fast algorithm for solving large problems using the boundary element method (BEM). Like the fast multipole method (FMM), the speed-up in the solution of the BEM arises from the rapid evaluations of the dense matrix–vector products required in iterative solution methods. This fast algorithm, which we refer to as fast Fourier transform on multipoles (FFTM), uses the fast Fourier transform (FFT) to rapidly evaluate the discrete convolutions in potential calculations via multipole expansions. It is demonstrated that FFTM is an accurate method, and is generally more accurate than FMM for a given order of multipole expansion (up to the second order). It is also shown that the algorithm has approximately linear growth in the computational complexity, implying that FFTM is as efficient as FMM. Copyright © 2004 John Wiley & Sons, Ltd.

...read moreread less

Proceedings Article•DOI•

An improved radix-16 FFT algorithm

[...]

S. Bouguezel¹, M.O. Ahmad¹, Mallappa Kumara Swamy¹•Institutions (1)

Concordia University¹

02 May 2004

TL;DR: An improved radix-16 decimation-in-frequency (DIF) FFT algorithm is proposed by introducing new indices for some of the output sub-sequences resulting from the conventional radix -16 DIF decomposition of the DFT.

...read moreread less

Abstract: An improved radix-16 decimation-in-frequency (DIF) FFT algorithm is proposed by introducing new indices for some of the output sub-sequences resulting from the conventional radix-16 DIF decomposition of the DFT. This improved radix-16 DIF FFT algorithm achieves savings of more than 46% in the number of twiddle factor evaluations or accesses to the lookup table and address generations compared to the conventional radix-16 DIF FFT algorithm. These savings are achieved without imposing any additional computational or structural complexity in the algorithm.

...read moreread less

Patent•

Fast fourier transform processor

[...]

Jun-Xian Teng¹, Hsien-Yuan Hsu¹•Institutions (1)

Industrial Technology Research Institute¹

28 Dec 2004

TL;DR: In this article, a Fast Fourier Transform (FFT) processor is provided, which analyzes the input/output order of the fast Fourier transformation, separates the portions requiring complex computations, simplifies the hardware thereof and adjusts the output order.

...read moreread less

Abstract: A Fast Fourier Transform (FFT) processor is provided. It comprises a multiplexer, a first angle rotator, a second angle rotation and multiplexing unit, an adder, a twiddle factor storage, a multiplier, and a data storage. The FFT processor analyzes the input/output order of the Fast Fourier Transformation, separates the portions requiring complex computations, simplifies the hardware thereof, and adjusts the output order. It not only effectively saves the hardware area, but also reduces the computations and memory access count. Thereby, the power consumption is reduced.

...read moreread less

Patent•

Inverse fast fourier transform (IFFT) with overlap and add

[...]

Robert J. Van Wechel, Ivan L. Johnston

01 Dec 2004

TL;DR: In this paper, a system for efficiently filtering interfering signals in a front end of a GPS receiver is disclosed, where at least a portion of the interfering signals are removed by applying weights to the inputs.

...read moreread less

Abstract: A system for efficiently filtering interfering signals in a front end of a GPS receiver is disclosed. Such interfering signals can emanate from friendly, as well as unfriendly, sources. One embodiment includes a GPS receiver with a space-time adaptive processing (STAP) filter. At least a portion of the interfering signals are removed by applying weights to the inputs. One embodiment adaptively calculates and applies the weights by Fourier Transform convolution and Fourier Transform correlation. The Fourier Transform can be computed via a Fast Fourier Transform (FFT). This approach advantageously reduces computational complexity to practical levels. Another embodiment utilizes redundancy in the covariance matrix to further reduce computational complexity. In another embodiment, an improved FFT and an improved Inverse FFT further reduce computational complexity and improve speed. Advantageously, embodiments can efficiently null a relatively large number of jammers at a relatively low cost and with relatively low operating power.

...read moreread less

Proceedings Article•DOI•

On pruning the discrete cosine and sine transforms

[...]

Ryszard Stasinski¹•Institutions (1)

Poznań University of Technology¹

12 May 2004

TL;DR: It is shown that a limited set of output discrete cosine transform (DCT) samples can be computed by a modified real-valued output-pruned FFT algorithm for appropriately permuted data samples.

...read moreread less

Abstract: In the paper it is shown that a limited set of output discrete cosine transform (DCT) samples can be computed by a modified real-valued output-pruned FFT algorithm for appropriately permuted data samples. The same is true for the discrete sine transform (DST). Analogously, when computing data contribution from few DCT or DST samples the input-pruned FFT algorithm for inverse FFT can be applied, the input-pruned algorithms for the inverse DCT or DST are obtained. The algorithms are very efficient, their complexities are O(NlogK), where N is the transform size, and K is a divisor of N equal to or greater than the number of computed transform samples, which is less than the number of computed transform samples, which is less than O(NlogN) for the full DCT or DST algorithm. The algorithms are easy to implement, too.

...read moreread less

Proceedings Article•DOI•

A genetic algorithm for the optimisation of a reconfigurable pipelined FFT processor

[...]

Nasri Sulaiman¹, Tughrul Arslan¹•Institutions (1)

University of Edinburgh¹

24 Jun 2004

TL;DR: Two forms of optimisation; input data optimisation and FFT coefficients optimisation are investigated in this paper and the word length is optimised down to 10 bits for input data and 8 bits for the FFT coefficient.

...read moreread less

Abstract: This paper describes the optimisation of the word length in a 16-point radix-4 reconfigurable pipelined fast Fourier transform (FFT) based receiver device. Two forms of optimisation; input data optimisation and FFT coefficients optimisation are investigated in this paper. The word length for input data and FFT coefficients are initially set to 16-bits. A genetic algorithm (GA) is then used to find the optimal word length for the input data and FFT coefficients while satisfying functionality constraints. The GA is able to determine an optimised word length down to 10 bits for input data and 8 bits for the FFT coefficients.

...read moreread less

Proceedings Article•DOI•

High-speed assembly FFT implementation with memory reference reduction on DSP processors

[...]

Yiyan Tang, Y. Wang, Jin-Gyun Chung, Sangseob Song, M. Lim - Show less +1 more

13 Dec 2004

TL;DR: This paper proposes a hand-coded assembly implementation for the radix-2 DIF FFT algorithm with the twiddle-factor-based butterfly grouping method on a TI TMS320C64/spl times/ DSP that is 8 times faster than the C implementation and slightly slower than the TI assembly benchmark while requiring only 50% of memory references due to twiddle factors.

...read moreread less

Abstract: The memory reference in digital signal processors (DSP) is among the most costly of operations due to its long latency and substantial power consumption Previously proposed twiddle-factor-based butterfly grouping methods can effectively minimize memory references due to twiddle factors for implementing any existing fast Fourier transform (FFT) algorithms on DSP However, the performance of its C implementation on DSP is far behind the corresponding TI assembly benchmark for radix-2 DIF FFT due to limitations of the compiler In this paper, we propose a hand-coded assembly implementation for the radix-2 DIF FFT algorithm with the twiddle-factor-based butterfly grouping method on a TI TMS320C64/spl times/ DSP Experimental results show that for 1024-pt radix-2 DIF FFT, our hand-coded assembly implementation is 8 times faster than the C implementation and slightly faster than the TI assembly benchmark while requiring only 50% of memory references due to twiddle factors compared to the TI assembly benchmark

...read moreread less

Proceedings Article•DOI•

An efficient FFT algorithm based on the radix-2/4 DIF approach for computing 3D DFT

[...]

S. Bouguezel¹, M.O. Ahmad¹, Mallappa Kumara Swamy¹•Institutions (1)

Concordia University¹

02 May 2004

TL;DR: It is shown that the proposed algorithm reduces the computational complexity significantly in comparison to the existing 3D vector radix FFT algorithms as well as algorithms that are based on row-column decomposition.

...read moreread less

Abstract: We propose a 3D split vector-radix decimation-in-frequency (DIF) FFT algorithm for computing the 3D DFT, based on a mixture of radix-(2/spl times/2/spl times/2) and radix-(4/spl times/4/spl times/4) index maps. It is shown that the proposed algorithm reduces the computational complexity significantly in comparison to the existing 3D vector radix FFT algorithms as well as algorithms that are based on row-column decomposition. In addition, since the proposed algorithm is expressed in a simple matrix form using the Kronecker product, it facilitates easy software or hardware implementation of the algorithm.

...read moreread less

Journal Article•

Design and Implementation of High Throughput FFT Processor

[...]

Xie Ying

01 Jan 2004-Journal of Computer Research and Development

TL;DR: A parallel architecture for the implementation of the radix 4 and mixed radix FFT algorithm is presented and the dedicated parallel memory mapping algorithm with the feature of minimal memory size relies on the in place calculation property of the FFT algorithms.

...read moreread less

Abstract: A parallel architecture for the implementation of the radix 4 and mixed radix FFT algorithm is presented The dedicated parallel memory mapping algorithm with the feature of minimal memory size relies on the in place calculation property of the FFT algorithm, and can simultaneously access to all the data needed for calculation of each butterfly The address generation of twiddle factors only need simple operation in this algorithm The hardware complexity of the butterfly processor is reduced by using 3 real multipliers algorithm for a complex multiplier The processor can be configured for transforms of lengths N , where N is power of two The implementation is on an Altera chip EP200K400E using Altera Quartus II 2 0 Operating at 89MHz clock frequency the processor computes a complex 1024 point FFT within 14 1μs and 4096 point FFT within 67μs

...read moreread less

Journal Article•DOI•

Automatic Performance Tuning for Fast Fourier Transforms

[...]

Dragan Mirkovic¹, Lennart Johnsson¹•Institutions (1)

University of Houston¹

01 Feb 2004

TL;DR: This paper discusses architecture-specific performance tuning for fast Fourier transforms (FFTs) implemented in the UHFFT library, an adaptive and portable software library for FFTs developed by the authors.

...read moreread less

Abstract: In this paper we discuss architecture-specific performance tuning for fast Fourier transforms (FFTs) implemented in the UHFFT library. The UHFFT library is an adaptive and portable software library for FFTs developed by the authors. We present the optimization methods used at different levels, starting with the algorithm selection used for the library code generation and ending with the actual implementation and specification of the appropriate compiler optimization options. We report on the performance results for several modern microprocessor architectures.

...read moreread less

Book Chapter•DOI•

Direct Solver Based on FFT and SEL for Diffraction Problems with Distribution

[...]

Hideyuki Koshigoe¹•Institutions (1)

Chiba University¹

06 Jun 2004

TL;DR: The numerical algorithm by use of SEL is improved with FFT and the calculation speed is faster than the previous one and the limit function of approximate solutions satisfied the diffraction problem in the sense of distribution.

...read moreread less

Abstract: A direct solver for diffraction problems is presented in this paper. The solver is based on the fast Fourier transform (FFT) and the successive elimination of lines which we call SEL. In the previous paper, we showed the numerical algorithm by use of SEL and proved that the limit function of approximate solutions satisfied the diffraction problem in the sense of distribution. In this paper, the above numerical algorithm is improved with FFT and we show that the calculation speed is faster than the previous one.

...read moreread less

Journal Article•DOI•

Split vector-radix-2/8 2-D fast Fourier transform

[...]

Soo-Chang Pei¹, Wei-Yu Chen¹•Institutions (1)

National Taiwan University¹

19 Apr 2004-IEEE Signal Processing Letters

TL;DR: This letter presents an efficient split vector-radix-2/8 fast Fourier transform (FFT) algorithm that saves 14% real multiplications and has much lower arithmetic complexity than the split vectors- Radix- 2/4 FFT algorithm.

...read moreread less

Abstract: This letter presents an efficient split vector-radix-2/8 fast Fourier transform (FFT) algorithm. The split vector-radix-2/8 FFT algorithm saves 14% real multiplications and has much lower arithmetic complexity than the split vector-radix-2/4 FFT algorithm. Moreover, this algorithm reduces 25% data loads and stores compared with the split vector-radix-2/4 FFT algorithm.

...read moreread less

Proceedings Article•DOI•

Error analysis and complexity optimization for the multiplier-less FFT-like transformation (ML-FFT)

[...]

K. M. Tsui¹, S.C. Chan¹, K.W. Tse¹•Institutions (1)

University of Hong Kong¹

23 May 2004

TL;DR: The effect of the signal round-off errors on the accuracies of the multiplier-less fast Fourier transform-like transformation (ML-FFT) is studied.

...read moreread less

Abstract: This paper studies the effect of the signal round-off errors on the accuracies of the multiplier-less fast Fourier transform-like transformation (ML-FFT). The idea of the ML-FFT is to parameterize the twiddle factors in the conventional FFT algorithm as certain rotation-like matrices and approximate the associated parameters inside these matrices by the sum-of-power-of-two (SOPOT) or canonical signed digits (CSD) representations. The error due to the SOPOT approximations is called the coefficient round-off error. Apart from this error, signal round-off error also occurs because of insufficient wordlengths. Using a recursive noise model of these errors, the minimum hardware to realize the ML-FFT subject to the prescribed output bit accuracy can be obtained using a random search algorithm. A design example is given to demonstrate the effectiveness of the proposed approach.

...read moreread less

A New Radix- FFT Algorithm for Length- DFTs

[...]

Saad Bouguezel, M. Omair Ahmad, Mallappa Kumara Swamy

01 Jan 2004