scispace - formally typeset
Search or ask a question
Journal ArticleDOI

An algorithm for computing the mixed radix fast Fourier transform

R. Singleton1
01 Jun 1969-IEEE Transactions on Audio and Electroacoustics (IEEE)-Vol. 17, Iss: 2, pp 93-103
TL;DR: This paper presents an algorithm for computing the fast Fourier transform, based on a method proposed by Cooley and Tukey, and includes an efficient method for permuting the results in place.
Abstract: This paper presents an algorithm for computing the fast Fourier transform, based on a method proposed by Cooley and Tukey. As in their algorithm, the dimension n of the transform is factored (if possible), and n/p elementary transforms of dimension p are computed for each factor p of n . An improved method of computing a transform step corresponding to an odd factor of n is given; with this method, the number of complex multiplications for an elementary transform of dimension p is reduced from (p-1)^{2} to (p-1)^{2}/4 for odd p . The fast Fourier transform, when computed in place, requires a final permutation step to arrange the results in normal order. This algorithm includes an efficient method for permuting the results in place. The algorithm is described mathematically and illustrated by a FORTRAN subroutine.
Citations
More filters
Journal ArticleDOI
24 Jan 2005
TL;DR: It is shown that such an approach can yield an implementation of the discrete Fourier transform that is competitive with hand-optimized libraries, and the software structure that makes the current FFTW3 version flexible and adaptive is described.
Abstract: FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with hand-optimized libraries, and describes the software structure that makes our current FFTW3 version flexible and adaptive. We further discuss a new algorithm for real-data DFTs of prime size, a new way of implementing DFTs by means of machine-specific single-instruction, multiple-data (SIMD) instructions, and how a special-purpose compiler can derive optimized implementations of the discrete cosine and sine transforms automatically from a DFT algorithm.

5,172 citations


Cites background from "An algorithm for computing the mixe..."

  • ...In addition to FFTW v. 3.0.1, the other codes benchmarked are as follows (some for only one precision or machine): arprec, “four-step” FFT implementation [18] (from the C++ ARPREC library, 2002); cxml, the vendor-tuned Compaq Extended Math Library on Alpha; fftpack, the Fortran library from [22]; green, free code by J. Green (C, 1998); mkl, the Intel Math Kernel Library v. 6.1 (DFTI interface) on the Pentium IV; ipps, the Intel Integrated Performance Primitives, Signal Processing, v. 3.0 on the Pentium IV; numerical recipes, the C routine from [31]; ooura, a free code by T. Ooura (C and Fortran, 2001); singleton, a Fortran FFT [32]; sorensen, a split-radix FFT [33]; takahashi, the FFTE Fig....

    [...]

  • ...Ooura (C and Fortran, 2001); singleton, a Fortran FFT [32]; sorensen , a split-radix FFT [33];takahashi , the FFTE library v....

    [...]

Proceedings ArticleDOI
12 May 1998
TL;DR: An adaptive FFT program that tunes the computation automatically for any particular hardware, and tests show that FFTW's self-optimizing approach usually yields significantly better performance than all other publicly available software.
Abstract: FFT literature has been mostly concerned with minimizing the number of floating-point operations performed by an algorithm. Unfortunately, on present-day microprocessors this measure is far less important than it used to be, and interactions with the processor pipeline and the memory hierarchy have a larger impact on performance. Consequently, one must know the details of a computer architecture in order to design a fast algorithm. In this paper, we propose an adaptive FFT program that tunes the computation automatically for any particular hardware. We compared our program, called FFTW, with over 40 implementations of the FFT on 7 machines. Our tests show that FFTW's self-optimizing approach usually yields significantly better performance than all other publicly available software. FFTW also compares favorably with machine-specific, vendor-optimized libraries.

1,824 citations


Cites methods from "An algorithm for computing the mixe..."

  • ...They include the Sun Performance Library version 1.2 (SUNPERF); public-domain code by T. Ooura (Fortran, 1996), J. Green (C, 1996), and R. H. Krukar (C, 1990); the Fortran FFTPACK library [11]; a Fortran split-radix FFT by Sorensen [12]; a Fortran FFT by Singleton [13]; Temperton’s Fortran GPFA code [14]; Bailey’s “4-step” FFT implementation [15]; Sitton’s QFT code [16]; and thefour1 routine from [17] (NRF)....

    [...]

  • ...B B B B B B B B B B B B B B B B B B J J J J J J J J J J J J J J J J J J H H H H H H H H H H H H H H H H H H F F F F F F F F F F F F F F F F M M M M M M M M M M M M M M M M M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 4 8 16 32 64 12 8 25 6 51 2 10 24 20 48 40 96 81 92 16 38 4 32 76 8 65 53 6 13 10 72 26 21 44 0 50 100 150 200 250 Sp ee d in “ M FL O PS ” Array Size B FFTW J SUNPERF H Ooura F Green FFTPACK Sorensen Krukar Singleton M Temperton 1 Bailey QFT NRF Figure 4 : Comparison of double precision 1D complex FFTs on a Sun HPC 5000 (167MHz UltraSPARC-I)....

    [...]

  • ...[13] R. C. Singleton, “An algorithm for computing the mixed radix fast Fourier transform,”IEEE Transactions on Audio and Electroacoustics, vol. AU-17, pp. 93–103, June 1969....

    [...]

  • ...Krukar (C, 1990); the Fortran FFTPACK library [11]; a Fortran split-radix FFT by Sorensen [12]; a Fortran FFT by Singleton [13]; Temperton’s Fortran GPFA code [14]; Bailey’s “4-step” FFT implementation [15]; Sitton’s QFT code [16]; and thefour1 routine from [17] (NRF)....

    [...]

Journal ArticleDOI
TL;DR: IMAGIC's novel angular reconstitution approach allows for the rapid determination of three-dimensional structures of uncrystallized molecules to high resolution.

1,281 citations


Cites methods from "An algorithm for computing the mixe..."

  • ...We use the Singleton mixed-radix FFT algorithm (Singleton 1969) for efficiency....

    [...]

Journal ArticleDOI
TL;DR: A concise user guide is presented outlining the steps required to obtain thermodynamic information from ab initio calculations of alloy thermodynamic properties from first-principles.
Abstract: Although the formalism that allows the calculation of alloy thermodynamic properties from first-principles has been known for decades, its practical implementation has so far remained a tedious process. The Alloy Theoretic Automated Toolkit (ATAT) drastically simplifies this procedure by implementing decision rules based on formal statistical analysis that frees the researchers from a constant monitoring during the calculation process and automatically “glues” together the input and the output of various codes, in order to provide a high-level interface to the calculation of alloy thermodynamic properties from first-principles. ATAT implements the Structure Inversion Method (SIM), also known as the Connolly-Williams method, in combination with semi-grand-canonical Monte Carlo simulations. In order to make this powerful toolkit available to the wide community of researchers who could benefit from it, this article present a concise user guide outlining the steps required to obtain thermodynamic information from ab initio calculations.

1,001 citations


Cites background from "An algorithm for computing the mixe..."

  • ...[19] R....

    [...]

  • ...Singleton in 1968 [19], later converted to C and subsequently improved by Mark Olesen and John Beale in 1995....

    [...]

Book
01 Feb 2010
TL;DR: The SWAN wave model as discussed by the authors is a wave model based on linear wave theory (SWAN) for oceanic and coastal waters, and it has been shown to be effective in detecting ocean waves.
Abstract: 1. Introduction 2. Observation techniques 3. Description of ocean waves 4. Statistics 5. Linear wave theory (oceanic waters) 6. Waves in oceanic waters 7. Linear wave theory (coastal waters) 8. Waves in coastal waters 9. The SWAN wave model Appendices References Index.

874 citations

References
More filters
Journal ArticleDOI
TL;DR: Good generalized these methods and gave elegant algorithms for which one class of applications is the calculation of Fourier series, applicable to certain problems in which one must multiply an N-vector by an N X N matrix which can be factored into m sparse matrices.
Abstract: An efficient method for the calculation of the interactions of a 2' factorial ex- periment was introduced by Yates and is widely known by his name. The generaliza- tion to 3' was given by Box et al. (1). Good (2) generalized these methods and gave elegant algorithms for which one class of applications is the calculation of Fourier series. In their full generality, Good's methods are applicable to certain problems in which one must multiply an N-vector by an N X N matrix which can be factored into m sparse matrices, where m is proportional to log N. This results inma procedure requiring a number of operations proportional to N log N rather than N2. These methods are applied here to the calculation of complex Fourier series. They are useful in situations where the number of data points is, or can be chosen to be, a highly composite number. The algorithm is here derived and presented in a rather different form. Attention is given to the choice of N. It is also shown how special advantage can be obtained in the use of a binary computer with N = 2' and how the entire calculation can be performed within the array of N data storage locations used for the given Fourier coefficients. Consider the problem of calculating the complex Fourier series N-1 (1) X(j) = EA(k)-Wjk, j = 0 1, * ,N- 1, k=0

11,795 citations

Proceedings ArticleDOI
07 Nov 1966
TL;DR: The "Fast Fourier Transform" has had a major effect on several areas of computing, the most striking example being techniques of numerical convolution, which have been completely revolutionized.
Abstract: The "Fast Fourier Transform" has now been widely known for about a year. During that time it has had a major effect on several areas of computing, the most striking example being techniques of numerical convolution, which have been completely revolutionized. What exactly is the "Fast Fourier Transform"?

493 citations

Journal ArticleDOI
TL;DR: The fast Fourier transform algorithm is briefly reviewed and fast difference equation methods for accurately computing the needed trigonometric function values are given and the problem of computing a large Fouriertransform on a system with virtual memory is considered, and a solution is proposed.
Abstract: and have shown major time savings in using it to compute large transforms on a digital computer. With n a power of two, computing time for this algorithm is proportional to n log2 n, a major improvement over other methods with computing time proportional to n 2. In this paper, the fast Fourier transform algorithm is briefly reviewed and fast difference equation methods for accurately computing the needed trigonometric function values are given. The problem of computing a large Fourier transform on a system with virtual memory is considered, and a solution is proposed. This method has been used to compute complex Fourier transforms of size n = 2 z6 on a computer with 215 words of core storage; this exceeds by a factor of eight the maximum radix two transform size with fixed allocation of this amount of core storage. The method has also been used to compute large mixed radix transforms. A scaling plan for computing the fast Fourier transform with fixed-point arithmetic is also given.

142 citations

Journal ArticleDOI
TL;DR: The base 8 algorithms described in this paper allow one to perform as many base 8 iterations as possible and then finish the computation by performing a base 4 or a base 2 iteration if one is required, which preserves the versatility of the base 2 algorithm while attaining the computational advantage of thebase 8 algorithm.
Abstract: 1. Introduction. Cooley and Tukey stated in their original paper [1] that the Fast Fourier Transform algorithm is formally most efficient when the number of samples in a record can be expressed as a power of 3 (i.e., N = 3m), and further that there is little efficiency lost by using N = 2m or N = 4™. Later, however, it was recognized that the symmetries of the sine and cosine weighting functions made the base 4 algorithms more efficient than either the base 2 or the base 3 algorithms [2], [3]. Making use of this observation, Gentleman and Sande have constructed an algorithm which performs as many iterations of the transform as possible in a base 4 mode, and then, if required, performs the last iteration in a base 2 mode. Although this "4 + 2" algorithm is more efficient than base 2 algorithms, it is now apparent that the techniques used by Gentleman and Sande can be profitably carried one step further to an even more efficient, base 8 algorithm. The base 8 algorithms described in this paper allow one to perform as many base 8 iterations as possible and then finish the computation by performing a base 4 or a base 2 iteration if one is required. This combination preserves the versatility of the base 2 algorithm while attaining the computational advantage of the base 8 algorithm.

88 citations

Journal ArticleDOI
TL;DR: The following procedures are based on the Cooley-Tukey algorithm for computing the finite Fourier transform of a complex data vector; the dimension of the data vector is assumed here to be a power of two.
Abstract: The following procedures are based on the Cooley-Tukey algorithm [1] for computing the finite Fourier transform of a complex data vector; the dimension of the data vector is assumed here to be a power of two. Procedure COMPLEXTRANSFORM computes either the complex Fourier transform or its inverse. Procedure REALTRANSFORM computes either the Fourier coefficients of a sequence of real data points or evaluates a Fourier series with given cosine and sine coefficients. The number of arithmetic operations for either procedure is proportional to n log2n, where n is the number of data points.

32 citations