scispace - formally typeset
Search or ask a question

Showing papers on "Cyclotomic fast Fourier transform published in 1987"


Journal ArticleDOI
TL;DR: A new implementation of the real-valued split-radix FFT is presented, an algorithm that uses fewer operations than any otherreal-valued power-of-2-length FFT.
Abstract: This tutorial paper describes the methods for constructing fast algorithms for the computation of the discrete Fourier transform (DFT) of a real-valued series. The application of these ideas to all the major fast Fourier transform (FFT) algorithms is discussed, and the various algorithms are compared. We present a new implementation of the real-valued split-radix FFT, an algorithm that uses fewer operations than any other real-valued power-of-2-length FFT. We also compare the performance of inherently real-valued transform algorithms such as the fast Hartley transform (FHT) and the fast cosine transform (FCT) to real-valued FFT algorithms for the computation of power spectra and cyclic convolutions. Comparisons of these techniques reveal that the alternative techniques always require more additions than a method based on a real-valued FFT algorithm and result in computer code of equal or greater length and complexity.

489 citations


Book
01 Jan 1987
TL;DR: This book discusses the Discrete Fourier Transform (DFT) and a few applications of the DFT, as well as some of the techniques used in real sequences and the Real DFT.
Abstract: Preface 1. Introduction. A Bit of History An Application Problems 2. The Discrete Fourier Transform (DFT). Introduction DFT Approximation to the Fourier Transform The DFT-IDFT pair DFT Approximations to Fourier Series Coefficients The DFT from Trigonometric Approximation Transforming a Spike Train Limiting Forms of the DFT-IDFT Pair Problems 3. Properties of the DFT. Alternate Forms for the DFT Basic Properties of the DFT Other Properties of the DFT A Few Practical Considerations Analytical DFTs Problems 4. Symmetric DFTs. Introduction Real sequences and the Real DFT (RDFT) Even Sequences and the Discrete Cosine Transform (DST) Odd Sequences and the Discrete Sine Transform (DST) Computing Symmetric DFTs Notes Problems 5. Multi-dimensional DFTs. Introduction Two-dimensional DFTs Geometry of Two-Dimensional Modes Computing Multi-Dimensional DFTs Symmetric DFTs in Two Dimensions Problems 6. Errors in the DFT. Introduction Periodic, Band-limited Input Periodic, Non-band-limited Input Replication and the Poisson Summation Formula Input with Compact Support General Band-Limited Functions General Input Errors in the Inverse DFT DFT Interpolation - Mean Square Error Notes and References Problems 7. A Few Applications of the DFT. Difference Equations - Boundary Value Problems Digital Filtering of Signals FK Migration of Seismic Data Image Reconstruction from Projections Problems 8. Related Transforms. Introduction The Laplace Transform The z- Transform The Chebyshev Transform Orthogonal Polynomial Transforms The Discrete Hartley Transform (DHT) Problems 9. Quadrature and the DFT. Introduction The DFT and the Trapezoid Rule Higher Order Quadrature Rules Problems 10. The Fast Fourier Transform (FFT). Introduction Splitting Methods Index Expansions (One ---> Multi-dimensional) Matrix Factorizations Prime Factor and Convolution Methods FFT Performance Notes Problems Glossary of (Frequently and Consistently Used) Notations References.

354 citations


Journal ArticleDOI
TL;DR: The complexity gains of a new algorithm derived from Clifford's theorem are discussed and new upper bounds, also for the complexity of the underlying group algebras, are derived.

47 citations


Journal ArticleDOI
01 Feb 1987
TL;DR: A three-dimensional (3-D) Discrete Fourier Transform (DFT) algorithm for real data using the one-dimensional Fast Hartley Transform (FHT) is introduced that is simpler and retains the speed advantage that is characteristic of the Hartley approach.
Abstract: A three-dimensional (3-D) Discrete Fourier Transform (DFT) algorithm for real data using the one-dimensional Fast Hartley Transform (FHT) is introduced. It requires the same number of one-dimensional transforms as a direct FFT approach but is simpler and retains the speed advantage that is characteristic of the Hartley approach. The method utilizes a decomposition of the cas function kernel of the Hartley transform to obtain a temporary transform, which is then corrected by some additions to yield the 3-D DFT. A Fortran subroutine is available on request.

45 citations


Journal ArticleDOI
R.C. Agarwal1, J.W. Cooley
01 Sep 1987
TL;DR: The algorithm formulation and implementation described here not only achieves full vector utilization but successfully copes with the problems of hierarchical storage.
Abstract: A number of previous attempts at the vectorization of the fast Fourier transform (FFT) algorithm have fallen somewhat short of achieving the full potential speed of vector processors. The algorithm formulation and implementation described here not only achieves full vector utilization but successfully copes with the problems of hierarchical storage. In the present paper, these techniques are described and extended to the general mixed radix algorithms, prime factor algorithm (PFA), the multidimensional discrete Fourier transform (DFT), the rectangular transform convolution algorithms, and the Winograd fast Fourier transform algorithm. Some of the methods were used in the Engineering Scientific Subroutine Library for the IBM 3090 Vector Facility. Using this approach, very good and consistent performance was obtained over a very wide range of transform lengths.

42 citations


Patent
14 Sep 1987
TL;DR: In this article, a fast Fourier transform circuit, including an illustrative radix-eight DFT kernel that operates on an n-bit-serial data format, for an efficient serial-like, pipelined operation within the DFT.
Abstract: A fast Fourier transform circuit, including an illustrative radix-eight discrete Fourier transform (DFT) kernel that operates on an n-bit-serial data format, for an efficient serial-like, pipelined operation within the DFT. The circuit performs a four-point DFT on half of the input data words at a time, stores intermediate results from the four-point DFT in a commutation stage, then combines the intermediate results in two two-point DFTs. Internal multiplication in the eight-point DFT is effected in delay registers that also serve to store the intermediate results, thereby providing an economy of timing and circuit routing. Interleaving and deinterleaving operations convert the data format between three-bit-serial and conventional bit-parallel used outside the eight-point DFT kernel, which may therefore be easily cascaded for more complex FFT operations. The DFT kernel also includes means for selectively bypassing butterfly computation modules to perform shorter-length DFTs.

30 citations


Journal ArticleDOI
01 Apr 1987
TL;DR: The computational method uses the sprit-radix algorithm which requires the least number of operations compared with other Hartley algorithms and is compared with those using the fast Fourier transform.
Abstract: The use of fast Hartley transform for fast discrete interpolation is considered. The computational method uses the sprit-radix algorithm which requires the least number of operations compared with other Hartley algorithms. Results from this method are compared with those using the fast Fourier transform.

27 citations


Journal ArticleDOI
TL;DR: A range and error analysis is developed for a discrete Fourier transform computed using the ring of cyclotomic integers, and derivations of both deterministic and statistical upper bounds for the range are presented.
Abstract: A range and error analysis is developed for a discrete Fourier transform (fast Fourier transform) computed using the ring of cyclotomic integers. Included are derivations of both deterministic and statistical upper bounds for the range of the resulting processor and formulas for the ratio of the mean square error to mean square signal, in terms of the pertinent parameters. Comparisons of theoretical predictions with empirical results are also presented.

22 citations


Proceedings ArticleDOI
06 Apr 1987
TL;DR: An algorithm for evaluating the Discrete Fourier Transform at particular output frequency is derived using a technique called summation by parts (SBP), which is shown to reduce the number of multiplications and the numbers of bits per multiplicative coefficient needed to implement the DFT.
Abstract: An algorithm for evaluating the Discrete Fourier Transform (DFT) at particular output frequency is derived using a technique called summation by parts (SBP). This technique is shown to reduce the number of multiplications and the number of bits per multiplicative coefficient needed to implement the DFT. For many transform lengths, only two one-bit multiplications or simple memory shifts are needed to implement the DFT. When the DFT length is prime, a SBP algorithm designed for a fixed output frequency index can be used to evaluate the DFT at any other non-zero output frequency index simply by appropriately changing the order of the input sequence.

22 citations



01 Nov 1987
TL;DR: This thesis investigates several different aspects of parallel Fast Fourier Transform implementation techniques for distributed-memory message-passing systems such as hypercube multiprocessors with excellent speedup when implemented on the Intel iPSC hypercube.
Abstract: The Fast Fourier Transform appears frequently in scientific computing. Therefore it is desirable to implement it efficiently on parallel computers. In this thesis, we investigate several different aspects of parallel Fast Fourier Transform implementation techniques for distributed-memory message-passing systems such as hypercube multiprocessors. We describe various Fast Fourier Transform algorithms using a matrix notation. An error analysis is presented that considers the effect of different methods used in the computation of the Fourier Transform coefficients as well as accumulated roundoff. New implementations of one and two-dimensional Fast Fourier Transforms are presented along with comparisons with existing methods. New algorithms for symmetric transforms are also developed and the results show excellent speedup when implemented on the Intel iPSC hypercube.

DOI
01 Sep 1987
TL;DR: The new shuffling algorithm is established and proved for the general case and it is evident that this algorithm can be applied directly to many different implementations of the prime-factor DFT, including a pipeline VLSI implementation.
Abstract: The fast prime-factor discrete Fouriertransform (DFT) algorithm [2?4] can be used to reduce the number of basic cells when the transform length of a DFT is not a highly composite number. This pipeline prime-factor DFT can be realised by a one-to r-dimensional index mapping by (r ? 1) one-to two-dimensional mappings. Hence, only the one-to two-dimensional shuffling algorithm is needed to realise a practical prime-factor DFT. Recently, it was shown that a new algorithm can be developed to perform the shuffling operations needed in implement the above-mentioned one-to two-dimensional mapping for a prime-factor DFT. In the paper, the new shuffling algorithm is established and proved for the general case. It is evident that this algorithm can be applied directly to many different implementations of the prime-factor DFT, including a pipeline VLSI implementation.

Journal ArticleDOI
01 Aug 1987
TL;DR: The DFT algorithm is rewritten as a matrix based algorithm and mapped onto a 2-dimensional systolic array processor, which features nearest neighbour interconnections and results of DFT computations are pipelined out directly in the correct order.
Abstract: A scheme for computing the discrete Fourier transform (DFT) of a 2-dimensional systolic array processor is presented. The DFT algorithm is rewritten as a matrix based algorithm and mapped onto a 2-dimensional systolic array processor. The significance of this approach is that the total time required to complete an N-point DFT is 3√(N) + N time units (assuming that it takes one time unit to operate data in a processor element (PE)); the architecture features nearest neighbour interconnections (as opposed to spatially global interconnections); all PEs in the 2-dimensional systolic array processor are busy (as opposed to some processor arrays in which half of the PEs are idle while the other half are busy); results of DFT computations are pipelined out directly in the correct order (as opposed to some processors which require bit-reversal operation or data commutation during the computational process); and the matrix-matrix multiplication and the diagonal elements of the matrix-matrix-matrix product can be computed systolically on the 2-dimensional processor.

Proceedings ArticleDOI
01 Apr 1987
TL;DR: Split vector radix is used to develop a 2D fast Fourier transform algorithm, it is performed "in-place", and requires no matrix transpose operation, and an overall saving of about 30% complex multiplications for a typical 1024 × 1024 array could be obtained.
Abstract: Split vector radix is used to develop a 2D fast Fourier transform algorithm, it is performed "in-place", and requires no matrix transpose operation; This method greatly improves the conventional vector radix 2D FFT, an overall saving of about 30% complex multiplications for a typical 1024 × 1024 array could be obtained.

Journal ArticleDOI
01 Jun 1987
TL;DR: In the paper the construction of nesting and row-column prime factor discrete Fourier transform (DFT) algorithms using recently developed small-N DFT modules is investigated and how to construct algorithms requiring theoretically less than 2N real-valued multipliers is shown.
Abstract: In the paper the construction of nesting and row-column prime factor discrete Fourier transform (DFT) algorithms using recently developed small-N DFT modules is investigated. For nesting algorithms it is shown how to construct algorithms requiring theoretically less than 2N real-valued multipliers, hence attaining probably the theoretical lower limit on the number of multiplications. If practical polynomial product algorithms are used, the DFT algorithms require less than 0(N log N) multiplications, but more than 0(N log N) additions. It is shown, however, that for N - 216 the total numbers of arithmetical operations for these algorithms are similar to those for the best common factor fast Fourier transforms. The structures of nesting algorithms are composed of a greater number of stages than row-column ones, hence their realisations often require a greater number of nonarithmetical operations. The paper also contains some remarks on the construction of row-column prime factor DFT algorithms requiring fewer arithmetical operations than the best FFTs.

Proceedings ArticleDOI
01 Apr 1987
TL;DR: All members of this family of Discrete Fourier transforms are shown to be equivalent asymptotically to the Karhunen-Loève transform of an arbitrary wide sense stationary process.
Abstract: An extended family of Discrete Fourier transforms is introduced These transforms, which may be implemented by using FFTs, allow the computation of pseudo-cyclic convolutions by multiplication in the transform domain The choice of a suitable transform (DFT 1/4 ) or the combined use of two complementary transforms allows a fast and efficient computation of aperiodic convolutions of waveforms of duration N by using N-point transforms that require no zero padding Finally, all members of this family are shown to be equivalent asymptotically to the Karhunen-Loeve transform of an arbitrary wide sense stationary process

Journal ArticleDOI
A. Gerheim1, J. Stoughton
TL;DR: This correspondence investigates the symmetry and sparseness of the Walsh gain matrices and an efficient sparse matrix algorithm is used to calculate the Walsh Gain matrix.
Abstract: Zarowski and Yunik [1] demonstrated that an FIR filter can be realized with fewer multiplications in the fast Walsh transform (FWT) domain than in the fast Fourier transform (FFT) domain for some transform lengths. This correspondence investigates the symmetry and sparseness of the Walsh gain matrices. An efficient sparse matrix algorithm is used to calculate the Walsh gain matrix.

01 Jun 1987
TL;DR: In this article, the authors investigated the use of another real transform, the Discrete Hartley transform (DHT), for adaptive system estimation and adaptive echo cancelling, and showed that the DHT performs better than these other real transforms under most circumstances.
Abstract: : The least mean square (MLS) algorithm is the most often used real-time adaptive filtering algorithm due to its computational simplicity and remarkably good fit to the optimal Wiener solution. There have been many transform domain algorithms proposed for improving the convergence rate of the LMS algorithm, the most popular of which had been the Discrete Fourier Transform (DFT). However, the DFT requires complex arithmetic and thus, has proven computationally undesirable for applications involving only real signals. A number of unitary, real transforms have been proposed as less costly replacements for the DFT. These include the Discrete Cosine Transform (DCT), the Discrete Walsh Hadamard Transform (WHT), and the Power of Two Transform (PO2). Each of these in some vary exhibits a property necessary to speed the convergence rate, at a lower computational cost than the DFT. The work investigates the use of another real transform, the Discrete Hartley transform (DHT), for adaptive system estimation and adaptive echo cancelling. It is shown that the DHT performs better than these other real transforms under most circumstances. Its relationship to the DFT is such that it can be transformed into the DFT with simple algebraic manipulation. Keywords: Adaptive filters; Orthogonal transforms.

Journal ArticleDOI
TL;DR: The present paper proposes three new algorithms for the reduction of the input-output (I-O) operations and it is proven that the reduction achieved is of a logarithmic order of magnitude.