scispace - formally typeset
Search or ask a question

Showing papers on "Prime-factor FFT algorithm published in 1988"


Journal ArticleDOI
TL;DR: It is shown that the number of distinct N-point DFTs needed to calculate N*N-point two-dimensional DFT’s is equal to thenumber of linear congruences spanning the N-N grid.
Abstract: An algorithm is presented for computation of the two-dimensional discrete Fourier transform (DFT). The algorithm is based on geometric properties of the integers and exhibits symmetry and simplicity of realization. Only one-dimensional transformation of the input data is required. The transformations are independent; hence, parallel processing is feasible. It is shown that the number of distinct N-point DFTs needed to calculate N*N-point two-dimensional DFTs is equal to the number of linear congruences spanning the N*N grid. Examples for N=3, N=4, and N=10 are presented. A short APL code illustrating the algorithm is given. >

76 citations


Journal ArticleDOI
TL;DR: An algorithm for computing the Fourier transform over any finite field GF(p/sup m/) that requires only O(n(log n)/sup 2//4) additions and the same number of multiplications for an n-point transform and allows in some fields a further reduction of the number of multiplier additions.
Abstract: The Fourier transform over finite fields is mainly required in the encoding and decoding of Reed-Solomon and BCH codes. An algorithm for computing the Fourier transform over any finite field GF(p/sup m/) is introduced. It requires only O(n(log n)/sup 2//4) additions and the same number of multiplications for an n-point transform and allows in some fields a further reduction of the number of multiplications to O(n log n). Because of its highly regular structure, this algorithm can be easily implementation by VLSI technology. >

56 citations


Journal ArticleDOI
Z.-J. Mou1, Pierre Duhamel1
TL;DR: Methodologies for constructing fast algorithms to compute the discrete Fourier transform of a 2-D real sequence are introduced and the resulting algorithms are shown to be in-place and butterfly-style as well as the usual 1-D FFT algorithms.
Abstract: Methodologies for constructing fast algorithms to compute the discrete Fourier transform (DFT) of a 2-D real sequence are introduced. The resulting algorithms are shown to be in-place and butterfly-style as well as the usual 1-D FFT algorithms. Above all, the computational load of these algorithms is reduced to less than one-half of their complex counterparts. Due to the in-place property, the storage requirement is exactly halved. A comparison is made on the basis of arithmetic complexity, storage, and input/output requirements. >

40 citations


Proceedings ArticleDOI
07 Jun 1988
TL;DR: An efficient method for computing the discrete Fourier transform when only a few output points are needed is described, based on a novel factorization of the DFT, where one part is computed using standard power-of-two FFTs and the other uses a technique similar to the Goertzel algorithm.
Abstract: The authors describe an efficient method for computing the discrete Fourier transform (DFT) when only a few output points are needed. The method is shown to be more efficient than either Goertzel's method or pruning, and it allows any band in the output to be computed. It is based on a novel factorization of the DFT, where one part is computed using standard power-of-two FFTs (fast Fourier transforms) and the other uses a technique similar to the Goertzel algorithm. >

37 citations


Journal ArticleDOI
C.S. Burrus1
TL;DR: This result has practical importance because it allows a single software or hardware bit-reverse counter to unscramble the more efficient Radix-4, radix-8, and mixed-radix FFTs.
Abstract: Several methods are reviewed for removing the unscrambler in the prime factor algorithms (PFA) and the types of unscrambler necessary for the Cooley-Tukey FFT (fast Fourier transform) are discussed. It is shown that a radix-4, radix-8, radix-16, or any radix-2/sup m/ FFT can be written to give the output in the same bit-reversed order as the radix-2 FFT. This applies to programs which mix radix-8, radix-4, and radix-2 stages to have the high efficiency of radix-8 and radix-4 and the variety of lengths of radix-2. In a more general form, the method will allow a radix-16 to give its output in the same order as as radix-4 FFT. The method can be used with radix-2/sup m/ Harley, cosine, sine, number-theoretic, and special real-data transforms. This result has practical importance because it allows a single software or hardware bit-reverse counter to unscramble the more efficient radix-4, radix-8, and mixed-radix FFTs. >

36 citations


Journal ArticleDOI
TL;DR: An analytic three-dimensional image-reconstruction algorithm that can utilize the cross-plane gamma rays detected by a wide solid-angle PET (positron-emission tomography) system is presented, although mathematical equivalence to Fourier transform methods is proved.
Abstract: An analytic three-dimensional image-reconstruction algorithm that can utilize the cross-plane gamma rays detected by a wide solid-angle PET (positron-emission tomography) system is presented. Unlike current analytic algorithms, it does not use Fourier transform methods, although mathematical equivalence to Fourier transform methods is proved. The results of implementing the algorithm are briefly discussed. An extension of the algorithm to utilize all measured cross-plane gamma rays is discussed. >

35 citations


Journal ArticleDOI
TL;DR: This paper discusses the two‐dimensional implementation of a number of modified fast Fourier transform algorithms that efficiently interpolate (zoom) magnetic resonance (MR) images, and a significant reduction in computation time is achieved when modeling is combined with 2D‐BSDF and SIFFT as fewer points require modeling.
Abstract: This paper discusses the two-dimensional implementation of a number of modified fast Fourier transform (FFT) algorithms that efficiently interpolate (zoom) magnetic resonance (MR) images If the original image was sampled at a rate satisfying the Nyquist criterion, these algorithms would effectively increase the sampling rate, permitting image details to be more easily discerned The Skinner interpolating fast Fourier transform (SIFIT) avoids many of the computationally unnecessary complex multiplications that occur when interpolating using the normal fast Fourier transform algorithm The novel interpolating fast Fourier transform (NIFFT) offers further savings when a subimage is required Theoretical and experimental timings that compare the use of the normal FFT, SIFFT, and NIFFT algorithms for interpolation are given using magnetic resonance image reconstruction examples Time savings of a factor of 2 to 4 are possible in typical experimental situations Time savings of factors of 5 to 20 are possible when zooming images using two-dimensional band selectable digital filtering (2D-BSDF) in combination with decimation and the SIFFT algorithm In 2D-BSDF, the original MRI data set is reduced in size to retain only those frequency components corresponding to a desired subimage, thereby decreasing the computational load associated with further processing A significant reduction in computation time is achieved when modeling is combined with 2D-BSDF and SIFFT as fewer points require modeling © 1988 Academic Press, Inc

32 citations


Proceedings ArticleDOI
07 Jun 1988
TL;DR: The authors present a systolic circuit for computing a fast algorithm performing the discrete Hartley transform (DHT), which appears to be regular and, therefore, very attractive for VLSI realizations.
Abstract: The authors present a systolic circuit for computing a fast algorithm performing the discrete Hartley transform (DHT). The proposed architecture employs a systolic elevator concept and CORDIC processors. The elevator assures local communications in the proposed algorithm, and the CORDIC processor makes it possible to enhance processing speed and exploit parallelism. The architecture appears to be regular and, therefore, very attractive for VLSI realizations. The computational cost necessary for computing the DFT (discrete Fourier transform) is also discussed with respect to other architectures. >

26 citations


Journal ArticleDOI
TL;DR: The split-radix algorithm has fewer multiplies than the radix-8 Cooley-Tukey algorithm, and many fewer additions that the minimum-multiply algorithms, and is advantageous for hardware in which a multiplier/accumulator is the basic processor.
Abstract: An algorithm for computing length-2/sup M/ discrete Fourier transforms (DFTs), called the split-radix FFT, has recently been developed. The split-radix algorithm has fewer multiplies than the radix-8 Cooley-Tukey algorithm, and many fewer additions that the minimum-multiply algorithms. It is shown that it involves significantly more butterfly computations than the radix-4 Cooley-Tukey algorithms which have butterflies of similar complexity. Consequently, the split-radix algorithm is advantageous for hardware in which a multiplier/accumulator is the basic processor, as might be the case with some VLSI implementations. In addition, the split-radix algorithm has varying numbers of butterflies in successive stages, complicating the design of efficient multiprocessor implementations. A few simple strategies for balancing the computational load among the stages are considered, and their average efficiencies are computed. >

25 citations


Journal ArticleDOI
01 Feb 1988
TL;DR: A new algorithm designed for large, single transforms is presented, which employs a pair of multiple transforms to perform the single transform.
Abstract: The Fast Fourier Transform algorithm does not readily lend itself to efficient implementation on vector computers, especially on machines where sequential access is important. Several authors have commented that the efficiency of computation is much improved if many transforms are performed simultaneously. We present a new algorithm designed for large, single transforms, which employs a pair of multiple transforms to perform the single transform. The merits of the algorithm are discussed with reference to its implementation on a CDC CYBER 205.

25 citations


Journal ArticleDOI
01 Jan 1988
TL;DR: The implementation on the CRAY-1 of a prime factor FFT algorithm which adapts some of these developments to a vector-processing scientific computer and it is shown that worthwhile gains are obtained in both speed and storage requiremennts.
Abstract: Recent developments in algorithm design have made the Fast Fourier Transform even faster. We described the implementation on the CRAY-1 of a prime factor FFT algorithm which adapts some of these developments to a vector-processing scientific computer. Comparative times are given for the new and old versions of the FFT algorithm, applied to the problem of performing multiple simultaneous complex transforms. It is shown that worthwhile gains are obtained in both speed and storage requiremennts. The new algorithm is also vectorizable in the more difficult cases of a single transform. Finally, we use timing measurements of the new routine to estimate the value on the CRAY-1 of Hockney's parameter n 1 2 , which characterizes a computer in terms of its apparent degree of parallelism.

Proceedings ArticleDOI
11 Apr 1988
TL;DR: A method for computing theSTFFT is described which utilizes the redundancy to reduce the number of operations required to compute the STFFT, which is more efficient than the regular FFT method for up to very large time differences.
Abstract: The short-time fast Fourier transform (STFFT) is an important tool in speech and other time-varying signal processing areas. The STFFT is often required to be performed in real time, which makes it important that computationally efficient methods exist. Traditionally the STFFT has been computed by repeated application of the fast Fourier transform (FFT) at consecutive time instances. This method does not utilize the redundancies that exist between the FFTs at different times, which, especially for small time differences, are quite large. A method for computing the STFFT is described which utilizes the redundancy to reduce the number of operations required to compute the STFFT. The reduction depends on the length of the STFFT and the time difference between FFTs. This method is more efficient than the regular FFT method for up to very large time differences. >

Book ChapterDOI
01 Jan 1988
TL;DR: In this article, the multiplicative complexity of the discrete Fourier transform (DFT) was analyzed and the complexity of DFT for any positive integer was shown. But the complexity was not shown for any integer.
Abstract: In this chapter the multiplicative complexity of the discrete Fourier transform (DFT) is analyzed. The next several sections define the DFT and then show how the complexity of the DFT is determined when the number of inputs is prime, a power of an odd prime, a power of two, and finally for any positive integer.

Journal ArticleDOI
TL;DR: In this article, the authors conducted a comparison of three algorithms commonly used for the calculation of two-dimensional fast Fourier transforms (FFTs), namely, the conventional row-column FFT, the vector-radix FFT and the polynomial-transform FFT.
Abstract: Floating-point error is conducted for three algorithms commonly used for the calculation of two-dimensional fast Fourier transforms (FFTs), namely, the conventional row-column FFT, the vector-radix FFT, and the polynomial-transform FFT The respective errors are determined both analytically and on the basis of computer simulation Comparison shows that the vector-radix FFT and the polynomial-transform FFT, even though computationally more efficient than the row-column FFT, show approximately the same (and sometimes reduced) susceptibility to errors in floating-point arithmetic >

Journal ArticleDOI
TL;DR: The design procedure for the redesign of modules for performing small discrete Fourier transforms with an optional “rotation” of the results is described and the new operation counts are compared with those for Winograd's DFT modules.

Proceedings ArticleDOI
O.K. Ersoy1, N.C. Hu1
11 Apr 1988
TL;DR: The fast real Fourier transform (FRFT) algorithms discussed are the radix-2 decimation-in-time (DIT), theRadix-4 DIT, the split-radix DIF, the prime factor, and the Winograd FRFT algorithm.
Abstract: Fast algorithms for the computation of the real discrete Fourier transform (RDFT) are discussed. Implementations based on the RDFT are always efficient, whereas the implementations based on the DFT are efficient only when signals to be processed are complex. The fast real Fourier transform (FRFT) algorithms discussed are the radix-2 decimation-in-time (DIT), the radix-4 DIT, the split-radix DIT, the split-radix DIF, the prime factor, and the Winograd FRFT algorithm. >

Proceedings ArticleDOI
11 Apr 1988
TL;DR: An algorithm designed to improve results for separating two voices simultaneously recorded on a single channel is presented and a prime factor fast Fourier transform has been developed.
Abstract: An algorithm designed to improve results for separating two voices simultaneously recorded on a single channel is presented. A variable frame size orthogonal transform and a spectral matching technique are used. A multistep pitch detection scheme is proposed which includes a traditional autocorrelation function, a modified autocorrelation, the average magnitude difference function, and a look-forward and look-backward double checking scheme. The orthogonal transforms utilized include the fast Fourier transform and the fast triangular transform. For a variable frame size transform, a prime factor fast Fourier transform has been developed. The execution of the process is automated and implemented on the IBM-PC, VAX 8650, and HP 9000. Intelligibility tests using simple quantitative measures have been performed on the separated signals. An extension of the problem to the three-speaker case is reported. >

Proceedings ArticleDOI
11 Apr 1988
TL;DR: The recording effect of the fast Fourier transform is considered which requires that the elements of the data array be permuted by bit-reversing the array index and a closed-form expression is derived for the largest index that must be bit- reversed.
Abstract: The recording effect of the fast Fourier transform is considered which requires that the elements of the data array be permuted by bit-reversing the array index. The bit-reversal algorithm given by B. Gold and C.M. Rader (1969) is referred to. Several improvements are made to this algorithm that result in improved efficiency. A closed-form expression is derived for the largest index that must be bit-reversed. A computational analysis is given, comparing the original and modified algorithms. >

Journal ArticleDOI
TL;DR: A new fast Fourier transform algorithm for real or half-complex input data is described, based on the decomposition of N into mutually prime factors, which performs transforms in-place and without pre- or post-reordering of the data.

Proceedings ArticleDOI
20 Mar 1988
TL;DR: In this paper, the authors examined the effect of these dangers when the FFT algorithm is supplied to power system load variation and recommended the use of a DFT algorithm to evaluate the frequency spectrum of power system loads.
Abstract: The discrete Fourier transform (DFT) and the fast Fourier transform (FFT) are based on certain assumptions that must be understood and satisfied, or misleading results will be obtained. The authors examine these assumptions and qualitatively analyze the effect of these dangers when the FFT algorithm is supplied to power system load variation. They recommend the use of a DFT algorithm to evaluate the frequency spectrum of power system load variation. >

Journal ArticleDOI
TL;DR: The Hartley Transform not only decreases the computer time of the WDF but also simplifies the convolution of two WDFs, which is used here to simulate a blurred image and its restoration.

Journal ArticleDOI
K. Nakayama1
TL;DR: An improved FFT (fast Fourier transform) algorithm combining both decimations in frequency and in time is presented, and stress is placed on the derivation of general formulas for submatrices and multiplicands.
Abstract: An improved FFT (fast Fourier transform) algorithm combining both decimations in frequency and in time is presented. Stress is placed on a derivation of general formulas for submatrices and multiplicands. Computational efficiency is briefly discussed. >

Proceedings ArticleDOI
11 Apr 1988
TL;DR: A concept for a fixed point FFT (fast Fourier transform) error analysis is explained which allows a rather fast, simple and comprehensive numerical evaluation.
Abstract: A concept for a fixed point FFT (fast Fourier transform) error analysis is explained which allows a rather fast, simple and comprehensive numerical evaluation. This avoids time-consuming simulations or cumbersome theoretical derivations of the specific FFT length and structure for any change of the input signal or for any changes in the method of scaling and wordlength-reduction. From the results obtained, conclusions are drawn from reducing fixed-point errors with little or no additional effort. >

Journal ArticleDOI
TL;DR: This paper shows that the DFT of a sequence with prime length, P, can be computed efficiently, for selected P's, using number theoretic transforms, using FFT/NTT.
Abstract: Many fast algorithms have been proposed for computing the discrete Fourier transformation. Most of them are based on factorization with the goal of reducing the number of multiplications. They use floating point arithmetic to avoid repetitious scaling and a sizeable wordlength to minimize quantization errors. This paper shows that the DFT of a sequence with prime length, P, can be computed efficiently, for selected P's, using number theoretic transforms. The proposed technique, denoted as FFT/NTT, is illustrated for p = 2M + 1. Advantages include availability of fast algorithms for a set of prime lengths, residue arithmetic with benefit in speed and hardware costs, parallel implementation with short wordlengths through the use of the Chinese remainder theorem, and exact computation except for scaling and round off for the input array and the trigonometric sequences.

Proceedings ArticleDOI
11 Apr 1988
TL;DR: A systolic-network architecture for the computation of the FFT is presented, and a one-chip VLSI design consideration for AT/sup 2/ optimal fast Fourier transform (FFT) shuffle-exchange architecture is considered.
Abstract: One-chip VLSI design consideration for AT/sup 2/ optimal fast Fourier transform (FFT) shuffle-exchange architecture is considered, and a systolic-network architecture for the computation of the FFT is presented. This architecture has the same asymptotically optimal theoretical O(N/sup 2/log/sup 2/N) AT/sup 2/ complexity as the FFT shuffle-exchange architecture, but is more suitable for one-chip VLSI design. Architectures which are feasible for a one-chip FFT design, as well as for shuffle-exchange-type fast discrete orthogonal transforms such as the generalized transform, cosine transform, and slant transform are also discussed. >

Proceedings ArticleDOI
11 Apr 1988
TL;DR: The implementation of several FFT (fast Fourier transform) algorithms on the TMS320C30, the third-generation device in the Texas Instruments family of digital signal processors, permits flexible and compact coding of the algorithms in assembly language while preserving close correspondence to a high-level language implementation.
Abstract: The implementation of several FFT (fast Fourier transform) algorithms on the TMS320C30, the third-generation device in the Texas Instruments family of digital signal processors is reported. The algorithms considered are the complex radix-2 and radix-4, and real-valued radix-2 FFT. The architecture and the instruction set of the TMS320C30 permit flexible and compact coding of the algorithms in assembly language while preserving close correspondence to a high-level language implementation. The efficiency of the architecture and the speed of the device (60 ns) make possible the realization of a 1024-point FFT in 3.75 ms (complex, radix-2), 3.04 ms (complex, radix-4), or 1.67 ms (real, radix-2). >

Proceedings ArticleDOI
21 Jan 1988
TL;DR: Mapping of a Fast Fourier Transform, Haar Transform and Hadamard Transform algorithms onto a small, two-dimensional, mesh-connected array of processors makes it possible to reduce significantly the number of memory locations needed for the constants.
Abstract: The paper discusses mapping of a Fast Fourier Transform (FFT), Haar Transform and Hadamard Transform algorithms onto a small, two-dimensional, mesh-connected array of processors. The FFT algorithm is an in-place, decimation in frequency, Cooley-Tuckey algorithm in radix 2 and radix 4 versions applied to multidimensional, complex inputs. The data flow of the algorithms has been implemented on the array using an efficient, regular data transfer pattern, uniform for all the algorithms. The inputs and constants used in the algorithms are prestored in the local memories of the processors. The mapping makes it possible to reduce significantly the number of memory locations needed for the constants. A partitioning scheme has been developed for the algorithms which allows us to execute them with inputs of arbitrary size on a small processor array. Also an algorithm has been proposed for the processor array, which efficiently unscrambles the bit reversed output of the FFT algorithm. The processors of the array have East, West, North, South interconnections with their nearest neighbors. The local memory of the processors is small, on the order of hundreds of locations. The processors are controlled in Single Instruction Multiple Data Stream (SIMD) mode and can be selectively disabled using simple masks, consisting of combinations of rows or columns.

Proceedings ArticleDOI
11 Apr 1988
TL;DR: Simulations indicate that when a sufficient number of bits is used to quantize the coefficients, the probability of detection does not significantly degrade, and an empirical formula for the error is derived.
Abstract: Detection of a sinusoid of unknown frequency in wideband noise is performed efficiently by the FFT (fast Fourier transform). The detector performs a hypothesis test on the magnitude of the FFT output. When the FFT is implemented, error due to arithmetic roundoff coefficient quantization limits the accuracy of the transform and degrades the detection performance. When the FFT is used as a detector of an unknown sinusoidal signal, the coefficient quantization error is significant and increases with the FFT length. The decimation is analyzed in time for a radix-2 FFT. The FFT output error is defined to be the maximum magnitude of the difference between the true FFT and FFT computed with the quantized coefficients. An upper bound on the error is derived by a deterministic analysis and is verified to be close to the actually measured error. Using the functional form of the bound and scaling it to fit the measured error, an empirical formula for the error is derived. The probability of detection of the quantized-coefficient FFT is computed using the empirical error formula. The probability of detection curves is presented as a function of the FFT length. Simulations indicate that when a sufficient number of bits is used to quantize the coefficients, the probability of detection does not significantly degrade. >

Journal ArticleDOI
TL;DR: A comparison of the characteristics of the DFT and the FFT shows that, whereas the latter is clearly advantageous in terms of calculation time, it does not allow for precise localization of the spectral lines, which can be overcome by interpolating with Spline functions.
Abstract: Biological rhythms are often studied by complementary explorations in the temporal and the frequency domains. This provides a means of investigating purely frequential features such as periods and phases, as well as those related to the shape of the curve. A comparison of the characteristics of the DFT (Discrete Fourier Transform) and the FFT (Fast Fourier Transform) shows that, whereas the latter is clearly advantageous in terms of calculation time, it does not allow for precise localization of the spectral lines. This drawback can be overcome by interpolating with Spline functions. The determination of Splines associated with an FFT is detailed for microcomputer application. The algorithm proposed here is then tested on a concrete example of measurement of biological rhythms of activity on an inbred mouse.

Journal ArticleDOI
TL;DR: This paper models a computation by the well-known pebble game on directed acyclic graphs and shows that the way in which connections are made between subgraphs of a graph exerts a nontrivial effect on the time-space tradeoff behavior.
Abstract: Computations with embedded Fast Fourier Transforms (FFT) have many important applications which include high-speed polynomial multiplication. In this paper, we discuss the time-space tradeoff behavior of such computations. In particular, we examine the way compositions of FFT graphs with one another and with other graphs affect this behavior. We model a computation by the well-known pebble game on directed acyclic graphs. We are able to show that the way in which connections are made between subgraphs of a graph exerts a nontrivial effect on the time-space tradeoff behavior. The graphs studied correspond to the problem of finding the maximum of a vector produced by the FFT algorithm, and to problems containing back-to-back FFTs such as polynomial multiplication. For the first problem, we show that different ways of expressing the FFT algorithm lead to very different time-space tradeoff behaviors. Our results for a polynomial multiplication algorithm based on back-to-back FFTs allow us to conclude that it ...