scispace - formally typeset
Search or ask a question

Showing papers on "Prime-factor FFT algorithm published in 1993"


Journal ArticleDOI
TL;DR: In this paper, a transform decomposition algorithm was proposed to reduce the number of operations required to compute the discrete Fourier transform (DFT) when the input and output data points differ.
Abstract: Ways of efficiently computing the discrete Fourier transform (DFT) when the number of input and output data points differ are discussed. The two problems of determining whether the length of the input sequence or the length of the output sequence is reduced can be found to be duals of each other, and the same methods can, to a large extent, be used to solve both. The algorithms utilize the redundancy in the input or output to reduce the number of operations below those of the fast Fourier transform (FFT) algorithms. The usual pruning method is discussed, and an efficient algorithm, called transform decomposition, is introduced. It is based on a mixture of a standard FFT algorithm and the Horner polynomial evaluation scheme equivalent to the one in Goertzel's algorithms. It requires fewer operations and is more flexible than pruning. The algorithm works for power-of-two and prime-factor algorithms, as well as for real-input data. >

256 citations


Journal ArticleDOI
02 May 1993
TL;DR: This paper presents a new method for computing the configuration-space map of obstacles that is used in motion-planning algorithms, and is particularly promising for workspaces with many and/or complicated obstacles, or when the shape of the robot is not simple.
Abstract: This paper presents a new method for computing the configuration-space map of obstacles that is used in motion-planning algorithms. The method derives from the observation that, when the robot is a rigid object that can only translate, the configuration space is a convolution of the workspace and the robot. This convolution is computed with the use of the fast Fourier transform (FFT) algorithm. The method is particularly promising for workspaces with many and/or complicated obstacles, or when the shape of the robot is not simple. It is an inherently parallel method that can significantly benefit from existing experience and hardware on the FFT. >

129 citations


Journal ArticleDOI
TL;DR: An efficient recursive algorithm for computing the time-varying Fourier transform (TVFT) or short-time Fourier Transform (STFT) of a time sequence is presented, instead of excluding the old samples, their importance is diminished by using all-pole moving windows.
Abstract: An efficient recursive algorithm for computing the time-varying Fourier transform (TVFT) or short-time Fourier transform (STFT) of a time sequence is presented. In this approach, instead of excluding the old samples, their importance is diminished by using all-pole moving windows. This recursive algorithm requires about one half of the computation and storage of the Amin's algorithm. The resulting TVFT does not possess any sidelobes. The performance of the algorithm is illustrated by two numerical examples. >

41 citations


Journal ArticleDOI
TL;DR: Two new fast discrete cosine transform computation algorithms are presented, superior to the conventional radix-3 algorithm, that require less computational complexity in terms of the number of multiplications per point and provide a wider choice of the sequence length for which the DCT can be realized.
Abstract: Presents two new fast discrete cosine transform computation algorithms: a radix-3 and a radix-6 algorithm. These two new algorithms are superior to the conventional radix-3 algorithm as they (i) require less computational complexity in terms of the number of multiplications per point, (ii) provide a wider choice of the sequence length for which the DCT can be realized and, (iii) support the prime factor-decomposed computation algorithm to realize the 2/sup m/3/sup n/-point DCT. Furthermore, a mixed-radix algorithm is also proposed such that an optimal performance can be achieved by applying the proposed radix-3 and radix-6 and the well-developed radix-2 decomposition techniques in a proper sequence. >

38 citations


Patent
05 Aug 1993
TL;DR: The Fast Fourier Transform (FFT) processor includes a plurality of pipelined, functionally identical stages, each stage adapted to perform a portion of an FFT operation on a block of data as mentioned in this paper.
Abstract: The Fast Fourier Transform (FFT) processor includes a plurality of pipelined, functionally identical stages, each stage adapted to perform a portion of an FFT operation on a block of data. The output of the last stage of the processor is the high-precision Fast Fourier Transform of the data block. Support functions are included at each stage. Thus, each stage includes a computational element and a buffer memory interface. Each stage also includes apparatus for coefficient generation.

38 citations


Journal ArticleDOI
TL;DR: It is shown that for the fine-frequency estimator a good method is to fit a Gaussian function to the fast-Fourier-transform (FFT) peak and its two neighbors, which achieves a frequency standard deviation and a bias in the order of only a few percent of a bin.
Abstract: In the case of a single sinusoid or multiple well-separated sinusoids, a coarse estimator consisting of a windowed Fourier transform followed by a fine estimator which is an interpolator is a good approximation to an optimal frequency acquisition and measurement algorithm. The design tradeoffs are described. It is shown that for the fine-frequency estimator a good method is to fit a Gaussian function to the fast-Fourier-transform (FFT) peak and its two neighbors. This method achieves a frequency standard deviation and a bias in the order of only a few percent of a bin. In the case of short-time stationarity, for a moderate number of averages and for an adaptive threshold detector, only between 0.5 and 1 dB is lost when averaging is traded off for FFT length, in contrast to the asymptotic result of 1.5 dB. The COSPAS-SARSAT satellite system for emergency detection and localization is used to illustrate the concepts. The algorithm is analyzed theoretically, and good agreement is found with test results. >

29 citations


Journal ArticleDOI
TL;DR: This work describes and gives timing results for a radix-4 version that is implemented on the RS/6000 workstation, and presents a set of experiments that suggest that numericalbehavior of the new algorithms is slightly better than the numerical behavior of Cooley-Tukey FFT's.
Abstract: The decimation-in-time radix-2, radix-4, split-ra- dix, and radix-8 algorithms, presented in a paper by Linzer and Feig (5), are described in detail. These algorithms compute discrete Fourier transforms (DFT's) on input sequences with lengths that are powers of 2 with fewer multiply-adds than tra- ditional Cooley-Tukey algorithms. The descriptions given pro- vide the needed details to implement these algorithms efficiently in a computer program that could compute DFT's on a length 2" sequence for general m. We describe and give timing results for a radix-4 version that we have implemented on the RS/6000 workstation. The timing results show that a substantial saving in execution time is obtained when the new radix-4 FFT is used instead of a standard Cooley-Tukey radix-4 FFT. Finally, we present a set of experiments that suggest that numerical behav- ior of the new algorithms is slightly better than the numerical behavior of Cooley-Tukey FFT's.

28 citations


Book ChapterDOI
09 Dec 1993
TL;DR: Two families of scalable hash functions for collision-resistant hashing that are highly parallel and based on the generalized fast Fourier transform (FFT) are proposed.
Abstract: We propose two families of scalable hash functions for collision-resistant hashing that are highly parallel and based on the generalized fast Fourier transform (FFT). FFT-hashing is based on multipermutations. This is a basic cryptographic primitive for perfect generation of diffusion and confusion which generalizes the boxes of the classic FFT. The slower FFT-hash functions iterate a compression function. For the faster FFT-hash functions all rounds are alike with the same number of message words entering each round.

27 citations



Journal ArticleDOI
01 Feb 1993
TL;DR: The proposed algorithm is more efficient compared to the radix-2 FHT in terms of the computational requirements, as well as the execution time for transform lengths higher than 30 and is faster than the prime-factor FFT algorithm for real-valued series.
Abstract: Fast algorithms for computing the DHT of short transform lengths (N = 2, 3, 4, 5, 7, 8, 9 and 16) are derived. A new prime-factor algo- rithm is also proposed to compute the long-length DHTs from the short-length DHT algorithms. The short-length algorithms (except for N = 8 and N = 16) are such that the even and the odd parts of the DHT components are obtained directly, without any additional computation. This feature of the short-length algorithms makes the proposed prime-factor DHT algorithm more attractive and efficient. It is found that the proposed algorithm is more efficient compared to the radix-2 FHT in terms of the computational requirements, as well as the execution time for transform lengths higher than 30. It is also observed that the number of operations required for the computation of DHT by the prime-factor FFT algorithm for real-valued data is the same as those of the proposed algo- rithm for certain transform lengths, e.g. N = 30, 60, 252 etc., which do not contain 8 or 16 as a cofactor. However, for all other transform lengths the proposed algorithm has a lower computa- tional complexity. It is further observed that the proposed algorithm is faster than the prime-factor FFT algorithm for real-valued series.

18 citations


Journal ArticleDOI
TL;DR: A recursively pruned radix-(2*2) two-dimensional (2D) fast Fourier transform (FFT) algorithm is proposed to reduce the number of operations involved in the calculation of the 2D discrete Fouriers transform (DFT).
Abstract: A recursively pruned radix-(2*2) two-dimensional (2D) fast Fourier transform (FFT) algorithm is proposed to reduce the number of operations involved in the calculation of the 2D discrete Fourier transform (DFT) It is able to compute input and output data points having multiple and possibly irregularly shaped (nonsquare) regions of support The computational performance of the recursively pruned radix-(2*2) 2D FFT algorithm is compared with that of pruning algorithms based on the one-dimensional (1D) FFT The former is shown to yield significant computational savings when employed in the combined 2D DFT/1D linear difference equation filter method to enhance three-dimensional spatially planar image sequences, and when employed in the MixeD moving object detection and trajectory estimation algorithm >

Proceedings ArticleDOI
09 May 1993
TL;DR: Experimental results are presented which show that using the FFT-based approach is more than an order of magnitude faster than computing the iterates explicitly, even on problems with as few as a thousand volume-filaments.
Abstract: It is noted that including non-ideal ground planes in 3-D inductance extraction programs is computationally expensive, as the ground plane must be finely discretized to ensure that the current distribution throughout the plane is accurately computed. This makes standard volume-element algorithms unsuitable because they require n/sup 2/ computation time and storage, where n is the number of filaments into which the ground plane is discretized. In the present work it is noted that, by using a preconditioned iterative method combined with an FFT (fast Fourier transform)-based algorithm to compute the iterates, one can reduce the computation time to effectively n log n, and substantially reduce required storage. Experimental results are presented which show that using the FFT-based approach is more than an order of magnitude faster than computing the iterates explicitly, even on problems with as few as a thousand volume-filaments. The FFT-based algorithm is compared with a GMRES (generalized minimal residual)-style algorithm.

Proceedings ArticleDOI
27 Apr 1993
TL;DR: An algorithm for fast adaptive filtering that applies a FFT (fast Fourier transform)-based iterative method and uses sliding data windows involving block updating and downdating computations and computes the tap weight filter vector in O(L log N) operations.
Abstract: An algorithm for fast adaptive filtering is proposed. The algorithm applies a FFT (fast Fourier transform)-based iterative method and uses sliding data windows involving block updating and downdating computations. The method is stable and robust, and computes the tap weight filter vector in O(L log N) operations, where the sliding window Toeplitz data matrix X is L-by-N. The complexity thus generally lies between those of the family of unstable but fast, O(N), methods and the stable but slow O(N/sup 2/) Cholesky factor updating methods. >

Journal ArticleDOI
TL;DR: A decimation-in-frequency vector split-radix algorithm that possesses the in-place property and needs no matrix transpose to decompose an N*N 2D discrete Hartley transform into one and twelve DHTs.
Abstract: A decimation-in-frequency vector split-radix algorithm is proposed to decompose an N*N 2D discrete Hartley transform (DHT) into one (N/2)*(N/2) DHT and twelve (N/4) DHTs. The proposed algorithm possesses the in-place property and needs no matrix transpose. Its computational structure is very regular and is simpler than those of all existing nonseparable 2D DHTs. >

Journal ArticleDOI
TL;DR: A simple modification of the FFT algorithm that results in an efficient method for calculating the transform only at evenly spaced frequencies on a logarithmic scale is proposed.
Abstract: A standard fast Fourier transform (FFT) computes the transform at evenly spaced points on a linear scale. A simple modification of the FFT algorithm that results in an efficient method for calculating the transform only at evenly spaced frequencies on a logarithmic scale is proposed. The saving in the number of operations, compared with a standard FFT, is approximately 60% for typical values. >

Journal ArticleDOI
TL;DR: In this paper, variants of the Winograd fast Fourier transform (FFT) algorithm for prime transform size that offer options as to operational counts and arithmetic balance are derived, and their implementations on VAX, IBM 3090 VF, and IBM RS/6000 are discussed.
Abstract: Variants of the Winograd fast Fourier transform (FFT) algorithm for prime transform size that offer options as to operational counts and arithmetic balance are derived. Their implementations on VAX, IBM 3090 VF, and IBM RS/6000 are discussed. For processors that perform floating-point addition, floating-point multiplication, and floating-point multiply-add with the same time delay, variants of the FFT algorithm have been designed such that all floating-point multiplications can be overlapped by using multiply-add. The use of a tensor product formulation, throughout, gives a means for producing variants of algorithms matching computer architectures. >

Journal ArticleDOI
TL;DR: An array architecture for computing a fast Fourier transform (FFT) with a flexible number of identical processing elements is presented and shows that a high-radix, lengthy FFT can be efficiently implemented with simple hardware.
Abstract: An array architecture for computing a fast Fourier transform (FFT) with a flexible number of identical processing elements is presented. The architecture is based on the symmetry of a constant geometry FFT. It allows an easy tradeoff between the hardware complexity and the computation time. A method for constructing a high-radix FFT with simple lower-radix hardware based on successive decompositions and premultiplications has been developed. It shows that a high-radix, lengthy FFT can be efficiently implemented with simple hardware. To verify the architecture, an experimental radix-2 processing element chip has been designed and the results are discussed. >

Journal ArticleDOI
TL;DR: The proposed algorithm has a structure similar to the split-radix algorithm used for the computation of the FFT, the same computational complexity as the direct fast cosine algorithms and a high regularity which facilitates its implementation in VLSI technology.
Abstract: An extension of an existing fast algorithm for the computation of the discrete cosine transform is presented. The proposed algorithm has a structure similar to the split-radix algorithm used for the computation of the FFT, the same computational complexity as the direct fast cosine algorithms and a high regularity which facilitates its implementation in VLSI technology. A comparison on different computer architectures shows that the proposed algorithm is superior to existing algorithms in terms of execution time.

Journal ArticleDOI
TL;DR: A fast algorithm for computing the optimal linear interpolation filter is developed based on the Sherman-Morrison inversion formula for symmetric matrices that requires fewer multiplications and hence is of lower complexity.
Abstract: A fast algorithm for computing the optimal linear interpolation filter is developed. The algorithm is based on the Sherman-Morrison inversion formula for symmetric matrices. The relationship between the derived algorithm and the Levinson algorithm is illustrated. It is shown that the new algorithm, in comparison with the well-known algorithms, requires fewer multiplications and hence is of lower complexity. >

Journal ArticleDOI
TL;DR: This paper discusses the parallel computation of the 2-D discrete Fourier transform by using three different fast ourier transform algorithms: row-column FFT, vector radix FFT and polynomial transform FFT.

Proceedings ArticleDOI
A. Saidi1
23 May 1993
TL;DR: An efficient algorithm for computing the large discrete Fourier transform (DFT) coefficients of a correlated data sequence which reduces the computations associated with the small coefficients is presented.
Abstract: A novel formulation of the decimation method fast Fourier transform (FFT) algorithm is introduced. This formulation generalizes both the decimation-in-time (DIT) and the decimation-in-frequency (DIF) FFT algorithms for various radices in multidimensions. This alternative derivation of the decimation method FFT algorithm has the advantage of clearly showing what is exactly being computed in the intermediate stages of the algorithm. This information is used to present an efficient algorithm for computing the large discrete Fourier transform (DFT) coefficients of a correlated data sequence which reduces the computations associated with the small coefficients. >

Journal ArticleDOI
TL;DR: An efficient algorithm for computing radix-3/9 discrete Hartley transforms (DHTs) is presented and it is shown that the Radix- 3/9 fast Hartley transform (FHT) algorithm reduces the number of multiplications required by a radIX-3 FHT algorithm for nearly 50%.
Abstract: An efficient algorithm for computing radix-3/9 discrete Hartley transforms (DHTs) is presented. It is shown that the radix-3/9 fast Hartley transform (FHT) algorithm reduces the number of multiplications required by a radix-3 FHT algorithm for nearly 50%. For the computation of real-valued discrete Fourier transforms (DFTs) with sequence lengths that are powers of 3, it is shown that the radix-3/9 FHT algorithm reduces the number of multiplications by 16.2% over the fastest real-valued radix-3/9 fast Fourier transform (FFT) algorithm. >

Journal ArticleDOI
TL;DR: It is shown formally that in many cases only one equation is enough for both operations of the prime factor algorithm, and a truly in-place and in-order computation is obtained.
Abstract: It is shown that the prime factor algorithm (PFA) has an intrinsic property that allows it to be easily realized in an in-place and in-order form. In contrast to other approaches that use two equations for loading data from and returning the results to the memory, respectively, it is shown formally that in many cases only one equation is enough for both operations. Thus a truly in-place and in-order computation is obtained. Nevertheless, the sequence length of the PFA computation must be carefully selected. The conditions under which a particular sequence length is possible for in-place and in-order PFA computation are analyzed. >

Proceedings ArticleDOI
03 May 1993
TL;DR: It is shown that the multidimensional (M-D) FFT can be represented by the same vector-matrix form as the 1-D FFT.
Abstract: The twiddle factor matrix of the discrete Fourier transform can be recursively factorized into the cascading of the basic butterfly stage matrices. It is shown that the matrix can be further partitioned into three matrices practically specifying the input data, twiddle factor, and output data sequences of the fast Fourier transform (FFT). The equivalent relationship of these matrices is introduced. Thus, the equivalent relationship for a variety of the FFT algorithms can be obtained by equivalent matrix transformation. It is shown that the multidimensional (M-D) FFT can be represented by the same vector-matrix form as the 1-D FFT. The addressing sequences of the M-D FFT is the subset of the 1-D FFT. Therefore, the signal flow graph of the 1-D FFT can be used to describe that of the M-D FFT and the 1-D FFT addressing sequences can be employed to implement the M-D FFT. The simulation results of the proposed FFT approach are given. >

Journal ArticleDOI
TL;DR: Every multidimensional sequence is completely equivalent to a one-dimensional function in both "time" and "frequency" domains, as it is already done in echo planar methods.
Abstract: Whenever DFT (discrete Fourier transform) processing of a multidimensional discrete signal is required, one can apply either a multidimensional FFT (fast Fourier transform) algorithm, or a single-dimension FFT algorithm, both using the same number of points. That is, the dimensions of a "multidimensional" signal, and of its spectrum, are a matter of choice. Every multidimensional sequence is completely equivalent to a one-dimensional function in both "time" and "frequency" domains. This statement applied to MRI (magnetic resonance imaging) explains why one can reconstruct the slice by using either one-dimensional or two-dimensional methods, as it is already done in echo planar methods. In the commonly used spin warp methods, the image can be also reconstructed by either one- or two-dimensional processing. However, some artifacts in the images reconstructed from the original "zig-zag" echo planar trajectory, are shown to be due to the wrong dimensionality of the FFT applied. >

Journal ArticleDOI
TL;DR: It is shown how the computational burden of source localization by matched field processing (MFP) can be significantly reduced by expressing the correlation in terms of a discrete Fouriertransform and using the fast Fourier transform (FFT) algorithm.
Abstract: It is shown how the computational burden of source localization by matched field processing (MFP) can be significantly reduced (20 to 30 times) by expressing the correlation in terms of a discrete Fourier transform and using the fast Fourier transform (FFT) algorithm. The price paid to achieve increased speed is in the form of quantization phase errors. It is shown through analysis and computer simulation that the quantization errors reduce the source peak height, depending upon the size of DFT. The proposed fast MFP works for range localization only. However, the depth estimation is possible by repeated application of the above algorithm for different depths. >

Proceedings ArticleDOI
27 Apr 1993
TL;DR: The idea of RTA is extended to the M-D Cooley-Tukey (C-T) FFT algorithm and M-d Good-Thomas (G-Thomas) prime factor algorithm and a new implementation strategy for these algorithms that requires no interprocessor communication is discussed.
Abstract: To reduce or eliminate the interprocessor communications, I. Gertner, R. Tolimieri, and their colleagues proposed an M-D fast Fourier transform (FFT) algorithm, called the reduced transform algorithm (RTA). In the present work, the idea of RTA is extended to the M-D Cooley-Tukey (C-T) FFT algorithm and M-D Good-Thomas (G-T) prime factor algorithm. A new implementation strategy for these algorithms that requires no interprocessor communication is discussed. A hybrid algorithm which combines the C-T or G-T algorithm with RTA is also described. >

Journal ArticleDOI
TL;DR: In this article, a fast algorithm for iterative extrapolation based on the fast Hartley transform (FHT) is presented, where low-pass filtering in the iterative procedure can be implemented by the FHT directly instead of by the fast Fourier transform (FFT) and the inverse FFT.
Abstract: A fast algorithm for A. Papoulis's (1975) and R.W. Gerchberg's (1974) iterative extrapolation based on the fast Hartley transform (FHT) is presented. The low-pass filtering in the iterative procedure can be implemented by the FHT directly instead of by the fast Fourier transform (FFT) and the inverse FFT. M.S. Sabri and W. Steenart's (1978) example demonstrates that the FHT approach is simple to use. >

Journal ArticleDOI
TL;DR: It can be proved that the computational complexity of the proposed DFT algorithm is identical to that of the most popular split-radix FFT, yet requires real number arithmetics only.
Abstract: Discrete Fourier transform (DFT)/discrete Hartley transform (DHT) algorithms based on the basis-vector decomposition of the corresponding transform matrices are derived. The computations of DFT are divided into two stages: an add/subtract preprocessing and a block-diagonal postprocessing. Both stages can be computed effectively. It can be proved that the computational complexity of the proposed DFT algorithm is identical to that of the most popular split-radix FFT, yet requires real number arithmetics only. Generation and storage of the real multiplicative coefficients are easier than that in conventional FFTs. Connections of the proposed approach with several well-known DFT algorithms are included. Furthermore, many variations of the proposed algorithm are also pointed out. >

Proceedings ArticleDOI
01 Nov 1993
TL;DR: An on-chip VLSI architecture for computation of Fourier transforms is presented and the implementation of arithmetic operators is based on on-line arithmetic, and the transforms are performed by a parallel-pipeline version of the Cooley- Tukey fast Fourier transform (FFT) algorithm.
Abstract: An on-chip VLSI architecture for computation of Fourier transforms is presented. It performs thearithmetic operations in a digit-level pipeline fashion. For this purpose, the implementation of arithmeticoperators is based on on-line (i.e. digit-serial and most significant digit first) arithmetic, and the trans-forms are performed by a parallel-pipeline version of the Cooley-Tukey fast Fourier transform (FFT)algorithm. 1. INTRODUCTION In general, an important characteristics of digital signal processing problems is that the amount ofdata is huge and must be processed quickly. But fortunately, since many of these problems have someinherent regularities, it is possible to perform the whole process as a repetition of smaller similar processes.The computation of Fourier transforms is such a problem. For example, many efficient FFT algorithmshave been studied for vector computers [1, 2, 3], for parallel SIMD [4] and MIMD [5] computers, andfor shared-memory machines [6, 7]. In this paper, we deal with the computation of the FFT in a systemhaving many special-purpose VLSI processors, and concentrate on the hardware implementation of suchprocessors to get high performance by pipelining the arithmetic operations at the digit level.