scispace - formally typeset
Search or ask a question

Showing papers on "Split-radix FFT algorithm published in 1993"


Journal ArticleDOI
TL;DR: In this paper, a transform decomposition algorithm was proposed to reduce the number of operations required to compute the discrete Fourier transform (DFT) when the input and output data points differ.
Abstract: Ways of efficiently computing the discrete Fourier transform (DFT) when the number of input and output data points differ are discussed. The two problems of determining whether the length of the input sequence or the length of the output sequence is reduced can be found to be duals of each other, and the same methods can, to a large extent, be used to solve both. The algorithms utilize the redundancy in the input or output to reduce the number of operations below those of the fast Fourier transform (FFT) algorithms. The usual pruning method is discussed, and an efficient algorithm, called transform decomposition, is introduced. It is based on a mixture of a standard FFT algorithm and the Horner polynomial evaluation scheme equivalent to the one in Goertzel's algorithms. It requires fewer operations and is more flexible than pruning. The algorithm works for power-of-two and prime-factor algorithms, as well as for real-input data. >

256 citations


Patent
05 Aug 1993
TL;DR: The Fast Fourier Transform (FFT) processor includes a plurality of pipelined, functionally identical stages, each stage adapted to perform a portion of an FFT operation on a block of data as mentioned in this paper.
Abstract: The Fast Fourier Transform (FFT) processor includes a plurality of pipelined, functionally identical stages, each stage adapted to perform a portion of an FFT operation on a block of data. The output of the last stage of the processor is the high-precision Fast Fourier Transform of the data block. Support functions are included at each stage. Thus, each stage includes a computational element and a buffer memory interface. Each stage also includes apparatus for coefficient generation.

38 citations


Journal ArticleDOI
TL;DR: In this article, a three-dimensional array of dimension n 1, n 2, n 3 can be rotated in such a way, that all the innermost loops have lengths which are products of two dimensions.

30 citations


Journal ArticleDOI
TL;DR: It is shown that for the fine-frequency estimator a good method is to fit a Gaussian function to the fast-Fourier-transform (FFT) peak and its two neighbors, which achieves a frequency standard deviation and a bias in the order of only a few percent of a bin.
Abstract: In the case of a single sinusoid or multiple well-separated sinusoids, a coarse estimator consisting of a windowed Fourier transform followed by a fine estimator which is an interpolator is a good approximation to an optimal frequency acquisition and measurement algorithm. The design tradeoffs are described. It is shown that for the fine-frequency estimator a good method is to fit a Gaussian function to the fast-Fourier-transform (FFT) peak and its two neighbors. This method achieves a frequency standard deviation and a bias in the order of only a few percent of a bin. In the case of short-time stationarity, for a moderate number of averages and for an adaptive threshold detector, only between 0.5 and 1 dB is lost when averaging is traded off for FFT length, in contrast to the asymptotic result of 1.5 dB. The COSPAS-SARSAT satellite system for emergency detection and localization is used to illustrate the concepts. The algorithm is analyzed theoretically, and good agreement is found with test results. >

29 citations


Journal ArticleDOI
TL;DR: This work describes and gives timing results for a radix-4 version that is implemented on the RS/6000 workstation, and presents a set of experiments that suggest that numericalbehavior of the new algorithms is slightly better than the numerical behavior of Cooley-Tukey FFT's.
Abstract: The decimation-in-time radix-2, radix-4, split-ra- dix, and radix-8 algorithms, presented in a paper by Linzer and Feig (5), are described in detail. These algorithms compute discrete Fourier transforms (DFT's) on input sequences with lengths that are powers of 2 with fewer multiply-adds than tra- ditional Cooley-Tukey algorithms. The descriptions given pro- vide the needed details to implement these algorithms efficiently in a computer program that could compute DFT's on a length 2" sequence for general m. We describe and give timing results for a radix-4 version that we have implemented on the RS/6000 workstation. The timing results show that a substantial saving in execution time is obtained when the new radix-4 FFT is used instead of a standard Cooley-Tukey radix-4 FFT. Finally, we present a set of experiments that suggest that numerical behav- ior of the new algorithms is slightly better than the numerical behavior of Cooley-Tukey FFT's.

28 citations


Book ChapterDOI
09 Dec 1993
TL;DR: Two families of scalable hash functions for collision-resistant hashing that are highly parallel and based on the generalized fast Fourier transform (FFT) are proposed.
Abstract: We propose two families of scalable hash functions for collision-resistant hashing that are highly parallel and based on the generalized fast Fourier transform (FFT). FFT-hashing is based on multipermutations. This is a basic cryptographic primitive for perfect generation of diffusion and confusion which generalizes the boxes of the classic FFT. The slower FFT-hash functions iterate a compression function. For the faster FFT-hash functions all rounds are alike with the same number of message words entering each round.

27 citations


Patent
08 Sep 1993
TL;DR: In this article, a digit reversing system for mixed radix FFT operations with arbitrary arrangements of radices is described, which is used for appropriately arranging input terms applied to a mixed-radix multi-stage Fast Fourier Transform (FFT) process.
Abstract: A digit reversing system is disclosed for handling mixed radix FFT operations with arbitrary arrangements of radices In a first step, all bits in an integer field of size log2 N are position reversed In a second step, subfields of the output produced in the first step are individually unreversed at the local level to produce unreversed digits The output is used for appropriately arranging input terms applied to a mixed-radix multi-stage Fast Fourier Transform (FFT) process

22 citations


Journal ArticleDOI
01 Feb 1993
TL;DR: The proposed algorithm is more efficient compared to the radix-2 FHT in terms of the computational requirements, as well as the execution time for transform lengths higher than 30 and is faster than the prime-factor FFT algorithm for real-valued series.
Abstract: Fast algorithms for computing the DHT of short transform lengths (N = 2, 3, 4, 5, 7, 8, 9 and 16) are derived. A new prime-factor algo- rithm is also proposed to compute the long-length DHTs from the short-length DHT algorithms. The short-length algorithms (except for N = 8 and N = 16) are such that the even and the odd parts of the DHT components are obtained directly, without any additional computation. This feature of the short-length algorithms makes the proposed prime-factor DHT algorithm more attractive and efficient. It is found that the proposed algorithm is more efficient compared to the radix-2 FHT in terms of the computational requirements, as well as the execution time for transform lengths higher than 30. It is also observed that the number of operations required for the computation of DHT by the prime-factor FFT algorithm for real-valued data is the same as those of the proposed algo- rithm for certain transform lengths, e.g. N = 30, 60, 252 etc., which do not contain 8 or 16 as a cofactor. However, for all other transform lengths the proposed algorithm has a lower computa- tional complexity. It is further observed that the proposed algorithm is faster than the prime-factor FFT algorithm for real-valued series.

18 citations


Journal ArticleDOI
TL;DR: A recursively pruned radix-(2*2) two-dimensional (2D) fast Fourier transform (FFT) algorithm is proposed to reduce the number of operations involved in the calculation of the 2D discrete Fouriers transform (DFT).
Abstract: A recursively pruned radix-(2*2) two-dimensional (2D) fast Fourier transform (FFT) algorithm is proposed to reduce the number of operations involved in the calculation of the 2D discrete Fourier transform (DFT) It is able to compute input and output data points having multiple and possibly irregularly shaped (nonsquare) regions of support The computational performance of the recursively pruned radix-(2*2) 2D FFT algorithm is compared with that of pruning algorithms based on the one-dimensional (1D) FFT The former is shown to yield significant computational savings when employed in the combined 2D DFT/1D linear difference equation filter method to enhance three-dimensional spatially planar image sequences, and when employed in the MixeD moving object detection and trajectory estimation algorithm >

17 citations


Proceedings ArticleDOI
09 May 1993
TL;DR: Experimental results are presented which show that using the FFT-based approach is more than an order of magnitude faster than computing the iterates explicitly, even on problems with as few as a thousand volume-filaments.
Abstract: It is noted that including non-ideal ground planes in 3-D inductance extraction programs is computationally expensive, as the ground plane must be finely discretized to ensure that the current distribution throughout the plane is accurately computed. This makes standard volume-element algorithms unsuitable because they require n/sup 2/ computation time and storage, where n is the number of filaments into which the ground plane is discretized. In the present work it is noted that, by using a preconditioned iterative method combined with an FFT (fast Fourier transform)-based algorithm to compute the iterates, one can reduce the computation time to effectively n log n, and substantially reduce required storage. Experimental results are presented which show that using the FFT-based approach is more than an order of magnitude faster than computing the iterates explicitly, even on problems with as few as a thousand volume-filaments. The FFT-based algorithm is compared with a GMRES (generalized minimal residual)-style algorithm.

16 citations


Proceedings ArticleDOI
27 Apr 1993
TL;DR: An algorithm for fast adaptive filtering that applies a FFT (fast Fourier transform)-based iterative method and uses sliding data windows involving block updating and downdating computations and computes the tap weight filter vector in O(L log N) operations.
Abstract: An algorithm for fast adaptive filtering is proposed. The algorithm applies a FFT (fast Fourier transform)-based iterative method and uses sliding data windows involving block updating and downdating computations. The method is stable and robust, and computes the tap weight filter vector in O(L log N) operations, where the sliding window Toeplitz data matrix X is L-by-N. The complexity thus generally lies between those of the family of unstable but fast, O(N), methods and the stable but slow O(N/sup 2/) Cholesky factor updating methods. >

Journal ArticleDOI
TL;DR: A simple modification of the FFT algorithm that results in an efficient method for calculating the transform only at evenly spaced frequencies on a logarithmic scale is proposed.
Abstract: A standard fast Fourier transform (FFT) computes the transform at evenly spaced points on a linear scale. A simple modification of the FFT algorithm that results in an efficient method for calculating the transform only at evenly spaced frequencies on a logarithmic scale is proposed. The saving in the number of operations, compared with a standard FFT, is approximately 60% for typical values. >

Journal ArticleDOI
TL;DR: In this paper, variants of the Winograd fast Fourier transform (FFT) algorithm for prime transform size that offer options as to operational counts and arithmetic balance are derived, and their implementations on VAX, IBM 3090 VF, and IBM RS/6000 are discussed.
Abstract: Variants of the Winograd fast Fourier transform (FFT) algorithm for prime transform size that offer options as to operational counts and arithmetic balance are derived. Their implementations on VAX, IBM 3090 VF, and IBM RS/6000 are discussed. For processors that perform floating-point addition, floating-point multiplication, and floating-point multiply-add with the same time delay, variants of the FFT algorithm have been designed such that all floating-point multiplications can be overlapped by using multiply-add. The use of a tensor product formulation, throughout, gives a means for producing variants of algorithms matching computer architectures. >

Journal ArticleDOI
TL;DR: An array architecture for computing a fast Fourier transform (FFT) with a flexible number of identical processing elements is presented and shows that a high-radix, lengthy FFT can be efficiently implemented with simple hardware.
Abstract: An array architecture for computing a fast Fourier transform (FFT) with a flexible number of identical processing elements is presented. The architecture is based on the symmetry of a constant geometry FFT. It allows an easy tradeoff between the hardware complexity and the computation time. A method for constructing a high-radix FFT with simple lower-radix hardware based on successive decompositions and premultiplications has been developed. It shows that a high-radix, lengthy FFT can be efficiently implemented with simple hardware. To verify the architecture, an experimental radix-2 processing element chip has been designed and the results are discussed. >

Journal ArticleDOI
TL;DR: The proposed algorithm has a structure similar to the split-radix algorithm used for the computation of the FFT, the same computational complexity as the direct fast cosine algorithms and a high regularity which facilitates its implementation in VLSI technology.
Abstract: An extension of an existing fast algorithm for the computation of the discrete cosine transform is presented. The proposed algorithm has a structure similar to the split-radix algorithm used for the computation of the FFT, the same computational complexity as the direct fast cosine algorithms and a high regularity which facilitates its implementation in VLSI technology. A comparison on different computer architectures shows that the proposed algorithm is superior to existing algorithms in terms of execution time.

Journal ArticleDOI
TL;DR: This paper discusses the parallel computation of the 2-D discrete Fourier transform by using three different fast ourier transform algorithms: row-column FFT, vector radix FFT and polynomial transform FFT.

Proceedings ArticleDOI
A. Saidi1
23 May 1993
TL;DR: An efficient algorithm for computing the large discrete Fourier transform (DFT) coefficients of a correlated data sequence which reduces the computations associated with the small coefficients is presented.
Abstract: A novel formulation of the decimation method fast Fourier transform (FFT) algorithm is introduced. This formulation generalizes both the decimation-in-time (DIT) and the decimation-in-frequency (DIF) FFT algorithms for various radices in multidimensions. This alternative derivation of the decimation method FFT algorithm has the advantage of clearly showing what is exactly being computed in the intermediate stages of the algorithm. This information is used to present an efficient algorithm for computing the large discrete Fourier transform (DFT) coefficients of a correlated data sequence which reduces the computations associated with the small coefficients. >

Journal ArticleDOI
TL;DR: An efficient algorithm for computing radix-3/9 discrete Hartley transforms (DHTs) is presented and it is shown that the Radix- 3/9 fast Hartley transform (FHT) algorithm reduces the number of multiplications required by a radIX-3 FHT algorithm for nearly 50%.
Abstract: An efficient algorithm for computing radix-3/9 discrete Hartley transforms (DHTs) is presented. It is shown that the radix-3/9 fast Hartley transform (FHT) algorithm reduces the number of multiplications required by a radix-3 FHT algorithm for nearly 50%. For the computation of real-valued discrete Fourier transforms (DFTs) with sequence lengths that are powers of 3, it is shown that the radix-3/9 FHT algorithm reduces the number of multiplications by 16.2% over the fastest real-valued radix-3/9 fast Fourier transform (FFT) algorithm. >

Journal ArticleDOI
TL;DR: In this paper, an FFT method is described for the solution of Poisson's equation over a rectangular region with Robbins boundary conditions on either one or two sides of the region, together with suitabIe conditions on the rest of the boundary.
Abstract: An FFT method is described for the solution of Poisson's equation over a rectangular region with Robbins boundary conditions on either one or two sides of the region, together with suitabIe conditions on the rest of the boundary. In contrast to earlier applications of the FFT technique, the equations for the Fourier harmonic amplitudes do not decouple into simpler independent systems and an effective iterative scheme is developed for the solution of these equations. A theoretical convergence analysis is shown generally to support the results obtained from practical computation. For the test problems considered the method is found to take between 3 and 4 times the execution time for problems soluble directly by the FFT technique

Journal ArticleDOI
TL;DR: From the results, it is seen that the radix-3 and -6 FHT algorithms presented are comparable to the split-radix FHT algorithm in terms of their operation count and will be more efficient when the sequence length is closer to an integer power of the corresponding radix.
Abstract: Fast algorithms of a transform, like fast Fourier transform (FFT) algorithms, are based on different decomposition techniques. It is shown that these decomposition techniques can also be applied to the computation of the discrete Hartley transform (DHT) for a real-valued sequence. Recently, an efficient decomposition technique for radix-3 decimation-in-time (DIT) FFT and fast Hartley transform (FHT) algorithms has been demonstrated. Such a decomposition technique is implemented for radix-3 and -6 decimation-in-frequency (DIF) FHT algorithms and found to improve the operation count. Efficiency in these algorithms is derived by pairing the rotating factors with an appropriate reordering of the input sequence. From the results, it is seen that the radix-3 and -6 FHT algorithms presented are comparable to the split-radix FHT algorithm in terms of their operation count and will be more efficient when the sequence length is closer to an integer power of the corresponding radix.

Proceedings ArticleDOI
03 May 1993
TL;DR: It is shown that the multidimensional (M-D) FFT can be represented by the same vector-matrix form as the 1-D FFT.
Abstract: The twiddle factor matrix of the discrete Fourier transform can be recursively factorized into the cascading of the basic butterfly stage matrices. It is shown that the matrix can be further partitioned into three matrices practically specifying the input data, twiddle factor, and output data sequences of the fast Fourier transform (FFT). The equivalent relationship of these matrices is introduced. Thus, the equivalent relationship for a variety of the FFT algorithms can be obtained by equivalent matrix transformation. It is shown that the multidimensional (M-D) FFT can be represented by the same vector-matrix form as the 1-D FFT. The addressing sequences of the M-D FFT is the subset of the 1-D FFT. Therefore, the signal flow graph of the 1-D FFT can be used to describe that of the M-D FFT and the 1-D FFT addressing sequences can be employed to implement the M-D FFT. The simulation results of the proposed FFT approach are given. >

Journal ArticleDOI
TL;DR: Every multidimensional sequence is completely equivalent to a one-dimensional function in both "time" and "frequency" domains, as it is already done in echo planar methods.
Abstract: Whenever DFT (discrete Fourier transform) processing of a multidimensional discrete signal is required, one can apply either a multidimensional FFT (fast Fourier transform) algorithm, or a single-dimension FFT algorithm, both using the same number of points. That is, the dimensions of a "multidimensional" signal, and of its spectrum, are a matter of choice. Every multidimensional sequence is completely equivalent to a one-dimensional function in both "time" and "frequency" domains. This statement applied to MRI (magnetic resonance imaging) explains why one can reconstruct the slice by using either one-dimensional or two-dimensional methods, as it is already done in echo planar methods. In the commonly used spin warp methods, the image can be also reconstructed by either one- or two-dimensional processing. However, some artifacts in the images reconstructed from the original "zig-zag" echo planar trajectory, are shown to be due to the wrong dimensionality of the FFT applied. >

Journal ArticleDOI
TL;DR: It is shown how the computational burden of source localization by matched field processing (MFP) can be significantly reduced by expressing the correlation in terms of a discrete Fouriertransform and using the fast Fourier transform (FFT) algorithm.
Abstract: It is shown how the computational burden of source localization by matched field processing (MFP) can be significantly reduced (20 to 30 times) by expressing the correlation in terms of a discrete Fourier transform and using the fast Fourier transform (FFT) algorithm. The price paid to achieve increased speed is in the form of quantization phase errors. It is shown through analysis and computer simulation that the quantization errors reduce the source peak height, depending upon the size of DFT. The proposed fast MFP works for range localization only. However, the depth estimation is possible by repeated application of the above algorithm for different depths. >

Proceedings ArticleDOI
27 Apr 1993
TL;DR: The idea of RTA is extended to the M-D Cooley-Tukey (C-T) FFT algorithm and M-d Good-Thomas (G-Thomas) prime factor algorithm and a new implementation strategy for these algorithms that requires no interprocessor communication is discussed.
Abstract: To reduce or eliminate the interprocessor communications, I. Gertner, R. Tolimieri, and their colleagues proposed an M-D fast Fourier transform (FFT) algorithm, called the reduced transform algorithm (RTA). In the present work, the idea of RTA is extended to the M-D Cooley-Tukey (C-T) FFT algorithm and M-D Good-Thomas (G-T) prime factor algorithm. A new implementation strategy for these algorithms that requires no interprocessor communication is discussed. A hybrid algorithm which combines the C-T or G-T algorithm with RTA is also described. >

Journal ArticleDOI
TL;DR: In this article, a fast algorithm for iterative extrapolation based on the fast Hartley transform (FHT) is presented, where low-pass filtering in the iterative procedure can be implemented by the FHT directly instead of by the fast Fourier transform (FFT) and the inverse FFT.
Abstract: A fast algorithm for A. Papoulis's (1975) and R.W. Gerchberg's (1974) iterative extrapolation based on the fast Hartley transform (FHT) is presented. The low-pass filtering in the iterative procedure can be implemented by the FHT directly instead of by the fast Fourier transform (FFT) and the inverse FFT. M.S. Sabri and W. Steenart's (1978) example demonstrates that the FHT approach is simple to use. >

Journal ArticleDOI
TL;DR: It can be proved that the computational complexity of the proposed DFT algorithm is identical to that of the most popular split-radix FFT, yet requires real number arithmetics only.
Abstract: Discrete Fourier transform (DFT)/discrete Hartley transform (DHT) algorithms based on the basis-vector decomposition of the corresponding transform matrices are derived. The computations of DFT are divided into two stages: an add/subtract preprocessing and a block-diagonal postprocessing. Both stages can be computed effectively. It can be proved that the computational complexity of the proposed DFT algorithm is identical to that of the most popular split-radix FFT, yet requires real number arithmetics only. Generation and storage of the real multiplicative coefficients are easier than that in conventional FFTs. Connections of the proposed approach with several well-known DFT algorithms are included. Furthermore, many variations of the proposed algorithm are also pointed out. >

Proceedings ArticleDOI
01 Nov 1993
TL;DR: An on-chip VLSI architecture for computation of Fourier transforms is presented and the implementation of arithmetic operators is based on on-line arithmetic, and the transforms are performed by a parallel-pipeline version of the Cooley- Tukey fast Fourier transform (FFT) algorithm.
Abstract: An on-chip VLSI architecture for computation of Fourier transforms is presented. It performs thearithmetic operations in a digit-level pipeline fashion. For this purpose, the implementation of arithmeticoperators is based on on-line (i.e. digit-serial and most significant digit first) arithmetic, and the trans-forms are performed by a parallel-pipeline version of the Cooley-Tukey fast Fourier transform (FFT)algorithm. 1. INTRODUCTION In general, an important characteristics of digital signal processing problems is that the amount ofdata is huge and must be processed quickly. But fortunately, since many of these problems have someinherent regularities, it is possible to perform the whole process as a repetition of smaller similar processes.The computation of Fourier transforms is such a problem. For example, many efficient FFT algorithmshave been studied for vector computers [1, 2, 3], for parallel SIMD [4] and MIMD [5] computers, andfor shared-memory machines [6, 7]. In this paper, we deal with the computation of the FFT in a systemhaving many special-purpose VLSI processors, and concentrate on the hardware implementation of suchprocessors to get high performance by pipelining the arithmetic operations at the digit level.

Proceedings ArticleDOI
19 May 1993
TL;DR: The M-D FFT can be efficiently implemented by the unified 1-D indexing, and the address generator design can be simplified, because the matrix transposition is no longer necessary.
Abstract: A novel M-D (multidimensional) to 1-D FFT (fast Fourier transform) signal flow graph mapping is proposed. Thus, the M-D FFT can be efficiently implemented by the unified 1-D indexing, and the address generator design can be simplified. In addition, the matrix transposition is no longer necessary. The addressing sequences can be derived from the factorization of the twiddle factor matrix. The unified indexing concept of the M-D FFT implementation automatically solves the scaling problem of the block floating-point arithmetic. Practical chip design considerations in implementing the algorithm are presented. >

Proceedings ArticleDOI
24 May 1993
TL;DR: The relationship among four different versions of the DFT and their inherent properties are explored and the new real-multiplier FFT-j algorithms are proposed for all four version of the N=2/sup m/ DFT.
Abstract: In this paper four different versions of the DFT (DFT-j,j=I,II,III,IV) are first introduced, Then, the relationship among four different versions of the DFT and their inherent properties are explored. The new real-multiplier FFT-j algorithms are proposed for all four versions of the N=2/sup m/ DFT. The algorithm formulae are derived, represented by Kronecker product and direct sum. Finally, the signal flowgraph for the length-2/sup 3/ FFT-j is given to illustrate the proposed algorithms. The computational complexity is analysed and a comparison is made with other existing real-multiplier FFT algorithms. The proposed algorithms require the minimum number of arithmetic operations and use real multipliers and allow in-place computation. Besides, the proposed algorithms have very simple and regular structure. They have been implemented by software and have been finding practical applications. >

Journal ArticleDOI
N. Morishima1
TL;DR: Finite Fourier integrals based on a cubic-splines fit to equidistant data are shown to be evaluated fast and accurately, providing high accuracy with much shorter CPU time than a trapezoidal FFT.