scispace - formally typeset
Search or ask a question

Showing papers on "Split-radix FFT algorithm published in 1984"


Journal ArticleDOI
01 Aug 1984
TL;DR: The Fast Hartley Transform (FHT) is as fast as or faster than the Fast Fourier Transform (FFT) and serves for all the uses such as spectral analysis, digital processing, and convolution to which the FFT is at present applied.
Abstract: A fast algorithm has been worked out for performing the Discrete Hartley Transform (DHT) of a data sequence of N elements in a time proportional to Nlog 2 N. The Fast Hartley Transform (FHT) is as fast as or faster than the Fast Fourier Transform (FFT) and serves for all the uses such as spectral analysis, digital processing, and convolution to which the FFT is at present applied. A new timing diagram (stripe diagram) is presented to illustrate the overall dependence of running time on the subroutines composing one implementation; this mode of presentation supplements the simple counting of multiplies and adds. One may view the Fast Hartley procedure as a sequence of matrix operations on the data and thus as constituting a new factorization of the DFT matrix operator; this factorization is presented. The FHT computes convolutions and power spectra distinctly faster than the FFT.

455 citations


Journal ArticleDOI
TL;DR: A new N = 2n fast Fourier transform algorithm is presented, which has fewer multiplications and additions than radix 2n, n = 1, 2, 3 algorithms, has the same number of multiplications as the Raderi-Brenner algorithm, but much fewer additions.
Abstract: A new N = 2n fast Fourier transform algorithm is presented, which has fewer multiplications and additions than radix 2n, n = 1, 2, 3 algorithms, has the same number of multiplications as the Raderi-Brenner algorithm, but much fewer additions, and is numerically better conditioned, and is performed ‘in place’ by a repetitive use of a ‘butterfly’-type structure.

412 citations


Journal ArticleDOI
TL;DR: A simple algorithm for the evaluation of discrete Fourier transforms (DFT) and discrete cosine transforms (DCT) is presented, which achieves a substantial decrease in the number of additions when compared to currently used FFT algorithms.

366 citations


Journal ArticleDOI
01 Aug 1984
TL;DR: Several methods for lengthening vectors are discussed, including the case of multiple and multi-dimensional transforms where M sequences of length N can be transformed as a single sequence of length MN using a 'truncated' FFT.
Abstract: The adaptation of the Cooley-Tukey, the Pease and the Stockham FFT's to vector computers is discussed. Each of these algorithms computes the same result namely, the discrete Fourier transform. They differ only in the way that intermediate computations are stored. Yet it is this difference that makes one or the other more appropriate depending on the application. This difference also influences the computational efficiency on a vector computer and motivates the development of methods to improve efficiency. Each of the FFT's is defined rigorously by a short expository FORTRAN program which provides the basis for discussions about vectorization. Several methods for lengthening vectors are discussed, including the case of multiple and multi-dimensional transforms where M sequences of length N can be transformed as a single sequence of length MN using a 'truncated' FFT. The implementation of an in place FFT on a computer with memory-to-memory architecture is made possible by in place matrix-vector multiplication.

164 citations


Journal ArticleDOI
TL;DR: This paper describes an analog speech scrambler using the FFT technique (FFT scrambler), which provides highly secured scrambled signal by permuting a large number of FFT coefficients.
Abstract: This paper describes an analog speech scrambler using the FFT technique (FFT scrambler). The FFT scrambler provides highly secured scrambled signal by permuting a large number of FFT coefficients. Important items to be considered in the designing of the FFT scrambler are discussed, such as FFT frame length, permutation of FFT coefficients, and frame synchronization, in addition to the configuration of the experimental FFT scrambler. Possible causes for the degradation of the descrambled signal, such as transmission channel group delay and intersymbol interference, are also discussed, together with experimental results.

31 citations


Proceedings ArticleDOI
19 Mar 1984
TL;DR: This paper presents an in-place, radix-2 FFT that does the unscrambling while the FFT is being calculated rather than as a separate process.
Abstract: Most Cooley-Tukey, in-place Fast Fourier Transform (FFT) algorithms result in the output being permuted or scrambled in order. For a radix-2 FFT, this order can be easily found by reversing the order of the bits of the address, and the unscrambler is called a bit-reversed counter. In some machines, this unscrambling takes from 10% to 50% of the total execution time. This paper presents an in-place, radix-2 FFT that does the unscrambling while the FFT is being calculated rather than as a separate process. The theoretical framework is based on index maps [1] and ideas used on the in-place, in-order prime factor FFT (PFA) [2]. The non-scrambled algorithm is implemented in FORTRAN. The size of the program is essentially the same as the regular radix-2 FFT with its bit-reversed counter.

29 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: The primary goals of these techniques are to eliminate unnecessary computations required when implementing a complex transform on a real-valued vector, to compute the transform in-place in the original length-N real vector, and to obtain the transform coefficients in-order.
Abstract: This paper presents two techniques for computing a discrete transform of a vector of real-valued data using the Prime Factor Algorithm (PFA) with high-speed convolution. These techniques are applied to the Discrete Fourier Transform (DFT) and the Discrete Hartley Transform (DHT). The primary goals of these techniques are to eliminate unnecessary computations required when implementing a complex transform on a real-valued vector, to compute the transform in-place in the original length-N real vector, and to obtain the transform coefficients in-order. The two algorithms described require modification of the Winograd short-length transform modules to accommodate a real input. One technique replaces the modules in the Burrus-Eschenbacher PFA program with the modified real-input modules and constructs the complete transform in a final step of additions and subtractions after modules for each factor have been executed. The other technique uses these real-input DFT modules for part of the computation associated with each factor and requires complex input DFT modules for another part of the computation. These algorithms require exactly one half of the number of multiplications and slightly less than one half of the number of additions required by a complex-input PFA.

18 citations


Journal ArticleDOI
TL;DR: In this paper, a new technique that significantly minimizes the aliasing error encountered in the conventional use of the fast Fourier transform (FFT) algorithms for the efficient evaluation of Fourier transforms of spatially limited functions (such as those that occur in the radiation pattern analysis of reflector antennas and planar near field to far field (NF-FF) transformation) is presented.
Abstract: A new technique that significantly minimizes the aliasing error encountered in the conventional use of the fast Fourier transform (FFT) algorithms for the efficient evaluation of Fourier transforms of spatially limited functions (such as those that occur in the radiation pattern analysis of reflector antennas and planar near field to far field (NF-FF) transformation) is presented and illustrated through a typical example. Employing this technique and a discrete Fourier series (DFS) expansion for the integrand, a method for computing the radiation integrals of reflector antennas and planar NF-FF transformation integrals at arbitrary observation angles with optimum use of computer memory and time is also described.

17 citations


Journal ArticleDOI
TL;DR: It is shown that the 2n FFT can be computed with less than 2n+1 nontrivial complex multiplications and a variation of this algorithm is shown to give the same multiplication count as the `split-radix´ FFT.
Abstract: First we give a decomposition of an FFT of length 2n into a number of one-dimensional polynomial products. If these products are computed with minimum multiplication algorithms, we show that the 2n FFT can be computed with less than 2n+1 nontrivial complex multiplications. A variation of this algorithm is also shown to give the same multiplication count as the `split-radix´ FFT.

17 citations


Proceedings ArticleDOI
01 Dec 1984
TL;DR: The pipeline FFT implementation is explained and attention is focused on the current activity which involves developing a fixed point arithmetic version using CMOS multipliers and adders to reduce the power consumption.
Abstract: This paper describes recent progress in the implementation of a high speed Fast Fourier Transform (FFT) processor with state-of-the-art VLSI circuits Initial efforts have produced FFT and inverse FFT processors that operate at data rates of up to 40 MHz (complex) The current implementation computes transforms of up to 16,384 points in length by means of the McClellan and Purdy radix 4 pipeline FFT algorithm The arithmetic is performed by single chip 22 bit floating point adders and multipliers, while the interstage reordering is performed by delay commutators implemented with semi-custom VLSI This paper explains the pipeline FFT implementation and focuses attention on our current activity which involves developing a fixed point arithmetic version using CMOS multipliers and adders to reduce the power consumption

11 citations


Proceedings ArticleDOI
01 Jan 1984
TL;DR: By using state of the art arithmetic components and judicious semi-custom circuit development, an FFT processor has been implemented that computes a 4096 point (complex) transform in 102 microseconds.
Abstract: This paper describes recent progress in implementation of a 40 MHz (complex) data rate frequency domain adaptive digital filter. The filter uses multiple time overlapped channels each consisting of an FFT, a frequency domain multiplier, and an inverse FFT. The 4096 point FFT and inverse FFT processors realize the McClellan and Purdy radix 4 pipeline FFT algorithm with 22 bit floating point arithmetic. The arithmetic is performed with single chip floating point adders and multipliers. The interstage reordering is performed with a delay commutator implemented with semi-custom VLSI. By using state of the art arithmetic components and judicious semi-custom circuit development, an FFT processor has been implemented that computes a 4096 point (complex) transform in 102 microseconds.

Journal ArticleDOI
TL;DR: An extended fast Fourier transform algorithm which entirely eliminates or greatly reduces such operations is introduced and the derived algorithm has been applied to ARMA spectral estimation and its effectiveness compared to other methods.
Abstract: The conventional FFT algorithm can be used for the computation of ARMA spectral estimates, but a large number of operations would involve zeros. An extended fast Fourier transform algorithm which entirely eliminates or greatly reduces such operations is introduced in this paper. Subsequently, the derived algorithm has been applied to ARMA spectral estimation and its effectiveness compared to other methods.

Journal ArticleDOI
TL;DR: A novel way of organizing a twiddle factor table and indexing butterfly terms for efficiently computing the radix-4 fast Fourier transform is presented.

Proceedings ArticleDOI
04 Dec 1984
TL;DR: A simple, easily implemented method for an FFT-based implementation of image cross-correlation which provides for the proper normalization required to obtain useful, easily interpreted output is presented.
Abstract: Presented here is a simple, easily implemented method for an FFT-based implementation of image cross-correlation which provides for the proper normalization required to obtain useful, easily interpreted output. The results are applied to character recognition. The algorithm is structured such that the data is processed in a highly parallel fashion in order to facilitate a fast, efficient hardware implementation. A discussion is made of the relative merits of using this method.

Journal ArticleDOI
TL;DR: The letter demonstrates an FFT algorithm implemented on the 68000 microprocessor that can calculate a 256-point transform in less than 48 ms.
Abstract: The letter demonstrates an FFT algorithm implemented on the 68000 microprocessor that can calculate a 256-point transform in less than 48 ms. The algorithm employs an interesting method of scaling data to overcome overflow.

Journal ArticleDOI
TL;DR: It seems, however, that the main problem with [1] concerns the fact that the proposed programs are not optimal from the point of view of the number of data transfers.
Abstract: Reference [1] contains corrections to the arithmetical errors of the paper and to the part of the resulting concluding remarks (the concluding remarks concerning 1024-point FFT and 1008- point WFTA remain uncorrected). It seems, however, that the main problem with [1] concerns the fact that the proposed programs are not optimal from the point of view of the number of data transfers. The main limitations of the programs given in [1] are related to the structures of small transforms, as well as to the structures of complete fast Fourier transform algorithms. Some of these limitations are pointed out in the correspondence.

Proceedings ArticleDOI
09 Jul 1984
TL;DR: The results of performance analysis show that the combination of adaptive architecture capability and VLSI technology can provide a practical solution for meeting the goal of advanced real-time FFT processing.
Abstract: A versatile special-purpose VLSI fast Fourier transform (FFT) processor is presented. It can process variant data sizes of FFT and cooperate with other identical FFT processors to accomplish cascade and parallel FFT processing schemes. The operations of the single processor FFT processing scheme, the multiprocessor cascade FFT processing scheme, and the multiprocessor parallel FFT processing scheme are described. The results of performance analysis show that the combination of adaptive architecture capability and VLSI technology can provide a practical solution for meeting the goal of advanced real-time FFT processing.

Proceedings ArticleDOI
01 Mar 1984
TL;DR: The novelty of the proposed algorithm is to reduce the number of multiplication and to simplify the hardware implementation and the results show a very promising and will be a viable alternative to the FFT and any other DFT algorithms.
Abstract: A new method for the evaluation of the Discrete Fourier Transform (DFT) is presented. This method evaluates the DFT of samples of a continuous time signal by multiplying the DFT of the difference signal at the output of a Linear Delta Modulator (LDM) by a rotation factor. The novelty of the proposed algorithm is to reduce the number of multiplication and to simplify the hardware implementation. Further modification of this algorithm does not require any multiplication at all during the DFT computation. An implementation of convolutions, chirp-z transform and the discrete Hilbert transform with the proposed technique will offer good opportunities for additional research with respect to the point of a simple hardware implementation, high-speed, and a computational simplicity. This proposed technique is, in fact, the combination of an encoding technique and a FFT algorithm. The results show a very promising and will be a viable alternative to the FFT and any other DFT algorithms.