scispace - formally typeset
Search or ask a question

Showing papers on "Prime-factor FFT algorithm published in 1991"


Proceedings ArticleDOI
26 Jun 1991
TL;DR: A linear and a nonlinear algorithm are presented for the problem of system "identification in H∞", posed by Helmicki, Jacobson and Nett, which has the robust convergence property.
Abstract: In this paper, a linear and a nonlinear algorithm are presented for the problem of system "identification in H∞", posed by Helmicki, Jacobson and Nett. We derive some error bounds for the linear algorithm which indicate that if the model error is not too high, then this algorithm has good guaranteed error properties. The linear algorithm requires only FFT (fast Fourier transform) computations. A nonlinear algorithm, which requires an additional step of solving a Nehari best approximation problem, is also presented that has the robust convergence property.

130 citations


Journal ArticleDOI
TL;DR: In this paper, the fast Fourier transform (FFT) technique is utilized to simulate a multivariate nonstationary Gaussian random process with prescribed evolutionary spectral description, and a stochastic decomposition technique facilitates utilization of the FFT algorithm.
Abstract: The fast Fourier transform (FFT) technique is utilized to simulate a multivariate nonstationary Gaussian random process with prescribed evolutionary spectral description. A stochastic decomposition technique facilitates utilization of the FFT algorithm. The decomposed spectral matrix is expanded into a weighted summation of basic functions and time‐dependent weights that are simulated by the FFT algorithm. The desired evolutionary spectral characteristics of the multivariate unidimensional process may be prescribed in a closed form or a set of numerical values at discrete frequencies. The effectiveness of the proposed technique is demonstrated by means of three examples with different evolutionary spectral characteristics derived from past earthquake events. The closeness between the target and the corresponding estimated correlation structure suggests that the simulated time series reflect the prescribed probabilistic characteristics extremely well. The simulation approach is computationally efficient, p...

104 citations


Proceedings Article
01 Jan 1991
TL;DR: In this paper, a parallel algorithm for the Fourier transform on the star graph is presented, which requires O(n/sup 2/) multiply-add steps for an input sequence of n! elements, and is hence cost-optimal with respect to the sequential algorithm on which it is based.
Abstract: The n-star graph, denoted by S/sub n/, is one of the graph networks that have been recently proposed as attractive alternatives to the n-cube topology for interconnecting processors in parallel computers. We present a parallel algorithm for the computation of the Fourier transform on the star graph. The algorithm requires O(n/sup 2/) multiply-add steps for an input sequence of n! elements, and is hence cost-optimal with respect to the sequential algorithm on which it is based. This is believed to be the first algorithm, and the only one to date, for the computation of the Fourier transform on the star graph. >

74 citations


Journal ArticleDOI
TL;DR: A structured approach is used to generate a fast algorithm to compute two-dimensional discrete cosine transforms (DCT) based on Hou's method and this algorithm is described by logic diagrams which reveal the relationship between the 2-D algorithm and its 1-D counterpart.
Abstract: A structured approach is used to generate a fast algorithm to compute two-dimensional discrete cosine transforms (DCT) based on Hou's method. Hou's algorithm is extended to the 2-D case using an approach presented in both matrix and diagrammatical forms. The matrix approach is discussed, and this forms a basis on which a 2-D fast DCT algorithm is derived. It is shown that this matrix method has a structure similar to that of the 1-D Cooley-Tukey fast Fourier transform (FFT) algorithm. Then the decimation-in-frequency (DIF) 2-D fast DCT algorithm is presented using matrix forms which use the tensor (or Kronecker) product as a construction tool. Finally, the 2-D algorithm is described by logic diagrams which reveal the relationship between the 2-D algorithm and its 1-D counterpart. As an example, the logic diagram of an 8-point*8-point 2-D DCT using the new 2-D DCT algorithm is generated through a simple procedure. >

69 citations


Journal ArticleDOI
TL;DR: It is shown how the familiar radix-2 Fast Fourier Transform algorithm can be extended toradix-3,Radix-4, radIX-5, and finally to mixed-radix FFTs, and how these new versions of the FFT require neither an unscrambling step nor work space.
Abstract: It has recently been shown that the familiar radix-2 Fast Fourier Transform (FFT) algorithm can be made both self-sorting and in-place—two useful properties which were previously thought to be mutually exclusive. In this paper the procedure is demonstrated and it is shown how it can be extended toradix-3, radix-4, radix-5, and finally to mixed-radix FFTs. These new versions of the FFT algorithm require neither an unscrambling step nor work space. Implementation on vector computers (for the case of multiple transforms) is discussed. Timing experiments on the Cray X-MP demonstrate that these new variants of the FFT run just as fast as older self sorting routines which required work space.

30 citations


Journal ArticleDOI
TL;DR: The bit-reversal counteralgorithm of B. Gold and C.M. Radar (1969) bit reverses a continuous sequence of N numbers by running a loop N -1 times and the heuristic approach presented repeats a similar loop only N/4 times.
Abstract: The bit-reversal counteralgorithm of B. Gold and C.M. Radar (1969) bit reverses a continuous sequence of N numbers by running a loop N -1 times. The heuristic approach presented repeats a similar loop only N/4 times. >

28 citations


Patent
03 Sep 1991
TL;DR: In this paper, a pipelined Fast Fourier Transform (FFT) architecture includes a memory for storing complex number data and a data path coupled to the memory for accessing R complex numbers therefrom, for computing an FFT butterfly, and storing R results from the FFT computation in the memory during one pipeline cycle.
Abstract: A pipelined Fast Fourier Transform (FFT) architecture includes a memory for storing complex number data. A pipelined data path is coupled to the memory for accessing R complex number data therefrom, for computing an FFT butterfly, and storing R results from the FFT butterfly computation in the memory during one pipeline cycle.

27 citations


Journal ArticleDOI
TL;DR: A parallel architecture especially designed for a synthetic-aperture-radar (SAR) processing algorithm based on an appropriate two-dimensional fast Fourier transform (FFT) code is presented, allowing drastic reduction of the processing time, preserving elaboration accuracy and flexibility.
Abstract: A parallel architecture especially designed for a synthetic-aperture-radar (SAR) processing algorithm based on an appropriate two-dimensional fast Fourier transform (FFT) code is presented. The algorithm is briefly summarized, and the FFT code is given for the one-dimensional case, although all results can be immediately generalized to the double FFT. The computer architecture, which consists of a toroidal net with transputers on each node, is described. Parametric expressions for the computational time of the net versus the number of nodes are derived. The architecture allows drastic reduction of the processing time, preserving elaboration accuracy and flexibility. >

27 citations


Journal ArticleDOI
TL;DR: In this paper, the rho -algorithm, a nonlinear transformation, is shown to be applicable to monotonic series and the results of applying the algorithm to a series involving the zeroth-order Hankel function of the second kind and its associated Fourier transform are given.
Abstract: The rho -algorithm, a nonlinear transformation, is shown to be applicable to monotonic series. The results of applying the algorithm to a series involving the zeroth-order Hankel function of the second kind and its associated Fourier transform are given. It is shown that the algorithm performs better than the epsilon -algorithm derived from Shanks' transform (1955). Numerical results include a relative error measure versus number of terms taken in the series. >

25 citations


Proceedings Article
01 Jan 1991

23 citations


Patent
21 Oct 1991
TL;DR: In this paper, a radix-12 FFT is presented, where complex data are represented in a 1, W 3 coordinate system rather than in a classic 1,j coordinate system, and the only multiplicative scaler in the complex twiddle factors is the reciprocal of the square root of 3 which appears six times and which by conversion to canonical signed digit code, can be accurately expressed by 5 adds.
Abstract: Using classic Fast Fourier Transform (FFT) rules, a radix-12 FFT is composed of a first tier of 2 multiplierless radix-6 transformers followed by multiplierless radix-2 transformers, or by its transpose configuration. Complex data are represented in a 1, W 3 coordinate system rather than in a classic 1,j coordinate system. The only multiplicative scaler in the complex twiddle factors is the reciprocal of the square root of 3 which appears six times and which by conversion to canonical signed digit code, can be accurately expressed by 5 adds. As a consequence the complex twiddle factor multipliers and ancillary address reduce to a total of 144 real adds required to perform the entire complex 12-point FFT.

Journal ArticleDOI
01 Sep 1991
TL;DR: The Bluestein FFT may be the algorithm of choice on multiprocessors, particularly those with the hypercube architecture because of its minimal communication requirements and for most values of N it is also shown to be superior to another alternative, namely parallel multiplication.
Abstract: The original Cooley-Tukey FFT was published in 1965 and presented for sequences with length N equal to a power of two. However, in the same paper they noted that their algorithm could be generalized to composite N in which the length of the sequence was a product of small primes. In 1967, Bergland presented an algorithm for composite N and variants of his mixed radix FFT are currently in wide use. In 1968, Bluestein presented an FFT for arbitrary N including large primes. However, for composite N, Bluestein's FFT was not competitive with Bergland's FFT. Since it is usually possible to select a composite N, Bluestein's FFT did not receive much attention. Nevertheless because of its minimal communication requirements, the Bluestein FFT may be the algorithm of choice on multiprocessors, particularly those with the hypercube architecture. In contrast to the mixed radix FFT, the communication pattern of the Bluestein FFT maps quite well onto the hypercube. With P = 2^d processors, an ordered Bluestein FFT requires 2d communication cycles with packet length N/2P which is comparable to the requirements of a power of two FFT. For fine-grain computations, the Bluestein FFT requires 20log"2N computational cycles. Although this is double that required for a mixed radix FFT, the Bluestein FFT may nevertheless be preferred because of its lower communication costs. For most values of N it is also shown to be superior to another alternative, namely parallel multiplication.

Proceedings ArticleDOI
14 Apr 1991
TL;DR: The authors introduce the pruned short-time FFT, a novel computational structure for efficiently computing the STFT with dense temporal sampling that achieves the same computational savings as the Goertzel algorithm, but is unconditionally stable.
Abstract: Although most applications which use the short-time Fourier transform (STFT) temporally downsample the output, some applications exploit a dense temporal sampling of the STFT. One example, coded-division multiple-beam sonar, is discussed. Given a need for the densely sampled STFT, the complexity of the computation can be reduced from O(N log N) for the general short-time FFT structure to O(N) using the Goertzel algorithm. The authors introduce the pruned short-time FFT, a novel computational structure for efficiently computing the STFT with dense temporal sampling. The pruned FFT achieves the same computational savings as the Goertzel algorithm, but is unconditionally stable. >

Journal ArticleDOI
TL;DR: An Nth-order Hankel transform (also called Fourier-Bessel transform) algorithm designed for many analytically defined functions is presented, which provides greater generality than many others.
Abstract: An Nth-order Hankel transform (also called Fourier-Bessel transform) algorithm designed for many analytically defined functions is presented. This algorithm is not restricted to order zero. As such, it provides greater generality than many others. The traditional difficulty in the evaluation of Hankel transforms, the presence of Bessel functions in the kernel of the integral transform, is eliminated in this new Hankel transform algorithm. The algorithm presented is composed of a fast (linear time) Nth-order Chebyshev transform followed by a fast Fourier transform. >

Journal ArticleDOI
TL;DR: It is shown that the equalization of FFTs leads to results which are different from the widely used intuitive ones and the formulae of the method can be easily adapted for deriving algorithms for the cosine/sine DFT.
Abstract: A general method of deriving DFT (discrete Fourier transform) algorithms, generalised fast Fourier transform algorithms, is presented. It is shown that a special case of the method is equivalent to nesting of FFTs. The application of the method to the case where N has mutually prime factors results in a new interpretation of the permutations characteristic of this class of algorithms. It is shown that the equalization of FFTs leads to results which are different from the widely used intuitive ones. The high efficiency of split-radix FFTs is explained. It is shown that the formulae of the method can be easily adapted for deriving algorithms for the cosine/sine DFT. A set of FFTs that has smaller arithmetical and/or memory complexities than any algorithm known is presented. In particular, a method of deriving split-radix-2/sup s/ FFTs requiring N log/sub 2/ N-3N+4 real multiplications and 3N log/sub 2/ N-3N+4 additions for any s>1 is presented. >

Journal ArticleDOI
01 Jul 1991
TL;DR: In this paper, a conceptually simple algorithm is presented for the determination of the transfer function for two-dimensional generalised or singular systems using the discrete Fourier transform (DFT).
Abstract: A systematic and conceptually simple algorithm is presented for the determination of the transfer function for two-dimensional generalised or singular systems. The method uses the discrete Fourier transform (DFT) and can easily be applied. The simplicity and efficiency of the algorithm are illustrated by two examples.

Journal ArticleDOI
TL;DR: A modified fast cosine transform (FCT) algorithm is presented featuring the following three properties: the entire calculation is performed using arrays half the size of what would be required using a common fast Fourier transform (FFT).

Journal ArticleDOI
TL;DR: The Temperton (1985) indexing scheme for the prime factor fast Fourier transform algorithm (PFA-FFT) is generalized to include other prime factor maps and it is found that the updating scheme forThe Ruritanian map can completely avoid the modulo operations required in indexing the data.
Abstract: The Temperton (1985) indexing scheme for the prime factor fast Fourier transform algorithm (PFA-FFT) is generalized to include other prime factor maps. In particular, it is found that the updating scheme for the Ruritanian map can completely avoid the modulo operations required in indexing the data. >

Journal ArticleDOI
TL;DR: An efficient algorithm (involving real arithmetic only) for N-point DFT is developed and used as the basic building block for developing the real valued fast Fourier transform (FFT).
Abstract: The authors earlier developed a fast recursive algorithm for the discrete sine transform (see IEEE Trans. Acoust. Speech Signal Process., vol.38, no.3, p.553-7, 1990). This algorithm is used as the basic building block for developing the real valued fast Fourier transform (FFT). It is assumed that the input sequence is real and of length N, an integer power of 2. The N-point discrete Fourier transform (DFT) of a real sequence can be implemented via the real (cos DFT) and imaginary (sin DFT) components. The N-point cos DFT in turn can be developed from N/2-point cos DFT and N/4-point discrete sine transform (DST). Similarly, the N-point sin DFT can be developed from N2-point sin DFT and N/4-point DST. Using this approach, an efficient algorithm (involving real arithmetic only) for N-point DFT is developed. >

Journal ArticleDOI
TL;DR: The prime factor algorithm was implemented on a hypercube using CrOS III communication routines, taking 120 ms to compute the DFT of 5040 complex points using 32 nodes of the Caltech-JPL MARK III Hypercube and the Cooley-Tukey algorithm with the same hardware configuration.
Abstract: The prime factor algorithm (PFA) is an efficient discrete Fourier transform (DFT) computation algorithm in which a one-dimensional DFT is tuned into a multidimensional DFT, consisting of a few short DFTs whose lengths are mutually prime, and then an efficient algorithm is used for the short DFTs. The PFA was implemented on a hypercube using CrOS III communication routines, taking 120 ms to compute the DFT of 5040 complex points using 32 nodes of the Caltech-JPL MARK III Hypercube. It took 105 ms to do a DFT of 4096 complex points using the Cooley-Tukey algorithm with the same hardware configuration. The performance of hypercubes MARK III, NCUBE, and iPSC and the relative importance of communication and calculation are analyzed. With the current communication speed the Cooley-Tukey algorithm performs fast on a massively concurrent processor and the PFA is advantageous when the number of processors is less than 64 or so. The experience with using the PFA also serves as a useful guide to a multidimensional fast Fourier transform implementation using any algorithm. >

Proceedings ArticleDOI
14 Apr 1991
TL;DR: A tool to aid in the automated VLSI implementation of the discrete Fourier transform (DFT) is described and a transformation technique between a symbolic computation environment and a behavioral synthesis environment for the transferring of functional primitives is discussed.
Abstract: A tool to aid in the automated VLSI implementation of the discrete Fourier transform (DFT) is described. This tool is tensor product algebra, a branch of finite-dimensional multilinear algebra. Tensor product formulations of fast fourier transform (FFT) algorithms to compute the DFT are presented. These mathematical formulations are manipulated, using properties of tensor product algebra, to obtain variants that adapt to performance constraints in a VLSI implementation process. The possibility of automating this procedure by processing these mathematical formulations or expressions in a behavioral synthesis environment of a silicon compilation system is discussed. A transformation technique between a symbolic computation environment and a behavioral synthesis environment for the transferring of functional primitives is discussed. >

Journal ArticleDOI
TL;DR: The hardware design of a circuit capable of producing digit reversed sequences for Radix-2, radix-4, and mixed radIX-2/4 fast Fourier transform algorithms is presented in detail.
Abstract: The hardware design of a circuit capable of producing digit reversed sequences for radix-2, radix-4, and mixed radix-2/4 fast Fourier transform (FFT) algorithms is presented in detail. The design requires selectively routing the output of a binary counter to the output address pointer used during the execution of the FFT. The digit reversed counter is capable of generating address sequences for fast sequences for fast Fourier transforms varying in size from 4 to 64 K data points. >

Journal ArticleDOI
TL;DR: This application note from Motorola provides both an excellent tutorial on the FFT itself and describes how it can be implemented using a general-purpose digital signal processor.

Journal ArticleDOI
TL;DR: A fast algorithm is presented which computes the two-dimensional Hartley transform using the decimation in frequency decomposition and, due to its in-place property, it does not require midmemory devices or matrix transposition.
Abstract: A fast algorithm is presented which computes the two-dimensional Hartley transform. This algorithm is referred to as the split vector radix algorithm. It uses the decimation in frequency decomposition and, due to its in-place property, it does not require midmemory devices or matrix transposition. Its computational structure is simpler than that of the algorithm of L.Z. Chen (1983), and it is easy to program. Compared with the vector radix algorithm of R. Kumaresan and P.K. Gupta (1986), the proposed algorithm saves about 35% of the multiplication and 10% of the additions for the discrete Fourier transform (DFT) of a 4096*4096 real valued input sequence. >

Journal ArticleDOI
TL;DR: In this article, the disadvantages of numerical inversion of the Laplace transform via the conventional fast Fourier transform (FFT) are identified and an improved method is presented to remedy them.
Abstract: The disadvantages of numerical inversion of the Laplace transform via the conventional fast Fourier transform (FFT) are identified and an improved method is presented to remedy them. The improved method is based on introducing a new integration step length Delta(omega) = pi/mT for trapezoidal-rule approximation of the Bromwich integral, in which a new parameter, m, is introduced for controlling the accuracy of the numerical integration. Naturally, this method leads to multiple sets of complex FFT computations. A new inversion formula is derived such that N equally spaced samples of the inverse Laplace transform function can be obtained by (m/2) + 1 sets of N-point complex FFT computations or by m sets of real fast Hartley transform (FHT) computations.

Proceedings ArticleDOI
04 Nov 1991
TL;DR: A novel filter-bank interpretation of the procedure is presented, allowing understanding of the errors occurring in the method's use for an appropriate partial-band transform, and the novel algorithm is compared to existing methods especially for the computation of a limited number of frequency points.
Abstract: A detailed analysis of a newly proposed fast Fourier transform (FFT) type algorithm is presented. Several variants are introduced in the form of signal-flow graph (SFG) descriptions. The main characteristic of the approach is the frequency-separation property of the subsequences involved in the decomposition process. A novel filter-bank interpretation of the procedure is presented, allowing understanding of the errors occurring in the method's use for an appropriate partial-band transform. These errors are studied in depth to obtain general formulas describing their nature, whatever the number and type of decomposition stages might be. The computational complexity of the algorithm is analyzed both theoretically and in terms of running-time measurements. With these insights, the novel algorithm is compared to existing methods especially for the computation of a limited number of frequency points. Previously reported complexity estimates are refined and extended. >

Journal ArticleDOI
TL;DR: A bus-oriented multiprocessor architecture specialized for computation of the discrete Fourier transform (DFT) of a length N=2/sup M/ sequential data stream is developed and allows flexibility in the number of processors and in the choice of a fast Fourier Transform (FFT) algorithm.
Abstract: A bus-oriented multiprocessor architecture specialized for computation of the discrete Fourier transform (DFT) of a length N=2/sup M/ sequential data stream is developed. The architecture distributes computation and memory requirements evenly among the processors and allows flexibility in the number of processors and in the choice of a fast Fourier transform (FFT) algorithm. With three buses, the bus bandwidth equals the input data rate. A single time-multiplexed bus with a bandwidth of three times the input data rate can alternatively be used. The architecture requires processors that have identical hardware, which makes it more attractive than the cascade (pipeline) FFT for multiprocessor implementation. >

Proceedings ArticleDOI
14 Apr 1991
TL;DR: A comparative assessment of the computational complexity of several Gabor transform algorithms is given, with results in the range O(P)/sup 2/ to O (P log/sub 2/ P), where P is the number of data points being transformed.
Abstract: A comparative assessment of the computational complexity of several Gabor transform algorithms is given, with results in the range O(P)/sup 2/ to O(P log/sub 2/ P), where P is the number of data points being transformed. Among the results is a novel algorithm of lower complexity than previously known FFT (fast Fourier transform) based methods. The most efficient of the methods, which uses the Zak transform as an operational calculus, performs the Gabor analysis and synthesis transforms with a complexity comparable to that of the FFT. >

Journal ArticleDOI
TL;DR: In this paper, a new formulation of the discrete Wigner-Ville distribution is presented, which can be implemented directly using standard fast Fourier transform techniques for a non-negative frequency resolution of N points, only an N point FFT is needed.
Abstract: A new formulation of the discrete Wigner-Ville distribution is presented which can be implemented directly using standard fast Fourier transform techniques. For a non-negative frequency resolution of N points, only an N point FFT is needed.

Journal ArticleDOI
V. Nagesha1
TL;DR: Efficient fast Fourier transform algorithms to compute the forward and inverse discrete Fourier transforms of a sequence with linear-phase characteristic are examined and can be easily written by simple restructuring of a complex FFT algorithm.
Abstract: Efficient fast Fourier transform (FFT) algorithms to compute the forward and inverse discrete Fourier transforms (DFT) of a sequence with linear-phase characteristic are examined. These reduce the computational requirements as regards a complex FFT by large factors and should be used whenever applicable. The case when the DFT coefficients are real-valued leads to further reductions in computational requirements. Though the redundancy in the linear-phase situation is exactly 50%, the computational requirements and implementation are quite different from the real-valued FFT which uses a similar symmetry relation. The code for such implementations can be easily written by simple restructuring of a complex FFT algorithm. >