scispace - formally typeset
Search or ask a question

Showing papers on "Split-radix FFT algorithm published in 1999"


Journal ArticleDOI
TL;DR: In this paper, an effective numerical algorithm based on inverting a specialized Laplace transform is derived for computing the two-dimensional power-series expansion coefficients of a two-variable function.
Abstract: An effective numerical algorithm based on inverting a specialized Laplace transform is derived for computing the two-dimensional power-series expansion coefficients of a two-variable function. Due to the special structure of the constructed 2D Laplace transform, the accuracy of the inverted function values can be assured effectively by the generalized Riemann zeta function evaluation and the multiple sets of 2D FFT computation. Therefore, the algorithm is particularly amenable to modern computers having multiprocessors and/or vector processors.

72 citations


Posted Content
TL;DR: A new algorithm for reducing an arbitrary unitary matrix U into a sequence of elementary operations that can be used to manipulate an array of quantum bits and shows that the Fast Fourier Transform (FFT) algorithm is a special case of this algorithm.
Abstract: We present a new algorithm for reducing an arbitrary unitary matrix U into a sequence of elementary operations (operations such as controlled-nots and qubit rotations). Such a sequence of operations can be used to manipulate an array of quantum bits (i.e., a quantum computer). Our algorithm applies recursively a mathematical technique called the CS Decomposition to build a binary tree of matrices whose product, in some order, equals the original matrix U. We show that the Fast Fourier Transform (FFT) algorithm is a special case of our algorithm. We report on a C++ program called “Qubiter” that implements the ideas of this paper. Qubiter(PATENT PENDING) source code is publicly available.

44 citations


Journal ArticleDOI
TL;DR: An efficient algorithm for computing the real-valued FFT using radix-2 decimation-in-frequency (DIF) approach has been introduced and a C++ program that implements this algorithm has been included.
Abstract: An efficient algorithm for computing the real-valued FFT (of length N) using radix-2 decimation-in-frequency (DIF) approach has been introduced. The fact that the odd coefficients are the DFT values of an N/2-length linear phase sequence introduces a redundancy in the form of the symmetry X(2k+1)=X/sup */(N-2k-1), which can be exploited to reduce the arithmetic complexity and memory requirements. The arithmetic complexity and, memory requirements of the algorithm presented are exactly the same as the most efficient decimation-in-time (DIT) algorithm for the real-valued FFT that exists to date. A C++ program that implements this algorithm has been included.

31 citations


Journal ArticleDOI
TL;DR: This paper presents an optimized column fast Fourier transform (FFT) architecture, which utilizes bit-serial arithmetic and dynamic reconfiguration to achieve a complete overlap between computation and communication.
Abstract: This paper presents an optimized column fast Fourier transform (FFT) architecture, which utilizes bit-serial arithmetic and dynamic reconfiguration to achieve a complete overlap between computation and communication. As a result, for a clock rate of 40 MHz, the system can compute a 24-b precision 1K point complex FFT transform in 9.2 /spl mu/s, far surpassing the performance of any existing FFT systems.

28 citations


Proceedings ArticleDOI
15 Mar 1999
TL;DR: A novel VLSI architecture for computing the N-point discrete Fourier transform (DFT) based on a radix-2 fast algorithm, where N is a power of two, which is attractive for use in long-length DFT applications, such as ADSL and OFDM systems.
Abstract: This paper presents a novel VLSI architecture for computing the N-point discrete Fourier transform (DFT) based on a radix-2 fast algorithm, where N is a power of two. The architecture consists of one complex multiplier, two complex adders, and some special memory units. It can compute one transform sample every log/sub 2/N+1 clock cycles in average. For the case of N=512, the chip area required is about 5742/spl times/5222 /spl mu/m/sup 2/ and the throughput is up to 4 M transform samples per second under 0.6 /spl mu/m CMOS technology. Such area-time performance makes the proposed design rather attractive for use in long-length DFT applications, such as ADSL and OFDM systems.

27 citations


Proceedings ArticleDOI
01 Aug 1999
TL;DR: The new algorithm presented in the paper has been implemented in the POLYNOMIAL Toolbox for MATLAB™, Version 2.0 and its performance highly exceeds that of older procedures used in Version 1.6.
Abstract: Fast Fourier Transform algorithm, the powerful technique of Discrete Fourier Transform, is used here to compute the determinant of a polynomial matrix. The new algorithm presented in the paper has been implemented in the POLYNOMIAL Toolbox for Matlab™, Version 2.0 and its performance highly exceeds that of older procedures used in Version 1.6. The new method is both much less costly and much more reliable and also naturally handles polynomial matrices with complex coefficients. Experimental testing results are also reported in the paper.

27 citations


Proceedings ArticleDOI
01 Dec 1999
TL;DR: This paper presents an efficient implementation of the pipeline FFT processor based on the radix-4 decimation-in-time algorithm with the use of digit-serial arithmetic units that can not only achieve nearly 100% hardware utilization, but also require much less memory compared with the previous digit- serial FFT processors.
Abstract: This paper presents an efficient implementation of the pipeline FFT processor based on the radix-4 decimation-in-time algorithm with the use of digit-serial arithmetic units. By splitting the sequential input sample into parallel digit-serial data streams, the proposed architecture can not only achieve nearly 100% hardware utilization, but also require much less memory compared with the previous digit-serial FFT processors. Furthermore, in FFT processors, several modules of ROM are required for the storage of twiddle factors. By exploiting the redundancy of the factors, the overall ROM size can be effectively reduced by a factor of 2.

17 citations


Journal ArticleDOI
TL;DR: This letter shows that a fast Fourier transform (FFT) based method is well suited to rapidly generate Gaussian noise samples and quantifies this error for arbitrary sampling rates and correlation functions.
Abstract: Rapid generation of time series samples of stationary, zero-mean, correlated Gaussian noise will accelerate digital communication system simulations. In this letter, we show that a fast Fourier transform (FFT) based method is well suited to rapidly generate such noise samples. The FFT method requires O(N) memory elements and O(N log 2 N) floating-point operations to generate each sequence of N variates. Sequences that are bandlimited incur an aliasing error in the correlation function of the sequence, but for practical simulations we show this error is negligible. We quantify this error for arbitrary sampling rates and correlation functions.

17 citations


Journal ArticleDOI
TL;DR: The algorithms developed in this paper update the DFT to reflect the modified window contents, using less computation than directly evaluating the modified transform via the FFT algorithm, which reduces the computational order by a factor of log 2 N for both the 1-D and 2-D cases.

15 citations


Journal ArticleDOI
TL;DR: The number of multiplications necessary to compute the proposed algorithm is significantly reduced while the number of additions remains almost identical to that of conventional Multidimensional FFT's (MFFT).
Abstract: In this paper, we propose a new approach for computing multidimensional Cooley-Tukey FFT‘s that is suitable for implementation on a variety of multiprocessor architectures. Our algorithm is derived in this paper from a Cooley decimation-in-time algorithm by using an appropriate indexing process and the tensor product properties. It is proved that the number of multiplications necessary to compute our proposed algorithm is significantly reduced while the number of additions remains almost identical to that of conventional Multidimensional FFT‘s (MFFT). Comparison results show the powerful performance of the proposed MFFT algorithm against the row-column FFT transform when data dimension M is large. Furthermore, this algorithm, presented in a simple matrix form, will be much easier to implement in practice. Connections of the proposed approach with well-known DFT algorithms are included in this paper and many variations of the proposed algorithm are also pointed out.

14 citations


Journal ArticleDOI
A. Fertner1
TL;DR: The index-reversed complex conjugate sequence and the mirror symmetric complex conjjugate sequence were defined and a significant reduction in the number of complex computations is achieved if a sequence in either domain exhibits such symmetry.
Abstract: The discrete Fourier transform (DFT) and the inverse discrete Fourier transform (IDFT) are used in a wide variety of signal processing applications. Even with the increased speed of modern processors, there is an ongoing need to further develop more efficient methods for computing DFT and IDFT, with a particular effort to reduce the number of complex multiplications. The properties of certain complex sequences are extraordinarily useful in the sense that they lead to data manipulation schemes that result in the sequences to which traditional but much shorter fast Fourier transform (FFT) algorithms may be applied. This is achieved by exploiting a certain regularity in the complex data. The index-reversed complex conjugate sequence and the mirror symmetric complex conjugate sequence were defined. A significant reduction in the number of complex computations is achieved if a sequence in either domain exhibits such symmetry.

Patent
02 Aug 1999
TL;DR: In this paper, a transposeless 2-dimensional FFT with minimum number of clock cycles and minimum complexity is presented. But the complexity of the circuit is reduced by elimination of butterfly computation structure and adaptation of transpose-less 2D transform architecture.
Abstract: A circuit for performing Fast Fourier Transform (FFT) with minimum number of clock cycles and minimum complexity. One-dimensional FFT of size N=N 0 ×N 1 × . . . ×N M−1 , N m m=0, 1, . . . , M−1, positive numbers, is computed recursively, through a sequence of two-dimensional row-column transform computations of sizes, N 0 ×N 1 , (N 0 ×N 1 )×N 2 , (N 0 ×N 1 ×N 2 )×N 3 , . . . , (N 0 ×N 1 × . . . ×N M−2 )×N M−1 with twiddle factors. The complexity of the circuit is reduced by elimination of butterfly computation structure and adaptation of transposeless 2-D transform architecture.

Patent
21 Jun 1999
TL;DR: In this article, a parallel FFT generating system for generating a Fast Fourier Transform (FFT) of an input vector is described, which includes a plurality of processes configured to receive the input vector and process the input vectors in parallel in relation to a set of twiddle factors to generate an output vector.
Abstract: A parallel FFT generating system is disclosed for generating a Fast Fourier Transform (FFT) of an input vector. The parallel FFT generating system includes a plurality of processes configured to receive the input vector and process the input vector in parallel in relation to a set of twiddle factors to generate an output vector, the output vector comprising a Fourier transform representation of the input vector.

Journal ArticleDOI
TL;DR: This article presents a method to compute the discrete Fourier transform (DFT) of an N-point real vector and the inverse DFT (IDFT) of the DFT of another real N- Point vector by carrying out a single complex N- point DFT.
Abstract: This article presents a method to compute the discrete Fourier transform (DFT) of an N-point real vector and the inverse DFT (IDFT) of the DFT of another real N-point vector by carrying out a single complex N-point DFT. Possible applications are for transceivers where the transmitter has to carry out the DFT of an N-point real sequence and the receiver has to carry out the inverse DFT of the DFT of another N-point real sequence.

Proceedings ArticleDOI
23 Aug 1999
TL;DR: A DRAM-like pipelined commutator architecture is used in order to reduce the required chip area for the sequential processing of 8 K complex data, and the proposed structure brings about the 55% chip size reduction compared with conventional approach.
Abstract: In this paper we propose an implementation method for a single-chip 8192 complex point FFT in terms of sequential data processing. In order to reduce the required chip area for the sequential processing of 8 K complex data, a DRAM-like pipelined commutator architecture is used. The 16-point FFT is a basic building block of the entire FFT chip, and the 8192-point FFT consists of the cascaded blocks with six stages of radix-4 and one stage of radix-2. Since each stage requires rounding of the resulting bits while maintaining the proper S/N ratio, the convergent block floating point (CBFP) algorithm is used for the effective internal bit rounding. As a result the proposed structure brings about the 55% chip size reduction compared with conventional approach.

Proceedings ArticleDOI
26 Oct 1999
TL;DR: A DRAM-like pipelined commutator architecture is used to reduce the required chip area for the sequential processing of 2 K complex data, and the convergent block floating point (CBFP) algorithm is used for the effective internal bit rounding.
Abstract: In this paper, we propose an implementation method for a single-chip 2048 complex point FFT in terms of sequential data processing. In order to reduce the required chip area for the sequential processing of 2 K complex data, a DRAM-like pipelined commutator architecture is used. The 16-point FFT is a basic building block of the entire FFT chip, and the 2048-point FFT consists of the cascaded blocks with five stages of radix-4 and one stage of radix-2. Since each stage requires rounding of the resulting bits while maintaining the proper S/N ratio, the convergent block floating point (CBFP) algorithm is used for the effective internal bit rounding. As a result, the proposed structure brings about the 55% chip size reduction compared with the conventional approach.

Journal ArticleDOI
TL;DR: The algorithm derived in this paper is derived from a Cooley decimation-in-time algorithm by using an appropriate indexing process and it is proved that the number of multiplications necessary to compute the proposed algorithm is significantly reduced while theNumber of additions remains almost identical to that of conventional 2D FFT's.
Abstract: In this paper, we propose a new approach for computing 2D FFT's that are suitable for implementation on a systolic array architecture. Our algorithm is derived in this paper from a Cooley decimation-in-time algorithm by using an appropriate indexing process. It is proved that the number of multiplications necessary to compute our proposed algorithm is significantly reduced while the number of additions remains almost identical to that of conventional 2D FFT's. Comparison results show the good performance of the proposed 2D FFT algorithm against the row-column FFT transform. Copyright © 1999 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: A multidimensional fast Fourier transform (FFT) algorithm is presented for signals with arbitrary symmetries and periodic on arbitrary lattices that makes the frequency domain computation of Volterra filtering more convenient than the time domain approach.
Abstract: A multidimensional fast Fourier transform (FFT) algorithm is presented for signals with arbitrary symmetries and periodic on arbitrary lattices. Applications that can benefit from such an algorithm include Volterra filtering and analysis of x-ray diffraction data. The presented algorithm exploits signal redundancy to achieve a computational complexity of N log N, where N is the number of independent samples. To the authors’ knowledge, this is the only FFT that makes the frequency domain computation of Volterra filtering more convenient than the time domain approach.

Journal ArticleDOI
TL;DR: A new implementation of the two-dimensional FFT (2-D FFT) has reduced arithmetic complexity and computational savings are achieved because the 2-D case enables, after some modifications of the basic separable algorithm, scaling and inverse scaling of butterfly operators.
Abstract: A a new implementation of the two-dimensional FFT (2-D FFT) is proposed. Compared with the usual separable solution, the new realization of the 2-D FFT has reduced arithmetic complexity. Computational savings are achieved because the 2-D case enables, after some modifications of the basic separable algorithm, scaling and inverse scaling of butterfly operators. The new improvement is also applied to other 2-D transforms: DCT-IV, DCT, and lapped transforms.

Patent
Ronald D. Wagstaff1
09 Jul 1999
TL;DR: In this article, a system tests analog and mixed signal IC devices using an FFT algorithm, supported by a non-iterative Fast Fourier Transform (FFT) coherency analysis algorithm.
Abstract: A system tests analog and mixed signal IC devices using an FFT algorithm, supported by a non-iterative Fast Fourier Transform (FFT) coherency analysis algorithm to establish FFT sample-set coherency. A test signal is input into the IC device, and an output signal from the IC device is analyzed using the FFT algorithm. The non-iterative FFT coherency analysis algorithm uses only one “given” value and two approximated values related to a test signal. Based on these given and approximated values, the correct set of all four values required for proper testing of the IC device is determined in a single pass, without the need for multiple iterations.

Patent
30 Oct 1999
TL;DR: In this paper, a technique for computationally efficient evaluation of the Modified Discrete Cosine Transform (MDCT) using the Fast Fourier Transform (FFT) method is presented.
Abstract: A technique for computationally efficient evaluation of the Modified Discrete Cosine Transform (MDCT) using the Fast Fourier Transform (FFT) method is presented. The full MDCT computation process comprises of a pre-processing block, an N-Point FFT block and finally a post-processing block. It is well known that an N/2-Point FFT can be used for computing N-Point FFT of a sequence of N real data. The input to the FFT block described above consist of a sequence of N complex numbers. This patent discusses a method by which the regularity in these complex data can be exploited to compute their N-Point FFT using an N/2-Point FFT only, thereby decreasing computation burden almost by two.

Proceedings ArticleDOI
27 Aug 1999
TL;DR: In this paper, the PSR far from its maximum was calculated in the case of a F/3 system working at 4000 nm, in focus and in presence of small defocusing.
Abstract: We are interested in calculating precisely the PSR far from its maximum, where the maxima of the irradiance are falling to 1E-06, or less. The first and most used method consist in calculating the Fourier transform of the wavefront using the Fast Fourier Transform algorithm (FFT). Another method is using the beam superposition technique (BST) to decompose the wavefront in Gaussian beams, propagate those beams, and recompose to obtain the result. The third method is to apply the exact equations derived at the end of last century and described in reference books like Born and Wolf or Marechel and Francon. We shall compare the result obtained with the three methods, FFT, BST, and exact calculation in the case of a F/3 system working at 4000 nm, in focus and in presence of small defocusing.

Journal ArticleDOI
TL;DR: In this article, a fast root finding algorithm based on an FFT implementation is proposed, thus avoiding the need for a computationally heavy polynomial rooting technique that estimates the eigenvalues of a companion matrix.
Abstract: A fast root finding algorithm based on an FFT implementation is proposed, thus avoiding the need for a computationally heavy polynomial rooting technique that estimates the eigenvalues of a companion matrix. The minimum-phase polynomial factorisation proposed by Oppenheim and Schafer (1989) is first extended to an arbitrary radius factorisation, then used to extract the roots in an iterative manner.

Journal ArticleDOI
TL;DR: This paper presents a new method of implementing the fast Fourier transform (FFT) algorithm that efficiently utilizes computer time to perform the FFT computation while data acquisition proceeds so that local butterfly modules are built using the data points that are already available.
Abstract: On-line running spectral analysis is of considerable interest in many electrophysiological signals, such as the EEG (electroencephalograph). This paper presents a new method of implementing the fast Fourier transform (FFT) algorithm. Our "real-time FFT algorithm" efficiently utilizes computer time to perform the FFT computation while data acquisition proceeds so that local butterfly modules are built using the data points that are already available. The real-time FFT algorithm is developed using the decimation-in-time split-radix FFT (DIT sr-FFT) butterfly structure. In order to demonstate the synchronization ability of the proposed algorithm, the authors develop a method of evaluating the number of arithmetic operations that it requires. Both the derivation and the experimental result show that the real-time FFT algorithm is superior to the conventional whole-block FFT algorithm in synchronizing with the data acquisition process. Given that the FFT sizeN=2 r , real-time implementation of the FFT algorithm requires only 2/r the computational time required by the whole-block FFT algorithm.

Journal ArticleDOI
TL;DR: The fast Fourier transform (FFT) algorithm is explained in a simple and novel way and it is shown how the two canonic forms of the FFT can be respectively arranged as “combine- and-conquer” or “divide-and-con conquer” algorithms.

Book ChapterDOI
01 Jan 1999
TL;DR: An efficient method for computing the discrete Fourier transform (DFT) is available which has revolutionized many fields of applied science and engineering where, hitherto, problems of computing had posed a serious obstacle to progress.
Abstract: An efficient method for computing the discrete Fourier transform (DFT) is available which has revolutionized many fields of applied science and engineering where, hitherto, problems of computing had posed a serious obstacle to progress. The algorithm is known as the fast Fourier transform (FFT). The fast Fourier transform achieves its greatest efficiency when the sample size is a highly composite number with many factors. The easiest way of expounding the algebra of the multifactor Fourier transform is to concentrate on the details of the three-factor case; for the algebra remains simple, while the generalization to the multifactor case follows immediately.

01 Jan 1999
TL;DR: In this article, a new implementation of the two-dimensional FFT (2-D FFT) is proposed, which enables, after some modifications of the basic separable algorithm, scaling and inverse scaling of butterfly operators.
Abstract: In this correspondence, a new implementation of the two- dimensional FFT (2-D FFT) is proposed. Compared with the usual separable solution, the new realization of the 2-D FFT has reduced arithmetic complexity. Computational savings are achieved because the 2- D case enables, after some modifications of the basic separable algorithm, scaling and inverse scaling of butterfly operators. The new improvement is also applied to other 2-D transforms: DCT-IV, DCT, and lapped transforms.

Proceedings ArticleDOI
22 Sep 1999
TL;DR: In this paper, the authors point out a method for reducing the computing effort required by fast Fourier transformation (FFT) techniques used for collecting spectral data, where the number of equidistant samples from which the discrete Fourier transform (DFT) is formed is reduced.
Abstract: In applications of time-domain dielectric spectroscopy, in which frequency-domain information is obtained by means of Fourier transformation of step-response data, results are often required at a number M of evenly spaced frequencies, where M is considered smaller than N, the number of equidistant samples from which the discrete Fourier transform (DFT) is formed. The object of this paper is to point out a method for reducing the computing effort required by fast Fourier transformation (FFT) techniques used for collecting spectral data.