scispace - formally typeset
Search or ask a question

Showing papers on "Twiddle factor published in 2002"


Journal ArticleDOI
TL;DR: In this paper, a new algorithm, containing a Fast Fourier Transform (FFT) sub-routine, for the numerical calculus of the KKT has been developed, which is based in the convolution theorem and it uses only two FFT calculi.

36 citations


Journal ArticleDOI
TL;DR: A new multiplierless approximation of the discrete Fourier transform (DFT) called the multiplierless fast Fourier Transform-like (ML-FFT) transformation makes use of a novel factorization to parameterize the twiddle factors in the conventional radix-2/sup n/ or split-radix FFT algorithms as certain rotation-like matrices.
Abstract: This letter proposes a new multiplierless approximation of the discrete Fourier transform (DFT) called the multiplierless fast Fourier transform-like (ML-FFT) transformation. It makes use of a novel factorization to parameterize the twiddle factors in the conventional radix-2/sup n/ or split-radix FFT algorithms as certain rotation-like matrices and approximates the associated parameters using the sum-of-powers-of-two (SOPOT) or canonical signed digits (CSD) representations. The ML-FFT converges to the DFT when the number of SOPOT terms used increases and has an arithmetic complexity of O(N log/sub 2/ N) additions, where N = 2/sup m/ is the transform length. Design results show that the NM-FFT offers flexible tradeoff between arithmetic complexity and numerical accuracy in approximating the DFT.

25 citations


Patent
05 Jun 2002
TL;DR: In this article, the authors present a method of computing a fast Fourier transform (FFT) and an associated circuit that controls the addressing of a data memory of the FFT processing circuit.
Abstract: The present invention is generally directed to a novel method of computing a fast Fourier transform (FFT), and an associated circuit that controls the addressing of a data memory of the FFT processing circuit. The novel method operates by computing all complex butterfly operations in a given stage of computations, before computing any of the complex butterfly operations in a subsequent stage. Further, and within any given computation stage, the method performs by computing all other complex butterfly operations in a given stage of computations having a twiddle factor equal to the first twiddle value of that stage, before computing any other complex butterfly operations in the given stage of computations. Thereafter, subsequent computations are performed in the same way. More particularly, after computing a first set of complex butterfly operations (each having the same twiddle value) in a given computation stage, a first complex butterfly operation (having a different twiddle value) of a second set of complex butterfly operations, is computed in that stage. Thereafter, all remaining complex butterfly operations (having the same value) in that stage will be computed. This methodology will be repeated until all butterfly operations are calculated in each stage. An addressing circuit is also provided for addressing a data memory in a system for computing a FFT, the system having a data memory for storing data values and a coefficient memory for storing coefficient values.

24 citations


Proceedings ArticleDOI
15 Apr 2002
TL;DR: A novel twiddle factor-based FFT algorithm to reduce the frequency of memory access as well as multiplication operations is presented and shows that, for a 32-point FFT, the new algorithm leads to as much as 20% reduction in clock cycles and an average of 30% reduced in memory access than that of the conventional DIF FFT.
Abstract: In microprocessor-based systems, memory access is expensive due to longer latency and higher power consumption. In this paper, we present a novel FFT algorithm to reduce the frequency of memory access as well as multiplication operations. For an N-point FFT, we design the FFT with two distinct sections: (1) The first section of the FFT structure computes the butterflies involving twiddle factors WNj (j ≠ 0) through a computation/partitioning scheme similar to the Hoffman coding. In this section, all the butterflies sharing the same twiddle factor will be clustered and computed together. In this way, redundant memory access to load twiddle factors is avoided. (2) In the second section, the remaining (N - 1) butterflies involving the twiddle factor WN0 are computed with a register-based breadth first tree traversal algorithm. This novel twiddle factor-based FFT is tested on the TIT MS320C62x digital signal processor. The results show that, for a 32-point FFT, the new algorithm leads to as much as 20% reduction in clock cycles and an average of 30% reduction in memory access than that of the conventional DIF FFT.

23 citations


Journal ArticleDOI
TL;DR: Both worst case and average case analysis of roundoff errors occurring in eight precomputation methods of twiddle factors are presented.
Abstract: The accurate precomputation of the twiddle factors is necessary in order to perform discrete trigonometric transforms. This paper presents both worst case and average case analysis of roundoff errors occurring in eight precomputation methods of twiddle factors. We are interested in methods with small roundoff errors, low complexity and using only little computer memory. Numerical tests confirm the theoretical results.

14 citations


Proceedings ArticleDOI
07 Aug 2002
TL;DR: Design results show that the ML-FFT offers flexible tradeoff between arithmetic complexity and numerical accuracy in approximating the DFT, and uses the polynomial transformation to obtain similar multiplier-less approximation of 2D FFT.
Abstract: This paper proposes a new multiplier-less approximation of the 1D discrete Fourier transform (DFT) called the multiplierless fast Fourier transform-like (ML-FFT) transformation. It parameterizes the twiddle factors in conventional radix-2/sup n/ or split-radix FFT algorithms as certain rotation-like matrices and approximates the associated parameters using the sum-of-powers-of-two (SOPOT) or canonical signed digits (CSD) representations. The ML-FFT converges to the DFT when the number of SOPOT terms used increases and has an arithmetic complexity of O(Nlog/sub 2/ N) additions, where N=2/sup m/ is the transform length. Design results show that the ML-FFT offers flexible tradeoff between arithmetic complexity and numerical accuracy in approximating the DFT. Using the polynomial transformation, similar multiplier-less approximation of 2D FFT is also obtained.

10 citations


Book ChapterDOI
27 Aug 2002
TL;DR: A blocking algorithm for a parallel one-dimensional fast Fourier transform (FFT) on clusters of PCs based on the six-step FFT algorithm, which achieves performance of over 1.3 GFLOPS on an 8-node dual Pentium III 1 GHz PC SMP cluster.
Abstract: In this paper, we propose a blocking algorithm for a parallel one-dimensional fast Fourier transform (FFT) on clusters of PCs. Our proposed parallel FFT algorithm is based on the six-step FFT algorithm. The six-step FFT algorithm can be altered into a block nine-step FFT algorithm to reduce the number of cache misses. The block nine-step FFT algorithm improves performance by utilizing the cache memory effectively. We use the block nine-step FFT algorithm to design the parallel one-dimensional FFT algorithm. In our proposed parallel FFT algorithm, since we use cyclic distribution, all-to-all communication is required only once. Moreover, the input data and output data are both can be given in natural order. We successfully achieved performance of over 1.3 GFLOPS on an 8-node dual Pentium III 1 GHz PC SMP cluster.

10 citations


Journal ArticleDOI
TL;DR: An extendible look-up table of the twiddle factors for implementation of fast Fourier transform (FFT) is introduced and the results indicate that the FFT scheme is effective.

7 citations


Proceedings ArticleDOI
03 Sep 2002
TL;DR: The SIMD-FFT algorithm is extended to handle Multi-Dimensional input data; this new approach does not make use of matrix transposition, and the results are compared against the FFTW for the 2D and 3D case.
Abstract: A general radix-2 FFT algorithm was recently developed and implemented for Modern Single Instruction Multiple Data (SIMD) architectures. This algorithm (SIMD-FFT) was found to be faster than any scalar FFT implementation, and as well, than other FFT implementations that uses the SIMD architecture for complex 1D and 2D input data [1]. In this paper, the SIMD-FFT algorithm is extended to handle Multi-Dimensional input data; this new approach does not make use of matrix transposition. The results are compared against the FFTW for the 2D and 3D case. Overall, the SIMD-FFT was found to be faster for complex 2D input data (ranging from 82% up to 343%), and as well, for complex 3D input data (ranging from 59.5% up to 198%)

6 citations


Journal ArticleDOI
TL;DR: In this article, the fast Fourier transform (FFT) algorithm was used to compute the transform on as coarse a grid as one desired without loss of precision, where the range of the Miller indices of the input data was tested to ensure that the total number of grid divisions in the x, y and z directions of the cell is sufficiently large enough to perform the FFT.
Abstract: The fast Fourier transform (FFT) algorithm as normally formulated allows one to compute the Fourier transform of up to N complex structure factors, F(h), N/2 ≥ h > −N/2, if the transform ρ(r) is computed on an N-point grid. Most crystallographic FFT programs test the ranges of the Miller indices of the input data to ensure that the total number of grid divisions in the x, y and z directions of the cell is sufficiently large enough to perform the FFT. This note calls attention to a simple remedy whereby an FFT can be used to compute the transform on as coarse a grid as one desires without loss of precision.

5 citations


Proceedings ArticleDOI
13 May 2002
TL;DR: The proposed two matrix transformation techniques have led to a novel twiddle-factor-based FFT algorithm that exhibits as much as 20% reduction in clock cycles and an average of 30% reduced in memory access than that of the conventional DIF FFT.
Abstract: Memory reference is one of the major courses of power consumption incurred in a microprocessor. In this paper, we propose two matrix transformation techniques to reduce the number of memory references in the FFT computation. With the first transformation, all the butterflies sharing the same twiddle factor will be clustered and computed together to eliminate redundant memory access to load twiddle factors. With the second transformation, all remaining (N − 1) butterflies involving the twiddle factor W N 0 are computed using a register-based breadth-first tree traversal algorithm so that load/store operations of intermediate data arrays are minimized. The proposed two transformations together have led to a novel twiddle-factor-based FFT algorithm. The test results on the TI TMS320C62x digital signal processor show that, for a 32-point FFT, the new algorithm exhibits as much as 20% reduction in clock cycles and an average of 30% reduction in memory access than that of the conventional DIF FFT.

Proceedings ArticleDOI
01 Nov 2002
TL;DR: In this article, the orthogonal frequency Fourier transform (OFFT) was proposed to reduce the number of multipliers in the Fourier Transform (FFT) and simplify the complex-valued terms in the transform.
Abstract: Improved Fourier Transforms for Multi-carrier ProcessingSteve Shattil Carl R. NassarIdris Communications Colorado State Universitysteve@idriscomm.com carln@engr.colostate.eduABSTRACTWe present a simplified Fourier-transform process, called the orthogonal frequencyFourier transform (OFFT). Conventional divide-and-conquer techniques, such as the fast Fouriertransform (FFT), reduce the number of operations in a Fourier transform and simplify at leastsome of the complex-valued terms (i.e. twiddle factors). The FFT reduces the number ofmultipliers, which account for much of the chip area and power consumption in digital VLSIdesign. The OFFT and inverse OFFT exploit orthogonal frequency relationships to replacemultiplications with simpler sampling and adding operations. Specifically, the OFFT replacestwiddle factors with step functions, which are superpositions of harmonic sinusoids. The resultingtransform is adapted to add samples that are selected relative to at least one periodic stepfunction, thus eliminating all complex multiplications. In phase and quadrature phase OFFTprocessing may be performed. OFFTs can be combined with pass-band sampling tosimultaneously perform filtering, down conversion, and demodulation. Inverse OFFTs combinedwith pass-band filters can be used to provide up conversion of multi-carrier signals. Since OFFTsare substantially less complex than FFTs, OFFT processing is applicable to digital radio systemswhere there are considerable constraints on power consumption and chip size. The OFFT isparticularly useful for processing multi-carrier transmission protocols in wirelesscommunications, such as Carrier Interferometry, Orthogonal Frequency Division Multiplexing,and Multi-carrier Code Division Multiple Access, which are quickly gaining favor over single-carrier protocols. OFFT algorithms can process a greater number of carriers and provide lowercomplexity compared to FFTs.I. IntroductionFast Fourier Transform (FFT) processing applications include digital mobile cellular radiosystems where there are considerable constraints on power consumption and chip size. Theprimary constraining factor is chip complexity, which is typically expressed in terms of thenumber of adders, the number of multipliers, data storage requirements, and control complexity,rather than speed of operation.Divide-and-conquer techniques reduce the number of operations in a Fourier transform andsimplify at least some of the transform’s twiddle factors (i.e., the complex-valued terms in theFourier transform that represent intervening phase shifts, or rotational factors). In a divide-and-conquer technique, the computation of a DFT is decomposed into nested DFTs of shorter length.Divide-and-conquer techniques are well known in the derivation of fast algorithms in which anN-point DFT is decomposed into successively smaller DFTs that are computed separately andcombined to give the final result.A pipeline DFT processor is characterized by real-time continuous processing of an inputdata sequence. However, it is very difficult to initiate the FFT operation until all of the N sampleddata are taken. This contributes to the complexity of the processor, and thus, power consumption.In [1], the author suggests that a new metric should be introduced for real-time processing, sinceit is sub-optimal given the added complexity. Although almost all the feasible approachesapproach the lower bound of complexity, as described in [2]. One particular class of pipelineprocessors with the application of recursive Common Factor Algorithms [3] has the lowestcomplexity among the conventional approaches that meet the real-time processing requirements.

Book ChapterDOI
16 May 2002

01 Jan 2002
TL;DR: Results show that the ML-FFT offers flexible tradeoff between arithmetic complexity and numerical accuracy in approximating the DFT.
Abstract: This paper proposes a new mult iplier -less appro ximation of the I-D Discrete Fourier Transform (DFD called the multiplier­ less Fast Fourier Transform-like (ML-FFD transformation. It para meteri zes the twiddle factors in conventional radix- 2" or split-radix FFT algorithms as certain rotation-like matrices and approximates the associated parameters the sum-of­ powers-of-two (SOPOD or canonical signed digits (CSD) representations. The ML-FFT converges to the OFT when the number of SOPOT terms used increases and has an arithmetic complexity of O(Nlog, N) addi tions, where N = 2m is the transform length. Oesign results show that the ML-FFT offers flexible tradeoff between arithmetic complexity and numerical accuracy in approximating the DFT. Using the polyno mial transformation, similar multiplier-less approximation of 2-D FFT is also obtain ed.