scispace - formally typeset
Search or ask a question

Showing papers on "Prime-factor FFT algorithm published in 2002"


Journal ArticleDOI
TL;DR: In this article, the DC-FFT algorithm was used to analyze the contact stresses in an elastic body under pressure and shear tractions for high efficiency and accuracy, and a set of general formulas of the frequency response function for the elastic field was derived and verified.
Abstract: The knowledge of contact stresses is critical to the design of a tribological element. It is necessary to keep improving contact models and develop efficient numerical methods for contact studies, particularly for the analysis involving coated bodies with rough surfaces. The fast Fourier Transform technique is likely to play an important role in contact analyses. It has been shown that the accuracy in an algorithm with the fast Fourier Transform is closely related to the convolution theorem employed. The algorithm of the discrete convolution and fast Fourier Transform, named the DC-FFT algorithm includes two routes of problem solving: DC-FFT/Influence coefficients/Green's, function for the cases with known Green's functions and DC-FFT/Influence coefficient/conversion, if frequency response functions are known. This paper explores the method for the accurate conversion for influence coefficients from frequency response functions, further improves the DC- FFT algorithm, and applies this algorithm to analyze the contact stresses in an elastic body under pressure and shear tractions for high efficiency and accuracy. A set of general formulas of the frequency response function for the elastic field is derived and verified. Application examples are presented and discussed.

265 citations


Journal ArticleDOI
TL;DR: In this article, a concept of integer fast Fourier transform (IntFFT) for approximating the discrete Fourier Transform (DFT) is introduced, where the lifting scheme is used to approximate complex multiplications appearing in the FFT lattice structures.
Abstract: A concept of integer fast Fourier transform (IntFFT) for approximating the discrete Fourier transform is introduced. Unlike the fixed-point fast Fourier transform (FxpFFT), the new transform has the properties that it is an integer-to-integer mapping, is power adaptable and is reversible. The lifting scheme is used to approximate complex multiplications appearing in the FFT lattice structures where the dynamic range of the lifting coefficients can be controlled by proper choices of lifting factorizations. Split-radix FFT is used to illustrate the approach for the case of 2/sup N/-point FFT, in which case, an upper bound of the minimal dynamic range of the internal nodes, which is required by the reversibility of the transform, is presented and confirmed by a simulation. The transform can be implemented by using only bit shifts and additions but no multiplication. A method for minimizing the number of additions required is presented. While preserving the reversibility, the IntFFT is shown experimentally to yield the same accuracy as the FxpFFT when their coefficients are quantized to a certain number of bits. Complexity of the IntFFT is shown to be much lower than that of the FxpFFT in terms of the numbers of additions and shifts. Finally, they are applied to noise reduction applications, where the IntFFT provides significantly improvement over the FxpFFT at low power and maintains similar results at high power.

165 citations


Journal ArticleDOI
TL;DR: In this article, a new algorithm to eliminate the error caused by this decaying component in the Fourier algorithm has been proposed, and three simplified methods are proposed to alleviate the computation burden.
Abstract: The impact of exponentially decaying direct component on the Fourier algorithm is theoretically investigated first in this paper. A new algorithm to eliminate the error caused by this decaying component in the Fourier algorithm has been proposed. Furthermore, three simplified methods are proposed to alleviate the computation burden. The performance of the Fourier algorithm improved with these methods along with the least error squares algorithm is evaluated using a simple network and a real power system modeled by EMTP. The evaluation results are presented and discussed.

156 citations


Journal ArticleDOI
TL;DR: This paper proposes a fast approximate algorithm for the associated Legendre transform by means of polynomial interpolation accelerated by the Fast Multipole Method (FMM), and shows that the algorithm is stable and is faster than the direct computation for N ≥ 511.
Abstract: The spectral method with discrete spherical harmonics transform plays an important role in many applications. In spite of its advantages, the spherical harmonics transform has a drawback of high computational complexity, which is determined by that of the associated Legendre transform, and the direct computation requires time of O(N3) for cut-off frequency N. In this paper, we propose a fast approximate algorithm for the associated Legendre transform. Our algorithm evaluates the transform by means of polynomial interpolation accelerated by the Fast Multipole Method (FMM). The divide-and-conquer approach with split Legendre functions gives computational complexity O(N2 log N). Experimental results show that our algorithm is stable and is faster than the direct computation for N ≥ 511.

101 citations


Journal ArticleDOI
TL;DR: In this paper, a new algorithm, containing a Fast Fourier Transform (FFT) sub-routine, for the numerical calculus of the KKT has been developed, which is based in the convolution theorem and it uses only two FFT calculi.

36 citations


Journal ArticleDOI
TL;DR: An FFT algorithm that can compute the match score of a sequence against a position-specific scoring matrix (PSSM) that finds the PSSM score simultaneously over all offsets of the P SSM with the sequence, although like all previous FFT algorithms, it still disallows gaps.
Abstract: Historically, in computational biology the fast Fourier transform (FFT) has been used almost exclusively to count the number of exact letter matches between two biosequences. This paper presents an FFT algorithm that can compute the match score of a sequence against a position-specific scoring matrix (PSSM). Our algorithm finds the PSSM score simultaneously over all offsets of the PSSM with the sequence, although like all previous FFT algorithms, it still disallows gaps. Although our algorithm is presented in the context of global matching, it can be adapted to local matching without gaps. As a benchmark, our PSSM-modified FFT algorithm computed pairwise match scores. In timing experiments, our most efficient FFT implementation for pairwise scoring appeared to be 10 to 26 times faster than a traditional FFT implementation, with only a factor of 2 in the acceleration attributable to a previously known compression scheme. Many important algorithms for detecting biosequence similarities, e.g., gapped BLAST o...

34 citations


Book ChapterDOI
03 Nov 2002
TL;DR: The Cooley-Tukey FFT can be interpreted as an algorithm for the efficient computation of the Fourier transform for the finite cyclic groups, a compact group, or the non-compact group of the real line as discussed by the authors.
Abstract: The Cooley-Tukey FFT can be interpreted as an algorithm for the efficient computation of the Fourier transform for the finite cyclic groups, a compact group, or the non-compact group of the real line. All of which are commutative instances of a "Group FFT". A brief survey of some recent progress made in the direction of noncommutative generalizations and applications is given.

31 citations


01 Jan 2002
TL;DR: The amending algorithm based on the analysis of the FFT algorithm has the characteristics of easy implementation and high precision, and will be a practical method for harmonic analysis in power system.
Abstract: The wide use of non-linear components in power system gives rise to not only integer harmonics, but also non-integer harmonics in the power system. The conventional harmonic measurement algorithm Fast Fourier Transform (FFT) is suitable to be used in integer harmonic analysis, but is not fit to analyze non-integer harmonics due to its leakage and picket fence effects, which brings about large errors in practical applications. The engender of the leakage effect is caused bythe different characters between the theoretic implementation of the Fourier Transform which deals with infinite signals and the practical implementation of Fourier Transform which deals with finite signals. These differences give rise to measurement errors of non-integer harmonics of FFT algorithm. In order to reduce the leakage errors and improve the measurement precision, this paper presents amending algorithm based on the analysis of the FFT algorithm. Through simple transforms of FFT algorithm, the amending algorithm can reduce satisfactorily the leakage error, and obtain accurate analysis results. Simulations validate the high precision of this novel algorithm. The amending algorithm have the characteristics of easy implementation and high precision, and will be a practical method for harmonic analysis in power system.

25 citations


Proceedings ArticleDOI
15 Apr 2002
TL;DR: A novel twiddle factor-based FFT algorithm to reduce the frequency of memory access as well as multiplication operations is presented and shows that, for a 32-point FFT, the new algorithm leads to as much as 20% reduction in clock cycles and an average of 30% reduced in memory access than that of the conventional DIF FFT.
Abstract: In microprocessor-based systems, memory access is expensive due to longer latency and higher power consumption. In this paper, we present a novel FFT algorithm to reduce the frequency of memory access as well as multiplication operations. For an N-point FFT, we design the FFT with two distinct sections: (1) The first section of the FFT structure computes the butterflies involving twiddle factors WNj (j ≠ 0) through a computation/partitioning scheme similar to the Hoffman coding. In this section, all the butterflies sharing the same twiddle factor will be clustered and computed together. In this way, redundant memory access to load twiddle factors is avoided. (2) In the second section, the remaining (N - 1) butterflies involving the twiddle factor WN0 are computed with a register-based breadth first tree traversal algorithm. This novel twiddle factor-based FFT is tested on the TIT MS320C62x digital signal processor. The results show that, for a 32-point FFT, the new algorithm leads to as much as 20% reduction in clock cycles and an average of 30% reduction in memory access than that of the conventional DIF FFT.

23 citations


Journal ArticleDOI
TL;DR: An efficient algorithm that decomposes a monomial representation of a solvable group G into its irreducible components is presented and well-known theorems in a constructively refined form are presented and derive new results on decomposition matrices of representations.

22 citations


Journal ArticleDOI
TL;DR: By using this Fourier-based method, the use of large filters or infinite impulse response filters in multiresolution analysis becomes manageable in terms of computation costs.
Abstract: Wavelet transforms are often calculated by using the Mallat algorithm. In this algorithm, a signal is decomposed by a cascade of filtering and downsampling operations. Computing time can be important but the filtering operations can be speeded up by using fast Fourier transform (FFT)-based convolutions. Since it is necessary to work in the Fourier domain when large filters are used, we present some results of Fourier-based optimization of the sampling operations. Acceleration can be obtained by expressing the samplings in the Fourier domain. The general equations of the down- and upsampling of digital multidimensional signals are given. It is shown that for special cases such as the separable scheme and Feauveau’s quincunx scheme, the samplings can be implemented in the Fourier domain. The performance of the implementations is determined by the number of multiplications involved in both FFT-convolution-based and Fourier-based algorithms. This comparison shows that the computational costs are reduced when the proposed implementation is used. The complexity of the algorithm is O(N log N). By using this Fourier-based method, the use of large filters or infinite impulse response filters in multiresolution analysis becomes manageable in terms of computation costs. Mesh simplification based on multiresolution “detail relevance” images illustrates an application of the implemenentation. © 2002 SPIE and IS&T.

Patent
Michael Walker1
08 Mar 2002
TL;DR: In this paper, a Continuous Fourier Transformation (CFT) was proposed to allow a sliding determination of the Fourier transformation instead of former block processing according to the FFT, and the number of frequency samples and frequency distribution can be chosen freely and independently from the time sample rate.
Abstract: In the domain of telecommunications, the Fourier Transformation, frequently in the variant Fast Fourier Transformation, or FFT in short, is used, for example, in methods for echo suppression, for noise reduction, for improving speech recognition and for coding audio and video signals. In the case of the FFT, the number of frequencies N and the number of sampling values K are equal, the frequency spacing is constant, the bandwidth is constant, and the delay between the time signal and the frequency spectrum is fixed. These characteristics do not permit adaptation to, for example, psychoacoustic features, the frequency resolution of the human ear being nonlinear. The invention discloses a Continuous Fourier Transformation (CFT), that allows a sliding determination of the fourier transformation instead of former block processing according to the FFT. Further the number of frequency samples and the frequency distribution can be chosen freely and independently from the time sample rate.

Journal ArticleDOI
TL;DR: A new bit reversal algorithm which outperforms the existing ones and is based upon a pseudo‐semi‐group homomorphism property, which is almost trivial to prove and leads to a very efficient algorithm which is believed to be the best.
Abstract: In this paper we present a new bit reversal algorithm which outperforms the existing ones. The bit reversal technique is involved in the fast Fourier transform technique (FFT), which is widely used in computer-based numerical techniques for solving numerous problems. The new approach for computing the bit reversal is based upon a pseudo-semi-group homomorphism property. The surprise is that this property is almost trivial to prove but at the same time it also leads to a very efficient algorithm which we believe to be the best with only (N) operations and optimal constant, i.e. unity. Copyright © 2002 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: It is shown that a transform based on the RJM offers a simple structure of N-point FFT in terms of the decomposition of the corresponding matrix and that it computes very fast the center weighted Hadamard transform.
Abstract: The Reverse Jacket matrix (RJM) is a generalized form of the Hadamard matrix. Thus RJM is closely related to the matrix for fast Fourier transform (FFT). It also has a very interesting structure, i.e. its inverse can be easily obtained and has the reversal form of the original matrix. In this paper, we have shown that a transform based on the RJM offers a simple structure of N-point FFT in terms of the decomposition of the corresponding matrix and that it computes very fast the center weighted Hadamard transform.

Patent
30 Aug 2002
TL;DR: In this article, a wireless communication technique enables fast Fourier transforms (FFTs) and inverse Fast Fourier Transform (IFFTs) to be performed with reduced latency and reduced memory requirements.
Abstract: A wireless communication technique enables fast Fourier transforms (FFTs) and inverse fast Fourier transforms (IFFTs) to be performed with reduced latency and reduced memory requirements. In particular, an FFT/IFFT unit receives input data representative of a communication symbol. The FFT/IFFT unit applies an FFT operation to the input data to generate intermediate data. The FFT/IFFT unit stores the intermediate data in a random access memory (RAM). The intermediate data stored in the RAM may override data used as input to the FFT operation. The FFT/IFFT unit selectively addresses the RAM to retrieve the intermediate data in a desired output order. For example, the FFT/IFFT unit may output the intermediate data in the same sequential order as the FFT/IFFT unit received the input data.

Journal ArticleDOI
TL;DR: An algorithm for computing the discrete Fourier transform of data with threefold symmetry axes is presented, which reduces the computational complexity of such a Fouriertransform by a factor of 3.
Abstract: An algorithm for computing the discrete Fourier transform of data with threefold symmetry axes is presented. This algorithm is straightforward and easily implemented. It reduces the computational complexity of such a Fourier transform by a factor of 3. There are no restrictive requirements imposed on the initial data. Explicit formulae and a scheme of computing the Fourier transform are given. The algorithm has been tested and benchmarked against FFT on the unit cell, revealing the expected increase in speed. This is a non-trivial example of a more general approach developed recently by the authors.

Journal ArticleDOI
TL;DR: A new relationship between the IDCT and the discrete Fourier transform (DFT) is established that allows the evaluation of two simultaneous N-point IDCTs by computing a single FFT of the same dimension.
Abstract: This paper reconsiders the discrete cosine transform (DCT) algorithm of Narashima and Peterson (1978) in order to reduce the computational cost of the evaluation of N-point inverse discrete cosine transform (IDCT) through an N-point FFT. A new relationship between the IDCT and the discrete Fourier transform (DFT) is established. It allows the evaluation of two simultaneous N-point IDCTs by computing a single FFT of the same dimension. This IDCT implementation technique reduces by half the number of operations.

Proceedings ArticleDOI
03 Nov 2002
TL;DR: A modular pipeline architecture for computing long discrete Fourier transforms (DFT), where two conventional pipeline /spl radic/N point fast Fourier transform (FFT) modules are joined by a specialized center element, which reduces the number of delay elements required.
Abstract: This paper presents a modular pipeline architecture for computing long discrete Fourier transforms (DFT). For an N point DFT, two conventional pipeline /spl radic/N point fast Fourier transform (FFT) modules are joined by a specialized center element. The center element contains memories, multipliers and control logic. Compared with a standard N point pipeline FFT the modular FFT reduces the number of delay elements required. Further, the coefficient storage is concentrated within the center element, reducing the storage requirement in the pipeline FFT modules. The centralized memory and address generator provide the storage and data reordering. The throughput of a standard radix-2 pipeline FFT is maintained with slightly higher end-to-end latency. A simulator has been built to analyze the proposed architecture. Results for DFTs of lengths up to 4M points are presented and compared with alternate algorithms.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: Design results show that the ML-FFT offers flexible tradeoff between arithmetic complexity and numerical accuracy in approximating the DFT, and uses the polynomial transformation to obtain similar multiplier-less approximation of 2D FFT.
Abstract: This paper proposes a new multiplier-less approximation of the 1D discrete Fourier transform (DFT) called the multiplierless fast Fourier transform-like (ML-FFT) transformation. It parameterizes the twiddle factors in conventional radix-2/sup n/ or split-radix FFT algorithms as certain rotation-like matrices and approximates the associated parameters using the sum-of-powers-of-two (SOPOT) or canonical signed digits (CSD) representations. The ML-FFT converges to the DFT when the number of SOPOT terms used increases and has an arithmetic complexity of O(Nlog/sub 2/ N) additions, where N=2/sup m/ is the transform length. Design results show that the ML-FFT offers flexible tradeoff between arithmetic complexity and numerical accuracy in approximating the DFT. Using the polynomial transformation, similar multiplier-less approximation of 2D FFT is also obtained.

Book ChapterDOI
27 Aug 2002
TL;DR: A blocking algorithm for a parallel one-dimensional fast Fourier transform (FFT) on clusters of PCs based on the six-step FFT algorithm, which achieves performance of over 1.3 GFLOPS on an 8-node dual Pentium III 1 GHz PC SMP cluster.
Abstract: In this paper, we propose a blocking algorithm for a parallel one-dimensional fast Fourier transform (FFT) on clusters of PCs. Our proposed parallel FFT algorithm is based on the six-step FFT algorithm. The six-step FFT algorithm can be altered into a block nine-step FFT algorithm to reduce the number of cache misses. The block nine-step FFT algorithm improves performance by utilizing the cache memory effectively. We use the block nine-step FFT algorithm to design the parallel one-dimensional FFT algorithm. In our proposed parallel FFT algorithm, since we use cyclic distribution, all-to-all communication is required only once. Moreover, the input data and output data are both can be given in natural order. We successfully achieved performance of over 1.3 GFLOPS on an 8-node dual Pentium III 1 GHz PC SMP cluster.

Proceedings ArticleDOI
07 Nov 2002
TL;DR: A novel approach for scalable length Fast Fourier Transform (FFT) in single Processing Element (single PE) architecture has been developed and an efficient mechanism, named Interleaved Rotated Data Allocation (IRDA), to replace the multiple- port memory with a single-port memory.
Abstract: A novel approach for scalable length Fast Fourier Transform (FFT) in single Processing Element (single PE) architecture has been developed. The scalable length FFT design meets the different lengths requirement of FFT operation in an OFDM system. An efficient mechanism, named Interleaved Rotated Data Allocation (IRDA), to replace the multiple-port memory with a single-port memory has also been proposed. Using a single-port memory instead of multiple-port memory makes the design more area efficient.

Journal ArticleDOI
TL;DR: A new fast algorithm for spectral transformations for two-dimensional digital filters is presented, based on the use of the fast Fourier transform, which is illustrated by a numerical example.
Abstract: In this paper, a new fast algorithm for spectral transformations for two-dimensional digital filters is presented. The algorithm is based on the use of the fast Fourier transform. The computational complexity of this algorithm is evaluated. The simplicity and efficiency of the algorithm is illustrated by a numerical example.

Journal ArticleDOI
TL;DR: An extendible look-up table of the twiddle factors for implementation of fast Fourier transform (FFT) is introduced and the results indicate that the FFT scheme is effective.

Journal ArticleDOI
TL;DR: By combining the polynomial transform and radix-q decomposition, the paper presents a new algorithm for the type-III r-dimensional discrete Cosine transform (rD-DCT-III) with size ql1×ql2× ... ×qlr, where q is an odd prime number.
Abstract: By combining the polynomial transform and radix-q decomposition, the paper presents a new algorithm for the type-III r-dimensional discrete Cosine transform (rD-DCT-III) with size ql1×ql2×m×qlr, where q is an odd prime number. The number of multiplications for computing an rD-DCT-III is approximately 1/r times that needed by the row-column method while the number of additions increase slightly. The total number of operations (additions plus multiplications) is also reduced. The proposed algorithm has a simple computational structure because it needs only 1D-DCT-III and the polynomial transform.

Proceedings ArticleDOI
03 Sep 2002
TL;DR: The SIMD-FFT algorithm is extended to handle Multi-Dimensional input data; this new approach does not make use of matrix transposition, and the results are compared against the FFTW for the 2D and 3D case.
Abstract: A general radix-2 FFT algorithm was recently developed and implemented for Modern Single Instruction Multiple Data (SIMD) architectures. This algorithm (SIMD-FFT) was found to be faster than any scalar FFT implementation, and as well, than other FFT implementations that uses the SIMD architecture for complex 1D and 2D input data [1]. In this paper, the SIMD-FFT algorithm is extended to handle Multi-Dimensional input data; this new approach does not make use of matrix transposition. The results are compared against the FFTW for the 2D and 3D case. Overall, the SIMD-FFT was found to be faster for complex 2D input data (ranging from 82% up to 343%), and as well, for complex 3D input data (ranging from 59.5% up to 198%)

Book ChapterDOI
21 Apr 2002
TL;DR: A new method for computing the discrete Fourier transform of data endowed with linear symmetries is presented, which minimizes operations and memory space requirements by eliminating redundant data and computations induced by the symmetry on the DFT equations.
Abstract: A new method for computing the discrete Fourier transform (DFT) of data endowed with linear symmetries is presented. The method minimizes operations and memory space requirements by eliminating redundant data and computations induced by the symmetry on the DFT equations. The arithmetic complexity of the new method is always lower, and in some cases significantly lower than that of its predecesor. A parallel version of the new method is also discussed. Symmetry-aware DFTs are crucial in the computer determination of the atomic structure of crystals from x-ray diffraction intensity patterns.

Journal ArticleDOI
TL;DR: In this article, the fast Fourier transform (FFT) algorithm was used to compute the transform on as coarse a grid as one desired without loss of precision, where the range of the Miller indices of the input data was tested to ensure that the total number of grid divisions in the x, y and z directions of the cell is sufficiently large enough to perform the FFT.
Abstract: The fast Fourier transform (FFT) algorithm as normally formulated allows one to compute the Fourier transform of up to N complex structure factors, F(h), N/2 ≥ h > −N/2, if the transform ρ(r) is computed on an N-point grid. Most crystallographic FFT programs test the ranges of the Miller indices of the input data to ensure that the total number of grid divisions in the x, y and z directions of the cell is sufficiently large enough to perform the FFT. This note calls attention to a simple remedy whereby an FFT can be used to compute the transform on as coarse a grid as one desires without loss of precision.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: This approach shows that depending on the number and distribution of components, and desired accuracy, the combined algorithm can reduce the computational burden by as much as a factor of ten.
Abstract: The FFT is a classical approach to fast spectral analysis and measurement. However, it is not the best choice when high accuracy is desired for signals with a very sparse, unpredictable, wide spectral distribution. This paper describes a multistage algorithm designed for more efficient and very accurate calculation of the spectral components in such cases. The approach taken is to combine the advantages of three algorithms FFT CZT and DFT. The FFT is used for a coarse resolution scan of the entire frequency range. The chirp z transform (CZT) is used with an interpolation technique to find a more precise location of the frequency components. The DFT is used along with a windowing technique to ensure a very accurate computation of magnitude and phase. Accurate phase is very difficult to obtain with traditional approaches. This approach shows that depending on the number and distribution of components, and desired accuracy, the combined algorithm can reduce the computational burden by as much as a factor of ten.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: It is shown that the two receiver schemes are equivalent in terms of the characteristics of the FFT output, per-bin SNR after maximum-ratio combining, computational complexity and the residual out-of-band noise rejection property.
Abstract: A fractionally-spaced frequency domain equalizer (FS-FDE) is needed to transform a received signal into the frequency domain. We consider two fast Fourier transform (FFT) schemes, one with a single 2N-point FFT and one with two N-point FFTs, for the transform of oversampled N symbol data in the FS-FDE receiver. It is shown that the two receiver schemes are equivalent in terms of the characteristics of the FFT output, per-bin SNR after maximum-ratio combining, computational complexity and the residual out-of-band noise rejection property. These properties are verified by computer simulation in additive white Gaussian noise (AWGN) and typical multipath channel.

Journal ArticleDOI
TL;DR: The least-squares estimate for the joint estimation of M directions-of-arrival in an array-processing scenario is rederived in a way that makes explicit use of the discrete Fourier transform.
Abstract: The least-squares estimate for the joint estimation of M directions-of-arrival in an array-processing scenario is rederived in a way that makes explicit use of the discrete Fourier transform It is shown that the M-dimensional search algorithm can be made orders of magnitude faster than the traditional search algorithm This new approach is compared via simulation with conventional narrowband beamforming