scispace - formally typeset
Search or ask a question

Showing papers on "Prime-factor FFT algorithm published in 2005"


Journal ArticleDOI
TL;DR: The Fast Linear Canonical Transform (FLCT) as mentioned in this paper is derived from the linear canonical transform (LCT) and can be used for FFT, FRT, and FST calculations.
Abstract: The linear canonical transform (LCT) describes the effect of any quadratic phase system (QPS) on an input optical wave field. Special cases of the LCT include the fractional Fourier transform (FRT), the Fourier transform (FT), and the Fresnel transform (FST) describing free-space propagation. Currently there are numerous efficient algorithms used (for purposes of numerical simulation in the area of optical signal processing) to calculate the discrete FT, FRT, and FST. All of these algorithms are based on the use of the fast Fourier transform (FFT). In this paper we develop theory for the discrete linear canonical transform (DLCT), which is to the LCT what the discrete Fourier transform (DFT) is to the FT. We then derive the fast linear canonical transform (FLCT), an NlogN algorithm for its numerical implementation by an approach similar to that used in deriving the FFT from the DFT. Our algorithm is significantly different from the FFT, is based purely on the properties of the LCT, and can be used for FFT, FRT, and FST calculations and, in the most general case, for the rapid calculation of the effect of any QPS.

167 citations


Posted Content
TL;DR: In this paper, a fractional FFT algorithm is used to retrieve option prices from the corresponding characteristic functions, which can be delivered up to 45 times faster without substantial loss of accuracy in the results.
Abstract: This paper shows how the recently developed fractional FFT algorithm (FRFT) can be used to retrieve option prices from the corresponding characteristic functions. The FRFT algorithm has the advantage of using the characteristic function information in a more efficient way than the straight FFT. Typically, therefore, fewer function evaluations are needed and substantial savings in computational time can be made. Two experiments, based on the stochastic volatility and the variance-gamma models, illustrate the benefits of using the fractional version of the FFT and show that option prices can be delivered up to 45 times faster without substantial loss of accuracy in the results.

144 citations


Journal ArticleDOI
TL;DR: A new continuous-flow mixed-radix (CFMR) fast Fourier transform (FFT) processor that uses the MR (radix-4/2) algorithm and a novel in-place strategy that can reduce hardware complexity and computation cycles compared with existing FFT processors is proposed.
Abstract: The paper proposes a new continuous-flow mixed-radix (CFMR) fast Fourier transform (FFT) processor that uses the MR (radix-4/2) algorithm and a novel in-place strategy. The existing in-place strategy supports only a fixed-radix FFT algorithm. In contrast, the proposed in-place strategy can support the MR algorithm, which allows CF FFT computations regardless of the length of FFT. The novel in-place strategy is made by interchanging storage locations of butterfly outputs. The CFMR FFT processor provides the MR algorithm, the in-place strategy, and the CF FFT computations at the same time. The CFMR FFT processor requires only two N-word memories due to the proposed in-place strategy. In addition, it uses one butterfly unit that can perform either one radix-4 butterfly or two radix-2 butterflies. The CFMR FFT processor using the 0.18 /spl mu/m SEC cell library consists of 37,000 gates excluding memories, requires only 640 clock cycles for a 512-point FFT and runs at 100 MHz. Therefore, the CFMR FFT processor can reduce hardware complexity and computation cycles compared with existing FFT processors.

128 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a new method for power signal harmonic analysis, which consists of a frequency and phasor estimating algorithm, a finite-impulse-response comb filter, and a correction factor.
Abstract: With the increasing use of nonlinear loads in power systems, the harmonic pollution becomes more and more serious. It is well known that fast Fourier transform (FFT) is a powerful tool for power signal harmonic analysis, but leakage effect, picket fence effect, and aliasing effect make FFT suffer from specific restrictions. In this paper, we proposed a new method for power signal harmonic analysis. The major components of this method are a frequency and phasor estimating algorithm, a finite-impulse-response comb filter, and a correction factor. We also combine other methods to enhance our performance, such as discrete Fourier transform and least square error (LSE) method. To verify this method, we provided the comparisons of this method with FFT.

75 citations


Journal ArticleDOI
TL;DR: This letter defines a DFRFT based on a centered version of the DFT (CDFRFT) using eigenvectors derived from the Gru/spl uml/nbaum tridiagonal commutor that serve as excellent discrete approximations to the Hermite-Gauss functions.
Abstract: Existing versions of the discrete fractional Fourier transform (DFRFT) are based on the discrete Fourier transform (DFT). These approaches need a full basis of DFT eigenvectors that serve as discrete versions of Hermite-Gauss functions. In this letter, we define a DFRFT based on a centered version of the DFT (CDFRFT) using eigenvectors derived from the Gru/spl uml/nbaum tridiagonal commutor that serve as excellent discrete approximations to the Hermite-Gauss functions. We develop a fast and efficient way to compute the multiangle version of the CDFRFT for a discrete set of angles using the FFT algorithm. We then show that the associated chirp-frequency representation is a useful analysis tool for multicomponent chirp signals.

72 citations


Journal ArticleDOI
TL;DR: Simulated data have been used to analyze the artefacts produced by the Lagrange interpolation and the proposed non-linear sampling scheme is simple and highly robust since no parameters need to be adjusted by the user.
Abstract: Rapid acquisition of high-resolution 2D and 3D NMR spectra is essential for studying biological macromolecules. In order to minimize the experimental time, a non-linear sampling scheme is proposed for the indirect dimensions of multidimensional experiments. These data can be processed using the algorithm proposed by Dutt and Rokhlin (Appl. Comp. Harm. Anal. 1995, 2, 85–100) for fast Fourier transforms of non equispaced data. Examples of 1H−15N HSQC spectra are shown, where crowded correlation peaks can be resolved using non-linear acquisition. Simulated data have been used to analyze the artefacts produced by the Lagrange interpolation. As compared to non-linear processing methods, this algorithm is simple and highly robust since no parameters need to be adjusted by the user.

71 citations


Journal ArticleDOI
Maria Eleftheriou1, Blake G. Fitch1, Aleksandr Rayshubskiy1, T. J. C. Ward1, R. S. Germain1 
TL;DR: The volumetric FFT outperforms a port of the FFTW Version 2.1.5 library on large-node-count partitions and compared with that of the Fastest Fourier Transform in the West (FFTW) library.
Abstract: This paper presents results on a communications-intensive kernel, the three-dimensional fast Fourier transform (3D FFT), running on the 2,048-node Blue Gene®/L (BG/L) prototype. Two implementations of the volumetric FFT algorithm were characterized, one built on the Message Passing Interface library and another built on an active packet Application Program Interface supported by the hardware bring-up environment, the BG/L advanced diagnostics environment. Preliminary performance experiments on the BG/L prototype indicate that both of our implementations scale well up to 1,024 nodes for 3D FFTs of size 128 × 128 × 128. The performance of the volumetric FFT is also compared with that of the Fastest Fourier Transform in the West (FFTW) library. In general, the volumetric FFT outperforms a port of the FFTW Version 2.1.5 library on large-node-count partitions.

61 citations


Book
07 Jul 2005
TL;DR: In this paper, the Fourier Transform on Finite Non-Abelian Group (FFT-NAG) is used to represent the non-abelian groups and the Gibbs Derivative on NAG.
Abstract: Preface. Acknowledgments. Acronyms. 1 Signals and Their Mathematical Models. 1.1 Systems. 1.2 Signals. 1.3 Mathematical Models of Signals. References. 2 Fourier Analysis. 2.1 Representations of Groups. 2.1.1 Complete Reducibility. 2.2 Fourier Transform on Finite Groups. 2.3 Properties of the Fourier Transform. 2.4 Matrix Interpretation of the Fourier Transform on Finite Non-Abelian Groups. 2.5 Fast Fourier Transform on Finite Non-Abelian Groups. References. 3 Matrix Interpretation of the FFT. 3.1 Matrix Interpretation of FFT on Finite Non-Abelian Groups. 3.2 Illustrative Examples. 3.3 Complexity of the FFT. 3.3.1 Complexity of Calculations of the FFT. 3.3.2 Remarks on Programming Implememtation of FFT. 3.4 FFT Through Decision Diagrams. 3.4.1 Decision Diagrams. 3.4.2 FFT on Finite Non-Abelian Groups Through DDs. 3.4.3 MMTDs for the Fourier Spectrum. 3.4.4 Complexity of DDs Calculation Methods. References. 4 Optimization of Decision Diagrams. 4.1 Reduction Possibilities in Decision Diagrams. 4.2 Group-Theoretic Interpretation of DD. 4.3 Fourier Decision Diagrams. 4.3.1 Fourier Decision Trees. 4.3.2 Fourier Decision Diagrams. 4.4 Discussion of Different Decompositions. 4.4.1 Algorithm for Optimization of DDs. 4.5 Representation of Two-Variable Function Generator. 4.6 Representation of Adders by Fourier DD. 4.7 Representation of Multipliers by Fourier DD. 4.8 Complexity of NADD. 4.9 Fourier DDs with Preprocessing. 4.9.1 Matrix-valued Functions. 4.9.2 Fourier Transform for Matrix-Valued Functions. 4.10 Fourier Decision Trees with Preprocessing. 4.11 Fourier Decision Diagrams with Preprocessing. 4.12 Construction of FNAPDD. 4.13 Algorithm for Construction of FNAPDD. 4.13.1 Algorithm for Representation. 4.14 Optimization of FNAPDD. References. 5 Functional Expressions on Quaternion Groups. 5.1 Fourier Expressions on Finite Dyadic Groups. 5.1.1 Finite Dyadic Groups. 5.2 Fourier Expressions on Q2. 5.3 Arithmetic Expressions. 5.4 Arithmetic Expressions from Walsh Expansions. 5.5 Arithmetic Expressions on Q2. 5.5.1 Arithmetic Expressions and Arithmetic-Haar Expressions. 5.5.2 Arithmetic-Haar Expressions and Kronecker Expressions. 5.6 Different Polarity Polynomials Expressions. 5.6.1 Fixed-Polarity Fourier Expressions in C(Q2). 5.6.2 Fixed-Polarity Arithmetic-Haar Expressions. 5.7 Calculation of the Arithmetic-Haar Coefficients. 5.7.1 FFT-like Algorithm. 5.7.2 Calculation of Arithmetic-Haar Coefficients Through Decision Diagrams. References. 6 Gibbs Derivatives on Finite Groups. 6.1 Definition and Properties of Gibbs Derivatives on Finite Non-Abelian Groups. 6.2 Gibbs Anti-Derivative. 6.3 Partial Gibbs Derivatives. 6.4 Gibbs Differential Equations. 6.5 Matrix Interpretation of Gibbs Derivatives. 6.6 Fast Algorithms for Calculation of Gibbs Derivatives on Finite Groups. 6.6.1 Complexity of Calculation of Gibbs Derivatives. 6.7 Calculation of Gibbs Derivatives Through DDs. 6.7.1 Calculation of Partial Gibbs Derivatives. References. 7 Linear Systems on Finite Non-Abelian Groups. 7.1 Linear Shift-Invariant Systems on Groups. 7.2 Linear Shift-Invariant Systems on Finite Non-Abelian Groups. 7.3 Gibbs Derivatives and Linear Systems. 7.3.1 Discussion. References. 8 Hilbert Transform on Finite Groups. 8.1 Some Results of Fourier Analysis on Finite Non-Abelian Groups. 8.2 Hilbert Transform on Finite Non-Abelian Groups. 8.3 Hilbert Transform in Finite Fields. References. Index.

47 citations


Journal ArticleDOI
TL;DR: A traced FFT Pruning method (TFFTP) is developed, which is a novel technique and does not require that the outputs be in continuous windows of the fast Fourier transform.
Abstract: The fast Fourier transform (FFT) is an essential tool in digital signal processing and communications. In the applications of the FFT where the required outputs are very sparse, for example, in digital filtering, one may only require the spectrum corresponding to certain bins of the FFT or in narrow frequency windows. In these cases, most of the FFT outputs are not required. Some pruning algorithms have been proposed to deal with such cases. However, most of the pruning algorithms require that the outputs be in continuous windows. This paper develops a traced FFT Pruning method (TFFTP), which is a novel technique and does not require this condition. Under some circumstances, considerable savings in computational complexity and power consumption can be realized using the TFFTP compared to the FFT. This paper derives the average number of butterflies that need to be executed when only k/sub in/ input or/and k/sub out/ output bins of an N point FFT, where k/sub in//spl les/N, or/and k/sub out//spl les/N are required. This method is then extended to arbitrary radix FFT pruning and simultaneous input and output pruning case.

42 citations


Book ChapterDOI
30 Aug 2005
TL;DR: This study has improved a recently proposed mathematical model of Fourier transform technique for pricing financial derivatives to help design and develop an effective parallel algorithm using a swapping technique that exploits data locality.
Abstract: Fast Fourier Transform (FFT) has been used in many scientific and engineering applications. In the current study, we have applied the FFT for a novel application in finance. We have improved a recently proposed mathematical model of Fourier transform technique for pricing financial derivatives to help design and develop an effective parallel algorithm using a swapping technique that exploits data locality. We have implemented our algorithm on 20 node SunFire 6800 high performance computing system and compared the new algorithm with the traditional Cooley-Tukey algorithm We have presented the computed option values for various strike prices with a proper selection of strike-price spacing to ensure fine-grid integration for FFT computation as well as to maximize the number of strikes lying in the desired region of the asset price.

26 citations


Proceedings ArticleDOI
05 Dec 2005
TL;DR: A new general method to deduce FFT algorithms is introduced, and the deduced second radix-2 decimation-in-time FFT algorithm is transformed into another parallelizable sequential form, reducing the time complexity of DFT to O(nlogn/p) (where p is the number of processors).
Abstract: Discrete Fourier transform (DFT) has many applications in digital signal and image processing and other scientific and technological domains, but its time complexity of direct computation is O(n2), limiting greatly its application range. Thus many people have developed fast Fourier transform (FFT) algorithms, reducing the complexity from O(n2) to O(nlogn)(In this paper logn denotes log2n).But for large n, O(nlogn) is still very high. So multiprocessor systems have been used to speed up the computation of DFT. This paper first introduces a new general method to deduce FFT algorithms, then transforms the deduced second radix-2 decimation-in-time FFT algorithm into another parallelizable sequential form, and finally transforms the latter algorithm into a new parallel FFT algorithm, reducing the time complexity of DFT to O(nlogn/p) (where p is the number of processors). Using similar methods, the authors can also design other new parallel 1-D and 2-D FFT algorithms.

Journal ArticleDOI
TL;DR: An analytical fan-beam reconstruction algorithm that compensates for uniform attenuation in SPECT is developed in the form of backprojection first then filtering, and is mathematically exact.
Abstract: In this paper, we developed an analytical fan-beam reconstruction algorithm that compensates for uniform attenuation in SPECT The new fan-beam algorithm is in the form of backprojection first, then filtering, and is mathematically exact The algorithm is based on three components The first one is the established generalized central-slice theorem, which relates the 1D Fourier transform of a set of arbitrary data and the 2D Fourier transform of the backprojected image The second one is the fact that the backprojection of the fan-beam measurements is identical to the backprojection of the parallel measurements of the same object with the same attenuator The third one is the stable analytical reconstruction algorithm for uniformly attenuated Radon data, developed by Metz and Pan The fan-beam algorithm is then extended into a cone-beam reconstruction algorithm, where the orbit of the focal point of the cone-beam imaging geometry is a circle This orbit geometry does not satisfy Tuy's condition and the obtained cone-beam algorithm is an approximation In the cone-beam algorithm, the cone-beam data are first backprojected into the 3D image volume; then a slice-by-slice filtering is performed This slice-by-slice filtering procedure is identical to that of the fan-beam algorithm Both the fan-beam and cone-beam algorithms are efficient, and computer simulations are presented The new cone-beam algorithm is compared with Bronnikov's cone-beam algorithm, and it is shown to have better performance with noisy projections

Proceedings ArticleDOI
19 Jun 2005
TL;DR: Results show that the design and hardware implementation of a FFT-based algorithm using modular arithmetic to efficiently compute very large number multiplications starts to be useful for 4096-bit operands and beyond.
Abstract: Modular multiplication (MM) for large integers is the foundation of most public-key cryptosystems, specifically RSA, El-Gamal and the elliptic curve cryptosystems. Thus MM algorithms have been studied widely and extensively. Most of works are based on the well known Montgomery multiplication method (MMM) and its variants, which require multiplication in N. Authors have always avoided the fast Fourier transform (FFT) method believing that it is impractical for present system sizes despite its smaller complexity order. In this paper, the authors presented the design and hardware implementation of a FFT-based algorithm using modular arithmetic to efficiently compute very large number multiplications. The algorithm has been implemented in CASM, an intermediate level HDL developed in the laboratory. The target architecture is a FPGA. The algorithm is scalable and can easily be mapped to any operand size. Results show that such algorithm implementation starts to be useful for 4096-bit operands and beyond.

Proceedings ArticleDOI
Jing Wu1, Wei Zhao1
01 Jan 2005
TL;DR: In this article, a multi-spectrum line interpolation correction algorithm (MICA) is presented to correct the leakage effects of the spectrum in the orthogonal frequency diversion multiplexing (OFDM) communication system.
Abstract: Taking the rectangular window as an example, the summation in the expression of discrete Fourier transform is solved by the integral operation Under the condition of asynchronous sampling, the long-range leakage and short-range leakage effects of the spectrum are explained A new method, named multi-spectrum-line interpolation correction algorithm (MICA), is presented to correct the leakage effects The difference between MICA and the previous windowed interpolation algorithm is analyzed The simulation results show the measurement accuracy of MICA is very high and the long-range leakage effect can be corrected by applying not only the window with better spectral performances, but also the interpolation algorithm MICA based on FFT can be applied to measure precisely the electrical harmonics and estimate the frequency offset in the orthogonal frequency diversion multiplexing (OFDM) communication system

Proceedings ArticleDOI
07 Nov 2005
TL;DR: In this article, a three-step phase-shifting algorithm was proposed, which is 3.4 times faster than the traditional 3-step algorithm, using a simple intensity ratio function to replace the arctangent function in the traditional algorithm.
Abstract: We propose a new three-step phase-shifting algorithm, which is much faster than the traditional three-step algorithm. We achieve the speed advantage by using a simple intensity ratio function to replace the arctangent function in the traditional algorithm. The phase error caused by this new algorithm is compensated for by use of a look-up-table (LUT). Our experimental results show that both the new algorithm and the traditional algorithm generate similar results, but the new algorithm is 3.4 times faster. By implementing this new algorithm in a high-resolution, real-time 3D shape measurement system, we were able to achieve a measurement speed of 40 frames per second (fps) at a resolution of 532 × 500 pixels, all with an ordinary personal computer.

Journal ArticleDOI
01 Sep 2005
TL;DR: This segment deals with some aspects of the spectrum estimation problem of the fast Fourier transform, specifically the problem of estimating the intensity of the visible spectrum.
Abstract: Each article in this continuing series on the fast Fourier transform (FFT) is designed to illuminate new features of the wide-ranging applicability of this transform. This segment deals with some aspects of the spectrum estimation problem.

Patent
08 Aug 2005
TL;DR: In this article, a system and method Fast Fourier Transform (FFT) method in a multi-mode wireless processing system is presented. And the method can include loading an input vector into an input buffer, initializing a second counter and a variable N, where N = log 2 (input vector size), and s is the value of the second counter, performing an FFT stage, and comparing s to N and performing additional FFT stages until s=N.
Abstract: A system and method Fast Fourier Transform (FFT) method in a multi-mode wireless processing system. The method can include loading an input vector into an input buffer, initializing a second counter and a variable N, where N=log2 (input vector size), and s is the value of the second counter, performing an FFT stage, and comparing s to N and performing additional FFT stages until s=N. Performing the FFT stage can include performing vector operations on data in the input buffer and sending results to an output buffer, the data in the input buffer comprising a plurality of segments, advancing the value of the second counter; and switching roles of the input and output buffers. The vector operations can include performing Radix-4 FFT vector operations on the four input data at a time and multiplying the resulting output vectors with a Twiddle factor.

Proceedings ArticleDOI
18 Mar 2005
TL;DR: It can be shown that all the possible split-radix FFT algorithms of the type radix-2/sup r//2/Sup rs/ for computing a 2/sup m/-point DFT require exactly the same number of arithmetic operations.
Abstract: A radix-2/16 decimation-in-frequency (DIF) fast Fourier transform (FFT) algorithm and its higher radix version, namely radix-4/16 DIF FFT algorithm, are proposed by suitably mixing the radix-2, radix-4 and radix-16 index maps, and combing some of the twiddle factors. It is shown that the proposed algorithms and the existing radix-2/4 and radix-2/8 FFT algorithms require exactly the same number of arithmetic operations (multiplications+additions). Moreover, by using techniques similar to these, it can be shown that all the possible split-radix FFT algorithms of the type radix-2/sup r//2/sup rs/ for computing a 2/sup m/-point DFT require exactly the same number of arithmetic operations.

Proceedings ArticleDOI
01 May 2005
TL;DR: This paper proposes a novel FFT based finite field multiplier based on the fast Fourier transform that performs polynomial multiplication in O(nlog(n) time compared to the classical method time of O( n2).
Abstract: Finite field multiplication is one of the most useful arithmetic operations and has applications in many areas such as signal processing, coding theory and cryptography. However, it is also one of the most time consuming operations in both software and hardware, which makes it pertinent to develop a fast and efficient implementation. In this paper, we propose a novel FFT based finite field multiplier to address this problem. The fast Fourier transform (FFT) is the collection of computationally efficient algorithms that perform the discrete Fourier transform (DFT). For our purposes, we will use its efficient computation for polynomial multiplication. The FFT performs polynomial multiplication in O(nlog(n)) time compared to the classical method time of O(n2). The idea of using the FFT for finite field multiplication has been researched extensively, but to our knowledge, this is the first implementation in hardware

Proceedings ArticleDOI
19 Sep 2005
TL;DR: A new fast algorithm using multilevel Taylor interpolation and the FFT (TI-FFT) has been developed to solve the near-field (NF) propagation problem for the planar scenario.
Abstract: A new fast algorithm using multilevel Taylor interpolation and the FFT (TI-FFT) has been developed to solve the near-field (NF) propagation problem for the planar scenario. The algorithm speeds the computation by grouping neighborhood regions in the spatial domain or the spectral domain through the Taylor interpolation (TI) method using the FFT technique. The CPU time increases as O(N/sup 2/ log/sub 2/ N/sup 2/) instead of the polynomial time O(N/sup 4/) required for the Stratton-Chu formula for N /spl times/ N observation points. The multilevel TI-FFT uses a sampling rate above the Nyquist rate as required by the FFT, while the Stratton-Chu formula requires a higher sampling rate because of the fast variation of the phase term. An accuracy of -50 dB for the multilevel TI-FFT algorithm is easily obtained and an accuracy of -70 dB is possible when the algorithm is optimized. The algorithm works particularly well for band-limited beam-like fields and "quasi-planar" surfaces.

Proceedings ArticleDOI
18 Mar 2005
TL;DR: A fast iterative algorithm, with computation based on the fast Fourier transform (FFT), is presented, which achieves better performance than traditional FFT-based deconvolution methods with an equal number of coefficients in the inverse filters.
Abstract: A fast iterative algorithm, with computation based on the fast Fourier transform (FFT), is presented. It can be used to control a sound field at several control points with a loudspeaker array from multiple reference signals. It designs an equalizer able to invert long FIR filters and which achieves better performance than traditional FFT-based deconvolution methods with an equal number of coefficients in the inverse filters.

Proceedings ArticleDOI
19 Dec 2005
TL;DR: This paper proposes an alternate instance of padding zeros to the data sequence that results in computational cost reduction to O(pNlog2 N) and can be used to achieve non-uniform upsampling that would zoom-in or zoom-out a particular frequency band.
Abstract: The classical Cooley-Tukey fast Fourier transform (FFT) algorithm has the computational cost of O(Nlog2N) where N is the length of the discrete signal. Spectrum resolution is improved through padding zeros at the tail of the discrete signal, if (p -1)N zeros are padded (where p is an integer) at the tail of the data sequence, the computational cost through FFT becomes O(pNlog2pN). This paper proposes an alternate instance of padding zeros to the data sequence that results in computational cost reduction to O(pNlog2 N). It has been noted that this modification can be used to achieve non-uniform upsampling that would zoom-in or zoom-out a particular frequency band, in addition, it may be used for pruning the spectrum, which would reduce resolution of an unimportant frequency band

Journal ArticleDOI
TL;DR: It is demonstrated that the two-dimensional fast Fourier transform?(FFT) is a useful algorithm due to its hierarchical structure and ability to determine the relative magnitudes of different spatial wavelengths in a material.
Abstract: This work is part of an effort to structurally integrate self-sensing functionality into smart composite materials using embedded microsensors and local network communication nodes. Here we address the issue of data management through the development of localized processing algorithms. We demonstrate that the two-dimensional fast Fourier transform?(FFT) is a useful algorithm due to its hierarchical structure and ability to determine the relative magnitudes of different spatial wavelengths in a material. This may be applied, for example, to determine the global components of a strain field or temperature distribution. We develop two methods for implementing the distributed 2D FFT based on the radix-2 (row?column) and radix-2 ? 2 (vector?radix) structures, and compare them in terms of computational requirements within a low power, low bandwidth network of microprocessors. Our results show that the vector?radix algorithm requires 50% fewer multiplications than the row?column algorithm when performed in a distributed manner. Since the most important information of the 2D FFT can often be found in the lowest frequency components, we develop pruning methods for the distributed row?column and vector?radix algorithms that reduce internode communication requirements by 50% in both cases. We conclude that the pruned version of the distributed vector?radix 2D FFT is the most efficient of the methods investigated for rapid signal identification in smart composite materials.

Proceedings ArticleDOI
27 May 2005
TL;DR: The proposed algorithm reduces two-dimensional searches, widely used in the time-frequency based method, FrFT and chirp Fourier transform, into two one- dimensional searches and is a computationally fast alternative for LFM signal detection and parameter estimation.
Abstract: A fast method for parameter estimation of the multi-component linear frequency modulated (multi-LFM) signal is proposed. The signal detector and chirp rate estimator are based on the modulus square of the fractional autocorrelation. By using the estimated chirp rate, an approach based on the fractional Fourier transform (FrFT) is employed for estimation of the amplitude and centre frequency. The proposed algorithm reduces two-dimensional searches, widely used in the time-frequency based method, FrFT and chirp Fourier transform, into two one-dimensional searches. By utilizing the discrete FrFT, along with the fast Fourier transform (FFT) algorithm, the proposed method is a computationally fast alternative for LFM signal detection and parameter estimation. Analysis of the multi-LFM signal is performed using the CLEAN technique as well. Finally, computer simulations are provided to illustrate the performance of the proposed algorithm.

Proceedings ArticleDOI
20 Mar 2005
TL;DR: The adaptive matrix-transpose algorithm is efficient since it minimizes the overhead associated with transposing matrices by adaptively choosing the suitable radix based on data size, number of processors, start-up time, and the effective bandwidth.
Abstract: Computing fast Fourier transform (FFT) on parallel computers has the same communication requirement to transpose matrices one or more times. In this paper, we propose an efficient algorithm (the adaptive matrix-transpose algorithm) for transposing matrices, which is based on all-to-all communication. The adaptive matrix-transpose algorithm is efficient since it minimizes the overhead associated with transposing matrices by adaptively choosing the suitable radix based on data size, number of processors, start-up time, and the effective bandwidth. We study the effect of the adaptive matrix-transpose algorithm on the 6-step 1-D FFT using symmetric multiprocessors (SMP).

Journal ArticleDOI
TL;DR: Developed is a new generic pruning algorithm that uses only single bit combinational operations for delineating paths in the signal flow graph, as against butterflies, which are relevant for computation of the desired frequency coefficients.
Abstract: Several fast Fourier transform (FFT) pruning algorithms have been proposed for applications where the entire spectrum of frequencies is not of relevance and even the inputs are sparse. However, these are architecturally inefficient because of the complexity of the overhead operations involved. Developed is a new generic pruning algorithm that uses only single bit combinational operations for delineating paths in the signal flow graph, as against butterflies, which are relevant for computation of the desired frequency coefficients. The entire delineation process has been divided into stages making the proposed algorithm also amenable for pipelined hardware implementation.

Journal ArticleDOI
TL;DR: Based on this mapping algorithm, several 18-bit word-length 1024-point FFT processors implemented with TSMC0.18 μm CMOS technology are given to demonstrate its scalability and high performance.
Abstract: Many parallel Fast Fourier Transform (FFT) algorithms adopt multiple stages architecture to increase performance. However, data permutation between stages consumes volume memory and processing time. One FFT array processing mapping algorithm is proposed in this paper to overcome this demerit. In this algorithm, arbitrary 2k butterfly units (BUs) could be scheduled to work in parallel on n = 2s data (k = 0,1,..., s - 1). Because no inter stage data transfer is required, memory consumption and system latency are both greatly reduced. Moreover, with the increasing of BUs, not only does throughput increase linearly, system latency also decreases linearly. This array processing orientated architecture provides flexible tradeoff between hardware cost and system performance. In theory, the system latency is (s×2s-k) × tclk and the throughput is n/(s × 2s-k × tclk), where tclk is the system clock period. Based on this mapping algorithm, several 18-bit word-length 1024-point FFT processors implemented with TSMC0.18 μm CMOS technology are given to demonstrate its scalability and high performance. The core area of 4-BU design is 2.991 × 1.121 mm2 and clock frequency is 326 MHz in typical condition (1.8 V, 25°C). This processor completes 1024 FFT calculation in 7.839 μs.

Patent
09 Feb 2005
TL;DR: In this article, a matrix prefetch buffer-based fast Fourier transform processor is proposed to reduce quantization errors generated from the operation by using a matrix pre-fetch buffer.
Abstract: The present invention provides a fast Fourier transform processor, dynamic scaling method and fast Fourier transform with radix-8 algorithm. It reduces quantization errors generated from the operation by using a matrix prefetch buffer-based fast Fourier transform processor. Operation sizes of the matrix prefetch buffer as block sizes the invention adjust the signals against overflow by the status of signals in each block. It can shunt time of complex multiplication operation systematically and reduce operation complexity in butterfly units by utilizing algorithms of 3-step radix-8 fast Fourier transform and re-scheduling. Moreover, the present invention provides a fast Fourier transform processor for realizing the methods and algorithms mentioned above.

Proceedings ArticleDOI
28 Sep 2005
TL;DR: Methods analysed in this paper differs by the way how overflow of mathematical operation results (complex addition and complex multiplication) is prevented.
Abstract: This paper presents some methods for maintaining accuracy in implementation of fast Fourier transform on fixed point DSPs and analysis of their performance. Methods analysed in this paper differs by the way how overflow of mathematical operation results (complex addition and complex multiplication) is prevented. Depending of capabilities of specific fixed point DSP suitable method can be chosen.

Patent
Kim Rounioja1, Sien Ong2
05 Apr 2005
TL;DR: In this article, the authors describe a method of computing a fast Fourier transform (FFT) using enhanced processor computational capabilities for more efficient and flexible implementation of an electronic device (e.g., a linear equalizer) based on that FFT computing.
Abstract: This invention describes a method of computing a fast Fourier transform (FFT) using enhanced processor computational capabilities for more efficient and flexible implementation of an electronic device (e.g., a linear equalizer) based on that FFT computing. A simple non-parallel instruction set processor (or just a non-parallel processor) containing complex multiplication and addition/subtraction capabilities is extended by adding additional registers and interconnects and a dedicated parallel instruction for calculating the FFT butterfly. The parallel instruction consists of orthogonal sub-instructions each controlling a section of the data path related to a corresponding section of the FFT butterfly.