scispace - formally typeset
Search or ask a question

Showing papers on "Split-radix FFT algorithm published in 2007"


Journal ArticleDOI
TL;DR: In this brief, multi-path delay commutator structures are utilized to improve the throughput rate of radix-2 andRadix-4 FFT computation by a factor of 2 to 4.
Abstract: In this brief, multi-path delay commutator structures are utilized to improve the throughput rate of radix-2 and radix-4 FFT computation by a factor of 2 to 4. Latency can also be reduced by a factor of 2 to 3. Compared with previous radix-2 and radix-4 FFT structures, the proposed high-throughput FFT with doubled throughput rate requires similar or even less hardware cost. Although split radix FFT design is more hardware efficient, the regular structure of proposed FFT structures are attractive for high throughput FFT design.

81 citations


Journal ArticleDOI
Hyun-Yong Lee1, In-Cheol Park1
TL;DR: The proposed algorithm is to decompose a discrete Fourier transform into two balanced sub-DFTs in order to minimize the total number of twiddle factors to be stored into tables.
Abstract: This paper presents an area-efficient algorithm for the pipelined processing of fast Fourier transform (FFT). The proposed algorithm is to decompose a discrete Fourier transform (DFT) into two balanced sub-DFTs in order to minimize the total number of twiddle factors to be stored into tables. The radix in the proposed decomposition is adaptively changed according to the remaining transform length to make the transform lengths of sub-DFTs resulting from the decomposition as close as possible. An 8192-point pipelined FFT processor designed for digital video broadcasting-terrestrial (DVB-T) systems saves 33% of general multipliers and 23% of the total size of twiddle factor tables compared to a conventional pipelined FFT processor based on the radix-22 algorithm. In addition to the decomposition, several implementation techniques are proposed to reduce area, such as a simple index generator of twiddle factor and add/subtract units combined with the two's complement operation

79 citations


Journal ArticleDOI
TL;DR: This tutorial simply reviews the DFT and FFT, with a few characteristic examples.
Abstract: Frequency analysis is an important issue in the IEEE. Using a computer in a calculation means moving into a non-physical, synthetic environment. Numerically, discrete or fast Fourier transformations (DFTs or FFTs) are used to obtain the frequency content of a time signal, and these are totally different than the mathematical definition of the Fourier transform. This tutorial simply reviews the DFT and FFT, with a few characteristic examples.

44 citations


Journal ArticleDOI
TL;DR: In this paper, the polar and pseudo-polar FFT can be computed very accurately and efficiently by the well-known nonequispaced FFT, and the reconstruction of a 2D signal from its Fourier transform samples on a (pseudo)polar grid by means of the inverse nonequispecific FFT is discussed.

43 citations


Journal ArticleDOI
TL;DR: The proposed design of a new hardware efficient fast cyclic convolution algorithm for small-length DFT can save large amount of hardware cost with the same processing speed when the transform length is long and the processing speed can be flexible and balanced with the hardware cost.
Abstract: A primeN-length discrete Fourier transform (DFT) can be reformulated into a (N-1)-length complex cyclic convolution and then implemented by systolic array or distributed arithmetic. In this paper, a recently proposed hardware efficient fast cyclic convolution algorithm is combined with the symmetry properties of DFT to get a new hardware efficient fast algorithm for small-length DFT, and then WFTA is used to control the increase of the hardware cost when the transform length Nis large. Compared with previously proposed low-cost DFT and FFT algorithms with computation complexity of O(logN), the new algorithm can save 30% to 50% multipliers on average and improve the average processing speed by a factor of 2, when DFT length Nvaries from 20 to 2040. Compared with previous prime-length DFT design, the proposed design can save large amount of hardware cost with the same processing speed when the transform length is long. Furthermore, the proposed design has much more choices for different applicable DFT transform lengths and the processing speed can be flexible and balanced with the hardware cost

34 citations


Journal ArticleDOI
TL;DR: A general class of split-radix fast Fourier transform (FFT) algorithms for computing the length-2m DFT is proposed by introducing a new recursive approach coupled with an efficient method for combining the twiddle factors and it is shown that the number of arithmetic operations required is independent of s and is (2m-3)2m+1+8.
Abstract: In this paper, a general class of split-radix fast Fourier transform (FFT) algorithms for computing the length-2m DFT is proposed by introducing a new recursive approach coupled with an efficient method for combining the twiddle factors. This enables the development of higher split-radix FFT algorithms from lower split-radix FFT algorithms without any increase in the arithmetic complexity. Specifically, an arbitrary radix-2/2s FFT algorithm for any value of s, 4les sles m, is proposed and its arithmetic complexity analyzed. It is shown that the number of arithmetic operations (multiplications plus additions) required by the proposed radix-2/2s FFT algorithm is independent of s and is (2m-3)2m+1+8 regardless of whether a complex multiplication is carried out using four multiplications and two additions or three multiplications and three additions. This paper thus provides a variety of choices and ways for computing the length-2m DFT with the same arithmetic complexity.

33 citations


Patent
04 Apr 2007
TL;DR: In this paper, the authors described techniques for performing Fast Fourier Transform (FFT) using a delayless pipeline and an Inverse FFT (IFFT) using the main memory.
Abstract: Techniques for performing Fast Fourier Transforms (FFT) are described. In some aspects, calculating the Fast Fourier Transform is achieved with an apparatus having a memory (610), a Fast Fourier Transform engine (FFTe) having one or more registers (650) and a delayless pipeline (630), the FFTe configured to receive a multi-point input from the main memory (610), store the received input in at least one of the one or more registers (650), and compute either or both of a Fast Fourier Transform (FFT) and an Inverse Fast Fourier Transform (IFFT) on the input using the delayless pipeline.

30 citations


Proceedings ArticleDOI
01 Nov 2007
TL;DR: In this paper, a group-harmonic weighting distribution is proposed for system-wide interharmonic evaluation in power systems, which can restore the dispersing spectral leakage energy caused by the fast Fourier transform.
Abstract: The fast Fourier transform (FFT) is still a widely-used tool for analyzing and measuring both stationary and transient signals with power system harmonics in power systems. However, the misapplications of FFT can lead to incorrect results caused by some problems such as aliasing effect, spectral leakage and picket-fence effect. A strategy of group-harmonic weighting distribution is proposed for system-wide inter-harmonic evaluation in power systems. The proposed algorithm can restore the dispersing spectral leakage energy caused by the fast Fourier transform (FFT), and calculate the power distribution proportion around the adjacent frequencies at each harmonic to determine the inter-harmonic frequency. Therefore, not only high-precision in integer harmonic measurement by the FFT can be retained, but also the inter-harmonics can be identified accurately, particularly under system frequency drift. The numerical examples are presented to verify the performance of the proposed algorithm.

27 citations


Book ChapterDOI
TL;DR: The tangent FFT is presented, a straightforward in-place cache-friendly DFT algorithm having exactly the same operation counts as Van Buskirk's algorithm, and it is pinpoints how the tangentFFT saves time compared to the split-radix FFT.
Abstract: The split-radix FFT computes a size-n complex DFT, when n is a large power of 2, using just 4n lg n-6n+8 arithmetic operations on real numbers. This operation count was first announced in 1968, stood unchallenged for more than thirty years, and was widely believed to be best possible. Recently James Van Buskirk posted software demonstrating that the split-radix FFT is not optimal. Van Buskirk's software computes a size-n complex DFT using only (34/9 + o(1))n lg n arithmetic operations on real numbers. There are now three papers attempting to explain the improvement from 4 to 34/9: Johnson and Frigo, IEEE Transactions on Signal Processing, 2007; Lundy and Van Buskirk, Computing, 2007; and this paper. This paper presents the "tangent FFT," a straightforward in-place cache-friendly DFT algorithm having exactly the same operation counts as Van Buskirk's algorithm. This paper expresses the tangent FFT as a sequence of standard polynomial operations, and pinpoints how the tangent FFT saves time compared to the split-radix FFT. This description is helpful not only for understanding and analyzing Van Buskirk's improvement but also for minimizing the memory-access costs of the FFT.

23 citations


Journal ArticleDOI
TL;DR: This work gives a relatively short survey of the FFT for arbitrary finite abelian groups, cyclic or not, with complete and partially novel proofs, the main distinction being explicit induction formulas for the F FT in all cases which generalize the original FFT-algorithm.
Abstract: Fast Fourier transforms (FFTs) are fast algorithms, i.e., of low complexity, for the computation of the discrete Fourier transform (DFT) on a finite abelian group. They are among the most important algorithms in applied and engineering mathematics and in computer science, in particular for one- and multidimensional systems theory and signal processing. We give a relatively short survey of the FFT for arbitrary finite abelian groups, cyclic or not, with complete and partially novel proofs, the main distinction being explicit induction formulas for the FFT in all cases which generalize the original FFT-algorithm due to Cooley and Tukey and, much earlier, to Gaus. We believe that our approach has didactic advantages over the usual ones. We also present the application of the FFT to fast convolution algorithms, and the so-called number theoretic transforms over finite coefficient rings. We do not treat those algorithms which decrease the multiplicative complexity at the expense of many more rational linear combinations, which in this context are considered costless, nor do we discuss the DFT for nonabelian finite groups.

21 citations


Journal ArticleDOI
TL;DR: A new approach to approximate the SAD metric by cosine series which can be expressed in correlation terms is used which is suitable for software implementations and has a deterministic execution time unlike the existing fast algorithms for SAD matching.
Abstract: Fast Fourier transforms (FFTs) which are O(N logN) algorithms to compute a discrete Fourier transform (DFT) of size N have been called one of the ten most important algorithms of the twentieth century. However, even though many algorithms have been developed to speed up the computation the sum of absolute difference (SAD) matching, they are exclusively designed in the spatial domain. In this paper, we propose a fast frequency algorithm to speed up the process of (SAD) matching. We use a new approach to approximate the SAD metric by cosine series which can be expressed in correlation terms. These latter can be computed using FFT algorithms. Experimental results demonstrate the effectiveness of our method when using only the first correlation terms for block and template matching in terms of accuracy and speed. The proposed algorithm is suitable for software implementations and has a deterministic execution time unlike the existing fast algorithms for SAD matching.

Proceedings ArticleDOI
01 May 2007
TL;DR: A look-up table (LUT) methodology is developed and demonstrated on variable length (128-1024 point), variable bit-precision (6-12 b) FFT with uniform bit truncation and optimum bit truncations for wideband digital receiver in radar applications.
Abstract: A practical fast Fourier transform (FFT) processor can contain several millions of gates, so effective design techniques usually are required in order to guarantee high-speed products. A look-up table (LUT) methodology is developed and demonstrated on variable length (128-1024 point), variable bit-precision (6-12 b) FFT with uniform bit truncation and optimum bit truncation for wideband digital receiver in radar applications. The FFT processors are designed using a standard 130 nanometer CMOS process and operates down to 120 mV. The required processing time for the non-configurable 12-b 1024-point LUT FFT is 15.78 ns at a clock frequency of 470 MHz. The required time for configurable LUT 12-b 1024-point FFT processing is 61 ns. The configurable LUT FFT processor with short transform lengths can be expandable so that they can be used easily to form new FFT processors with longer transform lengths. The performance comparison of conventional FFT, LUT FFT, and configurable LUT FFT for digital wideband receiver application is discussed.

Patent
25 Apr 2007
TL;DR: In this article, a method of reducing noise in a speech signal using a fast Fourier transform (FFT) is proposed. But the method is not suitable for the frequency domain.
Abstract: A method of reducing noise in a speech signal involves converting the speech signal to the frequency domain using a fast fourier transform (FFT), creating a subset of selected spectral subbands, determining the appropriate gain for each subband, and interpolating the gains to match the number of FFT points. The converted speech signal is then filtered using the interpolated gains as filter coefficients, and an inverse FFT performed on the processed signal to recover the time domain output signal.

Journal ArticleDOI
TL;DR: An alternative way of refining phases with the origin-free modulus sum function S is shown that, instead of applying the tangent formula in sequential mode, it is applied in parallel mode with the help of the fast Fourier transform (FFT) algorithm.
Abstract: An alternative way of refining phases with the origin-free modulus sum function S is shown that, instead of applying the tangent formula in sequential mode [Rius (1993). Acta Cryst. A49, 406-409], applies it in parallel mode with the help of the fast Fourier transform (FFT) algorithm. The test calculations performed on intensity data of small crystal structures at atomic resolution prove the convergence and hence the viability of the procedure. This new procedure called S-FFT is valid for all space groups and especially competitive for low-symmetry ones. It works well when the charge-density peaks in the crystal structure have the same sign, i.e. either positive or negative.

Proceedings ArticleDOI
01 Dec 2007
TL;DR: An efficient algorithm with using parallel and pipelining methods is proposed to implement high speed and high resolution FFT algorithm to implement the high speed FFT on FPGA.
Abstract: Using fast Fourier transform (FFT) is indispensable in most signal processing applications. Designing an appropriate algorithm for the implementation of FFT can be efficacious in digital signal processing. Sophisticated techniques such as pipelining and parallel calculations have potential impacts on VLSI implementation of FFT algorithm. Furthermore, a mathematic approach such as floating point calculation achieves higher precision. In this paper, an efficient algorithm with using parallel and pipelining methods is proposed to implement high speed and high resolution FFT algorithm. Latency reduction is an important issue to implement the high speed FFT on FPGA. The Proposed FFT algorithm shows the latency of 5131 clock pulse when N refers to 1024 points. The design has the mean squared error (MSE) of 0.0001 which is preferable to Radix 2 FFT.

Journal ArticleDOI
TL;DR: In this paper, a low complexity pipeline FFT processor for MIMO-OFDM systems with four transmitting and four receiving (4 × 4) antennas is proposed which is based on multi-channel structure which enables to support multiple data streams efficiently.
Abstract: In this paper, we propose a low complexity pipeline FFT processor for MIMO-OFDM systems with four transmitting and four receiving (4 × 4) antennas. The proposed FFT processor is based on multi-channel structure which enables to support multiple data streams efficiently. With mixed-radix algorithm, the number of non-trivial multiplications of the proposed FFT processor are decreased. Implementation results show that the proposed FFT processor reduces the required number of logic gates by 25% over the conventional 4-channel R4MDC FFT processor which has been considered to be the most area-efficient FFT processor for 4 × 4 MIMO-OFDM systems.

Journal ArticleDOI
TL;DR: This paper proposes to combine Kazhdan's FFT-based approach to surface reconstruction from oriented points with adaptive subdivision and partition of unity blending techniques to achieve a higher reconstruction accuracy and include a more robust surface restoration in regions where the surface folds back to itself.
Abstract: In this paper, we propose to combine Kazhdan's FFT-based approach to surface reconstruction from oriented points with adaptive subdivision and partition of unity blending techniques. This removes the main drawback of the FFT-based approach which is a high memory consumption for geometrically complex datasets. This allows us to achieve a higher reconstruction accuracy compared with the original global approach. Furthermore, our reconstruction process is guided by a global error control accomplished by computing the Hausdorff distance of selected input samples to intermediate reconstructions. The advantages of our surface reconstruction method also include a more robust surface restoration in regions where the surface folds back to itself.

Proceedings ArticleDOI
15 Apr 2007
TL;DR: This work rigorously derives a novel variant of the general-radix Cooley-Tukey FFT that is structured to map efficiently for any vector length v and radix and includes the new FFT into the program generator spiral to generate actual C implementations.
Abstract: SIMD (single instruction multiple data) vector instructions, such as Intel's SSE family, are available on most architectures, but are difficult to exploit for speed-up. In many cases, such as the fast Fourier transform (FFT), signal processing algorithms have to undergo major transformations to map efficiently. Using the Kronecker product formalism, we rigorously derive a novel variant of the general-radix Cooley-Tukey FFT that is structured to map efficiently for any vector length v and radix. Then, we include the new FFT into the program generator spiral to generate actual C implementations. Benchmarks on Intel's SSE show that the new algorithms perform better on practically all sizes than the best available libraries Intel's MKL and FFTW.

Journal ArticleDOI
TL;DR: The grouped scheme, which can be specially applied to compute the pruning fast Fourier transform (pruning FFT) with power-of-two partial transformation length, and using the radix-2 FFT scheme, can be implemented with properties of sharing hardware and regular structures.

Patent
11 Sep 2007
TL;DR: In this paper, a variable length fast Fourier transform (FFT) system and a method for performing the FFT system in a global navigation satellite system (GNSS) signal acquisition and tracking, which includes a memory and a number of processing elements are disclosed.
Abstract: A variable length fast Fourier transform (FFT) system and a method for performing the FFT system in a global navigation satellite system (GNSS) signal acquisition and tracking, which includes a memory and a number of processing elements are disclosed. Based on the GNSS signal tracking, the variable length FFT system performs a first FFT operation together with a first data length. Based on the GNSS signal acquisition, the variable length FFT system is divided into several FFT subsystems to simultaneously perform different operations with various data lengths different from the first data length. Thus, the variable length FFT system can enhance the hardware utility and increase throughputs.

Journal ArticleDOI
TL;DR: The novel aspects of the specific FFT method described include: a bit-wise reversal re-grouping operation of the conventional FFT is replaced by the use of lossless image rotation and scaling and the usual arithmetic operations of complex multiplication are replaced with integer addition.
Abstract: The Fourier transform is one of the most important transformations in image processing. A major component of this influence comes from the ability to implement it efficiently on a digital computer. This paper describes a new methodology to perform a fast Fourier transform (FFT). This methodology emerges from considerations of the natural physical constraints imposed by image capture devices (camera/eye). The novel aspects of the specific FFT method described include: 1) a bit-wise reversal re-grouping operation of the conventional FFT is replaced by the use of lossless image rotation and scaling and 2) the usual arithmetic operations of complex multiplication are replaced with integer addition. The significance of the FFT presented in this paper is introduced by extending a discrete and finite image algebra, named Spiral Honeycomb Image Algebra (SHIA), to a continuous version, named SHIAC

Journal Article
Sun Jing1
TL;DR: In this paper, the authors proposed a power quality analysis method based on Mallat algorithm and fast Fourier transform (FFT) to distinguish steady state disturbance from non-steady state disturbance.
Abstract: Based on Mallat algorithm and fast Fourier transform (FFT), the authors propose a power quality analysis method. In this method, the wavelet denoising is applied to sampled signals; according to the detection results of catastrophe point of signals, the high frequency coefficients of the first level and the second level obtained by Mallat decomposition algorithm are taken as the criteria to distinguish steady state disturbance from non-steady state disturbance, and then the duration of disturbance can be solved. In the light of frequency band division principle of multi-resolution analysis, by use of Mallat reconstruction algorithm the transient disturbance waveform is extracted, moreover an identification subroutine that can accurately distinguish short-term variation disturbances such as voltage sag, voltage swell and interruption is programmed. For steady state disturbance, the authors point out that FFT can be used as a tool to distinguish harmonics from flicker. The effectiveness and accuracy of the proposed method is validated by Matlab-based simulation results.

Journal ArticleDOI
TL;DR: The present method can be implemented in the microprocessor and VLSI environment using a commercial FFT chip and yields energy preserving and shift invariant decimated analytic wavelet coefficients, which are free of aliasing effects.
Abstract: This letter introduces an analytic wavelet transform based on linear phase quadrature mirror filters (QMFs). The computation of the analytic signal and the reconstruction of the signal is carried by the fast Fourier transform (FFT)-based algorithm. The transform yields energy preserving and shift invariant decimated analytic wavelet coefficients, which are free of aliasing effects. The present method can be implemented in the microprocessor and VLSI environment using a commercial FFT chip

Proceedings ArticleDOI
04 Dec 2007
TL;DR: A hardware interpretation to design a highly parallel and parameterized architecture of the cyclotomic FFT based on four stages and modular structure of last stage which allows to reach a very high throughput rate which, for 256-point FFT, can get hold of 8.5 fc.
Abstract: The hardware design and implementation of cyclotomic Fast Fourier Transform (FFT) over finite fields GF(2m) is described. By reformulating the algorithm presented in [8], we introduce a hardware interpretation to design a highly parallel and parameterized architecture of the cyclotomic FFT. Based on four stages and modular structure of last stage, this architecture can operate at different throughput rates. Compared to another implemented algorithm [9] which operates at fc (the system clock frequency), the proposed architecture allows to reach a very high throughput rate which, for 256-point FFT, can get hold of 8.5 fc. An FPGA implementation of the proposed architecture is given where the critical path delay and the hardware complexity are evaluated.

Patent
22 Feb 2007
TL;DR: In one embodiment, the present invention relates to an interleaved method for computing a Fast Fourier Transform (FFT), and in another embodiment, the authors relates to a method for parallel filter via Fast FFT.
Abstract: The present invention generally relates to a method for computing a Fast Fourier Transform (FFT). In one embodiment, the present invention relates to an interleaved method for computing a Fast Fourier Transform (FFT). In another embodiment, the present invention relates to a method for parallel filter via Fast Fourier Transform (FFT).

Proceedings ArticleDOI
05 Nov 2007
TL;DR: This paper proposes a novel signal processing front end for fast Fourier transform(FFT)-based spectrum analyzers and shows how employing polynomial-based filtering structures leads to more efficient solutions than were previously available.
Abstract: In this paper we propose a novel signal processing front end for fast Fourier transform(FFT)-based spectrum analyzers. Recent developments in polynomial-based filtering have yielded efficient structures for sample rate reduction by arbitrary factors. This is a key operation in FFT-based spectral analysis when arbitrary span capability is required. It is shown how employing these structures leads to more efficient solutions than were previously available.

Journal ArticleDOI
TL;DR: A fast Fourier transform algorithm, which removes two steps of twiddle factor multiplications from the conventional five-step FFT algorithm, and reduces its memory requirement by O(n) operations.

Journal ArticleDOI
TL;DR: The principle of zoom FFT technique based on complex modulation, its application to development of SLF/ELF receiver and how to obtain high resolution spectrum using the new technique are introduced in detail and also the theoretical and test results are presented.
Abstract: Discrete fast Fourier transform (FFT) has been widely applied to signal spectral analysis and can figure out the entire bandwidth spectrum of a signal. However, the fine structure of high resolution spectrum in a narrow bandwidth is required in some applications. If regular FFT is still used to figure out the high resolution spectrum, it will result in addition of data and at last sharply increase of computation and storage. Therefore, FFT is inefficient and a new method must be put forward. In the paper, the principle of zoom FFT technique based on complex modulation, its application to development of SLF/ELF receiver and how to obtain high resolution spectrum using the new technique are introduced in detail and also the theoretical and test results are presented.

Journal ArticleDOI
TL;DR: A decimation-in-time fast algorithm is presented to significantly reduce the computational complexity of the polynomial time frequency transform (PTFT) compared with that by only using 1D FFT.

ReportDOI
29 Jun 2007
TL;DR: In this article, a general formulation and an associated design discipline for the two lattices are developed, which involves jointly determining element spacing, steering range and beam-layout geometry, grating-lobe behavior, and FFT factorability and therefore computational efficiency.
Abstract: : It is well known that when the identical elements of a planar receive array are laid out in horizontal rows and vertical columns, a fast Fourier transform or FFT can be used to efficiently realize simultaneous beams laid out in rows and columns in the direction cosines associated with the azimuth and elevation directions. Here a more general formulation and an associated design discipline is developed. Identical elements are laid out on an arbitrary planar lattice -- it could be square, rectangular, diamond, or triangular and might display tremendous symmetry or vary little -- and the beams in direction-cosine space are laid out on an arbitrary superlattice of the dual of the element-layout lattice. The generality of these two arbitrary lattice can yield significant cost reductions for large, many-beam arrays and arises from, first, formulating the desired beam outputs using a discrete Fourier transform or DFT generalized to use an integer-matrix size parameter, and second, efficiently realizing the required real-time computations with the generalized FFT based on a matrix factorization of that size parameter that is developed here. This generalized FFT includes as special cases the usual 1D and 2D FFTs in radix-2 and mixed-radix forms but offers many more possibilities as well. The approach cannot outperform but does match, when the matrix size parameter factors well, the N log N computational efficiency of the usual FFT. Examples illustrate a design discipline for the two lattices that involves jointly determining element spacing, steering range and beam-layout geometry, grating-lobe behavior, and FFT factorability and therefore computational efficiency.