Showing papers on "Prime-factor FFT algorithm published in 2013"

PDF

Open Access

Proceedings Article•DOI•

[...]

Nathanaël Perraudin¹, Peter Balazs, Peter Lempel Søndergaard•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

01 Oct 2013

TL;DR: A new algorithm to estimate a signal from its short-time Fourier transform modulus (STFTM) shows not only significant improvement in speed of convergence but it does as well recover the signals with a smaller error than the traditional GLA.

...read moreread less

Abstract: In this paper, we present a new algorithm to estimate a signal from its short-time Fourier transform modulus (STFTM). This algorithm is computationally simple and is obtained by an acceleration of the well-known Griffin-Lim algorithm (GLA). Before deriving the algorithm, we will give a new interpretation of the GLA and formulate the phase recovery problem in an optimization form. We then present some experimental results where the new algorithm is tested on various signals. It shows not only significant improvement in speed of convergence but it does as well recover the signals with a smaller error than the traditional GLA.

...read moreread less

128 citations

Journal Article•DOI•

Adaptive sub-linear time fourier algorithms

[...]

David Lawlor¹, David Lawlor², Yang Wang¹, Andrew Christlieb¹•Institutions (2)

Michigan State University¹, Duke University²

23 Apr 2013-Advances in Adaptive Data Analysis

TL;DR: A new deterministic algorithm for the sparse Fourier transform problem, in which the algorithm seeks to identify k ≪ N significant Fourier coefficients from a signal of bandwidth N, which is orders of magnitude faster than competing algorithms.

...read moreread less

Abstract: We present a new deterministic algorithm for the sparse Fourier transform problem, in which we seek to identify k ≪ N significant Fourier coefficients from a signal of bandwidth N. Previous deterministic algorithms exhibit quadratic runtime scaling, while our algorithm scales linearly with k in the average case. Underlying our algorithm are a few simple observations relating the Fourier coefficients of time-shifted samples to unshifted samples of the input function. This allows us to detect when aliasing between two or more frequencies has occurred, as well as to determine the value of unaliased frequencies. We show that empirically our algorithm is orders of magnitude faster than competing algorithms.

...read moreread less

69 citations

Journal Article•DOI•

Flexible multiple-image encryption algorithm based on log-polar transform and double random phase encoding technique

[...]

Li-Hua Gong¹, Xingbin Liu¹, Fen Zheng¹, Nanrun Zhou²•Institutions (2)

Nanchang University¹, Beijing University of Posts and Telecommunications²

27 Sep 2013-Journal of Modern Optics

TL;DR: A novel multiple-image encryption algorithm by combining log-polar transform with double random phase encoding in the fractional Fourier domain to obtain high encryption efficiency and avoids cross-talk in the meantime.

...read moreread less

Abstract: We present a novel multiple-image encryption algorithm by combining log-polar transform with double random phase encoding in the fractional Fourier domain. In this algorithm, the original images are transformed to annular domains by inverse log-polar transform and then the annular domains are merged into one image. The composite image is encrypted by the classical double random phase encoding method. The proposed multiple-image encryption algorithm takes advantage of the data compression characteristic of log-polar transform to obtain high encryption efficiency and avoids cross-talk in the meantime. Optical implementation of the proposed algorithm is demonstrated and numerical simulation results verify the feasibility and the validity of the proposed algorithm.

...read moreread less

66 citations

Journal Article•DOI•

A High-Speed Low-Complexity Modified ${\rm Radix}-2^{5}$ FFT Processor for High Rate WPAN Applications

[...]

Taesang Cho¹, Hanho Lee¹•Institutions (1)

Inha University¹

01 Jan 2013-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A novel modified radix-25 FFT algorithm that reduces the hardware complexity is proposed, which can reduce the number of complex multiplications and the size of the twiddle factor memory.

...read moreread less

Abstract: This paper presents a high-speed low-complexity modified radix-25 512-point fast Fourier transform (FFT) processor using an eight data-path pipelined approach for high rate wireless personal area network applications. A novel modified radix-25 FFT algorithm that reduces the hardware complexity is proposed. This method can reduce the number of complex multiplications and the size of the twiddle factor memory. It also uses a complex constant multiplier instead of a complex Booth multiplier. The proposed FFT processor achieves a signal-to-quantization noise ratio of 35 dB at 12 bit internal word length. The proposed processor has been designed and implemented using 90-nm CMOS technology with a supply voltage of 1.2 V. The results demonstrate that the total gate count of the proposed FFT processor is 290 K. Furthermore, the highest throughput rate is up to 2.5 GS/s at 310 MHz while requiring much less hardware complexity.

...read moreread less

63 citations

Proceedings Article•DOI•

Computing a k-sparse n-length Discrete Fourier Transform using at most 4k samples and O(k log k) complexity

[...]

Sameer Pawar¹, Kannan Ramchandran¹•Institutions (1)

University of California, Berkeley¹

07 Jul 2013

TL;DR: The FFAST algorithm as mentioned in this paper is based on filterless subsampling of the input signal x using a small set of carefully chosen uniform sub-sampling patterns guided by the Chinese Remainder Theorem.

...read moreread less

Abstract: Given an n-length input signal x, it is well known that its Discrete Fourier Transform (DFT), X, can be computed in O(nlogn) complexity using a Fast Fourier Transform. If the spectrum X is exactly k-sparse (where k <;<; n), can we do better? We show that asymptotically in k and n, when k is sub-linear in n (i.e., k ∝ nδ where 0 <; δ <; 1), and the support of the non-zero DFT coefficients is uniformly random, we can exploit this sparsity in two fundamental ways (i) sample complexity: we need only M = rk deterministically chosen samples of the input signal x (where r <; 4 when 0 <; δ <; 0.99); and (ii) computational complexity: we can reliably compute the DFT X using O(k log k) operations, where the constants in the big Oh are small. Our algorithm succeeds with high probability, with the probability of failure vanishing to zero asymptotically in the number of samples acquired, M. Our approach is based on filterless subsampling of the input signal x using a small set of carefully chosen uniform subsampling patterns guided by the Chinese Remainder Theorem (CRT). Specifically, our subsampling operation on x is designed to create aliasing patterns on the spectrum X that "look like" parity-check constraints of good erasure-correcting sparse-graph codes. We show how computing the sparse DFT X is equivalent to decoding of these sparse-graph codes and is low in both sample complexity and decoding complexity. We accordingly dub our algorithm the FFAST (Fast Fourier Aliasing-based Sparse Transform) algorithm. In our analysis, we rigorously connect our CRT based graph constructions to random sparse-graph codes based on a balls-and-bins model and analyze the convergence behavior of the latter using well-studied density evolution techniques from coding theory. We provide simulation results in Section IV that corroborate our theoretical findings, and validate the empirical performance of the FFAST algorithm.

...read moreread less

58 citations

Journal Article•DOI•

An In-Place FFT Architecture for Real-Valued Signals

[...]

Manohar Ayinala¹, Yingjie Lao¹, Keshab K. Parhi¹•Institutions (1)

University of Minnesota¹

08 Aug 2013-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: This brief presents a novel scalable architecture for in-place fast Fourier transform (IFFT) computation for real-valued signals based on a modified radix-2 algorithm, which removes the redundant operations from the flow graph.

...read moreread less

Abstract: This brief presents a novel scalable architecture for in-place fast Fourier transform (IFFT) computation for real-valued signals. The proposed computation is based on a modified radix-2 algorithm, which removes the redundant operations from the flow graph. A new processing element (PE) is proposed using two radix-2 butterflies that can process four inputs in parallel. A novel conflict-free memory-addressing scheme is proposed to ensure the continuous operation of the FFT processor. Furthermore, the addressing scheme is extended to support multiple parallel PEs. The proposed real-FFT processor simultaneously requires fewer computation cycles and lower hardware cost compared to prior work. For example, the proposed design with two PEs reduces the computation cycles by a factor of 2 for a 256-point real fast Fourier transform (RFFT) compared to a prior work while maintaining a lower hardware complexity. The number of computation cycles is reduced proportionately with the increase in the number of PEs.

...read moreread less

56 citations

Journal Article•DOI•

Pipelined Architectures for Real-Valued FFT and Hermitian-Symmetric IFFT With Real Datapaths

[...]

Sayed Ahmad Salehi¹, Rasoul Amirfattahi¹, Keshab K. Parhi²•Institutions (2)

Isfahan University of Technology¹, University of Minnesota²

11 Jul 2013-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: Novel parallel pipelined architectures for the computation of the fast Fourier transform (FFT) of real signals and inverse FFT of Hermitian-symmetric signals using only real datapaths are presented.

...read moreread less

Abstract: This brief presents novel parallel pipelined architectures for the computation of the fast Fourier transform (FFT) of real signals and inverse FFT of Hermitian-symmetric signals using only real datapaths. The real FFT structure is transformed by transferring twiddle factors to subsequent stages, such that each stage in the proposed flow graph contains one column of butterfly units and one column of twiddle factor blocks, and each column of the flow graph contains only N samples. This is a key requirement for the design of architectures that are based on only real datapaths. This structure is then mapped to pipelined architectures. The proposed architectures can be used with any FFT size or level of parallelism, which is a power of two. A systematic method to design architectures for FFTs with different levels of parallelism and radix values is presented. By modifying the FFT flow graph for real-valued samples, this methodology leads to architectures with fewer adders, delays, and interconnections.

...read moreread less

50 citations

Proceedings Article•DOI•

Evaluating the Hardware Performance of a Million-Bit Multiplier

[...]

Yarkin Doröz, Erdinc Ozturk¹, Berk Sunar•Institutions (1)

Istanbul Commerce University¹

04 Sep 2013

TL;DR: Estimates show that the performance of the novel architecture designed to realize a million-bit multiplication architecture matches that of previously reported software implementations on a high-end 3 Ghz Intel Xeon processor, while requiring only a tiny fraction of the area.

...read moreread less

Abstract: In this work we present the first full and complete evaluation of a very large multiplication scheme in custom hardware. We designed a novel architecture to realize a million-bit multiplication architecture based on the Schonhage-Strassen Algorithm and the Number Theoretical Transform (NTT). The construction makes use of an innovative cache architecture along with processing elements customized to match the computation and access patterns of the FFT-based recursive multiplication algorithm. When synthesized using a 90nm TSMC library operating at a frequency of 666 MHz, our architecture is able to compute the product of integers in excess of a million bits in 7.74 milliseconds. Estimates show that the performance of our design matches that of previously reported software implementations on a high-end 3 Ghz Intel Xeon processor, while requiring only a tiny fraction of the area.

...read moreread less

41 citations

Journal Article•DOI•

A fast butterfly algorithm for generalized Radon transforms

[...]

Jingwei Hu¹, Sergey Fomel¹, Laurent Demanet², Lexing Ying³•Institutions (3)

University of Texas at Austin¹, Massachusetts Institute of Technology², Stanford University³

21 Jun 2013-Geophysics

TL;DR: In this article, a fast butterfly algorithm for the hyperbolic Radon transform is proposed, which reformulates the transform as an oscillatory integral operator and constructs a blockwise low-rank approximation of the kernel function.

...read moreread less

Abstract: Generalized Radon transforms, such as the hyperbolic Radon transform, cannot be implemented as efficiently in the frequency domain as convolutions, thus limiting their use in seismic data processing. We have devised a fast butterfly algorithm for the hyperbolic Radon transform. The basic idea is to reformulate the transform as an oscillatory integral operator and to construct a blockwise low-rank approximation of the kernel function. The overall structure follows the Fourier integral operator butterfly algorithm. For 2D data, the algorithm runs in complexity O(N2 log N), where N depends on the maximum frequency and offset in the data set and the range of parameters (intercept time and slowness) in the model space. From a series of studies, we found that this algorithm can be significantly more efficient than the conventional time-domain integration.

...read moreread less

37 citations

Journal Article•DOI•

Unified architecture for 2, 3, 4, 5, and 7-point DFTs based on Winograd Fourier transform algorithm

[...]

Fahad Qureshi, Mario Garrido, Oscar Gustafsson

28 Feb 2013-Electronics Letters

TL;DR: A unified hardware architecture that can be reconfigured to calculate 2, 3, 4, 5, or 7-point DFTs is presented and the processing element finds potential use in memory-based FFTs, where non-power-of-two sizes are required such as in DMB-T.

...read moreread less

Abstract: A unified hardware architecture that can be reconfigured to calculate 2, 3, 4, 5, or 7-point DFTs is presented. The architecture is based on the Winograd Fourier transform algorithm and the complexity is equal to a 7-point DFT in terms of adders/subtractors and multipliers plus only seven multiplexers introduced to enable reconfigurability. The processing element finds potential use in memory-based FFTs, where non-power-of-two sizes are required such as in DMB-T.

...read moreread less

27 citations

Proceedings Article•DOI•

Sparse Fast Fourier Transform by downsampling

[...]

Sung-Hsien Hsieh, Chun-Shien Lu, Soo-Chang Pei¹•Institutions (1)

National Taiwan University¹

26 May 2013

TL;DR: Complexity analysis and experimental results show that this method outperforms FFT and sFFT and a top-down iterative strategy combined with different downsampling factors further saves computational costs.

...read moreread less

Abstract: Sparse Fast Fourier Transform (sFFT) [1][2], has been recently proposed to outperform FFT in reducing computational complexity. Assume that an input signal of length N in the frequency domain is K-sparse, where K ≤ N. sFFT costs O(K logN) instead of O(N logN) in FFT. In this paper, a new fast sFFT algorithm is proposed and costs O(K logK) averagely without any operations being related to N. The idea is to downsample the original input signal at the beginning. Subsequent processing operates under downsampled signals, which length is proportional to O(K). However, downsampling possibly leads to “aliasing.” By shift theorem of DFT, the aliasing problem can be formulated as the “Moment-preserving problem.” In addition, a top-down iterative strategy combined with different downsampling factors further saves computational costs. Complexity analysis and experimental results show that our method outperforms FFT and sFFT.

...read moreread less

Journal Article•DOI•

A Highly Accurate FGG-FG-FFT for the Combined Field Integral Equation

[...]

Jia-Ye Xie¹, Hou-Xing Zhou¹, Wei Hong¹, Wei-Dong Li¹, Guang Hua¹ - Show less +1 more•Institutions (1)

Southeast University¹

25 Jun 2013-IEEE Transactions on Antennas and Propagation

TL;DR: In this paper, a novel realization of the Integral Equation in combination with the fast Fourier transform for the CFIE is established by fitting both the Green's function and its gradient onto the nodes of a uniform Cartesian grid.

...read moreread less

Abstract: In this paper, a novel realization of the Integral Equation in combination with the fast Fourier transform for the CFIE is established by Fitting both the Green's function and its Gradient onto the nodes of a uniform Cartesian grid. The new method has been compared with several existing popular FFT-based methods, including the AIM, the IE-FFT, and the p-FFT. The accuracy of the proposed method is significantly higher than other FFT-based methods, and the method is not sensitive to both the grid spacing and the expansion order. The outstanding merit of the proposed method is that the fitting procedure is independent of the basis functions. Therefore, when the higher order basis functions would be adopted in the method of moments, only one fitting procedure for the Green's function and its gradient on a basis function support is needed to meet all of basis functions defined on this support. Some numerical examples are provided in this paper to demonstrate the accuracy and efficiency of the proposed method.

...read moreread less

Proceedings Article•DOI•

Energy efficient parameterized FFT architecture

[...]

Ren Chen¹, Hoang Le¹, Viktor K. Prasanna¹•Institutions (1)

University of Southern California¹

24 Oct 2013

TL;DR: A parameterized FFT architecture is proposed to identify the design trade-offs in achieving energy efficiency, and designs achieve up to 28% and 38% improvement in the energy efficiency and EAT, respectively, compared with a state-of-the-art design.

...read moreread less

Abstract: In this paper, we revisit the classic Fast Fourier Transform (FFT) for energy efficient designs on FPGAs. A parameterized FFT architecture is proposed to identify the design trade-offs in achieving energy efficiency. We first perform design space exploration by varying the algorithm mapping parameters, such as the degree of vertical and horizontal parallelism, that characterize decomposition based FFT algorithms. Then we explore an energy efficient design by empirical selection on the values of the chosen architecture parameters, including the type of memory elements, the type of interconnection network and the number of pipeline stages. The trade offs between energy, area, and time are analyzed using two performance metrics: the energy efficiency (defined as the number of operations per Joule) and the Energy×Area×Time (EAT) composite metric. From the experimental results, a design space is generated to demonstrate the effect of these parameters on the various performance metrics. For N-point FFT (16 ≤ N ≤ 1024), our designs achieve up to 28% and 38% improvement in the energy efficiency and EAT, respectively, compared with a state-of-the-art design.

...read moreread less

Journal Article•DOI•

Split Radix Algorithm for Length $6^{m}$ DFT

[...]

Weihua Zheng¹, Kenli Li¹•Institutions (1)

Hunan University¹

28 Jan 2013-IEEE Signal Processing Letters

TL;DR: Novel order permutation of sub-DFTs and reduction of the number of arithmetic operations enhance the practicability of the proposed algorithm and inherently provides a wider choice of accessible FFT's lengths.

...read moreread less

Abstract: Discrete Fourier transform (DFT) is widespread used in many fields of science and engineering. DFT is implemented with efficient algorithms categorized as fast Fourier transform. A fast algorithm is proposed for computing a length-N=6m DFT. The proposed algorithm is a blend of radix-3 and radix-6 FFT. It is a variant of split radix and can be flexibly implemented a length 2r×3m DFT. Novel order permutation of sub-DFTs and reduction of the number of arithmetic operations enhance the practicability of the proposed algorithm. It inherently provides a wider choice of accessible FFT's lengths.

...read moreread less

Book Chapter•DOI•

Discrete Fourier Transform and Signal Spectrum

[...]

Li Tan¹, Jean Jiang•Institutions (1)

Purdue University North Central¹

01 Jan 2013

TL;DR: In this paper, the authors investigated discrete Fourier transform (DFT) and Fast Fourier Transform (FFT) algorithms to compute signal amplitude spectrum and power spectrum, and used the window function to reduce spectral leakage.

...read moreread less

Abstract: This chapter investigates discrete Fourier transform (DFT) and fast Fourier transform (FFT) and their properties; introduces the DFT/FFT algorithms to compute signal amplitude spectrum and power spectrum; and uses the window function to reduce spectral leakage. Finally, the chapter describes the FFT algorithm and shows how to apply FFT it to estimate a speech spectrum.

...read moreread less

Proceedings Article•DOI•

Pipelined FFT for wireless communications supporting 128–2048 / 1536 -point transforms

[...]

Inkeun Cho¹, Tomasz Patyk², David Guevorkian³, Jarmo Takala⁴, Shuvra S. Bhattacharyya¹ - Show less +1 more•Institutions (4)

University of Maryland, College Park¹, Dolby Laboratories², Nokia Networks³, Tampere University of Technology⁴

01 Dec 2013

TL;DR: A pipeline FFT architecture is proposed, which supports FFT lengths of power-of-two multiple of three and is memory optimal as for N-point transform only N - 1 memory locations are needed.

...read moreread less

Abstract: Modern wireless communication systems use orthogonal frequency division multiplexing (OFDM) and multiple input multiple output (MIMO) schemes, which call for fast Fourier transforms (FFT) Traditionally power-of-two FFT lengths have been exploited but recently also non-power-of-two transform lengths have been defined For example, 3GPP LTE specification defines 1536- point FFT In this paper, we propose a pipeline FFT architecture, which supports FFT lengths of power-of-two multiple of three The architecture is basically single delay feedback structure followed by radix-3 computation unit The proposed architecture is memory optimal as for N-point transform only N - 1 memory locations are needed

...read moreread less

Proceedings Article•DOI•

Image encryption using block based transformation with fractional Fourier transform

[...]

Delong Cui, Lei Shu, Yuanfang Chen¹, Xiaoling Wu•Institutions (1)

Pierre-and-Marie-Curie University¹

01 Aug 2013

TL;DR: Theoretical analysis and experimental results demonstrate that the algorithm is favorable, and the security of the proposed algorithm depends on the transformation algorithm, sensitivity to the randomness of phase mask and the orders of FRFT.

...read moreread less

Abstract: In order to transmit image data in open network, a novel image encryption algorithm based on fractional Fourier transform and block-based transformation is proposed in this paper. The image encryption process includes two steps: the original image was divided into blocks, which were rearranged into a transformed image using a transformation algorithm, and then the transformed image was encrypted using the fractional Fourier transform (FRFT) algorithm. The security of the proposed algorithm depends on the transformation algorithm, sensitivity to the randomness of phase mask and the orders of FRFT. Theoretical analysis and experimental results demonstrate that the algorithm is favorable.

...read moreread less

Proceedings Article•DOI•

Modified parallel code-phase search for acquisition in presence of sign transition

[...]

Jérôme Leclère¹, Cyril Botteron¹, Pierre-André Farine¹•Institutions (1)

École Normale Supérieure¹

25 Jun 2013

TL;DR: The algorithm proposed in this article transforms the initial correlation into two smaller correlations, and it is shown that the theoretical number of operations can be reduced by about 21 %, and that the memory resources for an FPGA implementation can be almost halved.

...read moreread less

Abstract: One of the method to have a fast acquisition of GNSS signals is the parallel code-phase search, which uses the fast Fourier transform (FFT) to perform the correlation. A problem with this method is the potential sign transition that can happen between two code periods due to data or secondary code and lead to a loss of sensitivity or to the non-detection of the signal. A known straightforward solution consists in using two code periods instead of one for the correlation. However, in addition to increasing the complexity, this solution is not efficient since half of the points calculated are discarded. This led us to look for a more efficient algorithm. The algorithm proposed in this article transforms the initial correlation into two smaller correlations. When the radix-2 FFT is used, the proposed algorithm is more efficient for half of the possible sampling frequencies. It is shown for example that the theoretical number of operations can be reduced by about 21 %, and that the memory resources for an FPGA implementation can be almost halved.

...read moreread less

Patent•

Vectorization of fast fourier transform for elastic wave propogation for use in seismic underwater exploration of geographical areas of interest

[...]

Sheng Xu¹, Feng Chen¹•Institutions (1)

CGG¹

18 Apr 2013

TL;DR: In this paper, a vectorization scheme for high dimensional FFTs is presented, which has the best performance on the slowest or higher dimensions of data compared to conventional numerical scheme implementations.

...read moreread less

Abstract: Numerical simulations of elastic wave propagation algorithms are critical components for seismic imaging and inversion. Finite-difference schemes yield good efficiency but cannot ensure the accuracy of the high frequency component. Pseudo-spectral algorithms are accurate up to the Nyquist frequency, but its efficiency depends on the optimization of the fast Fourier transform (FFT) algorithm. The conventional FFT algorithms are optimized for signal processing, in which problems are generally one dimensional time series. For 3D wave propagation, FFT algorithms have the potential to be further optimized. Under current computer hardware architecture, a vectorization scheme for high dimensional FFTs is presented. Compared to conventional numerical scheme implementations, the systems and methods disclose herein has the best performance on the slowest or higher dimensions of data. For elastic wave propagation, vectorization improves the efficiency by more than a factor of two when compared to standard FFT algorithms.

...read moreread less

Proceedings Article•DOI•

The split-radix fast Fourier transforms with radix-4 butterfly units

[...]

Sian-Jheng Lin¹, Wei-Ho Chung¹•Institutions (1)

Center for Information Technology¹

01 Oct 2013

TL;DR: A split radix fast Fourier transform (FFT) algorithm consisting of mixed radix butterflies, whose structure is more regular than the conventional split Radix algorithm, and is fewer operations than the radix-4 algorithms.

...read moreread less

Abstract: We present a split radix fast Fourier transform (FFT) algorithm consisting of radix-4 butterflies. The major advantages of the proposed algorithm include: i). The proposed algorithm consists of mixed radix butterflies, whose structure is more regular than the conventional split radix algorithm. ii). The proposed algorithm is asymptomatically equal computation amount to the split radix algorithm, and is fewer operations than the radix-4 algorithms. iii). The proposed algorithm is in the conjugate-pair version, which requires less memory access than the conventional FFT algorithms.

...read moreread less

Proceedings Article•DOI•

Novel algorithms for 2-D FFT and its inverse for image compression

[...]

T. G. Anitha¹, S. Ramachandran²•Institutions (2)

Vinayaka Missions University¹, SJB Institute of Technology²

15 Apr 2013

TL;DR: Novel algorithms for 2-D FFT and IFFT so that they may be realized in hardware to suit VLSI realization, where the processing speed is of paramount importance.

...read moreread less

Abstract: High performance Fast Fourier Transform and Inverse Fast Fourier Transform are indispensable algorithms in the field of Digital Signal Processing. They are widely used in different areas of applications such as bio signal data compression, radars, image processing, voice processing etc. FFT algorithm is computationally intensive and need to be processed in real time for most applications. This paper presents novel algorithms for 2-D FFT and IFFT so that they may be realized in hardware. The algorithms have been developed to suit VLSI realization, where the processing speed is of paramount importance. The FFT and IFFT algorithms have been coded in MATLAB and successfully tested for 2D color images. The reconstructed images are indistinguishable from the original as can be seen from the results presented. The reconstructed quality of the images is better than 35 dB.

...read moreread less

Proceedings Article•DOI•

Implementation of radix 2 and radix 2 2 FFT algorithms on Spartan6 FPGA

[...]

Lakshmi Santhosh¹, Anoop Thomas¹•Institutions (1)

Rajagiri¹

04 Jul 2013

TL;DR: The Fast Fourier Transform (FFT) and its inverse (IFFT) are very important algorithms in digital signal processing and communication systems and these algorithms have been developed using Verilog hardware description language and implemented on Spartan6 FPGA.

...read moreread less

Abstract: The Fast Fourier Transform (FFT) and its inverse (IFFT) are very important algorithms in digital signal processing and communication systems. Radix-2 FFT algorithm is the simplest and most common form of the Cooley-Tukey algorithm. Radix-22 FFT algorithm is an attractive algorithm having same multiplicative complexity as radix-4 algorithm, but retains the simple butterfly structure of radix-2 algorithm. These algorithms have been developed using Verilog hardware description language and implemented on Spartan6 FPGA.

...read moreread less

Proceedings Article•DOI•

Structured FFT and TFT: symmetric and lattice polynomials

[...]

Joris van der Hoeven¹, Romain Lebreton², Éric Schost³•Institutions (3)

École Polytechnique¹, University of Montpellier², University of Western Ontario³

26 Jun 2013

TL;DR: In this article, the authors considered the problem of efficient computations with structured polynomials and provided complexity results for computing Fourier Transform and truncated Fourier transform of symmetric polynomial.

...read moreread less

Abstract: In this paper, we consider the problem of efficient computations with structured polynomials. We provide complexity results for computing Fourier Transform and Truncated Fourier Transform of symmetric polynomials, and for multiplying polynomials supported on a lattice.

...read moreread less

Journal Article•DOI•

Calculating the n-dimensional fast Fourier transform

[...]

V. S. Tutatchikov¹, O. I. Kiselev¹, M. V. Noskov¹•Institutions (1)

Siberian Federal University¹

01 Jul 2013-Pattern Recognition and Image Analysis

TL;DR: The focus is on studying the analog of the Cooley-Tukey algorithm because the number of operations applied to calculate the n-dimensional FFT is considerably less than in the conventional algorithm.

...read moreread less

Abstract: The one-dimensional fast Fourier transform (FFT) is the most popular tool for calculating the multidimensional Fourier transform. As a rule, to estimate the n-dimensional FFT, a standard method of combining one-dimensional FFTs, the so-called "by rows and columns" algorithm, is used in the literature. For fast calculations, different researchers try to use parallel calculation tools, the most successful of which are searches for the algorithms related to the computing device architecture: cluster, video card, GPU, etc. [1, 2]. The possibility of paralleling another algorithm for FFT calculation, which is an n-dimensional analog of the Cooley-Tukey algorithm [3, 4], is studied in this paper. The focus is on studying the analog of the Cooley-Tukey algorithm because the number of operations applied to calculate the n-dimensional FFT is considerably less than in the conventional algorithm nN n log2 N of addition operations and 1/2N n + 1log2 N of multiplication operations of addition operations and $$\frac{{2^n - 1}} {{2^n }}N^n \log _2 N$$ of multiplication operations against: N n + 1log2 N of addition operations and 1/2N n + 1log2 N of in combining one-dimensional FFTs.

...read moreread less

Proceedings Article•DOI•

Coherent optical implementations of the fast Fourier transform and their comparison to the optical implementation of the quantum Fourier transform

[...]

Rupert Young¹, Philip Birch¹, Chris Chatwin¹•Institutions (1)

University of Sussex¹

29 Apr 2013-Proceedings of SPIE

TL;DR: The decomposition of the FFT algorithm into the basic Butterfly operations is described, as this allows the algorithm to be fully implemented by the successive coherent addition and subtraction of two wavefronts, facilitating a simple and robust hardware implementation based on waveguided hybrid devices as employed in coherent optical detection modules.

...read moreread less

Abstract: Optical structures to implement the discrete Fourier transform (DFT) and fast Fourier transform (FFT) algorithms for discretely sampled data sets are considered. In particular, the decomposition of the FFT algorithm into the basic Butterfly operations is described, as this allows the algorithm to be fully implemented by the successive coherent addition and subtraction of two wavefronts (the subtraction being performed after one has been appropriately phase shifted), so facilitating a simple and robust hardware implementation based on waveguided hybrid devices as employed in coherent optical detection modules. Further, a comparison is made to the optical structures proposed for the optical implementation of the quantum Fourier transform and they are shown to be very similar.

...read moreread less

Proceedings Article•DOI•

Parallel sparse FFT

[...]

Cheng Wang¹, Mauricio Araya-Polo², Sunita Chandrasekaran¹, Amik St-Cyr², Barbara Chapman¹, Detlef Hohl² - Show less +2 more•Institutions (2)

University of Houston¹, Royal Dutch Shell²

17 Nov 2013

TL;DR: The authors' parallel sFFT (PsFFT) implementation achieves approximately 60% parallel efficiency on a single 8-core Intel Sandy Bridge socket for relevant test cases and applies several techniques such as index coalescing, data affiliated loops and multi-level blocking techniques to alleviate memory access congestion and increase performance.

...read moreread less

Abstract: The Fast Fourier Transform (FFT) is a widely used numerical algorithm. When N input data points lead to only k

...read moreread less

Journal Article•DOI•

Case Study of Grigoryan FFT onto FPGAs and DSPs

[...]

Narayanam Ranganadh, Parimal A. Patel, Artyom M. Grigoryan

01 Jan 2013-International Journal of Future Computer and Communication

TL;DR: The paired-transform based algorithm of the FFT is faster than the radix-2 FFT, consequently it is useful for higher sampling rates and also on the Virtex-II pro FPGAs.

...read moreread less

Abstract: Frequency analysis plays vital role in the applications like cryptanalysis, steganalysis, system identification, controller tuning, speech recognition, noise filters, etc. Discrete Fourier Transform (DFT) is a principal mathematical method for the frequency analysis. The way of splitting the DFT gives out various fast algorithms. In this paper, we present the implementation of two fast algorithms for the DFT for evaluating their performance. One of them is the popular radix-2 Cooley-Tukey fast Fourier transform algorithm (FFT) (1) and the other one is the Grigoryan FFT based on the splitting by the paired transform (2). We evaluate the performance of these algorithms by implementing them on the TMS320C62x DSP and also on the Virtex-II pro FPGAs. Finally we show that the paired-transform based algorithm of the FFT is faster than the radix-2 FFT, consequently it is useful for higher sampling rates.

...read moreread less

Proceedings Article•DOI•

A parallel implementation method of FFT-based full-search block matching algorithms

[...]

Toshiyuki Dobashi¹, Hitoshi Kiya¹•Institutions (1)

Tokyo Metropolitan University¹

26 May 2013

TL;DR: A parallel implementation method of FFT-based full-search BMAs that can not only process in parallel, but also select the efficient FFT size and calculate two cross-correlations at the same time is proposed.

...read moreread less

Abstract: One category of fast full-search block matching algorithms (BMAs) is based on the fast Fourier transformation (FFT). This paper proposes a parallel implementation method of FFT-based full-search BMAs. The FFT-based full-search BMAs are much faster than the direct full-search BMA, and its accuracy is as same as the direct full-search BMA. However, these are not designed for parallel processing. The proposed method divides the search window into multiple sub search windows using the overlap-save method, and the FFT-based full-search BMA is applied to each sub search window. These sub search windows are processed in parallel. By dividing the search window, the method can not only process in parallel, but also select the efficient FFT size. Furthermore, the method can also calculate two cross-correlations at the same time. These properties also contribute to speeding up of the block matching. The experimental results shows that the method on 6 cores CPU is about 11 times faster than the conventional FFT-based full-search BMA.

...read moreread less

Proceedings Article•DOI•

Reducing the Hamming distance of encoded FFT twiddle factors using improved heuristic algorithms

[...]

A. G. da Luz, E. A. C. da Costa, Sidinei Ghissoni

23 May 2013

TL;DR: The appropriate ordering of coefficients, based on the guidance given by the improved Anedma algorithm, can contribute for the reduction of Hamming distance of the encoded twiddle factors.

...read moreread less

Abstract: This paper addresses the exploration of different heuristic algorithms for a better manipulation of twiddle factors of Fast Fourier Transform (FFT). The FFT algorithm involve multiplications of input data with appropriate coefficients, hence the best ordering of those operations can contribute for reducing the switching activity, what leads to the minimization of power consumption in FFTs. The heuristic algorithm named Bellmore and Nemhauser, and a proposed one named Anedma in both original and improved versions, are used to get as near as possible to the optimal solution for the ordering and partitioning of coefficients in FFTs. Data encoding methods are used for decreasing switching activity for transmitting information over buses, hence we have used some encoding techniques in the coefficients. As will be shown, the appropriate ordering of coefficients, based on the guidance given by the improved Anedma algorithm, can contribute for the reduction of Hamming distance of the encoded twiddle factors.

...read moreread less

Journal Article•DOI•

A novel conflict-free parallel memory access scheme for FFT constant geometry architectures

[...]

CuiMei Ma¹, He Chen¹, JiYang Yu², Teng Long¹•Institutions (2)

Beijing Institute of Technology¹, China Academy of Space Technology²

02 Apr 2013-Science in China Series F: Information Sciences

TL;DR: A parallel conflict-free access scheme for a constant geometry architecture which is unlike the previous schemes is proposed, which only uses one modular addition operation, and does not involve complicated operations, thus reducing the hardware complexity of address generation.

...read moreread less

Abstract: In this paper, a parallel conflict-free access scheme for a constant geometry architecture which is unlike the previous schemes is proposed. The proposed method only uses one modular addition operation, and does not involve complicated operations, thus reducing the hardware complexity of address generation. Because of the reduction of the combinational logic which is used to generate the access address, the scheme also reduces the time delay and accordingly improves the executable frequency of fast Fourier transform (FFT) processors. In the scheme, we use an arbitrary radix, i.e., radix-r, to implement the scheme. The scheme is not only applicable to radix-r FFT processors with one butterfly unit, but is also suitable for FFT processors with multiple butterfly units. Because the same architecture is used for every stage of the constant geometry, it can enhance the flexibility of the FFT implementation. Finally, we analyze the resource costs and time delay of the proposed method, and the results verify the advantages of the proposed scheme.

...read moreread less