Showing papers on "Split-radix FFT algorithm published in 2016"

PDF

Open Access

Journal Article•DOI•

Fast convolution with free-space Green's functions

[...]

Felipe Vico¹, Leslie Greengard, Miguel Ferrando¹•Institutions (1)

15 Oct 2016-Journal of Computational Physics

TL;DR: A fast algorithm for computing volume potentials - that is, the convolution of a translation invariant, free-space Green's function with a compactly supported source distribution defined on a uniform grid is introduced.

...read moreread less

84 citations

Journal Article•DOI•

The Serial Commutator FFT

[...]

Mario Garrido¹, Shen-Jui Huang², Sau-Gee Chen³, Oscar Gustafsson¹•Institutions (3)

Linköping University¹, Novatek², National Chiao Tung University³

03 Mar 2016-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: This brief presents a new type of fast Fourier transform (FFT) hardware architectures called serial commutator (SC) FFT, based on the observation that, in the radix-2 FFT algorithm, only half of the samples at each stage must be rotated.

...read moreread less

Abstract: This brief presents a new type of fast Fourier transform (FFT) hardware architectures called serial commutator (SC) FFT. The SC FFT is characterized by the use of circuits for bit-dimension permutation of serial data. The proposed architectures are based on the observation that, in the radix-2 FFT algorithm, only half of the samples at each stage must be rotated. This fact, together with a proper data management, makes it possible to allocate rotations only every other clock cycle. This allows for simplifying the rotator, halving the complexity with respect to conventional serial FFT architectures. Likewise, the proposed approach halves the number of adders in the butterflies with respect to previous architectures. As a result, the proposed architectures use the minimum number of adders, rotators, and memory that are necessary for a pipelined FFT of serial data, with 100% utilization ratio.

...read moreread less

33 citations

Journal Article•DOI•

A New Representation of FFT Algorithms Using Triangular Matrices

[...]

Mario Garrido¹•Institutions (1)

Linköping University¹

26 Aug 2016-IEEE Transactions on Circuits and Systems

TL;DR: The triangular matrix representation is an excellent alternative to represent FFT algorithms and it opens new possibilities in the exploration and understanding of the FFT.

...read moreread less

Abstract: In this paper we propose a new representation for FFT algorithms called the triangular matrix representation. This representation is more general than the binary tree representation and, therefore, it introduces new FFT algorithms that were not discovered before. Furthermore, the new representation has the advantage that it is simple and easy to understand, as each FFT algorithm only consists of a triangular matrix. Besides, the new representation allows for obtaining the exact twiddle factor values in the FFT flow graph easily. This facilitates the design of FFT hardware architectures. As a result, the triangular matrix representation is an excellent alternative to represent FFT algorithms and it opens new possibilities in the exploration and understanding of the FFT.

...read moreread less

32 citations

Journal Article•DOI•

Low-Power Split-Radix FFT Processors Using Radix-2 Butterfly Units

[...]

Zhuo Qian¹, Martin Margala¹•Institutions (1)

University of Massachusetts Lowell¹

12 Apr 2016-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: Simulation results show that compared with the conventional radix-2 shared-memory implementations, the proposed design achieves over 20% lower power consumption when computing a 1024-point complex-valued transform.

...read moreread less

Abstract: Split-radix fast Fourier transform (SRFFT) is an ideal candidate for the implementation of a low-power FFT processor, because it has the lowest number of arithmetic operations among all the FFT algorithms. In the design of such processors, an efficient addressing scheme for FFT data as well as twiddle factors is required. The signal flow graph of SRFFT is the same as radix-2 FFT, and therefore, the conventional address generation schemes of FFT data could also be applied to SRFFT. However, SRFFT has irregular locations of twiddle factors and forbids the application of radix-2 address generation methods. This brief presents a shared-memory low-power SRFFT processor architecture. We show that SRFFT can be computed by using a modified radix-2 butterfly unit. The butterfly unit exploits the multiplier-gating technique to save dynamic power at the expense of using more hardware resources. In addition, two novel address generation algorithms for both the trivial and nontrivial twiddle factors are developed. Simulation results show that compared with the conventional radix-2 shared-memory implementations, the proposed design achieves over 20% lower power consumption when computing a 1024-point complex-valued transform.

...read moreread less

30 citations

Journal Article•DOI•

Measurement of power system harmonic based on adaptive Kaiser self-convolution window

[...]

Wenxuan Yao, Zhaosheng Teng, Qiu Tang, Yunpeng Gao

04 Feb 2016-Iet Generation Transmission & Distribution

TL;DR: In this paper, a fast Fourier transform (FFT) based method for time-varying power system harmonic measurement is proposed, where the harmonic signal is preprocessed by infinite impulse response filter bank and Teager-Kaiser energy operator for fast detection of instability onset time.

...read moreread less

Abstract: As the rapid development in power electronic devices, the harmonic pollution becomes one of principle power quality problems in power system. The fast Fourier transform (FFT) is widely used for analysing and measuring power system harmonics. However, the limitation of the FFT such as an aliasing effect, spectrum leakage picket-fence effect, would contribute to inaccuracy results. Furthermore, the real power system harmonic is actually a non-stationary signal while FFT is a tool for stable signal. This study focuses on a novel FFT based method for time-varying power system harmonic measurement. The harmonic signal is preprocessed by infinite impulse response filter bank and Teager–Kaiser energy operator for fast detection of instability onset time. Then an adaptive Kaiser self-convolution window-based interpolated FFT algorithm is used to estimate each harmonic component. The results of both simulation and practical implementation show that the proposed method is suitable to deal with time-varying harmonic and achieves a higher accuracy compared with the traditional FFT-based techniques.

...read moreread less

29 citations

Journal Article•DOI•

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO

[...]

Antony Xavier Glittas¹, Mathini Sellathurai¹, G. Lakshminarayanan²•Institutions (2)

Heriot-Watt University¹, National Institute of Technology, Tiruchirappalli²

06 Jan 2016-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This brief presents a novel pipelined FFT processor for the FFT computation of two independent data streams based on the multipath delay commutator FFT architecture, which requires a lower number of registers and has high throughput.

...read moreread less

Abstract: Nowadays, many applications require simultaneous computation of multiple independent fast Fourier transform (FFT) operations with their outputs in natural order. Therefore, this brief presents a novel pipelined FFT processor for the FFT computation of two independent data streams. The proposed architecture is based on the multipath delay commutator FFT architecture. It has an $N/2$ -point decimation in time FFT and an $N/2$ -point decimation in frequency FFT to process the odd and even samples of two data streams separately. The main feature of the architecture is that the bit reversal operation is performed by the architecture itself, so the outputs are generated in normal order without any dedicated bit reversal circuit. The bit reversal operation is performed by the shift registers in the FFT architecture by interleaving the data. Therefore, the proposed architecture requires a lower number of registers and has high throughput.

...read moreread less

28 citations

Journal Article•DOI•

Multiplierless Unity-Gain SDF FFTs

[...]

Mario Garrido¹, Rikard Andersson¹, Fahad Qureshi², Oscar Gustafsson¹•Institutions (2)

Linköping University¹, Tampere University of Technology²

01 Apr 2016-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A novel approach to implement multiplierless unity-gain single-delay feedback fast Fourier transforms (FFTs) without compensation circuits, even when using nonunity-gain rotators, by a joint design of rotators.

...read moreread less

Abstract: In this brief, we propose a novel approach to implement multiplierless unity-gain single-delay feedback fast Fourier transforms (FFTs). Previous methods achieve unity-gain FFTs by using either complex multipliers or nonunity-gain rotators with additional scaling compensation. Conversely, this brief proposes unity-gain FFTs without compensation circuits, even when using nonunity-gain rotators. This is achieved by a joint design of rotators, so that the entire FFT is scaled by a power of two, which is then shifted to unity. This reduces the amount of hardware resources of the FFT architecture, while having high accuracy in the calculations. The proposed approach can be applied to any FFT size, and various designs for different FFT sizes are presented.

...read moreread less

26 citations

Journal Article•DOI•

Area-Efficient Scaling-Free DFT/FFT Design Using Stochastic Computing

[...]

Bo Yuan¹, Yanzhi Wang², Zhongfeng Wang³•Institutions (3)

City University of New York¹, Syracuse University², Nanjing University³

26 Aug 2016-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: Among various discrete transforms, discrete Fourier transformation (DFT) is the most important technique that performs Fourier analysis in various practical applications, such as digital signal processing, wireless communications, to name a few.

...read moreread less

Abstract: Discrete Fourier transform (DFT) is an important transformation technique in signal processing tasks. Due to its ultrahigh computing complexity as $O(\!N^{\!2}\!)$ , $N$ - point DFT is usually implemented in the format of fast Fourier transformation (FFT) with the complexity of $O(N\log N)$ . Despite this significant reduction in complexity, the hardware cost of the multiplication-intensive $N$ - point FFT is still very prohibitive, particularly for many large-scale applications that require large $N$ . This brief, for the first time , proposes high-accuracy low-complexity scaling-free stochastic DFT/FFT designs. With the use of the stochastic computing technique, the hardware complexity of the DFT/FFT designs is significantly reduced. More importantly, this brief presents the scaling-free stochastic adder and the random number generator sharing scheme, which enable a significant reduction in accuracy loss and hardware cost. Analysis results show that the proposed stochastic DFT/FFT designs achieve much better hardware performance and accuracy performance than state-of-the-art stochastic design.

...read moreread less

26 citations

Journal Article•DOI•

Configurable Floating-Point FFT Accelerator on FPGA Based Multiple-Rotation CORDIC

[...]

Jiyang Chen, Yuanwu Lei, Yuanxi Peng, Tingting He, Deng Ziye - Show less +1 more

01 Nov 2016-Chinese Journal of Electronics

TL;DR: A configurable floating-point FFT accelerator based on CORDIC rotation is proposed, in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory.

...read moreread less

Abstract: Fast Fourier transform (FFT) accelerator and Coordinate rotation digital computer (CORDIC) algorithm play important roles in signal processing. We propose a configurable floating-point FFT accelerator based on CORDIC rotation, in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory. To finish CORDIC rotation efficiently, a novel approach in which segmented-parallel iteration and compress iteration based on CSA are presented and redundant CORDIC is used to reduce the latency of each iteration. To prove the efficiency of our FFT accelerator, four FFT accelerators are prototyped into a FPGA chip to perform a batch-FFT. Experimental results show that our structure, which is composed of four butterfly units and finishes FFT with the size ranging from 64 to 8192 points, occupies 33230(3%) REGs and 143006(30%) LUTs. The clock frequency can reach 122MHz. The resources of double-precision FFT is only about 2.5 times of single-precision while the theoretical value is 4. What's more, only 13331 cycles are required to implement 8192-points double-precision FFT with four butterfly units in parallel.

...read moreread less

18 citations

Proceedings Article•DOI•

cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs

[...]

Cheng Wang¹, Sunita Chandrasekaran², Barbara Chapman¹•Institutions (2)

University of Houston¹, University of Delaware²

23 May 2016

TL;DR: The challenges and proposed effective solutions to efficiently port sFFT to massively parallel processors, such as GPUs, using CUDA are explored and some of the optimization strategies such as index coalescing, loop splitting, asynchronous data layout transformation, linear time selection algorithm are presented.

...read moreread less

Abstract: The Fast Fourier Transform (FFT) is one of the most important numerical tools widely used in many scientific and engineering applications. The algorithm performs O(nlogn) operations on n input data points in order to calculate only small number of k large coefficients, while the rest of n - k numbers are zero or negligibly small. The algorithm is clearly inefficient, when n points input data lead to only k

...read moreread less

18 citations

Proceedings Article•DOI•

The hexagonal fast fourier transform

[...]

James B. Birdsong¹, Nicholas I. Rummelt²•Institutions (2)

University of Florida¹, Air Force Research Laboratory²

19 Aug 2016

TL;DR: This work develops a hexagonal FFT in ASA coordinates that uses only the standard Fourier transform, allowing the user to implement the hexagonally sampled FFT using standard FFT routines.

...read moreread less

Abstract: The discrete Fourier transform is an important tool for processing digital images. Efficient algorithms for computing the Fourier transform are known as fast Fourier transforms (FFTs). One of the most common of these is the Cooley-Tukey radix-2 decimation algorithm that efficiently transforms one-dimensional data into its frequency domain representation. The orthogonality of rectangular sampling allows the separability of the Fourier kernel which enables the use of the Cooley-Tukey algorithm on two-dimensional digital images that have been sampled rectangularly. Hexagonal sampling provides many benefits over rectangular sampling, but it does not result in the orthogonal rows and columns that can be transformed independently as is done with rectangular samples. Use of the Array Set Addressing (ASA) coordinate system for hexagonally sampled images has been shown to provide a separable Fourier kernel, leading to an efficient FFT, however its implementation is composed of nonstandard transforms that require custom routines to evaluate. This work develops a hexagonal FFT in ASA coordinates that uses only the standard Fourier transform, allowing the user to implement the hexagonal FFT using standard FFT routines.

...read moreread less

Proceedings Article•DOI•

Area-efficient scaling-free DFT/FFT design using stochastic computing

[...]

Bo Yuan¹, Yanzhi Wang², Zhongfeng Wang³•Institutions (3)

City University of New York¹, Syracuse University², Broadcom³

22 May 2016

TL;DR: This brief presents the scaling-free stochastic adder and the random number generator sharing scheme, which enable a significant reduction in accuracy loss and hardware cost and achieve much better hardware performance and accuracy performance than state-of-the-art Stochastic design.

...read moreread less

Abstract: Among various discrete transforms, discrete Fourier transformation (DFT) is the most important technique that performs Fourier analysis in various practical applications, such as digital signal processing, wireless communications, to name a few. Due to its ultra-high computing complexity as O(N2), in practice the N-point DFT is usually performed in the form of fast Fourier transformation (FFT) with complexity as O(NlogN). Despite this significant reduction in computing complexity, the hardware cost of the multiplication-intensive N-point FFT is still very prohibitive; especially for many large-scale applications that requires large N.

...read moreread less

Proceedings Article•DOI•

Area-Efficient Error-Resilient Discrete Fourier Transformation Design using Stochastic Computing

[...]

Bo Yuan¹, Yanzhi Wang², Zhongfeng Wang³•Institutions (3)

City University of New York¹, Syracuse University², Nanjing University³

18 May 2016

TL;DR: Analysis results show that compared with the conventional design, the proposed two 256-point stochastic DFT designs achieve 76% and 62% reduction in area, respectively, and show much stronger error-resilience, which is very attractive in nanoscale CMOS era.

...read moreread less

Abstract: Discrete Fourier Transformation (DFT)/Fast Fourier Transformation (FFT) are the widely used techniques in numerous modern signal processing applications. In general, because of their inherent multiplication-intensive characteristics, the hardware implementations of DFT/FFT usually require a large amount of hardware resource, which limits their applications in area-constraint scenarios. To overcome this challenge, this paper, for the first time, proposes area-efficient error-resilient DFT designs using stochastic computing. By leveraging low-complexity stochastic multipliers, two types of stochastic DFT design are presented with significant reduction in overall area. Analysis results show that compared with the conventional design, the proposed two 256-point stochastic DFT designs achieve 76% and 62% reduction in area, respectively. More importantly, these stochastic DFT designs also show much stronger error-resilience, which is very attractive in nanoscale CMOS era.

...read moreread less

Journal Article•DOI•

Vector Radix 2 × 2 Sliding Fast Fourier Transform

[...]

Keun-Yung Byun¹, Chun Su Park², Jee Young Sun¹, Sung-Jea Ko•Institutions (2)

Korea University¹, Sejong University²

05 Jan 2016-Mathematical Problems in Engineering

TL;DR: A stable 2D sliding fast Fourier transform (FFT) algorithm based on the vector radix 2 × 2 FFT is presented and theoretical analysis shows that the proposed algorithm has the lowest computational requirements among the existing stable sliding DFT algorithms.

...read moreread less

Abstract: The two-dimensional (2D) discrete Fourier transform (DFT) in the sliding window scenario has been successfully used for numerous applications requiring consecutive spectrum analysis of input signals. However, the results of conventional sliding DFT algorithms are potentially unstable because of the accumulated numerical errors caused by recursive strategy. In this letter, a stable 2D sliding fast Fourier transform (FFT) algorithm based on the vector radix (VR) 2 × 2 FFT is presented. In the VR-2 × 2 FFT algorithm, each 2D DFT bin is hierarchically decomposed into four sub-DFT bins until the size of the sub-DFT bins is reduced to 2 × 2; the output DFT bins are calculated using the linear combination of the sub-DFT bins. Because the sub-DFT bins for the overlapped input signals between the previous and current window are the same, the proposed algorithm reduces the computational complexity of the VR-2 × 2 FFT algorithm by reusing previously calculated sub-DFT bins in the sliding window scenario. Moreover, because the resultant DFT bins are identical to those of the VR-2 × 2 FFT algorithm, numerical errors do not arise; therefore, unconditional stability is guaranteed. Theoretical analysis shows that the proposed algorithm has the lowest computational requirements among the existing stable sliding DFT algorithms.

...read moreread less

Journal Article•DOI•

Fast Acquisition of GPS Signal Using FFT Decomposition

[...]

Shaik Fayaz Ahamed¹, G. Sasibhushana Rao², L. Ganesh³•Institutions (3)

Velagapudi Ramakrishna Siddhartha Engineering College¹, Andhra University², Anil Neerukonda Institute of Technology and Sciences³

01 Jan 2016-Procedia Computer Science

TL;DR: A new GPS signal acquisition method based on decomposition of FFT is proposed to improve the acquisition performance and is implemented, validated and compared with conventional serial search and radix2 FFT search algorithms using Intermediate Frequency GPS signal.

...read moreread less

Proceedings Article•DOI•

CORDIC-based FFT real-time processing design and FPGA implementation

[...]

Aimei Tang¹, Li Yu¹, Fangjian Han¹, Zhiqiang Zhang¹•Institutions (1)

National University of Defense Technology¹

04 Mar 2016

TL;DR: A designing scheme of high-speed real-time serial pipelined Fast Fourier Transform (FFT) processor on FPGA which is based on Coordinate Rotation Digital Computer (CORDIC) algorithm which will reduce the hardware complexity compared to the direct implementation of the butterflies using complex multipliers.

...read moreread less

Abstract: This paper presents a designing scheme of high-speed real-time serial pipelined Fast Fourier Transform (FFT) processor on FPGA which is based on Coordinate Rotation Digital Computer (CORDIC) algorithm. The CORDIC algorithm will reduce the hardware complexity compared to the direct implementation of the butterflies using complex multipliers. Moreover, the design uses the butterflies of the radix-2 Decimation-In-Time (DIT) algorithm, the dual-port RAM and the pipelined structure, which will sufficiently increase the performances of the FFT processor. The simulation results show that compared with the same type of real-time FFT processor, the scheme presented in this paper reduces the hardware resource requirements of Adaptive Look-up Tables (ALUTs) and increase the Signal Noise Ratio (SNR) by about 25dB.

...read moreread less

Exact Signal Measurements using FFT Analysis

[...]

Stefan Scholl

01 Jan 2016

TL;DR: This tutorial describes how to accurately measure signal power using the FFT and the different effects that introduce errors during FFT processing are described and how they can be avoided or compensated.

...read moreread less

Abstract: This tutorial describes how to accurately measure signal power using the FFT. The different effects that introduce errors during FFT processing are described and it is explained how they can be avoided or compensated.

...read moreread less

Proceedings Article•DOI•

Accelerating Discrete Fourier Transforms with dot-product engine

[...]

Miao Hu¹, John Paul Strachan¹•Institutions (1)

Hewlett-Packard¹

01 Oct 2016

TL;DR: This paper presents a solution of computing DFT using the dot-product engine (DPE) - a one transistor one memristor (1T1M) crossbar array with hybrid peripheral circuit support and the computing complexity is reduced to a constant O(λ) independent of the input data size.

...read moreread less

Abstract: Discrete Fourier Transforms (DFT) are extremely useful in signal processing. Usually they are computed with the Fast Fourier Transform (FFT) method as it reduces the computing complexity from O(N2) to O(Nlog(N)). However, FFT is still not powerful enough for many real-time tasks which have stringent requirements on throughput, energy efficiency and cost, such as Internet of Things (IoT). In this paper, we present a solution of computing DFT using the dot-product engine (DPE) - a one transistor one memristor (1T1M) crossbar array with hybrid peripheral circuit support. With this solution, the computing complexity is further reduced to a constant O(λ) independent of the input data size, where λ is the timing ratio of one DPE operation comparing to one real multiplication operation in digital systems.

...read moreread less

Proceedings Article•DOI•

Reduced FFT algorithm for network voltage disturbances detection

[...]

Aleksandar M. Stanisavljevic¹, Vladimir Katic¹, Bane Popadic¹, Boris Dumnic¹, Rade J. Radisic¹, Ilija Kovačević¹ - Show less +2 more•Institutions (1)

University of Novi Sad¹

01 Nov 2016

TL;DR: Reduced Fast Fourier Transformation (RFFT) is described, an algorithm of harmonic estimation based on the FFT, created by authors, convenient for voltage dips detection, and tested in Matlab / SimPowerSystems environment.

...read moreread less

Abstract: The paper describes Reduced Fast Fourier Transformation (RFFT), an algorithm of harmonic estimation based on the FFT, created by authors, convenient for voltage dips detection. The algorithm is simple, fast, computationally inexpensive and sufficiently accurate. It is tested in Matlab / SimPowerSystems environment. Results show that the algorithm is faster and better than the FFT, which is advantage in applications for network voltage disturbances detection.

...read moreread less

Proceedings Article•DOI•

Accurate performance analysis of a fixed point FFT

[...]

Pankaj Gupta¹•Institutions (1)

Texas Instruments¹

01 Mar 2016

TL;DR: A technique to estimate accurately the impact of fixed point arithmetic on FFT performance by measuring Signal-to-Quantization Noise Ratio (SQNR) of 2n (=N) Radix-2 FFT implementation and presenting the simulation results to illustrate the accuracy of the theoretical analysis.

...read moreread less

Abstract: Fast Fourier Transform (FFT) algorithm is widely used in today's digital signal processing applications. In practice, fixed point arithmetic is used for hardware implementations. The finite bits representation of signals introduces quantization error and thereby limits its accuracy. In this paper, we present a technique to estimate accurately the impact of fixed point arithmetic on FFT performance. We evaluate the fixed point accuracy by measuring Signal-to-Quantization Noise Ratio (SQNR) of 2n (=N) Radix-2 FFT implementation. This SQNR analysis is used to determine fixed point precisions of the FFT implementation that provides a good trade-off between the required hardware resources and final FFT output signal integrity. In the end, we will present the simulation results to illustrate the accuracy of the theoretical analysis.

...read moreread less

Proceedings Article•DOI•

A generator of memory-based, runtime-reconfigurable 2N3M5K FFT engines

[...]

Angie Wang¹, Jonathan Bachrach¹, Borivoje Nikolie¹•Institutions (1)

University of California, Berkeley¹

20 Mar 2016

TL;DR: The Chisel hardware construction language has been used in this work to create a generator of runtime-reconfigurable 2n3m5k FFT engines targeting software-defined radios (SDR) for modern communications, but with flexibility to support a wide range of applications.

...read moreread less

Abstract: Runtime-reconfigurable, mixed-radix FFT/IFFT engines are essential for modern wireless communication systems. To comply with varying standards requirements, these engines are customized for each modem. The Chisel hardware construction language has been used in this work to create a generator of runtime-reconfigurable 2n3m5k FFT engines targeting software-defined radios (SDR) for modern communications, but with flexibility to support a wide range of applications. The generator uses a conflict-free, in-place, multi-bank SRAM design, and exploits the duality of decimation-in-frequency (DIF) and decimation-in-time (DIT) FFTs to support continuous data flow with only 2N memory blocks. DFT decomposition using the prime-factor algorithm (PFA) followed by the Cooley-Tukey algorithm (CTA) reduces twiddle ROM sizes. A programmable Winograd's Fourier Transform (WFTA) butterfly supporting radix-2/3/4/5/7 operations reuses radix-7 hardware to support reconfigurability with minimal area penalty. The generated FFTs use 50% less memory than iterative FFTs from Spiral. The twiddle ROM size of the generated LTE/WiFi FFT engine is 16% smaller than that of a 2048-pt Spiral design.

...read moreread less

Proceedings Article•DOI•

Accurate estimation method of sinusoidal frequency based on FFT

[...]

Lei Fan¹, Guoqing Qi¹, Wenbo He²•Institutions (2)

Dalian Maritime University¹, Dalian Polytechnic University²

27 Jul 2016

TL;DR: In order to further improve the estimation precision of sinusoidal frequency, a new estimation method based on Fast Fourier Transform (FFT) is proposed, which has low SNR threshold, and outperforms the existing estimators.

...read moreread less

Abstract: In order to further improve the estimation precision of sinusoidal frequency, a new estimation method based on Fast Fourier Transform (FFT) is proposed. Zero-padding is used before the coarse estimation. And three sample values of Discrete-Time Fourier Transform (DTFT) of the original signal is used to perform the fine estimation. In the computer simulations, it can be shown that the proposed estimation method follows the Cramer-Rao Bound in the whole region of frequency offset. The estimation precision is higher than the existing estimators. The proposed estimator has low SNR threshold, and outperforms the existing estimators.

...read moreread less

Proceedings Article•DOI•

Design of 32-point mixed radix Fft processor using CSD multiplier

[...]

Rajdeep Kaur¹, Tarandip Singh¹•Institutions (1)

Sri Guru Granth Sahib World University¹

01 Jan 2016

TL;DR: Canonic Signed Digit (CSD) constant multiplier is used which minimizes the count of complex multipliers and twiddle factor memory size to achieve an optimized FFT processor in terms of area and memory requirements.

...read moreread less

Abstract: In this paper a modified FFT (Fast Fourier Transform) processor using Mixed Radix DIT Algorithm is presented. Canonic Signed Digit (CSD) constant multiplier is used which minimizes the count of complex multipliers and twiddle factor memory size to achieve an optimized FFT processor in terms of area and memory requirements. Fixed point number representation has been used to minimize the memory consumption and I/O bandwidth. The proposed FFT processor codes are written in VHDL and synthesized using Xilinx ISE design tool of version 14.7. The used device is of Spartan-6 Family and the device targeted is XC6SLX45T. For the design verification purpose ISim simulator is used. The results have shown reduction in the hardware utilization and time delay.

...read moreread less

Journal Article•DOI•

Acceleration of Perturbation-Based Electric Field Integral Equations Using Fast Fourier Transform

[...]

Miao Miao Jia¹, Sheng Sun¹, Yin Li², Zhiguo Qian³, Weng Cho Chew⁴ - Show less +1 more•Institutions (4)

University of Electronic Science and Technology of China¹, University of Hong Kong², Intel³, University of Illinois at Urbana–Champaign⁴

22 Jul 2016-IEEE Transactions on Antennas and Propagation

TL;DR: In this paper, the perturbation-based electric field integral equation of the form {R^{n-1},~n = 0, 1, 2, \ldots,} is accelerated by using fast Fourier transform (FFT) technique.

...read moreread less

Abstract: In this communication, the computation of the perturbation-based electric field integral equation of the form ${R^{n-1},~n = 0, 1, 2, \ldots ,}$ is accelerated by using fast Fourier transform (FFT) technique. As an effective solution of the low-frequency problem, the perturbation method employs the Taylor expansion of the scalar Green’s function in free space. However, multiple impedance matrices have to be solved at different frequency orders, and the computational cost becomes extremely high, especially for large-scale problems. Since the perturbed kernels still satisfy Toeplitz property on the uniform Cartesian grid, the FFT based on Lagrange interpolation can be well incorporated to accelerate the multiple matrix vector products. Because of the nonsingularity property of high-order kernels when $n\geq 1$ , we do not need to do any near field amendment. Finally, the efficiency of the proposed method is validated in an iterative solver with numerical examples.

...read moreread less

Proceedings Article•DOI•

Combining FFT and Spectral-Pooling for Efficient Convolution Neural Network Model

[...]

Zelong Wang, Qiang Lan, Dafei Huang, Mei Wen

20 Nov 2016

TL;DR: Theoretical computing complexity of and some other similar operation is demonstrated, revealing an advantage on computation of CS-unit, which is equivalent to a combination of a convolutional layer and a pooling layer but more effective.

...read moreread less

Abstract: Convolution operation is the most important and time consuming step in a convolution neural network model. In this work, we analyze the computing complexity of direct convolution and fast-Fourier-transform-based (FFT-based) convolution. We creatively propose CS-unit, which is equivalent to a combination of a convolutional layer and a pooling layer but more effective. Theoretical computing complexity of and some other similar operation is demonstrated, revealing an advantage on computation of CS-unit. Also, practical experiments are also performed and the result shows that CS-unit holds a real superiority on run time. Keywords-computing complexity; FFT-based convolution; CSunit

...read moreread less

Journal Article•DOI•

An Ultra-long FFT Architecture Implemented in a Reconfigurable Application Specified Processor

[...]

Feng Han¹, Li Li¹, Kun Wang¹, Fan Feng¹, Hongbing Pan¹, Baoning Zhang¹, Guoqiang He, Jun Lin¹ - Show less +4 more•Institutions (1)

Nanjing University¹

14 Jun 2016-IEICE Electronics Express

TL;DR: An efficient architecture for performing 128 points to 1M points Fast Fourier Transformation (FFT) based on mixed radix-2/4/8 butterfly unit is presented and an efficient 2-epoch FFT solution is realized.

...read moreread less

Abstract: This paper presents an efficient architecture for performing 128 points to 1M points Fast Fourier Transformation (FFT) based on mixed radix-2/4/8 butterfly unit. The proposed FFT architecture reduces the computation cost by taking the advantage of the radix-8 FFT algorithm while remaining compatible with sequences whose data length is an integral power of 2. Further optimizations for reconfigurable application specified processor are developed. First, we propose a separated radix-2/4/8 butterfly unit which is more flexible than an entire radix-2/4/8 butterfly unit; second, for the sequences longer than 256K points, an efficient 2-epoch FFT solution is realized. This FFT architecture is implemented in a reconfigurable application specified processor. The computation time of our architecture is 676 us and 14.8ms for 128K and 1M points FFTs respectively.

...read moreread less

Proceedings Article•DOI•

Design of high speed FFT algorithm For OFDM technique

[...]

C A Arun¹, P. Prakasam²•Institutions (2)

Pavai College of Technology¹, Coimbatore Institute of Technology²

04 Mar 2016

TL;DR: In this work two distinct complex multiplication approaches are discussed, and the corresponding results produced by these approaches are compared with MATLAB.

...read moreread less

Abstract: This paper presents a high speed FFT algorithms for high data rate wireless personal area network applications. In wireless personal area network the FFT/IFFT block leads the major role. Computational requirements of FFT/IFFT processes are a heavy burden in most applications. The FFT/IFFT block can be implemented in various methods. From this work, we can recognize that most of the FFT structures follow the divide and conquer algorithms, which improve the computational efficiency. In this work two distinct complex multiplication approaches are discussed, and the corresponding results produced by these approaches are compared with MATLAB. The pipelined architecture of Radix-2 SDF has been implemented with 16 point module.

...read moreread less

Proceedings Article•DOI•

FPGA based area optimized parallel pipelined radix-2 2 feed forward FFT architecture

[...]

S A Ajmal¹, S L Gangadharaiah¹•Institutions (1)

M. S. Ramaiah Institute of Technology¹

20 May 2016

TL;DR: This paper presents area optimization of parallel pipelined radix-22 feed forward Fast Fourier transform (PPFFT) architecture and is compared with other PFFT (feed forward) architectures using the same synthesis tool and FPGA to show that the proposed architecture exhibits better area optimization.

...read moreread less

Abstract: The design of pipelined Fast Fourier transform (PFFT) in modern communication systems provides an efficient way for computation of FFT with better area utilizing hardware architecture. Previously, the radix-22 had been used only for single path delay feedback architectures. Later with many types of research works the radix 22 was extended to multi-path delay commutator (MDC) architectures. This paper presents area optimization of parallel pipelined radix-22 feed forward Fast Fourier transform (PPFFT) architecture. This architecture is provided for parallelism value 4 and 16 sample points and the area of proposed PFFT is compared with other PFFT (feed forward) architectures using the same synthesis tool and FPGA. The comparison shows that the proposed architecture exhibits better area optimization.

...read moreread less

Proceedings Article•DOI•

Two-dimensional fast Fourier transform: Batterfly in analog of Cooley-Tukey algorithm

[...]

V. S. Tutatchikov¹•Institutions (1)

Siberian Federal University¹

01 Jun 2016

TL;DR: The butterfly of analog Cooley-Tukey algorithm is provided, which requires less complex operations of additional and multiplication than the standard method, and runs 1.5 times faster than analogue in Matlab.

...read moreread less

Abstract: One- and two-dimensional (2D) fast Fourier transform (FFT) algorithms has been widely used in digital processing. 2D discrete Fourier transform is reduced to a combination of one-dimensional FFT for all coordinates due to the increased complexity and the large amount of computation by increasing dimension of the signal. This article provides the butterfly of analog Cooley-Tukey algorithm, which requires less complex operations of additional and multiplication than the standard method, and runs 1.5 times faster than analogue in Matlab.

...read moreread less

Journal Article•DOI•

Hybrid Architecture Design for Calculating Variable-Length Fourier Transform

[...]

Shin-Chi Lai, Wen-Ho Juang¹, Yueh-Shu Lee¹, Shin-Hao Chen², Ke-Horng Chen², Chia-Chun Tsai, Chiung-Hon Lee - Show less +3 more•Institutions (2)

National Cheng Kung University¹, National Chiao Tung University²

01 Mar 2016-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: This brief presents a hybrid structure to effectively compute the variable-length Fourier transform by employing the recursive and radix-22 fast algorithm and the proposed hardware accelerator only costs four real multipliers and ten real adders with a greater reduction than Kim et al.'s design.

...read moreread less

Abstract: This brief presents a hybrid structure to effectively compute the variable-length Fourier transform by employing the recursive and radix-22 fast algorithm. After applying a hardware-sharing scheme to both fast algorithms, the proposed method not only improves the drawback of higher hardware cost in implementation but also retains the regular and flexible nature of recursive discrete Fourier transform (RDFT). The proposed hardware accelerator only costs four real multipliers and ten real adders with a greater reduction (86.7% and 66.7%, respectively) than Kim et al. 's design. In addition, the number of multiplications and additions for 256-point DFT computations can be reduced by 38.6% and 70%, respectively, compared to Lai et al. 's recent approach. For accuracy analysis, the SNR value of the proposed design, at least, is 4 dB higher than the other RDFT designs. Considering a whole evaluation, a very-large-scale integration chip design was further fabricated using TSMC 0.18- $\mu\mbox{m}$ 1P6M CMOS process. The core size was only 660 × 660 $\mu\mbox{m}^2$ , and the measured power consumption was 8.8 mW @ 25 MHz. The result shows that the proposed chip included data memory is 1.38 times the computational efficiency per unit area of Lai et al. 's work. Therefore, it will be the state-of-the-art RDFT processor in the application of various variable-transform-length digital signal processing issues.

...read moreread less