scispace - formally typeset
Search or ask a question

Showing papers on "Prime-factor FFT algorithm published in 2016"


Journal ArticleDOI
TL;DR: A fast algorithm for computing volume potentials - that is, the convolution of a translation invariant, free-space Green's function with a compactly supported source distribution defined on a uniform grid is introduced.

84 citations


Journal ArticleDOI
TL;DR: This brief presents a new type of fast Fourier transform (FFT) hardware architectures called serial commutator (SC) FFT, based on the observation that, in the radix-2 FFT algorithm, only half of the samples at each stage must be rotated.
Abstract: This brief presents a new type of fast Fourier transform (FFT) hardware architectures called serial commutator (SC) FFT. The SC FFT is characterized by the use of circuits for bit-dimension permutation of serial data. The proposed architectures are based on the observation that, in the radix-2 FFT algorithm, only half of the samples at each stage must be rotated. This fact, together with a proper data management, makes it possible to allocate rotations only every other clock cycle. This allows for simplifying the rotator, halving the complexity with respect to conventional serial FFT architectures. Likewise, the proposed approach halves the number of adders in the butterflies with respect to previous architectures. As a result, the proposed architectures use the minimum number of adders, rotators, and memory that are necessary for a pipelined FFT of serial data, with 100% utilization ratio.

33 citations


Journal ArticleDOI
TL;DR: The triangular matrix representation is an excellent alternative to represent FFT algorithms and it opens new possibilities in the exploration and understanding of the FFT.
Abstract: In this paper we propose a new representation for FFT algorithms called the triangular matrix representation. This representation is more general than the binary tree representation and, therefore, it introduces new FFT algorithms that were not discovered before. Furthermore, the new representation has the advantage that it is simple and easy to understand, as each FFT algorithm only consists of a triangular matrix. Besides, the new representation allows for obtaining the exact twiddle factor values in the FFT flow graph easily. This facilitates the design of FFT hardware architectures. As a result, the triangular matrix representation is an excellent alternative to represent FFT algorithms and it opens new possibilities in the exploration and understanding of the FFT.

32 citations


Journal ArticleDOI
TL;DR: A novel algorithm to perform cooperative wideband spectrum sensing (CWSS) for cognitive radios (CRs) based on a sub-Nyquist version of the sparse fast Fourier transform (sFFT) algorithm is presented, and it is executed cooperatively by using M identical nodes.
Abstract: This brief presents a novel algorithm to perform cooperative wideband spectrum sensing (CWSS) for cognitive radios (CRs). The proposed algorithm is based on a sub-Nyquist version of the sparse fast Fourier transform (sFFT) algorithm, and it is executed cooperatively by using $\boldsymbol{M}$ identical nodes. In this case, we designed a CWSS circuit based on the proposed algorithm that implements the main functional procedures of the sub-Nyquist sFFT algorithm by using multi-coset sampling and relatively prime sampling rates. According to the verification results, the proposed circuit based on the designed CWSS algorithm is suitable for implementing CWSS in CRs for sparse spectra composed of highly noisy multiband signals, and it improves the performance of previous sub-Nyquist sFFT algorithm and previous sFFT hardware implementation.

29 citations


Journal ArticleDOI
TL;DR: In this paper, a fast Fourier transform (FFT) based method for time-varying power system harmonic measurement is proposed, where the harmonic signal is preprocessed by infinite impulse response filter bank and Teager-Kaiser energy operator for fast detection of instability onset time.
Abstract: As the rapid development in power electronic devices, the harmonic pollution becomes one of principle power quality problems in power system. The fast Fourier transform (FFT) is widely used for analysing and measuring power system harmonics. However, the limitation of the FFT such as an aliasing effect, spectrum leakage picket-fence effect, would contribute to inaccuracy results. Furthermore, the real power system harmonic is actually a non-stationary signal while FFT is a tool for stable signal. This study focuses on a novel FFT based method for time-varying power system harmonic measurement. The harmonic signal is preprocessed by infinite impulse response filter bank and Teager–Kaiser energy operator for fast detection of instability onset time. Then an adaptive Kaiser self-convolution window-based interpolated FFT algorithm is used to estimate each harmonic component. The results of both simulation and practical implementation show that the proposed method is suitable to deal with time-varying harmonic and achieves a higher accuracy compared with the traditional FFT-based techniques.

29 citations


Journal ArticleDOI
TL;DR: This brief presents a novel pipelined FFT processor for the FFT computation of two independent data streams based on the multipath delay commutator FFT architecture, which requires a lower number of registers and has high throughput.
Abstract: Nowadays, many applications require simultaneous computation of multiple independent fast Fourier transform (FFT) operations with their outputs in natural order. Therefore, this brief presents a novel pipelined FFT processor for the FFT computation of two independent data streams. The proposed architecture is based on the multipath delay commutator FFT architecture. It has an $N/2$ -point decimation in time FFT and an $N/2$ -point decimation in frequency FFT to process the odd and even samples of two data streams separately. The main feature of the architecture is that the bit reversal operation is performed by the architecture itself, so the outputs are generated in normal order without any dedicated bit reversal circuit. The bit reversal operation is performed by the shift registers in the FFT architecture by interleaving the data. Therefore, the proposed architecture requires a lower number of registers and has high throughput.

28 citations


Journal ArticleDOI
TL;DR: A novel approach to implement multiplierless unity-gain single-delay feedback fast Fourier transforms (FFTs) without compensation circuits, even when using nonunity-gain rotators, by a joint design of rotators.
Abstract: In this brief, we propose a novel approach to implement multiplierless unity-gain single-delay feedback fast Fourier transforms (FFTs). Previous methods achieve unity-gain FFTs by using either complex multipliers or nonunity-gain rotators with additional scaling compensation. Conversely, this brief proposes unity-gain FFTs without compensation circuits, even when using nonunity-gain rotators. This is achieved by a joint design of rotators, so that the entire FFT is scaled by a power of two, which is then shifted to unity. This reduces the amount of hardware resources of the FFT architecture, while having high accuracy in the calculations. The proposed approach can be applied to any FFT size, and various designs for different FFT sizes are presented.

26 citations


Journal ArticleDOI
TL;DR: Among various discrete transforms, discrete Fourier transformation (DFT) is the most important technique that performs Fourier analysis in various practical applications, such as digital signal processing, wireless communications, to name a few.
Abstract: Discrete Fourier transform (DFT) is an important transformation technique in signal processing tasks. Due to its ultrahigh computing complexity as $O(\!N^{\!2}\!)$ , $N$ - point DFT is usually implemented in the format of fast Fourier transformation (FFT) with the complexity of $O(N\log N)$ . Despite this significant reduction in complexity, the hardware cost of the multiplication-intensive $N$ - point FFT is still very prohibitive, particularly for many large-scale applications that require large $N$ . This brief, for the first time , proposes high-accuracy low-complexity scaling-free stochastic DFT/FFT designs. With the use of the stochastic computing technique, the hardware complexity of the DFT/FFT designs is significantly reduced. More importantly, this brief presents the scaling-free stochastic adder and the random number generator sharing scheme, which enable a significant reduction in accuracy loss and hardware cost. Analysis results show that the proposed stochastic DFT/FFT designs achieve much better hardware performance and accuracy performance than state-of-the-art stochastic design.

26 citations


Journal ArticleDOI
TL;DR: This paper investigates the accuracy of the sine-wave parameter estimators provided by the Weighted Three-Parameter Sine-Fit algorithm when a generic cosine window is adopted and shows that the W3PSF algorithm can be well approximated by the classical weighted Discrete Time Fourier Transform (DTFT) when the number of analyzed waveform cycles is high enough.

18 citations


Journal ArticleDOI
TL;DR: A configurable floating-point FFT accelerator based on CORDIC rotation is proposed, in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory.
Abstract: Fast Fourier transform (FFT) accelerator and Coordinate rotation digital computer (CORDIC) algorithm play important roles in signal processing. We propose a configurable floating-point FFT accelerator based on CORDIC rotation, in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory. To finish CORDIC rotation efficiently, a novel approach in which segmented-parallel iteration and compress iteration based on CSA are presented and redundant CORDIC is used to reduce the latency of each iteration. To prove the efficiency of our FFT accelerator, four FFT accelerators are prototyped into a FPGA chip to perform a batch-FFT. Experimental results show that our structure, which is composed of four butterfly units and finishes FFT with the size ranging from 64 to 8192 points, occupies 33230(3%) REGs and 143006(30%) LUTs. The clock frequency can reach 122MHz. The resources of double-precision FFT is only about 2.5 times of single-precision while the theoretical value is 4. What's more, only 13331 cycles are required to implement 8192-points double-precision FFT with four butterfly units in parallel.

18 citations


Proceedings ArticleDOI
23 May 2016
TL;DR: The challenges and proposed effective solutions to efficiently port sFFT to massively parallel processors, such as GPUs, using CUDA are explored and some of the optimization strategies such as index coalescing, loop splitting, asynchronous data layout transformation, linear time selection algorithm are presented.
Abstract: The Fast Fourier Transform (FFT) is one of the most important numerical tools widely used in many scientific and engineering applications. The algorithm performs O(nlogn) operations on n input data points in order to calculate only small number of k large coefficients, while the rest of n - k numbers are zero or negligibly small. The algorithm is clearly inefficient, when n points input data lead to only k

Proceedings ArticleDOI
19 Aug 2016
TL;DR: This work develops a hexagonal FFT in ASA coordinates that uses only the standard Fourier transform, allowing the user to implement the hexagonally sampled FFT using standard FFT routines.
Abstract: The discrete Fourier transform is an important tool for processing digital images. Efficient algorithms for computing the Fourier transform are known as fast Fourier transforms (FFTs). One of the most common of these is the Cooley-Tukey radix-2 decimation algorithm that efficiently transforms one-dimensional data into its frequency domain representation. The orthogonality of rectangular sampling allows the separability of the Fourier kernel which enables the use of the Cooley-Tukey algorithm on two-dimensional digital images that have been sampled rectangularly. Hexagonal sampling provides many benefits over rectangular sampling, but it does not result in the orthogonal rows and columns that can be transformed independently as is done with rectangular samples. Use of the Array Set Addressing (ASA) coordinate system for hexagonally sampled images has been shown to provide a separable Fourier kernel, leading to an efficient FFT, however its implementation is composed of nonstandard transforms that require custom routines to evaluate. This work develops a hexagonal FFT in ASA coordinates that uses only the standard Fourier transform, allowing the user to implement the hexagonal FFT using standard FFT routines.

Proceedings ArticleDOI
22 May 2016
TL;DR: This brief presents the scaling-free stochastic adder and the random number generator sharing scheme, which enable a significant reduction in accuracy loss and hardware cost and achieve much better hardware performance and accuracy performance than state-of-the-art Stochastic design.
Abstract: Among various discrete transforms, discrete Fourier transformation (DFT) is the most important technique that performs Fourier analysis in various practical applications, such as digital signal processing, wireless communications, to name a few. Due to its ultra-high computing complexity as O(N2), in practice the N-point DFT is usually performed in the form of fast Fourier transformation (FFT) with complexity as O(NlogN). Despite this significant reduction in computing complexity, the hardware cost of the multiplication-intensive N-point FFT is still very prohibitive; especially for many large-scale applications that requires large N.

Proceedings ArticleDOI
20 Mar 2016
TL;DR: This paper proposes to leverage the recently introduced Flexible Approximate MUltilayer Sparse Transforms (FAST) in order to compute approximate FFTs on graphs, showing good potential.
Abstract: Signal processing on graphs is a recent research domain that seeks to extend classical signal processing tools such as the Fourier transform to irregular domains given by a graph. In such a graph setting, a way to rapidly apply the Fourier transform, i.e. a Fast Fourier Transform (FFT), is lacking. In this paper, we propose to leverage the recently introduced Flexible Approximate MUlti-layer Sparse Transforms (FAST) in order to compute approximate FFTs on graphs. The approach is first described, then validated on several types of classical graphs and finally used for fast filtering, showing good potential.

Journal ArticleDOI
TL;DR: A stable 2D sliding fast Fourier transform (FFT) algorithm based on the vector radix 2 × 2 FFT is presented and theoretical analysis shows that the proposed algorithm has the lowest computational requirements among the existing stable sliding DFT algorithms.
Abstract: The two-dimensional (2D) discrete Fourier transform (DFT) in the sliding window scenario has been successfully used for numerous applications requiring consecutive spectrum analysis of input signals. However, the results of conventional sliding DFT algorithms are potentially unstable because of the accumulated numerical errors caused by recursive strategy. In this letter, a stable 2D sliding fast Fourier transform (FFT) algorithm based on the vector radix (VR) 2 × 2 FFT is presented. In the VR-2 × 2 FFT algorithm, each 2D DFT bin is hierarchically decomposed into four sub-DFT bins until the size of the sub-DFT bins is reduced to 2 × 2; the output DFT bins are calculated using the linear combination of the sub-DFT bins. Because the sub-DFT bins for the overlapped input signals between the previous and current window are the same, the proposed algorithm reduces the computational complexity of the VR-2 × 2 FFT algorithm by reusing previously calculated sub-DFT bins in the sliding window scenario. Moreover, because the resultant DFT bins are identical to those of the VR-2 × 2 FFT algorithm, numerical errors do not arise; therefore, unconditional stability is guaranteed. Theoretical analysis shows that the proposed algorithm has the lowest computational requirements among the existing stable sliding DFT algorithms.

Journal ArticleDOI
TL;DR: A new GPS signal acquisition method based on decomposition of FFT is proposed to improve the acquisition performance and is implemented, validated and compared with conventional serial search and radix2 FFT search algorithms using Intermediate Frequency GPS signal.

Journal ArticleDOI
TL;DR: A fast algorithm is described for the 2-D left-side QDFT which is based on the concept of the tensor representation when the color or four-componnrnt quaternion image is described by a set of 1-D quaternions signals and the 1- D left- side QDFTs over these signals determine values of the2-Dleft-sideQDFT at corresponding subset of frequency-points.
Abstract: We describe a fast algorithm for the 2-D left-side QDFT which is based on the concept of the tensor representation when the color or four-componnrnt quaternion image is described by a set of 1-D quaternion signals and the 1-D left-side QDFTs over these signals determine values of the 2-D left-side QDFT at corresponding subset of frequency-points. The efficiency of the tensor algorithm for calculating the fast left-side 2-D QDFT is described and compared with the existent methods.  The proposed algorithm of the 2r×2r-point 2-D QDFT uses 18N2 less multiplications than the well-known methods: • column-row method • method of symplectic decomposition.  The proposed algorithm is simple to apply and design, which makes it very practical in color image processing in the frequency domain.  The method of quaternion image tensor representation is uique in a sense that it can be used for both left-sida and right-side 2-D QDFTs. 3 Inroduction – Quanterions in Imaging  The quaternion can be considered 4-dimensional generation of a complex number with one real part and three imaginary parts. Any quaternion may be represented in a hyper-complex form Q = a + bi + cj + dk = a + (bi + cj + dk), where a, b, c, and d are real numbers and i, j, and k are three imaginary units with the following multiplication laws: ij = −ji = k, jk = −kj = i, ki = −ik = −j, i2 = j2 = k2 = ijk = −1.  The commutativity does not hold in quaternion algebra, i.e., Q1Q2≠Q2Q1.  A unit pure quaternion is μ=iμi+jμj+kμk such that |μ| = 1, μ 2 = −1 For instance, the number μ=(i+j+k)/√3, μ=(i+j)/√2, and μ=(i-k)/√2  The exponential number is defined as exp(μx) = cos(x) + μ sin(x) = cos(x) + iμi sin(x) +jμj sin(x) +kμk sin(x) 4 RGB Model for Color Images 5  A discrete color image fn,m in the RGB color space can be transformed into imaginary part of quaternion numbers form by encoding the red, green, and blue components of the RGB value as a pure quaternion (with zero real part): fn,m = 0 + (rn,mi + gn,mj + bn,mk) Figure 1: RBG color cube in quaternion space.  The advantage of using quaternion based operations to manipulate color information in an image is that we do not have to process each color channel independently, but rather, treat each color triple as a whole unit. Calculation of the left-side 1-D QDFT  Let fn =(an,bn,cn,dn)=an +ibn +jcn +kdn be the quaternion signal of length N. The left-side 1-D quaternion DFT ( LS QDFT) is defined as 6 If we denote the N-point LS 1-D DFTs of the parts an, bn, cn, and dn of the quaternion signal fn by Ap, Bp, Cp, and Dp, respectively, we can calculate of the LS 1-D QDFT as If the real part is zero, an =0, and fn =(0,bn,cn,dn)=an +ibn +jcn +kdn , the number of operations of multiplication and addition can be estimated as Multiplications and Additions for the left-side 1-D QDFT  In the general case of the quaternion signal fn, the number of operations of multiplication and addition for LS 1-D QDFT can be estimated as 7 The number of operations for the left-side 1-D QDFT can be estimated as Here, we consider that for the fast N-point discrete paired transform-based FFT, the estimation for multiplications and additions are and two 1-D DFTs with real inputs can be calculated by one DFT with complex input, (1) Number of multiplications: Special case 8 The number of operations of multiplication and addition equal or 8N operations of real multiplication less than in (1). The direct and inverse left-side 2-D QDFTs  Given color-in-quaternion image fn,m =an,m +ibn,m +jcn,m +kdn,m , we consider the concept of the left-side 2-D QDFT in the following form: 9 1. Column-row algorithm: The calculation of the separable 2-D N×N-point QDFT by formula 2. The calculation the LS 2-D QDFT by the symplectic decomposition of the color image requires 2N N-point LS 1-D QDFTs. Each of the 1-D QDFT requires two N-point LS 1-D DFTs. Therefore, the column-row method uses 4N N-point LS 1-D DFTs and multiplications or The inverse left-side 2-D QDFT is: Example: N×N-point left-side 2-D QDFT 10 Figure 2. (a) The color image of size 1223×1223 and (b) the 2-D left-side quaternion discrete Fourier transform of the color-inqiuaternion image (in absolute scale and cyclically shifted to the middle). Tensor Representation of the regular 2-D DFT Let fn,m be the gray-scale image of size N×N.  The tensor representation of the image fn,m is the 2D-frequency-and-1D-time representation when the image is described by a set of 1-D splitting-signals each of length N 11 The components of the signals are the ray-sums of the image along the parallel lines Each splitting-signals defines 2-D DFT at N frequency-points of the set on the cartesian lattice Example: Tensor Representation of the 2-D DFT 1-D splitting-signal of the tensor represntation of the image 512×512 12 Figure 3. (a) The Miki-Anoush-Mini image, (b) splitting-signal for the frequency-point (4,1), (c) magnitude of the shifted to the middle 1-D DFT of the signal, and (d) the 2D DFT of the image with the frequency-points of the set T4,1. Tensor Representation of the left-side 2-D QDFT Let fn,m =an,m +ibn,m +jcn,m +kdn,m be the quaternion image of size N×N, (an,m =0). In the tensor representation, the quaternion image is represented by a set of 1-D quaternion splitting-signals each of length N and generated by a set of frequencies (p,s), 13 The components of the signals are defined as Here, the subsets Property of the TT: Example: Tensor Representation of the 2-D LS QDFT The splitting-signal of the tensor represntation of the color image 1223×1223: 14 Figure 5. The 123-point left-side DFT of the (1,4) quaternion splitting-signal; (a) the real part and (b) the i-component of the signal. Figure 4. Color image and (a,b,c) components of the splitting-signal generated by (1,4). Example: Tensor Representation of the 2-D LS QDFT 15 Figure 7. (a) The real part and (b) the imaginary part of the left-side 2-D QDFT of the 2-D color-in-quaternion `girl Anoush\" image. Figure 6. (a) The 1-D left-side QDFT the quaternion splitting-signal f1,4,t (in absolute scale), and (b) the location of 1223 frequency-points of the set T1,4 on the Cartesian grid, wherein this 1-D LS QDFT equals the 2-D LS QDFT of the quaternion image. μ=(i+2j+k)/√6 Tensor Transform: Direction Quaternion Image Components Color image can be reconstructed by its 1-D quaternion splitting-signals or direction color image components defined by 16 Statement 1: The discrete quaternion image of size N×N, where N is prime, can be composed from its (N+1) quaternion direction images or splitting-signals as Color-or-Quaternion Image is The Sum of Direction Image Components 17 Figure 8: (a) The color image and direction images generated by (p,s) equal (b) (1,1), (c) (1,2), and (d) (1,4). The Paired Image Representation: Splitting-Signals and Direction Quaternion Image Components The tensor transform, or representation is redndant for the case N×N, where N is a power of 2. Therefore the tensor transform is modified and new1-D quaternion splitting-signals or direction color image components are calculated by 18 Statement 2: The discrete quaternion image of size N×N, where N=2r, r>1, can be composed from its (3N−2) quaternion direction images as Here JʹN,N is a set of generators (p,s). Such representation of the quaternion image is called the paired transform; it is unitary and therefore not redundant.

Proceedings ArticleDOI
04 Mar 2016
TL;DR: A designing scheme of high-speed real-time serial pipelined Fast Fourier Transform (FFT) processor on FPGA which is based on Coordinate Rotation Digital Computer (CORDIC) algorithm which will reduce the hardware complexity compared to the direct implementation of the butterflies using complex multipliers.
Abstract: This paper presents a designing scheme of high-speed real-time serial pipelined Fast Fourier Transform (FFT) processor on FPGA which is based on Coordinate Rotation Digital Computer (CORDIC) algorithm. The CORDIC algorithm will reduce the hardware complexity compared to the direct implementation of the butterflies using complex multipliers. Moreover, the design uses the butterflies of the radix-2 Decimation-In-Time (DIT) algorithm, the dual-port RAM and the pipelined structure, which will sufficiently increase the performances of the FFT processor. The simulation results show that compared with the same type of real-time FFT processor, the scheme presented in this paper reduces the hardware resource requirements of Adaptive Look-up Tables (ALUTs) and increase the Signal Noise Ratio (SNR) by about 25dB.

Journal ArticleDOI
TL;DR: In this paper, a super-resolution motion parameter estimation algorithm is proposed for ground moving targets with the close centers. But, the performance degradation of the conventional algorithms may suffer since the spectra resolution of FT may be limited by the time samples.
Abstract: For synthetic aperture radar (SAR), most conventional algorithms provide a good performance in ground moving target imaging based on Fourier transform (FT). However, when multiple moving targets with the close centres exist, the conventional algorithms may suffer from the performance degradation since the spectra resolution of FT may be limited by the time samples. To address this issue, a super-resolution motion parameter estimation algorithm is proposed in this study. First, Keystone transform is applied to correct the linear range walk. Then the range curvature is compensated by the matched function with respect to the platform velocity. After performing the compensation of linear range walk and range curvature, the energy of a ground moving target is focused on one range cell, and then a first-order discrete polynomial-phase transform is applied to transform the quadratic phase signal into a single tone. After applying the smoothing technique to construct the covariance matrix of full rank, the multiple signal classification algorithm is utilised to estimate the target cross- and along-track velocities, which can significantly improve the motion parameter resolution performance compared with the FFT-based algorithms. The real SAR data processing results are used to validate the effectiveness and feasibility of the proposed algorithm.

01 Jan 2016
TL;DR: This tutorial describes how to accurately measure signal power using the FFT and the different effects that introduce errors during FFT processing are described and how they can be avoided or compensated.
Abstract: This tutorial describes how to accurately measure signal power using the FFT. The different effects that introduce errors during FFT processing are described and it is explained how they can be avoided or compensated.

Journal ArticleDOI
TL;DR: The proposed hybrid fast Fourier transform Adaptive LINear Element (FFT-ADALINE) algorithm for fast and accurate estimation of harmonics operates in good and accurate performance with the settling time is within half cycle.
Abstract: Hybrid fast Fourier transform Adaptive LINear Element (FFT-ADALINE) algorithm for fast and accurate estimation of harmonics is proposed in this study. The FFT method can perform fast conversion from time domain to frequency domain, but it cannot respond immediately to any change of the measured harmonics due to the utilisation of buffer. Meanwhile, ADALINE has better capability to respond immediately due to its learning ability, but its settling time is about two cycles of the measurement signal. In the proposed method, both of the aforementioned algorithms are combined for harmonic estimation where it is able to respond immediately to any change of the measured harmonics and the settling time is reduced to half cycle of the measurement signal. The theory of the proposed algorithm is the application of FFT with weights updating rule to reduce the error of ADALINE instantaneously. The robustness of the proposed method is simulated via MATLAB Simulink. The validity of the simulation work is further proven by the experimental work, which has been done with Chroma programmable AC source model 6590 and non-linear load operations. The proposed algorithm operates in good and accurate performance with the settling time is within half cycle.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: This paper presents a solution of computing DFT using the dot-product engine (DPE) - a one transistor one memristor (1T1M) crossbar array with hybrid peripheral circuit support and the computing complexity is reduced to a constant O(λ) independent of the input data size.
Abstract: Discrete Fourier Transforms (DFT) are extremely useful in signal processing. Usually they are computed with the Fast Fourier Transform (FFT) method as it reduces the computing complexity from O(N2) to O(Nlog(N)). However, FFT is still not powerful enough for many real-time tasks which have stringent requirements on throughput, energy efficiency and cost, such as Internet of Things (IoT). In this paper, we present a solution of computing DFT using the dot-product engine (DPE) - a one transistor one memristor (1T1M) crossbar array with hybrid peripheral circuit support. With this solution, the computing complexity is further reduced to a constant O(λ) independent of the input data size, where λ is the timing ratio of one DPE operation comparing to one real multiplication operation in digital systems.

Proceedings ArticleDOI
01 Nov 2016
TL;DR: Reduced Fast Fourier Transformation (RFFT) is described, an algorithm of harmonic estimation based on the FFT, created by authors, convenient for voltage dips detection, and tested in Matlab / SimPowerSystems environment.
Abstract: The paper describes Reduced Fast Fourier Transformation (RFFT), an algorithm of harmonic estimation based on the FFT, created by authors, convenient for voltage dips detection. The algorithm is simple, fast, computationally inexpensive and sufficiently accurate. It is tested in Matlab / SimPowerSystems environment. Results show that the algorithm is faster and better than the FFT, which is advantage in applications for network voltage disturbances detection.

Proceedings ArticleDOI
Pankaj Gupta1
01 Mar 2016
TL;DR: A technique to estimate accurately the impact of fixed point arithmetic on FFT performance by measuring Signal-to-Quantization Noise Ratio (SQNR) of 2n (=N) Radix-2 FFT implementation and presenting the simulation results to illustrate the accuracy of the theoretical analysis.
Abstract: Fast Fourier Transform (FFT) algorithm is widely used in today's digital signal processing applications. In practice, fixed point arithmetic is used for hardware implementations. The finite bits representation of signals introduces quantization error and thereby limits its accuracy. In this paper, we present a technique to estimate accurately the impact of fixed point arithmetic on FFT performance. We evaluate the fixed point accuracy by measuring Signal-to-Quantization Noise Ratio (SQNR) of 2n (=N) Radix-2 FFT implementation. This SQNR analysis is used to determine fixed point precisions of the FFT implementation that provides a good trade-off between the required hardware resources and final FFT output signal integrity. In the end, we will present the simulation results to illustrate the accuracy of the theoretical analysis.

Proceedings ArticleDOI
20 Mar 2016
TL;DR: The Chisel hardware construction language has been used in this work to create a generator of runtime-reconfigurable 2n3m5k FFT engines targeting software-defined radios (SDR) for modern communications, but with flexibility to support a wide range of applications.
Abstract: Runtime-reconfigurable, mixed-radix FFT/IFFT engines are essential for modern wireless communication systems. To comply with varying standards requirements, these engines are customized for each modem. The Chisel hardware construction language has been used in this work to create a generator of runtime-reconfigurable 2n3m5k FFT engines targeting software-defined radios (SDR) for modern communications, but with flexibility to support a wide range of applications. The generator uses a conflict-free, in-place, multi-bank SRAM design, and exploits the duality of decimation-in-frequency (DIF) and decimation-in-time (DIT) FFTs to support continuous data flow with only 2N memory blocks. DFT decomposition using the prime-factor algorithm (PFA) followed by the Cooley-Tukey algorithm (CTA) reduces twiddle ROM sizes. A programmable Winograd's Fourier Transform (WFTA) butterfly supporting radix-2/3/4/5/7 operations reuses radix-7 hardware to support reconfigurability with minimal area penalty. The generated FFTs use 50% less memory than iterative FFTs from Spiral. The twiddle ROM size of the generated LTE/WiFi FFT engine is 16% smaller than that of a 2048-pt Spiral design.

Proceedings ArticleDOI
27 Jul 2016
TL;DR: In order to further improve the estimation precision of sinusoidal frequency, a new estimation method based on Fast Fourier Transform (FFT) is proposed, which has low SNR threshold, and outperforms the existing estimators.
Abstract: In order to further improve the estimation precision of sinusoidal frequency, a new estimation method based on Fast Fourier Transform (FFT) is proposed. Zero-padding is used before the coarse estimation. And three sample values of Discrete-Time Fourier Transform (DTFT) of the original signal is used to perform the fine estimation. In the computer simulations, it can be shown that the proposed estimation method follows the Cramer-Rao Bound in the whole region of frequency offset. The estimation precision is higher than the existing estimators. The proposed estimator has low SNR threshold, and outperforms the existing estimators.

Proceedings Article
01 Oct 2016
TL;DR: A back-projection algorithm using the Fast Fourier Transform (FFT) is proposed to generate SAR images, optimal for SPOT mode and for wide-band scenarios, however to expand the algorithm flexibility two additional implementations are presented.
Abstract: This paper describes a novel method based on the back-projection approach to generate SAR images. The back-projection operation has various “fast” implementations all using multilevel algorithms. In this paper a back-projection algorithm using the Fast Fourier Transform (FFT) is proposed. The basic method does not employ interpolation by using non uniform sampling, and it obtains the same results as the straightforward computation, with O(N^2logN) instead of O(N^4) complexity. This method is optimal for SPOT mode and for wide-band scenarios, however to expand the algorithm flexibility two additional implementations are presented. As opposed to the basic method which does not use interpolation at all, these implementations require interpolation in the imaging process, but the error introduced is negligible.

Proceedings ArticleDOI
01 Jan 2016
TL;DR: Canonic Signed Digit (CSD) constant multiplier is used which minimizes the count of complex multipliers and twiddle factor memory size to achieve an optimized FFT processor in terms of area and memory requirements.
Abstract: In this paper a modified FFT (Fast Fourier Transform) processor using Mixed Radix DIT Algorithm is presented. Canonic Signed Digit (CSD) constant multiplier is used which minimizes the count of complex multipliers and twiddle factor memory size to achieve an optimized FFT processor in terms of area and memory requirements. Fixed point number representation has been used to minimize the memory consumption and I/O bandwidth. The proposed FFT processor codes are written in VHDL and synthesized using Xilinx ISE design tool of version 14.7. The used device is of Spartan-6 Family and the device targeted is XC6SLX45T. For the design verification purpose ISim simulator is used. The results have shown reduction in the hardware utilization and time delay.

Journal ArticleDOI
TL;DR: A novel method for computing the discrete Fourier transform over a finite field based on the Goertzel-Blahut algorithm is described, which is currently the best one for Computing the DFT over even extensions of the characteristic two finite field, in terms of multiplicative complexity.
Abstract: A novel method for computing the discrete Fourier transform (DFT) over a finite field based on the Goertzel-Blahut algorithm is described. The novel method is currently the best one for computing the DFT over even extensions of the characteristic two finite field, in terms of multiplicative complexity.

Journal ArticleDOI
TL;DR: In this paper, the perturbation-based electric field integral equation of the form {R^{n-1},~n = 0, 1, 2, \ldots,} is accelerated by using fast Fourier transform (FFT) technique.
Abstract: In this communication, the computation of the perturbation-based electric field integral equation of the form ${R^{n-1},~n = 0, 1, 2, \ldots ,}$ is accelerated by using fast Fourier transform (FFT) technique. As an effective solution of the low-frequency problem, the perturbation method employs the Taylor expansion of the scalar Green’s function in free space. However, multiple impedance matrices have to be solved at different frequency orders, and the computational cost becomes extremely high, especially for large-scale problems. Since the perturbed kernels still satisfy Toeplitz property on the uniform Cartesian grid, the FFT based on Lagrange interpolation can be well incorporated to accelerate the multiple matrix vector products. Because of the nonsingularity property of high-order kernels when $n\geq 1$ , we do not need to do any near field amendment. Finally, the efficiency of the proposed method is validated in an iterative solver with numerical examples.