scispace - formally typeset
Search or ask a question

Showing papers on "Rader's FFT algorithm published in 2016"


Journal ArticleDOI
TL;DR: A fast algorithm for computing volume potentials - that is, the convolution of a translation invariant, free-space Green's function with a compactly supported source distribution defined on a uniform grid is introduced.

84 citations


Journal ArticleDOI
TL;DR: The triangular matrix representation is an excellent alternative to represent FFT algorithms and it opens new possibilities in the exploration and understanding of the FFT.
Abstract: In this paper we propose a new representation for FFT algorithms called the triangular matrix representation. This representation is more general than the binary tree representation and, therefore, it introduces new FFT algorithms that were not discovered before. Furthermore, the new representation has the advantage that it is simple and easy to understand, as each FFT algorithm only consists of a triangular matrix. Besides, the new representation allows for obtaining the exact twiddle factor values in the FFT flow graph easily. This facilitates the design of FFT hardware architectures. As a result, the triangular matrix representation is an excellent alternative to represent FFT algorithms and it opens new possibilities in the exploration and understanding of the FFT.

32 citations


Journal ArticleDOI
TL;DR: In this paper, the authors presented new convolution and correlation theorems in the OFRFT domain and discussed the design method of multiplicative filter for band-limited signals for OF-RFT by convolution in time domain based on fast Fourier transform (FFT) as well as in OF-FT domain.
Abstract: This paper presents new convolution and correlation theorems in the OFRFT domain. The authors also discuss the design method of multiplicative filter for bandlimited signals for OFRFT by convolution in time domain based on fast Fourier transform (FFT) as well as in OFRFT domain. Moreover, with the help of simulation, the effect of time-shifting and frequency-modulation parameters is shown in mapping one shape of an area to the same shape of another area.

14 citations


Proceedings ArticleDOI
22 May 2016
TL;DR: This brief presents the scaling-free stochastic adder and the random number generator sharing scheme, which enable a significant reduction in accuracy loss and hardware cost and achieve much better hardware performance and accuracy performance than state-of-the-art Stochastic design.
Abstract: Among various discrete transforms, discrete Fourier transformation (DFT) is the most important technique that performs Fourier analysis in various practical applications, such as digital signal processing, wireless communications, to name a few. Due to its ultra-high computing complexity as O(N2), in practice the N-point DFT is usually performed in the form of fast Fourier transformation (FFT) with complexity as O(NlogN). Despite this significant reduction in computing complexity, the hardware cost of the multiplication-intensive N-point FFT is still very prohibitive; especially for many large-scale applications that requires large N.

13 citations


Proceedings ArticleDOI
20 Nov 2016
TL;DR: Theoretical computing complexity of and some other similar operation is demonstrated, revealing an advantage on computation of CS-unit, which is equivalent to a combination of a convolutional layer and a pooling layer but more effective.
Abstract: Convolution operation is the most important and time consuming step in a convolution neural network model. In this work, we analyze the computing complexity of direct convolution and fast-Fourier-transform-based (FFT-based) convolution. We creatively propose CS-unit, which is equivalent to a combination of a convolutional layer and a pooling layer but more effective. Theoretical computing complexity of and some other similar operation is demonstrated, revealing an advantage on computation of CS-unit. Also, practical experiments are also performed and the result shows that CS-unit holds a real superiority on run time. Keywords-computing complexity; FFT-based convolution; CSunit

7 citations


Proceedings ArticleDOI
04 Mar 2016
TL;DR: In this work two distinct complex multiplication approaches are discussed, and the corresponding results produced by these approaches are compared with MATLAB.
Abstract: This paper presents a high speed FFT algorithms for high data rate wireless personal area network applications. In wireless personal area network the FFT/IFFT block leads the major role. Computational requirements of FFT/IFFT processes are a heavy burden in most applications. The FFT/IFFT block can be implemented in various methods. From this work, we can recognize that most of the FFT structures follow the divide and conquer algorithms, which improve the computational efficiency. In this work two distinct complex multiplication approaches are discussed, and the corresponding results produced by these approaches are compared with MATLAB. The pipelined architecture of Radix-2 SDF has been implemented with 16 point module.

7 citations


Proceedings ArticleDOI
01 Jun 2016
TL;DR: The butterfly of analog Cooley-Tukey algorithm is provided, which requires less complex operations of additional and multiplication than the standard method, and runs 1.5 times faster than analogue in Matlab.
Abstract: One- and two-dimensional (2D) fast Fourier transform (FFT) algorithms has been widely used in digital processing. 2D discrete Fourier transform is reduced to a combination of one-dimensional FFT for all coordinates due to the increased complexity and the large amount of computation by increasing dimension of the signal. This article provides the butterfly of analog Cooley-Tukey algorithm, which requires less complex operations of additional and multiplication than the standard method, and runs 1.5 times faster than analogue in Matlab.

6 citations


Journal ArticleDOI
TL;DR: A novel method, called aFFT-C (accurate FFT convolution), utilizes the Fast Fourier Transform (FFT) for the gain in speed, but relying on a rigorous analysis of the propagation of roundoff error, it can detect and circumvent the accumulation of this numerical error that is otherwise inherent to the Fourier transform.

6 citations


Proceedings ArticleDOI
19 May 2016
TL;DR: An Altera FPGA based NIOS II custom instruction implementation of Good-Thomas FFT algorithm is provided to improve the system performance and also provide the comparison when the same algorithm is completely implemented in software.
Abstract: Image processing can be considered as signal processing in two dimensions (2D). Filtering is one of the basic image processing operation. Filtering in frequency domain is computationally faster when compared to the corresponding spatial domain operation as the complex convolution process is modified as multiplication in frequency domain. The popular 2D transforms used in image processing are Fast Fourier Transform (FFT), Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT). The common values for resolution of an image are 640x480, 800x600, 1024x768 and 1280x1024. As it can be seen, the image formats are generally not a power of 2. So power of 2 FFT lengths are not required and these cannot be built using shorter Discrete Fourier Transform (DFT) blocks. Split radix based FFT algorithms like Good-Thomas FFT algorithm simplifies the implementation logic required for such applications and hence can be implemented in low area and power consumption and also meet the timing constraints thereby operating at high frequency. The Good-Thomas FFT algorithm which is a Prime Factor FFT algorithm (PFA) provides the means of computing DFT with least number of multiplication and addition operations. We will be providing an Altera FPGA based NIOS II custom instruction implementation of Good-Thomas FFT algorithm to improve the system performance and also provide the comparison when the same algorithm is completely implemented in software.

5 citations


Journal ArticleDOI
TL;DR: An overview of the work done by a different FFT processor previously is given and the comparison of different architecture is also discussed.
Abstract: A time-domain sequence is converted into an equivalent frequency-domain sequence using discrete Fourier transform. The reverse operation converts a frequency-domain sequence into an equivalent time- domain sequence using inverse discrete Fourier transform. Based on the discrete Fourier transform. Fast Fourier transform (FFT) is an effective algorithm with few computations. FFT is used in everything from broadband to 3G and Digital TV to radio LAN"s. To improve its architecture different efficient algorithms are developed. This paper gives an overview of the work done by a different FFT processor previously. The comparison of different architecture is also discussed.

3 citations


Proceedings ArticleDOI
01 Jul 2016
TL;DR: Two levels of saving ideas are proposed to apply the scaling operation to the Twidlle Factors(TF) similar to Tangent FFT like one proposed by Frigo for split radix so that the net computational complexity is of the order of 4Nlog2N computation, where N is the size of FFT.
Abstract: Rader abd Brenner's ‘real-factor’ FFT can be applied to Radix-4 FFT to fetch saving in the multiplication counts. However in turn the number of addition count increases which results in increase in total flop count. For this in this paper two levels of saving ideas are proposed. First is a slight modification to Rader and Brenner's ‘real-factor’ FFT for Radix-4, which not only reduces the multiplication but also makes the total flop count equals to standard Radix-4 FFT. Second is to apply the scaling operation to the Twidlle Factors(TF) similar to Tangent FFT like one proposed by Frigo for split radix so that the net computational complexity is of the order of 4Nlog2N computation, where N is the size of FFT. As such the complexity order is same as Standard Split Radix FFT.

Proceedings ArticleDOI
01 Sep 2016
TL;DR: A pipelined fast Fourier transform processor consisting of radix-2, 3 and 5 for prime-sized discrete Fourier Transform (DFT) that reduces hardware complexity 32 %, and shows 737 Mbps throughput.
Abstract: This paper presents a pipelined fast Fourier transform (FFT) processor consisting of radix-2, 3 and 5 for prime-sized discrete Fourier transform (DFT). The FFT processor does not require memory storing the twiddle factors or complex multiplications. It is adaptable for 34 kinds of the FFT length with a trivial multiplications and multiplexing of data in the LTE uplink. The proposed architecture reduces hardware complexity 32 %, and shows 737 Mbps throughput.

18 Nov 2016
TL;DR: This article provides a general Cooley-Tukey algorithm analog, which requires less complex operations of additional and multiplication than the standard method, and runs 1.5 times faster than analogue in Matlab.

Journal ArticleDOI
TL;DR: This paper provides a new FFT (classical) algorithm over symmetric groups and then transforms it to a quantum algorithm, which is faster than the existing O ( n 4 log ? n ) QFT algorithm.

Journal ArticleDOI
TL;DR: A simplified algorithm is proposed for the modified split radix FFT (MSRFFT) algorithm, reducing the number of real coefficients evaluated from 5/8N −2t o 15/32N − 2 and the numberof groups of decomposition from 4 to 3 and the proposed implementation method can save execution time on CPUs and general processing units (GPUs).
Abstract: Discrete Fourier transform (DFT) finds various applications in signal processing, image processing, artificial intelligent, and fuzzy logic etc. DFT is often computed efficiently with Fast Fourier transform (FFT). The modified split radix FFT (MSRFFT) algorithm implements a length-N=2 m DFT achieving a reduction of arithmetic complexity compared to split-radix FFT (SRFFT). In this paper, a simplified algorithm is proposed for the MSRFFT algorithm, reducing the number of real coefficients evaluated from 5/8N − 2t o 15/32N − 2 and the number of groups of decomposition from 4 to 3. A implementation approach is also presented. The approach makes data-path of the MSRFFT regular similar to that of the radix-2 FFT algorithm. The experimental results show that (1) MSRFFT consumes less time on central processing units (CPUs) with sufficient cache than existing algorithms; (2) the proposed implementation method can save execution time on CPUs and general processing units (GPUs).

Proceedings ArticleDOI
22 May 2016
TL;DR: An efficient memory-based fast Fourier transform processor including 35 different working sizes for LTE systems and exploiting prime factor algorithm to decrease the multiplications and twiddle factor storage is presented.
Abstract: This paper presents an efficient memory-based fast Fourier transform processor including 35 different working sizes for LTE systems. A factorization method named high-radix-small-butterfly combined with a conflict-free address scheme for 2p3q5r point memory-based FFT processor is proposed. The processor can not only provide conflict-free concurrent data access from different memory banks but also continuous-flow working mode. Moreover, we exploit prime factor algorithm to decrease the multiplications and twiddle factor storage. In addition, a unified Winograd Fourier transform algorithm butterfly core was designed for the small 2, 3, 4, 5-point DFTs. The FFT processor was implemented in a SMIC 55nm CMOS process with core area 1.063mm2. The chip consumes 40 8mW at 122.88MHz operating frequency with 1.08V voltage supply.

Journal ArticleDOI
TL;DR: Modification of structure of Bit Parallel Multiplication (BPM) has been modified to alleviate the performance of twiddle factor multiplier of FFT architecture and Modified BPM structures have been incorporated into R4MDC FFT structure for alleviating the performances of frequency transformation process.
Abstract: This paper presents the Fast Fourier Transform (FFT) techniques for converting a signal in the time domain to the frequency domain. The advent of Fast Fourier Transform (FFT) method has greatly extended our ability to implement the Fourier methods on digital computers. FFT is an algorithm that speeds up the calculation of Discrete Fourier Transform (DFT). Pipelined FFT have features like simplicity, modularity and high throughput. Radix-2 Multi-Path Delay Commutator (R2MDC) FFT is the best frequency transformation architecture for FFT calculation. However it requires more delay elements to provide desired frequency transformation functions. To overcome this problem, Radix-4 Multi-Path Delay Commutator (R4MDC) has been designed in this paper. In addition, structure of Bit Parallel Multiplication (BPM) has been modified to alleviate the performance of twiddle factor multiplier of FFT architecture. Modified BPM structure has utilized only little hardware to perform the twiddle factor multiplication. Finally, Modified BPM structures have been incorporated into R4MDC FFT structure for alleviating the performances of frequency transformation process. Design of proposed methods has been validated by using ModelSim 6.3C and Synthesis results has been evaluated by using Xilinx 12.4i (Family: Spartan 3, Device: Xc3s200, Package: PQ208, Speed: -5) design tool.

Journal ArticleDOI
TL;DR: The following will show its superiority from the detailed study of the FFT algorithm and accordingly make the further improvement to the algorithm to improve the use efficiency of theFFT algorithm.
Abstract: The quantity of digital image is huge. Undoubtedly, it will be a huge project to complete the image processing in a limited time, and it is unrealistic to directly deal with the real-time image signal with the Discrete Fourier Transform (DFT) algorithm. There is no fundamentally change until Cooley and Tukey proposed a Fast Fourier Transform Algorithm (FFT) and it will greatly promote the application of Discrete Fourier Transform in all aspects. The following will show its superiority from the detailed study of the FFT algorithm and accordingly make the further improvement to the algorithm to improve the use efficiency of the FFT algorithm.

Proceedings ArticleDOI
01 Jul 2016
TL;DR: An modified frequency discrete correlation algorithm only needing FFT is presented and deduced in new ways based on the correlation theorem and simulation experiments prove that the algorithm is correct and effective.
Abstract: Compared with the existing conventional frequency correlation algorithm which needs both Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (IFFT), an modified frequency discrete correlation algorithm only needing FFT is presented and deduced in new ways based on the correlation theorem. Also simulation experiments of the algorithm for periodic signals and mixed signals composed by periodic components and random components have been completed. The results of the experiments prove that the algorithm is correct and effective. The deviation caused by the algorithm is less than 10-6 in terms of taking time domain correlation calculation as criterion.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: A new identical radix-2k FFT algorithms are presented, which has same number of butterfly and twiddle factor multiplication and the difference is only in Twiddle factor stage location in signal flow graph (SFG).
Abstract: The radix-2k fast Fourier transform (FFT) algorithm is used to achieve at the same time both a radix-2 butterfly and a reduced number of twiddle factor multiplication. In this paper we present a new identical radix-2^k FFT algorithms, which has same number of butterfly and twiddle factor multiplication. The difference is only in twiddle factor stage location in signal flow graph (SFG). Further, analyze these algorithms and is shown that the round-off noise of identical radix-22, radix-23, and radix-24 FFT algorithms at output is reduced 27%, 8%, 3% respectively.

Proceedings Article
01 Oct 2016
TL;DR: In this paper, the authors investigated the implementation and use of the sparse fast Fourier transform algorithm in the converter control of a variable speed drive, which is proposed due to the reduction in computational complexity compared to the conventional Fast Fourier Transform for the special case of sparse signals.
Abstract: This paper investigates the implementation and the use of the sparse fast Fourier transform algorithm in the converter control of a variable speed drive. The algorithm is proposed due to the reduction in computational complexity compared to the conventional fast Fourier transform for the special case of sparse signals. After discussing the theory and a simulation model, experimental results obtained using a field programmable gate array (FPGA) implementation are presented, showing the effectiveness of the proposed solution.

Posted Content
01 Feb 2016-viXra
TL;DR: In this article, the classical convolution of Clifford algebra-valued signals with the steerable two-sided Clifford Fourier transform (CFT) is compared with the (equally steerable) Mustard convolution.
Abstract: In this paper we use the general steerable two-sided Clifford Fourier transform (CFT), and relate the classical convolution of Clifford algebra-valued signals over $\R^{p,q}$ with the (equally steerable) Mustard convolution. A Mustard convolution can be expressed in the spectral domain as the point wise product of the CFTs of the factor functions. In full generality do we express the classical convolution of Clifford algebra signals in terms of finite linear combinations of Mustard convolutions, and vice versa the Mustard convolution of Clifford algebra signals in terms of finite linear combinations of classical convolutions.

Proceedings ArticleDOI
06 Jul 2016
TL;DR: The application of the proposed TD-FFT/TD-IFFT for real-time ILC implementation is presented, and demonstrated through an online implementation of the modeling free inversion-based iterative-learning control of a piezoelectric actuator in experiments.
Abstract: In this paper, an algorithm of time-distributed fast Fourier transform and inverse fast Fourier transform (TD-FFT/TD-IFFT) is proposed. This work is motivated by the needs to implement FFT/IFFT in real-time on general microprocessors (e.g., Intel's ×86-based microprocessors) in signal processing and control applications, for example, in real-time implementation of frequency-domain iterative learning control techniques. The proposed TD-FFT technique explores the butterfly-structure in the FFT computation, and distributes the computation needed into a sequence of stages each executing a much shorter sampled data sequence. The proposed approach is extended to real-time IFFT computation as well. For a sampled sequence of 2N length, the proposed TD-FFT/TD-IFFT algorithm maintains the total computation complexity of FFT/IFFT while distributing the computation from one sampling period to multiple sampling periods. Then, the application of the proposed TD-FFT/TD-IFFT for real-time ILC implementation is presented, and demonstrated through an online implementation of the modeling free inversion-based iterative-learning control (MIIC) of a piezoelectric actuator in experiments.

Book ChapterDOI
01 Jan 2016
TL;DR: This paper mainly focuses on prime size 5-point and 7–point WFFT architectures, implemented in Verilog and simulated using Xilinx ISE 13.1.
Abstract: This paper presents area and latency aware design of Discrete Fourier Transform (DFT) architectures using Winograd Fast Fourier Transform algorithm (WFFT). WFFT is one of the Fast Fourier algorithms which calculate prime sized DFTs. The main component of DFT architectures are Adders and Multipliers. This paper presents DFT architectures using Winograd Fast Fourier Algorithm with Carry Look Ahead Adder and add/shift multiplier and also with Semi-complex Multipliers. In this paper, different prime size DFTs are calculated using polynomial base WFFT as well as conventional algorithm. Area and latency are calculated in Xilinx synthesizer. Polynomial WFFT include Chinese Remainder theorem which increases complexity for higher orders. This paper mainly focuses on prime size 5-point and 7–point WFFT architectures, implemented in Verilog and simulated using Xilinx ISE 13.1. Each sub module is designed using data flow style and finally top level integration is done using structural modeling. DFT architecture has wide range of applications in various domain includes use in Digital Terrestrial/Television Multimedia Broadcasting standard.