scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Radix-2 decimation-in-frequency algorithm for the computation of the real-valued FFT

01 Apr 1999-IEEE Transactions on Signal Processing (IEEE)-Vol. 47, Iss: 4, pp 1181-1184
TL;DR: An efficient algorithm for computing the real-valued FFT using radix-2 decimation-in-frequency (DIF) approach has been introduced and a C++ program that implements this algorithm has been included.
Abstract: An efficient algorithm for computing the real-valued FFT (of length N) using radix-2 decimation-in-frequency (DIF) approach has been introduced. The fact that the odd coefficients are the DFT values of an N/2-length linear phase sequence introduces a redundancy in the form of the symmetry X(2k+1)=X/sup */(N-2k-1), which can be exploited to reduce the arithmetic complexity and memory requirements. The arithmetic complexity and, memory requirements of the algorithm presented are exactly the same as the most efficient decimation-in-time (DIT) algorithm for the real-valued FFT that exists to date. A C++ program that implements this algorithm has been included.
Citations
More filters
Journal ArticleDOI
TL;DR: A formal procedure for designing FFT architectures using folding transformation and register minimization techniques is proposed and new parallel-pipelined architectures for the computation of real-valued fast Fourier transform (RFFT) are derived.
Abstract: This paper presents a novel approach to develop parallel pipelined architectures for the fast Fourier transform (FFT). A formal procedure for designing FFT architectures using folding transformation and register minimization techniques is proposed. Novel parallel-pipelined architectures for the computation of complex and real valued fast Fourier transform are derived. For complex valued Fourier transform (CFFT), the proposed architecture takes advantage of under utilized hardware in the serial architecture to derive L-parallel architectures without increasing the hardware complexity by a factor of L. The operating frequency of the proposed architecture can be decreased which in turn reduces the power consumption. Further, this paper presents new parallel-pipelined architectures for the computation of real-valued fast Fourier transform (RFFT). The proposed architectures exploit redundancy in the computation of FFT samples to reduce the hardware complexity. A comparison is drawn between the proposed designs and the previous architectures. The power consumption can be reduced up to 37% and 50% in 2-parallel CFFT and RFFT architectures, respectively. The output samples are obtained in a scrambled order in the proposed architectures. Circuits to reorder these scrambled output sequences to a desired order are presented.

163 citations


Cites methods from "Radix-2 decimation-in-frequency alg..."

  • ...In [23]–[25] different algorithms have been proposed for computation of RFFT....

    [...]

Journal ArticleDOI
TL;DR: The proposed architecture takes advantage of the reduced number of operations of the RFFT with respect to the complex fast Fourier transform (CFFT), and requires less area while achieving higher throughput and lower latency.
Abstract: This paper presents a new pipelined hardware architecture for the computation of the real-valued fast Fourier transform (RFFT). The proposed architecture takes advantage of the reduced number of operations of the RFFT with respect to the complex fast Fourier transform (CFFT), and requires less area while achieving higher throughput and lower latency. The architecture is based on a novel algorithm for the computation of the RFFT, which, contrary to previous approaches, presents a regular geometry suitable for the implementation of hardware structures. Moreover, the algorithm can be used for both the decimation in time (DIT) and decimation in frequency (DIF) decompositions of the RFFT and requires the lowest number of operations reported for radix 2. Finally, as in previous works, when calculating the RFFT the output samples are obtained in a scrambled order. The problem of reordering these samples is solved in this paper and a pipelined circuit that performs this reordering is proposed.

130 citations


Cites methods from "Radix-2 decimation-in-frequency alg..."

  • ...On the other hand, it has been also demonstrated that it is possible to obtain the same savings for the DIF decomposition using an alternative algorithm [29] that makes use of linear-phase sequences....

    [...]

Journal ArticleDOI
TL;DR: A novel approach to develop pipelined fast Fourier transform (FFT) architectures for real-valued signals based on modifying the flow graph of the FFT algorithm such that it has both real and complex datapaths.
Abstract: This paper presents a novel approach to develop pipelined fast Fourier transform (FFT) architectures for real-valued signals The proposed methodology is based on modifying the flow graph of the FFT algorithm such that it has both real and complex datapaths The imaginary parts of the computations replace the redundant operations in the modified flow graph New butterfly structures are designed to handle the hybrid datapaths The proposed hybrid datapath leads to a general approach which can be extended to all radix- 2n based FFT algorithms Further, architectures with arbitrary level of parallelism can be derived using the folding methodology Novel 2-parallel and 4-parallel architectures are presented for radix- 23 and radix- 24 algorithms The proposed architectures maximize the utilization of hardware components with no redundant computations The proposed radix- 23 and radix- 24 architectures lead to low hardware complexity with respect to adders and delays The N-point 4-parallel radix- 24 architecture requires 2(log16N-1) complex multipliers, 2log2N real adders and N complex delay elements

55 citations


Cites methods from "Radix-2 decimation-in-frequency alg..."

  • ...Even though specific algorithms for the computation of the RFFT [8]–[13] have been proposed in the past, these algorithms lack regular geometries to design pipelined architectures....

    [...]

Journal ArticleDOI
TL;DR: This paper presents the design of ring learning with errors (LWE) cryptoprocessors using number theoretic transform (NTT) cores and Gaussian samplers based on the inverse transform method and concludes that they have the highest throughput, but they require more area resources than other previous ones.
Abstract: This paper presents the design of ring learning with errors (LWE) cryptoprocessors using number theoretic transform (NTT) cores and Gaussian samplers based on the inverse transform method The NTT cores are designed using radix-2 and radix-8 decimation-in-frequency NTT algorithms and pipeline architectures The designed Gaussian samplers are an optimized parallel implementation of the inverse transform method and they use a pipeline architecture to generate a sample every clock cycle after the latency period, that is, the output is obtained in a fixed time achieving timing-attack-resistant ring-LWE cryptoprocessors Also, taking into account the national institute of standards and technology recommendation, a random number generator is designed to generate the input of the Gaussian sampler The cryptoprocessors were synthesized on the field-programmable gate array EP4SGX230KF40C2 and verified in hardware using the DE4 board and the SignalTap tool According to the obtained synthesis results, for dimension 512, the three cryptoprocessors perform the encryption in 933, 516, and $173~\boldsymbol {\mu }\text{s}$ and the decryption in 459, 278, and $104~\boldsymbol {\mu }\text{s}$ We compared the designed cryptoprocessors with other ones presented in the literature, and from this comparison, we can conclude that they have the highest throughput, but they require more area resources than other previous ones

42 citations


Cites methods from "Radix-2 decimation-in-frequency alg..."

  • ...The FFT is designed by implementing the butterfly diagram obtained from a radix-r FFT algorithm based on the decimation-in-time (DIT) or decimation-in-frequency (DIF) approaches, where r is a power of two [25], [26], and the above algorithm is adapted to perform the NTT....

    [...]

Journal ArticleDOI
TL;DR: A novel architecture for memory-based fast Fourier transform (FFT) computation for real-valued signals based on radix-2 decimation-in-frequency algorithm to minimize the computation clock cycles and maximize the utilization of the processing element (PE).
Abstract: This brief presents a novel architecture for memory-based fast Fourier transform (FFT) computation for real-valued signals based on radix-2 decimation-in-frequency algorithm. A superior strategy of stage partition for the real FFT (RFFT) is proposed to minimize the computation clock cycles and maximize the utilization of the processing element (PE). The PE employed in our RFFT architecture can process four inputs in parallel by using two radix-2 butterflies and only two multiplexers. The proposed memory-addressing scheme and control of the multiplexers can be expressed in terms of a counter according to the RFFT computation stage. Furthermore, the proposed RFFT architecture can support more PEs in two dimensions as well. Compared with prior works, the proposed RFFT processors have the advantages of fewer computation cycles and lower hardware usage. The experiment shows that the proposed processor reduces the computation cycles by a factor of 17.5% for a 32-point RFFT computation compared with a recently presented work while maintaining lower hardware usage and complexity in the PE design.

41 citations


Cites background from "Radix-2 decimation-in-frequency alg..."

  • ...Nowadays, the interest in the computation of FFT for realvalued signals is increasing since most of the physical signals are real [3]–[5]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A new implementation of the real-valued split-radix FFT is presented, an algorithm that uses fewer operations than any otherreal-valued power-of-2-length FFT.
Abstract: This tutorial paper describes the methods for constructing fast algorithms for the computation of the discrete Fourier transform (DFT) of a real-valued series. The application of these ideas to all the major fast Fourier transform (FFT) algorithms is discussed, and the various algorithms are compared. We present a new implementation of the real-valued split-radix FFT, an algorithm that uses fewer operations than any other real-valued power-of-2-length FFT. We also compare the performance of inherently real-valued transform algorithms such as the fast Hartley transform (FHT) and the fast cosine transform (FCT) to real-valued FFT algorithms for the computation of power spectra and cyclic convolutions. Comparisons of these techniques reveal that the alternative techniques always require more additions than a method based on a real-valued FFT algorithm and result in computer code of equal or greater length and complexity.

489 citations

Journal ArticleDOI
TL;DR: A complete set of fast algorithms for computing the discrete Hartley transform is developed, including decimation-in-frequency, radix-4, split radix, prime factor, and Winograd transform algorithms.
Abstract: The discrete Hartley transform (DHT) is a real-valued transform closely related to the DFT of a real-valued sequence. Bracewell has recently demonstrated a radix-2 decimation-in-time fast Hartley transform (FHT) algorithm. In this paper a complete set of fast algorithms for computing the DHT is developed, including decimation-in-frequency, radix-4, split radix, prime factor, and Winograd transform algorithms. The philosophies of all common FFT algorithms are shown to be equally applicable to the computation of the DHT, and the FHT algorithms closely resemble their FFT counterparts. The operation counts for the FHT algorithms are determined and compared to the counts for corresponding real-valued FFT algorithms. The FHT algorithms are shown to always require the same number of multiplications, the same storage, and a few more additions than the real-valued FFT algorithms. Even though computation of the FHT takes more operations, in some situations the inherently real-valued nature of the discrete Hartley transform may justify this extra cost.

275 citations

Journal ArticleDOI
O.K. Ersoy1
01 Mar 1994
TL;DR: Generalizations of the Fourier transform kernel lead to a number of novel transforms, in particular, special discrete cosine, discrete sine, and real discrete Fourier transforms, which have already found use in anumber of applications.
Abstract: Major continuous-time, discrete-time, and discrete Fourier-related transforms as well as Fourier-related series are discussed both with real and complex kernels. The complex Fourier transforms, Fourier series, cosine, sine, Hartley, Mellin, Laplace transforms, and z-transforms are covered on a comparative basis. Generalizations of the Fourier transform kernel lead to a number of novel transforms, in particular, special discrete cosine, discrete sine, and real discrete Fourier transforms, which have already found use in a number of applications. The fast algorithms for the real discrete Fourier transform provide a unified approach for the optimal fast computation of all discrete Fourier-related transforms. The short-time Fourier-related transforms are discussed for applications involving nonstationary signals. The one-dimensional transforms discussed are also extended to the two-dimensional transforms. >

57 citations

Journal ArticleDOI
TL;DR: Five programs for efficient computation of DFT of real-valued data are analyzed with respect to their operation counts vis-a-vis run times on weak and powerful floating-point processors and the Bruun (1978) algorithm turns out to be a "best" performer.
Abstract: Five programs for efficient computation of DFT of real-valued data are analyzed with respect to their operation counts vis-a-vis run times on weak and powerful floating-point processors. The results help dispose of the claims of superiority of FHT over corresponding real-valued FFT. The Bruun (1978) algorithm turns out to be a "best" performer. >

30 citations

Journal ArticleDOI
V. Nagesha1
TL;DR: Efficient fast Fourier transform algorithms to compute the forward and inverse discrete Fourier transforms of a sequence with linear-phase characteristic are examined and can be easily written by simple restructuring of a complex FFT algorithm.
Abstract: Efficient fast Fourier transform (FFT) algorithms to compute the forward and inverse discrete Fourier transforms (DFT) of a sequence with linear-phase characteristic are examined. These reduce the computational requirements as regards a complex FFT by large factors and should be used whenever applicable. The case when the DFT coefficients are real-valued leads to further reductions in computational requirements. Though the redundancy in the linear-phase situation is exactly 50%, the computational requirements and implementation are quite different from the real-valued FFT which uses a similar symmetry relation. The code for such implementations can be easily written by simple restructuring of a complex FFT algorithm. >

7 citations