scispace - formally typeset
Search or ask a question

Showing papers on "Twiddle factor published in 2022"


Journal ArticleDOI
TL;DR: In this article , a coordinate rotation digital computer (CORDIC) algorithm has been developed using integrated synthesis environment (ISE) Xilinx version 14.7 software, synthesized using very-high-speed integrated circuit hardware description language (VHDL), and tested on Virtex-5 FPGA.
Abstract: The fourth-generation (4G) and fifth-generation (5G) wireless communication systems use the orthogonal frequency division multiplexing (OFDM) modulation techniques and subcarrier allocations. The OFDM modulator and demodulator have inverse fast Fourier transform (IFFT) and fast Fourier transform (FFT) respectively. The biggest challenge in IFFT/FFT processor is the computation of imaginary and real values. CORDIC has been proved one of the best rotation algorithms for logarithmic, trigonometric, and complex calculations. The proposed work focuses on the OFDM transceiver hardware chip implementation, in which 8-point to 1024-point IFFT and FFT are used to compute the operations in transmitter and receiver respectively. The coordinate rotation digital computer (CORDIC) algorithm has read-only memory (ROM)-based architecture to store FFT twiddle factors and their angle generators. The address generation unit is required to fetch the data and write the results into the memory in the appropriate sequence. CORDIC provides low memory, delay, and optimized hardware on the field-programmable gate array (FPGA) in comparison to normal FFT architecture for the OFDM system. The comparative performance of the FFT and CORDIC-FFT based OFDM transceiver chip is estimated using FPGA parameters: slices, flip-flops, lookup table (LUTs), frequency, power, and delay. The design is developed using integrated synthesis environment (ISE) Xilinx version 14.7 software, synthesized using very-high-speed integrated circuit hardware description language (VHDL), and tested on Virtex-5 FPGA.

5 citations



Journal ArticleDOI
TL;DR: This paper proposes fast algorithms for computing the discrete Fourier transform for real-valued sequences of lengths from 3 to 9 using the complex-valued FFT.
Abstract: This paper proposes fast algorithms for computing the discrete Fourier transform for real-valued sequences of lengths from 3 to 9. Since calculating the real-valued DFT using the complex-valued FFT is redundant regarding the number of needed operations, the developed algorithms do not operate on complex numbers. The algorithms are described in matrix–vector notation and their data flow diagrams are shown.

2 citations




Proceedings ArticleDOI
08 Dec 2022
TL;DR: In this article , the authors proposed an architecture for the fast Fourier transform that makes use of lookup tables to reduce the number of times multiplication is performed and pre-processing units use integer multipliers to reduce multipliers needed to perform complex multiplication when computing the FFT.
Abstract: Numerous signal processing algorithms, including many used in biomedical research, employ the Fourier transform extensively. Due to the computationally intensive nature of the FFT tasks, optimization models are necessary to reduce the computations' complexity to a more manageable level. In this article, we suggest optimizations that can reduce the number of mathematical operations necessary to compute the FFT. These optimizations may occur at either the architecture or application level. The proposed architecture for the fast Fourier transform makes use of lookup tables to reduce the number of times multiplication is performed. In addition, pre-processing units use integer multipliers to reduce the number of multipliers needed to perform complex multiplication when computing the FFT.

Journal ArticleDOI
05 Apr 2022-Mausam
TL;DR: It is found that on average the FFT with the proposed modifications is more than twice as fast as the original FFT.
Abstract: The efficient Fourier transform (EFT) and FFT algorithms are described and their computational efficiencies with respect to the direct method are discussed. An efficient procedure is proposed for the reordering of data set; the use of EFT algorithm for the initial Fourier transforms and restricting the size of final subsets to not less than 4 is also suggested for saving computation time in the FFT. It is found that on average the FFT with the proposed modifications is more than twice as fast as the original FFT. The amount of overhead operations involved in computer routine, based on the modified FFT is estimated.

Journal ArticleDOI
TL;DR: In this paper , the authors presented a new data-flow graph with no non-trivial twiddle factors in the odd stages of the split-radix fast Fourier transform.
Abstract: Fully parallel pipelined fast Fourier transform offers the highest throughput but requires high area and power. In this work, an efficient fast Fourier transform processor based on the split-radix algorithm is presented to reduce the hardware complexity of fully parallel fast Fourier transforms. In fully parallel pipelined implementations, split-radix fast Fourier transform requires considerably more registers compared to other fast Fourier transform algorithms leading to more power consumption and high latency in spite of having the least number of multiplications among all fast Fourier transforms. To address this issue, the authors present a new data-flow graph with no non-trivial twiddle factors in the odd stages of the split-radix fast Fourier transform. This facilitates significant reduction in the number of pipelining registers and a lower latency is achieved. Further, an efficient shift-add twiddle factor multiplier is proposed based on architectural analysis and substructure sharing. A comparison of the proposed design with existing fully parallel fast Fourier transforms is carried out and it is found that the proposed method outperforms other designs. Synthesis results show that the proposed design offers 45–57% area savings and 67–74% energy savings compared to the best existing fully parallel fast Fourier transform design.

Proceedings ArticleDOI
21 Dec 2022
TL;DR: In this article , a single-bin FFT algorithm based on the formulas for DFT of different frequency bins derived from the Radix-2 FFT is presented, and a comparative analysis of the developed algorithm with the Goertzel algorithm is performed.
Abstract: The Discrete Fourier Transform (DFT) plays an important role in digital signal processing. Fast Fourier Transform (FFT) is a set of algorithms for computing the DFT in short run-time. In many practical applications, DFT of only one frequency value is required rather than of all the frequency values. Algorithms for computing DFT of only one frequency are called Single-bin FFT algorithms. In this paper, we have developed a single-bin FFT algorithm based on the formulas for DFT of different frequency bins derived from the Radix-2 FFT algorithm. Finally, we have performed a comparative analysis of the developed algorithm with another available single-bin FFT algorithm, called the Goertzel algorithm.

Proceedings ArticleDOI
16 Feb 2022
TL;DR: In this article , the authors proposed an efficient algorithm to reduce the redundant computations in FFT which improves the speed and reduces the power consumption by using verilog HDL and pruning.
Abstract: Fast Fourier Transform (FFT) is a Digital Signal Processing (DSP) technique to compute Discrete Fourier Transform (DFT) in a faster way by utilizing the properties of the twiddle factor. Conventional FFT has a problem of computational inefficiency when the number of zero valued inputs out-numbers the number of non-zero valued inputs. This is because of the redundant computations on the zero valued inputs. This issue can be resolved by using a technique called pruning in FFT. In this paper we propose an efficient algorithm to reduce the redundant computations in FFT which improves the speed and reduces the power consumption. The proposed algorithm is implemented using verilog HDL.

Proceedings ArticleDOI
31 Oct 2022
TL;DR: In this paper , a message alignment technique is used to ensure similar meanings of cluster indexes at different positions in the IB-FFT butterfly network, and butterfly nodes are replaced by separately optimized LUTs for multiplication with twiddle factors and for addition.
Abstract: The trend towards larger bandwidths to accommodate higher data rates is unbroken in mobile communications. High sampling rates resulting from these large bandwidths can lead to critical energy consumption in the analog-to-digital converter (ADC) particularly in mobile devices. As the ADC energy demand depends exponentially on the bit resolution, an increase in energy consumption can be avoided by coarse quantization. The ADC can be designed to maximize the mutual information between its output and the channel input using the information bottleneck (IB) method. This approach can also be used to replace computationally expensive arithmetic operations with mutual information maximizing look-up tables (LUTs). We consider an orthogonal frequency division multiplexing communication link and substitute arithmetic computations inside the fast Fourier transform (FFT) with LUTs. Replacing every FFT butterfly operation with an individual LUT quickly becomes infeasible for large FFT sizes due to exploding memory requirements. Instead, a technique called message alignment is used ensuring similar meanings of cluster indexes at different positions in the IB-FFT butterfly network. In the proposed message aligned IB-FFT, butterfly nodes are replaced by separately optimized LUTs for multiplication with twiddle factors and for addition. Message alignment allows to reuse these LUTs saving valuable memory resources.

Book ChapterDOI
01 Jan 2022

Proceedings ArticleDOI
28 Dec 2022
TL;DR: In this paper , the authors proposed a novel architecture for N-point FFT with run-time configurable Radix-2 architecture, FFT size, and data type, and implemented the logic so that only one memory will be used for the entire computation process.
Abstract: A Fast Fourier Transform is an efficient algorithm to compute the discrete Fourier Transform (DFT). It is one of the finest operations in the area of digital signal and image processing. The operation requires a high computational module i.e., (N 2 complex multiplications and N*(N-1) additions). This makes the computational and implementation very difficult. Implementation of N-point FFT/IFFT of data width 32-bit (16-bit real and 16-bit Imaginary) with run-time configurable Radix-2 Architecture, FFT size, and data type i.e., (Fixed Point). Compile time configurable data and twiddle factor precision. The design target is to minimize the latency and design constraints. The logic is implemented so that only one memory will be used for the entire computation process. Hence, this gives a Novel architecture design for N-point FFT.

Proceedings ArticleDOI
20 Jun 2022
TL;DR: In this article , the authors proposed a FFT-specific approximate multiplier design to improve the energy efficiency of signal processing by using the approximated twiddle factor and compressor to reduce the energy consumption during partial product accumulation.
Abstract: Fast Fourier Transform (FFT) is playing an important role in signal processing. This paper proposes a FFT -specific approximate multiplier design to improve the energy efficiency. The approximate multiplier is based on the approximated twiddle factor and compressor to reduce the energy during partial product accumulation. Instead of complex arithmetic operations, the design can achieve reduced power consumption and hardware cost with limited error.

Journal ArticleDOI
TL;DR: In this paper , a modification of the method of implementing the multiplication of multi-digit integers based on the fast Fourier transform (FFT) avoiding the bit-reversed ordering is proposed.
Abstract: The paper for the parallel model of computation, a modification of the method of implementing the multiplication of multi-digit integers based on the fast Fourier transform (FFT) avoiding the bit-reversed ordering is proposed. The paper researches the calculation of FFT according to the “butterfly” scheme based on decimation-in- frequency and decimation-in-time methods, an input signal with elements in direct and bit-reversed order, with an increase and decrease in the number Fourier series coefficients at each step of the "butterfly", the use of a list of Fourier series coefficients in direct and bit-reversed order. The standard FFT-based multiplication algorithm uses the same “butterfly” operation to compute the forward and inverse Fourier transforms. The paper analyzes two combinations of the FFTFDN–FFTTBN and FFTFBN–FFTTDN “butterfly” calculation schemes for calculating forward and inverse discrete Fourier transforms (DFT) in the case of implementing the multi-digit operation in parallel computational model to exclude bit-reversed permutation. A scheme for distributing calculations among four processors is proposed, in which forward and inverse Fourier transform calculations are localized within one parallel processor. The proposed modification does not reduce the computational complexity in terms of the number of complex operations, but due to the exclusion of bit-reversed permutation, the number of synchronization commands between processors and data is reduced, which reduces the algorithm execution time. The scheme can be adapted to distribute the computations among a larger number of processors. Four algorithms for implementing FFT based on decimation-in-frequency and decimation-in-time methods, an input vector with elements in direct and bit-reversed orders are presented. To check the result of the calculation, the algorithm of multiplication avoiding the steps of bit-reversed ordering was implemented in the APL programming language. An example of calculation is given in the form of a table.

Journal ArticleDOI
TL;DR: In this paper , a new low power and low complexity FFT architecture design is proposed, where an input grouping method is used to reduce the multiplications of the inputs and FFT twiddle factor coefficients.
Abstract: FFT is a commonly applied algorithm in digital signal processing and communications. In this brief, a new low power and low complexity FFT architecture design is proposed. An input grouping method is used to reduce the multiplications of the inputs and FFT twiddle factor coefficients. In addition, a new input partial sum sharing scheme is proposed to reuse the hardware resources to further reduce the adder cost. Logic synthesis results in ASIC show that the proposed 16-point FFT architecture can save area and power cost by at least 19.1% and 19.0% respectively compared with the recently published designs. Similarly, the proposed 32-point FFT architecture can reduce both power and delay by at least 6.91% and 5.35%.

Proceedings ArticleDOI
28 May 2022
TL;DR: In this paper , a mixed-radix single-port merged-bank memory addressing algorithm is presented to increase the effective memory bandwidth and to reduce the required memory area concurrently, which can lead to a more area-efficient solution as compared to the mixedradix counterpart in general cases.
Abstract: Fully homomorphic encryption (FHE) is a powerful scheme that allows computations to be performed on encrypted data. To reduce the computational complexity, double-CRT representation has been adopted in the BGV-FHE cryptosystem, in which the 2 nd -CRT, also known as the polynomial-CRT, can be viewed as performing Discrete Fourier Transform (DFT). Since the point size of the DFT is usually non power of two, the traditional Cooley-Tukey FFT algorithm cannot be directly applied to reduce the complexity. This paper explores efficient VLSI architecture of Bluestein’s FFT for BGV-FHE applications. A mixed-radix single-port merged-bank memory addressing algorithm is presented to increase the effective memory bandwidth and to reduce the required memory area concurrently. The evaluation was conducted by implementing a Bluestein’s FFT compiler that can be configured to generate different point sizes of DFT designs for BGV-FHE. Analytical results also show that the proposed Bluestein’s FFT design can lead to a more area-efficient solution as compared to the mixed-radix counterpart in general cases.