scispace - formally typeset
Search or ask a question

Showing papers on "Twiddle factor published in 2010"


Journal ArticleDOI
TL;DR: The twiddle factor from the feedback in a traditional SDFT resonator is removed and thus the finite precision of its representation is no longer a problem and the accumulated errors and potential instabilities are drastically reduced in the mSDFT.
Abstract: This article presented a novel method of computing the SDFT that we call the modulated SDFT (mSDFT). The sliding discrete Fourier transform (SDFT) is a recursive algorithm that computes a DFT on a sample-by-sample basis. The accumulated errors and potential instabilities inherent in traditional SDFT algorithms are drastically reduced in the mSDFT. We removed the twiddle factor from the feedback in a traditional SDFT resonator and thus the finite precision of its representation is no longer a problem.

103 citations


Journal ArticleDOI
TL;DR: This work proposes a nonequispaced hyperbolic cross FFT based on onehyperbolicCross FFT and a dedicated interpolation by splines on sparse grids and allows for the efficient evaluation of trigonometric polynomials with Fourier coefficients supported on the hyperbolics cross at arbitrary spatial sampling nodes.
Abstract: A straightforward discretization of problems in $d$ spatial dimensions often leads to an exponential growth in the number of degrees of freedom. Thus, even efficient algorithms like the fast Fourier transform (FFT) have high computational costs. Hyperbolic cross approximations allow for a severe decrease in the number of used Fourier coefficients to represent functions with bounded mixed derivatives. We propose a nonequispaced hyperbolic cross FFT based on one hyperbolic cross FFT and a dedicated interpolation by splines on sparse grids. Analogously to the nonequispaced FFT for trigonometric polynomials with Fourier coefficients supported on the full grid, this allows for the efficient evaluation of trigonometric polynomials with Fourier coefficients supported on the hyperbolic cross at arbitrary spatial sampling nodes.

61 citations


Book ChapterDOI
12 Sep 2010
TL;DR: This work studies two complex use-cases: (1) Fast Fourier Transformation where a local memory transpose is expressed as part of the datatype, and (2) a conjugate gradient solver with a checkerboard layout that requires multiple nested datatypes.
Abstract: Many parallel applications need to communicate noncontiguous data. Most applications manually copy (pack/unpack) data before communications even though MPI allows a zero-copy specification. In this work, we study two complex use-cases: (1) Fast Fourier Transformation where we express a local memory transpose as part of the datatype, and (2) a conjugate gradient solver with a checkerboard layout that requires multiple nested datatypes. We demonstrate significant speedups up to a factor of 3.8 and 18%, respectively, in both cases. Our work can be used as a template to utilize datatypes for application developers. For MPI implementers, we show two practically relevant access patterns that deserve special optimization.

50 citations


Proceedings ArticleDOI
25 Feb 2010
TL;DR: A radix-2 pipelined FFT processor based on Field Programmable Gate Array (FPGA) for Wireless Local Area Networks (WLAN) is proposed, which can be completely implemented within only 67 clock cycles.
Abstract: It is important to develop a high-performance FFT processor to meet the requirements of real time and low cost in many different systems. So a radix-2 pipelined FFT processor based on Field Programmable Gate Array (FPGA) for Wireless Local Area Networks (WLAN) is proposed. Unlike being stored in the traditional ROM, the twiddle factors in our pipelined FFT processor can be accessed directly. A novel simple address mapping scheme is also proposed. The FFT processor has two pipelines, one is in the execution of complex multiplication of the butterfly unit, and the other is between the RAM modules, which read input data, store temporary variables of butterfly unit and output the final results. Finally, the pipelined 64-point FFT processor can be completely implemented within only 67 clock cycles.

25 citations


Proceedings ArticleDOI
03 Aug 2010
TL;DR: The proposed memory-reduced CORDIC algorithm eliminates the need for storing twiddle factors and angles, resulting in significant area savings with no negative impact on performance.
Abstract: In this paper, a new pipelined, reduced memory CORDIC-based architecture is presented for any radix size FFT. A multi-bank memory structure and the corresponding addressing scheme are used to realize the parallel and in-place data accesses. The proposed memory-reduced CORDIC algorithm eliminates the need for storing twiddle factors and angles, resulting in significant area savings with no negative impact on performance. As a case study, the radix-2 and radix-4 FFT algorithms have been implemented on FPGA hardware. The synthesis results match the theoretical analysis and it can be observed that more than 20% reduction can be achieved in total memory logic.

25 citations


Journal ArticleDOI
TL;DR: Low-cost, fast-computational, power-efficient, and reconfigurable design for recursive discrete Fourier transform (RDFT), the first integration that collated both the prime factor algorithm and the Chinese reminder theorem into a recursive algorithm is proposed.
Abstract: Low-cost, fast-computational, power-efficient, and reconfigurable design for recursive discrete Fourier transform (RDFT) is proposed in this brief. The proposed method is the first integration that collated both the prime factor algorithm (PFA) and the Chinese reminder theorem (CRT) into a recursive algorithm. Hence, a multicycle RDFT algorithm (PFA + CRT + RDFT) and its hardware implementation are produced and presented here in great detail. Compared with some well-known recursive algorithms, the significant improvements for the proposed algorithm can be summarized as follows: 1) The number of computational cycles of the proposed algorithm can be saved by up to 88.5%; 2) The number of multiplications and additions for the proposed algorithm is dramatically reduced by up to 85.2% and 85.2%, respectively; 3) The amount of coefficient read-only memory for storing the twiddle factors totally takes 694 words fewer than those of other existing RDFT algorithms; 4) The hardware cost of the proposed algorithm only takes four real multipliers and eight real adders. This design is more suitable for digital radio mondiale (DRM) systems, such as coded orthogonal frequency-division-multiplexing modulation. The proposed RDFT algorithm was designed and fabricated using a 0.18-μm 1P6M CMOS process. The core area is 521 × 508 μm2, and this hardware accelerator only consumes 8.44 mW at 25 MHz. Furthermore, the performance index of power for this design is three times discrete Fourier transform (DFT) per energy of previous work. Additionally, it can calculate the 288/256/176/112-point DFTs for a portable DRM receiver.

19 citations


Journal ArticleDOI
TL;DR: Two known FFT algorithms, G and GT, are investigated that give an asymptotic reduction in the number of twiddle factor loads required for depth-first recursions but that also allow for aggressive vectorization and easier optimization of trivial twiddleFactor multiplies.
Abstract: Optimizing the number of arithmetic operations required in fast Fourier transform (FFT) algorithms has been the focus of extensive research, but memory management is of comparable importance on modern processors. In this article, we investigate two known FFT algorithms, G and GT , that are similar to Cooley-Tukey decimation-in-time and decimation-in-frequency FFT algorithms but that give an asymptotic reduction in the number of twiddle factor loads required for depth-first recursions. The algorithms also allow for aggressive vectorization (even for non-power-of-2 orders) and easier optimization of trivial twiddle factor multiplies. We benchmark G and GT implementations with comparable Cooley-Tukey implementations on commodity hardware. In a comparison designed to isolate the effect of twiddle factor access optimization, these benchmarks show typical speedups ranging from 10% to 65%, depending on transform order, precision, and vectorization. A more heavily optimized implementation of GT yields substantial performance improvements over the widely used code FFTW for many transform orders. The twiddle factor access optimization technique can be generalized to other common FFT algorithms, including real-data FFTs, split-radix FFTs, and multidimensional FFTs.

16 citations


Patent
Feng Liu, He Chen, Teng Long, Dazhi Zeng, Wei Liu 
01 Dec 2010
TL;DR: In this paper, the authors proposed a pipeline FFT processor variable in the number of points, which comprises a first 1024-point variable FFT processing module, a twiddle factor processing module and a selection control module.
Abstract: The invention provides a pipeline FFT processor variable in the number of points, which comprises a first 1024-point variable FFT processing module, a twiddle-factor processing module, a second 1024-point variable FFT processing module and a selection-control module, wherein the four modules and an intermediate data storage module outside the processor jointly complete FFT two-dimensional processing large in the number of points; each 1024-point variable FFT processing module comprises a first 32-point variable FFT processing sub-module, a second 32-point variable FFT processing sub-module, atwiddle-factor processing sub-module, an intermediate data storage sub-module and a selection-control sub-module; variable-point FFT operation is implemented through the 32-point variable FFT processing sub-modules; the twiddle-factor processing module generates and multiplies intermediate twiddle factors by the result of FFT operation; and the selection-control module realizes control over a whole chip. The processor is suitable to be realized in single-chip FPGA or ASIC, and can simultaneously obtain high speed, low power consumption, high precision and other characteristics.

14 citations


Proceedings ArticleDOI
18 Oct 2010
TL;DR: It is shown that there is a trade-off between twiddle factor memory complexity and switching activity in the introduced algorithms.
Abstract: In this paper, we propose higher point FFT (fast Fourier transform) algorithms for a single delay feedback pipelined FFT architecture considering the 4096-point FFT These algorithms are different from each other in terms of twiddle factor multiplication. Twiddle factor multiplication complexity comparison is presented when implemented on Field-Programmable Gate Arrays(FPGAs) for all proposed algorithms. We also discuss the design criteria of the twiddle factor multiplication. Finally it is shown that there is a trade-off between twiddle factor memory complexity and switching activity in the introduced algorithms.

14 citations


Journal ArticleDOI
TL;DR: This paper presents a fast split-radix- (2×2)/(8-times;8) algorithm for computing the 2-D discrete Hartley transform (DHT) of length N ×N with N = q · 2 m, where q is an odd integer.
Abstract: This paper presents a fast split-radix- (2t2)/(8t8) algorithm for computing the 2-D discrete Hartley transform (DHT) of length N ×N with N = q · 2 m, where q is an odd integer. The proposed algorithm decomposes an N × N DHT into one N /2 × N /2 DHT and 48 N /8 × N /8 DHTs. It achieves an efficient reduction on the number of arithmetic operations, data transfers and twiddle factors compared to the split-radix-(2×2)/(4×4) algorithm. Moreover, the characteristic of expression in simple matrices leads to an easy implementation of the algorithm. If implementing the above two algorithms with fully parallel structure in hardware, it seems that the proposed algorithm can decrease the area complexity compared to the split-radix-(2×2)/(4×4) algorithm, but requires a little more time complexity. An application of the proposed algorithm to 2-D medical image compression is also provided.

13 citations


Proceedings ArticleDOI
03 Aug 2010
TL;DR: This paper proposes equivalent radix-22 algorithms and evaluates them based on twiddle factor switching activity for a single delay feedback pipelined FFT architecture and shows that the twiddle factors switching activity of the equivalent algorithms is reduced with up to 40%.
Abstract: In this paper, we propose equivalent radix-22 algorithms and evaluate them based on twiddle factor switching activity for a single delay feedback pipelined FFT architecture. These equivalent pipeline FFT algorithms have the same number of complex multipliers with the same resolution as the radix-22. It is shown that the twiddle factor switching activity of the equivalent algorithms is reduced with up to 40% for some of the equivalent algorithms derived for N = 256.

Proceedings ArticleDOI
22 Jun 2010
TL;DR: The simulation tests and an implementation of 1024-point FFT targeted on XC5VSX50T FPGA, show that the proposed FFT processor has a high computational frequency and hence suitable for usual OFDM wireless applications.
Abstract: This paper, based on the complexity and hardware requirement of conventional FFT algorithms, analyses the architecture of radix-4 Single-Path Delay Feedback (SDF) in DIF and present an efficient FFT processor for real-time applications. A test bench is built up comparing Signal to Quantization Noise Ratio (SQNR) performances of the processors with a float-point model and fixed-point one. Several values of I/O word length and twiddle factor word length are implemented, to investigate the quantization effect of fixed-point arithmetic with limited precision derived from rounding or truncation errors, for enhancing the output performance. The simulation tests and an implementation of 1024-point FFT targeted on XC5VSX50T FPGA, show that the proposed FFT processor has a high computational frequency and hence suitable for usual OFDM wireless applications.

Proceedings ArticleDOI
03 Aug 2010
TL;DR: Through analyzing and comparing the computation complexity about the split-radix FFT pruning algorithm and other algorithms, it was shown that the proposed method is more efficient than other conventional algorithms in computational complexity.
Abstract: It is necessary to devise the efficient FFT algorithm which can reduce computational complexity due to multiplication in the butterfly structure with twiddle factors in the OFDM based Cognitive Radio, where zero valued inputs/outputs outnumber nonzero inputs/outputs. Now transformed decomposition is considered as more suitable candidate than FFT pruning method for OFDM based Cognitive Radio due to the feasibility of HW design about irregular position of zero inputs/outputs. However, with the introduction of the efficient control circuit for the pruning matrix which selects the multiplication branch corresponding to nonzero outputs in OFDM based cognitive radio, the split-radix FFT pruning algorithm can be proposed for getting the more reduction of computational complexity. Through analyzing and comparing the computation complexity about the split-radix FFT pruning algorithm and other algorithms, it was shown that the proposed method is more efficient than other conventional algorithms in computational complexity.

Patent
Andrew Whyte1
25 Jan 2010
TL;DR: An efficient circuit and method for performing radix-3 Discrete Fourier transform (DFT) of a 3*2M size data frame are provided in this article, where the data frame is split and fast Fourier transformation (FFT) processed as three sub-frames.
Abstract: An efficient circuit and method for performing radix-3 Discrete Fourier transform (DFT) of a 3*2M size data frame are provided. The data frame is split and fast Fourier transform (FFT) processed as three sub-frames. Radix-3 operations are performed on the FFT processed sub-frames over a number of stages with time shared hardware to compute the DFT of the data-frame. FFT operations are performed on the second and third sub-frames to produce respective sub-transforms. Concurrently with FFT processing of the first sub-frame, butterfly operations are performed on the sub-transforms of the second and third sub-frames. Through the use of time-shared hardware and arranging FFT operations to correspond with radix-3 operations at various stages of processing, the DFT is performed with existing FFT processors while reducing resource requirements and/or reducing DFT transform time over the full-parallel radix-3 implementation.

Patent
Stig Halvarsson1
14 Jan 2010
TL;DR: In this paper, the authors propose a method for storing N number of Fast Fourier Transform (FFT) data points into x-memories, N and x being integers greater than one.
Abstract: A method may include storing N number of Fast Fourier Transform (FFT) data points into x-memories, N and x being integers greater than one, and the x-memories having a total memory capacity equivalent to store the N number of FFT data points, and reading K FFT data points of the N number of FFT data points from each of the x-memories so that the N number of FFT data points are read, K being an integer greater than one. The method may further include performing parallel radix-m FFTs on the x*K number of FFT data points, multiplying the x*K number of FFT data points by twiddle factors to obtain resultants, shifting the resultants, and writing back the shifted resultants of the x*K number of FFT data points to the x-memories. The method may also include repeating the reading, the multiplying, the shifting and the writing back until the N number of FFT data points have been completely transformed into an FFT resultant, and where there is x*K number of FFT data points available for processing during every repetition, and outputting the FFT resultant.

Patent
30 Jun 2010
TL;DR: In this paper, an N-point FFT/IFFT/IDFT device is presented, which comprises an mi-point DFT/DIFFT unit, a phase rotation factor unit and a complex multiplier.
Abstract: The invention discloses an N-point FFT/IFFT device which comprises an mi-point FFT/IFFT unit, a phase rotation factor unit and a complex multiplier, wherein the mi-point FFT/IFFT unit comprises an ri 1-point FFT/IFFT unit and an ri 2-point FFT/IFFT unit which are connected in series; the phase rotation factor unit is connected with the complex multiplier and generates and stores twiddle factors; and the complex multiplier is used for weighting the output of an mi-point FFT/IFFT arithmetic unit by using the twiddle factors. The invention also discloses an N-point FFT/IFFT method, which comprises the following steps of: (1) dividing N-point FFT/IFFT into the FFT/IFFT of points of m1, m2 to mK, wherein mi is equal to the product of ri1 and ri2 and is the base number of each stage of FFT/IFFT; (2) cascading ri 1-point FFT/IFFT unit and ri 2-point FFT/IFFT together to realize mi-point DFT/IDFT; and (3) cascading the mi-point DFT/IDFT together to obtain N-point DFT/IDFT. The invention can realize the operation of more than two times in N sampling periods, enhances the analyzing and processing speed of signals, adopts the multiplexing of memories and enhances the utilization rate of the memories in unit.

Proceedings ArticleDOI
06 Dec 2010
TL;DR: This paper studies how efficient FFT algorithm is implemented on the basis of Field Programmable Gate Array (FPGA) and indicates that the calculation can reach equivalent precision and the system performs satisfactorily.
Abstract: Fast Fourier Transform (FFT) is an efficient algorithm to compute the Discrete Fourier Transform (DFT). In many applications the input data are purely real-time, and efficient FFT can satisfy the situation. FFT algorithm based on complex sequence is an improved algorithm of primary FFT. This paper studies how efficient FFT algorithm is implemented on the basis of Field Programmable Gate Array (FPGA). When processing the same sequence length of data such as data of voltage or image, the algorithm put forward in this paper can save half time that is used for the amount of calculation in theory. The simulation indicates that the calculation can reach equivalent precision and the system performs satisfactorily. The method can applied well in many real-time systems and image processing.

Proceedings ArticleDOI
25 Jun 2010
TL;DR: A novel 3780-point FFT algorithm for the China national broadcasting standard which adopts the pure PFA (prime factor algorithm) and nested Winograd FFT which can realize the identity in the circuit structure is proposed.
Abstract: This paper proposes a novel 3780-point FFT algorithm for the China national broadcasting standard. This new algorithm adopts the pure PFA (prime factor algorithm) and nested Winograd FFT which can realize the identity in the circuit structure. The simulation demonstrates that the new algorithm can achieve the equal accuracy rather than other methods but with the least quantity of multiplication.

Journal Article
Wang Yang1
TL;DR: Simulation results show that the design can satisfy the need of the FFT/IFFT for different bandwidths in a high-speed LTE system.
Abstract: A design of the variable-point FFT/IFFT for non-2n is proposed in consideration of the characteristics of the variable-length and non-2n subcarriers in the 3GPP LTE systemA high speed and configurable FFT/IFFT is designed by using the mixed radix with radix-2,radix-3,and radix-4 based on the pipelined ping-pong architectureIt will save about 100 kbits storage space since all the twiddle factors are stored in the same space and RAM is shared by I/O data after optimizing the address generation unitSimulation results show that the design can satisfy the need of the FFT/IFFT for different bandwidths in a high-speed LTE system

01 Jan 2010
TL;DR: A design of 4096-point radix-8 FFT is implemented on Field-Programming Gate Array (FPGA) and a novel method to generate twiddle factors is proposed and compared with the traditional method.
Abstract: A design of 4096-point radix-8 FFT is implemented on Field-Programming Gate Array (FPGA). Traditional radix-2 and radix-4 FFT processors could not satisfy the requirements of modern high-speed digital signal processing, so the radix-8 shared-memory architecture is used at the top-level. The butterfly module, the twiddle factor generation module, the input-output interface module are analyzed and optimized. A novel method to generate twiddle factors is proposed and compared with the traditional method. The pipeline style design increases the computing speed and decreases the FPGA resource utilization. Simulation verification is done and the result is compared with that of Matlab fixed-point model. The design is finally programmed to an Altera EP2S60F672I4 device and is verified with the help of a digital signal processor. The computing results with various input patterns are retrieved to Matlab and compared with the fixed-point model bit by bit. Under the clock frequency of 100MHz, the design takes 2.048μs to finish 4096-point radix-8 FFT, so it can meet the requirement of high speed digital signal processing.

Proceedings Article
01 Aug 2010
TL;DR: This paper shows that the fixed point implementation of `real-factor' FFT can be modified so that its noise-to-signal ratio (NSR) is lower than the NSR of Cooley-Tukey radix-2 FFT.
Abstract: In this paper we show that Rader and Brenner's ‘real-factor’ FFT can be streamlined so that it requires lower computational complexity as compared to the Cooley- Tukey radix-2 FFT. We then show that the fixed point implementation of ‘real-factor’ FFT can be modified so that its noise-to-signal ratio (NSR) is lower than the NSR of Cooley-Tukey radix-2 FFT. Finally simulation results are presented which verify the suitability of ‘real-factor’ FFTs.

Proceedings ArticleDOI
16 Aug 2010
TL;DR: It is shown that FFT signal processing limitations can be overcome by performing the FFT in the optical domain and the power of the technique is demonstrated by performingThe FFT on a single source 10 Tbit/s OFDM signal.
Abstract: The discrete Fourier Transform (DFT) or its more popular efficient algorithm, the Fast Fourier Transform (FFT) is one of the important mathematical functions in the toolbox of signal processing. To this date its processing speed is limited by the underlying electronic processors. We show that FFT signal processing limitations can be overcome by performing the FFT in the optical domain and demonstrate the power of the technique by performing the FFT on a single source 10 Tbit/s OFDM signal.

Proceedings ArticleDOI
01 Dec 2010
TL;DR: A comparison with the existing 2-D vector-radix FFT algorithms has shown that the presented algorithm can be considered as a good compromise between the structural and computational complexities.
Abstract: In this paper, a new decimation-in-time vector-radix-22×22 fast Fourier transform (VR-22×22-FFT) algorithm for computing the two dimensional discrete Fourier transform (2-D DFT) is presented. The algorithm is derived by applying a two-stage decomposition approach and by introducing an efficient technique for grouping the twiddle factors. The arithmetic complexity of the proposed algorithm is analyzed and the number of real multiplications and additions are computed for different transform sizes. Moreover, a comparison with the existing 2-D vector-radix FFT algorithms has shown that the presented algorithm can be considered as a good compromise between the structural and computational complexities.

Proceedings ArticleDOI
09 Feb 2010
TL;DR: A high level implementation of a high performance FFT for OFDM Modulator and Demodulator with Radix-22 Algorithm, which has been coded in Verilog and targeted into XilinxSpartan3 FPGAs.
Abstract: In the widely used OFDM (Orthogonal Frequency Division Multiplexing) systems, the FFT and IFFT pair are used to modulate and demodulate the data constellation on the subcarriers. This paper presents a high level implementation of a high performance FFT for OFDM Modulator and Demodulator. The design has been coded in Verilog and targeted into XilinxSpartan3 FPGAs. Radix-22 Algorithm is proposed and used for the OFDM communication system. This algorithm has the same multiplicative complexity as the radix-4 algorithm, but retains the butterfly structure of radix-2 algorithm. The design is parameterizable in terms of input word length, output word length, twiddle factor word length, and processing word length. Also, it is scalable in terms of number of stages.

Patent
28 Jul 2010
TL;DR: In this article, a demodulator based on orthogonal frequency division multiplexing (OFDM) is used in a DRM receiver, where the demodulators are provided with a controller which controls the work uniformly, and the controller is connected with an on-chip memory cell, a reordering unit, a calculator, a twiddle factor unit and a butterfly unit.
Abstract: The invention discloses a demodulator which is based on orthogonal frequency division multiplexing and is used in a DRM receiver. The demodulator is provided with a controller which controls the work uniformly, and the controller is connected with an on-chip memory cell, a reordering unit, a calculator, a twiddle factor unit and a butterfly unit; the controller reorders the source operands that are sequentially input by the reordering unit according to an FFT arithmetic type and a mixed basis decomposition, and stores the reordered operands in the on-chip memory cell; the controller reads theoperands from the on-chip memory cell and obtains a twiddle factor by computation by the twiddle factor unit according to the mixed basis decomposition; then the butterfly unit performs butterfly computation, and finally the obtained operands are written back to the on-chip memory cell. The demodulator has the advantages of meeting the DRM standard and supporting four OFDM parameters in terms of design; compared with the implementation of the existing software, the demodulator has low cost and low power consumption, uses 32-bit single precision floating-point number to describe the internal data at high precision, and can implement high-precision OFDM demodulation.

Proceedings ArticleDOI
19 Nov 2010
TL;DR: Balanced Binary Tree Decomposition based FFT/IFFT processor is proposed for WI-MAX(IEEE 802.16e standard) and uses very few multipliers compared to conventional pipeline based design using Radix-2 Method which gives very good throughput yet highly area efficient design.
Abstract: Balanced Binary Tree Decomposition(BBTD) based FFT/IFFT processor is proposed for WI-MAX(IEEE 802.16e standard). BBTD algorithm is used for implementation purpose. It uses very few multipliers compared to conventional pipeline based design using Radix-2 Method. Hence this approach gives very good throughput yet highly area efficient design. BBTD algorithm reduces complex multiplier by 33% and twiddle factor by half. Architecture also provides concept of local ROM module, optimized complex multiplier and variable length support from 256-2048 point for FFT/IFFT. Its core size is 3.89 mm^2 with 51.25 μs execution time. Design has been implemented using 0.18 μm CMOS technology standard cell library. The chip size including memories is 3.89 mm^2 with 51.25 μs execution time which satisfies requirement of WI-MAX standard. Total computation is reduced because of less multipliers. This leads to low power architecture. Proposed architecture consumes only 46.57 mW power at 40 MHz which is good for battery operated devices.

Proceedings ArticleDOI
21 Jun 2010
TL;DR: A real-time OFDM transmitter architecture for 12.1 Gb/s bitrate has been optimized for optimum performance at lowest DSP complexity and achieved 22.3 dB in Q-factor for 7-bit IFFTs twiddle factors and 6-bit resolution of DAC.
Abstract: A real-time OFDM transmitter architecture for 12.1 Gb/s bitrate has been optimized for optimum performance at lowest DSP complexity. We achieved 22.3 dB in Q-factor for 7 bit IFFTs twiddle factors and 6 bit resolution of DAC.

Journal Article
TL;DR: Performance analysis indicated that the revision of twiddle factor generation method needs much fewer storage resources compared with former methods; and relocate seven times can increase twiddlefactor precision about 16 dB gains compared with the method which has no relocate.

Proceedings ArticleDOI
14 Jul 2010
TL;DR: The development of a software tool for simulating and generating fully parallel generic VHDL representations of Fast Fourier Transforms and several fixed-point number optimizations are described with emphasis on maximizing speed and/or minimizing FPGA area.
Abstract: This paper describes the development of a software tool for simulating and generating fully parallel generic VHDL representations of Fast Fourier Transforms. Several fixed-point number optimizations are described with emphasis on maximizing speed and/or minimizing FPGA area. Twiddle factor bit precision and its effects on FPGA area usage are also explored.

Proceedings ArticleDOI
01 Dec 2010
TL;DR: A cost-efficient IFFT/FFT processor with its fixed-point analysis is presented for wireless orthogonal frequency division multiplexing (OFDM) system and the nonzero bits of twiddle factors (TW) are minimized to reduce the structure.
Abstract: In this paper, a cost-efficient IFFT/FFT processor with its fixed-point analysis is presented for wireless orthogonal frequency division multiplexing (OFDM) system. The IFFT/FFT processor is multiplierless architecture, and the nonzero bits of twiddle factors (TW) are minimized to reduce the structure (or the number of the hardwired adder) by the proposed classification of TWs and hardware sharing. On the other hand, the minimum wordlength (WL) of the TWs and input signals can be determined by the fix-point analysis. The proposed IFFT/FFT processor can achieve packet error rate(PER); 0.1 for the test vehicle IEEE 802.11a single-input-single-output(SISO)-OFDM system and slight symbol error rate(SER) loss for the IEEE 802.11n multi-input-multi-output(MIMO)-OFDM system. The core area of the one processor chip is 0.57um×0.565um with 0.18um CMOS process. Besides, the power consumption is 7.74 mw with 1.8 V supply voltage and 40 Mhz system clock.