scispace - formally typeset
Search or ask a question

Showing papers on "Fast Fourier transform published in 2005"


Journal ArticleDOI
24 Jan 2005
TL;DR: It is shown that such an approach can yield an implementation of the discrete Fourier transform that is competitive with hand-optimized libraries, and the software structure that makes the current FFTW3 version flexible and adaptive is described.
Abstract: FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with hand-optimized libraries, and describes the software structure that makes our current FFTW3 version flexible and adaptive. We further discuss a new algorithm for real-data DFTs of prime size, a new way of implementing DFTs by means of machine-specific single-instruction, multiple-data (SIMD) instructions, and how a special-purpose compiler can derive optimized implementations of the discrete cosine and sine transforms automatically from a DFT algorithm.

5,172 citations


Journal ArticleDOI
27 Jun 2005
TL;DR: SPIRAL generates high-performance code for a broad set of DSP transforms, including the discrete Fourier transform, other trigonometric transforms, filter transforms, and discrete wavelet transforms.
Abstract: Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance? We present SPIRAL, which considers this problem for the performance-critical domain of linear digital signal processing (DSP) transforms. For a specified transform, SPIRAL automatically generates high-performance code that is tuned to the given platform. SPIRAL formulates the tuning as an optimization problem and exploits the domain-specific mathematical structure of transform algorithms to implement a feedback-driven optimizer. Similar to a human expert, for a specified transform, SPIRAL "intelligently" generates and explores algorithmic and implementation choices to find the best match to the computer's microarchitecture. The "intelligence" is provided by search and learning techniques that exploit the structure of the algorithm and implementation space to guide the exploration and optimization. SPIRAL generates high-performance code for a broad set of DSP transforms, including the discrete Fourier transform, other trigonometric transforms, filter transforms, and discrete wavelet transforms. Experimental results show that the code generated by SPIRAL competes with, and sometimes outperforms, the best available human tuned transform library code.

853 citations


Journal ArticleDOI
03 Jan 2005
TL;DR: New subthreshold logic and memory design methodologies are developed and demonstrated on a fast Fourier transform (FFT) processor that is designed to investigate the estimated minimum energy point.
Abstract: In emerging embedded applications such as wireless sensor networks, the key metric is minimizing energy dissipation rather than processor speed. Minimum energy analysis of CMOS circuits estimates the optimal operating point of clock frequencies, supply voltage, and threshold voltage according to A. Chandrakasan et al. (see ibid., vol.27, no.4, p.473-84, Apr. 1992). The minimum energy analysis shows that the optimal power supply typically occurs in subthreshold (e.g., supply voltages that are below device thresholds). New subthreshold logic and memory design methodologies are developed and demonstrated on a fast Fourier transform (FFT) processor. The FFT processor uses an energy-aware architecture that allows for variable FFT length (128-1024 point), variable bit-precision (8 b and 16 b) and is designed to investigate the estimated minimum energy point. The FFT processor is fabricated using a standard 0.18-/spl mu/m CMOS logic process and operates down to 180 mV. The minimum energy point for the 16-b 1024-point FFT processor occurs at 350-mV supply voltage where it dissipates 155 nJ/FFT at a clock frequency of 10 kHz.

619 citations


Journal ArticleDOI
TL;DR: A simple, chip-based implementation of a double-beam interferometer that can separate biomolecules based on size and that can compensate for changes in matrix composition is introduced.
Abstract: A simple, chip-based implementation of a double-beam interferometer that can separate biomolecules based on size and that can compensate for changes in matrix composition is introduced. The interferometric biosensor uses a double-layer of porous Si comprised of a top layer with large pores and a bottom layer with smaller pores. The structure is shown to provide an on-chip reference channel analogous to a double-beam spectrometer, but where the reference and sample compartments are stacked one on top of the other. The reflectivity spectrum of this structure displays a complicated interference pattern whose individual components can be resolved by fitting of the reflectivity data to a simple interference model or by fast Fourier transform (FFT). Shifts of the FFT peaks indicate biomolecule penetration into the different layers. The small molecule, sucrose, penetrates into both porous Si layers, whereas the large protein, bovine serum albumin (BSA), only enters the large pores. BSA can be detected even in a large (100-fold by mass) excess of sucrose from the FFT spectrum. Detection can be accomplished either by computing the weighted difference in the frequencies of two peaks or by computing the ratio of the intensities of two peaks in the FFT spectrum.

348 citations


Journal ArticleDOI
TL;DR: This letter analyzes the conventional clipping and filtering using a parabolic approximation of the clipping pulse to get a new clip and filtering technique that obtains the same PAR reduction as that of the existing iterative techniques with 2K+1 FFT/IFFT operations, where K represents the number of iterations.
Abstract: The existing iterative clipping and filtering techniques require several iterations to mitigate the peak regrowth. In this letter, we analyze the conventional clipping and filtering using a parabolic approximation of the clipping pulse. We show that the clipping noise obtained after several clipping and filtering iterations is approximately proportional to that generated in the first iteration. Therefore, we scale the clipping noise generated in the first iteration to get a new clipping and filtering technique that, with three fast Fourier transform/inverse fast Fourier transform (FFT/IFFT) operations, obtains the same PAR reduction as that of the existing iterative techniques with 2K+1 FFT/IFFT operations, where K represents the number of iterations.

240 citations


Journal ArticleDOI
TL;DR: A novel 128-point FFT/IFFT processor for ultrawideband (UWB) systems and the proposed pipelined FFT architecture, called mixed-radix multipath delay feedback (MRMDF), can provide a higher throughput rate by using the multidata-path scheme.
Abstract: In this paper, we present a novel 128-point FFT/IFFT processor for ultrawideband (UWB) systems. The proposed pipelined FFT architecture, called mixed-radix multipath delay feedback (MRMDF), can provide a higher throughput rate by using the multidata-path scheme. Furthermore, the hardware costs of memory and complex multipliers in MRMDF are only 38.9% and 44.8% of those in the known FFT processor by means of the delay feedback and the data scheduling approaches. The high-radix FFT algorithm is also realized in our processor to reduce the number of complex multiplications. A test chip for the UWB system has been designed and fabricated using 0.18-/spl mu/m single-poly and six-metal CMOS process with a core area of 1.76/spl times/1.76 mm/sup 2/, including an FFT/IFFT processor and a test module. The throughput rate of this fabricated FFT processor is up to 1 Gsample/s while it consumes 175 mW. Power dissipation is 77.6 mW when its throughput rate meets UWB standard in which the FFT throughput rate is 409.6 Msample/s.

220 citations


Journal ArticleDOI
TL;DR: Digitally synthetic holograms of surface model objects are investigated for reconstructing three-dimensional objects with shade and texture, and another technique based on a theoretical model of the brightness of the reconstructed surfaces enables us to shade the surface of a reconstructed object as designed.
Abstract: Digitally synthetic holograms of surface model objects are investigated for reconstructing three-dimensional objects with shade and texture. The objects in the proposed techniques are composed of planar surfaces, and a property function defined for each surface provides shape and texture. The field emitted from each surface is independently calculated by a method based on rotational transformation of the property function by use of a fast Fourier transform (FFT) and totaled on the hologram. This technique has led to a reduction in computational cost: FFT operation is required only once for calculating a surface. In addition, another technique based on a theoretical model of the brightness of the reconstructed surfaces enables us to shade the surface of a reconstructed object as designed. Optical reconstructions of holograms synthesized by the proposed techniques are demonstrated.

210 citations


Journal ArticleDOI
TL;DR: An energy-balanced allocation of a real-time application onto a single-hop cluster of homogeneous sensor nodes connected with multiple wireless channels and incorporates techniques for exploring the energy-latency tradeoffs of communication activities (such as modulation scaling), which leads to 10x lifetime improvement in simulations.
Abstract: We propose an energy-balanced allocation of a real-time application onto a single-hop cluster of homogeneous sensor nodes connected with multiple wireless channels. An epoch-based application consisting of a set of communicating tasks is considered. Each sensor node is equipped with discrete dynamic voltage scaling (DVS). The time and energy costs of both computation and communication activities are considered. We propose both an Integer Linear Programming (ILP) formulation and a polynomial time 3-phase heuristic. Our simulation results show that for small scale problems (with ≤ 10 tasks), up to 5x lifetime improvement is achieved by the ILP-based approach, compared with the baseline where no DVS is used. Also, the 3-phase heuristic achieves up to 63% of the system lifetime obtained by the ILP-based approach. For large scale problems (with 60-100 tasks), up to 3.5x lifetime improvement can be achieved by the 3-phase heuristic. We also incorporate techniques for exploring the energy-latency tradeoffs of communication activities (such as modulation scaling), which leads to 10x lifetime improvement in our simulations. Simulations were further conducted for two real world problems - LU factorization and Fast Fourier Transformation (FFT). Compared with the baseline where neither DVS nor modulation scaling is used, we observed up to 8x lifetime improvement for the LU factorization algorithm and up to 9x improvement for FFT.

194 citations


Proceedings ArticleDOI
TL;DR: A significantly improved algorithm for the problem of finding a Fourier representation R of m terms for a given discrete signal A of length N and a quadratic-in-m algorithm that works for any values of Ni's is given.
Abstract: •We study the problem of finding a Fourier representation R of m terms for a given discrete signal A of length N. The Fast Fourier Transform (FFT) can find the optimal N-term representation in time O(N log N) time, but our goal is to get sublinear time algorithms when m << N. Suppose ||A||2 ≤M||A-Ropt||2, where Ropt is the optimal output. The previously best known algorithms output R such that ||A-R||22≤(1+e))||A-Ropt||22 with probability at least 1-δ in time* poly(m,log(1/δ),log N,log M,1/e). Although this is sublinear in the input size, the dominating expression is the polynomial factor in m which, for published algorithms, is greater than or equal to the bottleneck at m2 that we identify below. Our experience with these algorithms shows that this is serious limitation in theory and in practice. Our algorithm beats this m2 bottleneck. Our main result is a significantly improved algorithm for this problem and the d-dimensional analog. Our algorithm outputs an R with the same approximation guarantees but it runs in time m•poly(log(1/δ),log N,log M,1/e). A version of the algorithm holds for all N, though the details differ slightly according to the factorization of N. For the d-dimensional problem of size N1 × N2 × •• × Nd, the linear-in-m algorithm extends efficiently to higher dimensions for certain factorizations of the Ni's; we give a quadratic-in-m algorithm that works for any values of Ni's. This article replaces several earlier, unpublished drafts.

187 citations


Journal ArticleDOI
29 Aug 2005
TL;DR: It is shown that FFT can be used for the joint processing of multiple (code-phase/frequency) search options in both dimensions at once and can significantly reduce computational complexity by jointly acquiring different satellites.
Abstract: A Global Positioning System (GPS) receiver uses satellite signals to determine position, velocity, and timing information. Measurements are obtained by synchronising the locally generated signal in the receiver with the signals received. A synchronisation procedure called acquisition adjusts the code phases of the incoming signal and the locally generated pseudo-random replica sequence of the corresponding satellite to a small timing offset and finds the residual frequency modulation after carrier wipe-off. New fast techniques for acquiring signals indoors in conditions that require a significant number of computations are presented. In this work many arithmetic operations are shared when exploring different search options by using fast Fourier transform (FFT) and a technique based on the frequency domain replica shifting. It is shown that FFT can be used for the joint processing of multiple (code-phase/frequency) search options in both dimensions at once. With a slight degradation in performance, the algorithm has a modified version that implements the technique using two-dimensional FFT. Several possible processing schemes are presented. Moreover, the presented shifting replica approach in the frequency domain can significantly reduce computational complexity by jointly acquiring different satellites.

186 citations


Journal ArticleDOI
TL;DR: In this article, the authors present the essential ideas underlying the fast Fourier transform (NUFFT) algorithm in simple terms, and illustrate its utility with application to problems in magnetic resonance imaging and heat flow.

Journal ArticleDOI
TL;DR: Two related numerical models that calculate the time-dependent pressure field radiated by an arbitrary photoacoustic source in a fluid, such as that generated by the absorption of a short laser pulse, are presented.
Abstract: Two related numerical models that calculate the time-dependent pressure field radiated by an arbitrary photoacoustic source in a fluid, such as that generated by the absorption of a short laser pulse, are presented. Frequency-wavenumber (k-space) implementations have been used to produce fast and accurate predictions. Model I calculates the field everywhere at any instant of time, and is useful for visualizing the three-dimensional evolution of the wave field. Model II calculates pressure time series for points on a straight line or plane and is therefore useful for simulating array measurements. By mapping the vertical wavenumber spectrum directly to frequency, this model can calculate time series up to 50 times faster than current numerical models of photoacoustic propagation. As the propagating and evanescent parts of the field are calculated separately, model II can be used to calculate far- and near-field radiation patterns. Also, it can readily be adapted to calculate the velocity potential and thus particle velocity and acoustic intensity vectors. Both models exploit the efficiency of the fast Fourier transform, and can include the frequency-dependent directional response of an acoustic detector straightforwardly. The models were verified by comparison with a known analytic solution and a slower, but well-understood, numerical model.

Journal ArticleDOI
TL;DR: This work proposes two methods for the alignment of multiple spectral data sets that make use of fast Fourier transform for the rapid computation of a cross-correlation function that enables alignments between samples to be optimized.
Abstract: Preprocessing of chromatographic and spectral data is an important aspect of analytical sciences. In particular, recent advances in proteomics have resulted in the generation of large data sets that require analysis. To assist accurate comparison of chemical signals, we propose two methods for the alignment of multiple spectral data sets. Based on methods previously described, each chromatograph or spectrum to be aligned is divided and aligned as individual segments to a reference. However, our methods make use of fast Fourier transform for the rapid computation of a cross-correlation function that enables alignments between samples to be optimized. The proposed methods are demonstrated in comparison with an existing method on a chromatographic and a mass spectral data set. It is shown that our methods provide an advantage of speed and a reduction of the number of input parameters required. The software implementations for the proposed alignment methods are available under the downloads section at http://ptcl.chem.ox.ac.uk/~jwong/specalign.

Journal ArticleDOI
TL;DR: In this article, large-scale simulations of non-Brownian rigid fibers sedimenting under gravity at zero Reynolds number have been performed using a fast algorithm using a slender-body theory, and the line distribution of point forces along their lengths is approximated by a Legendre polynomial.
Abstract: Large-scale simulations of non-Brownian rigid fibers sedimenting under gravity at zero Reynolds number have been performed using a fast algorithm. The mathematical formulation follows the previous simulations by Butler and Shaqfeh [“Dynamic simulations of the inhomogeneous sedimentation of rigid fibres,” J. Fluid Mech. 468, 205 (2002)]. The motion of the fibers is described using slender-body theory, and the line distribution of point forces along their lengths is approximated by a Legendre polynomial in which only the total force, torque, and particle stresslet are retained. Periodic boundary conditions are used to simulate an infinite suspension, and both far-field hydrodynamic interactions and short-range lubrication forces are considered in all simulations. The calculation of the hydrodynamic interactions, which is typically the bottleneck for large systems with periodic boundary conditions, is accelerated using a smooth particle-mesh Ewald (SPME) algorithm previously used in molecular dynamics simulations. In SPME the slowly decaying Green’s function is split into two fast-converging sums: the first involves the distribution of point forces and accounts for the singular short-range part of the interactions, while the second is expressed in terms of the Fourier transform of the force distribution and accounts for the smooth and long-range part. Because of its smoothness, the second sum can be computed efficiently on an underlying grid using the fast Fourier transform algorithm, resulting in a significant speed-up of the calculations. Systems of up to 512 fibers were simulated on a single-processor workstation, providing a different insight into the formation, structure, and dynamics of the inhomogeneities that occur in sedimenting fiber suspensions.

Proceedings ArticleDOI
TL;DR: This paper presents the first 3D discrete curvelet transform, an extension to the 2D transform described in Candes et al..1, and describes three different implementations: in-core, out-of-core and MPI-based parallel implementations.
Abstract: In this paper, we present the first 3D discrete curvelet transform. This transform is an extension to the 2D transform described in Candes et al..1 The resulting curvelet frame preserves the important properties, such as parabolic scaling, tightness and sparse representation for singularities of codimension one. We describe three different implementations: in-core, out-of-core and MPI-based parallel implementations. Numerical results verify the desired properties of the 3D curvelets and demonstrate the efficiency of our implementations.

Journal ArticleDOI
16 May 2005
TL;DR: This paper presents a novel fast integral equation method, termed IE-FFT, for solving large electromagnetic scattering problems, which utilizes the Toeplitz property of the coefficient matrix and is therefore applicable to both static and wave propagation problems.
Abstract: This paper presents a novel fast integral equation method, termed IE-FFT, for solving large electromagnetic scattering problems Similar to other fast integral equation methods, the IE-FFT algorithm starts by partitioning the basis functions into multilevel clustering groups Subsequently, the entire impedance matrix is decomposed into two parts: one for the self and/or near field couplings, and one for well-separated group couplings The IE-FFT algorithm employs two discretizations one is for the unknown current on an unstructured triangular mesh, and the other is a uniform Cartesian grid for interpolating the Green's function By interpolating the Green's function on a regular Cartesian grid, the couplings between two well-separated groups can be computed using the fast Fourier transform (FFT) Consequently, the IE-FFT algorithm does not require the knowledge of addition theorem It simply utilizes the Toeplitz property of the coefficient matrix and is therefore applicable to both static and wave propagation problems

Posted Content
TL;DR: In this paper, a fractional FFT algorithm is used to retrieve option prices from the corresponding characteristic functions, which can be delivered up to 45 times faster without substantial loss of accuracy in the results.
Abstract: This paper shows how the recently developed fractional FFT algorithm (FRFT) can be used to retrieve option prices from the corresponding characteristic functions. The FRFT algorithm has the advantage of using the characteristic function information in a more efficient way than the straight FFT. Typically, therefore, fewer function evaluations are needed and substantial savings in computational time can be made. Two experiments, based on the stochastic volatility and the variance-gamma models, illustrate the benefits of using the fractional version of the FFT and show that option prices can be delivered up to 45 times faster without substantial loss of accuracy in the results.

Journal ArticleDOI
TL;DR: Simulation results show that the proposed CG-Toeplitz approach to field-corrected MR image reconstruction produces equivalent image quality as the CG-NUFFT method with significantly reduced computation time.
Abstract: In some types of magnetic resonance (MR) imaging, particularly functional brain scans, the conventional Fourier model for the measurements is inaccurate. Magnetic field inhomogeneities, which are caused by imperfect main fields and by magnetic susceptibility variations, induce distortions in images that are reconstructed by conventional Fourier methods. These artifacts hamper the use of functional MR imaging (fMRI) in brain regions near air/tissue interfaces. Recently, iterative methods that combine the conjugate gradient (CG) algorithm with nonuniform FFT (NUFFT) operations have been shown to provide considerably improved image quality relative to the conjugate-phase method. However, for non-Cartesian k-space trajectories, each CG-NUFFT iteration requires numerous k-space interpolations; these are operations that are computationally expensive and poorly suited to fast hardware implementations. This paper proposes a faster iterative approach to field-corrected MR image reconstruction based on the CG algorithm and certain Toeplitz matrices. This CG-Toeplitz approach requires k-space interpolations only for the initial iteration; thereafter, only fast Fourier transforms (FFTs) are required. Simulation results show that the proposed CG-Toeplitz approach produces equivalent image quality as the CG-NUFFT method with significantly reduced computation time.

Journal ArticleDOI
TL;DR: Results demonstrate that MMSE turbo equalization is an attractive candidate for single-carrier broadband wireless transmissions in long delay-spread environments.
Abstract: This paper deals with a low complexity receiver scheme where equalization and channel decoding are jointly optimized in an iterative process. We derive the theoretical transfer function of the infinite length linear minimum mean square error (MMSE) equalizer with a priori information. A practical implementation is exposed which employs the fast Fourier transform (FFT) to compute the equalizer coefficients, resulting in a low-complexity receiver structure. The performance of the proposed scheme is investigated for the enhanced general packet radio service (EGPRS) radio link. Simulation results show that significant power gains may be achieved with only a few (3-4) iterations. These results demonstrate that MMSE turbo equalization is an attractive candidate for single-carrier broadband wireless transmissions in long delay-spread environments.

Journal ArticleDOI
03 Jun 2005
TL;DR: The design and realisation of a high level framework for the implementation of 1-D and 2-D FFTs for real-time applications and an FPGA-based parametrisable environment based on 2- D FFT is presented as a solution for frequency-domain image filtering application.
Abstract: Applications based on the fast Fourier transform (FFT), such as signal and image processing, require high computational power, plus the ability to experiment with algorithms. Reconfigurable hardware devices in the form of field programmable gate arrays (FPGAs) have been proposed as a way of obtaining high performance at an economical price. However, users must program FPGAs at a very low level and have a detailed knowledge of the architecture of the device being used. They do not therefore facilitate easy development of, or experimentation with, signal/image processing algorithms. To try to reconcile the dual requirements of high performance and ease of development, this paper reports on the design and realisation of a high level framework for the implementation of 1-D and 2-D FFTs for real-time applications. A wide range of FFT algorithms, including radix-2, radix-4, split-radix and fast Hartley transform (FHT) have been implemented under a common framework in order to enable the system designers to meet different system requirements. Results show that the parallel implementation of 2-D FFT achieves linear speed-up and real-time performance for large matrix sizes. Finally, an FPGA-based parametrisable environment based on 2-D FFT is presented as a solution for frequency-domain image filtering application.

Journal ArticleDOI
TL;DR: This work presents a Fourier-based approach that estimates large translations, scalings, and rotations using the pseudopolar (PP) Fourier transform to achieve substantial improved approximations of the polar and log-polar Fourier transforms of an image.
Abstract: One of the major challenges related to image registration is the estimation of large motions without prior knowledge. This work presents a Fourier-based approach that estimates large translations, scalings, and rotations. The algorithm uses the pseudopolar (PP) Fourier transform to achieve substantial improved approximations of the polar and log-polar Fourier transforms of an image. Thus, rotations and scalings are reduced to translations which are estimated using phase correlation. By utilizing the PP grid, we increase the performance (accuracy, speed, and robustness) of the registration algorithms. Scales up to 4 and arbitrary rotation angles can be robustly recovered, compared to a maximum scaling of 2 recovered by state-of-the-art algorithms. The algorithm only utilizes one-dimensional fast Fourier transform computations whose overall complexity is significantly lower than prior works. Experimental results demonstrate the applicability of the proposed algorithms.

Journal ArticleDOI
TL;DR: A new continuous-flow mixed-radix (CFMR) fast Fourier transform (FFT) processor that uses the MR (radix-4/2) algorithm and a novel in-place strategy that can reduce hardware complexity and computation cycles compared with existing FFT processors is proposed.
Abstract: The paper proposes a new continuous-flow mixed-radix (CFMR) fast Fourier transform (FFT) processor that uses the MR (radix-4/2) algorithm and a novel in-place strategy. The existing in-place strategy supports only a fixed-radix FFT algorithm. In contrast, the proposed in-place strategy can support the MR algorithm, which allows CF FFT computations regardless of the length of FFT. The novel in-place strategy is made by interchanging storage locations of butterfly outputs. The CFMR FFT processor provides the MR algorithm, the in-place strategy, and the CF FFT computations at the same time. The CFMR FFT processor requires only two N-word memories due to the proposed in-place strategy. In addition, it uses one butterfly unit that can perform either one radix-4 butterfly or two radix-2 butterflies. The CFMR FFT processor using the 0.18 /spl mu/m SEC cell library consists of 37,000 gates excluding memories, requires only 640 clock cycles for a 512-point FFT and runs at 100 MHz. Therefore, the CFMR FFT processor can reduce hardware complexity and computation cycles compared with existing FFT processors.

Journal ArticleDOI
TL;DR: A new method for numerically reconstructing digital holograms on tilted planes based on the angular spectrum of plane waves is presented, which is especially useful for tomographic image reconstruction.
Abstract: We present a new method for numerically reconstructing digital holograms on tilted planes. The method is based on the angular spectrum of plane waves. Fast Fourier transform algorithm is used twice and coordinate rotation in the Fourier domain enables to reconstruct the object field on the tilted planes. Correction of the anamorphism resulting from the coordinate transformation is performed by suitable interpolation of the spectral data. Experimental results are presented to demonstrate the method for a singleaxis rotation. The algorithm is especially useful for tomographic image reconstruction.

Journal ArticleDOI
08 Jul 2005
TL;DR: A variable-length FFT processor design that is based on a radix-2/4/8 algorithm and a single-path delay feedback architecture that can function correctly up to 45 MHz with a 3.3 V supply voltage and power consumption of 640 mW.
Abstract: Fast Fourier transform (FFT) processing is one of the key procedures in the popular orthogonal frequency division multiplexing (OFDM) communication systems. Structured pipeline architectures and low power consumption are the main concerns for its VLSI implementation. In the paper, the authors report a variable-length FFT processor design that is based on a radix-2/4/8 algorithm and a single-path delay feedback architecture. The processor can be used in various OFDM-based communication systems, such as digital audio broadcasting (DAB), digital video broadcasting-terrestrial (DVB-T), asymmetric digital subscriber loop (ADSL) and very-high-speed digital subscriber loop (VDSL). To reduce power consumption and chip area, special current-mode SRAMs are adopted to replace shift registers in the delay lines. In addition, techniques including complex multipliers containing three real multiplications, and reduced sine/cosine tables are adopted. The chip is fabricated using a 0.35 /spl mu/m CMOS process and it measures 3900 /spl mu/m /spl times/ 5500 /spl mu/m. According to the measured results, the 2048-point FFT operation can function correctly up to 45 MHz with a 3.3 V supply voltage and power consumption of 640 mW. In low-power operation, when the supply voltage is scaled down to 2.3 V, the processor consumes 176 mW when it runs at 17.8 MHz.

Journal ArticleDOI
TL;DR: The proposed method explores the capability of DFT and directional filtering in dealing with low-quality images and the effectiveness of nonlinear feature extraction method in fingerprint classification.

Journal ArticleDOI
TL;DR: In this article, an adaptive control scheme is proposed to reduce force ripple effects impeding motion accuracy in Permanent Magnet Linear Motors (PMLMs) by using a Fast Fourier Transform (FFT) analysis.

Journal ArticleDOI
TL;DR: In this paper, a wavelet based spectral finite element is developed for studying elastic wave propagation in 1-D connected waveguides, which circumvents several disadvantages of the conventional spectral element formulation using Fast Fourier Transforms (FFT) particularly in the study of transient dynamics.

Journal ArticleDOI
TL;DR: This work decomposes the restoration into a sum of two independent restorations that yields an image that comes directly from a modified FFT-based approach that can be calculated very efficiently even though no circular convolution structure exists.
Abstract: Fast Fourier transform (FFT)-based restorations are fast, but at the expense of assuming that the blurring and deblurring are based on circular convolution. Unfortunately, when the opposite sides of the image do not match up well in intensity, this assumption can create significant artifacts across the image. If the pixels outside the measured image window are modeled as unknown values in the restored image, boundary artifacts are avoided. However, this approach destroys the structure that makes the use of the FFT directly applicable, since the unknown image is no longer the same size as the measured image. Thus, the restoration methods available for this problem no longer have the computational efficiency of the FFT. We propose a new restoration method for the unknown boundary approach that can be implemented in a fast and flexible manner. We decompose the restoration into a sum of two independent restorations. One restoration yields an image that comes directly from a modified FFT-based approach. The other restoration involves a set of unknowns whose number equals that of the unknown boundary values. By summing the two, the artifacts are canceled. Because the second restoration has a significantly reduced set of unknowns, it can be calculated very efficiently even though no circular convolution structure exists.

Journal ArticleDOI
TL;DR: In this article, a novel fast electromagnetic field-circuit simulator that permits the full-wave modeling of transients in nonlinear microwave circuits is proposed, which is composed of two components: 1) a fullwave solver that models interactions of electromagnetic fields with conducting surfaces and finite dielectric volumes by solving time-domain surface and volume electric field integral equations, respectively, and 2) a circuit solver, which models field interactions with lumped circuits, which are potentially active and nonlinear, by solving Kirchoff's equations through modified nodal analysis.
Abstract: A novel fast electromagnetic field-circuit simulator that permits the full-wave modeling of transients in nonlinear microwave circuits is proposed. This time-domain simulator is composed of two components: 1) a full-wave solver that models interactions of electromagnetic fields with conducting surfaces and finite dielectric volumes by solving time-domain surface and volume electric field integral equations, respectively, and 2) a circuit solver that models field interactions with lumped circuits, which are potentially active and nonlinear, by solving Kirchoff's equations through modified nodal analysis. These field and circuit analysis components are consistently interfaced and the resulting coupled set of nonlinear equations is evolved in time by a multidimensional Newton-Raphson scheme. The solution procedure is accelerated by allocating field- and circuit-related computations across the processors of a distributed-memory cluster, which communicate using the message-passing interface standard. Furthermore, the electromagnetic field solver, whose demand for computational resources far outpaces that of the circuit solver, is accelerated by a fast Fourier transform (FFT)-based algorithm, viz. the time-domain adaptive integral method. The resulting parallel FFT accelerated transient field-circuit simulator is applied to the analysis of various active and nonlinear microwave circuits, including power-combining arrays.

Journal ArticleDOI
TL;DR: Experimental results show the effectiveness of the receiver design in combating CFO and the spirit of maximum-likelihood estimation in the EM algorithm.
Abstract: In this letter, we study the design of expectation-maximization (EM)-based iterative receivers for multiple-input multiple-output (MIMO) orthogonal frequency-division multiplexing systems with the presence of carrier-frequency offset (CFO). Motivated by the spirit of maximum-likelihood estimation in the EM algorithm, we first present a pilot-aided CFO estimation scheme that allows fast Fourier transform-based fast implementation. Then this CFO estimation is incorporated into the initialization step of the iterative receiver. Experimental results show the effectiveness of our receiver design in combating CFO.