scispace - formally typeset
Search or ask a question

Showing papers on "Split-radix FFT algorithm published in 1996"


Proceedings ArticleDOI
15 Apr 1996
TL;DR: A new VLSI architecture for a real-time pipeline FFT processor is proposed, derived by integrating a twiddle factor decomposition technique in the divide-and-conquer approach, which has the same multiplicative complexity as the radix-4 algorithm, but retains the butterfly structure of the Radix-2 algorithm.
Abstract: A new VLSI architecture for a real-time pipeline FFT processor is proposed. A hardware-oriented radix-2/sup 2/ algorithm is derived by integrating a twiddle factor decomposition technique in the divide-and-conquer approach. The radix-2/sup 2/ algorithm has the same multiplicative complexity as the radix-4 algorithm, but retains the butterfly structure of the radix-2 algorithm. The single-path delay-feedback architecture is used to exploit the spatial regularity in the signal flow graph of the algorithm. For length-N DFT computation, the hardware requirement of the proposed architecture is minimal on both dominant components: log/sub 4/N-1 complexity multipliers and N-1 complexity data memory. The validity and efficiency of the architecture have been verified by simulation in the hardware description language VHDL.

410 citations


01 Apr 1996
TL;DR: This paper surveys some recent work directed towards generalizing the fast Fourier transform (FFT) from the point of view of group representation theory, and discusses generalizations of the FFT to arbitrary finite groups and compact Lie groups.
Abstract: In this paper we survey some recent work directed towards generalizing the fast Fourier transform (FFT). We work primarily from the point of view of group representation theory. In this setting the classical FFT can be viewed as a family of efficient algorithms for computing the Fourier transform of either a function defined on a finite abelian group, or a bandlimited function on a compact abelian group. We discuss generalizations of the FFT to arbitrary finite groups and compact Lie groups.

142 citations


Journal ArticleDOI
TL;DR: A method for the calculation of the fractional Fourier transform (FRT) by means of the fast Fouriertransform (FFT) algorithm is presented and scaling factors for the FRT and Fresnel diffraction when calculated through the FFT are discussed.
Abstract: A method for the calculation of the fractional Fourier transform (FRT) by means of the fast Fourier transform (FFT) algorithm is presented. The process involves mainly two FFT’s in cascade; thus the process has the same complexity as this algorithm. The method is valid for fractional orders varying from −1 to 1. Scaling factors for the FRT and Fresnel diffraction when calculated through the FFT are discussed.

118 citations


Patent
31 Oct 1996
TL;DR: In this paper, the inverse fast Fourier transform (FFT) is implemented in 19-bit precision using a fixed point 16-bit processor and the input data is downscaled by right shifting one or two bits if overflow is possible.
Abstract: A discrete multitone (DMT) digital subscriber loop (xDSL) telecommunication system has a transmitter portion including a bit encoder, inverse fast Fourier transform (FFT), parallel-to-serial converter, digital-to-analog converter and line driver for transmitting data signals to a twisted pair telephone line and a receiver portion including an analog-to-digital converter, serial-to-parallel converter, forward FFT and bit decoder for receiving data signals from the twisted pair telephone line. The FFT's are implemented in 19-bit precision using a fixed point 16-bit processor. At each FFT stage, the number of sign bits in the FFT input data is examined to determine whether overflow is possible during multiply and add operations. The input data is downscaled by right shifting one or two bits if overflow is possible. If downscaling occurred, the output data is rescaled after completion of the FFT operation. If overflow is not possible, no scaling is done. By using variable scaling to downscale only when necessary, better overall precision is maintained.

104 citations


Journal ArticleDOI
TL;DR: A series-expansion approach and an operator framework are used to derive a new, fast, and accurate Fourier algorithm for iterative tomographic reconstruction that is applicable for parallel-ray projections collected at a finite number of arbitrary view angles and radially sampled at a rate high enough that aliasing errors are small.
Abstract: We use a series-expansion approach and an operator framework to derive a new, fast, and accurate Fourier algorithm for iterative tomographic reconstruction. This algorithm is applicable for parallel-ray projections collected at a finite number of arbitrary view angles and radially sampled at a rate high enough that aliasing errors are small. The conjugate gradient (CG) algorithm is used to minimize a regularized, spectrally weighted least-squares criterion, and we prove that the main step in each iteration is equivalent to a 2-D discrete convolution, which can be cheaply and exactly implemented via the fast Fourier transform (FFT). The proposed algorithm requires O(N/sup 2/logN) floating-point operations per iteration to reconstruct an N/spl times/N image from P view angles, as compared to O(N/sup 2/P) floating-point operations per iteration for iterative convolution-backprojection algorithms or general algebraic algorithms that are based on a matrix formulation of the tomography problem. Numerical examples using simulated data demonstrate the effectiveness of the algorithm for sparse- and limited-angle tomography under realistic sampling scenarios. Although the proposed algorithm cannot explicitly account for noise with nonstationary statistics, additional simulations demonstrate that for low to moderate levels of nonstationary noise, the quality of reconstruction is almost unaffected by assuming that the noise is stationary.

97 citations


Journal ArticleDOI
TL;DR: Fast Fourier transform (FFT)-based computations can be far more accurate than the slow transforms suggest, but these results depend critically on the accuracy of the FFT software employed, which should generally be considered suspect.
Abstract: Fast Fourier transform (FFT)-based computations can be far more accurate than the slow transforms suggest. Discrete Fourier transforms computed through the FFT are far more accurate than slow transforms, and convolutions computed via FFT are far more accurate than the direct results. However, these results depend critically on the accuracy of the FFT software employed, which should generally be considered suspect. Popular recursions for fast computation of the sine/cosine table (or twiddle factors) are inaccurate due to inherent instability. Some analyses of these recursions that have appeared heretofore in print, suggesting stability, are incorrect. Even in higher dimensions, the FFT is remarkably stable.

96 citations


Journal ArticleDOI
TL;DR: Two programs are presented to compute direct- and cross-variogram values, direct andCross-covariograms, and pseudo-cross-variograms based on the Fast Fourier Transform algorithm, which is shown to be faster than the spatial approach for this type of data.

71 citations


Journal ArticleDOI
TL;DR: A fast and reliable convolution algorithm to calculate the mean line of a roughness profile using the Gaussian filter according to ISO 11562 has been derived based on a recurrence relation for the weighting function.
Abstract: A fast and reliable convolution algorithm to calculate the mean line of a roughness profile using the Gaussian filter according to ISO 11562 has been derived. The algorithm is based on a recurrence relation for the weighting function. This greatly speeds up the calculation, making the algorithm nearly comparable to algorithms using the fast Fourier transform (FFT) in a usual way. The algorithm has been implemented as a short C function to be used with any evaluation program. The application of this function to a measured profile is given for demonstration and compared with the results obtained by the ordinary FFT filter algorithm.

55 citations


Patent
01 Feb 1996
TL;DR: In this paper, a storage-address generator (482) directs that corresponding FFT input elements of successive FFT operations be stored in the same locations in an input-data memory (451).
Abstract: In a cellular-telephone-system base-station receiver's channelizer (111), frequency translation of the outputs of a filter bank (FIG. 5) implemented in fast-Fourier-transform circuitry (453,455,460) is achieved by rotating the correspondence between FFT input elements and the filter coefficients by which multipliers (437) multiply incoming samples to produce them. Specifically, a storage-address generator (482) directs that corresponding FFT input elements of successive FFT operations be stored in the same locations in an input-data memory (451). To retrieve those values for use in the DFT operation, however, a fetch-address generator (484) employs a modulo-K adder (488) to impose a changing offset so that the starting address for retrieval of each FFT operation's input record changes between FFT operations by the filter bank's decimation rate M. An FFT-implemented combiner (131) similarly rotates computation values to phase align successive wavelets that it adds together to generate modulated carriers in a multi-channel output signal.

53 citations


Journal ArticleDOI
TL;DR: It is demonstrated that, in geoid computations over large regions, the 1D spherical F FT and the 2D multiband spherical FFT in combination with discrete spectra for the kernel functions and 100% zero-padding give better results than those obtained by the other transform techniques.
Abstract: The Stokes formula is efficiently evaluated by the one-and two- dimensional (1D, 2D) fast Fourier transform (FFT) technique in the plane and on the sphere in order to obtain precise geoid determinatiover a large area such as Europe. Using a high-pass filtered spherical harmonic reference model (OSU91A truncated to different degrees), gridded gravity anomalies and geoid heights were produced and the anomalies were used as input in the FFT software. Various tests were performed with respect to the different kernel functions used, to the spherical computations in bands, as well as to windowing, edge effects and extent of the area. It is thus demonstrated that, in geoid computations over large regions, the 1D spherical FFT and the 2D multiband spherical FFT in combination with discrete spectra for the kernel functions and 100% zero-padding give better results than those obtained by the other transform techniques. Additionally, numerical tests were carried out at the same test area using the planar fast Hartley transform (FHT) instead of the FFT and the results obtained by the two attractive alternatives were compared regarding the requirements in both computer time and computer memory needed in geoid height computations.

43 citations


Journal ArticleDOI
TL;DR: Details of a new low power fast Fourier transform (FFT) processor for use in digital television applications are presented and the chip design is based on a novel VLSI architecture which has been derived from a first principles factorization of the discrete Fourier Transform matrix and tailored to a direct silicon implementation.
Abstract: Details of a new low power fast Fourier transform (FFT) processor for use in digital television applications are presented. This has been fabricated using a 0.6-/spl mu/m CMOS technology and can perform a 64 point complex forward or inverse FFT on real-time video at up to 18 Megasamples per second. It comprises 0.5 million transistors in a die area of 7.8/spl times/8 mm/sup 2/ and dissipates 1 W. The chip design is based on a novel VLSI architecture which has been derived from a first principles factorization of the discrete Fourier transform (DFT) matrix and tailored to a direct silicon implementation.

Journal ArticleDOI
TL;DR: A simple algorithm is described for computing general pseudo-differential operator actions based on the asymptotic expansion of the symbol together with the fast Fourier transform, which shows that the algorithm is efficient through analyzing its complexity.
Abstract: A simple algorithm is described for computing general pseudo-differential operator actions. Our approach is based on the asymptotic expansion of the symbol together with the fast Fourier transform (FFT). The idea is motivated by the characterization of the pseudo-differential operator algebra. We show that the algorithm is efficient through analyzing its complexity. Some numerical experiments are also presented.

Proceedings ArticleDOI
13 Mar 1996
TL;DR: The idea of the method is to localize the nonregularities into the nodes of the Cooley-Tukey FFT type computational graph so that a simple programmable processor element for executing of node function can be the basis for parallel constructs.
Abstract: We propose for a class of trigonometric transforms fast algorithms with a unified structure and a simple data exchange similar to constant geometry isomorphic to the Cooley-Tukey FFT algorithm. One can easily extend many of the parallel FFT approaches for these algorithms. The idea of the method is to localize the nonregularities into the nodes of the Cooley-Tukey FFT type computational graph. Only the basic operation in the nodes of the computational graph will be different for different transforms. Thus a simple programmable processor element for executing of node function can be the basis for parallel constructs.© (1996) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Journal ArticleDOI
TL;DR: It is shown that while the technique presented herein is not expected to exhibit the same performance as that of comparable techniques based on the three-dimensional FFT, it is an attractive alternative that makes modest sacrifices in performance for gains in computational complexity.
Abstract: In this paper a description is given of a computationally efficient algorithm, based on the two-dimensional fast Fourier transform (FFT), for the estimation of multiple translational motions from a sequence of images. The proposed algorithm relies on properties of the projection (Radon) transform to reduce the problem from three to two dimensions and is effective in isolating and reliably estimating several superimposed motions in the presence of moderate levels of noise. Furthermore, the reliance of this algorithm on a novel array processing technique for line detection allows for the efficient estimation of the motion parameters. It is shown that while the technique presented herein is not expected to exhibit the same performance as that of comparable techniques based on the three-dimensional FFT, it is an attractive alternative that makes modest sacrifices in performance for gains in computational complexity.

Journal ArticleDOI
TL;DR: A new parallel FFT algorithm is proposed that removes the complex multiplier between the two pipeline stages and simplifies the address generation of twiddle factors and reduces the number of twiddles to a minimum.
Abstract: Usually, parallel pipelined FFT processors are used to compute long FFTs due to high processing rate and easy implementation. The efficient VLSI implementation of each FFT processor at the pipelines is a critical problem to be considered. We propose a new parallel FFT algorithm that removes the complex multiplier between the two pipeline stages. The new algorithm also simplifies the address generation of twiddle factors and reduces the number of twiddle factors to a minimum. With the new algorithm, each FFT processor at the pipelines can be integrated easily onto a single chip.

Journal ArticleDOI
TL;DR: The real and complex split-radix generalized fast Fourier transform algorithm has been developed and its applications for skew-circular convolution and partial FFT are described.

Patent
27 Aug 1996
TL;DR: In this paper, the smallest possible circuit size is provided for FFT computing units, FFT computation devices, and pulse counters that can achieve computational precision using the smallest available circuit size.
Abstract: To provide FFT computing units, FFT computation devices, and pulse counters that can achieve computational precision using the smallest possible circuit size. FFT computing unit 602 comprises a data shift circuit for standardizing FFT computation target data to a specified bit width, adders/subtracters, multipliers, and data converters for standardizing the bit width to a certain bit width by truncating part of the output data of each computing unit, etc. FFT computation device comprises FFT computing unit 602, sensor 620, amplification circuit 621, gain control circuit 623, AD converter 622, first RAM 625 for sequentially storing the A/D conversion data, second RAM 626 for storing the FFT computation target data and the data being computed, coefficient ROM 101, and level determination circuit 624; and the level determination circuit determines the size of the data being transferred when the data is being transferred from RAM 1 to RAM 2, and the result is used for the data shift adjustment and gain control during FFT computation.

Journal ArticleDOI
TL;DR: The empirical results for the Pearson X 2, likelihood ratio, and Freeman-Halton statistics show that the network algorithm, or equivalently, the recursive polynomial multiplication algorithm is superior to the FFT algorithm with respect to computing speed and accuracy.

Proceedings ArticleDOI
18 Aug 1996
TL;DR: A new class of parallel architectures called unfolded swapped networks (USN) for fast Fourier transform (FFT) and related problems is proposed and can be constructed using small butterfly modules, each built on a chip, and requires fewer pins than a similar-sized butterfly network.
Abstract: We propose a new class of parallel architectures called unfolded swapped networks (USN) for fast Fourier transform (FFT) and related problems. The VLSI area of a suitably constructed N(log/sub 2/ N+o(log N))-node USN is no more than N/sup 2/+o(N/sup 2/), which is smaller than the best known result for a log, N-dimensional butterfly network. USNs can be constructed using small butterfly modules, each built on a chip, and requires fewer pins than a similar-sized butterfly network by a factor of /spl Theta/(log N). N-point FFT can be executed on a USN at a speed comparable to a butterfly network, assuming constant link delay; it can be executed on a USN considerably faster than on a butterfly when link delay increases with length and/or when inter-chip data transfers are much slower than intra-chip ones.

Proceedings ArticleDOI
07 May 1996
TL;DR: It is shown that detection performance increases monotonically with the number of FFT stages completed, converging ultimately to that of the exact ML detector.
Abstract: In the context of FFT-based maximum-likelihood (ML) detection of a complex sinusoid in noise, we consider the result of terminating the FFT at an intermediate stage of computation and applying the ML detection strategy to its unfinished results. We show that detection performance increases monotonically with the number of FFT stages completed, converging ultimately to that of the exact ML detector. The receiver operating characteristic associated with the completion of each FFT stage is derived. This enables the calculation of the minimum number of FFT stages that must be completed in order for desired detection and false alarm probabilities to be obtained.

Proceedings ArticleDOI
14 Oct 1996
TL;DR: This work develops a real transform algorithm for calculating the discrete circular deconvolution by substituting the fast Fourier transform defined in the complex domain and it is shown that the computational cost is about half of the traditional FFT.
Abstract: Fast computation of the discrete deconvolution is very important in image/video signal processing. We develop a real transform algorithm for calculating the discrete circular deconvolution by substituting the fast Fourier transform (FFT) defined in the complex domain. It is shown that the computational cost of the algorithm is about half of the traditional FFT. Furthermore, the algorithm has a weak numerical stability.

Journal ArticleDOI
TL;DR: In this paper, general systems of polynomials, satisfying prescribed symmetries and orthonormal on the unit circle with respect to weight functions belonging to a suitable symmetry class, are used in order to generalize the Discrete Fourier Transform (DFT) and the FFT algorithm.
Abstract: General systems of polynomials, satisfying prescribed symmetries and orthonormal on the unit circle with respect to weight functions belonging to a suitable symmetry class, are used in order to generalize the Discrete Fourier Transform (DFT) and the FFT algorithm.

Journal ArticleDOI
TL;DR: Based on the differential property of Fourier transform and the Taylor expansion of a n-variables function, the subsequence interpolating algorithm is extended to a general n-dimensional signal as discussed by the authors.
Abstract: Based on the differential property of Fourier transform and the Taylor expansion of a n-variables function, the subsequence interpolating algorithm is extended to a general n-dimensional signal. As the interpolating process is consisted of a few parallel inverse FFT with the same size as the forward FFT, it is very efficient and is suitable for parallel processing.

Journal ArticleDOI
TL;DR: In this article, an enhanced spectrometer is proposed using the modified forward-backward linear prediction method (MFBLP) with a search algorithm, and a computer simulation for multiple-wavenumber estimation is investigated.
Abstract: Interferometric instruments have the following serious weak points: (1) the necessity of doing a Fourier transform that involves a vast amount of calculation; (2) the lack of knowledge of suitable measuring conditions until the Fourier transform is finished; and (3) the spectral resolution of the conventional Fourier-based techniques is significantly affected by the sampling rate, data length, and noise in signal processing. In this paper, an enhanced spectrometer is proposed using the modified forward-backward linear prediction method (MFBLP) with a search algorithm. To document the advantage of the method presented, a computer simulation for multiple-wavenumber estimation is investigated. The MFBLP method is truly superior to the fast Fourier transform (FFT) method. In general, the spectral resolution using the FFT method is proportional to the data length. In this paper, however, it is shown that excellent results can also be obtained from only 60 sample points using the FFT method. Moreover, from experimental results, we also conclude that the sampling rate must be consistent with the condition 632

Journal ArticleDOI
TL;DR: It is demonstrated that this computation can be done in place by just employing butterfly swaps if the input reordering is combined with the bit-reverse scrambling required by the real-valued version of the decimation-in-time fast Fourier transform algorithm.
Abstract: The possibility of computing the discrete cosine transform (DCT) of length N=2/sup v/, v integer, via an N-point discrete Fourier transform (DFT) is widely known from the literature. It is demonstrated that this computation can be done in place by just employing butterfly swaps if the input reordering (necessary for the DCT computation via DFT) is combined with the bit-reverse scrambling required by the real-valued version of the decimation-in-time fast Fourier transform algorithm. As computer code for many real-valued FFT algorithms is publicly available, and FFT-based DCT computations exhibit a very regular structure that is helpful in signal processor implementations, this method of DCT-computation becomes even more attractive.

Patent
27 Aug 1996
TL;DR: In this article, a two-dimensional (2D) parallel FFT circuit 11 for receiving signals from two-dimensionally arrayed antenna elements A0 to A63 and executing spatial axis 2D FFT operation for the analysis of radio wave arriving directions, a 64-element parallel FLT circuit 13 for receiving outputs from the circuit 11 and executing time base FLT operation for frequency analysis and a shift register group 12 for paralleling time series constituted of arranging the direction-sorted decomposed results outputted from the circuits as direction analyzed results in parallel in the time order
Abstract: PROBLEM TO BE SOLVED: To attain the analysis of radio wave arriving directions and the execution of fast Fourier transformation(FFT) for the three-dimensional(3D) development of the analyzed results, i.e., the simultaneous analysis of directions and frequency, with small and inexpensive constitution. SOLUTION: The 3D FFT device is provided with a two-dimensional(2D) parallel FFT circuit 11 for receiving signals from two-dimensionally arrayed antenna elements A0 to A63 and executing spatial axis 2D FFT operation for the analysis of radio wave arriving directions, a 64-element parallel FFT circuit 13 for receiving outputs from the circuit 11 and executing time base FFT operation for frequency analysis and a shift register group 12 for paralleling time series constituted of arranging the direction-sorted decomposed results outputted from the circuit 11 as direction analyzed results in parallel in the time order so as to sort the signal time series of fixed time as a group and outputting the results transformed in the directional order together with time and outputs from the shift register group 12 are supplied to the parallel FFT circuit 13.



Proceedings ArticleDOI
14 Oct 1996
TL;DR: 2-D AFT can also perform competitively with the classical 2-D FFT in terms of complexity and speed, and a computer simulation shows the correction of this algorithm.
Abstract: The arithmetic Fourier transform (AFT) is a number-theoretic approach to Fourier analysis which has been shown to perform competitively with the classical FFT. A 2-D AFT algorithm using same method is developed on the basis of a 1-D AFT algorithm. The analysis of the complexity and the architecture of the 2-D AFT algorithm shows that 2-D AFT can also perform competitively with the classical 2-D FFT in terms of complexity and speed. Finally, a computer simulation shows the correction of this algorithm.

Journal ArticleDOI
TL;DR: A space-group-general radix-2 crystallographic fast Fourier transform (FFT) has been written in only 130 lines of executable Fortran code and Computational times compare favorably with other FFT programs that require 1000 or more lines of code excluding peak-interpolation routines.
Abstract: A space-group-general radix-2 crystallographic fast Fourier transform (FFT) has been written in only 130 lines of executable Fortran code. Computational times compare favorably with other FFT programs that require 1000 or more lines of code excluding peak-interpolation routines. The complete program and description of control parameters are given. The program is dimensioned to a maximum grid size of 1283 points and a Miller-index range of ±60. These arrays may be altered by simply changing the values of NPAR and MH in the PARAMETER statements of the program.