scispace - formally typeset
Search or ask a question

Showing papers on "Split-radix FFT algorithm published in 1992"


Journal Article
TL;DR: A low cost real-time synthesizer design allowing processing of recorded and live sounds, synthesis of instruments and synthesis of speech and the singing voice is proposed.
Abstract: We present a new additive synthesis method based on spectral envelopes and inverse Fast Fourier Transform (FFT -1). User control is facilitated by the use of spectral envelopes to describe the characteristics of the short term spectrum of the sound in terms of sinusoidal and noise components. Such characteristics can be given by users or obtained automatically from natural sounds. Use of the inverse FFT reduces the computation cost by a factor on the order of 15 compared to oscillators. We propose a low cost real-time synthesizer design allowing processing of recorded and live sounds, synthesis of instruments and synthesis of speech and the singing voice.

93 citations


Journal ArticleDOI
TL;DR: This work pads the set of spectral coefficients with zeros and takes an FFT of length 3N to interpolate the Chebyshev series to a very fine grid, and applies either the Mth order Euler sum acceleration or (2M + 1)-point Lagrangian interpolation to approximate the sum of the series on the irregular grid.

71 citations


Journal ArticleDOI
TL;DR: The sliding fast Fourier transform is reviewed and is shown to have the computational complexity of N complex multiplications per sample, as opposed to the well-cited assumption of (N/2) log/sub 2/ N complex multiplication per sample.
Abstract: The sliding fast Fourier transform (FFT) is reviewed and is shown to have the computational complexity of N complex multiplications per sample, as opposed to the well-cited assumption of (N/2) log/sub 2/ N complex multiplication per sample reported in a book by L.R. Rabiner and B. Gold (1975). >

67 citations


Journal ArticleDOI
01 Nov 1992
TL;DR: An implementation of the Cooley-Tukey complex-to-complex FFT on the Connection Machine is described, which is designed to make effective use of the communications bandwidth of the architecture, its memory bandwidth, and storage with precomputed twiddle factors.
Abstract: We describe an implementation of the Cooley-Tukey complex-to-complex FFT on the Connection Machine. The implementation is designed to make effective use of the communications bandwidth of the architecture, its memory bandwidth, and storage with precomputed twiddle factors. The peak data motion rate that is achieved for the interprocessor communication stages is in excess of 7 Gbytes/s for a Connection Machine system CM-200 with 2048 floating-point processors. The peak rate of FFT computations local to a processor is 12.9 Gflops/s in 32-bit precision, and 10.7 Gflops/s in 64-bit precision. The same FFT routine is used to perform both one- and multi-dimensional FFT without any explicit data rearrangement. The peak performance for a one-dimensional FFT on data distributed over all processors is 5.4 Gflops/s in 32-bit precision and 3.2 Gflops/s in 64-bit precision. The peak performance for square, two-dimensional transforms, is 3.1 Gflops/s in 32-bit precision, and for cubic, three dimensional transforms, the peak is 2.0 Gflops/s in 64-bit precision. Certain oblong shapes yield better performance. The number of twiddle factors stored in each processor is P/2N + log2 N for an FFT on P complex points uniformly distributed among N processors. To achieve this level of storage efficiency we show that a decimation-in-time FFT is required for normal order input, and a decimation-in-frequency FFT is required for bit-reversed input order.

57 citations


Journal ArticleDOI
TL;DR: In this article, a real-valued DFT algorithm for odd-length type-II, type-III, and type-IV DCTS is presented. But the algorithm requires permutations and sign changes only.
Abstract: Efficient methods for mapping odd-length type-II, type-II, and type-IV DCTS to a real-valued DFT are presented. It is found that odd-length type-II and type-III DCTs can be transformed, by means of an index mapping, to a real-valued DFT of the same length using permutations and sign changes only. The real-valued DFT can then be computed by efficient real-valued FFT algorithms such as the prime factor algorithm. Similar mapping is introduced to convert a type-IV DCT to a real-valued DFT up to a scaling factor and some additions. Methods for computing DCTs with even lengths are also discussed. >

57 citations


Journal ArticleDOI
TL;DR: By introducing a general approach for constructing the fast Hartley transform (FHT) from the corresponding FFT, new vector- and split-vector-radix FHT algorithms with the same desirable properties as their FFT counterparts are obtained.
Abstract: The split-radix approach for computing the discrete Fourier transform (DFT) is extended for the vector-radix fast Fourier transform (FFT) to two and higher dimensions. It is obtained by further splitting the (N/2*N/2) transforms with twiddle factors in the radix (2*2) FFT algorithm. The generalization of this split vector-radix FFT algorithm to higher radices and higher dimensions is also presented. By introducing a general approach for constructing the fast Hartley transform (FHT) from the corresponding FFT, new vector- and split-vector-radix FHT algorithms with the same desirable properties as their FFT counterparts are obtained. >

56 citations


Journal ArticleDOI
TL;DR: In this paper, a generalized prime factor fast Fourier transform (FFT) algorithm was proposed, which can self-sort and in-place simultaneously and has a lower operation count than conventional FFT algorithms.
Abstract: Prime factor fast Fourier transform (FFT) algorithms have two important advantages: they can be simultaneously self-sorting and in-place, and they have a lower operation count than conventional FFT algorithms. The major disadvantage of the prime factor FFT has been that it was only applicable to a limited set of values of the transform length N. This paper presents a generalized prime factor FFT, which is applicable for any $N = 2^p 3^q 5^r $, while maintaining both the self-sorting in-place capability and the lower operation count. Timing experiments on the Cray Y-MP demonstrate the advantages of the new algorithm.

54 citations


Journal ArticleDOI
01 Aug 1992
TL;DR: The moving fast Fourier transform (MFFT) algorithms developed in the paper apply to the particular case where the window is moved one data point along the signal between successive transforms, using less computation than in directly evaluating the new transform with the FFT algorithm.
Abstract: A common approach to signal or image processing using the discrete Fourier transform (DFT) is to extract a portion of the signal by windowing, and then to form the DFT of the window contents. By moving the window appropriately, the entire signal may be covered. The moving fast Fourier transform (MFFT) algorithms developed in the paper apply to the particular case where the window is moved one data point along the signal between successive transforms. The MFFT ‘updates’ the DFT to reflect the new window contents, using less computation than in directly evaluating the new transform with the FFT algorithm. The MFFT has computational order Nin 1 — d and N2in 2 — d, a factor of log2Nimprovement over the FFT. MFFT algorithms are derived for use with the boxcar, split-triangular, Hanning, Hamming and Blackman windows. Generalisation to piecewise linear and piecewise polynomial windows is discussed.

51 citations


Journal ArticleDOI
TL;DR: In this paper, an application-specific architecture for the parallel calculation of the decimation in time and radix 2 fast Hartley (FHT) and Fourier (FFT) transforms is presented.
Abstract: An application-specific architecture for the parallel calculation of the decimation in time and radix 2 fast Hartley (FHT) and Fourier (FFT) transforms is presented. A real sequence with N=2/sup n/ data items is considered as input. The system calculates the FHT and the FFT in n and n+1 stages. respectively. The modular and regular parallel architecture is based on a constant geometry algorithm using butterflies of four data items and the perfect unshuffle permutation. With this permutation, the mapping of the algorithm in VLSI technology is simplified and the communications among processors are minimized. Organization of the processor memory based on first-in, first-out (FIFO) queues facilitates a systolic data flow and permits the implementation in a direct way of the complex data movements and address sequences of the transforms. This is accomplished by means of simple multiplexing operations, using hardwired control. The total calculation time is (Nlog/sub 2/N)/4Q cycles for the FHT and N(1+log/sub 2/N)/4Q cycles for the FFT, where Q is the number of processors (Q= 2/sup q/, Q >

34 citations


Journal ArticleDOI
TL;DR: A method for converting any nesting DFT algorithm to the type-I discrete W transform (DWT-I) is introduced and is more efficient that either WFTA or PFA for large N, and it is more flexible for the choice of transform length.
Abstract: A method for converting any nesting DFT algorithm to the type-I discrete W transform (DWT-I) is introduced. A nesting algorithm that differs from either the Windograd Fourier transform algorithm (WFTA) or the prime factor FFT algorithm (PFA) is presented. New small-N DETs, which are suitable for this nesting algorithm, are developed based on using sparse matrix decomposition. The proposed algorithm is more efficient that either WFTA or PFA for large N, and it is more flexible for the choice of transform length, because 32 points are used. For 2D processing, the proposed algorithm is more efficient than the polynomial transform. >

23 citations


Journal ArticleDOI
TL;DR: A fast backprojection method through the use of interpolated fast Fourier transform (FFT) is presented, which allows the arbitrary control of the frequency characteristics.
Abstract: A fast backprojection method through the use of interpolated fast Fourier transform (FFT) is presented. The computerized tomography (CT) reconstruction by the convolution backprojection (CBP) method has produced precise images. However, the backprojection part of the conventional CBP method is not very efficient. The authors propose an alternative approach to interpolating and backprojecting the convolved projections onto the image frame. First, the upsampled Fourier series expansion of the convolved projection is calculated. Then, using a Gaussian function, it is projected by the aliasing-free interpolation of FFT bins onto a rectangular grid in the frequency domain. The total amount of computation in this procedure for a 512*512 image is 1/5 of the conventional backprojection method with linear interpolation. This technique also allows the arbitrary control of the frequency characteristics. >

Journal ArticleDOI
TL;DR: The Cooley-Tukey FFT as discussed by the authors algorithm is based on the version of the Goertzel algorithm via Horner's rule, which is suitable for vector processors and any parallel machine such as hypercube.
Abstract: An efficient algorithm that places an optimized DG (dependence graph) for 2/sup n/ points of the discrete Fourier transform (DFT) computation is proposed. A one-dimensional DFT is turned into a multidimensional DFT, consisting of a few short DFTs, which is based on the version of the Goertzel algorithm via Horner's rule. The data sequences in the Cooley-Tukey FFT algorithm are in an order that is easily manageable and well suited for vector processors and any parallel machine such as hypercube. >

Journal ArticleDOI
TL;DR: A linear syStolic array for fast Fourier transform (FFT) computation that is based on the Pease algorithm, which has the advantage of making the systolic array structure uniform from stage to stage.
Abstract: The authors propose a linear systolic array for fast Fourier transform (FFT) computation that is based on the Pease algorithm, which has the advantage of making the systolic array structure uniform from stage to stage. With slight modifications the algorithm can be directly implemented on a systolic array. The array needs only log/sub 2/n processors, where n is the number of input words (length of the FFT). It processes data generated at a speed twice the rate of the processor clock. >

Journal ArticleDOI
TL;DR: The influence of random instabilities in the sampling instants on spectral estimation by the fast Fourier transform (FFT) of harmonic, stochastic processes is considered.
Abstract: The influence of random instabilities in the sampling instants on spectral estimation by the fast Fourier transform (FFT) of harmonic, stochastic processes is considered. The degradation due to the deviation from a uniform sampling is presented by explicit formulas. This degradation, a decrease of the desired signal and an increase of the sidelobe noise, is expressed in terms of the characteristic function of the jitter's distribution. >

Journal ArticleDOI
TL;DR: A fast discrete Fourier transform (DFT) computing algorithm used in situations where part of the data is zero and only the first transform elements are to be calculated is proposed.
Abstract: A fast discrete Fourier transform (DFT) computing algorithm used in situations where part of the data is zero and only the first transform elements are to be calculated is proposed. The method is based on the pruning of a split-radix decimation-time (DIT) fast Fourier transform (FFT) diagram. It has the advantage of providing gains as a result of pruning computation and the use of a split radix. >

Journal ArticleDOI
TL;DR: A method for speedy computation of the autocorrelation coefficients used by linear predictive coding (LPC) that uses Fermat number transform (FNT) that has found that there exists a fast computational algorithm for FNT which has a computational structure similar to the fast Fourier transform (FFT).
Abstract: A method for speedy computation of the autocorrelation coefficients used by linear predictive coding (LPC) that uses Fermat number transform (FNT) is described. It is found that there exists a fast computational algorithm for FNT which has a computational structure similar to the fast Fourier transform (FFT). Since the fast Fermat number transform (FFNT) and FFT have similar computational structures, readily available FFT VLSI hardware structures may be adopted for real-time implementation of the FFNT. A verification of the FFNT on an MC 68000 single-board computer has been performed with quite satisfactory results. >

Journal ArticleDOI
01 Feb 1992
TL;DR: The proposed approach is based on the conventional three-loop indexing structure, in which redundancies associated with the indexing scheme have been removed at the expense of memory, and an increase in speed of up to 10% is achieved depending on the FFT sequence length.
Abstract: The paper is concerned with efficient computation of the one-butterfly inplace complex split-radix fast Fourier transform algorithm. The proposed approach is based on the conventional three-loop indexing structure, in which redundancies associated with the indexing scheme have been removed at the expense of memory. As a result an increase in speed of up to 10% is achieved depending on the FFT sequence length.

Proceedings ArticleDOI
23 Mar 1992
TL;DR: A very efficient algorithm for computing the discrete Fourier transform (DFT) of real-symmetric input is presented, based on Bruun's algorithm, which achieves the same low arithmetic as the split-radix FFT for real-Symmetric data, but has a structure that is as simple as the radix-2.
Abstract: A very efficient algorithm for computing the discrete Fourier transform (DFT) of real-symmetric input is presented. The algorithm is based on Bruun's algorithm where, except for the last stage, all twiddle factors are purely real. It is well-known that about half of the arithmetic operations and memory requirements can be removed when the input is real-valued. It may be assumed that another half of the computational and memory requirements can be eliminated when the input is real and symmetric. This is, however, impossible with a standard radix-2 fast Fourier transform (FFT), but can be achieved by the Bruun algorithm. The symmetries within the algorithm with for real-symmetric input are exploited to remove about three fourths of the butterflies and memory locations. The algorithm presented achieves the same low arithmetic as the split-radix FFT for real-symmetric data, but has a structure that is as simple as the radix-2. The implementation on the TMS320C30 shows that the new algorithm fits a DSP processor very well. The program requires 0.51-0.60 ms to compute a length 1024 FFT with real-symmetric data. >

Proceedings ArticleDOI
16 Nov 1992
TL;DR: An efficient state space method for implementing the fast Fourier transform over rectangular windows is proposed for the cases when there is a large overlap between the consecutive input signals, called the generalized sliding FFT (GSFFT).
Abstract: An efficient state space method for implementing the fast Fourier transform over rectangular windows is proposed for the cases when there is a large overlap between the consecutive input signals. This is called the generalized sliding FFT (GSFFT). To minimize the computational complexity of the GSFFT. The intermediate result of the FFT structure which can be used in the next iterations are preserved. The complexity of this method is compared with that of standard FFT. The GSFFT is then used to propose an efficient implementation of the frequency domain block least mean square adaptive (FBLMS) filters, which are known to be efficient when filter length is large. >

Proceedings ArticleDOI
23 Mar 1992
TL;DR: A method, called twiddle-factor-shift, which combines the simplicity of interconnections and processor elements (PEs) of radix-2 fast Fourier transform (FFT) algorithms and of the lower arithmetic complexity of higher radix FFTs is presented.
Abstract: A method, called twiddle-factor-shift, which combines the simplicity of interconnections and processor elements (PEs) of radix-2 fast Fourier transform (FFT) algorithms and of the lower arithmetic complexity of higher radix FFTs is presented. The method is based on the linearity of the basic radix-2 operation and the data dependencies of the FFT. Twiddle-factor-shift means the cumulation or rotations of the complex samples every second or third stage of the FFT. This method offers an additional flexibility in the design of pipelined FFT architectures and leads to efficient PE and interconnection structures. An example of a modified radix-8 FFT architecture for a transformation length of N=256 that processes four samples in parallel and uses temporal permutation networks, which are optimal in the sense latency is given. >

Proceedings ArticleDOI
25 May 1992
TL;DR: This paper shows a fast implementation method of a two dimensional (2D) filter based on fast convolution related to the fast Fourier transform (FFT) look up table that gives a good accurate selection of the desired frequency.
Abstract: This paper shows a fast implementation method of a two dimensional (2D) filter. The filter design is based on fast convolution related to the fast Fourier transform (FFT) look up table. This is then extended to the 2D FFT. The design and implementation of the system gives a good accurate selection of the desired frequency. The system has a step advantage of a reduction in the operation time. >

Proceedings ArticleDOI
01 Jan 1992
TL;DR: In this paper, a 2D fast Fourier transform (FFT) algorithm for analyzing interferograms and the fringe shift is described, and the phase errors caused by nonlinear response of a detector and by a random noise are analyzed theoretically.
Abstract: A 2-D fast Fourier transform (FFT) algorithm for analyzing interferograms and the fringe shift is described. The phase errors caused by nonlinear response of a detector and by a random noise are analyzed theoretically. From the analysis, it is concluded that (1) the phase error due to the nonlinear response of a detector can be canceled by the proper filter window in the transform plane, and (2) the 2-D transform permits better separation of the desired information components from unwanted components than a 1-D transform. The relationship of 2-D FFT algorithm accuracy with factors such as the quantization of grey levels, spatial carrier frequency, spatial scanning direction, pixel array, form of the wavefront to be tested, etc., are discussed by analyzing a simulated ideal interferogram. An example of analyzing an actual interferogram and measuring the displacement of a piezoelectric transducer (PZT) device is given. In principle, the 2-D FFT algorithm can attain to an accuracy of (lambda) /100 approximately (lambda) /200 under optimum parameter conditions.

Proceedings ArticleDOI
10 May 1992
TL;DR: The automatic design of prime-length fast Fourier transforms (FFTs) based on S. Winograd's (1980) theory is described and a program is described that generates code for F FTs so that longer prime length FFTs that were formerly practical to design are now easily generated.
Abstract: The automatic design of prime-length fast Fourier transforms (FFTs) based on S. Winograd's (1980) theory is described. A program is described that generates code for FFTs so that longer prime length FFTs that were formerly practical to design are now easily generated. For those prime lengths ( >

Proceedings ArticleDOI
10 May 1992
TL;DR: A 2D systolic array for performing the 2D N*N-point discrete Fourier transform (DFT) in a row-column-wise or column-row-wise format that reaches the lower bound in terms of the area-time complexity.
Abstract: Proposes a 2D systolic array for performing the 2D N*N-point discrete Fourier transform (DFT). This array is constructed based on the use of the Goertzel algorithm to realize the 2-D DFT in a row-column-wise or column-row-wise format. Unlike the conventional row-column decomposition method, the proposed system involves no matrix transposition problems. In addition, the system possesses the features of regularity, modularity, and concurrency. As a consequence, it is well suited to VLSI implementation and has a very high throughput of one 2-D transition per N cycles. Moreover, the utilization efficiency of the proposed system is 100%, and the latency (processing time for a single 2-D transform) is 4N-1 cycles. In terms of the area-time complexity, the proposed approach is a fast design and reaches the lower bound. >

Proceedings ArticleDOI
C. Lu1, R. Tolimieri
23 Mar 1992
TL;DR: An algorithm is presented which overcomes the problem for real symmetric and antisymmetric data sequences and a similar algorithm is given for the translational complex conjugate symmetric data sequence.
Abstract: A previously proposed algorithm for the FFT (fast Fourier transform) computation of real symmetric and antisymmetric sequences reduced the N-point symmetric FFT computation to a N/4-point complex FFT computation, but the postprocessing involved division by sin(2 pi k/N). For large size N, this may cause stability problems. An algorithm is presented which overcomes the problem for real symmetric and antisymmetric data sequences. A similar algorithm is given for the translational complex conjugate symmetric data sequence. >

Proceedings ArticleDOI
11 Nov 1992
TL;DR: Hardware algorithms for one-dimensional fast Fourier transform (FFT) computation on an 8-neighbor processor array are presented and two data mapping methods and algorithms are shown: the algorithm for similarity allocation and the algorithms for superposition allocation.
Abstract: Hardware algorithms for one-dimensional fast Fourier transform (FFT) computation on an 8-neighbor processor array are presented. These algorithms achieve high-speed FFT computation by combining the radix 4 butterfly computation with the communication capabilities of the 8-neighbor processor array. Three algorithms are considered. Two data mapping methods and algorithms are shown: the algorithm for similarity allocation and the algorithm for superposition allocation. The radix 4 and the radix 2 FFT algorithms are compared and evaluated. >

Proceedings ArticleDOI
23 Mar 1992
TL;DR: This approach provides a method of detecting and synchronizing to the sequence, regardless of whether the sequence is at baseband or modulate a carrier, the relationship of sampling rate to bit rate, and whether the modulated sequence is regarded as a carrier which itself modulates a small-bandwidth, constant envelope signal.
Abstract: Maximal-length linear recursive sequences (M-sequences) over GF(2) are used for a variety of purposes in communications, test, and other signals. An M-sequence contains a unique characteristic phase at which the sequence is invariant under a decimation by two. This point may be detected using an algorithm whose principal computation is the fast Fourier transform (FFT). This approach provides a method of detecting and synchronizing to the sequence, regardless of whether the sequence is at baseband or modulates a carrier, the relationship of sampling rate to bit rate, and whether the modulated sequence is regarded as a carrier which itself modulates a small-bandwidth, constant envelope signal. Moreover, the technique provides a method for obtaining the polynomial generator of the sequence if it is not already known. >

Proceedings ArticleDOI
22 Jan 1992
TL;DR: The authors propose a parallel FFT (fast Fourier transform) architecture based on an N-point FFT decomposition that is faster than that of a single FFT chip and a fault tolerance interconnection for the FFT butterfly network by considering the complexity of the number of primitive cells.
Abstract: The authors propose a parallel FFT (fast Fourier transform) architecture based on an N-point FFT decomposition. Performance evaluation is performed with respect to the total area of the architecture. It is clear that the architecture is simple and the execution time is faster than that of a single FFT chip. The authors also propose a fault tolerance interconnection for the FFT butterfly network by considering the complexity of the number of primitive cells. >

Journal ArticleDOI
TL;DR: Based on some theorems of Number Theory, a new algorithm for computing the FFT (with power of two length) is proposed, which is recursive in nature, and thus the computation structure is rather regular.
Abstract: Since the discovery of the fast Fourier transform (FFT), many new FFT algorithms have been developed. Conventionally, the convolution‐based approach deals commonly with the prime length discrete Fourier transforms. In this paper, based on some theorems of Number Theory, a new algorithm for computing the FFT (with power of two length) is proposed. This novel recursive algorithm contains three stages, the first and the last stages contain only additions and substractions, and the second stage is of block diagonal form, with each block being a circular correlation/convolution matrix. The newly proposed convolution‐based FFT algorithm has the following advantages: 1. In terms of computational counts, this algorithm can achieve the multiplicative lower bound derived by Winograd. 2. The proposed algorithm can easily be implemented in a parallel computing environment. 3. The proposed algorithm is recursive in nature, and thus the computation structure is rather regular.

Proceedings ArticleDOI
01 Apr 1992
TL;DR: In this paper, a transform based on the WMMR filters was proposed for frequency analysis and spatial localization with a window width of 1/4 period or less, which can be used with impulsive noise of up to 40% and with random baseline shifts.
Abstract: One domain in which the ordering filters have not appeared is frequency analysis. Simultaneously one must note that the impulse rejection properties of the ordering filters could be very beneficial due to the lack of robustness of the DFT/FFT. Another problem with the DFT/FFT is the ambiguity of the estimate of frequency at a point (frequency localization). This paper introduces a transform (WMMR/MED/COUNT) that simultaneously solves both of the problems in some cases. The Gabor transform and various wavelet techniques have recently been reviewed as a substitute to FFT frequency analysis for spatial localization. While the Gabor transform optimally infers frequency content and spatial localization simultaneously, it suffers from the fact that it requires a full period within the window. This paper presents a transform based on the WMMR filters that will yield frequency analysis and spatial localization with a window width of 1/4 period or less. Experimentally, it has been shown that this technique can be used with impulsive noise of up to 40% and with random baseline shifts. The short-time Fourier, Gabor transform and the WMMR/MED/COUNT transforms (WMCT) are compared for their localization properties in noisy and noiseless situations.