scispace - formally typeset
Search or ask a question

Showing papers on "Rader's FFT algorithm published in 1983"


Journal ArticleDOI
TL;DR: The implementation of the FFT on vector computers is described, and in the final section it is demonstrated how savings can be achieved in the case of two-dimensional transforms.

183 citations


Journal ArticleDOI
TL;DR: A decimation-in-time radix-2 fast Fourier transform (FFT) algorithm is considered here for implementation in multiprocessors with shared bus, multistage interconnection network (MIN), and in mesh connected computers.
Abstract: A decimation-in-time radix-2 fast Fourier transform (FFT) algorithm is considered here for implementation in multiprocessors with shared bus, multistage interconnection network (MIN), and in mesh connected computers. Results are derived for data allocation, interprocessor communication, approximate computation time, and speedup of an N point FFT on any P available processing elements (PE's). Further generalization is obtained for a radix-r FFT algorithm. An N X N point two-dimensional discrete Fourier transform (DFT) implementation is also considered when one or more rows of the input data matrix are allocated to each PE.

27 citations


Journal ArticleDOI
01 Oct 1983
TL;DR: A description of the parallelism in the radix-2 pipeline FFT is presented, and it is shown that to obtain the required processing rate further parallel processing is necessary.
Abstract: The advantages of using digital convolution to implement a particular pulse compression radar filter are outlined. Using the bandwidth of the given filter, a simple calculation of the required computation rate indicates that considerable parallel computation would be necessary using existing integrated circuits. Some methods of computing the DFT are given and the FFT algorithm is chosen since its regular structure and in-place computation facilitate parallel computation. A description of the parallelism in the radix-2 pipeline FFT is presented, and it is shown that to obtain the required processing rate further parallel processing is necessary. By computing n butterflies in parallel at each stage of the FFT a family of parallel pipeline FFT processors are developed. Using the new architectures allows an increase in the processing speed while retaining the simple structure of the pipeline FFT. It is shown that for each value of N and n there are four canonic forms of equivalent computational complexity, but with different structures. The four forms arise from the two types of DIT FFT algorithm and the two methods of selecting the order in which the butterflies are computed. The connection between the FFT algorithm and the binary m-cube array is given, and is used to show by an example how the architectures presented fit between the normal pipeline FFT and the array processor in the amount of parallel computation involved. The radix-4 pipeline FFT is described and it is shown that this structure can also be paralleled in a similar way to the radix-2 pipeline FFT. The amount of hardware required to implement digital convolution using these architectures is discussed and examples are given. The balance between logic speed and integration density and the problems of interconnecting the computational elements are also discussed.

26 citations


Journal ArticleDOI
TL;DR: It is shown that, under a VLSI model of computations, such a design requires the same asymptotical area and attains the same throughput as the corresponding network for the evaluation of a single N-element FFT.
Abstract: A network for the evaluation of the fast Fourier transform (FFT) is presented. Such a network is able to compute, in parallel, the FFT's of arbitrary partitions in powers of two of the N input elements. It is shown that, under a VLSI model of computations, such a design requires the same asymptotical area and attains the same throughput as the corresponding network for the evaluation of a single N-element FFT.

13 citations


Journal ArticleDOI
TL;DR: A fast algorithm for an N-point discrete cosine transform (DCT) is derived from a 4N-point Winograd Fourier transform algorithm (WFTA), suitable for a high-speed implementation using one-bit systolic arrays.
Abstract: A fast algorithm for an N-point discrete cosine transform (DCT) is derived from a 4N-point Winograd Fourier transform algorithm (WFTA). This algorithm, which has the same form as Winograd's Fourier transform and convolution algorithms, is suitable for a high-speed implementation using one-bit systolic arrays.

9 citations


Dissertation
01 Jan 1983
TL;DR: It is demonstrated that Winograd's cyclic convolution and Fourier Transform Algorithms, together with Nussbaumer's two-dimensional cyclic Convolution algorithms, have a common general form.
Abstract: Many of the techniques for the computation of a two-dimensional convolution of a small fixed window with a picture are reviewed. It is demonstrated that Winograd's cyclic convolution and Fourier Transform Algorithms, together with Nussbaumer's two-dimensional cyclic convolution algorithms, have a common general form. Many of these algorithms use the theoretical minimum number of general multiplications. A novel implementation of these algorithms is proposed which is based upon one-bit systolic arrays. These systolic arrays are networks of identical cells with each cell sharing a common control and timing function. Each cell is only connected to its nearest neighbours. These are all attractive features for implementation using Very Large Scale Integration (VLSI). The throughput rate is only limited by the time to perform a one-bit full addition. In order to assess the usefulness to these systolic arrays a 'cost function' is developed to compare them with more conventional techniques, such as the Cooley-Tukey radix-2 Fast Fourier Transform (FFT). The cost function shows that these systolic arrays offer a good way of implementing the Discrete Fourier Transform for transforms up to about 30 points in length. The cost function is a general tool and allows comparisons to be made between different implementations of the same algorithm and between dissimilar algorithms. Finally a technique is developed for the derivation of Discrete Cosine Transform (DCT) algorithms from the Winograd Fourier Transform Algorithm. These DCT algorithms may be implemented by modified versions of the systolic arrays proposed earlier, but requiring half the number of cells.

3 citations


Proceedings ArticleDOI
01 Apr 1983
TL;DR: Several theoretical results concerning the discrete Fourier transform are derived and these are used to obtain an efficient algorithm for extending the range of lengths of a multi-dimensional convolver or correlator based on a transform processor or program.
Abstract: Several theoretical results concerning the discrete Fourier transform are derived. These are then used to obtain an efficient algorithm for extending the range of lengths of a multi-dimensional convolver or correlator based on a transform processor or program. Methods of implementing this algorithm in hardware and software are also considered.

2 citations


Proceedings ArticleDOI
16 Jun 1983
TL;DR: The application of Winograd's algorithm to OTF calculation is reported on and it is compared with some other methods of computing the OTF.
Abstract: Since the advent of the Cooley-Tukey FFT algorithm, many optical designers who have used the method to compute the optical transfer function with desk-top computers would welcome the availability of even faster algorithms. There has, of course, been a steady improvement in FFT techniques in the past decade or so, but it seems that Winograd's algorithm is the most encouraging yet. In this paper we report on the application of this new algorithm to OTF calculation and compare it with some other methods of computing the OTF.© (1983) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.