scispace - formally typeset
Search or ask a question

Showing papers on "Fast Fourier transform published in 1987"


Journal ArticleDOI
TL;DR: A new implementation of the real-valued split-radix FFT is presented, an algorithm that uses fewer operations than any otherreal-valued power-of-2-length FFT.
Abstract: This tutorial paper describes the methods for constructing fast algorithms for the computation of the discrete Fourier transform (DFT) of a real-valued series. The application of these ideas to all the major fast Fourier transform (FFT) algorithms is discussed, and the various algorithms are compared. We present a new implementation of the real-valued split-radix FFT, an algorithm that uses fewer operations than any other real-valued power-of-2-length FFT. We also compare the performance of inherently real-valued transform algorithms such as the fast Hartley transform (FHT) and the fast cosine transform (FCT) to real-valued FFT algorithms for the computation of power spectra and cyclic convolutions. Comparisons of these techniques reveal that the alternative techniques always require more additions than a method based on a real-valued FFT algorithm and result in computer code of equal or greater length and complexity.

489 citations


Journal ArticleDOI
TL;DR: This paper presents a recursive algorithm for DCT with a structure that allows the generation of the next higher order DCT from two identical lower order D CT's.
Abstract: The discrete cosine transform (DCT) is widely applied in various fields, including image data compression, because it operates like the Karhunen-Loeve transform for stationary random data. This paper presents a recursive algorithm for DCT with a structure that allows the generation of the next higher order DCT from two identical lower order DCT's. As a result, the method for implementing this recursive DCT requires fewer multipliers and adders than other DCT algorithms.

483 citations


Book
01 Jan 1987
TL;DR: This book discusses the Discrete Fourier Transform (DFT) and a few applications of the DFT, as well as some of the techniques used in real sequences and the Real DFT.
Abstract: Preface 1. Introduction. A Bit of History An Application Problems 2. The Discrete Fourier Transform (DFT). Introduction DFT Approximation to the Fourier Transform The DFT-IDFT pair DFT Approximations to Fourier Series Coefficients The DFT from Trigonometric Approximation Transforming a Spike Train Limiting Forms of the DFT-IDFT Pair Problems 3. Properties of the DFT. Alternate Forms for the DFT Basic Properties of the DFT Other Properties of the DFT A Few Practical Considerations Analytical DFTs Problems 4. Symmetric DFTs. Introduction Real sequences and the Real DFT (RDFT) Even Sequences and the Discrete Cosine Transform (DST) Odd Sequences and the Discrete Sine Transform (DST) Computing Symmetric DFTs Notes Problems 5. Multi-dimensional DFTs. Introduction Two-dimensional DFTs Geometry of Two-Dimensional Modes Computing Multi-Dimensional DFTs Symmetric DFTs in Two Dimensions Problems 6. Errors in the DFT. Introduction Periodic, Band-limited Input Periodic, Non-band-limited Input Replication and the Poisson Summation Formula Input with Compact Support General Band-Limited Functions General Input Errors in the Inverse DFT DFT Interpolation - Mean Square Error Notes and References Problems 7. A Few Applications of the DFT. Difference Equations - Boundary Value Problems Digital Filtering of Signals FK Migration of Seismic Data Image Reconstruction from Projections Problems 8. Related Transforms. Introduction The Laplace Transform The z- Transform The Chebyshev Transform Orthogonal Polynomial Transforms The Discrete Hartley Transform (DHT) Problems 9. Quadrature and the DFT. Introduction The DFT and the Trapezoid Rule Higher Order Quadrature Rules Problems 10. The Fast Fourier Transform (FFT). Introduction Splitting Methods Index Expansions (One ---> Multi-dimensional) Matrix Factorizations Prime Factor and Convolution Methods FFT Performance Notes Problems Glossary of (Frequently and Consistently Used) Notations References.

354 citations


Journal ArticleDOI
TL;DR: In this paper, a truncated series expansion of the inverse operator that maps object opacity function to hologram intensity was proposed, which is shown to be equivalent to conventional (optical) reconstruction, with successive terms increasingly supressing the twin image.
Abstract: Digitally sampled in-line holograms may be linearly filtered to reconstruct a representation of the original object distribution, thereby decoding the information contained in the hologram The decoding process is performed by digital computation rather than optically Substitution of digital for optical decoding has several advantages, including selective suppression of the twin-image artifact, elimination of the far-field requirement, and automation of the data reduction and analysis process The proposed filter is a truncated series expansion of the inverse of that operator that maps object opacity function to hologram intensity The first term of the expansion is shown to be equivalent to conventional (optical) reconstruction, with successive terms increasingly sup-pressing the twin image The algorithm is computationally efficient, requiring only a single fast Fourier transform pair

223 citations


Journal ArticleDOI
Hou1
TL;DR: Through use of the fast Hartley transform, discrete cosine transforms (DCT) and discrete Fourier transforms (DFT) can be obtained and the recursive nature of the FHT algorithm derived in this paper enables us to generate the next higher order FHT from two identical lower order F HT's.
Abstract: The fast Hartley transform (FHT) is similar to the Cooley-Tukey fast Fourier transform (FFT) but performs much faster because it requires only real arithmetic computations compared to the complex arithmetic computations required by the FFT. Through use of the FHT, discrete cosine transforms (DCT) and discrete Fourier transforms (DFT) can be obtained. The recursive nature of the FHT algorithm derived in this paper enables us to generate the next higher order FHT from two identical lower order FHT's. In practice, this recursive relationship offers flexibility in programming different sizes of transforms, while the orderly structure of its signal flow-graphs indicates an ease of implementation in VLSI.

175 citations


Journal ArticleDOI
TL;DR: An algorithm for the in-place computation of the discrete Fourier transform on real data: a decimation-in-time split-radix algorithm, more compact than the previously published one and a new fast Hartley transform algorithm with a reduced number of operations.
Abstract: This paper highlights the possible tradeoffs between arithmetic and structural complexity when computing cyclic convolution of real data in the transform domain. Both Fourier and Hartley-based schemes are first explained in their usual form and then improved, either from the structural point of view or in the number of operations involved. Namely, we first present an algorithm for the in-place computation of the discrete Fourier transform on real data: a decimation-in-time split-radix algorithm, more compact than the previously published one. Second, we present a new fast Hartley transform algorithm with a reduced number of operations. A more regular convolution scheme based on FFT's is also proposed. Finally, we show that Hartley transforms belong to a larger class of algorithms characterized by their "generalized" convolution property.

131 citations


Journal ArticleDOI
TL;DR: In this article, a fast Fourier transform (FFT) based iterative approach for computing the fields scattered by an infinite array of free-standing patches is presented, which is capable of handling patches that are lossy and have arbitrary shape; it is useful for analyzing configurations that may not have been analyzed previously.
Abstract: A fast Fourier transform (FFT)-based iterative approach for computing the fields scattered by an infinite array of free-standing patches is presented. The method is capable of handling patches that are lossy and have arbitrary shape; it is useful for analyzing configurations that may not have been analyzed previously. Though a rectangular FFT is used, the formulation allows the study of the common triangular array periodicities. Results for various geometries are presented and are compared with existing results.

112 citations


Journal ArticleDOI
TL;DR: The easily performed quantitative determination of the S, D, and O components allows the study of pharmacologically induced changes in the dynamic response characteristics of single visual cortical cells.
Abstract: The response characteristic of visual cortical cells to moving oriented stimuli consists mainly of directional (D) and orientational (O) components superimposed to a spontaneous activity (S). Commonly used polar plot diagrams reflect the maximal responses for different orientations and directions of stimulus movement with a periodicity of 360 degrees in the visual field. Fast Fourier analysis (FFT) is applied to polar plot data in order to determine the intermingled S, D, and O components. The zero order gain component of the spectrum corresponds to a (virtual) spontaneous activity. The first order component is interpreted as the strength of the direction selectivity and the second order component as the strength of the orientation specificity. The axes of the preferred direction and optimal orientation are represented by the respective phase values. Experimental data are well described with these parameters and relative changes of the shape of a polar plot can be detected with an accuracy better than 1%. The results are compatible with a model of converging excitatory and inhibitory inputs weighted according to the zero to second order components of the Fourier analysis. The easily performed quantitative determination of the S, D, and O components allows the study of pharmacologically induced changes in the dynamic response characteristics of single visual cortical cells.

107 citations


Journal ArticleDOI
TL;DR: This work demonstrates that the quality of the correlation signal can also depend on the technique used in the synthesis of the BPOF, and that BPOFs made using the Hartley transform provide superior false correlation rejection and more uniformly sized correlation signals for heavily multiplexed BPOs.
Abstract: Theoretical studies of the performance capabilities of binary phase-only filters (BPOFs), constructed using both Fourier and Hartley transforms, are presented. A thorough analysis of the Fourier BPOF is given. We show that, although BPOFs constructed using Fourier transforms perform well in optical correlator systems, they are also subject to additional noise sources and have the possibility of generating large false correlation signals. We then present an analysis of BPOFs constructed using the Hartley transform. We show that BPOFs made using the Hartley transform provide superior false correlation rejection and more uniformly sized correlation signals for heavily multiplexed BPOFs, compared with those made using the Fourier transform. We also present a technique for constructing Hartley BPOFs. Therefore, although it is well known that the quality of the correlation signal depends on the object, this work demonstrates that the quality of the correlation signal can also depend on the technique used in the synthesis of the BPOF.

95 citations


Journal ArticleDOI
15 Jun 1987
TL;DR: In this article, the four-point bivariate Lagrange interpolation algorithm was applied to near-field antenna data measured in a plane-polar facility, and the results were sufficiently accurate to permit the use of the FFT (fast Fourier transform) algorithm to calculate the far-field patterns of the antenna.
Abstract: The four-point bivariate Lagrange interpolation algorithm was applied to near-field antenna data measured in a plane-polar facility. The results were sufficiently accurate to permit the use of the FFT (fast Fourier transform) algorithm to calculate the far-field patterns of the antenna. Good agreement was obtained between the far-field patterns as calculated by the Jacobi-Bessel and the FFT algorithms. The significant advantage in using the FFT is in the calculation of the principal plane cuts, which may be made very quickly. Also, the application of the FFT algorithm directly to the near-field data was used to perform surface holographic diagnosis of a reflector antenna. The effects due to the focusing of the emergent beam from the reflector, as well as the effects of the information in the wide-angle regions, are shown. The use of the plane-polar near-field antenna test range has therefore been expanded to include these useful FFT applications. >

86 citations


Journal ArticleDOI
TL;DR: The Galerkin method is an approximate method which finds wide application in solving differential and integral equations as mentioned in this paper. But a large amount of computation is needed in order to get a high order approximation by using the method.
Abstract: The Galerkin method is an approximate method which finds wide application in solving differential and integral equations. But a large amount of computation is needed in order to get a high order approximation by using the method. Applying the FFT technique to form a so-called fast Galerkin method, we can reduce the computation work greatly, when taking trigonometric functions as characteristic functions. Taking the periodic solution of non-linear oscillators as an example, we illustrate the procedure and the efficiency of the method. Moreover, with some modifications we extend the applicability of the method, so that not only periodic solutions with known periods, but also those with unknown periods, as well as subharmonics, combination tones, etc., can be treated with the method. Some techniques are described which can be used to simplify the computation.

Journal ArticleDOI
Tony F. Chan1
TL;DR: It is shown that for a simple model problem—Poisson’s equation on a rectangle decomposed into two smaller rectangles—the capacitance system can be inverted exactly by Fast Fourier Transform.
Abstract: Domain decomposition is a class of techniques that are designed to solve elliptic problems on irregular domains and on multiprocessor systems. Typically, a domain is decomposed into many smaller regular subdomains and the capacitance system governing the interface unknowns is solved by some version of the preconditioned conjugate gradient method. In this paper, we show that for a simple model problem—Poisson’s equation on a rectangle decomposed into two smaller rectangles—the capacitance system can be inverted exactly by Fast Fourier Transform. An exact eigen-decomposition of the capacitance matrix also makes it possible to relate and compare the various preconditioners that have been proposed in the literature. For example, we show that in the limit as the aspect ratio of the two rectangles tend to infinity, the preconditioner proposed by Golub and Mayers becomes exact, but the one proposed by Dryja does not. Both preconditioners, however, are poor when the aspect ratio is small.

Proceedings ArticleDOI
Pierre Duhamel1, H. H'Mida
06 Apr 1987
TL;DR: Two new implementation of DCT's are proposed which have several interesting features, as far as VLSI implementation is concerned, and are mainly based on a new formulation of a length-2nDCT as a cyclic convolution.
Abstract: Small length Discrete Cosine Transforms (DCT's) are used for image data compression. In that case, length 8 or 16 DCT's are needed to be performed at video rate. We propose two new implementation of DCT's which have several interesting features, as far as VLSI implementation is concerned. A first one, using modulo-arithmetic, needs only one multiplication per input point, so that a single multiplier is needed on-chip. A second one, based on a decomposition of the DCT into polynomial products, and evaluation of these polynomial products by distributed arithmetic, results in a very small chip, with a great regularity and testability. Furthermore, the same structure can be used for FFT computation by changing only the ROM-part of the chip. Both new architectures are mainly based on a new formulation of a length-2nDCT as a cyclic convolution, which is explained in the first section of the paper.


Journal ArticleDOI
TL;DR: A study of parallelization of the Cooley-Tukey radix two FFT algorithm for MIMD (nonvector) architectures is presented and the precise instructions to be executed by each processor in the parallel system are determined.
Abstract: We present here a study of parallelization of the Cooley-Tukey radix two FFT algorithm for MIMD (nonvector) architectures. Parallel algorithms are presented for one and multidimensional Fourier transforms. From instruction traces obtained by executing Fortran kernels derived from our algorithms, we determined the precise instructions to be executed by each processor in the parallel system. We used these instruction races to predict the performance of the IBM Research Parallel Processing Prototype, RP3, as a computer of FFT's. Our performance results are depicted in graphs included in this paper.

Journal ArticleDOI
TL;DR: In this paper, an analysis is given of the FDAF where the window function is generalized and the convergence behavior of FDAF's with various window functions is compared, and the analysis describes the influence of \beta on the convergence behaviour of the FD over the whole convergence range.
Abstract: One of the advantages of a Frequency-Domain Adaptive Filter (FDAF) is that one can achieve convergence at a constant rate over the whole frequency range by choosing the adaptation constant for each frequency bin l equal to the overall adaptation constant divided by an estimate of the input power at this frequency bin. A commonly used method, applied in this paper, to estimate the input power is to do an exponentially weighting with smoothing constant \beta on the magnitude squared of the input values at each frequency bin l . Furthermore, it is known that a correctly implemented FDAF, using the overlap-save method, contains five 2 N -points Fast Fourier Transforms (FFT). Two of these are used to force the last N points of the time-domain augmented impulse response to zero by applying a particular window function. In this paper, an analysis is given of the FDAF where the window function is generalized. Using these results, the convergence behavior of FDAF's with various window functions is compared. Furthermore, the analysis describes the influence of \beta on the convergence behavior of the FDAF over the whole convergence range.

Journal ArticleDOI
D.M.W. Evans1
TL;DR: An elegant algorithm has been found that performs this "perfect shuffle" more efficiently and, according to timing experiments, runs about eight times faster than the fastest other algorithm known to the author.
Abstract: All radix-B fast Fourier transforms (FFT) or fast Hartley transforms (FHT) performed "in-place" require at some point that the sequence elements he permuted such that, indexing the elements 0 to N - 1, the element with index i is swapped with the element whose index is j. The permutation is called digit-reversing, because if i is represented as a string of digits, base B, then j is that index whose representation is the same string of digits written in reverse order. N is a power of B and B \geq 2 . An elegant algorithm has been found that Performs this "perfect shuffle" more efficiently and, according to timing experiments, runs about eight times faster than the fastest other algorithm known to the author. The algorithm is of order O(N) and led, for example, to a saving of 7 percent in the total (radix-2) FFT running time for N = 1024.

Journal ArticleDOI
TL;DR: The author's own involvement and experience with the FFT algorithm is described, which led to an unfolding of its pre-electronic computer history going back to Gauss.
Abstract: The discovery of the fast Fourier transform (FFT) algorithm and the subsequent development of algorithmic and numerical methods based on it have had an enormous impact on the ability of computers to process digital representations of signals, or functions. At first, the FFT was regarded as entirely new. However, attention and wide publicity led to an unfolding of its pre-electronic computer history going back to Gauss. The present paper describes the author's own involvement and experience with the FFT algorithm.

Journal ArticleDOI
TL;DR: In this paper, a class of cylindrical multiconductor transmission lines is theoretically analyzed, and useful parameters, e.g., characteristic impedance and effective dielectric constant, are derived.
Abstract: A class of cylindrical multiconductor transmission lines is theoretically analyzed, and useful parameters, e.g., characteristic impedance and effective dielectric constant, are derived. Discretization of the continuous functions and exploitation of the periodicity of the cylindrical structure lead to a discrete convolution which can be carried out numerically rigorously and efficiently using the FFT algorithm. An iterative technique is employed in the spectral domain to derive the solution of integral equations for the charge distribution. Numerical results are presented and compared with available data.

Journal ArticleDOI
01 Feb 1987
TL;DR: A three-dimensional (3-D) Discrete Fourier Transform (DFT) algorithm for real data using the one-dimensional Fast Hartley Transform (FHT) is introduced that is simpler and retains the speed advantage that is characteristic of the Hartley approach.
Abstract: A three-dimensional (3-D) Discrete Fourier Transform (DFT) algorithm for real data using the one-dimensional Fast Hartley Transform (FHT) is introduced. It requires the same number of one-dimensional transforms as a direct FFT approach but is simpler and retains the speed advantage that is characteristic of the Hartley approach. The method utilizes a decomposition of the cas function kernel of the Hartley transform to obtain a temporary transform, which is then corrected by some additions to yield the 3-D DFT. A Fortran subroutine is available on request.

Patent
16 Jun 1987
TL;DR: In this paper, a frequency-domain block-adaptive digital filter (FDAF) having a finite impulse response of length N for filtering a time-domain input signal in accordance with the overlap-save method includes window means (11) for obtaining modifications (B(p,m)) of the 2N frequency-•domain weighting factors (W(p;m)) from correlation products (A(p),m)).
Abstract: A frequency-domain block-adaptive digital filter (FDAF) having a finite impulse response of length N for filtering a time-domain input signal in accordance with the overlap-save method includes window means (11) for obtaining modifications (B(p;m)) of the 2N frequency-­domain weighting factors (W(p;m)) from correlation pro­ducts (A(p;m)). A known FDAF of this type contains five 2N-points FFT's, two of which are used in the window means (11). By utilizing a special time-domain window function which can be implemented very efficiently in the window means (11) with the aid of a frequency-domain convolution, a FDAF of this type containing only three 2N-point FFT's is obtained whose convergence properties are comparable to those of the known FDAF containing five 2N-point FFT's.

Journal ArticleDOI
TL;DR: In this paper, the numerical inversion of Laplace transforms by means of the finite Fourier cosine transform, as presented by Dubner and Abate, was analyzed, and it was found that the proper inversion formula should contain the Fourier sine series as well.

Journal ArticleDOI
TL;DR: A Bayesian image processing formalism which incorporates a priori amplitude and spatial probability density information was applied to two-dimensional source fields, and strikingly improved results for ideal and experimental radioisotope phantom imaging data were obtained.
Abstract: A Bayesian image processing (BIP) formalism which incorporates a priori amplitude and spatial probability density information was applied to two-dimensional source fields. For valid, moderately restrictive a priori information, strikingly improved results for ideal and experimental radioisotope phantom imaging data, compared to a standard non-Bayesian formalism (maximum likelihood, ML), were obtained. The applicability of a fast Fourier transform technique for "convolution" calculations, a reduced-region restriction for the initial "deconvolution" calculations, and a relaxation parameter for accelerating convergence are considered.

Journal ArticleDOI
TL;DR: For real functions, analytic signal concepts may be used to get the Hilbert transform as a byproduct, and applied to the cross correlation function this gives an efficient and accurate method for peak localization.
Abstract: The efficiency of the fast Fourier transform may be increased by removing operations on input values which are zero, and on output values which are not required. This is applied to interpolation of complex and real valued time domain functions. For real functions, analytic signal concepts may be used to get the Hilbert transform as a byproduct, and applied to the cross correlation function this gives an efficient and accurate method for peak localization.

Journal ArticleDOI
TL;DR: In this paper, a new domain decomposed fast Poisson solver on a rectangle divided into parallel strips or boxes is presented, which first performs uncoupled fast solves on each subdomain and then the interface variables are computed exactly by Fast Fourier Transform, without computing or inverting the capacitance matrix explicitly.
Abstract: We present a new domain decomposed fast Poisson solver on a rectangle divided into parallel strips or boxes. The method first performs uncoupled fast solves on each subdomain, and then the interface variables are computed exactly by Fast Fourier Transform, without computing or inverting the capacitance matrix explicitly. Finally, the solution on the interior of the subdomains can be computed by one more fast solve on each subdomain. This method is especially suited for parallel implementation, since the independent problems in the subdomains can be solved in parallel, and the communication involves the interface variables only.

Proceedings ArticleDOI
01 Apr 1987
TL;DR: A Gaussian probabilistic model was developed to screen and select from the large set of features and the significant harmonics of the signature were sorted according to the chi-square value, which is equivalent to the signal-to-noise ratio.
Abstract: Features such as shape, motion and pressure, minutiae details and timing, and transformation methods such as Hadamard and Walsh have been used in signature recognition with various degrees of success. One of the better studies was done by Sato and Kogure using nonlinear warping function. However, it is time consuming in terms of computer time and programming time. In this research, the signatures were normalized for size, orientation, etc. After normalization, the X and Y coordinates of each sampled point of a signature over time (to capture the dynamics of signature writing) were represented as a complex number and the set of complex numbers transformed into the frequency domain via the fast Fourier transform. A Gaussian probabilistic model was developed to screen and select from the large set of features (e.g. amplitude of each harmonics). The significant harmonics of the signature were sorted according to the chi-square value, which is equivalent to the signal-to-noise ratio. Fifteen harmonics with the largest signal-to-noise ratios from the true signatures were used in a discriminant analysis. A total of eight true signatures from a single person and eight each from nineteen forgers were used. It results in an error rate of 2.5%, with the normally more conservative jacknife procedure yielding the same small error rate.

Journal ArticleDOI
TL;DR: In this paper, the Schmidt-Lichtenstein approach was used to solve the nonlinear inversion of the gravity from a single density interface through a power series expansion, where the convergence of the inverse series is restricted to a low-frequency domain, characterized by a cutoff frequency that is dependent upon the amplitude of gravity anomaly, the magnitude of the density contrast, and the mean depth of the interface.
Abstract: The nonlinear inversion of the gravity from a single density interface can be performed through a power series expansion. The method is based on the Schmidt‐Lichtenstein approach for solving nonlinear integral equations. After expanding the nonlinear integral operator for the gravity effect as an operator power series, the inverse operator series is found by applying a technique formally equivalent to the classical inversion scheme of a scalar power series. Unlike the forward power series expansion, however, the convergence of the inverse series is restricted to a low‐frequency domain, characterized by a cutoff frequency that is dependent upon the amplitude of the gravity anomaly, the magnitude of the density contrast, and the mean depth of the interface. To ensure the stability of the inversion scheme, a suitable low‐pass filtering has to be performed. By taking advantage of the noniterative nature of the inversion scheme and the fast Fourier transform, the method is efficiently applied to invert a profi...

Journal ArticleDOI
TL;DR: An algorithm for evaluating the fast Fourier transform that avoids serious difficulties with memory bank conflicts and thus could provide the basis for implementations that more fully utilize the power of the Cray-2.
Abstract: Most implementations of a radix-2 fast Fourier transform on large scientific computers use algorithms that involve memory accesses whose strides are powers of two. (The term stride means the memory increment between successive elements stored or fetched.) Such strides are unacceptable for recently developed supercomputers, particularly the Cray-2, because of serious difficulties with memory bank conflicts.

Journal ArticleDOI
TL;DR: In this article, the scattering problem of an axially uniform dielectric cylinder is formulated in terms of the electric field integral equation, where the cylinder is of general cross-sectional shape, inhomogeneity, and anisotropy, and the incident field is arbitrary.
Abstract: The scattering problem of an axially uniform dielectric cylinder is formulated in terms of the electric field integral equation, where the cylinder is of general cross-sectional shape, inhomogeneity, and anisotropy, and the incident field is arbitrary. Using the pulse-function expansion and the point-matching technique, the integral equation is reduced to a system of simultaneous equations. Then, a published procedure for solving the system using the conjugate gradient method and the fast Fourier transform (FFT) is generalized to the case of oblique-incidence scattering.

Journal ArticleDOI
TL;DR: The results indicate that the communication delay is significantly affected by the method applied to allocate data to memory modules, and the communication time complexity is increased to O(log N) since all N requests generated by processors are serialized at a single memory module.
Abstract: This paper presents a model for the performance prediction of FFT algorithms executed on a shared-memory parallel computer consisting of N processors an the same number of memory modules. The model applies a deterministic analysis to estimate the communication delay through the interconnection network by assuming that all requests arrive at the network in bursts. Our results indicate that the communication delay is significantly affected by the method applied to allocate data to memory modules. For the case in which all data items referenced by a processor during an iteration are allocated to a single memory module, the best-case communication time complexity grows as O[(log N) 2 /N]. The worst-case communication time complexity for this case, obtained by a different allocation of data to memory modules, is increased to O[(log N)/√N] due to high network contention. For the case in which the data items referenced by different processors during an iteration are allocated to the same memory module, the communication time complexity is further increased to O(log N) since all N requests generated by processors are serialized at a single memory module. The methods developed in this paper can be applied for the performance prediction of other well-structured parallel iterative algorithms.