scispace - formally typeset
Search or ask a question

Showing papers on "Prime-factor FFT algorithm published in 2015"


Journal ArticleDOI
Jibin Zheng1, Tao Su1, Wentao Zhu1, Xuehui He1, Qing Huo Liu2 
TL;DR: This coherent detection algorithm can detect high-speed targets without the brute-force searching of unknown motion parameters and achieve a good balance between the computational cost and the antinoise performance.
Abstract: In this paper, by employing the symmetric autocorrelation function and the scaled inverse Fourier transform (SCIFT), a coherent detection algorithm is proposed for high-speed targets. This coherent detection algorithm is simple and can be easily implemented by using complex multiplications, the fast Fourier transform (FFT) and the inverse FFT (IFFT). Compared to the Hough transform and the keystone transform, this coherent detection algorithm can detect high-speed targets without the brute-force searching of unknown motion parameters and achieve a good balance between the computational cost and the antinoise performance. Through simulations and analyses for synthetic models and the real data, we verify the effectiveness of the proposed coherent detection algorithm.

102 citations


Journal ArticleDOI
TL;DR: The GHR combines 2-D and 1-D factorization techniques and improves the throughput by a factor of two to four with comparable hardware cost compared with the previous designs, which is nearly two times better than that of previous FFT processors.
Abstract: In this paper, we propose a hardware-efficient mixed generalized high-radix (GHR) reconfigurable fast Fourier transform (FFT) processor for long-term evolution applications. The GHR processor based on radix-25/16/9 uses a 2-D factorization scheme as the high-radix unit and a 1-D factorization method as the system data routing technology. The 2-D factorization scheme is implemented by an enhanced delay element matrix structure, which supports 25-, 16-, 9-, 8-, 5-, 4-, 3-, and 2-point FFTs. Two different designs were implemented. One design (called discrete Fourier transform core) supports 34 different transform sizes from 12 to 1296 points, while the other design (called FFT core) supports five different power-of-two sizes from 128 to 2048 points. The 1-D factorization method is performed by a coprime accessing technology, which accesses the data in parallel without conflict using a RAM. The GHR combines 2-D and 1-D factorization techniques and improves the throughput by a factor of two to four with comparable hardware cost compared with the previous designs. The speed–area ratio of the proposed scheme is nearly two times better than that of previous FFT processors. Application-specified integrated circuit implementation results based on a 0.18- $\mu{\rm m}$ technology are also provided.

51 citations


Journal ArticleDOI
TL;DR: MR-FFT-TSOM, a new stable and effective strategy of regularization has been proposed in the fast Fourier transform-twofold subspace-based optimization method and multiplicative regularized contrast source inversion to solve inverse scattering problems.
Abstract: In this paper, we combine two techniques together, i.e., the fast Fourier transform-twofold subspace-based optimiza- tion method (FFT-TSOM) and multiplicative regularization (MR) to solve inverse scattering problems. When applying MR to the objective function in the FFT-TSOM, the new method is referred to as MR-FFT-TSOM. In MR-FFT-TSOM, a new stable and effec- tive strategy of regularization has been proposed. MR-FFT-TSOM inherits not only the advantages of the FFT-TSOM, i.e., lower computational complexity than the TSOM, better stability of the inversion procedure, and better robustness against noise compared with the SOM, but also the edge-preserving ability from the MR. In addition, a more relaxed condition of choosing the number of current bases being used in the optimization can be obtained compared with the FFT-TSOM. Particularly, MR-FFT-TSOM has even better robustness against noise compared with the FFT- TSOM and multiplicative regularized contrast source inversion (MR-CSI). Numerical simulations including both inversion of synthetic data and experimental data from the Fresnel data set validate the efficacy of the proposed algorithm.

43 citations


Journal ArticleDOI
TL;DR: A novel architecture for memory-based fast Fourier transform (FFT) computation for real-valued signals based on radix-2 decimation-in-frequency algorithm to minimize the computation clock cycles and maximize the utilization of the processing element (PE).
Abstract: This brief presents a novel architecture for memory-based fast Fourier transform (FFT) computation for real-valued signals based on radix-2 decimation-in-frequency algorithm. A superior strategy of stage partition for the real FFT (RFFT) is proposed to minimize the computation clock cycles and maximize the utilization of the processing element (PE). The PE employed in our RFFT architecture can process four inputs in parallel by using two radix-2 butterflies and only two multiplexers. The proposed memory-addressing scheme and control of the multiplexers can be expressed in terms of a counter according to the RFFT computation stage. Furthermore, the proposed RFFT architecture can support more PEs in two dimensions as well. Compared with prior works, the proposed RFFT processors have the advantages of fewer computation cycles and lower hardware usage. The experiment shows that the proposed processor reduces the computation cycles by a factor of 17.5% for a 32-point RFFT computation compared with a recently presented work while maintaining lower hardware usage and complexity in the PE design.

41 citations


Journal ArticleDOI
TL;DR: A new, fast and computationally efficient lateral subpixel shift registration algorithm is presented that reduces computation time and memory requirements without sacricing the accuracy associated with the usual FFT approach accuracy.
Abstract: A new, fast and computationally efficient lateral subpixel shift registration algorithm is presented. It is limited to register images that differ by small subpixel shifts otherwise its performance degrades. This algorithm significantly improves the performance of the single-step discrete Fourier transform approach proposed by Guizar-Sicairos and can be applied efficiently on large dimension images. It reduces the dimension of Fourier transform of the cross correlation matrix and reduces the discrete Fourier transform (DFT) matrix multiplications to speed up the registration process. Simulations show that our algorithm reduces computation time and memory requirements without sacricing the accuracy associated with the usual FFT approach accuracy.

28 citations


Journal ArticleDOI
TL;DR: This paper presents the necessary mathematical formulation for removing the redundancies in the radix-2 DIT RFFT, and presents a formulation to regularize its flow graph to facilitate folded computation with a simple control unit.
Abstract: The decimation-in-time (DIT) fast Fourier transform (FFT) very often has advantage over the decimation-in-frequency (DIF) FFT for most real-valued applications, like speech/image/video processing, biomedical signal processing, and time-series analysis, etc., since it does not require any output reordering. Besides, the DIT FFT butterfly involves less computation time than its DIF counterpart. In this paper, we present an efficient architecture for the radix-2 DIT real-valued FFT (RFFT). We present here the necessary mathematical formulation for removing the redundancies in the radix-2 DIT RFFT, and present a formulation to regularize its flow graph to facilitate folded computation with a simple control unit. We propose here a register-based storage design which involves significantly less area at the cost of a little higher latency compared with the conventional RAM-based storage. The address generation for folded in-place DIT RFFT computation with register-based storage is challenging since both read and write operations are performed in the same clock cycle at different locations. Therefore, we present here a simple formulation of address generation for the proposed radix-2 DIT RFFT structure. The proposed structure involves $\sim$ 61% less area and $\sim$ 40% less power consumption than those of , on average, for FFT sizes 16, 32, 64, and 128. It involves $\sim$ 70% less area-delay product and $\sim$ 57% less energy per sample than those of the other, on average, for the same FFT sizes.

27 citations


Journal ArticleDOI
Jinqi Liu1, Qian-Jian Xing1, Xiao-Bo Yin1, Xiu-Bin Mao1, Feng Yu1 
TL;DR: An efficient radix-2 single-path delay commutator (SDC) pipelined architecture to implement the fast Walsh-Hadamard-Fourier transform (FWFT) algorithm and can be applied to FFT/WHT/sequence-ordered complex Hadamard transform (SCHT).
Abstract: This brief proposes an efficient radix-2 single-path delay commutator (SDC) pipelined architecture to implement the fast Walsh–Hadamard–Fourier transform (FWFT) algorithm. The proposed architecture includes $(\log_{2}N-1)$ SDC stages, which are implemented by merged half-butterfly. The merged half-butterfly is proposed to achieve 100% hardware utilization and minimum buffer usage by sharing common merged half-butterflies in the time-multiplexed approach. Compared with the conventional pipelined radix-2 FFT+Walsh–Hadamard Transform (WHT) designs, the proposed architecture reduces the number of buffers by 50% and of adders by 25%. The required number of complex multipliers is decreased to $0.5\log_{2}N-0.5$ , which is roughly the minimum number. Moreover, the proposed architecture can be applied to FFT/WHT/sequence-ordered complex Hadamard transform (SCHT).

24 citations


Journal ArticleDOI
TL;DR: A new modulated hopping Discrete Fourier Transform (mHDFT) algorithm which is characterized by its merits of high accuracy and constant stability is presented and the numerical simulation results verify the effectiveness and superiority of the proposed algorithm.
Abstract: A new modulated hopping Discrete Fourier Transform (mHDFT) algorithm which is characterized by its merits of high accuracy and constant stability is presented. The proposed algorithm, which is based on the circular frequency shift property of DFT, directly moves the k-th DFT bin to the position of k = 0, and computes the DFT by incorporating the successive DFT outputs with arbitrary time hop L. Compared to previous works, since the pole of mHDFT precisely settles on the unit circle in the Z-plane, the accumulated errors and potential instabilities, which are caused by the quantization of the twiddle factor, are always eliminated without increasing much computational effort. The numerical simulation results verify the effectiveness and superiority of the proposed algorithm.

19 citations


Journal ArticleDOI
TL;DR: This paper presents a fast Fourier transform (FFT) algorithm for computing length-q×2m DFTs, and the proposed algorithm achieves reduction of arithmetic complexity over the related algorithms.
Abstract: Discrete Fourier transform (DFT) is widely used in almost all fields of science and engineering. Fast Fourier transform (FFT) is an efficient tool for computing DFT. In this paper, we present a fast Fourier transform (FFT) algorithm for computing length- $q\times 2^{m}$ DFTs. The algorithm transforms all $q$ -points sub-DFTs into three parts. In the second part, the operations of subtransformation contain only multiplications by real constant factors. By transformation, length- $2^{m}$ -scaled DFTs (SDFT) are obtained. An extension of scaled radix-2/8 FFT (SR28FFT) is presented for computing these SDFTs, in which, the real constant factors of SDFTs are attached to the coefficients of sub-DFTs to simplify multiplication operations. The proposed algorithm achieves reduction of arithmetic complexity over the related algorithms. It can achieve a further reduction of arithmetic complexity for computing a length- $N=q\times 2^{m}$ IDFT by $2N-4m$ real multiplications. In addition, the proposed algorithm is applied to real-data FFT, and is extended to $6^{m}$ DFTs.

18 citations


Journal ArticleDOI
TL;DR: A modeling scheme to decompose the discrete Fourier transform (DFT) matrix recursively into a set of sparse matrices and is able to obtain different FFT representations with less computation operations than state of the arts.

16 citations


Journal ArticleDOI
TL;DR: This work enables construction of conflict-free schedules using single-ported memory banks, which require less area than more traditional multi-ported designs.
Abstract: A conflict-free schedule lets an FFT run to completion without ever having to pause for memory-conflict resolution. We show how to build such schedules for FFTs having any number of butterfly units $B$ operating at any radix $R$ , transforming any number of datapoints $D$ . Our algorithm works for FFT datapaths with or without pipeline overlap, and for memory banks having any number of access ports. Specifically, it enables construction of conflict-free schedules using single-ported memory banks, which require less area than more traditional multi-ported designs.

Proceedings ArticleDOI
02 Mar 2015
TL;DR: The proposed architecture is based on Dual RAM Ping-Pong Burst I/O with efficient addressing techniques which clocks at 385.804MHz on Xilinx Virtex-6 xc6vlx550t-2ff1759 taking 16.376µs to calculate one set of 1024 point FFT.
Abstract: This paper presents a Fast Fourier Transform (FFT) processor optimized for both ‘area’ and ‘frequency’. The processor architecture is deeply pipelined Radix-2 butterfly unit, 1024 point, 64bit Fixed Point input with 32bit real and 32bit imaginary, Decimation In Time (DIT) FFT processor on Field Programmable Gate Array (FPGA). The proposed architecture is based on Dual RAM Ping-Pong Burst I/O with efficient addressing techniques which clocks at 385.804MHz on Xilinx Virtex-6 xc6vlx550t-2ff1759 taking 16.376µs to calculate one set of 1024 point FFT.

Proceedings ArticleDOI
TL;DR: The purpose of this paper is to improve an existing implementation of multi-scale retinex (MSR) by utilizing the fast Fourier transforms within the illumination estimation step of the algorithm to improve the speed at which Gaussian blurring filters were applied to the original input image.
Abstract: Efficiency in terms of both accuracy and speed is highly important in any system, especially when it comes to image processing. The purpose of this paper is to improve an existing implementation of multi-scale retinex (MSR) by utilizing the fast Fourier transforms (FFT) within the illumination estimation step of the algorithm to improve the speed at which Gaussian blurring filters were applied to the original input image. In addition, alpha-rooting can be used as a separate technique to achieve a sharper image in order to fuse its results with those of the retinex algorithm for the sake of achieving the best image possible as shown by the values of the considered color image enhancement measure (EMEC).

Journal ArticleDOI
TL;DR: A new, simple, efficient and faster GPS acquisition via sub-sampled fast Fourier transform (ssFFT), which exploits the recently developed sparse FFT (or sparse IFFT) that computes in sub-linear time.
Abstract: Acquisition is a most important process and a challenge task for identifying visible satellites, coarse values of carrier frequency, and code phase of the satellite signals in designing software defined Global positioning system (GPS) receiver. This paper presents a new, simple, efficient and faster GPS acquisition via sub-sampled fast Fourier transform (ssFFT). The proposed algorithm exploits the recently developed sparse FFT (or sparse IFFT) that computes in sub-linear time. Further it uses the property of fourier transforms (FT): Aliasing a signal in the time domain corresponds to sub-sampling it in the frequency domain, and vice versa. The ssFFT is an FFT algorithm that computes sub-sampled version of the data by an integer factor ‘d’, and hence, the computational complexity is proportionately reduced by a factor of ‘d log d’ compared to conventional FFT-based algorithms for any length of the input GPS signal. The simulation results show that the proposed ssFFT based GPS acquisition computation is 8.5571 times faster than the conventional FFT-based acquisition computation time. The implementation of this method in an FPGA provides very fast processing of incoming GPS samples that satisfies real-time positioning requirements. Defence Science Journal, Vol. 65, No. 1, January 2015, pp.5-11, DOI:http://dx.doi.org/10.14429/dsj.65.5579

Proceedings ArticleDOI
19 Apr 2015
TL;DR: Hardware architectures for computing real FFT that exploits this conjugate symmetry property where the inputs are processed in a serial manner are presented, facilitated by pushing the twiddle factor values across various butterfly stages.
Abstract: The Fast Fourier transform (FFT) is an important operation in digital signal processing applications In applications such as biomedical signal processing, the signals are real The real-valued signals exhibit conjugate symmetry, giving rise to redundant values in the outputs This property can be exploited to reduce arithmetic computations, area and power consumption This paper presents hardware architectures for computing real FFT that exploits this conjugate symmetry property where the inputs are processed in a serial manner This is facilitated by pushing the twiddle factor values across various butterfly stages In this paper, two different serial FFT architectures are presented: one using real and the other using hybrid datapaths These architectures process one sample per clock cycle and are well suited for low-sample-rate applications such as biomedical These architectures are also modified so that two independent computations can be interleaved in the same datapath The advantage of interleaving is reduction in area, and is attractive for applications where FFT computation of two independent real signals is required

Proceedings ArticleDOI
05 Mar 2015
TL;DR: An architecture for real time hardware implementation of Hilbert Transform (HT) using Fast Fourier Transform (FFT) using Xilinx Kintex- 7 based FPGA is presented and the results acquired are presented in comparison to results obtained through MATLAB simulations.
Abstract: This paper presents an architecture for real time hardware implementation of Hilbert Transform (HT) using Fast Fourier Transform (FFT). HT is studied and its various application areas are discussed in the paper. Two different architectures are proposed using Fast Fourier Transform (FFT) for the implementation. Implementation of HT using the proposed FFT based architectures are compared with the implementations using Discrete Fourier Transform (DFT) and Discrete Hartley Transform (DHT). The proposed FFT based architectures are implemented on Xilinx Kintex- 7 based FPGA and the results acquired are presented in comparison to results obtained through MATLAB simulations. The architecture implemented supports transform length of 8192 points as a demonstrator to the idea using 24 bit fixed point arithmetic. Detailed comparison study in terms of resource utilization and timing analysis is also carried out and the results are reported.

Journal ArticleDOI
TL;DR: A novel algorithm, the eigenvalue decomposition and least squares algorithm (EDLSA), is proposed, not requiring any a priori information, which can replace the Fourier transform in depth-resolved interferometry with improved depth resolution.
Abstract: Depth resolution of depth-resolved interferometry evaluated by Fourier transform is limited by the range of phase shifting. A novel algorithm, the eigenvalue decomposition and least squares algorithm (EDLSA), is proposed. Experimental results obtained using depth-resolved wavenumber-scanning interferometry demonstrate that the EDLSA performs better than the Fourier transform and complex number least squares algorithm. Not requiring any a priori information, the algorithm can replace the Fourier transform in depth-resolved interferometry with improved depth resolution.

Journal ArticleDOI
TL;DR: A frequency offset estimation (FOE) algorithm based on improved fast Fourier transform (FFT) for coherent optical systems that adopts multi-steps interpolation with the increasing number of samples to improve the estimation accuracy gradually is investigated.
Abstract: We investigate a frequency offset estimation (FOE) algorithm based on improved fast Fourier transform (FFT) for coherent optical systems. The algorithm implements FFT operation with a small number of samples and then adopts multi-steps interpolation with the increasing number of samples to improve the estimation accuracy gradually. In a 28-GBd coherent quaternary phase-shift keying system, simulation results show that the proposed algorithm reaches the same estimation accuracy with least-squares FOE algorithm that utilizes 64 time spans (LS-64) under the same total number of samples $L$ . But the number of complex multiplications required by the proposed algorithm is just 7.26% and 6.75% of that required by LS-64 at $L=1024$ and $L=2048$ , respectively.

Journal ArticleDOI
TL;DR: A unified model of the Sb-SDFT methods is proposed, whose aim is to design a frequency adaptive control loop that allows to mitigate the problems associated with improper sampling frequency.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: A new extended-OFDM transceiver is designed by modifying the modulation, encoding-decoding and FFT techniques by applying QAM, Optimum Frequency domain response encoding and 64 lines FFT with Radix4 Decimation in Time (DIT) by prime factor algorithm respectively.
Abstract: In order to increase the quality of the real time application it is essential to improve the throughput, capability of the network in new generation of WLAN. In recent days, the wide local area networks are combined with MIMO and OFDM wireless technologies. It is well known that OFDM provides a high data rate with low complexity. The robustness of the OFDM provides high data rates. It comprises multiplexing in modulations, so that OFDM can able to increase the interference of the channel. In this paper is focused to improve the quality of service on 802.11n standard MIMO-OFDM wireless systems. To do this, a new extended-OFDM transceiver is designed by modifying the modulation, encoding-decoding and FFT techniques by applying QAM, Optimum Frequency domain response encoding and 64 lines FFT with Radix4 Decimation in Time (DIT) by prime factor algorithm respectively. The main unit of OFDM is FFT. Optimizing FFT requires high attention. Hence by generalizing FFT algorithm the performance of the system is improved in terms of high throughput and low complexity. The simulation results shows that the proposed transceiver is efficient than the existing system.

Proceedings ArticleDOI
08 Oct 2015
TL;DR: Pipelined architecture with low power techniques like sign swap, sub expression elimination along with several area reduction techniques like “In Place” addressing, single butterfly element per stage using the pipelined architecture are presented.
Abstract: Fast Fourier Transform is an elevated form of Discrete Fourier Transform which is much simpler, effective, and faster with lesser number of computations has dominated in various fields. As the gate length of CMOS is going deeper and deeper into Ultra Deep Sub-Micron (UDSM) the leakage power which was negligible before is tending towards the dynamic power range, increasing the requirement of low power devices. This paper presents several low power techniques like sign swap, sub expression elimination along with several area reduction techniques like “In Place” addressing, single butterfly element per stage using the pipelined architecture. In this paper pipelined architecture with low power techniques is implemented on both radix-2 and radix-4 FFT processor and compared. Results shows that pipelined Radix-4 FFT consumes 11% less power compared to radix-2 FFT for 16 point implementation.

Proceedings ArticleDOI
10 May 2015
TL;DR: A low-complexity multiplierless approximation for the 8-point DFT is presented for RF multi- beamforming using only 26 hardware adders and provides eight simultaneous aperture beams that closely resemble the antenna array patterns of an FFT-based beamformer.
Abstract: A low-complexity multiplierless approximation for the 8-point DFT is presented for RF multi- beamforming using only 26 hardware adders. The algorithm provides eight simultaneous aperture beams that closely resemble the antenna array patterns of an FFT-based beamformer. The multiplicative complexity is used as a benchmark for comparing the performance-complexity-power trade-offs between the traditional FFT and the proposed approximate DFT algorithms. Metrics based on maximum throughput, chip area, and power consumption are used for the comparison. The paper discusses the theory behind the proposed new algorithm, and the proposed 8-point DFT will be presented in the form of an 8 × 8 matrix. Furthermore simulation examples are provided for both 1-D and 2-D antenna array patters along with synthesized results for 45 nm CMOS technology at 1.1 V supply voltage. Cadence designs show a reduction of 30.6% , 33.2%, 29.0% , 26.1% and 52.0% in chip area, dynamic power consumption, critical path delay, gate-count and area-time and an increase in 45.5% in maximum clock frequency (throughput) for the proposed 8-point DFT approximation in comparison with a traditional radix-2 FFT algorithm, where both algorithms assumed 16-bit input signals.

Journal ArticleDOI
TL;DR: An improved low-complexity sum-product decoding algorithm is presented for low-density parity-check (LDPC) codes and can achieve a reduction of 42%–67% of the total number of arithmetic operations required for the decoding process.
Abstract: In this paper, an improved low-complexity sum-product decoding algorithm is presented for low-density parity-check (LDPC) codes. In the proposed algorithm, reduction in computational complexity is achieved by utilizing fast Fourier transform (FFT) with time shift in the check node process. The improvement in the decoding performance is achieved by utilizing an optimized integer constant in the variable node process. Simulation results show that the proposed algorithm achieves an overall coding gain improvement ranging from 0.04 to 0.46 dB. Moreover, when compared with the sum-product algorithm (SPA), the proposed decoding algorithm can achieve a reduction of 42%–67% of the total number of arithmetic operations required for the decoding process.

Proceedings ArticleDOI
01 Nov 2015
TL;DR: This paper describes the development of decimation-in-time radix-2 FFT algorithm with 16 and 32 points, which was used as a description language, and ISE Design Suite as an Integrated Development Environment (IDE).
Abstract: The Fast Fourier Transform (FFT) is an important algorithm used in the field of Digital Signal Processing and Communication Systems. The FFT has applications in a wide variety of areas, such as linear filtering, correlation, and spectrum analysis, among many others. This paper describes the development of decimation-in-time radix-2 FFT algorithm with 16 and 32 points. VHDL was used as a description language, and ISE Design Suite as an Integrated Development Environment (IDE).

Journal ArticleDOI
TL;DR: This paper proposes a shared multiplier scheduling scheme (SMSS) for area-efficient fast Fourier transform (FFT)/inverse FFT processors that can significantly reduce the total number of complex multipliers up to 28%.
Abstract: This paper proposes a shared multiplier scheduling scheme (SMSS) for area-efficient fast Fourier transform (FFT)/inverse FFT processors. SMSS can significantly reduce the total number of complex multipliers up to 28%. The proposed mixed-radix multipath delay commutator processors can support 128/256 and 256/512-point FFTs using SMSS. The proposed processors have been designed and implemented with 90-nm CMOS technology, which can reduce the total hardware complexity by 20%. The proposed processors having eight-parallel data paths can achieve a high throughput rate up to 27.5 GS/s at 430 MHz. In addition, the proposed processors can support any FFT size using additional stages.

Journal ArticleDOI
TL;DR: A two-dimensional analog of the Cooley-Tukey algorithm is constructed for a rectangular signal with the number of samples 2s × 2s + ℓ, and the testing of the algorithm on image-type signals shows that the speed of computation of the FFT by the algorithm proposed is about 1.7 times higher than that of the algorithms by rows and columns.
Abstract: One-dimensional fast Fourier transform (FFT) is the most popular tool for computing the two-dimensional Fourier transform. As a rule, a standard method of combination of one-dimensional FFTs--the so-called algorithm "by rows and columns" [1]--is used in the literature. In [2, 3], the authors showed how to compute the FFT for a signal with the number of samples 2 s × 2 s with the use of an analog of the Cooley-Tukey algorithm. In the present paper, a two-dimensional analog of the Cooley-Tukey algorithm is constructed for a rectangular signal with the number of samples 2 s × 2 s + l. The number of operations in this algorithm is much less than that in the successive application of a one dimensional FFT by rows and columns. The testing of the algorithm on image-type signals shows that the speed of computation of the FFT by the algorithm proposed is about 1.7 times higher than that of the algorithm by rows and columns.

Journal Article
TL;DR: The results show that the new algorithm based on down sampling and blocking FFT has an accurate matching result and a fine flexibility of the environment and can reduce the matching time by 4 /5.
Abstract: Image matching is one of the important applications in computer vision. Common algorithm of template-matching is based on normalized correlation coefficient. However,the origin algorithm had the problem of large amount of calculation and long matching time. In order to optimize it,a new method based on down sampling and blocking FFT was proposed. Two experiments were designed for comparison.The results show that the new algorithm not only has an accurate matching result and a fine flexibility of the environment,but also can reduce the matching time by 4 /5.

Proceedings ArticleDOI
04 Apr 2015
TL;DR: Prime size DFTs are calculated using polynomial base WFFT and those are compared with conventional definition base algorithm in terms of arithmetic operations which drastically reduce area and increase speed.
Abstract: The most of the communication standard such as Digital Terrestrial/Television Multimedia Broadcasting (DMB-T) require non power of two size Discrete Fourier Transforms (DFTs). Wino grad Fast Fourier Transform algorithm (WFFT) is a Fast Fourier algorithm which fulfil requirement by calculating non power of two size DFTs. In this paper, prime size DFTs are calculated using polynomial base WFFT and those are compared with conventional definition base algorithm in terms of arithmetic operations which drastically reduce area and increase speed. Theoretically, WFFT is very complex algorithm involving Chinese Remainder theorem(CRT) for polynomial calculations while for practical implementation it is one of the best optimize FFT algorithm. This paper includes prime size WFFT architectures, implemented in Verilog and simulated using Xilinx ISE 13.1. One of the future applications in communication domain is that utilisation of DFT architectures to compute IDFT which can be serve purpose in the receiving side of orthogonal frequency division multiplexing (OFDM) system.

Journal ArticleDOI
TL;DR: In this article, a new harmonics frequency estimation method based on spectrum analysis techniques is proposed to estimate the direction of angle, the most popular is the multiple signal classification(MUSIC) algorithm The drawbacks of MUSIC algorithm are concluded.
Abstract: This paper presents a new harmonics frequency estimation method. Unlike the conventional harmonic frequen- cy estimation method (fast Fourier transform), the new algorithm is based on spectrum analysis techniques often used to estimate the direction of angle, the most popular is the multiple signal classification(MUSIC) algorithm The drawbacks of MUSIC algorithm are concluded. Improved-MUSIC approximation algorithm is introduced and compared with FFT based on algorithm for harmonic frequency estimation. Theoretical analysis and simulations show this algorithm is a su- per-resolution algorithm with small data length.

Journal ArticleDOI
TL;DR: Simulations and experiments indicates that the derived SQNR is reliable to unfold the quantization effects of fixed-point radix- 2k FFT and the proposed joint optimization strategy is capable of providing better solutions to implement the radIX- 2K FFT processor efficiently.
Abstract: The radix- $2^{k}$ algorithm plays a crucial role in the pipelined implementation of fast Fourier transform (FFT). This paper presents a fixed-point analysis and hardware evaluation of radix- $2^{k}$ FFT under the framework of the single-path delay feedback (SDF) and multi-path delay commutator (MDC) pipelined structure. The investigation is carried out with variable operating word-lengths to ensure the generality. Furthermore, the main streams to fulfill FFT coefficients weighting, namely, the approach using complex multipliers and the one adopting memoryless CORDIC units, are both considered in the analysis. Based on these derivations, a joint optimization of radix- $2^{k}$ algorithm and operating word-length is discussed to achieve a reasonable trade-off between computational accuracy and hardware expenditure. Simulations and experiments indicates that the derived SQNR is reliable to unfold the quantization effects of fixed-point radix- $2^{k}$ FFT. In addition, the proposed joint optimization strategy is capable of providing better solutions to implement the radix- $2^{k}$ FFT processor efficiently.