scispace - formally typeset
Search or ask a question

Showing papers on "Fast Fourier transform published in 2017"


Journal ArticleDOI
TL;DR: FIt-S NE, a sped-up version of t-SNE, enables visualization of rare cell types in large datasets by obviating the need for downsampling.
Abstract: t-distributed Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization that has become widely popular in recent years. Efficient implementations of t-SNE are available, but they scale poorly to datasets with hundreds of thousands to millions of high dimensional data-points. We present Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE), which dramatically accelerates the computation of t-SNE. The most time-consuming step of t-SNE is a convolution that we accelerate by interpolating onto an equispaced grid and subsequently using the fast Fourier transform to perform the convolution. We also optimize the computation of input similarities in high dimensions using multi-threaded approximate nearest neighbors. We further present a modification to t-SNE called "late exaggeration," which allows for easier identification of clusters in t-SNE embeddings. Finally, for datasets that cannot be loaded into the memory, we present out-of-core randomized principal component analysis (oocPCA), so that the top principal components of a dataset can be computed without ever fully loading the matrix, hence allowing for t-SNE of large datasets to be computed on resource-limited machines.

281 citations


Proceedings ArticleDOI
14 Oct 2017
TL;DR: The CirCNN architecture is proposed, a universal DNN inference engine that can be implemented in various hardware/software platforms with configurable network architecture (e.g., layer type, size, scales, etc) and FFT can be used as the key computing kernel which ensures universal and small-footprint implementations.
Abstract: Large-scale deep neural networks (DNNs) are both compute and memory intensive. As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy. For DNNs, the model size is an important factor affecting performance, scalability and energy efficiency. Weight pruning achieves good compression ratios but suffers from three drawbacks: 1) the irregular network structure after pruning, which affects performance and throughput; 2) the increased training complexity; and 3) the lack of rigirous guarantee of compression ratio and inference accuracy.To overcome these limitations, this paper proposes CirCNN, a principled approach to represent weights and process neural networks using block-circulant matrices. CirCNN utilizes the Fast Fourier Transform (FFT)-based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) from $\mathrm {O}(n^{2})$ to $\mathrm {O}(n$ log n) and the storage complexity from $\mathrm {O}(n^{2})$ to O(n), with negligible accuracy loss. Compared to other approaches, CirCNN is distinct due to its mathematical rigor: the DNNs based on CirCNN can converge to the same “effectiveness” as DNNs without compression. We propose the CirCNN architecture, a universal DNN inference engine that can be implemented in various hardware/software platforms with configurable network architecture (e.g., layer type, size, scales, etc In CirCNN architecture: 1) Due to the recursive property, FFT can be used as the key computing kernel which ensures universal and small-footprint implementations. 2) The compressed but regular network structure avoids the pitfalls of the network pruning and facilitates high performance and throughput with highly pipelined and parallel design. To demonstrate the performance and energy efficiency, we test CIR-CNN in FPGA, ASIC and embedded processors. Our results show that CirCNN architecture achieves very high energy efficiency and performance with a small hardware footprint. Based on the FPGA implementation and ASIC synthesis results, CirCNN achieves 6 - 102X energy efficiency improvements compared with the best state-of-the-art results.CCS Concepts• Computer systems organization$\rightarrow $ Embedded hardware;

262 citations


Proceedings ArticleDOI
Jonathan T. Barron1, Yun-Ta Tsai1
01 Jul 2017
TL;DR: Fast Fourier Color Constancy (FFCC) as discussed by the authors is a color constancy algorithm which solves illuminant estimation by reducing it to a spatial localization task on a torus.
Abstract: We present Fast Fourier Color Constancy (FFCC), a color constancy algorithm which solves illuminant estimation by reducing it to a spatial localization task on a torus. By operating in the frequency domain, FFCC produces lower error rates than the previous state-of-the-art by 13–20% while being 250-3000 times faster. This unconventional approach introduces challenges regarding aliasing, directional statistics, and preconditioning, which we address. By producing a complete posterior distribution over illuminants instead of a single illuminant estimate, FFCC enables better training techniques, an effective temporal smoothing technique, and richer methods for error analysis. Our implementation of FFCC runs at ~700 frames per second on a mobile device, allowing it to be used as an accurate, real-time, temporally-coherent automatic white balance algorithm.

186 citations


Journal ArticleDOI
TL;DR: A new strategy to increase the speed of FSI by two orders of magnitude is reported, which binarize the Fourier basis patterns based on upsampling and error diffusion dithering to find broad imaging applications at wavebands that are not accessible using conventional two-dimensional image sensors.
Abstract: Fourier single-pixel imaging (FSI) employs Fourier basis patterns for encoding spatial information and is capable of reconstructing high-quality two-dimensional and three-dimensional images. Fourier-domain sparsity in natural scenes allows FSI to recover sharp images from undersampled data. The original FSI demonstration, however, requires grayscale Fourier basis patterns for illumination. This requirement imposes a limitation on the imaging speed as digital micro-mirror devices (DMDs) generate grayscale patterns at a low refreshing rate. In this paper, we report a new strategy to increase the speed of FSI by two orders of magnitude. In this strategy, we binarize the Fourier basis patterns based on upsampling and error diffusion dithering. We demonstrate a 20,000 Hz projection rate using a DMD and capture 256-by-256-pixel dynamic scenes at a speed of 10 frames per second. The reported technique substantially accelerates image acquisition speed of FSI. It may find broad imaging applications at wavebands that are not accessible using conventional two-dimensional image sensors.

175 citations


Journal ArticleDOI
TL;DR: Simulation results illustrate that the proposed methodologies can outperform some counterparts providing sequences with good autocorrelation features especially in the discrete phase/binary case.
Abstract: This paper is focused on the design of phase sequences with good (aperiodic) autocorrelation properties in terms of peak sidelobe level and integrated sidelobe level. The problem is formulated as a biobjective Pareto optimization forcing either a continuous or a discrete phase constraint at the design stage. An iterative procedure based on the coordinate descent method is introduced to deal with the resulting optimization problems that are nonconvex and NP-hard in general. Each iteration of the devised method requires the solution of a nonconvex min–max problem. It is handled either through a novel bisection or an FFT-based method respectively for the continuous and the discrete phase constraint. Additionally, a heuristic approach to initialize the procedures employing the $l_p$ -norm minimization technique is proposed. Simulation results illustrate that the proposed methodologies can outperform some counterparts providing sequences with good autocorrelation features especially in the discrete phase/binary case.

175 citations


Journal ArticleDOI
TL;DR: A Fourier pseudo-spectral method that conserves mass and energy is developed for a two-dimensional nonlinear Schrodinger equation and it is proved that the optimal rate of convergence is in the order of O in the discrete L 2 norm without any restrictions on the grid ratio.

112 citations


Journal ArticleDOI
TL;DR: The aim is to render the method transparent and accessible, whereby researchers that are new to this method should be able to implement it efficiently, and the potential of this method is demonstrated using two examples, each with a different material model.

108 citations


Journal ArticleDOI
TL;DR: In this paper, a new feature extraction step that combines the classical wavelet packet decomposition energy distribution technique and a feature extraction technique based on the selection of the most impulsive frequency bands is presented.

99 citations


Proceedings ArticleDOI
TL;DR: CirCNN as discussed by the authors utilizes the Fast Fourier Transform (FFT)-based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) and the storage complexity from O(n2) to O(nlogn) with negligible accuracy loss.
Abstract: Large-scale deep neural networks (DNNs) are both compute and memory intensive. As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy. For DNNs, the model size is an important factor affecting performance, scalability and energy efficiency. Weight pruning achieves good compression ratios but suffers from three drawbacks: 1) the irregular network structure after pruning; 2) the increased training complexity; and 3) the lack of rigorous guarantee of compression ratio and inference accuracy. To overcome these limitations, this paper proposes CirCNN, a principled approach to represent weights and process neural networks using block-circulant matrices. CirCNN utilizes the Fast Fourier Transform (FFT)-based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) from O(n2) to O(nlogn) and the storage complexity from O(n2) to O(n), with negligible accuracy loss. Compared to other approaches, CirCNN is distinct due to its mathematical rigor: it can converge to the same effectiveness as DNNs without compression. The CirCNN architecture, a universal DNN inference engine that can be implemented on various hardware/software platforms with configurable network architecture. To demonstrate the performance and energy efficiency, we test CirCNN in FPGA, ASIC and embedded processors. Our results show that CirCNN architecture achieves very high energy efficiency and performance with a small hardware footprint. Based on the FPGA implementation and ASIC synthesis results, CirCNN achieves 6-102X energy efficiency improvements compared with the best state-of-the-art results.

98 citations


Journal ArticleDOI
TL;DR: In this article, a new unconditionally stable implicit difference method, derived from the weighted and shifted Grunwald formula, converges with the second-order accuracy in both time and space variables.
Abstract: In this paper we intend to establish fast numerical approaches to solve a class of initial-boundary problem of time-space fractional convection–diffusion equations. We present a new unconditionally stable implicit difference method, which is derived from the weighted and shifted Grunwald formula, and converges with the second-order accuracy in both time and space variables. Then, we show that the discretizations lead to Toeplitz-like systems of linear equations that can be efficiently solved by Krylov subspace solvers with suitable circulant preconditioners. Each time level of these methods reduces the memory requirement of the proposed implicit difference scheme from $${\mathcal {O}}(N^2)$$ to $${\mathcal {O}}(N)$$ and the computational complexity from $${\mathcal {O}}(N^3)$$ to $${\mathcal {O}}(N\log N)$$ in each iterative step, where N is the number of grid nodes. Extensive numerical examples are reported to support our theoretical findings and show the utility of these methods over traditional direct solvers of the implicit difference method, in terms of computational cost and memory requirements.

94 citations


Journal ArticleDOI
TL;DR: A new method for sampling stochastic displacements in Brownian Dynamics (BD) simulations of colloidal scale particles, which circumvents the super-linear scaling exhibited by all known iterative sampling methods applied directly to the RPY tensor and scales linearly with the number of particles.
Abstract: We present a new method for sampling stochastic displacements in Brownian Dynamics (BD) simulations of colloidal scale particles. The method relies on a new formulation for Ewald summation of the Rotne-Prager-Yamakawa (RPY) tensor, which guarantees that the real-space and wave-space contributions to the tensor are independently symmetric and positive-definite for all possible particle configurations. Brownian displacements are drawn from a superposition of two independent samples: a wave-space (far-field or long-ranged) contribution, computed using techniques from fluctuating hydrodynamics and non-uniform fast Fourier transforms; and a real-space (near-field or short-ranged) correction, computed using a Krylov subspace method. The combined computational complexity of drawing these two independent samples scales linearly with the number of particles. The proposed method circumvents the super-linear scaling exhibited by all known iterative sampling methods applied directly to the RPY tensor that results from the power law growth of the condition number of tensor with the number of particles. For geometrically dense microstructures (fractal dimension equal three), the performance is independent of volume fraction, while for tenuous microstructures (fractal dimension less than three), such as gels and polymer solutions, the performance improves with decreasing volume fraction. This is in stark contrast with other related linear-scaling methods such as the force coupling method and the fluctuating immersed boundary method, for which performance degrades with decreasing volume fraction. Calculations for hard sphere dispersions and colloidal gels are illustrated and used to explore the role of microstructure on performance of the algorithm. In practice, the logarithmic part of the predicted scaling is not observed and the algorithm scales linearly for up to 4×106 particles, obtaining speed ups of over an order of magnitude over existing iterative methods, and making the cost of computing Brownian displacements comparable to the cost of computing deterministic displacements in BD simulations. A high-performance implementation employing non-uniform fast Fourier transforms implemented on graphics processing units and integrated with the software package HOOMD-blue is used for benchmarking.

Journal ArticleDOI
TL;DR: In this article, a nonlinear finite element (FE) solver was proposed to solve nonlinear problems in a general history-dependent and time-dependent material model, where the kernel is derived from an auxiliary homogeneous linear problem, which renders the extension of FFT-based schemes to nonlinear problem conceptually difficult.
Abstract: Fourier solvers have become efficient tools to establish structure–property relations in heterogeneous materials. Introduced as an alternative to the finite element (FE) method, they are based on fixed-point solutions of the Lippmann–Schwinger type integral equation. Their computational efficiency results from handling the kernel of this equation by the fast Fourier transform (FFT). However, the kernel is derived from an auxiliary homogeneous linear problem, which renders the extension of FFT-based schemes to nonlinear problems conceptually difficult. This paper aims to establish a link between FE-based and FFT-based methods in order to develop a solver applicable to general history-dependent and time-dependent material models. For this purpose, we follow the standard steps of the FE method, starting from the weak form, proceeding to the Galerkin discretization and the numerical quadrature, up to the solution of nonlinear equilibrium equations by an iterative Newton–Krylov solver. No auxiliary linear problem is thus needed. By analyzing a two-phase laminate with nonlinear elastic, elastoplastic, and viscoplastic phases and by elastoplastic simulations of a dual-phase steel microstructure, we demonstrate that the solver exhibits robust convergence. These results are achieved by re-using the nonlinear FE technology, with the potential of further extensions beyond small-strain inelasticity considered in this paper.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: This work generalizes existing discriminative approaches by using more powerful regularization, based on convolutional neural networks, and proposes a simple, yet effective, boundary adjustment method that alleviates the problematic circular convolution assumption, which is necessary for FFT-based deconvolution.
Abstract: This work addresses the task of non-blind image deconvolution. Motivated to keep up with the constant increase in image size, with megapixel images becoming the norm, we aim at pushing the limits of efficient FFT-based techniques. Based on an analysis of traditional and more recent learning-based methods, we generalize existing discriminative approaches by using more powerful regularization, based on convolutional neural networks. Additionally, we propose a simple, yet effective, boundary adjustment method that alleviates the problematic circular convolution assumption, which is necessary for FFT-based deconvolution. We evaluate our approach on two common non-blind deconvolution benchmarks and achieve state-of-the-art results even when including methods which are computationally considerably more expensive.

Journal ArticleDOI
TL;DR: An efficient iterative thresholding method for multi-phase image segmentation that has the optimal complexity O ( N log ⁡ N ) per iteration and has the total energy decaying property is proposed.

Journal ArticleDOI
TL;DR: Multirate signal processing techniques are introduced that improve the FFT-based methods for detecting faults in IM by reducing spectral leakage with fractional resampling.
Abstract: Fault detection in induction motors (IM) has been studied during the past decades due to the role that these electric machines play in industry. Regular monitoring is performed on IM to diagnose their operating condition using vibration and stator current analysis. The acquired signal is then processed to extract the characteristic parameters of the fault. The fast Fourier transform (FFT) is used for this task, but it has intrinsic limitations like sensitivity to low signal-to-noise ratio, overlapping of closely-located spectral components, nonstationary signals, and spectral leakage. These limitations have been studied to improve the spectrum estimation, but spectral leakage has not received enough attention, even when its effects can be significant. This paper introduces multirate signal processing techniques that improve the FFT-based methods by reducing spectral leakage with fractional resampling. The methodology is applied to experimental signals to show the improvement of the FFT-based methods for detecting faults in IM.

Posted Content
Minsik Cho1, Daniel Brand1
TL;DR: This work proposes a memory-efficient convolution or MEC with compact lowering, which reduces memory-overhead substantially and accelerates convolution process and reduces memory consumption significantly with good speedup on both mobile and server platforms.
Abstract: Convolution is a critical component in modern deep neural networks, thus several algorithms for convolution have been developed. Direct convolution is simple but suffers from poor performance. As an alternative, multiple indirect methods have been proposed including im2col-based convolution, FFT-based convolution, or Winograd-based algorithm. However, all these indirect methods have high memory-overhead, which creates performance degradation and offers a poor trade-off between performance and memory consumption. In this work, we propose a memory-efficient convolution or MEC with compact lowering, which reduces memory-overhead substantially and accelerates convolution process. MEC lowers the input matrix in a simple yet efficient/compact way (i.e., much less memory-overhead), and then executes multiple small matrix multiplications in parallel to get convolution completed. Additionally, the reduced memory footprint improves memory sub-system efficiency, improving performance. Our experimental results show that MEC reduces memory consumption significantly with good speedup on both mobile and server platforms, compared with other indirect convolution algorithms.

Journal ArticleDOI
TL;DR: Results of experimental analysis on the simulation signals and vibration signal collected from rolling bearings indicate that LMD-MHD is effective for extracting the weak fault features and performs well for bearing fault diagnosis.
Abstract: Extraction of the weak fault features under strong background noise is crucial to early fault diagnosis in bearings. A new method called local mean decomposition (LMD)-based multilayer hybrid denoising (LMD-MHD) is proposed for signal denoising in this paper. LMD is a novel self-adaptive time-frequency analysis method. It can decompose the signal into a set of product functions (PFs), and is thus particularly suitable for processing of multicomponent amplitude-modulated and frequency-modulated signals. The first filtering layer of LMD-MHD is to use a multiple criteria decision to select the effective PF components. The second filtering layer is to use the wavelet threshold denoising (WTD) as the prefilter of singular value decomposition (SVD) implementation, which enables the preserved singular values to consist of the most important information of the PFs. The final denoising layer of LMD-MHD is to use SVD to extract the most important principal features from Hankel matrix of the PFs. The order of effective ranks of the Hankel matrix is determined by the number of main frequencies in the fast Fourier transformation (FFT) result of the signal. The filtering is finished based on the reconstructed signal from the decomposition result of SVD. The results of experimental analysis on the simulation signals and vibration signal collected from rolling bearings indicate that LMD-MHD is effective for extracting the weak fault features and performs well for bearing fault diagnosis.

Journal ArticleDOI
TL;DR: A virtual active element pattern (AEP) expansion method is presented in which each AEP in an unequally spaced array is considered to be the pattern radiated by a subarray of some equally spaced virtual elements, which can be efficiently evaluated by fast Fourier transform (FFT).
Abstract: A virtual active element pattern (AEP) expansion method is presented in which each AEP in an unequally spaced array is considered to be the pattern radiated by a subarray of some equally spaced virtual elements. With the help of this method, the pattern of an unequally spaced array including mutual coupling can be efficiently evaluated by fast Fourier transform (FFT). By incorporating this idea into the iterative Fourier transform procedure, we develop a novel iterative synthesis method, which can apply the iterative FFT to efficiently synthesize unequally spaced arrays including mutual coupling. Different excitation constraints, such as phase-only control and amplitude-phase optimization with a prescribed dynamic range ratio, can be easily added into the proposed synthesis procedure. A set of synthesis examples for different antenna arrays with pencil and shaped beam patterns are provided to validate the effectiveness and advantages of the proposed method.

Posted Content
TL;DR: This paper proposes a frequency synchronization scheme for multiuser orthogonal frequency division multiplexing uplink with a large-scale uniform linear array at base station (BS) by exploiting the angle information of users to perform carrier frequency offset estimation for each user individually through a joint spatial-frequency alignment procedure.
Abstract: In this paper, we propose a frequency synchronization scheme for multiuser orthogonal frequency division multiplexing (OFDM) uplink with a large-scale uniform linear array (ULA) at base station (BS) by exploiting the angle information of users. Considering that the incident signal at BS from each user can be restricted within a certain angular spread, the proposed scheme could perform carrier frequency offset (CFO) estimation for each user individually through a \textit{joint spatial-frequency alignment} procedure and can be completed efficiently with the aided of fast Fourier transform (FFT). A multi-branch receive beamforming is further designed to yield an equivalent single user transmission model for which the conventional single-user channel estimation and data detection can be carried out. To make the study complete, the theoretical performance analysis of the CFO estimation is also conducted. We further develop a user grouping scheme to deal with the unexpected scenarios that some users may not be separated well from the spatial domain. Finally, various numerical results are provided to verify the proposed studies.

Journal ArticleDOI
TL;DR: In this article, the propagation of spatially correlated digital elevation model errors into gravimetric terrain corrections is modeled using the 2D Fourier transform, which can be applied to planar terrain correction.
Abstract: We have identified a gap in the literature on error propagation in the gravimetric terrain correction. Therefore, we have derived a mathematical framework to model the propagation of spatially correlated digital elevation model errors into gravimetric terrain corrections. As an example, we have determined how such an error model can be formulated for the planar terrain correction and then be evaluated efficiently using the 2D Fourier transform. We have computed 18.3 billion linear terrain corrections and corresponding error estimates for a 1 arc-second (∼30 m) digital elevation model covering the whole of the Australian continent.

Journal ArticleDOI
TL;DR: In this article, the fast Fourier transform was used to decompose standard conversion matrices between coefficients in classical orthogonal polynomial expansions into diagonally-scaled Hadamard products involving Toeplitz and Hankel matrices.
Abstract: Many standard conversion matrices between coefficients in classical orthogonal polynomial expansions can be decomposed using diagonally-scaled Hadamard products involving Toeplitz and Hankel matrices. This allows us to derive $\smash{\mathcal{O}(N(\log N)^2)}$ algorithms, based on the fast Fourier transform, for converting coefficients of a degree $N$ polynomial in one polynomial basis to coefficients in another. Numerical results show that this approach is competitive with state-of-the-art techniques, requires no precomputational cost, can be implemented in a handful of lines of code, and is easily adapted to extended precision arithmetic.

Journal ArticleDOI
TL;DR: A numerical methodology to compute the solution of an adhesive normal contact problem on rough surfaces with the Boundary Element Method, based on the Fast Fourier Transform and the Westergaard's fundamental solution, enables to solve efficiently the constrained minimization problem.
Abstract: We introduce a numerical methodology to compute the solution of an adhesive normal contact problem on rough surfaces with the Boundary Element Method. Based on the Fast Fourier Transform and the Westergaard's fundamental solution, the proposed algorithm enables to solve efficiently the constrained minimization problem: the numerical solution strictly verifies contact orthogonality and the algorithm takes advantage of the constraints to speed up the minimization. Comparisons with the analytical solution of the Hertz case prove the quality of the numerical computation. The method is also used to compute normal adhesive contact between rough surfaces made of multiple asperities.

Journal ArticleDOI
TL;DR: A BOTDR sensor is implemented that combines the complementary coding with the fast Fourier transform (FFT) technique for high-performance distributed sensing that provides an enhanced signal-to-noise ratio of the sensing system, which leads to high accuracy measurement.
Abstract: We implement a BOTDR sensor that combines the complementary coding with the fast Fourier transform (FFT) technique for high-performance distributed sensing. The employment of the complementary coding provides an enhanced signal-to-noise ratio of the sensing system, which leads to high accuracy measurement. Meanwhile, FFT technique in BOTDR is combined to reduce the measurement time sharply compared to the classical frequency sweeping technique. In addition, a pre-depletion two-wavelength probe pulse is proposed to suppress the distortion of the coding probe pulse induced by EDFA. Experiments are carried out beyond 10 km single-mode fiber, and the results show the capabilities of the proposed scheme to achieve 2 m spatial resolution with 0.37 MHz frequency uncertainty which corresponds to ∼0.37 °C temperature resolution or ∼7.4 μe strain resolution. The measurement time can be more than tens of times faster than traditional frequency sweeping method in theory.

Journal ArticleDOI
TL;DR: The FFT bispectrum estimator presented in this paper offers speed and simplicity benefits over a direct sampling approach, and can be applied to any order polyspectra, such as the trispectrum, with the cost of only a handful of FFTs.
Abstract: In this paper we establish the accuracy and robustness of a fast estimator for the bispectrum - the "FFT bispectrum estimator". The implementation of the estimator presented here offers speed and simplicity benefits over a direct sampling approach. We also generalise the derivation so it may be easily be applied to any order polyspectra, such as the trispectrum, with the cost of only a handful of FFTs. All lower order statistics can also be calculated simultaneously for little extra cost. To test the estimator we make use of a non-linear density field, and for a more strongly non-Gaussian test case we use a toy-model of reionization in which ionized bubbles at a given redshift are all of equal size and are randomly distributed. Our tests find that the FFT estimator remains accurate over a wide range of k, and so should be extremely useful for analysis of 21-cm observations. The speed of the FFT bispectrum estimator makes it suitable for sampling applications, such as Bayesian inference. The algorithm we describe should prove valuable in the analysis of simulations and observations, and whilst we apply it within the field of cosmology, this estimator is useful in any field that deals with non-Gaussian data.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: This is the first code that achieves the optimum robustness in terms of tolerating stragglers or failures for computing Fourier transforms, and the reconstruction process for coded FFT can be mapped to MDS decoding, which can be solved efficiently.
Abstract: We consider the problem of computing the Fourier transform of high-dimensional vectors, distributedly over a cluster of machines consisting of a master node and multiple worker nodes, where the worker nodes can only store and process a fraction of the inputs. We show that by exploiting the algebraic structure of the Fourier transform operation and leveraging concepts from coding theory, one can efficiently deal with the straggler effects. In particular, we propose a computation strategy, named as coded FFT, which achieves the optimal recovery threshold, defined as the minimum number of workers that the master node needs to wait for in order to compute the output. This is the first code that achieves the optimum robustness in terms of tolerating stragglers or failures for computing Fourier transforms. Furthermore, the reconstruction process for coded FFT can be mapped to MDS decoding, which can be solved efficiently. Moreover, we extend coded FFT to settings including computing general n-dimensional Fourier transforms, and provide the optimal computing strategy for those settings.

Journal ArticleDOI
TL;DR: The proposed algorithm directly optimizes the WISL without limitations on the weights filling the gap in the open literature and involves the FFT operation at each iteration ensuring the convergence speed.

Posted Content
TL;DR: In this paper, a multiple classifiers fusion localization technique using received signal strengths (RSSs) of visible light is proposed, in which the proposed system transmits different intensity modulated sinusoidal signals by LEDs and the signals received by a Photo Diode (PD) placed at various grid points.
Abstract: A multiple classifiers fusion localization technique using received signal strengths (RSSs) of visible light is proposed, in which the proposed system transmits different intensity modulated sinusoidal signals by LEDs and the signals received by a Photo Diode (PD) placed at various grid points. First, we obtain some {\emph{approximate}} received signal strengths (RSSs) fingerprints by capturing the peaks of power spectral density (PSD) of the received signals at each given grid point. Unlike the existing RSSs based algorithms, several representative machine learning approaches are adopted to train multiple classifiers based on these RSSs fingerprints. The multiple classifiers localization estimators outperform the classical RSS-based LED localization approaches in accuracy and robustness. To further improve the localization performance, two robust fusion localization algorithms, namely, grid independent least square (GI-LS) and grid dependent least square (GD-LS), are proposed to combine the outputs of these classifiers. We also use a singular value decomposition (SVD) based LS (LS-SVD) method to mitigate the numerical stability problem when the prediction matrix is singular. Experiments conducted on intensity modulated direct detection (IM/DD) systems have demonstrated the effectiveness of the proposed algorithms. The experimental results show that the probability of having mean square positioning error (MSPE) of less than 5cm achieved by GD-LS is improved by 93.03\% and 93.15\%, respectively, as compared to those by the RSS ratio (RSSR) and RSS matching methods with the FFT length of 2000.

Journal ArticleDOI
TL;DR: A fast algorithm without searching target's motion parameters is proposed to address the detection performance of radar maneuvering target with jerk motion, and Comparisons with other representative algorithms in computational cost, motion parameter estimation performance, and detection ability indicate that the proposed algorithm can achieve a good balance between the computational cost and Detection ability.
Abstract: The detection performance of radar maneuvering target with jerk motion is affected by the range migration (RM) and Doppler frequency migration (DFM). To address these problems, a fast algorithm without searching target's motion parameters is proposed. In this algorithm, the second-order keystone transform is first applied to eliminate the quadratic coupling between the range frequency and slow time. Then, by employing a new defined symmetric autocorrelation function, scaled Fourier transform, and inverse fast Fourier transform, the target's initial range and velocity are estimated. With these two estimates, the azimuth echoes along the target's trajectory, which can be modeled as a cubic phase signal (CPS), are extracted. Thereafter, the target's radial acceleration and jerk are estimated by approaches for parameters estimation of the CPS. Finally, by constructing a compensation function, the RM and DFM are compensated simultaneously, followed by the coherent integration and target detection. Comparisons with other representative algorithms in computational cost, motion parameter estimation performance, and detection ability indicate that the proposed algorithm can achieve a good balance between the computational cost and detection ability. The simulation and raw data processing results demonstrate the effectiveness of the proposed algorithm.

Journal ArticleDOI
TL;DR: In this paper, a mathematical model based on the concept of modal strain energy and signal processing method based on Hilbert-Huang Transform (HHT) was proposed to identify the cracks.

Journal ArticleDOI
TL;DR: A fast method based on sequence reversing transform is proposed to eliminate the RW and achieve the coherent accumulation and can avoid the blind speed sidelobe and make a good balance between the computational cost and the detection ability.
Abstract: This paper addresses the coherent integration problem for detecting a high-speed target, involving range walk (RW) within the coherent processing time. A fast method based on sequence reversing transform is proposed to eliminate the RW and achieve the coherent accumulation. The proposed method is simple and fast since it can be easily implemented by using complex multiplications, the fast Fourier transform (FFT) and the inverse FFT. Compared with the Radon–Fourier transform and moving target detection methods, the presented algorithm can avoid the blind speed sidelobe and make a good balance between the computational cost and the detection ability. Finally, several simulations are provided to demonstrate its effectiveness.