scispace - formally typeset
Search or ask a question

Showing papers on "Conjugate gradient method published in 2014"


Journal ArticleDOI
TL;DR: This work proposes an alternative formulation of multitarget tracking as minimization of a continuous energy that focuses on designing an energy that corresponds to a more complete representation of the problem, rather than one that is amenable to global optimization.
Abstract: Many recent advances in multiple target tracking aim at finding a (nearly) optimal set of trajectories within a temporal window. To handle the large space of possible trajectory hypotheses, it is typically reduced to a finite set by some form of data-driven or regular discretization. In this work, we propose an alternative formulation of multitarget tracking as minimization of a continuous energy. Contrary to recent approaches, we focus on designing an energy that corresponds to a more complete representation of the problem, rather than one that is amenable to global optimization. Besides the image evidence, the energy function takes into account physical constraints, such as target dynamics, mutual exclusion, and track persistence. In addition, partial image evidence is handled with explicit occlusion reasoning, and different targets are disambiguated with an appearance model. To nevertheless find strong local minima of the proposed nonconvex energy, we construct a suitable optimization scheme that alternates between continuous conjugate gradient descent and discrete transdimensional jump moves. These moves, which are executed such that they always reduce the energy, allow the search to escape weak minima and explore a much larger portion of the search space of varying dimensionality. We demonstrate the validity of our approach with an extensive quantitative evaluation on several public data sets.

616 citations


Journal ArticleDOI
01 Jul 2014
TL;DR: A modified preconditioned Conjugate Gradient (CG) method is presented that removes the costly global synchronization steps from the standard CG algorithm by only performing a single non-blocking reduction per iteration.
Abstract: Scalability of Krylov subspace methods suffers from costly global synchronization steps that arise in dot-products and norm calculations on parallel machines. In this work, a modified preconditioned Conjugate Gradient (CG) method is presented that removes the costly global synchronization steps from the standard CG algorithm by only performing a single non-blocking reduction per iteration. This global communication phase can be overlapped by the matrix-vector product, which typically only requires local communication. The resulting algorithm will be referred to as pipelined CG. An alternative pipelined method, mathematically equivalent to the Conjugate Residual (CR) method that makes different trade-offs with regard to scalability and serial runtime is also considered. These methods are compared to a recently proposed asynchronous CG algorithm by Gropp. Extensive numerical experiments demonstrate the numerical stability of the methods. Moreover, it is shown that hiding the global synchronization step improves scalability on distributed memory machines using the message passing paradigm and leads to significant speedups compared to standard preconditioned CG.

157 citations


Journal ArticleDOI
TL;DR: In this article, a truncated Newton method is proposed to estimate the effect of the inverse Hessian operator on the reconstruction of P-wave velocity models, which is based on the computation of the model update through a matrix-free conjugate gradient resolution of the Newton linear system.
Abstract: Full Waveform Inversion (FWI) is a powerful tool for quantitative seismic imaging from wide-azimuth seismic data. The method is based on the minimization of the misfit between observed and simulated data. This amounts to the resolution of a large-scale nonlinear minimization problem. The inverse Hessian operator plays a crucial role in this reconstruction process. Accounting accurately for the effect of this operator within the minimization scheme should correct for illumination deficits, restore the amplitude of the subsurface parameters, and help to remove artifacts generated by energetic multiple reflections. Conventional preconditioned gradient-based minimization methods only roughly approximate the effect of this operator. We are interested in this study to another class of minimization methods, named as truncated Newton methods. These methods are based on the computation of the model update through a matrix-free conjugate gradient resolution of the Newton linear system. The aim of this study is to present a feasible implementation of this method for the FWI problem, based on a second-order adjoint state formulation for the computation of Hessian-vector products. We compare this method with the nonlinear conjugate gradient and the l-BFGS method within the context of 2D acoustic frequency FWI for the reconstruction of P-wave velocity models. Two test cases are investigated. The first is the synthetic BP 2004 model, representative of the Gulf Of Mexico geology with high velocity contrasts associated with the presence of salt structures. The second is a 2D real data-set from the Valhall oil field in North sea. These tests emphasize the interesting properties of the truncated Newton method regarding conventional optimization methods within the context of FWI.

136 citations


Journal ArticleDOI
TL;DR: An iterative image-domain decomposition method for noise suppression in DECT, using the full variance-covariance matrix of the decomposed images, which shows superior performance on noise suppression with high image spatial resolution and low-contrast detectability.
Abstract: Purpose: Dual energy CT (DECT) imaging plays an important role in advanced imaging applications due to its capability of material decomposition. Direct decomposition via matrix inversion suffers from significant degradation of image signal-to-noise ratios, which reduces clinical values of DECT. Existing denoising algorithms achieve suboptimal performance since they suppress image noise either before or after the decomposition and do not fully explore the noise statistical properties of the decomposition process. In this work, the authors propose an iterative image-domain decomposition method for noise suppression in DECT, using the full variance-covariance matrix of the decomposed images. Methods: The proposed algorithm is formulated in the form of least-square estimation with smoothness regularization. Based on the design principles of a best linear unbiased estimator, the authors include the inverse of the estimated variance-covariance matrix of the decomposed images as the penalty weight in the least-square term. The regularization term enforces the image smoothness by calculating the square sum of neighboring pixel value differences. To retain the boundary sharpness of the decomposed images, the authors detect the edges in the CT images before decomposition. These edge pixels have small weights in the calculation of the regularization term. Distinct from the existing denoising algorithms applied on the images before or after decomposition, the method has an iterative process for noise suppression, with decomposition performed in each iteration. The authors implement the proposed algorithm using a standard conjugate gradient algorithm. The method performance is evaluated using an evaluation phantom (Catphan©600) and an anthropomorphic head phantom. The results are compared with those generated using direct matrix inversion with no noise suppression, a denoising method applied on the decomposed images, and an existing algorithm with similar formulation as the proposed method but with an edge-preserving regularization term. Results: On the Catphan phantom, the method maintains the same spatial resolution on the decomposed images as that of the CT images before decomposition (8 pairs/cm) while significantly reducing their noise standard deviation. Compared to that obtained by the direct matrix inversion, the noise standard deviation in the images decomposed by the proposed algorithm is reduced by over 98%. Without considering the noise correlation properties in the formulation, the denoising scheme degrades the spatial resolution to 6 pairs/cm for the same level of noise suppression. Compared to the edge-preserving algorithm, the method achieves better low-contrast detectability. A quantitative study is performed on the contrast-rod slice of Catphan phantom. The proposed method achieves lower electron density measurement error as compared to that by the direct matrix inversion, and significantly reduces the error variation by over 97%. On the head phantom, the method reduces the noise standard deviation of decomposed images by over 97% without blurring the sinus structures. Conclusions: The authors propose an iterative image-domain decomposition method for DECT. The method combines noise suppression and material decomposition into an iterative process and achieves both goals simultaneously. By exploring the full variance-covariance properties of the decomposed images and utilizing the edge predetection, the proposed algorithm shows superior performance on noise suppression with high image spatial resolution and low-contrast detectability.

128 citations


Journal ArticleDOI
TL;DR: This work demonstrates that the Moulinec-Suquet setting is actually equivalent to a Galerkin discretization of the cell problem, based on approximation spaces spanned by trigonometric polynomials and a suitable numerical integration scheme, and proves convergence of the approximate solution to the weak solution.
Abstract: In 1994, Moulinec and Suquet introduced an efficient technique for the numerical resolution of the cell problem arising in homogenization of periodic media. The scheme is based on a fixed-point iterative solution to an integral equation of the Lippmann-Schwinger type, with action of its kernel efficiently evaluated by the Fast Fourier Transform techniques. The aim of this work is to demonstrate that the Moulinec-Suquet setting is actually equivalent to a Galerkin discretization of the cell problem, based on approximation spaces spanned by trigonometric polynomials and a suitable numerical integration scheme. For the latter framework and scalar elliptic problems, we prove convergence of the approximate solution to the weak solution, including a-priori estimates for the rate of convergence for sufficiently regular data and the effects of numerical integration. Moreover, we also show that the variational structure implies that the resulting non-symmetric system of linear equations can be solved by the conjugate gradient method. Apart from providing a theoretical support to Fast Fourier Transform-based methods for numerical homogenization, these findings significantly improve on the performance of the original solver and pave the way to similar developments for its many generalizations proposed in the literature.

127 citations


Journal ArticleDOI
TL;DR: By constructing an objective function and using the gradient search, a gradient-based iteration is established for solving the coupled matrix equations as mentioned in this paper, and the authors prove that the gradient solution is convergent for any initial values.
Abstract: By constructing an objective function and using the gradient search, a gradient-based iteration is established for solving the coupled matrix equations A i XB i = F i , i = 1, 2, …, p The authors prove that the gradient solution is convergent for any initial values By analysing the spectral radius of the iterative matrix, the authors obtain an optimal convergence factor An example is provided to illustrate the effectiveness of the proposed algorithm and to testify the conclusions established in this study

119 citations


Journal ArticleDOI
TL;DR: This work compared the performances of three types of training algorithms in feed forward neural network for brain hematoma classification on the basis of mean square error, accuracy, rate of convergence and correctness of the classification.
Abstract: Classification is one of the most important task in application areas of artificial neural networks (ANN).Training neural networks is a complex task in the supervised learning field of research. The main difficulty in adopting ANN is to find the most appropriate combination of learning, transfer and training function for the classification task. We compared the performances of three types of training algorithms in feed forward neural network for brain hematoma classification. In this work we have selected Gradient Descent based backpropagation, Gradient Descent with momentum, Resilence backpropogation algorithms. Under conjugate based algorithms, Scaled Conjugate back propagation, Conjugate Gradient backpropagation with Polak-Riebreupdates(CGP) and Conjugate Gradient backpropagation with Fletcher-Reeves updates (CGF).The last category is Quasi Newton based algorithm, under this BFGS, Levenberg-Marquardt algorithms are selected. Proposed work compared training algorithm on the basis of mean square error, accuracy, rate of convergence and correctness of the classification. Our conclusion about the training functions is based on the simulation results.

116 citations


Proceedings ArticleDOI
TL;DR: In this article, a conjugate gradient (CG) method was proposed to reduce the complexity of data detection and precoding in massive MIMO systems, and a novel way of computing the signal-to-interference-and-noise ratio (SINR) directly within the CG algorithm was proposed.
Abstract: Massive multiple-input multiple-output (MIMO) promises improved spectral efficiency, coverage, and range, compared to conventional (small-scale) MIMO wireless systems. Unfortunately, these benefits come at the cost of significantly increased computational complexity, especially for systems with realistic antenna configurations. To reduce the complexity of data detection (in the uplink) and precoding (in the downlink) in massive MIMO systems, we propose to use conjugate gradient (CG) methods. While precoding using CG is rather straightforward, soft-output minimum mean-square error (MMSE) detection requires the computation of the post-equalization signal-to-interference-and-noise-ratio (SINR). To enable CG for soft-output detection, we propose a novel way of computing the SINR directly within the CG algorithm at low complexity. We investigate the performance/complexity trade-offs associated with CG-based soft-output detection and precoding, and we compare it to exact and approximate methods. Our results reveal that the proposed method outperforms existing algorithms for massive MIMO systems with realistic antenna configurations.

114 citations


Proceedings ArticleDOI
01 Dec 2014
TL;DR: The proposed conjugate gradient (CG) methods are able to outperform existing methods for massive MIMO systems with realistic antenna configurations and a novel way of computing the SINR directly within the CG algorithm at low complexity is proposed.
Abstract: Massive multiple-input multiple-output (MIMO) promises improved spectral efficiency, coverage, and range, compared to conventional (small-scale) MIMO wireless systems. Unfortunately, these benefits come at the cost of significantly increased computational complexity, especially for systems with realistic antenna configurations. To reduce the complexity of data detection (in the uplink) and precoding (in the downlink) in massive MIMO systems, we propose to use conjugate gradient (CG) methods. While precoding using CG is rather straightforward, soft-output minimum mean-square error (MMSE) detection requires the computation of the post-equalization signal-to-interference-and-noise-ratio (SINR). To enable CG for soft-output detection, we propose a novel way of computing the SINR directly within the CG algorithm at low complexity. We investigate the performance/complexity trade-offs associated with CG-based soft-output detection and precoding, and we compare it to existing exact and approximate methods. Our results reveal that the proposed algorithm is able to outperform existing methods for massive MIMO systems with realistic antenna configurations.

112 citations


Journal ArticleDOI
TL;DR: Two preconditionsed iterative methods, namely, the preconditioned generalized minimal residual (preconditioned GMRES) method and the precONDitioned conjugate gradient for normal residual ( preconditioned CGNR) method, are proposed to solve relevant linear systems.

100 citations


Journal ArticleDOI
TL;DR: How reconstruction quality degrades with uncertainties in the scan positions is explored, and it is shown experimentally that large errors in the assumed scan positions on the sample can be numerically determined and corrected using conjugate gradient descent methods.
Abstract: Ptychographic coherent x-ray diffractive imaging is a form of scanning microscopy that does not require optics to image a sample. A series of scanned coherent diffraction patterns recorded from multiple overlapping illuminated regions on the sample are inverted numerically to retrieve its image. The technique recovers the phase lost by detecting the diffraction patterns by using experimentally known constraints, in this case the measured diffraction intensities and the assumed scan positions on the sample. The spatial resolution of the recovered image of the sample is limited by the angular extent over which the diffraction patterns are recorded and how well these constraints are known. Here, we explore how reconstruction quality degrades with uncertainties in the scan positions. We show experimentally that large errors in the assumed scan positions on the sample can be numerically determined and corrected using conjugate gradient descent methods. We also explore in simulations the limits, based on the signal to noise of the diffraction patterns and amount of overlap between adjacent scan positions, of just how large these errors can be and still be rendered tractable by this method.

Journal ArticleDOI
TL;DR: It is shown that the methods of the suggested class of Dai–Liao conjugate gradient methods are globally convergent for uniformly convex objective functions.
Abstract: Based on an eigenvalue study, a descent class of Dai–Liao conjugate gradient methods is proposed. An interesting feature of the proposed class is its inclusion of the efficient nonlinear conjugate gradient methods proposed by Hager and Zhang, and Dai and Kou, as special cases. It is shown that the methods of the suggested class are globally convergent for uniformly convex objective functions. Numerical results are reported, they demonstrate the efficiency of the proposed methods in the sense of the performance profile introduced by Dolan and More.

Journal ArticleDOI
TL;DR: This paper proposes a modified Polak-Ribiere-Polyak (PRP) CG algorithm for solving a nonsmooth unconstrained convex minimization problem and shows that the algorithm converges globally to an optimal solution.

Journal ArticleDOI
TL;DR: A computational framework is presented for materials science models that come from energy gradient flows, which includes higher order derivative models and vector problems and results from a fast, graphical processing unit implementation for a three-dimensional model are shown.

Journal ArticleDOI
TL;DR: Two modified conjugate gradient methods are proposed by Dai and Liao and it is briefly shown that the methods are globally convergent when the line search fulfills the strong Wolfe conditions.

Journal ArticleDOI
TL;DR: A new method for simulating subsurface flow in a system of fractures based on a PDE-constrained optimization reformulation is introduced, removing all difficulties related to mesh generation and providing an easily parallel approach to the problem.

Proceedings Article
21 Jun 2014
TL;DR: An efficient method, called Riemannian Pursuit, that aims to address low rank matrix recovery and fixed-rank optimization problems simultaneously and substantially outperforms existing methods when applied to large-scale and ill-conditioned matrices.
Abstract: Low rank matrix recovery is a fundamental task in many real-world applications. The performance of existing methods, however, deteriorates significantly when applied to ill-conditioned or large-scale matrices. In this paper, we therefore propose an efficient method, called Riemannian Pursuit (RP), that aims to address these two problems simultaneously. Our method consists of a sequence of fixed-rank optimization problems. Each subproblem, solved by a nonlinear Riemannian conjugate gradient method, aims to correct the solution in the most important subspace of increasing size. Theoretically, RP converges linearly under mild conditions and experimental results show that it substantially outperforms existing methods when applied to large-scale and ill-conditioned matrices.

Journal ArticleDOI
TL;DR: It is demonstrated that the performances of these preconditioners are independent of the polynomial order (p independence) and mesh resolution for maximum continuity B-splines, as verified by various numerical tests.

Journal ArticleDOI
TL;DR: It is shown that the classical Jacobi Over-Relaxation method (JOR) should not be used as its convergence requires a proper value of the relaxation parameter, whereas other strategies should be preferred.
Abstract: In this paper, we investigate various numerical strategies to compute the direct space polarization energy and associated forces in the context of the point dipole approximation (including damping) used in polarizable molecular dynamics. We present a careful mathematical analysis of the algorithms that have been implemented in popular production packages and applied to large test systems. We show that the classical Jacobi Over-Relaxation method (JOR) should not be used as its convergence requires a proper value of the relaxation parameter, whereas other strategies should be preferred. On a single node, Preconditioned Conjugate Gradient methods (PCG) and Jacobi algorithm coupled with the Direct Inversion in the Iterative Subspace (JI/DIIS) provide reliable stability/convergence and are roughly twice as fast as JOR. Moreover, both algorithms are suitable for massively parallel implementations. The lower requirements in terms of processes communications make JI/DIIS the method of choice for MPI and hybrid Op...

Journal ArticleDOI
TL;DR: Variational data assimilation problems in meteorology and oceanography require the solution of a regularized nonlinear least‐squares problem, and the dual formulation can reduce both memory usage and computational cost.
Abstract: Variational data assimilation problems in meteorology and oceanography require the solution of a regularized nonlinear least-squares problem. Practical solution algorithms are based on the incremental (truncated Gauss–Newton) approach, which involves the iterative solution of a sequence of linear least-squares (quadratic minimization) sub-problems. Each sub-problem can be solved using a primal approach, where the minimization is performed in a space spanned by vectors of the size of the model control vector, or a dual approach, where the minimization is performed in a space spanned by vectors of the size of the observation vector. The dual formulation can be advantageous for two reasons. First, the dimension of the minimization problem with the dual formulation does not increase when additional control variables are considered, such as those accounting for model error in a weak-constraint formulation. Second, whenever the dimension of observation space is significantly smaller than that of the model control space, the dual formulation can reduce both memory usage and computational cost. In this article, a new dual-based algorithm called Restricted B-preconditioned Lanczos (RBLanczos) is introduced, where B denotes the background-error covariance matrix. RBLanczos is the Lanczos formulation of the Restricted B-preconditioned Conjugate Gradient (RBCG) method. RBLanczos generates mathematically equivalent iterates to those of RBCG and the corresponding B-preconditioned Conjugate Gradient and Lanczos algorithms used in the primal approach. All these algorithms can be implemented without the need for a square-root factorization of B. RBCG and RBLanczos, as well as the corresponding primal algorithms, are implemented in two operational ocean data assimilation systems and numerical results are presented. Practical diagnostic formulae for monitoring the convergence properties of the minimization are also presented.

Journal ArticleDOI
TL;DR: The Lanczos method finds the lowest eigenvalue in a Krylov subspace of increasing size, while the other methods search in a smaller subspace spanned by the set of previous search directions, and hence the theoretical efficiency of the minimum mode finding methods are bounded by the Lanczo method.
Abstract: Minimum mode following algorithms are widely used for saddle point searching in chemical and material systems. Common to these algorithms is a component to find the minimum curvature mode of the second derivative, or Hessian matrix. Several methods, including Lanczos, dimer, Rayleigh-Ritz minimization, shifted power iteration, and locally optimal block preconditioned conjugate gradient, have been proposed for this purpose. Each of these methods finds the lowest curvature mode iteratively without calculating the Hessian matrix, since the full matrix calculation is prohibitively expensive in the high dimensional spaces of interest. Here we unify these iterative methods in the same theoretical framework using the concept of the Krylov subspace. The Lanczos method finds the lowest eigenvalue in a Krylov subspace of increasing size, while the other methods search in a smaller subspace spanned by the set of previous search directions. We show that these smaller subspaces are contained within the Krylov space for which the Lanczos method explicitly finds the lowest curvature mode, and hence the theoretical efficiency of the minimum mode finding methods are bounded by the Lanczos method. Numerical tests demonstrate that the dimer method combined with second-order optimizers approaches but does not exceed the efficiency of the Lanczos method for minimum mode optimization.

Book
22 Dec 2014
TL;DR: This paper presents a meta-analysis of the riesz map and operator preconditioning in Hilbert spaces and the matrix formulation of the conjugate gradient method as a treatment of the Galerkin discretization.
Abstract: Preface 1. Introduction 2. Linear elliptic partial differential equations 3. Elements of functional analysis 4. Riesz map and operator preconditioning 5. Conjugate gradient method in Hilbert spaces 6. Finite dimensional Hilbert spaces and the matrix formulation of the conjugate gradient method 7. Comments on the Galerkin discretization 8. Preconditioning of the algebraic system as transformation of the discretization basis 9. Fundamental theorem on discretization 10. Local and global information in discretization and in computation 11. Limits of the condition number-based descriptions 12. Inexact computations, a posteriori error analysis, and stopping criteria 13. Summary and outlook Bibliography Index.

Posted Content
TL;DR: An optimization framework for problems whose solutions are well-approximated by Hierarchical Tucker (HT) tensors, an efficient structured tensor format based on recursive subspace factorizations, and finds that the organization of the tensor can have a major impact on completion from realistic seismic acquisition geometries.
Abstract: In this work, we develop an optimization framework for problems whose solutions are well-approximated by Hierarchical Tucker (HT) tensors, an efficient structured tensor format based on recursive subspace factorizations. By exploiting the smooth manifold structure of these tensors, we construct standard optimization algorithms such as Steepest Descent and Conjugate Gradient for completing tensors from missing entries. Our algorithmic framework is fast and scalable to large problem sizes as we do not require SVDs on the ambient tensor space, as required by other methods. Moreover, we exploit the structure of the Gramian matrices associated with the HT format to regularize our problem, reducing overfitting for high subsampling ratios. We also find that the organization of the tensor can have a major impact on completion from realistic seismic acquisition geometries. These samplings are far from idealized randomized samplings that are usually considered in the literature but are realizable in practical scenarios. Using these algorithms, we successfully interpolate large-scale seismic data sets and demonstrate the competitive computational scaling of our algorithms as the problem sizes grow.

Journal ArticleDOI
TL;DR: This paper proposes an efficient GPU-based algorithm to generate local element information and to assemble the global linear system associated with the FEM discretization of an elliptic PDE and proposes a new fine-grained parallelism strategy, a corresponding multigrid cycling stage and efficient data mapping to the many-core architecture of GPU.

Journal ArticleDOI
TL;DR: Simulations show that the proposed WC-CCM algorithm performs better than existing robust beamforming algorithms and the performances of the proposed low-complexity algorithms are equivalent or better than that of existing robust algorithms, whereas the complexity is more than an order of magnitude lower.
Abstract: The authors present a robust adaptive beamforming algorithm based on the worst-case (WC) criterion and the constrained constant modulus (CCM) approach, which exploits the constant modulus property of the desired signal. Similar to the existing worst-case beamformer with the minimum variance design, the problem can be reformulated as a second-order cone programme and solved with interior point methods. An analysis of the optimisation problem is carried out and conditions are obtained for enforcing its convexity and for adjusting its parameters. Furthermore, low-complexity robust adaptive beamforming algorithms based on the modified conjugate gradient and an alternating optimisation strategy are proposed. The proposed low-complexity algorithms can compute the existing WC constrained minimum variance and the proposed WC-CCM designs with a quadratic cost in the number of parameters. Simulations show that the proposed WC-CCM algorithm performs better than existing robust beamforming algorithms. Moreover, the numerical results also show that the performances of the proposed low-complexity algorithms are equivalent or better than that of existing robust algorithms, whereas the complexity is more than an order of magnitude lower.

Journal ArticleDOI
TL;DR: A new hybrid conjugate gradient method for solving unconstrained optimization problems that not only satisfies the famous D-L conjugacy condition, but also accords with the Newton direction with suitable condition.

Journal ArticleDOI
TL;DR: This work proposes an efficient GPU-based parallel implementation of the L-BFGS-B method, or the limited memory Broyden–Fletcher–Goldfarb–Shanno with boundaries, which is a sophisticated yet efficient optimization method widely used in computer graphics as well as general scientific computation.

Journal ArticleDOI
TL;DR: This work provides the first quantitative analysis of the maximum attainable accuracy of communication-avoiding Krylov subspace methods in finite precision and derives a bound on the deviation of the true and updated residuals in communication- avoiding conjugate gradient.
Abstract: Krylov subspace methods are a popular class of iterative methods for solving linear systems with large, sparse matrices. On modern computer architectures, both sequential and parallel performance of classical Krylov methods is limited by costly data movement, or communication, required to update the approximate solution in each iteration. These motivated communication-avoiding Krylov methods, based on $s$-step formulations, reduce data movement by a factor of $O(s)$ by reordering the computations in classical Krylov methods to exploit locality. Studies on the finite precision behavior of communication-avoiding Krylov methods in the literature have thus far been empirical in nature; in this work, we provide the first quantitative analysis of the maximum attainable accuracy of communication-avoiding Krylov subspace methods in finite precision. Following the analysis for classical Krylov methods, we derive a bound on the deviation of the true and updated residuals in communication-avoiding conjugate gradient...

Journal ArticleDOI
TL;DR: Results show that GPU-based Chebyshev preconditioner can reach around 46× speedup for the largest test system, and conjugate gradient can gain more than 4x speedup, demonstrating great potentials for GPU application in power system applications.

Journal ArticleDOI
Salah Mehanee1
TL;DR: In this article, a regularized conjugate gradient method was proposed to fit the observed data by a class of geometrically simple anomalous bodies, including the semi-infinite vertical cylinder, infinitely long horizontal cylinder, and sphere models using the logarithms of the model parameters [log(z) and log(|A|] rather than the parameters themselves in its iterative minimization scheme.
Abstract: A very fast and efficient approach for gravity data inversion based on the regularized conjugate gradient method has been developed. This approach simultaneously inverts for the depth (z), and the amplitude coefficient (A) of a buried anomalous body from the gravity data measured along a profile. The developed algorithm fits the observed data by a class of some geometrically simple anomalous bodies, including the semi-infinite vertical cylinder, infinitely long horizontal cylinder, and sphere models using the logarithms of the model parameters [log(z) and log(|A|)] rather than the parameters themselves in its iterative minimization scheme. The presented numerical experiments have shown that the original (non-logarithmed) minimization scheme, which uses the parameters themselves (z and |A|) instead of their logarithms, encountered a variety of convergence problems. The aforementioned transformation of the objective functional subjected to minimization into the space of logarithms of z and |A| overcomes these convergence problems. The reliability and the applicability of the developed algorithm have been demonstrated on several synthetic data sets with and without noise. It is then successfully and carefully applied to seven real data examples with bodies buried in different complex geologic settings and at various depths inside the earth. The method is shown to be highly applicable for mineral exploration, and for both shallow and deep earth imaging, and is of particular value in cases where the observed gravity data is due to an isolated body embedded in the subsurface.