scispace - formally typeset
Search or ask a question

Showing papers on "Conjugate gradient method published in 1990"


Journal ArticleDOI
TL;DR: To improve the global convergence properties of these basic algorithms, hybrid methods based on Powell's dogleg strategy are proposed, as well as linesearch backtracking procedures.
Abstract: Several implementations of Newton-like iteration schemes based on Krylov subspace projection methods for solving nonlinear equations are considered. The simplest such class of methods is Newton's algorithm in which a (linear) Krylov method is used to solve the Jacobian system approximately. A method in this class is referred to as a Newton–Krylov algorithm. To improve the global convergence properties of these basic algorithms, hybrid methods based on Powell's dogleg strategy are proposed, as well as linesearch backtracking procedures. The main advantage of the class of methods considered in this paper is that the Jacobian matrix is never needed explicitly.

745 citations


Journal ArticleDOI
01 Nov 1990
TL;DR: A supervised learning algorithm (Scaled Conjugate Gradient, SCG) with superlinear convergence rate is introduced and it is shown that SCG handles long ravines characterized by sharp curvature effectively.
Abstract: A supervised learning algorithm (Scaled Conjugate Gradient, SCG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. SCG uses second order information from the neural network but requires only O(N) memory usage, where N is the number of weights in the network. The performance of SCG is benchmarked against the performance of the standard backpropagation algorithm (BP), the conjugate gradient backpropagation (CGB) and the one-step Broyden-Fletcher-Goldfarb-Shanno memoryless quasi-Newton algorithm (BFGS). SCG yields a speed-up of at least an order of magnitude relative to BP. The speed-up depends on the convergence criterion, i.e., the bigger demand for reduction in error the bigger the speed-up. SCG is fully automated including no user dependent parameters and avoids a time consuming line-search, which CGB and BFGS use in each iteration in order to determine an appropriate step size. Incorporating problem dependent structural information in the architecture of a neural network often lowers the overall complexity. The smaller the complexity of the neural network relative to the problem domain, the bigger the possibility that the weight space contains long ravines characterized by sharp curvature. While BP is inefficient on these ravine phenomena, it is shown that SCG handles them effectively.

638 citations


Journal ArticleDOI
01 May 1990
TL;DR: In this paper, a conjugate gradient algorithm was proposed for the training of multilayer feedforward neural networks. But the performance of the algorithm is superior to that of the conventional backpropagation algorithm and is based on strong theoretical reasons supported by the numerical results of three examples.
Abstract: A novel approach is presented for the training of multilayer feedforward neural networks, using a conjugate gradient algorithm. The algorithm updates the unput weights to each neuron in an efficient prallel way, similar to the one used by the well known backpropagation algorithm. The performance of the algorithm is superior to that of the conventional backpropagation algorithm and is based on strong theoretical reasons supported by the numerical results of three examples.

383 citations


Journal ArticleDOI
TL;DR: A recursive way of constructing preconditioning matrices for the stiffness matrix in the discretization of selfadjoint second order elliptic boundary value problems is proposed, based on a sequence of nested finite element spaces with the usual nodal basis functions.
Abstract: A recursive way of constructing preconditioning matrices for the stiffness matrix in the discretization of selfadjoint second order elliptic boundary value problems is proposed. It is based on a sequence of nested finite element spaces with the usual nodal basis functions. Using a nodeordering corresponding to the nested meshes, the finite element stiffness matrix is recursively split up into two-level block structures and is factored approximately in such a way that any successive Schur complement is replaced (approximated) by a matrix defined recursively and thereform only implicitely given. To solve a system with this matrix we need to perform a fixed number (v) of iterations on the preceding level using as an iteration matrix the preconditioning matrix already defined on that level. It is shown that by a proper choice of iteration parameters it suffices to use\(v > \left( {1 - \gamma ^2 } \right)^{ - \tfrac{1}{2}} \) iterations for the so constructedv-foldV-cycle (wherev=2 corresponds to aW-cycle) preconditioning matrices to be spectrally equivalent to the stiffness matrix. The conditions involve only the constant λ in the strengthened C.-B.-S. inequality for the corresponding two-level hierarchical basis function spaces and are therefore independent of the regularity of the solution for instance. If we use successive uniform refinements of the meshes the method is of optimal order of computational complexity, if\(\gamma ^2< \tfrac{8}{9}\). Under reasonable assumptions of the finite element mesh, the condition numbers turn out to be so small that there are in practice few reasons to use an accelerated iterative method like the conjugate gradient method, for instance.

289 citations


Journal ArticleDOI
TL;DR: It is shown that any CG method for $Ax = b$ is characterized by an hpd inner product matrix B and a left preconditioning matrix C and how eigenvalue estimates may be obtained from the iteration parameters, generalizing the well-known connection between CG and Lanczos.
Abstract: The conjugate gradient method of Hestenes and Stiefel is an effective method for solving large, sparse Hermitian positive definite (hpd) systems of linear equations, $Ax = b$. Generalizations to non-hpd matrices have long been sought. The recent theory of Faber and Manteuffel gives necessary and sufficient conditions for the existence of a CG method. This paper uses these conditions to develop and organize such methods. It is shown that any CG method for $Ax = b$ is characterized by an hpd inner product matrix B and a left preconditioning matrix C. At each step the method minimizes the B-norm of the error over a Krylov subspace. This characterization is then used to classify known and new methods. Finally, it is shown how eigenvalue estimates may be obtained from the iteration parameters, generalizing the well-known connection between CG and Lanczos. Such estimates allow implementation of a stopping criterion based more nearly on the true error.

251 citations


Journal ArticleDOI
TL;DR: New criteria for restarting conjugate gradient algorithms that prove to be computationally very efficient are given and these criteria provide a descent property and global convergence for any conjugates gradient algorithm using a nonnegative update β.
Abstract: Descent property and global convergence proofs are given for a new hybrid conjugate gradient algorithm. Computational results for this algorithm are also given and compared with those of the Fletcher-Reeves method and the Polak-Ribiere method, showing a considerable improvement over the latter two methods. We also give new criteria for restarting conjugate gradient algorithms that prove to be computationally very efficient. These criteria provide a descent property and global convergence for any conjugate gradient algorithm using a nonnegative update β.

220 citations


Journal ArticleDOI
TL;DR: In this paper, a 2D finite-difference code is employed to solve forward and backward propagation problems to find the model which best explains the seismic waveform data, which can handle arbitrary 2D source-receiver configurations and lateral heterogeneities.
Abstract: Interpretation of seismic waveforms can be expressed as an optimization problem based on a non‐linear least‐squares criterion to find the model which best explains the data. An initial model is corrected iteratively using a gradient method (conjugate gradient). At each iteration, computation of the direction of the model perturbation requires the forward propagation of the actual sources and the reverse‐time propagation of the residuals (misfit between the data and the synthetics); the two wave fields thus obtained are then correlated. An extra forward propagation is required to compute the amplitude of the perturbation along the conjugate‐gradient direction. The number of propagations to be simulated numerically in each iteration equals three times the number of shots. Since a 2-D finite‐difference code is employed to solve forward‐ and backward‐propagation problems, the method is general and can handle arbitrary 2-D source‐receiver configurations and lateral heterogeneities. Using conventional velocity ...

219 citations


Book ChapterDOI
11 Aug 1990
TL;DR: It is shown that very large sparse systems can be solved efficiently by using combinations of structured Gaussian elimination and the conjugate gradient, Lanczos, and Wiedemann methods.
Abstract: Many of the fast methods for factoring integers and computing discrete logarithms require the solution of large sparse linear systems of equations over finite fields. This paper presents the results of implementations of several linear algebra algorithms. It shows that very large sparse systems can be solved efficiently by using combinations of structured Gaussian elimination and the conjugate gradient, Lanczos, and Wiedemann methods.

218 citations


Journal ArticleDOI
TL;DR: The numerical implementation of a systematic method for the exact boundary controllability of the wave equation, concentrating on the particular case of Dirichlet controls, is discussed.
Abstract: In this paper we discuss the numerical implementation of a systematic method for the exact boundary controllability of the wave equation, concentrating on the particular case of Dirichlet controls. The numerical methods described here consist in a combination of: finite element approximations for the space discretization; explicit finite difference schemes for the time discretization; a preconditioned conjugate gradient algorithm for the solution of the discrete problems; a pre/post processing technique based on a biharmonic Tychonoff regularization. The efficiency of the computational methodology is illustrated by the results of numerical experiments.

205 citations


Journal ArticleDOI
TL;DR: In this article, the residual vectors can be made mutually orthogonal by means of a two-term recursion relation which leads to the well-known conjugate gradient (CG) method.
Abstract: Discretization of steady-state eddy-current equations may lead to linear system Ax=b in which the complex matrix A is not Hermitian, but may be chosen symmetric. In the positive definite Hermitian case, an iterative algorithm for solving this system can be defined. The residual vectors can be made mutually orthogonal by means of a two-term recursion relation which leads to the well-known conjugate gradient (CG) method. The proposed method is illustrated by comparing it with other methods for some eddy current examples. >

201 citations


Journal ArticleDOI
TL;DR: An alternative to multigrid relaxation that is much easier to implement and more generally applicable is presented and the relationship of this approach to other multiresolution relaxation and representation schemes is discussed.
Abstract: An alternative to multigrid relaxation that is much easier to implement and more generally applicable is presented. Conjugate gradient descent is used in conjunction with a hierarchical (multiresolution) set of basis functions. The resultant algorithm uses a pyramid to smooth the residual vector before the direction is computed. Simulation results showing the speed of convergence and its dependence on the choice of interpolator, the number of smoothing levels, and other factors are presented. The relationship of this approach to other multiresolution relaxation and representation schemes is also discussed. >


Journal ArticleDOI
TL;DR: Solving Newton’s linear system using updated matrix factorizations or the (unpreconditioned) conjugate gradient iteration gives the most effective algorithms.
Abstract: Several variants of Newton’s method are used to obtain estimates of solution vectors and residual vectors for the linear model $Ax = b + e = b_{true} $ using an iteratively reweighted least squares criterion, which tends to diminish the influence of outliers compared with the standard least squares criterion. Algorithms appropriate for dense and sparse matrices are presented. Solving Newton’s linear system using updated matrix factorizations or the (unpreconditioned) conjugate gradient iteration gives the most effective algorithms. Four weighting functions are compared, and results are given for sparse well-conditioned and ill-conditioned problems.

Journal ArticleDOI
TL;DR: In this article, a sufficient condition for the stability of low-order mixed finite element methods is introduced, and two stabilisation procedures for the popular Q 1 −P 0 mixed method are theoretically analyzed.
Abstract: In this paper, a sufficient condition for the stability of low-order mixed finite element methods is introduced. To illustrate the possibilities, two stabilisation procedures for the popular Q 1 −P 0 mixed method are theoretically analysed. The effectiveness of these procedures in practice is assessed by comparing results with those obtained using a conventional penalty formulation, for a standard test problem. It is demonstrated that with appropriate stabilisation, efficient iterative solution techniques of conjugate gradient type can be applied directly to the discrete Stokes system.

Journal ArticleDOI
TL;DR: In this article, a method for shaping electromagnetic field pulses to achieve chemical selectivity is extended and applied to a simple multiple level model system, where both the time-dependent Schrodinger equation and the constant pulse energy are used as constraints on the variational scheme.
Abstract: A previously reported method for shaping electromagnetic field pulses to achieve chemical selectivity is extended and applied to a simple multiple level model system. The pulse shaping approach is based on optimal control theory, where both the time‐dependent Schrodinger equation and the constant pulse energy are used as constraints on the variational scheme. A conjugate gradient direction method is used to direct the convergence of the iterative process used to calculate the optimum pulse shape. The method is applied to a five‐level system interacting with an optical (laser) field. Results demonstrating selectivity and stability are compared to those of other recent related investigations.

Journal ArticleDOI
TL;DR: Three different conjugate gradient type approaches with iterates defined by a minimal residual property, a Galerkin type condition, and an Euclidean error minimization are investigated and numerical experiments for matrices arising from finite difference approximations to the complex Helmholtz equation are reported on.
Abstract: We consider conjugate gradient type methods for the solution of large linear systemsA x=b with complex coefficient matrices of the typeA=T+i?I whereT is Hermitian and ? a real scalar. Three different conjugate gradient type approaches with iterates defined by a minimal residual property, a Galerkin type condition, and an Euclidean error minimization, respectively, are investigated. In particular, we propose numerically stable implementations based on the ideas behind Paige and Saunder's SYMMLQ and MINRES for real symmetric matrices and derive error bounds for all three methods. It is shown how the special shift structure ofA can be preserved by using polynomial preconditioning, and results on the optimal choice of the polynomial preconditioner are given. Also, we report on some numerical experiments for matrices arising from finite difference approximations to the complex Helmholtz equation.

Journal ArticleDOI
TL;DR: In this article, the structure of the discrete-dipole approximation is investigated, and the matrix formed by this approximation is identified to be a symmetric, block-Toeplitz matrix.
Abstract: The discrete-dipole approximation is used to study the problem of light scattering by homogeneous rectangular particles. The structure of the discrete-dipole approximation is investigated, and the matrix formed by this approximation is identified to be a symmetric, block-Toeplitz matrix. Special properties of block-Toeplitz arrays are explored, and an efficient algorithm to solve the dipole scattering problem is provided. Timings for conjugate gradient, Linpack, and block-Toeplitz solvers are given; the results indicate the advantages of the block-Toeplitz algorithm. A practical test of the algorithm was performed on a system of 1400 dipoles, which corresponds to direct inversion of an 8400 × 8400 real matrix. A short discussion of the limitations of the discrete-dipole approximation is provided, and some results for cubes and parallelepipeds are given. We briefly consider how the algorithm may be improved further.

Journal ArticleDOI
TL;DR: A modification of Davidson's eigenvalue algorithm, based on the conjugate gradient method, is described, making it practical for very large problems where disk storage is the limiting factor, without the necessity of restarting or discarding some expansion vectors.
Abstract: A modification of Davidson's eigenvalue algorithm, based on the conjugate gradient method, is described. This method needs storage only for a few vectors (five to seven, depending on the implementation), making it practical for very large problems where disk storage is the limiting factor, without the necessity of restarting or discarding some expansion vectors. The convergence characteristics of the modified method are essentially identical with those of the original Davidson method if all expansion vectors are retained in the latter.

Journal ArticleDOI
TL;DR: In this article, the problem of scattering from frequency-selective surfaces (FSSs) has been investigated by expanding the unknown current distribution with three different sets of basis functions, namely the roof top, surface patch, and triangular patch.
Abstract: The problem of scattering from frequency-selective surfaces (FSSs) has been investigated by expanding the unknown current distribution with three different sets of basis functions, namely the roof top, surface patch, and triangular patch. The boundary condition on the total electric field on the FSS due to this current distribution is tested either by a line integral or by the Galerkin procedure. This results in an operator equation that can be solved either by a direct matrix inversion method or by an iterative procedure, namely the conjugate gradient method (CGM). The performance of each of these basis and testing functions is evaluated. It is found that the roof-top and the surface-patch basis functions in conjunction with the Galerkin testing are superior in computational efficiency to other combinations of basis and testing functions that have been studied. Comparison of the CPU times on a Cray X-MP/48 supercomputer in solving the operator equation by the direct matrix inversion method and the CGM is provided. Frequency responses of free-standing, periodic arrays of conducting and resistive plates are also presented. >

Journal ArticleDOI
TL;DR: The biconjugate gradient (BCG) method for solving linear systems is shown to be more efficient than the conjugate gradient method for several examples from electromagnetic scattering as discussed by the authors.
Abstract: The biconjugate gradient (BCG) method for solving linear systems is shown to be more efficient than the conjugate gradient (CG) method for several examples from electromagnetic scattering. A remedy for the occasional stagnation of the algorithm is proposed. The potential flaw in the BCG algorithm may be avoided when encountered by restarting the algorithm with a perturbed estimate of the solution. >

Journal ArticleDOI
TL;DR: The compressible Navier-Stokes equations are solved in thin-layer form for a variety of two-dimension al inviscid and viscous problems by preconditione d conjugate gradient-like algorithms, which is found to be competitive with the best current schemes, but has wide applications in parallel computing and unstructured mesh computations.
Abstract: The compressible Navier-Stokes equations are solved for a variety of two-dimensional inviscid and viscous problems by preconditioned conjugate gradient-like algorithms. Roe's flux difference splitting technique is used to discretize the inviscid fluxes. The viscous terms are discretized by using central differences. An algebraic turbulence model is also incorporated. The system of linear equations which arises out of the linearization of a fully implicit scheme is solved iteratively by the well known methods of GMRES (Generalized Minimum Residual technique) and Chebyschev iteration. Incomplete LU factorization and block diagonal factorization are used as preconditioners. The resulting algorithm is competitive with the best current schemes, but has wide applications in parallel computing and unstructured mesh computations.

Proceedings ArticleDOI
01 May 1990
TL;DR: A new preconditioner for solving a symmetric Toeplitz system of equations by the conjugate gradient method leads to an algorithm which is particularly suitable for parallel computations and has a better asymptotic convergence rate and a lower arithmetic cost per iteration.
Abstract: We introduce a new preconditioner for solving a symmetric Toeplitz system of equations by the conjugate gradient method. This choice leads to an algorithm which is particularly suitable for parallel computations and, compared to the circulant preconditioner of [C33, has a better asymptotic convergence rate and a lower arithmetic cost per iteration.

Journal ArticleDOI
TL;DR: An elementary theory giving bounds on the condition numbers which do not depend on the number of elements if a sparse system with only few variables per element is solved in each iteration is developed.
Abstract: We study a class of substructuring methods well-suited for iterative solution of large systems of linear equations arising from the p-version finite element method. The p-version offers a natural decomposition with every element treated as a substructure. We use the preconditioned conjugate gradient method with preconditioning constructed by a decomposition of the local function space on each element. We develop an elementary theory giving bounds on the condition numbers which do not depend on the number of elements if a sparse system with only few variables per element is solved in each iteration. This bound can be evaluated considering one element at a time and we compute such condition numbers numerically for various elements.

01 Jan 1990
TL;DR: Three new, parallel iterative domain decomposition algorithms for the solution of these linear systems using the finite element method and an additive Schwarz algorithm, which works equally well in two or three dimensions.
Abstract: The use of the finite element method for elasticity problems results in extremely large, sparse linear systems. Historically these have been solved using direct solvers like Choleski's method. These linear systems are often ill-conditioned and hence require good preconditioners if they are to be solved iteratively. We propose and analyze three new, parallel iterative domain decomposition algorithms for the solution of these linear systems. The algorithms are also useful for other elliptic partial differential equations. Domain decomposition algorithms are designed to take advantage of a new generation of parallel computers. The domain is decomposed into overlapping or nonoverlapping subdomains. The discrete approximation to a partial differential equation is then obtained iteratively by solving problems associated with each subdomain. The algorithms are often accelerated using the conjugate gradient method. The first new algorithm presented here borrows heavily from multi-level type algorithms. It involves a local change of basis on the interfaces between the substructures to accelerate the convergence. It works well only in two dimensions. The second algorithm is optimal in that the condition number of the iteration operator is bounded independently of the number of subdomains and unknowns. It uses non-overlapping subdomains, but overlapping regions of the interfaces between subdomains. This is an additive Schwarz algorithm, which works equally well in two or three dimensions. The third algorithm is designed for problems in three dimensions. It includes a coarse problem associated with the unknowns on the wirebaskets of the subdomains. The new method offers more potential parallelism than previous algorithms proposed for three dimensional problems since it allows for the simultaneous solution of the coarse problem and the local problems.

Journal ArticleDOI
TL;DR: In this article, a review of existing predictive and non-predictive parameter estimation methods is presented with a formulation of the inverse surface irrigation problem, and the search of the parameter set that minimizes the errors between field observations and the linearized zero-inertia model is performed.
Abstract: The basic concepts and procedures for the estimation of roughness and infiltration parameters encountered in surface irrigation are presented. A review of existing predictive and nonpredictive parameter-estimation methods is presented with a formulation of the inverse surface irrigation problem. Conjugate gradient and variable metric techniques are used for the search of the parameter set that minimizes the errors between field observations and the linearized zero-inertia model. Appropriate constraints that restrict the variation of parameters within physically realistic limits are also imposed on the objective function. The performance and radius of convergence of the search algorithm are studied by numerical tests that demonstrate the steps in the development of the associated objective function and the strategy required for convergence to the correct values of the field parameters. It is concluded that a key role is played by the formulation of the direct problem and its numerical solution, and that the nonlinear field-parameter search converges quickly when the influence of independent parameters can be decoupled during construction of the objective function.

Journal ArticleDOI
TL;DR: In this article, the authors consider a linear ill-posed operator equation Ax = y in Hilbert spaces and show that the method of conjugate gradients for solving this equation together with a stopping rule yields a regularization method.
Abstract: We consider a linear ill-posed operator equation Ax = y in Hilbert spaces. An algorithm R e:Y→X for solving this equation with given inexact right-hand side y e, such that , is called order optimal if it provides best possible error estimates under the assumption that the minimal norm solution x * of this operator equation fulfils some smoothness condition. It is shown that if such an algorithm is slightly modified to then it is a regularization method, i.e., we have without additional conditions on x *. We apply this result to show that the method of conjugate gradients for solving linear ill-posed equations together with a stopping rule yields a regularization method.

Journal ArticleDOI
TL;DR: In this article, the electromagnetic characterization of the transmission and scattering properties of an aperture in a thick conducting plane filled with an inhomogenous composite material for transverse electric polarization is discussed.
Abstract: The electromagnetic characterization of the transmission and scattering properties of an aperture in a thick conducting plane filled with an inhomogenous composite material for transverse electric polarization is discussed. Of particular interest in this analysis is the introduction of a new technique that combines the finite element and boundary integral methods. To allow the treatment of large apertures, the conjugate gradient method (CGM) and fast Fourier transform (FFT) are also incorporated for the solution of the resulting system. Numerical examples that demonstrate the validity, versatility, and capability of the technique are presented. >

Journal ArticleDOI
TL;DR: In this article, the authors show that the norm of the residual can be an arbitrarily poor predictor of a good search direction for nonlinear nonlinear optimization problems, and that the search direction is usually assessed using the norm.

Journal ArticleDOI
TL;DR: In this paper, the condition number of the Schur complement is shown to be smaller than condition number obtained by the block-diagonal preconditioning, where the first block of variables consists of degrees of freedom of a low order.
Abstract: We study symmetric positive definite linear systems, with a 2-by-2 block matrix preconditioned by inverting directly one of the diagonal blocks and suitably preconditioning the other. Using an approximate version of Young's "Property A", we show that the condition number of the Schur complement is smaller than the condition number obtained by the block-diagonal preconditioning. We also get bounds on both condition numbers from a strengthened Cauchy inequality. For systems arising from the finite element method, the bounds do not depend on the number of elements and can be obtained from element-by-element computations. The results are applied to thep-version finite element method, where the first block of variables consists of degrees of freedom of a low order.

Journal ArticleDOI
TL;DR: This iterative method converges for systems with coefficient matrices that are symmetric positive definite or positive real or irreducible L-matrices with a strong diagonal dominance and is very suitable for parallel implementation on a multiprocessor system, such as the CRAY X-MP.
Abstract: In this paper we consider thearithmetic mean method for solving large sparse systems of linear equations. This iterative method converges for systems with coefficient matrices that are symmetric positive definite or positive real or irreducible L-matrices with a strong diagonal dominance. The method is very suitable for parallel implementation on a multiprocessor system, such as the CRAY X-MP. Some numerical experiments on systems resulting from the discretization, by means of the usual 5-point difference formulae, of an elliptic partial differential equation are presented.