Showing papers in &quot;SIAM Journal on Scientific Computing in 1993&quot;

A flexible inner-outer preconditioned GMRES algorithm

TL;DR: A unifying characterization of various regularization methods is given and it is shown that the measurement of “size” is dependent on the particular regularization method chosen, and a new method is proposed for choosing the regularization parameter based on the L-curve.

...read moreread less

Abstract: Regularization algorithms are often used to produce reasonable solutions to ill-posed problems. The L-curve is a plot—for all valid regularization parameters—of the size of the regularized solution versus the size of the corresponding residual. Two main results are established. First a unifying characterization of various regularization methods is given and it is shown that the measurement of “size” is dependent on the particular regularization method chosen. For example, the 2-norm is appropriate for Tikhonov regularization, but a 1-norm in the coordinate system of the singular value decomposition (SVD) is relevant to truncated SVD regularization. Second, a new method is proposed for choosing the regularization parameter based on the L-curve, and it is shown how this method can be implemented efficiently. The method is compared to generalized cross validation and this new method is shown to be more robust in the presence of correlated errors.

...read moreread less

2,841 citations

Journal Article•DOI•

[...]

Youcef Saad

Fast Fourier transforms for nonequispaced data

TL;DR: A variant of the GMRES algorithm is presented that allows changes in the preconditioning at every step, resulting in a result of the flexibility of the new variant that any iterative method can be used as a preconditionser.

...read moreread less

Abstract: A variant of the GMRES algorithm is presented that allows changes in the preconditioning at every step. There are many possible applications of the new algorithm, some of which are briefly discussed. In particular, a result of the flexibility of the new variant is that any iterative method can be used as a preconditioner. For example, the standard GMRES algorithm itself can be used as a preconditioner, as can CGNR (or CGNE), the conjugate gradient method applied to the normal equations. However, the more appealing utilization of the method is in conjunction with relaxation techniques, possibly multilevel techniques. The possibility of changing preconditioners may be exploited to develop efficient iterative methods and to enhance robustness. A few numerical experiments are reported to illustrate this fact.

...read moreread less

1,348 citations

Journal Article•DOI•

[...]

A. Dutt, Vladimir Rokhlin

A transpose-free quasi-minimal residual algorithm for non-Hermitian linear systems

TL;DR: In this paper, a group of algorithms is presented generalizing the fast Fourier transform to the case of noninteger frequencies and nonequispaced nodes on the interval $[ - \pi,\pi ].

...read moreread less

Abstract: A group of algorithms is presented generalizing the fast Fourier transform to the case of noninteger frequencies and nonequispaced nodes on the interval $[ - \pi ,\pi ]$. The schemes of this paper are based on a combination of certain analytical considerations with the classical fast Fourier transform and generalize both the forward and backward FFTs. Each of the algorithms requires $O(N\cdot \log N + N\cdot \log (1/\varepsilon ))$ arithmetic operations, where $\varepsilon $ is the precision of computations and N is the number of nodes. The efficiency of the approach is illustrated by several numerical examples.

...read moreread less

848 citations

Journal Article•DOI•

[...]

Roland W. Freund

Wavelet-like bases for the fast solutions of second-kind integral equations

TL;DR: The biconjugate gradient method for solving general non-Hermitian linear systems and its transpose-free variant, the conjugate gradients squared algorithm (CGS), both typically exhib...

...read moreread less

Abstract: The biconjugate gradient method (BCG) for solving general non-Hermitian linear systems $Ax = b$ and its transpose-free variant, the conjugate gradients squared algorithm (CGS), both typically exhib...

...read moreread less

643 citations

Journal Article•DOI•

[...]

Bradley K. Alpert, Gregory Beylkin, Ronald R. Coifman, Vladimir Rokhlin

01 Jan 1993-SIAM Journal on Scientific Computing

TL;DR: In this article, a class of vector-space bases is introduced for sparse representation of discretizations of integral operators, where an operator with a smooth, nonoscillatory kernel possessing a finite number of singularities in each row or column is represented in these bases as a sparse matrix, to high precision.

...read moreread less

Abstract: A class of vector-space bases is introduced for the sparse representation of discretizations of integral operators An operator with a smooth, nonoscillatory kernel possessing a finite number of singularities in each row or column is represented in these bases as a sparse matrix, to high precision A method is presented that employs these bases for the numerical solution of second-kind integral equations in time bounded by $O(n\log ^2 n)$, where n is the number of points in the discretization Numerical results are given which demonstrate the effectiveness of the approach, and several generalizations and applications of the method are discussed

...read moreread less

378 citations

Journal Article•DOI•

Numerical solution of the Riemann problem for two-dimensional gas dynamics

[...]

Carsten Werner Schulz-Rinne, J.P. Collins, Harland M. Glaz

Index reduction in differential-algebraic equations using dummy derivatives

TL;DR: The Riemann problem for two-dimensional gas dynamics with isentropic or polytropic gas is considered and the required relations for the initial data and the symmetry properties of the solutions are given.

...read moreread less

Abstract: The Riemann problem for two-dimensional gas dynamics with isentropic or polytropic gas is considered. The initial data is constant in each quadrant and chosen so that only a rarefaction wave, shock wave, or slip line connects two neighboring constant initial states. With this restriction sixteen (respectively, fifteen) genuinely different wave combinations for isentropic (respectively, polytropic) gas exist. For each configuration the numerical solution is analyzed and illustrated by contour plots. Additionally, the required relations for the initial data and the symmetry properties of the solutions are given. The chosen calculations correspond closely to the cases studied by T. Zhang and Y. Zheng [SIAM J. Math. Anal., 21 (1990), pp. 593–630], so that the analytical theory can be directly compared to our numerical study.

...read moreread less

355 citations

Journal Article•DOI•

[...]

Sven Erik Mattsson, Gustaf Söderlind

An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices

TL;DR: A new index reduction algorithm for DAEs is developed that yields results with an accuracy comparable to that obtained for the corresponding state-space ODE.

...read moreread less

Abstract: A new index reduction algorithm for DAEs is developed. In the usual manner, parts of the DAE are differentiated analytically and appended to the original system. For each additional equation, a derivative is selected to be replaced by a new algebraic variable called a dummy derivative. The resulting augmented system is at most index 1, but no longer overdetermined. The dummy derivatives are not subject to discretization; their purpose is to annihilate part of the dynamics in the DAE, leaving only what corresponds to the dynamics of a state-space form. No constraint stabilization is necessary in the subsequent numerical treatment. Numerical tests indicate that the method yields results with an accuracy comparable to that obtained for the corresponding state-space ODE.

...read moreread less

330 citations

Journal Article•DOI•

[...]

Roland W. Freund¹, Martin H. Gutknecht¹, Noël M. Nachtigal²•Institutions (2)

Research Institute for Advanced Computer Science¹, Massachusetts Institute of Technology²

01 Jan 1993-SIAM Journal on Scientific Computing

TL;DR: In this paper, a look-ahead version of the Lanczos algorithm is proposed to solve large sparse non-Hermitian linear systems with non-asymptotic eigenvalues.

...read moreread less

Abstract: The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. An implementation is presented of a look-ahead version of the Lanczos algorithm that, except for the very special situation of an incurable breakdown, overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and requires the same number of matrix-vector products and inner products as the standard Lanczos process without look-ahead.

...read moreread less

323 citations

Journal Article•DOI•

Variants of BICGSTAB for matrices with complex spectrum

[...]

Martin H. Gutknecht

The accuracy of floating point summation

TL;DR: The author presents for real nonsymmetric matrices a method BICGSTAB2 in which the second factor may have complex conjugate zeros, and versions suitable for complex matrices are given for both methods.

...read moreread less

Abstract: Recently Van der Vorst [SIAM J. Sci. Statist. Comput.,13 (1992), pp. 631–644] proposed for solving nonsymmetric linear systems $Az = b$ a biconjugate gradient (BICG)-based Krylov space method called BICGSTAB that, like the biconjugate gradient squared (BICGS) method of Sonneveld, does not require matrix–vector multiplications with the transposed matrix $A^T $, and that has typically a much smoother convergence behavior than BICG and BICGS. Its nth residual polynomial is the product of the one of BICG (i.e., the nth Lanczos polynomial) with a polynomial of the same degree with real zeros. Therefore, nonreal eigenvalues of A are not approximated well by the second polynomial factor. Here, the author presents for real nonsymmetric matrices a method BICGSTAB2 in which the second factor may have complex conjugate zeros. Moreover, versions suitable for complex matrices are given for both methods.

...read moreread less

296 citations

Journal Article•DOI•

[...]

Nicholas J. Higham

A parallel graph coloring heuristic

TL;DR: Five summation methods and their variations are analyzed here and no one method is uniformly more accurate than the others, but some guidelines are given on the choice of method in particular cases.

...read moreread less

Abstract: The usual recursive summation technique is just one of several ways of computing the sum of n floating point numbers Five summation methods and their variations are analyzed here The accuracy of the methods is compared using rounding error analysis and numerical experiments Four of the methods are shown to be special cases of a general class of methods, and an error analysis is given for this class No one method is uniformly more accurate than the others, but some guidelines are given on the choice of method in particular cases

...read moreread less

269 citations

Journal Article•DOI•

[...]

Mark T. Jones, Paul E. Plassmann

Block sparse Cholesky algorithms on advanced uniprocessor computers

TL;DR: Experimental results obtained on an Intel iPSC/860 are presented, which demonstrate that, for graphs arising from finite element applications, the heuristic exhibits scalable performance and generates colorings usually within three or four colors of the best-known linear time sequential heuristics.

...read moreread less

Abstract: The problem of computing good graph colorings arises in many diverse applications, such as in the estimation of sparse Jacobians and in the development of efficient, parallel iterative methods for solving sparse linear systems. This paper presents an asynchronous graph coloring heuristic well suited to distributed memory parallel computers. Experimental results obtained on an Intel iPSC/860 are presented, which demonstrate that, for graphs arising from finite element applications, the heuristic exhibits scalable performance and generates colorings usually within three or four colors of the best-known linear time sequential heuristics. For bounded degree graphs, it is shown that the expected running time of the heuristic under the P-RAM computation model is bounded by $EO(\log (n)/\log \log (n))$. This bound is an improvement over the previously known best upper bound for the expected running time of a random heuristic for the graph coloring problem.

...read moreread less

Journal Article•DOI•

[...]

Esmond G. Ng, Barry W. Peyton

The development of variable-step symplectic integrators with application to the two-body problem

TL;DR: Two sparse Cholesky factorization algorithms are examined in a systematic and consistent fashion, both to illustrate the strengths of the blocking techniques in general and to obtain a fair evaluation of the two approaches.

...read moreread less

Abstract: As with many other linear algebra algorithms, devising a portable implementation of sparse Cholesky factorization that performs well on the broad range of computer architectures currently available is a formidable challenge. Even after limiting the attention to machines with only one processor, as has been done in this paper, there are still several interesting issues to consider. For dense matrices, it is well known that block factorization algorithms are the best means of achieving this goal. This approach is taken for sparse factorization as well.This paper has two primary goals. First, two sparse Cholesky factorization algorithms, the multifrontal method and a blocked left-looking sparse Cholesky method, are examined in a systematic and consistent fashion, both to illustrate the strengths of the blocking techniques in general and to obtain a fair evaluation of the two approaches, Second, the impact of various implementation techniques on time and storage efficiency is assessed, paying particularly clo...

...read moreread less

Journal Article•DOI•

[...]

M. P. Calvo, Jesús María Sanz-Serna

The Hermite spectral method for Gaussian-type functions

TL;DR: The authors develop and test variable step symplectic Runge–Kutta–Nystrom algorithms for the integration of Hamiltonian systems of ordinary differential equations and suggest that, for symplectic formulae, moving from constant to variable stepsizes results in a marked decrease in efficiency.

...read moreread less

Abstract: The authors develop and test variable step symplectic Runge–Kutta–Nystrom algorithms for the integration of Hamiltonian systems of ordinary differential equations. Numerical experiments suggest that, for symplectic formulae, moving from constant to variable stepsizes results in a marked decrease in efficiency. On the other hand, symplectic formulae with constant stepsizes may outperform available standard (nonsymplectic) variable-step codes. For the model situation consisting in the long-time integration of the two-body problem, our experimental findings are backed by theoretical analysis.

...read moreread less

Journal Article•DOI•

[...]

Tao Tang

Ill-conditioning in neural network training problems

TL;DR: In this paper the scaling factors are given for functions that are of Gaussian type, which have finite supports and the numerical results show that only reasonable numbers of the Hermite functions are required to solve differential equations.

...read moreread less

Abstract: Although Hermite functions were widely used for many practical problems, numerical experiments with the standard (normalized) Hermite functions $\psi _n (v)$ worked poorly in the sense that too many Hermite functions are required to solve differential equations. In order to obtain accurate numerical solutions, it is necessary to choose a scaling factor $\alpha $ and use $\psi _n (\alpha v)$ as the basis functions. In this paper the scaling factors are given for functions that are of Gaussian type, which have finite supports $[ - M,M]$. The scaling factor used is $\max _{0 \leqq j \leqq N} \{ \gamma _j \} / M$, where $\{ \gamma _j \} _{j = 0}^N $ are the roots of $\psi _{N + 1} (v)$ and $N + 1$ is the number of the truncated terms used. The numerical results show that after using this scaling factor, only reasonable numbers of the Hermite functions are required to solve differential equations.

...read moreread less

Journal Article•DOI•

[...]

S. Saarinen, R. Bramley, George Cybenko

Preserving symmetries in the proper orthogonal decomposition

TL;DR: The analysis and experimental results lead to the conclusion that many network training problems are ill conditioned and may not be solved more efficiently by higher-order optimization methods.

...read moreread less

Abstract: The training problem for feedforward neural networks is nonlinear parameter estimation that can be solved by a variety of optimization techniques. Much of the literature on neural networks has focused on variants of gradient descent. The training of neural networks using such techniques is known to be a slow process with more sophisticated techniques not always performing significantly better. This paper shows that feedforward neural networks can have ill-conditioned Hessians and that this ill-conditioning can be quite common. The analysis and experimental results in this paper lead to the conclusion that many network training problems are ill conditioned and may not be solved more efficiently by higher-order optimization methods. While the analyses used in this paper are for completely connected layered networks, they extend to networks with sparse connectivity as well. The results suggest that neural networks can have considerable redundancy in parameterizing the function space in a neighborhood of a lo...

...read moreread less

Journal Article•DOI•

[...]

Nadine Aubry, Wen-Yu Lian, Edriss S. Titi

Analysis of iterative methods for the steady and unsteady Stokes problem: application to spectral element discretizations

TL;DR: It is shown here that a necessary condition for achieving this goal is that the truncated system inherit the symmetry properties of the original infinite-dimensional system, leading to efficient finite truncations.

...read moreread less

Abstract: The proper orthogonal decomposition (POD) (also called Karhunen–Loeve expansion) has been recently used in turbulence to derive optimally fast converging bases of spatial functions, leading to efficient finite truncations. Whether a finite number of these modes can be used in numerical simulations to derive an “accurate” finite set of ordinary differential equations, over a certain range of bifurcation parameter values, still remains an open question. It is shown here that a necessary condition for achieving this goal is that the truncated system inherit the symmetry properties of the original infinite-dimensional system. In most cases, this leads to a systematic involvement of the symmetry group in deriving a new expansion basis called the symmetric POD basis. The Kuramoto–Sivashinsky equation with periodic boundary conditions is used as a paradigm to illustrate this point of view. However, the conclusion is general and can be applied to other equations, such as the Navier–Stokes equations, the complex G...

...read moreread less

Journal Article•DOI•

[...]

Yvon Maday, Dan Meiron, Anthony T. Patera, Einar M. Rønquist

Computing the exact least median of squares estimate and stability diagnostics in multiple linear regression

TL;DR: A new and detailed analysis of the basic Uzawa algorithm for decoupling of the pressure and the velocity in the steady and unsteady Stokes operator is presented, focusing on explicit construction of the Uzawa pressure-operator spectrum for a semiperiodic model problem.

...read moreread less

Abstract: A new and detailed analysis of the basic Uzawa algorithm for decoupling of the pressure and the velocity in the steady and unsteady Stokes operator is presented. The paper focuses on the following new aspects: explicit construction of the Uzawa pressure-operator spectrum for a semiperiodic model problem; general relationship of the convergence rate of the Uzawa procedure to classical inf-sup discretization analysis; and application of the method to high-order variational discretization.

...read moreread less

Journal Article•DOI•

[...]

Arnold J. Stromberg

Accelerated multigrid convergence and high-Reynolds recirculating flows

TL;DR: A method for computing the exact value of the LMS estimate in multiple linear regression based on the fact that each LQS estimate is the Chebyshev (or minimax) fit to some q element subset of the data is presented.

...read moreread less

Abstract: The difficulty in computing the least median of squares (LMS) estimate in multiple linear regression is due to the nondifferentiability and many local minima of the objective function. Several approximate, but not exact, algorithms have been suggested. This paper presents a method for computing the exact value of the LMS estimate in multiple linear regression. The LMS estimate is a special case of the least quantile of squares (LQS) estimate, which minimizes the qth smallest squared residual for a given data set. For LMS, $q = [n/2] + [(p + 1)/2]$ where $[ \, ]$ is the greatest integer function, n is the sample size, and p is the number of columns in the X matrix. The algorithm can compute a range of exact LQS estimates in multiple linear regression by considering $\left( {\begin{array}{*{20}c} n \\ {p + 1} \\ \end{array} } \right)$ possible $\theta $ values. It is based on the fact that each LQS estimate is the Chebyshev (or minimax) fit to some q element subset of the data. This yields a surprisingly ea...

...read moreread less

Journal Article•DOI•

[...]

Achi Brandt, Irad Yavneh

Computing the generalized singular value decomposition

TL;DR: Techniques are developed for accelerating multigrid convergence in general, and for advection-diffusion and incompressible flow problems with small viscosity in particular, showing very significant improvement in convergence rates at little cost.

...read moreread less

Abstract: Techniques are developed for accelerating multigrid convergence in general, and for advection-diffusion and incompressible flow problems with small viscosity in particular. It is shown by analysis that the slowing down of convergence is due mainly to poor coarse-grid correction to certain error components, and means for dealing with this problem are suggested, analyzed, and tested by numerical experiments, showing very significant improvement in convergence rates at little cost.

...read moreread less

Journal Article•DOI•

[...]

Zhaojun Bai, James Demmel

A multiresolution method for distributed parameter estimation

TL;DR: A new numerical method for computing the GSVD of two matrices A and B is presented, a variation on Paige''s method, which differs from previous algorithms in guaranteeing both backward stability and convergence.

...read moreread less

Abstract: We present a new numerical method for computing the GSVD [36, 27] of two matrices A and B. This method is a variation on Paige''s method [30]. It differs from previous algorithms in guaranteeing both backward stability and con- vergence. There are two innovations. The first is a new pre- processing step which reduces A and B to upper triangular forms satisfying certain rank conditions. The second is a new 2 by 2 triangular GSVD algorithm, which constitutes the inner loop of Paige''s method. We present proofs of stability and convergence of our method, and demonstrate examples on which all previous algorithms fail.

...read moreread less

Journal Article•DOI•

[...]

Jun Liu

A mapping algorithm for parallel sparse Cholesky factorization

TL;DR: It is shown how the method of scale-by-scale multiresolution yields robust and fast convergence and gives a natural regularization approach which is complementary to Tikhonov's regularization.

...read moreread less

Abstract: A multiresolution method for distributed parameter estimation (or inverse problems) is studied numerically. The identification of the coefficient of an elliptic equation in one dimension is considered as our model problem. First, multiscale bases are used to analyze the degree of ill-posedness of the inverse problem. Second, based on some numerical results, it is shown that the method of scale-by-scale multiresolution yields robust and fast convergence. Finally, it is shown how the method gives a natural regularization approach which is complementary to Tikhonov’s regularization.

...read moreread less

Journal Article•DOI•

[...]

Alex Pothen, Chunguang Sun

Numerical methods for hyperbolic conservation laws with stiff relaxation II: higher-order Godunov methods

TL;DR: A task-to-processor mapping algorithm is described for computing the parallel multifrontal Cholesky factorization of irregular sparse problems on distributed-memory multiprocessors that is nearly as efficient on a collection of problems with irregular sparsity structure as it is for the regular grid problems.

...read moreread less

Abstract: A task-to-processor mapping algorithm is described for computing the parallel multifrontal Cholesky factorization of irregular sparse problems on distributed-memory multiprocessors. The performance of the mapping algorithm is compared with the only general mapping algorithm previously reported. Using this mapping, the distributed multifrontal algorithm is nearly as efficient on a collection of problems with irregular sparsity structure as it is for the regular grid problems.

...read moreread less

Journal Article•DOI•

[...]

Richard B. Pember

A collection of problems for which Gaussian elimination with partial pivoting is unstable

TL;DR: A higher-order Godunov method is presented for hyperbolic systems of conservation laws with stiff, relaxing source terms and it is shown that this method produces higher- order accurate results.

...read moreread less

Abstract: A higher-order Godunov method is presented for hyperbolic systems of conservation laws with stiff, relaxing source terms. The goal is to develop a Godunov method that produces higher-order accurate...

...read moreread less

Journal Article•DOI•

[...]

Stephen J. Wright

01 Jan 1993-SIAM Journal on Scientific Computing

TL;DR: A significant collection of two-point boundary value problems is shown to give rise to linear systems of algebraic equations on which Gaussian elimination with row partial pivoting is unstable when standard solution techniques are used.

...read moreread less

Abstract: A significant collection of two-point boundary value problems is shown to give rise to linear systems of algebraic equations on which Gaussian elimination with row partial pivoting is unstable when standard solution techniques are used.

...read moreread less

Journal Article•DOI•

On a multivariate eigenvalue problem, part I: algebraic theory and a power method

[...]

Moody T. Chu, J. Loren Watterson

Sparse matrix computations on parallel processor arrays

TL;DR: A convergence property of Horst’s method by forming it as a generalization of the so-called power method is then proved and a closed form on the cardinality of solutions for the multivariate eigenvalue problem is first proved.

...read moreread less

Abstract: Multivariate eigenvalue problems for symmetric and positive definite matrices arise from multivariate statistics theory where coefficients are to be determined so that the resulting linear combinations of sets of random variables are maximally correlated. By using the method of Lagrange multipliers such an optimization problem can be reduced to the multivariate eigenvalue problem. For over 30 years an iterative method proposed by Horst [Psychometrika, 26 (1961), pp. 129–149J has been used for solving the multivariate eigenvalue problem. Yet the theory of convergence has never been complete. The number of solutions to the multivariate eigenvalue problem also remains unknown. This paper contains two new results. By using the degree theory, a closed form on the cardinality of solutions for the multivariate eigenvalue problem is first proved. A convergence property of Horst’s method by forming it as a generalization of the so-called power method is then proved. The discussion leads to new formulations of nume...

...read moreread less

Journal Article•DOI•

[...]

Andrew T. Ogielski, William Aiello

Preconditioning the Lanczos algorithm for sparse symmetric eigenvalue problems

TL;DR: It is proved that with high probability the algorithms produce well-balanced storage for sufficiently large matrices with bounded number of nonzeros in each row and column, but no other restrictions on structure.

...read moreread less

Abstract: This paper investigates the balancing of distributed compressed storage of large sparse matrices on a massively parallel computer. For fast computation of matrix–vector and matrix–matrix products on a rectangular processor array with efficient communications along its rows and columns it is required that the nonzero elements of each matrix row or column be distributed among the processors located within the same array row or column, respectively. Randomized packing algorithms are constructed with such properties, and it is proved that with high probability the algorithms produce well-balanced storage for sufficiently large matrices with bounded number of nonzeros in each row and column, but no other restrictions on structure. Then basic matrix–vector multiplication routines are described with fully parallel interprocessor communications and intraprocessor gather and scatter operations. Their efficiency isdemonstrated on the 16,384-processor MasPar computer.

...read moreread less

Journal Article•DOI•

[...]

Ronald B. Morgan, David S. Scott

A supernodal Cholesky factorization algorithm for shared-memory multiprocessors

TL;DR: A method for computing a few eigenpairs of sparse symmetric matrices is presented and analyzed that combines the power of preconditioning techniques with the efficiency of the Lanczos algorithm.

...read moreread less

Abstract: A method for computing a few eigenpairs of sparse symmetric matrices is presented and analyzed that combines the power of preconditioning techniques with the efficiency of the Lanczos algorithm. The method is related to Davidson’s method and its generalizations, but can be less expensive for matrices that are fairly sparse. A double iteration is used. An effective termination criterion is given for the inner iteration. Quadratic convergence with respect to the outer loop is shown.

...read moreread less

Journal Article•DOI•

[...]

Esmond G. Ng, Barry W. Peyton

Optimal parallel solution of sparse triangular systems

TL;DR: A straightforward parallelization of the left-looking supernodal sparse Cholesky factorization algorithm for shared-memory MIMD multiprocessors, which improves performance by reducing indirect addressing and memory traffic.

...read moreread less

Abstract: This paper presents a parallel sparse Cholesky factorization algorithm for shared-memory MIMD multiprocessors. The algorithm is particularly well suited for vector supercomputers with multiple processors, such as the Cray Y-MP. The new algorithm is a straightforward parallelization of the left-looking supernodal sparse Cholesky factorization algorithm. Like its sequential predecessor, it improves performance by reducing indirect addressing and memory traffic. Experimental results on a Cray Y-MP demonstrate the effectiveness of the new algorithm. On eight processors of a Cray Y-MP, the new routine performs the factorization at rates exceeding one Gflop for several test problems from the Harwell–Boeing sparse matrix collection.

...read moreread less

Journal Article•DOI•

[...]

Fernando L. Alvarado, Robert Schreiber

High-order symplectic Runge-Kutta-Nystro¨m methods

TL;DR: The parallel solution of a sparse system Lx = b with triangular matrix L is considered, which is often a performance bottleneck in parallel computation, and the inverse of L is represented as a product of a few sparse factors.

...read moreread less

Abstract: This paper considers the parallel solution of a sparse system $Lx = b$ with triangular matrix L, which is often a performance bottleneck in parallel computation. When many systems with the same matrix are to be solved, the parallel efficiency can be improved by representing the inverse of L as a product of a few sparse factors. The factorization with the smallest number of factors is constructed, subject to the requirement that no new nonzero elements are created. Applications are to iterative solvers with triangular preconditioners, to structural analysis, or to power systems applications. Experimental results on the Connection Machine show the method to be highly valuable.

...read moreread less

Journal Article•DOI•

[...]

M. P. Calvo, J. M. Sanz-Serna