Showing papers on "Matrix multiplication published in 1997"

PDF

Open Access

Journal Article•DOI•

LINCS : A linear constraint solver for molecular simulations

[...]

Berk Hess, Henk Bekker, Herman J. C. Berendsen, Johannes G. E. M. Fraaije

01 Sep 1997-Journal of Computational Chemistry

TL;DR: Although the derivation of the algorithm is presented in terms of matrices, no matrix matrix multiplications are needed and only the nonzero matrix elements have to be stored, making the method useful for very large molecules.

...read moreread less

Abstract: In this article, we present a new LINear Constraint Solver (LINCS) for molecular simulations with bond constraints. The algorithm is inherently stable, as the constraints themselves are reset instead of derivatives of the constraints, thereby eliminating drift. Although the derivation of the algorithm is presented in terms of matrices, no matrix matrix multiplications are needed and only the nonzero matrix elements have to be stored, making the method useful for very large molecules. At the same accuracy, the LINCS algorithm is three to four times faster than the SHAKE algorithm. Parallelization of the algorithm is straightforward. (C) 1997 John Wiley & Sons, Inc.

...read moreread less

12,699 citations

Book•

Nonnegative Matrices and Applications

[...]

Ravindra B. Bapat¹, T. E. S. Raghavan²•Institutions (2)

Indian Statistical Institute¹, University of Illinois at Chicago²

01 Mar 1997

TL;DR: In this article, an integrated treatment of the theory of nonnegative matrices and some related classes of positive matrices, concentrating on connections with game theory, combinatorics, inequalities, optimisation and mathematical economics is presented.

...read moreread less

Abstract: This book provides an integrated treatment of the theory of nonnegative matrices (matrices with only positive numbers or zero as entries) and some related classes of positive matrices, concentrating on connections with game theory, combinatorics, inequalities, optimisation and mathematical economics. The wide variety of applications, which include price fixing, scheduling and the fair division problem, have been carefully chosen both for their elegant mathematical content and for their accessibility to students with minimal preparation. Many results in matrix theory are also presented. The treatment is rigorous and almost all results are proved completely. These results and applications will be of great interest to researchers in linear programming, statistics and operations research. The minimal prerequisites also make the book accessible to first-year graduate students.

...read moreread less

555 citations

Journal Article•DOI•

Multi-Terminal Binary Decision Diagrams: An Efficient DataStructure for Matrix Representation

[...]

Masahiro Fujita¹, P. C. McGeer², J. C.-Y. Yang³•Institutions (3)

Fujitsu¹, Lawrence Berkeley National Laboratory², Stanford University³

01 Apr 1997

TL;DR: It is demonstrated that binary decision diagrams are an efficient representation for every special-case matrix in common use, notably sparse matrices, and that complete pivoting is no more difficult over these matrices than partial pivoting.

...read moreread less

Abstract: In this paper, we discuss the use of binary decision diagrams to represent general matrices. We demonstrate that binary decision diagrams are an efficient representation for every special-case matrix in common use, notably sparse matrices. In particular, we demonstrate that for any matrix, the BDD representation can be no larger than the corresponding sparse-matrix representation. Further, the BDD representation is often smaller than any other conventional special-case representation: for the n×n Walsh matrix, for example, the BDD representation is of size O(log n). No other special-case representation in common use represents this matrix in space less than O(n²). We describe termwise, row, column, block, and diagonal selection over these matrices, standard an Strassen matrix multiplication, and LU factorization. We demonstrate that the complexity of each of these operations over the BDD representation is no greater than that over any standard representation. Further, we demonstrate that complete pivoting is no more difficult over these matrices than partial pivoting. Finally, we consider an example, the Walsh Spectrum of a Boolean function.

...read moreread less

432 citations

Journal Article•DOI•

Algebric Decision Diagrams and Their Applications

[...]

R.I. Bahar¹, E. A. Frohm¹, C. M. Gaona¹, Gary D. Hachtel¹, Enrico Macii¹, Abelardo Pardo¹, Fabio Somenzi¹ - Show less +3 more•Institutions (1)

University of Colorado Boulder¹

01 Apr 1997

TL;DR: A treatment founded in Boolean algebras is presented and algorithms and results in several areas of application are discussed: Matrix multiplication, shortest path algorithms, and direct methods for numerical linear algebra.

...read moreread less

Abstract: In this paper we present theory and experimental results on Algebraic Decision Diagrams. These diagrams extend BDDs by allowing values from an arbitrary finite domain to be associated with the terminal nodes of the diagram. We present a treatment founded in Boolean algebras and discuss algorithms and results in several areas of application: Matrix multiplication, shortest path algorithms, and direct methods for numerical linear algebra. Although we report an essentially negative result for Gaussian elimination per se, we propose a modified form of ADDs which appears to circumvent the difficulties in some cases. We discuss the relevance of our findings and point to directions for future work.

...read moreread less

274 citations

Proceedings Article•DOI•

On improving the performance of sparse matrix-vector multiplication

[...]

J.B. White, P. Sadayappan

18 Dec 1997

TL;DR: The data locality characteristics of the compressed sparse row representation is examined and improvements in locality through matrix permutation are considered and modified sparse matrix representations are evaluated.

...read moreread less

Abstract: We analyze single node performance of sparse matrix vector multiplication by investigating issues of data locality and fine grained parallelism. We examine the data locality characteristics of the compressed sparse row representation and consider improvements in locality through matrix permutation. Motivated by potential improvements in fine grained parallelism, we evaluate modified sparse matrix representations. The results lead to general conclusions about improving single node performance of sparse matrix vector multiplication in parallel libraries of sparse iterative solvers.

...read moreread less

168 citations

Journal Article•DOI•

An inverse free parallel spectral divide and conquer algorithm for nonsymmetric eigenproblems

[...]

Zhaojun Bai¹, James Demmel², Ming Gu²•Institutions (2)

University of Kentucky¹, University of California, Berkeley²

01 May 1997-Numerische Mathematik

TL;DR: An inverse-free, highly parallel, spectral divide and conquer algorithm that can compute either an invariant subspace of a nonsymmetric matrix, or a pair of left and right deflating subspaces of a regular matrix pencil.

...read moreread less

Abstract: We discuss an inverse-free, highly parallel, spectral divide and conquer algorithm. It can compute either an invariant subspace of a nonsymmetric matrix \(A\), or a pair of left and right deflating subspaces of a regular matrix pencil \(A - \lambda B\). This algorithm is based on earlier ones of Bulgakov, Godunov and Malyshev, but improves on them in several ways. This algorithm only uses easily parallelizable linear algebra building blocks: matrix multiplication and QR decomposition, but not matrix inversion. Similar parallel algorithms for the nonsymmetric eigenproblem use the matrix sign function, which requires matrix inversion and is faster but can be less stable than the new algorithm.

...read moreread less

145 citations

Proceedings Article•DOI•

Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code

[...]

Jeremy D. Frens¹, David S. Wise¹•Institutions (1)

Indiana University¹

21 Jun 1997

TL;DR: An elementary, machine-independent, recursive algorithm for matrix multiplication C+=A*B provides implicit blocking at every level of the memory hierarchy and tests out faster than classically optimrd code, tracking hand-coded BLAS3 routines.

...read moreread less

Abstract: An elementary, machine-independent, recursive algorithm for matrix multiplication C+=A*B provides implicit blocking at every level of the memory hierarchy and tests out faster than classically optimrd code, tracking hand-coded BLAS3 routines. Proof of concept is demonstrated by racing the in-place algorithm against manufacturer's hand-tuned BLAS3 routines; it can win.The recursive code bifurcates naturally at the top level into independent block-oriented processes, that each writes to a disjoint and contiguous region of memory. Experience has shown that the indexing vastly improves the patterns of memory access at all levels of the memory hierarchy, independently of the sizes of caches or pages and without ad hoc programming. It also exposed a weakness in SGI's C compilers that merrily unroll loops for the super-scalar R8000 processor, but do not analogously unfold the base cases of the most elementary recursions. Such deficiencies might deter future programmers from using this rich class of recursive algorithms.

...read moreread less

128 citations

Journal Article•DOI•

Symbolic computation of fundamental solution matrices for linear time-periodic dynamical systems

[...]

Subhash C. Sinha¹, E.A. Butcher¹•Institutions (1)

Auburn University¹

11 Sep 1997-Journal of Sound and Vibration

TL;DR: In this paper, a new technique which employs both Picard iteration and expansion in shifted Chebyshev polynomials is used to symbolically approximate the fundamental solution matrix for linear time-periodic dynamical systems of arbitrary dimension explicitly as a function of the system parameters and time.

...read moreread less

105 citations

Journal Article•DOI•

When is a pair of matrices mortal

[...]

Vincent D. Blondel¹, John N. Tsitsiklis²•Institutions (2)

University of Liège¹, Massachusetts Institute of Technology²

15 Sep 1997-Information Processing Letters

TL;DR: It is shown that the problem of deciding whether a pair of 48 × 48 integer matrices is mortal is undecidable, and that the problems of deciding, for a given k, whether a couple of matrices are k-mortal is NP-complete.

...read moreread less

98 citations

Journal Article•DOI•

Matrix product eigenstates for one-dimensional stochastic models and quantum spin chains

[...]

Klaus Krebs¹, Sven Sandow²•Institutions (2)

University of Bonn¹, Virginia Tech²

07 May 1997-Journal of Physics A

TL;DR: In this article, it was shown that all zero-energy eigenstates of an arbitrary m-state quantum spin chain Hamiltonian with nearest-neighbour interaction in the bulk and single site boundary terms, which can also describe the dynamics of stochastic models, can be written as matrix product states.

...read moreread less

Abstract: We show that all zero-energy eigenstates of an arbitrary m-state quantum spin chain Hamiltonian with nearest-neighbour interaction in the bulk and single site boundary terms, which can also describe the dynamics of stochastic models, can be written as matrix product states. This means that the weights in these states can be expressed as expectation values in a Fock representation of an algebra generated by 2m operators fulfilling quadratic relations which are defined by the Hamiltonian.

...read moreread less

90 citations

Journal Article•DOI•

Isotropic integrity bases for vectors and second-order tensors

[...]

A.J.M. Spencer¹, A.J.M. Spencer², Ronald S. Rivlin¹, Ronald S. Rivlin²•Institutions (2)

Brown University¹, University of Nottingham²

01 Jan 1997-Archive for Rational Mechanics and Analysis

TL;DR: In this article, it has been shown how an arbitrary matrix polynomial in any number of symmetric 3 × 3 matrices may be expressed in a canonical form in the orthogonal transformation group.

...read moreread less

Abstract: In previous papers [2, 3] it has been shown how an arbitrary matrix polynomial in any number of symmetric 3 × 3 matrices may be expressed in a canonical form From these results an integrity basis under the orthogonal transformation group for an arbitrary number of symmetric 3 × 3 matrices has been derived This consists of traces of products formed from the matrices which have total degree six or less in the matrices In deriving these results a number of theorems were obtained which enabled us to express a product formed from any number of 3 × 3 matrices, whether symmetric or non-symmetric, as a sum of products of particular types formed from these matrices, with coefficients which are polynomials in traces of products formed from the matrices

...read moreread less

Journal Article•DOI•

Cumulant-based blind identification of linear multi-input-multi-output systems driven by colored inputs

[...]

Y. Inouye¹, K. Hirano•Institutions (1)

Osaka University¹

01 Jun 1997-IEEE Transactions on Signal Processing

TL;DR: This work addresses the blind identification problem of the linear MIMO system driven by unobservable colored inputs using higher order statistics (HOS) and shows that the transfer function matrix of an unknown system is identified only up to post-multiplication by a g matrix.

...read moreread less

Abstract: The blind identification problem of a linear multi-input-multi-output (MIMO) system is widely noticed by many researchers in diverse fields due to its relevance to blind signal separation. However, such a problem is ill-posed and has no unique solution. Therefore, we can only find a solution of the problem within an equivalence class. We address the blind identification problem of the linear MIMO system driven by unobservable colored inputs using higher order statistics (HOS), particularly the fourth-order cumulants, of the outputs, where the unobservable inputs are mutually independent but temporally colored linear processes. We first define the set, which is denoted by S, of stable scalar transfer functions and then define the notion of a generalized permutation matrix (which is abbreviated by a g-matrix) over S. Then, it is shown that the transfer function matrix of an unknown system is identified only up to post-multiplication by a g matrix. This result is applied to identifying FIR systems for blind signal separation.

...read moreread less

Journal Article•DOI•

A precise time-step integration method by step-response and impulsive-response matrices for dynamic problems

[...]

T. C. Fung¹•Institutions (1)

Nanyang Technological University¹

30 Dec 1997-International Journal for Numerical Methods in Engineering

TL;DR: In this paper, a precise time-step integration method for dynamic problems is presented, where the second-order differential equations are manipulated directly and a general damping matrix is considered.

...read moreread less

Abstract: In this paper, a precise time-step integration method for dynamic problems is presented. The second-order differential equations for dynamic problems are manipulated directly. A general damping matrix is considered. The transient responses are expressed in terms of the steady-state responses, the given initial conditions and the step-response and impulsive-response matrices. The steady-state responses for various types of excitations are readily obtainable. The computation of the step-response and impulsive-response matrices and their time derivatives are studied in this paper. A direct computation of these matrices using the Taylor series solutions is not efficient when the time-step size Δt is not small. In this paper, the recurrence formulae relating the response matrices at t=Δt to those at t=Δt/2 are constructed. A recursive procedure is proposed to evaluate these matrices at t=Δt from the matrices at t=Δt/2m. The matrices at t=Δt/2m are obtained from the Taylor series solutions. To improve the computational efficiency, the relations between the response matrices and their time derivatives are investigated. In addition, these matrices are expressed in terms of two symmetric matrices that can also be evaluated recursively. Besides, from the physical point of view, these matrices should be banded for small Δt. Both the stability and accuracy characteristics of the present algorithm are studied. Three numerical examples are used to illustrate the highly precise and stable algorithm. © 1997 John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

Matrix-product states for a one-dimensional lattice gas with parallel dynamics

[...]

Andreas Honecker¹, Ingo Peschel¹•Institutions (1)

Free University of Berlin¹

01 Jul 1997-Journal of Statistical Physics

TL;DR: In this paper, the hopping motion of classical particles on a chain coupled to reservoirs at both ends is studied for parallel dynamics with arbitrary probabilities, and the stationary state is obtained in the form of an alternating matrix product.

...read moreread less

Abstract: The hopping motion of classical particles on a chain coupled to reservoirs at both ends is studied for parallel dynamics with arbitrary probabilities. The stationary state is obtained in the form of an alternating matrix product. The properties of one- and two-dimensional representations are studied in detail and a general relation of the matrix algebra to that of the sequential limit is found. In this way the general phase diagram of the model is obtained. The mechanism of the sequential limit, the formulation as a vertex model, and other aspects are discussed.

...read moreread less

Journal Article•DOI•

Stationary State of Integrable Systems in Matrix Product Form.

[...]

Tomohiro Sasamoto¹, Miki Wadati¹•Institutions (1)

University of Tokyo¹

15 Sep 1997-Journal of the Physical Society of Japan

TL;DR: In this paper, a generalization of the matrix product ansatz (MPA) is proposed to construct a stationary state of one-dimensional integrable systems in the form of products of matrices.

...read moreread less

Abstract: Proposed is a method to construct a stationary state of one-dimensional integrable systems in the form of products of matrices. This is a generalization of the so-called “matrix product ansatz (MPA)”. The key idea is that the matrices are chosen to constitute the Zamolodchikov-Faddeev algebra (ZF-algebra) for the R-matrix and the K-matrix, which are the solutions of the Yang-Baxter equation (YBE) and the reflection equation (RE). It is shown that a matrix product state gives a simultaneous stationary state of commuting operators which are expressed in terms of the R-matrices and the K-matrices. As an example, a solution for the isotropic Heisenberg spin chain with non-diagonal boundary fields is given. The connection to the conventional MPA is clarified. Applications to other models and the relationship to the algebraic Bethe ansatz (ABA) are also discussed.

...read moreread less

Journal Article•DOI•

The Study on the Nonlinear Computations of the DQ and DC Methods

[...]

Wen Chen¹, Tingxiu Zhong¹•Institutions (1)

Shanghai Jiao Tong University¹

01 Jan 1997-Numerical Methods for Partial Differential Equations

TL;DR: In this article, the Hadamard product of matrices is used to obtain an explicit matrix formulation for the differential quadrature (DQ) and differential cubature (DC) solutions of nonlinear differential and integro-differential equations.

...read moreread less

Abstract: This article points out that the differential quadrature (DQ) and differential cubature (DC) methods, due to their global domain property, are more efficient for nonlinear problems than the traditional numerical techniques such as finite element and finite difference methods. By introducing the Hadamard product of matrices, we obtain an explicit matrix formulation for the DQ and DC solutions of nonlinear differential and integro-differential equations. Due to its simplicity and flexibility, the present Hadamard product approach makes the DQ and DC methods much easier to use. Many studies on the Hadamard product can be fully exploited for the DQ and DC nonlinear computations. Furthermore, we first present the SJT product of matrixandvectortocomputeaccuratelyandefficientlytheFrechetderivativematrixintheNewton{Raphson method for the solution of the nonlinear formulations. We also propose a simple approach to simplify the DQ or DC formulations for some nonlinear differential operators and, thus, the computational efficiency of these methods is significantly improved. We give the matrix multiplication formulas to efficiently compute the weighting coefficient matrices of the DC method. The spherical harmonics are suggested as the test functionsintheDCmethodtohandlethenonlineardifferentialequationsoccurringinglobalandhemispheric weather forecasting problems. Some examples are analyzed to demonstrate the simplicity and efficiency of the presented techniques. It is emphasized that the innovations presented are applicable to the nonlinear computations of the other numerical methods as well. c 1997 John Wiley & Sons, Inc.

...read moreread less

Proceedings Article•DOI•

Generalized Cannon's algorithm for parallel matrix multiplication

[...]

Hyuk-Jae Lee¹, James Patrick Robertson², José A. B. Fortes²•Institutions (2)

Louisiana Tech University¹, Purdue University²

11 Jul 1997

TL;DR: Performance analysis shows that the proposed generalized Cannon’s algorithm (GCA) requires fewer page faults than a previously proposed algorithm (SUMMA), and it is shown that GCA maintains higher performance for large matrices than SUMMA.

...read moreread less

Abstract: Cannon’s algorithm is a memory-efficient matrix multiplication technique for parallel computers with toroidal mesh interconnections. This algorithm assumes that input matrices are block distributed, but it is not clear how it can deal with block-cyclic distributed matrices. This paper generalizes Cannon’s algorithm for the case when input matrices are blockcyclic distributed across a two-dimensional processor array with an arbitrary number of processors and toroidal mesh interconnections. An efficient scheduling technique is proposed so that the number of communication steps is reduced to the least common multiple of P and Q for a given P x Q processor array. In addition, a partitioning and communication scheme is proposed to reduce the number of page faults for the case when matrices are too large to fit into main memory. Performance analysis shows that the proposed generalized Cannon’s algorithm (GCA) requires fewer page faults than a previously proposed algorithm (SUMMA). Experimental results on Intel Paragon show that GCA performs better than SUMMA when blocks of size larger than about (65 x 65) are used. However, GCA performance degrades if the block size is relatively small while SUMMA maintains the same performance. It is also shown that GCA maintains higher performance for large matrices than SUMMA

...read moreread less

Journal Article•DOI•

The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers

[...]

Zhaojun Bai, James Demmel, Jack Dongarra, A. Petitet, H. Robinson, K. Stanley - Show less +2 more

01 Sep 1997-SIAM Journal on Scientific Computing

TL;DR: A graphical user interface is designed to let the user choose the spectral decomposition according to specified regions in the complex plane of a spectral divide-and-conquer algorithm with Newton iteration.

...read moreread less

Abstract: The implementation and performance of a class of divide-and-conquer algorithms for computing the spectral decomposition of nonsymmetric matrices on distributed memory parallel computers are studied in this paper. After presenting a general framework, we focus on a spectral divide-and-conquer (SDC) algorithm with Newton iteration. Although the algorithm requires several times as many floating point operations as the best serial QR algorithm, it can be simply constructed from a small set of highly parallelizable matrix building blocks within Level 3 basic linear algebra subroutines (BLAS). Efficient implementations of these building blocks are available on a wide range of machines. In some ill-conditioned cases, the algorithm may lose numerical stability, but this can easily be detected and compensated for. The algorithm reached 31% efficiency with respect to the underlying PUMMA matrix multiplication and 82% efficiency with respect to the underlying ScaLAPACK matrix inversion on a 256 processor Intel Touchstone Delta system, and 41% efficiency with respect to the matrix multiplication in CMSSL on a 32 node Thinking Machines CM-5 with vector units. Our performance model predicts the performance reasonably accurately. To take advantage of the geometric nature of SDC algorithms, we have designed a graphical user interface to let the user choose the spectral decomposition according to specified regions in the complex plane.

...read moreread less

Journal Article•DOI•

A poly‐algorithm for parallel dense matrix multiplication on two‐dimensional process grid topologies

[...]

Jin Li¹, Anthony Skjellum¹, Robert D. Falgout²•Institutions (2)

Mississippi State University¹, Lawrence Livermore National Laboratory²

01 May 1997-Concurrency and Computation: Practice and Experience

TL;DR: A taxonomy for this family of related matrix multiplication algorithms of the form C = αAB + β C on two-dimensional process grid topologies is offered and it is concluded that no single algorithm always achieves the best performance on different matrix and grid shapes.

...read moreread less

Abstract: In this paper, we present several new and generalized parallel dense matrix multiplication algorithms of the form C = αAB + β C on two-dimensional process grid topologies These algorithms can deal with rectangular matrices distributed on rectangular grids We classify these algorithms coherently into three categories according to the communication primitives used and thus we offer a taxonomy for this family of related algorithms All these algorithms are represented in the data distribution independent approach and thus do not require a specific data distribution for correctness The algorithmic compatibility condition result shown here ensures the correctness of the matrix multiplication We define and extend the data distribution functions and introduce permutation compatibility and algorithmic compatibility We also discuss a permutation compatible data distribution (modified virtual 2D data distribution) We conclude that no single algorithm always achieves the best performance on different matrix and grid shapes A practical approach to resolve this dilemma is to use poly-algorithms We analyze the characteristics of each of these matrix multiplication algorithms and provide initial heuristics for using the poly-algorithm All these matrix multiplication algorithms have been tested on the IBM SP2 system The experimental results are presented in order to demonstrate their relative performance characteristics, motivating the combined value of the taxonomy and new algorithms introduced here © 1997 by John Wiley & Sons, Ltd

...read moreread less

Proceedings Article•

Fast finite elements for surgery simulation.

[...]

Morten Bro-Nielsen

01 Jan 1997

TL;DR: This paper discusses volumetric deformable models for modeling human body parts and organs in surgery simulation systems using finite element models of linear elastic materials for real-time response condensation.

...read moreread less

Abstract: This paper discusses volumetric deformable models for modeling human body parts and organs in surgery simulation systems. These models are built using finite element models of linear elastic materials. To achieve real-time response condensation has been applied to the system stiffness matrix, and selective matrix vector multiplication has been used to minimize the computational cost.

...read moreread less

Complexity of Kronecker Operations on Sparse Matrices with Applications to the Solution of Markov Models

[...]

Peter Buchholz, Gianfranco Ciardo¹, and P Kemper•Institutions (1)

College of William & Mary¹

01 Dec 1997

TL;DR: A systematic discussion of algorithms to multiply a vector by a matrix expressed as the Kronecker product of sparse matrices, extending previous work in a unified notational framework to define new algorithms for the solution of large structured Markov models.

...read moreread less

Abstract: We present a systematic discussion of algorithms to multiply a vector by a matrix expressed as the Kronecker product of sparse matrices, extending previous work in a unified notational framework. Then, we use our results to define new algorithms for the solution of large structured Markov models. In addition to a comprehensive overview of existing approaches, we give new results with respect to: (1) managing certain types of state-dependent behavior without incurring extra cost; (2) supporting both Jacobi-style and Gauss-Seidel-style methods by appropriate multiplication algorithms; (3) speeding up algorithms that consider probability vectors of size equal to the ``actual'''' state space instead of the ``potential'''' state space.

...read moreread less

Book Chapter•DOI•

BSP-Like External-Memory Computation

[...]

Jop F. Sibeyn¹, Michael Kaufmann¹•Institutions (1)

Max Planck Society¹

12 Mar 1997

TL;DR: In this article, the authors present a paradigm for solving external-memory problems, and illustrate it by algorithms for matrix multiplication, sorting and list ranking based on the use of BSP algorithms.

...read moreread less

Abstract: In this paper we present a paradigm for solving external-memory problems, and illustrate it by algorithms for matrix multiplication, sorting and list ranking Our paradigm is based on the use of BSP algorithms The correspondence is almost perfect, and especially the notion of x-optimality carries over to algorithms designed according to our paradigm

...read moreread less

Patent•

Processing system and method for performing sparse matrix multiplication by reordering vector blocks

[...]

Alan J. Hoffman¹, William R. Pulleyblank¹, John A. Tomlin¹•Institutions (1)

IBM¹

28 Aug 1997

TL;DR: In this article, a method, system, and data structure are provided which facilitate matrix multiplication with advantageous computational efficiency, which has applicability to numerous fields, including linear programming, where a great deal of multiplication of large, sparse matrices is performed.

...read moreread less

Abstract: A method, system, and data structure are provided which facilitate matrix multiplication with advantageous computational efficiency. The invention, as variously implemented as a processing system, method, or data structure in a recording medium such as a memory, has applicability to numerous fields, including linear programming, where a great deal of multiplication of large, sparse matrices is performed. The method of the invention includes the steps of creating a first submatrix block from non-zero terms of a sparse matrix, such that all of the terms within a given column of the submatrix block are form a respective column of the sparse matrix, creating a corresponding second index submatrix block of the same dimensions as the first block, such that each term of the second block identifies the position of the corresponding term of the first block within the sparse matrix, in terms of a row and column index. Finally, the method includes reordering terms of the first and second blocks correspondingly, as necessary to produce a final configuration within the first and second blocks such that all of the row indices within any given row of the second block are distinct.

...read moreread less

Patent•

Method and apparatus for conversion of frequency-coefficient matrices

[...]

Ibrahim Hajjahmad¹, Munib A. Wober¹, Michael L. Reisch¹•Institutions (1)

Polaroid Corporation¹

02 Jul 1997

TL;DR: In this article, a method and apparatus for converting frequency-coefficient matrices between a configuration in which the matrices are transforms of unoverlapped image-data matrices and a configuration where the matrixrices are transformations of overlapped image data matrices, the image data terms corresponding to pixels from an original image, is described.

...read moreread less

Abstract: A method and apparatus are disclosed for converting frequency-coefficient matrices between a configuration in which the matrices are transforms of unoverlapped image-data matrices and a configuration in which the matrices are transforms of overlapped image-data matrices, the image-data matrices comprising image-data terms corresponding to pixels from an original image, the method comprising the steps of: deriving a conversion matrix; transposing the conversion matrix; matrix multiplying a first frequency-coefficient matrix of one configuration by the conversion matrix; matrix multiplying a second frequency-coefficient matrix of the same configuration by the transpose conversion matrix; and combining the product results to form a matrix formatted in the other configuration.

...read moreread less

Proceedings Article•DOI•

The structure of sparse resultant matrices

[...]

Ioanis Z. Emiris¹, Victor Y. Pan²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, City University of New York²

01 Jul 1997

TL;DR: The matrix structure is exploited and the time complexity of constructing such matrices is decreased to roughly quadratic inthe matrix dimension, whereas the previous methods had cubic complexity.

...read moreread less

Abstract: Resultants characterize the existence of roots of systems of multivariatc nonlinear polynomial equations, while their matrices reduce the computation of all common zeros to a problemi nlirmaralgebra. Sparse elimination theory ha.s introduced the sparse resultant, which takes into account the sparse structurr of the polynomials. The construction of sparse resultant, or Newton, matrices is a critical step in the computation of the resultant and the solution of the system. We exploit the matrix structure and decrease the time complexity of constructing such matrices to roughly quadratic inthe matrix dimension, whereas the previous methods had cubic complexity. The space complexity is also decreased by one order of magnitude. These results imply similar improvements in the complexity of computing the resultant itself and of solving zero-dimensional systems. We apply some novel techniques for determining the rank of rectangularmatrices byanexact or numerical computation. Finally, we improve theexisting complexity forpolynomid multiplication under our model of sparseness, offering bounds linear in the number of variables and the number of nonzero terms.

...read moreread less

Proceedings Article•DOI•

Using PHiPAC to speed error back-propagation learning

[...]

Jeff A. Bilmes¹, Krste Asanovic, CheeWhye Chin, James Demmel•Institutions (1)

University of California, Berkeley¹

21 Apr 1997

TL;DR: This work introduces PHiPAC, a coding methodology for developing portable high-performance numerical libraries in ANSI C, and develops code for optimized matrix multiply routines that can achieve over 90% of peak performance on a variety of current workstations and are often faster than vendor-supplied optimized libraries.

...read moreread less

Abstract: We introduce PHiPAC, a coding methodology for developing portable high-performance numerical libraries in ANSI C. Using this methodology, we have developed code for optimized matrix multiply routines. These routines can achieve over 90% of peak performance on a variety of current workstations, and are often faster than vendor-supplied optimized libraries. We then describe the bunch-mode back-propagation algorithm and how it can use the PHiPAC derived matrix multiply routines. Using a set of plots, we investigate the tradeoffs between bunch size, convergence rate, and training speed using a standard speech recognition data set and show how use of the PHiPAC routines can lead to a significantly faster back-propagation learning algorithm.

...read moreread less

Journal Article•DOI•

Time-stepping and preserving orthonormality

[...]

Desmond J. Higham

01 Mar 1997-Bit Numerical Mathematics

TL;DR: It is argued that perturbing towards the orthonormal polar factor is an attractive choice, and it is shown that the perturbations improve the departure from orthonormality without significantly degrading the finite-time global error bound for the ODE solution.

...read moreread less

Abstract: Certain applications produce initial value ODEs whose solutions, regarded as time-dependent matrices, preserve orthonormality. Such systems arise in the computation of Lyapunov exponents and the construction of smooth singular value decompositions of parametrized matrices. For some special problem classes, there exist time-stepping methods that automatically inherit the orthonormality preservation. However, a more widely applicable approach is to apply a standard integrator and regularly replace the approximate solution by an orthonormal matrix. Typically, the approximate solution is replaced by the factorQ from its QR decomposition (computed, for example, by the modified Gram-Schmidt method). However, the optimal replacement—the one that is closest in the Frobenius norm—is given by the orthonormal polar factor. Quadratically convergent iteration schemes can be used to compute this factor. In particular, there is a matrix multiplication based iteration that is ideally suited to modern computer architectures. Hence, we argue that perturbing towards the orthonormal polar factor is an attractive choice, and we consider performing a fixed number of iterations. Using the optimality property we show that the perturbations improve the departure from orthonormality without significantly degrading the finite-time global error bound for the ODE solution. Our analysis allows for adaptive time-stepping, where a local error control process is driven by a user-supplied tolerance. Finally, using a recent result of Sun, we show how the global error bound carries through to the case where the orthonormal QR factor is used instead of the orthonormal polar factor.

...read moreread less

Journal Article•DOI•

New ordering methods for sparse matrix inversion via diagonalization

[...]

Y.Q. Wang, Hoay Beng Gooi

01 Aug 1997-IEEE Transactions on Power Systems

TL;DR: In this article, two new ordering methods that can be used to reduce the elements in the inverse factors of a sparse matrix are proposed, which are based on the diagonalization of A via the use of a transformation matrix, C.

...read moreread less

Abstract: Two new ordering methods that can be used to reduce the elements in the inverse factors of a sparse matrix are proposed. Compared with all other commonly used ordering methods, the new methods will produce less fill-in elements. The proposed methods are based on the diagonalization of A via the use of a transformation matrix, C. A new node sequence for the power network and all the elements of the C matrix are generated in only a single stage instead of the conventional LDU decomposition followed by a series of multiplications for W-matrix. The methods may be used for the parallel solution of sparse matrix equations. Test results show that the proposed methods can reduce the computation burden effectively.

...read moreread less

Journal Article•DOI•

Diagonal pivoting for partially reconstructible Cauchy-like matrices, with applications to Toeplitz-like linear equations and to boundary rational matrix interpolation problems

[...]

Thomas Kailath¹, Vadim Olshevsky¹•Institutions (1)

Stanford University¹

15 Mar 1997-Linear Algebra and its Applications

TL;DR: In this paper, a fast O(n 2 )-approximation algorithm for symmetric Gaussian elimination with partial diagonal pivoting for Hermitian Toeplitz-like matrices is presented.

...read moreread less

Journal Article•DOI•

Matrix multiplication with DNA.

[...]

John S. Oliver¹•Institutions (1)

Brown University¹

01 Aug 1997-Journal of Molecular Evolution

TL;DR: The use of DNA to perform an analog calculation that illustrates a new approach to computing with DNA for Boolean matrices or matrices containing positive, real numbers.

...read moreread less

Abstract: A DNA-based method for calculating the product of Boolean matrices or matrices containing positive, real numbers is presented. In the case of matrices containing real numbers, the manipulation of reaction conditions allows a quantitative calculation to be performed. The use of DNA to perform an analog calculation illustrates a new approach to computing with DNA.

...read moreread less