Showing papers on "Matrix multiplication published in 2002"

PDF

Open Access

Journal Article•DOI•

All pairs shortest paths using bridging sets and rectangular matrix multiplication

[...]

Uri Zwick¹•Institutions (1)

01 May 2002-Journal of the ACM

TL;DR: Two new algorithms for solving the All Pairs Shortest Paths (APSP) problem for weighted directed graphs using fast matrix multiplication algorithms are presented.

...read moreread less

Abstract: We present two new algorithms for solving the All Pairs Shortest Paths (APSP) problem for weighted directed graphs. Both algorithms use fast matrix multiplication algorithms.The first algorithm solves the APSP problem for weighted directed graphs in which the edge weights are integers of small absolute value in O(n2+μ) time, where μ satisfies the equation ω(1, μ, 1) = 1 + 2μ and ω(1, μ, 1) is the exponent of the multiplication of an n × nμ matrix by an nμ × n matrix. Currently, the best available bounds on ω(1, μ, 1), obtained by Coppersmith, imply that μ 0 is an error parameter and W is the largest edge weight in the graph, after the edge weights are scaled so that the smallest non-zero edge weight in the graph is 1. It returns estimates of all the distances in the graph with a stretch of at most 1 + ϵ. Corresponding paths can also be found efficiently.

...read moreread less

286 citations

Journal Article•DOI•

Products of random matrices.

[...]

Andrew D. Jackson¹, Benny Lautrup¹, Peter Johansen², Mads Nielsen³•Institutions (3)

Niels Bohr Institute¹, University of Copenhagen², IT University of Copenhagen³

18 Dec 2002-Physical Review E

TL;DR: In this paper, the determinant of the target matrix is log-normally distributed, whereas the remainder is a surprisingly complicated function of a parameter characterizing the norm of the matrix and its skewness.

...read moreread less

Abstract: We derive analytic expressions for infinite products of random 2 x 2 matrices. The determinant of the target matrix is log-normally distributed, whereas the remainder is a surprisingly complicated function of a parameter characterizing the norm of the matrix and a parameter characterizing its skewness. The distribution may have importance as an uncommitted prior in statistical image analysis.

...read moreread less

277 citations

Journal Article•DOI•

Inverses of 2 × 2 block matrices

[...]

Tzon-Tzer Lu¹, Sheng-Hua Shiou¹•Institutions (1)

National Sun Yat-sen University¹

01 Jan 2002-Computers & Mathematics With Applications

TL;DR: In this article, the authors give explicit inverse formulae for 2 × 2 block matrices with three different partitions and apply these results to obtain inverses of block triangular matrices and various structured matrices such as Hamiltonian, per-Hermitian, and centro-hermitian matrices.

...read moreread less

Abstract: In this paper, the authors give explicit inverse formulae for 2 × 2 block matrices with three different partitions. Then these results are applied to obtain inverses of block triangular matrices and various structured matrices such as Hamiltonian, per-Hermitian, and centro-Hermitian matrices.

...read moreread less

261 citations

Proceedings Article•DOI•

Linear combination of transformations

[...]

Marc Alexa¹•Institutions (1)

Technische Universität Darmstadt¹

01 Jul 2002

TL;DR: A natural and geometrically meaningful definition of scalar multiples and a commutative addition of transformations based on the matrix representation are derived, given that the matrices have no negative real eigenvalues.

...read moreread less

Abstract: Geometric transformations are most commonly represented as square matrices in computer graphics. Following simple geometric arguments we derive a natural and geometrically meaningful definition of scalar multiples and a commutative addition of transformations based on the matrix representation, given that the matrices have no negative real eigenvalues. Together, these operations allow the linear combination of transformations. This provides the ability to create weighted combination of transformations, interpolate between transformations, and to construct or use arbitrary transformations in a structure similar to a basis of a vector space. These basic techniques are useful for synthesis and analysis of motions or animations. Animations through a set of key transformations are generated using standard techniques such as subdivision curves. For analysis and progressive compression a PCA can be applied to sequences of transformations. We describe an implementation of the techniques that enables an easy-to-use and transparent way of dealing with geometric transformations in graphics software. We compare and relate our approach to other techniques such as matrix decomposition and quaternion interpolation.

...read moreread less

253 citations

Journal Article•DOI•

Data-sparse approximation by adaptive H 2 -matrices

[...]

Wolfgang Hackbusch¹, Steffen Börm¹•Institutions (1)

Max Planck Society¹

01 Sep 2002-Computing

TL;DR: The basic ideas of ℋ- andℋ2-matrices are introduced and an algorithm that adaptively computes approximations of general matrices in the latter format is presented.

...read moreread less

Abstract: A class of matrices (H2-matrices) has recently been introduced for storing discretisations of elliptic problems and integral operators from the BEM. These matrices have the following properties: (i) They are sparse in the sense that only few data are needed for their representation. (ii) The matrix-vector multiplication is of linear complexity. (iii) In general, sums and products of these matrices are no longer in the same set, but after truncation to the H2-matrix format these operations are again of quasi-linear complexity.We introduce the basic ideas of H- and H2-matrices and present an algorithm that adaptively computes approximations of general matrices in the latter format.

...read moreread less

247 citations

Journal Article•DOI•

Fuzzy complex projective spaces and their star-products

[...]

A. P. Balachandran¹, Brian P. Dolan², J. Lee³, Xavier Martin², Denjoe O'Connor² - Show less +1 more•Institutions (3)

Syracuse University¹, CINVESTAV², Seoul National University³

01 Sep 2002-Journal of Geometry and Physics

TL;DR: In this paper, an explicit expression for an associative ∗-product on the fuzzy complex projective space, CPN−1F, was derived, which generalises previous results for the fuzzy 2-sphere.

...read moreread less

184 citations

Journal Article•DOI•

Efficient computation of minimum-variance wave-front reconstructors with sparse matrix techniques.

[...]

Brent Ellerbroek

01 Sep 2002-Journal of The Optical Society of America A-optics Image Science and Vision

TL;DR: In this paper, the authors proposed a sparse minimum-variance reconstructor for a conventional natural guide star AO system using a sparse approximation for turbulence statistics and recognizing that the nonsparse matrix terms arising from LGS position uncertainty are low-rank adjustments that can be evaluated by using the matrix inversion lemma.

...read moreread less

Abstract: The complexity of computing conventional matrix multiply wave-front reconstructors scales as O(n3) for most adaptive optical (AO) systems, where n is the number of deformable mirror (DM) actuators. This is impractical for proposed systems with extremely large n. It is known that sparse matrix methods improve this scaling for least-squares reconstructors, but sparse techniques are not immediately applicable to the minimum-variance reconstructors now favored for multiconjugate adaptive optical (MCAO) systems with multiple wave-front sensors (WFSs) and DMs. Complications arise from the nonsparse statistics of atmospheric turbulence, and the global tip/tilt WFS measurement errors associated with laser guide star (LGS) position uncertainty. A description is given of how sparse matrix methods can still be applied by use of a sparse approximation for turbulence statistics and by recognizing that the nonsparse matrix terms arising from LGS position uncertainty are low-rank adjustments that can be evaluated by using the matrix inversion lemma. Sample numerical results for AO and MCAO systems illustrate that the approximation made to turbulence statistics has negligible effect on estimation accuracy, the time to compute the sparse minimum-variance reconstructor for a conventional natural guide star AO system scales as O(n3/2) and is only a few seconds for n = 3500, and sparse techniques reduce the reconstructor computations by a factor of 8 for sample MCAO systems with 2417 DM actuators and 4280 WFS subapertures. With extrapolation to 9700 actuators and 17,120 subapertures, a reduction by a factor of approximately 30 or 40 to 1 is predicted.

...read moreread less

178 citations

Journal Article•DOI•

Expansion algorithm for the density matrix

[...]

Anders M. N. Niklasson¹•Institutions (1)

Los Alamos National Laboratory¹

24 Oct 2002-Physical Review B

TL;DR: It is shown that the computational complexity, measured as the number of matrix multiplications, essentially is independent of system size even for metallic materials with a vanishing band gap.

...read moreread less

Abstract: A purification algorithm for expanding the single-particle density matrix in terms of the Hamiltonian operator is proposed. The scheme works with a predefined occupation and requires less than half the number of matrix-matrix multiplications compared to existing methods at low (10%) and high (g90%) occupancies. The expansion can be used with a fixed chemical potential, in which case it is an asymmetric generalization of and a substantial improvement over grand canonical McWeeny purification. It is shown that the computational complexity, measured as the number of matrix multiplications, essentially is independent of system size even for metallic materials with a vanishing band gap.

...read moreread less

143 citations

Journal Article•DOI•

Depth-3 arithmetic circuits over fields of characteristic zero

[...]

Amir Shpilka¹, Avi Wigderson¹•Institutions (1)

Hebrew University of Jerusalem¹

24 Jan 2002-Computational Complexity

TL;DR: This paper proves quadratic lower bounds for depth-3 arithmetic circuits over fields of characteristic zero for the elementary symmetric functions, the (trace of) iterated matrix multiplication, and the determinant, and gives new shorter formulae of constant depth for the Elementary symmetrical functions.

...read moreread less

Abstract: In this paper we prove quadratic lower bounds for depth-3 arithmetic circuits over fields of characteristic zero. Such bounds are obtained for the elementary symmetric functions, the (trace of) iterated matrix multiplication, and the determinant. As corollaries we get the first nontrivial lower bounds for computing polynomials of constant degree, and a gap between the power of depth-3 arithmetic circuits and depth-4 arithmetic circuits. We also give new shorter formulae of constant depth for the elementary symmetric functions.¶The main technical contribution relates the complexity of computing a polynomial in this model to the wealth of partial derivatives it has on every affine subspace of small co-dimension. Lower bounds for related models utilize an algebraic analog of the Neciporuk lower bound on Boolean formulae.

...read moreread less

141 citations

Journal Article•DOI•

Recursive blocked algorithms for solving triangular systems—Part II: two-sided and generalized Sylvester and Lyapunov matrix equations

[...]

Isak Jonsson¹, Bo Kågström¹•Institutions (1)

Umeå University¹

01 Dec 2002-ACM Transactions on Mathematical Software

TL;DR: Novel recursive blocked algorithms for two-sided matrix equations, which include matrix product terms such as AXBT are presented, and the performance improvements are remarkable, including 10-fold speedups or more, compared to standard algorithms.

...read moreread less

Abstract: We continue our study of high-performance algorithms for solving triangular matrix equations. They appear naturally in different condition estimation problems for matrix equations and various eigenspace computations, and as reduced systems in standard algorithms. Building on our successful recursive approach applied to one-sided matrix equations (Part I), we now present novel recursive blocked algorithms for two-sided matrix equations, which include matrix product terms such as AXBT. Examples are the discrete-time standard and generalized Sylvester and Lyapunov equations. The means for achieving high performance is the recursive variable blocking, which has the potential of matching the memory hierarchies of today's high-performance computing systems, and level-3 computations which mainly are performed as GEMM operations. Different implementation issues are discussed, including the design of efficient new algorithms for two-sided matrix products. We present uniprocessor and SMP parallel performance results of recursive blocked algorithms and routines in the state-of-the-art SLICOT library. Although our recursive algorithms with optimized kernels for the two-sided matrix equations perform more operations, the performance improvements are remarkable, including 10-fold speedups or more, compared to standard algorithms.

...read moreread less

98 citations

Proceedings Article•DOI•

On the complexity of matrix product

[...]

Ran Raz¹•Institutions (1)

Weizmann Institute of Science¹

19 May 2002

TL;DR: For any c = c(m) &rhoe; 1, a lower bound of &OHgr;(m2 log2c m) is obtained for the size of any arithmetic circuit for the product of two matrices, as long as the circuit doesn't use products with field elements of absolute value larger than c.

...read moreread less

Abstract: We prove a lower bound of Ω(m2 log m) for the size of any arithmetic circuit for the product of two matrices, over the real or complex numbers, as long as the circuit doesn't use products with field elements of absolute value larger than 1 (where mxm is the size of each matrix). That is, our lower bound is super-linear in the number of inputs and is applied for circuits that use addition gates, product gates and products with field elements of absolute value up to 1. More generally, for any c = c(m) ρ 1, we obtain a lower bound of Ω(m2 log2c m) for the size of any arithmetic circuit for the product of two matrices (over the real or complex numbers), as long as the circuit doesn't use products with field elements of absolute value larger than c. We also prove size-depth tradeoffs for such circuits.

...read moreread less

Journal Article•DOI•

An introduction to hierarchical matrices

[...]

Wolfgang Hackbusch, Lars Grasedyck, Steffen Börm

01 Jan 2002

TL;DR: A method for the data-sparse approximation of matrices resulting from the discretisation of non-local operators occurring in boundary integral methods or as the inverses of partial differential operators is given.

...read moreread less

Abstract: We give a short introduction to a method for the data-sparse approximation of matrices resulting from the discretisation of non-local operators occurring in boundary integral methods or as the inverses of partial differential operators. The result of the approximation will be the so-called hierarchical matrices (or short $\mathcal {H}$-matrices). These matrices form a subset of the set of all matrices and have a data-sparse representation. The essential operations for these matrices (matrix-vector and matrix-matrix multiplication, addition and inversion) can be performed in, up to logarithmic factors, optimal complexity.

...read moreread less

Journal Article•DOI•

Recursive array layouts and fast matrix multiplication

[...]

Siddhartha Chatterjee¹, Alvin R. Lebeck², P.K. Patnala², Mithuna Thottethodi•Institutions (2)

IBM¹, Duke University²

01 Nov 2002-IEEE Transactions on Parallel and Distributed Systems

TL;DR: Five recursive layouts with successively increasing complexity of address computation are evaluated and it is shown that addressing overheads can be kept in control even for the most computationally demanding of these layouts.

...read moreread less

Abstract: The performance of both serial and parallel implementations of matrix multiplication is highly sensitive to memory system behavior. False sharing and cache conflicts cause traditional column-major or row-major array layouts to incur high variability in memory system performance as matrix size varies. This paper investigates the use of recursive array layouts to improve performance and reduce variability. Previous work on recursive matrix multiplication is extended to examine several recursive array layouts and three recursive algorithms: standard matrix multiplication and the more complex algorithms of Strassen (1969) and Winograd. While recursive layouts significantly outperform traditional layouts (reducing execution times by a factor of 1.2-2.5) for the standard algorithm, they offer little improvement for Strassen's and Winograd's algorithms. For a purely sequential implementation, it is possible to reorder computation to conserve memory space and improve performance between 10 percent and 20 percent. Carrying the recursive layout down to the level of individual matrix elements is shown to be counterproductive; a combination of recursive layouts down to canonically ordered matrix tiles instead yields higher performance. Five recursive layouts with successively increasing complexity of address computation are evaluated and it is shown that addressing overheads can be kept in control even for the most computationally demanding of these layouts.

...read moreread less

Journal Article•DOI•

IRBL: An Implicitly Restarted Block-Lanczos Method for Large-Scale Hermitian Eigenproblems

[...]

James Baglama, Daniela Calvetti, Lothar Reichel

01 May 2002-SIAM Journal on Scientific Computing

TL;DR: The irbleigs code is an implementation of an implicitly restarted block-Lanczos method for computing a few selected nearby eigenvalues and associated eigenvectors of a large, possibly sparse, Hermitian matrix A, which makes it well suited for large-scale problems.

...read moreread less

Abstract: The irbleigs code is an implementation of an implicitly restarted block-Lanczos method for computing a few selected nearby eigenvalues and associated eigenvectors of a large, possibly sparse, Hermitian matrix A. The code requires only the evaluation of matrix-vector products with A; in particular, factorization of A is not demanded, nor is the solution of linear systems of equations with the matrix A. This, together with a fairly small storage requirement, makes the irbleigs code well suited for large-scale problems. Applications of the irbleigs code to certain generalized eigenvalue problems and to the computation of a few singular values and associated singular vectors are also discussed. Numerous computed examples illustrate the performance of the method and provide comparisons with other available codes.

...read moreread less

Proceedings Article•DOI•

Area and time efficient implementations of matrix multiplication on FPGAs

[...]

Ju-wook Jang¹, Seonil Choi, Viktor K. Prasanna•Institutions (1)

Sogang University¹

16 Dec 2002

TL;DR: These designs significantly reduce the latency as well as the area and improve the previous designs in terms of the area/speed metric where the speed denotes the maximum achievable running frequency.

...read moreread less

Abstract: We develop new algorithms and architectures for matrix multiplication on configurable hardware These designs significantly reduce the latency as well as the area Our designs improve the previous designs in terms of the area/speed metric where the speed denotes the maximum achievable running frequency The area/speed metrics for the previous designs and our design are 1445, 493, and 235, respectively, for 4 /spl times/ 4 matrix multiplication The latency of one of the previous design is 057 /spl mu/s, while our design takes 015 /spl mu/s using 18% less area The area of our designs is smaller by 11% - 46% compared with the best known systolic designs with the same latency for the matrices of sizes 3 /spl times/ 3 - 12 /spl times/ 12 The performance improvements tend to grow with the problem size

...read moreread less

TR-2002012: The Aggregation and Cancellation Techniques As a Practical Tool for Faster Matrix Multiplication

[...]

Igor Kaporin¹•Institutions (1)

Radboud University Nijmegen¹

01 Jan 2002

TL;DR: In this paper, the authors present a fast matrix multiplication algorithm taken from [10] in a re ned compact "analytical" form and demonstrate that it can be implemented as quite efficient computer code.

...read moreread less

Abstract: The main purpose of this paper is to present a fast matrix multiplication algorithm taken from [10] in a re ned compact "analytical" form and to demonstrate that it can be implemented as quite eAEcient computer code. Our improved presentation enables us to simplify substantially the analysis of the computational complexity and numerical stability of the algorithm as well as its computer implementation. The algorithm multiplies two N N matrices using O(N) arithmetic operations. In the case where N = 18 48, for a positive integer k, the total number of ops required by the algorithm is 4:893N 16:165N, which is quite competitive with a similar estimate for the Winograd algorithm, 3:732N 5N ops, N = 8 2, the latter being current record bound among all known practical algorithms. Moreover, we present a pseudo code of the algorithm which demonstrates its very moderate working memory requirements, much smaller than that of the best available implementations of Strassen andWinograd algorithms. We also reexamine an algorithm from [11] with operation count 3:682N 7:303N; N = 8 12, which performs well even for medium matrix sizes, e.g., N < 2000. For matrices of medium-large size (say, 2000 N < 10000) we consider one-level algorithms and compare them with the (multilevel) Strassen and Winograd algorithms. The results of numerical tests clearly indicate that our accelerated matrix multiplication routines implementing two or three disjoint product-based algorithm are comparable in computational time with an implementation of Winograd algorithm and clearly outperform it with respect to working space and (especially) numerical stability. The tests were performed for the matrices of the order of up to 7000, both in double and single precision.

...read moreread less

Journal Article•DOI•

A framework for high‐performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low‐level kernels

[...]

Vinod K. Valsalam¹, Anthony Skjellum¹•Institutions (1)

Mississippi State University¹

25 Aug 2002-Concurrency and Computation: Practice and Experience

TL;DR: Several algorithmic advances are made in this paper, including an oscillating iterative algorithm for matrix multiplication and a variable recursion cutoff criterion for Strassen's algorithm, and the need to standardize linear algebra kernel interfaces, distinct from the BLAS, for writing portable high‐performance code is exposed.

...read moreread less

Abstract: Despite extensive research, optimal performance has not easily been available previously for matrix multiplication (especially for large matrices) on most architectures because of the lack of a structured approach and the limitations imposed by matrix storage formats. A simple but effective framework is presented here that lays the foundation for building high-performance matrix-multiplication codes in a structured, portable and efficient manner. The resulting codes are validated on three different representative RISC and CISC architectures on which they significantly outperform highly optimized libraries such as ATLAS and other competing methodologies reported in the literature. The main component of the proposed approach is a hierarchical storage format that efficiently generalizes the applicability of the memory hierarchy friendly Morton ordering to arbitrary-sized matrices. The storage format supports polyalgorithms, which are shown here to be essential for obtaining the best possible performance for a range of problem sizes. Several algorithmic advances are made in this paper, including an oscillating iterative algorithm for matrix multiplication and a variable recursion cutoff criterion for Strassen's algorithm. The authors expose the need to standardize linear algebra kernel interfaces, distinct from the BLAS, for writing portable high-performance code. These kernel routines operate on small blocks that fit in the L1 cache. The performance advantages of the proposed framework can be effectively delivered to new and existing applications through the use of object-oriented or compiler-based approaches. Copyright © 2002 John Wiley & Sons, Ltd.

...read moreread less

Patent•

Rearranging data between vector and matrix forms in a SIMD matrix processor

[...]

James S. Blomgren, Timothy A. Olson¹, Christophe Harle•Institutions (1)

Apple Inc.¹

06 Jun 2002

TL;DR: In this paper, a group of instructions, block4 and block4 v, in a matrix processor 16 that rearranges data between vector and matrix forms of an A×B matrix of data 120 where the data matrix includes one or more 4×4 sub-matrices of data 160-166 is described.

...read moreread less

Abstract: This invention discloses a group of instructions, block4 and block4 v, in a matrix processor 16 that rearranges data between vector and matrix forms of an A×B matrix of data 120 where the data matrix includes one or more 4×4 sub-matrices of data 160-166. The instructions of this invention simultaneously swaps row or columns between the first 140, second 142, third 144, and fourth 146 matrix registers according to the instructions that perform predefined matrix tensor operations on the data matrix that includes one of the following group of operations: swapping rows between the different individual matrix registers, or swapping columns between the different individual matrix registers. Additionally, successive iterations or combinations of the block4 and or block4 v instructions perform standard tensor matrix operations from the following group of matrix operations: transpose, shuffle, and deal.

...read moreread less

Book Chapter•DOI•

A Polynomial Time Algorithm to Find the Minimum Cycle Basis of a Regular Matroid

[...]

Alexander Golynski¹, J. D. Horton¹•Institutions (1)

University of New Brunswick¹

03 Jul 2002

TL;DR: An algorithm is given to solve the minimum cycle basis problem for regular matroids based upon Seymour's decomposition theorem, the Gomory-Hu tree, which is essentially the solution for cographicMatroids; and the corresponding result for graphs.

...read moreread less

Abstract: An algorithm is given to solve the minimum cycle basis problem for regular matroids. The result is based upon Seymour's decomposition theorem for regular matroids; the Gomory-Hu tree, which is essentially the solution for cographic matroids; and the corresponding result for graphs. The complexity of the algorithm is O((n + m)4), provided that a regular matroid is represented as a binary n×m matrix. The complexity decreases to O((n+m)3.376) using fast matrix multiplication.

...read moreread less

Perturbation analysis for the eigenvalue problem of a formal product

[...]

Peter Benner, Volker Mehrmann, Hongguo Xu

01 Jan 2002

TL;DR: In this paper, the perturbation theory for the eigenvalue problem of a formal matrix product A s 1 1 ··· A sp p,w here all Ak are square and sk ∈{ −1, 1}.

...read moreread less

Abstract: We study the perturbation theory for the eigenvalue problem of a formal matrix product A s 1 1 ··· A sp p ,w here allAk are square and sk ∈{ −1, 1}. We generalize the classical perturbation results for matrices and matrix pencils to perturbation results for generalized deflating subspaces and eigenvalues of such formal matrix products. As an application we then extend the structured perturbation theory for the eigenvalue problem of Hamiltonian matrices to Hamiltonian/skew-Hamiltonian pencils. AMS subject classification: 65F15, 93B40, 93B60, 65H17.

...read moreread less

Book•DOI•

Rigid body dynamics of mechanisms

[...]

Hubert Hahn

01 Jan 2002

TL;DR: In this paper, the authors present a model of planar and spatial rigid-body systems with a general universal joint and a set of constraints, including the shortest distance between two rotation axes.

...read moreread less

Abstract: 1. Introduction.- 2. Planar and spatial vectors, matrices, and vector functions.- 3. Constraint equations and constraint reaction forces of mechanisms.- 4. Dynamics of planar and spatial rigid-body systems.- 5. Model equations of planar and spatial joints.- 6. Constitutive relations of planar and spatial external forces and torques.- A. Appendix.- A.1 Special vector and matrix operations used in mechanics.- A.1.1 Euclidean vector space.- A.1.2 Scalar product and cross product of planar vectors.- A.1.3 Cross product of spatial vectors.- A.1.4 Time derivatives of planar orientation matrices and of planar vectors in different frames.- A.1.5 Time derivatives of spatial orientation matrices and of spatial vectors in different frames.- A.1.6 Derivatives of vector functions.- A.2.1 Kinetic energy of an unconstrained rigid body.- A.2.3 Spatial equations of motion of a constrained rigid body.- A.4 Constraint equations of a general universal joint.- A.4.1 Notation and abbreviations.- A.4.2 Computation of constraint equations.- A.4.2.1 First constraint equation.- A.4.2.2 Second constraint equation.- A.4.2.3 Third constraint equation.- A.4.2.4 Fourth constraint equation.- A.4.3 Computation of the shortest distance between two rotation axes.- References.- List of figures.

...read moreread less

Patent•

Vector-matrix multiplication

[...]

Avner Goren, Aviram Sariel, Shimon Levit, Yosefa Asaf, Sergio Liberman, Benzion Sender, Tzvi Tzelnick, Yaron Hefetz, Eyal Moses, Vered Machal - Show less +6 more

03 Sep 2002

TL;DR: In this article, an integrated VMM (vector-matrix multiplier) module, including an electro-optical VMM component that multiplies an input vector by a matrix to produce an output vector, and an electronic VPU (vector processing unit) that processes at least one of the input and output vectors are discussed.

...read moreread less

Abstract: An integrated VMM (vector-matrix multiplier) module, including an electro-optical VMM component that multiplies an input vector by a matrix to produce an output vector; and an electronic VPU (vector processing unit) that processes at least one of the input and output vectors. Various error reducing mechanisms are also discussed.

...read moreread less

Patent•

Generating matrices to be used for the random orthogonal transformation of blocks of data in a transmission chain

[...]

Seiichi Izumi

04 Jan 2002

TL;DR: In this paper, the number of rows of each of these matrices is equal to M*n. The number of columns of each matrix is the same as the column number of the matrix.

...read moreread less

Abstract: Matrices to be used for the random orthogonal transformation of blocks of data in a transmission chain are generated. A square matrix with orthogonal column vectors and orthogonal row vectors is divided to create M matrices. The number of rows of each of these matrices is equal to M*n, where n is the number of columns of each of the matrices and M is an integer larger than one. Each of the M matrices is allocated to a transmitter in a transmission chain or, alternatively, a plurality of the M matrices are allocated to one base station of a wireless transmission system.

...read moreread less

Journal Article•DOI•

Perturbation Analysis for the Eigenvalue Problem of a Formal Product of Matrices

[...]

Peter Benner¹, Peter Benner², Volker Mehrmann³, Hongguo Xu⁴•Institutions (4)

Max Planck Society¹, University of Bremen², Technical University of Berlin³, University of Kansas⁴

01 Mar 2002-Bit Numerical Mathematics

TL;DR: In this article, the perturbation theory for the eigenvalue problem of a formal matrix product A 1 s 1 ··· A p s p, where all Ak are square and sk ∈ {−1, 1}.

...read moreread less

Abstract: We study the perturbation theory for the eigenvalue problem of a formal matrix product A 1 s 1 ··· A p s p, where all Ak are square and sk ∈ {−1, 1}. We generalize the classical perturbation results for matrices and matrix pencils to perturbation results for generalized deflating subspaces and eigenvalues of such formal matrix products. As an application we then extend the structured perturbation theory for the eigenvalue problem of Hamiltonian matrices to Hamiltonian/skew-Hamiltonian pencils.

...read moreread less

Journal Article•DOI•

An efficient multi-variable inversion algorithm for reliability evaluation of complex systems using path sets

[...]

Sanjay Chaturvedi¹, Krishna B. Misra¹•Institutions (1)

Indian Institute of Technology Kharagpur¹

01 Sep 2002-International Journal of Reliability, Quality and Safety Engineering

TL;DR: This paper proposes an efficient methodology to evaluate reliability of large and complex systems based on minimal path sets and presents an improved multi-variable inversion (MVI) algorithm to evaluate system reliability in a compact form.

...read moreread less

Abstract: Reliability evaluation of a large and complex system is quite an involved and time-consuming process and its state-of-art is far from being called as satisfactory. This is mainly due to the fact that unionizing path sets results in large number of terms in the reliability expression. Thereafter, the process of computing numerical value of system reliability from its expression is a task not free from the build up of round-off errors. The entire process also restricts the use of a low-end PC for computing system reliability of such systems. In this paper, we propose an efficient methodology to evaluate reliability of large and complex systems based on minimal path sets; the path sets enumeration procedure used in this paper generates path sets in lexicographic and increasing order of cardinality — a condition, which is helpful in obtaining sum of disjoint products (SDP) of the system reliability expression in a compact form. Although we make use of the system connection matrix but no complicated matrix operations are performed to obtain the results. The paper further presents an improved multi-variable inversion (MVI) algorithm to evaluate system reliability in a compact form. Our approach offers an extensive reduction in the number of mutually disjoint terms and provides a minimized and compact system reliability expression. The procedure not only results in substantial saving of CPU time but also can be run on a low-end PC. To demonstrate this capability, we solve several problems of varied complexities on a low-end PC and also provide a comparison of our approach with earlier techniques available for the purpose.

...read moreread less

Book Chapter•DOI•

Energy-Efficient Matrix Multiplication on FPGAs

[...]

Ju-wook Jang¹, Seonil Choi², Viktor K. Prasanna²•Institutions (2)

Sogang University¹, University of Southern California²

02 Sep 2002

TL;DR: These designs significantly reduce the energy dissipation and latency compared with the state-of-the-art FPGA-based designs and improve the energy performance of the optimized design from the recent Xilinx library by 32% to 88% without any increase in area-latency product.

...read moreread less

Abstract: We develop new algorithms and architectures for matrix multiplication on configurable devices. These designs significantly reduce the energy dissipation and latency compared with the state-of-the-art FPGA-based designs. We derive functions to represent the impact of algorithmic level design choices on the system-wide energy dissipation, latency, and area by capturing algorithm and architecture details including features of the target FPGA. The functions are used to optimize energy performance under latency and area constraints for a family of candidate algorithms and architectures. As a result, our designs improve the energy performance of the optimized design from the recent Xilinx library by 32% to 88% without any increase in area-latency product. In terms of comprehensive metrics such as EAT (Energy-Area-Time) and E/AT (Energy/Area-Time), our designs offer superior performance compared with the Xilinx design by 50%-79% and 13%-44%, respectively. We also address how to exploit further increases in density of future FPGA devices for asymptotic improvement in latency and energy dissipation for multiplication of larger size matrices.

...read moreread less

Proceedings Article•DOI•

Energy efficiency of FPGAs and programmable processors for matrix multiplication

[...]

Ronald Scrofano¹, Seonil Choi, Viktor K. Prasanna•Institutions (1)

University of Southern California¹

16 Dec 2002

TL;DR: FPGAs can multiply two n /spl times/ n matrices with both lower latency and lower energy consumption than the other two types of devices, which makes FPGAs the ideal choice for matrix multiplication in signal processing applications.

...read moreread less

Abstract: Advances in their technologies have positioned FPGAs and embedded processors to compete with digital signal processors (DSPs). In this paper, we evaluate the performance in terms of both latency and energy-efficiency of FPGAs, embedded processors, and DSPs in multiplying two n /spl times/ n matrices. As specific examples, we have chosen a representative of each type of device. Our results show that the FPGAs can multiply two n /spl times/ n matrices with both lower latency and lower energy consumption than the other two types of devices. This makes FPGAs the ideal choice for matrix multiplication in signal processing applications.

...read moreread less

Patent•

System and method to implement a matrix multiply unit of a broadband processor

[...]

Craig Hansen¹, Bruce L. Bateman¹, John Moussouris¹•Institutions (1)

MicroUnity¹

04 Sep 2002

TL;DR: In this paper, a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result is proposed, which can fully utilize the entire resources of a 128b by 128b multiplier regardless of the operand size.

...read moreread less

Abstract: The present invention provides a system and method for improving the performance of general-purpose processors by implementing a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result. The functional unit fully utilizes the entire resources of a 128b by 128b multiplier regardless of the operand size, as the number of elements of the matrix and vector operands increase as operand size is reduced. The unit performs both fixed-point and floating-point multiplications and additions with the highest-possible intermediate accuracy with modest resources.

...read moreread less

Journal Article•DOI•

Several inequalities involving Khatri–Rao products of positive semidefinite matrices

[...]

Shuangzhe Liu¹•Institutions (1)

Australian National University¹

15 Oct 2002-Linear Algebra and its Applications

TL;DR: In this paper, the Khatri-Rao and Tracy-Singh products for partitioned matrices are viewed as generalized Hadamard and generalized Kronecker products, respectively.

...read moreread less

Journal Article•DOI•

Least-squares inner product shaping

[...]

Yonina C. Eldar¹•Institutions (1)

Massachusetts Institute of Technology¹

15 Jun 2002-Linear Algebra and its Applications

TL;DR: In this paper, an optimal set of vectors with a specified inner product structure is constructed from a given set of vector vectors in a complex Hilbert space, and the optimal vectors are chosen to minimize the sum of the squared norms of the errors between the constructed vectors and the given vectors.

...read moreread less