scispace - formally typeset
Journal ArticleDOI

Elemental: A New Framework for Distributed Memory Dense Matrix Computations

Reads0
Chats0
TLDR
Preliminary performance results show the new solution achieves competitive, if not superior, performance on large clusters, and a simple yet effective alternative to the traditional MPI-based approaches.
Abstract
Parallelizing dense matrix computations to distributed memory architectures is a well-studied subject and generally considered to be among the best understood domains of parallel computing. Two packages, developed in the mid 1990s, still enjoy regular use: ScaLAPACK and PLAPACK. With the advent of many-core architectures, which may very well take the shape of distributed memory architectures within a single processor, these packages must be revisited since the traditional MPI-based approaches will likely need to be extended. Thus, this is a good time to review lessons learned since the introduction of these two packages and to propose a simple yet effective alternative. Preliminary performance results show the new solution achieves competitive, if not superior, performance on large clusters.

read more

Content maybe subject to copyright    Report

Citations
More filters

How to… home page

Challenger Tafe, +1 more
TL;DR: In this article, the authors developed a center to address state-of-the-art research, create innovating educational programs, and support technology transfers using commercially viable results to assist the Army Research Laboratory to develop the next generation Future Combat System in the telecommunications sector that assures prevention of perceived threats, and non-line of sight/Beyond line of sight lethal support.
Proceedings ArticleDOI

Dask: Parallel Computation with Blocked algorithms and Task Scheduling

TL;DR: This work couple blocked algorithms with dynamic and memory aware task scheduling to achieve a parallel and out-of-core NumPy clone and shows how this extends the effective scale of modern hardware to larger datasets.
Journal ArticleDOI

BLIS: A Framework for Rapidly Instantiating BLAS Functionality

TL;DR: Preliminary performance of level-2 and level-3 operations is observed to be competitive with two mature open source libraries (OpenBLAS and ATLAS) as well as an established commercial product (Intel MKL).
Journal ArticleDOI

The ELPA library: scalable parallel eigenvalue solutions for electronic structure theory and computational science

TL;DR: The Eigenvalue soLvers for Petascale Applications (ELPA) as discussed by the authors is a library for solving symmetric and Hermitian eigenvalue problems for dense matrices that have real-valued and complex-valued matrix entries.
References
More filters
Book

The algebraic eigenvalue problem

TL;DR: Theoretical background Perturbation theory Error analysis Solution of linear algebraic equations Hermitian matrices Reduction of a general matrix to condensed form Eigenvalues of matrices of condensed forms The LR and QR algorithms Iterative methods Bibliography.
Book

Lapack Users' Guide

Ed Anderson
TL;DR: The third edition of LAPACK provided a guide to troubleshooting and installation of Routines, as well as providing examples of how to convert from LINPACK or EISPACK to BLAS.
Book

MPI: The Complete Reference

TL;DR: MPI: The Complete Reference is an annotated manual for the latest 1.1 version of the standard that illuminates the more advanced and subtle features of MPI and covers such advanced issues in parallel computing and programming as true portability, deadlock, high-performance message passing, and libraries for distributed and parallel computing.
Journal ArticleDOI

A set of level 3 basic linear algebra subprograms

TL;DR: This paper describes an extension to the set of Basic Linear Algebra Subprograms targeted at matrix-vector operations that should provide for efficient and portable implementations of algorithms for high-performance computers.

How to… home page

Challenger Tafe, +1 more
TL;DR: In this article, the authors developed a center to address state-of-the-art research, create innovating educational programs, and support technology transfers using commercially viable results to assist the Army Research Laboratory to develop the next generation Future Combat System in the telecommunications sector that assures prevention of perceived threats, and non-line of sight/Beyond line of sight lethal support.
Related Papers (5)