Home
/
Authors
/
Michael A. Heroux

Author

Michael A. Heroux

Other affiliations: University of Tennessee, Cray, Colorado State University

Bio: Michael A. Heroux is an academic researcher from Sandia National Laboratories. The author has contributed to research in topics: Solver & Software. The author has an hindex of 33, co-authored 150 publications receiving 6126 citations. Previous affiliations of Michael A. Heroux include University of Tennessee & Cray.

Topics: Solver, Software, Exascale computing, Fault tolerance, Iterative method ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
1999
1998
1995
1994
1993
1992
1991
1989
1986

Papers

PDF

Open Access

More filters

Journal Article•DOI•

An overview of the Trilinos project

[...]

Michael A. Heroux¹, Roscoe A. Bartlett¹, Vicki E. Howle¹, Robert J. Hoekstra¹, Jonathan Joseph Hu¹, Tamara G. Kolda¹, Richard B. Lehoucq¹, Kevin Long¹, Roger P. Pawlowski¹, Eric T. Phipps¹, Andrew G. Salinger¹, Heidi K. Thornquist¹, Raymond S. Tuminaro¹, James M. Willenbring¹, Alan B. Williams¹, Kendall S. Stanley² - Show less +12 more•Institutions (2)

Sandia National Laboratories¹, Oberlin College²

01 Sep 2005-ACM Transactions on Mathematical Software

TL;DR: The overall Trilinos design is presented, describing the use of abstract interfaces and default concrete implementations and how packages can be combined to rapidly develop new algorithms.

...read moreread less

Abstract: The Trilinos Project is an effort to facilitate the design, development, integration, and ongoing support of mathematical software libraries within an object-oriented framework for the solution of large-scale, complex multiphysics engineering and scientific problems. Trilinos addresses two fundamental issues of developing software for these problems: (i) providing a streamlined process and set of tools for development of new algorithmic implementations and (ii) promoting interoperability of independently developed software.Trilinos uses a two-level software structure designed around collections of packages. A Trilinos package is an integral unit usually developed by a small team of experts in a particular algorithms area such as algebraic preconditioners, nonlinear solvers, etc. Packages exist underneath the Trilinos top level, which provides a common look-and-feel, including configuration, documentation, licensing, and bug-tracking.Here we present the overall Trilinos design, describing our use of abstract interfaces and default concrete implementations. We discuss the services that Trilinos provides to a prospective package and how these services are used by various packages. We also illustrate how packages can be combined to rapidly develop new algorithms. Finally, we discuss how Trilinos facilitates high-quality software engineering practices that are increasingly required from simulation software.

...read moreread less

1,109 citations

Journal Article•DOI•

The International Exascale Software Project roadmap

[...]

Jack Dongarra¹, Pete Beckman¹, Terry Moore¹, Patrick Aerts¹, Giovanni Aloisio¹, Jean-Claude Andre¹, David Barkai¹, Jean-Yves Berthou¹, Taisuke Boku¹, Bertrand Braunschweig¹, Franck Cappello¹, Barbara Chapman¹, Xuebin Chi¹, Alok Choudhary¹, Sudip S. Dosanjh¹, Thom H. Dunning¹, Sandro Fiore¹, Al Geist¹, Bill Gropp¹, Robert W. Harrison¹, Mark Hereld¹, Michael A. Heroux¹, Adolfy Hoisie¹, Koh Hotta¹, Zhong Jin¹, Yutaka Ishikawa¹, Fred Johnson¹, Sanjay Kale¹, Richard Kenway¹, David E. Keyes¹, Bill Kramer¹, Jesús Labarta¹, Alain Lichnewsky¹, Thomas Lippert¹, Bob Lucas¹, Barney Maccabe¹, Satoshi Matsuoka¹, Paul Messina¹, Peter Michielse¹, Bernd Mohr¹, Matthias S. Mueller¹, Wolfgang E. Nagel¹, Hiroshi Nakashima¹, Michael E. Papka¹, Daniel A. Reed¹, Mitsuhisa Sato¹, Edward Seidel¹, John Shalf¹, David Skinner¹, Marc Snir¹, Thomas Sterling¹, Rick Stevens¹, Frederick H. Streitz¹, Bob Sugar¹, Shinji Sumimoto¹, William Tang¹, John Taylor¹, Rajeev Thakur¹, Anne E. Trefethen¹, Mateo Valero¹, Aad J. van der Steen¹, Jeffrey S. Vetter¹, Peg Williams¹, Robert W. Wisniewski¹, Katherine Yelick¹ - Show less +61 more•Institutions (1)

University of Tennessee¹

01 Feb 2011

TL;DR: The work of the community to prepare for the challenges of exascale computing is described, ultimately combing their efforts in a coordinated International Exascale Software Project.

...read moreread less

Abstract: Over the last 20 years, the open-source community has provided more and more software on which the worldâs high-performance computing systems depend for performance and productivity. The community has invested millions of dollars and years of effort to build key components. However, although the investments in these separate software elements have been tremendously valuable, a great deal of productivity has also been lost because of the lack of planning, coordination, and key integration of technologies necessary to make them work together smoothly and efficiently, both within individual petascale systems and between different systems. It seems clear that this completely uncoordinated development model will not provide the software needed to support the unprecedented parallelism required for peta/ exascale computation on millions of cores, or the flexibility required to exploit new hardware models and features, such as transactional memory, speculative execution, and graphics processing units. This report describes the work of the community to prepare for the challenges of exascale computing, ultimately combing their efforts in a coordinated International Exascale Software Project.

...read moreread less

736 citations

Journal Article•

An Updated Set of Basic Linear Algebra Subprograms (BLAS)

[...]

Susan Blackford, James Demmel, Jack Dongarra, Iain S. Duff, Sven Hammarling, Greg Henry, Michael A. Heroux, Linda Kaufman, Andrew Lumsdaine, A. Petitet, Roldan Pozo, Karin A Remington, Clint Whaley - Show less +9 more

01 Jun 2002-ACM Transactions on Mathematical Software

TL;DR: In this paper, the authors present a list of the companies that have contributed to the development of the Numerical Algorithms Group (NALG), including Intel, Sandia National Laboratories, and IBM.

...read moreread less

Abstract: L. SUSAN BLACKFORD Myricom, Inc. JAMES DEMMEL University of California, Berkeley JACK DONGARRA The University of Tennessee IAIN DUFF Rutherford Appleton Laboratory and CERFACS SVEN HAMMARLING Numerical Algorithms Group, Ltd. GREG HENRY Intel Corporation MICHAEL HEROUX Sandia National Laboratories LINDA KAUFMAN William Patterson University ANDREW LUMSDAINE Indiana University ANTOINE PETITET Sun Microsystems ROLDAN POZO National Institute of Standards and Technology KARIN REMINGTON The Center for Advancement of Genomics and R. CLINT WHALEY Florida State University

...read moreread less

595 citations

Report•DOI•

Improving Performance via Mini-applications

[...]

Paul Stewart Crozier, Heidi K. Thornquist, Robert W. Numrich¹, Alan B. Williams, Harold C. Edwards, Eric R. Keiter, Mahesh Rajan, James M. Willenbring, Douglas W. Doerfler, Michael A. Heroux - Show less +6 more•Institutions (1)

University of Minnesota¹

01 Sep 2009

TL;DR: This paper discusses a collection of mini-applications and demonstrates how they are used to analyze and improve application performance on new and future computer platforms.

...read moreread less

Abstract: Application performance is determined by a combination of many choices: hardware platform, runtime environment, languages and compilers used, algorithm choice and implementation, and more. In this complicated environment, we find that the use of mini-applications - small self-contained proxies for real applications - is an excellent approach for rapidly exploring the parameter space of all these choices. Furthermore, use of mini-applications enriches the interaction between application, library and computer system developers by providing explicit functioning software and concrete performance results that lead to detailed, focused discussions of design trade-offs, algorithm choices and runtime performance issues. In this paper we discuss a collection of mini-applications and demonstrate how we use them to analyze and improve application performance on new and future computer platforms.

...read moreread less

462 citations

Report•DOI•

An overview of Trilinos.

[...]

Kevin Long, Raymond S. Tuminaro, Roscoe A. Bartlett, Robert J. Hoekstra, Eric T. Phipps, Tamara G. Kolda, Richard B. Lehoucq, Heidi K. Thornquist, Jonathan Joseph Hu, Alan B. Williams, Andrew G. Salinger, Victoria E. Howle, Roger P. Pawlowski, James M. Willenbring, Michael A. Heroux - Show less +11 more

01 Aug 2003

TL;DR: The Trilinos Project is an effort to develop parallel solver algorithms and libraries within an object-oriented software framework for the solution of large-scale, complex multi-physics engineering and scientific applications.

...read moreread less

Abstract: The Trilinos Project is an effort to facilitate the design, development, integration and ongoing support of mathematical software libraries. In particular, our goal is to develop parallel solver algorithms and libraries within an object-oriented software framework for the solution of large-scale, complex multi-physics engineering and scientific applications. Our emphasis is on developing robust, scalable algorithms in a software framework, using abstract interfaces for flexible interoperability of components while providing a full-featured set of concrete classes that implement all abstract interfaces. Trilinos uses a two-level software structure designed around collections of packages. A Trilinos package is an integral unit usually developed by a small team of experts in a particular algorithms area such as algebraic preconditioners, nonlinear solvers, etc. Packages exist underneath the Trilinos top level, which provides a common look-and-feel, including configuration, documentation, licensing, and bug-tracking. Trilinos packages are primarily written in C++, but provide some C and Fortran user interface support. We provide an open architecture that allows easy integration with other solver packages and we deliver our software to the outside community via the Gnu Lesser General Public License (LGPL). This report provides an overview of Trilinos, discussing the objectives, history, current development and future plans of the project.

...read moreread less

348 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Collapse

Cited by

PDF

Open Access

More filters

Fast parallel algorithms for short-range molecular dynamics

[...]

Steven J. Plimpton¹•Institutions (1)

Sandia National Laboratories¹

01 May 1993

TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.

...read moreread less

Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

...read moreread less

29,323 citations

Journal Article•

The Observation of Gravitational Waves from a Binary Black Hole Merger

[...]

Duncan A. Brown

16 Mar 2016-Bulletin of the American Physical Society

TL;DR: The first direct detection of gravitational waves and the first observation of a binary black hole merger were reported in this paper, with a false alarm rate estimated to be less than 1 event per 203,000 years, equivalent to a significance greater than 5.1σ.

...read moreread less

Abstract: On September 14, 2015 at 09:50:45 UTC the two detectors of the Laser Interferometer Gravitational-Wave Observatory simultaneously observed a transient gravitational-wave signal. The signal sweeps upwards in frequency from 35 to 250 Hz with a peak gravitational-wave strain of 1.0×10(-21). It matches the waveform predicted by general relativity for the inspiral and merger of a pair of black holes and the ringdown of the resulting single black hole. The signal was observed with a matched-filter signal-to-noise ratio of 24 and a false alarm rate estimated to be less than 1 event per 203,000 years, equivalent to a significance greater than 5.1σ. The source lies at a luminosity distance of 410(-180)(+160) Mpc corresponding to a redshift z=0.09(-0.04)(+0.03). In the source frame, the initial black hole masses are 36(-4)(+5)M⊙ and 29(-4)(+4)M⊙, and the final black hole mass is 62(-4)(+4)M⊙, with 3.0(-0.5)(+0.5)M⊙c(2) radiated in gravitational waves. All uncertainties define 90% credible intervals. These observations demonstrate the existence of binary stellar-mass black hole systems. This is the first direct detection of gravitational waves and the first observation of a binary black hole merger.

...read moreread less

4,375 citations

Book•

Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book

[...]

Anders Logg, Kent-Andre Mardal, Garth N. Wells

24 Feb 2012

TL;DR: This book is a tutorial written by researchers and developers behind the FEniCS Project and explores an advanced, expressive approach to the development of mathematical software.

...read moreread less

Abstract: This book is a tutorial written by researchers and developers behind the FEniCS Project and explores an advanced, expressive approach to the development of mathematical software. The presentation spans mathematical background, software design and the use of FEniCS in applications. Theoretical aspects are complemented with computer code which is available as free/open source software. The book begins with a special introductory tutorial for beginners. Followingare chapters in Part I addressing fundamental aspects of the approach to automating the creation of finite element solvers. Chapters in Part II address the design and implementation of the FEnicS software. Chapters in Part III present the application of FEniCS to a wide range of applications, including fluid flow, solid mechanics, electromagnetics and geophysics.

...read moreread less

2,372 citations

The Landscape of Parallel Computing Research: A View from Berkeley

[...]

Krste Asanovic, Ras Bodik, Bryan Catanzaro, Joseph Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Plishker, John Shalf, Samuel Williams, Katherine Yelick - Show less +7 more

18 Dec 2006

TL;DR: The parallel landscape is frame with seven questions, and the following are recommended to explore the design space rapidly: • The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems • The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS each development dollar.

...read moreread less

Abstract: Author(s): Asanovic, K; Bodik, R; Catanzaro, B; Gebis, J; Husbands, P; Keutzer, K; Patterson, D; Plishker, W; Shalf, J; Williams, SW | Abstract: The recent switch to parallel microprocessors is a milestone in the history of computing. Industry has laid out a roadmap for multicore designs that preserves the programming paradigm of the past via binary compatibility and cache coherence. Conventional wisdom is now to double the number of cores on a chip with each silicon generation. A multidisciplinary group of Berkeley researchers met nearly two years to discuss this change. Our view is that this evolutionary approach to parallel hardware and software may work from 2 or 8 processor systems, but is likely to face diminishing returns as 16 and 32 processor systems are realized, just as returns fell with greater instruction-level parallelism. We believe that much can be learned by examining the success of parallelism at the extremes of the computing spectrum, namely embedded computing and high performance computing. This led us to frame the parallel landscape with seven questions, and to recommend the following: • The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems • The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS per development dollar. • Instead of traditional benchmarks, use 13 “Dwarfs” to design and evaluate parallel programming models and architectures. (A dwarf is an algorithmic method that captures a pattern of computation and communication.) • “Autotuners” should play a larger role than conventional compilers in translating parallel programs. • To maximize programmer productivity, future programming models must be more human-centric than the conventional focus on hardware or applications. • To be successful, programming models should be independent of the number of processors. • To maximize application efficiency, programming models should support a wide range of data types and successful models of parallelism: task-level parallelism, word-level parallelism, and bit-level parallelism. 1 The Landscape of Parallel Computing Research: A View From Berkeley • Architects should not include features that significantly affect performance or energy if programmers cannot accurately measure their impact via performance counters and energy counters. • Traditional operating systems will be deconstructed and operating system functionality will be orchestrated using libraries and virtual machines. • To explore the design space rapidly, use system emulators based on Field Programmable Gate Arrays (FPGAs) that are highly scalable and low cost. Since real world applications are naturally parallel and hardware is naturally parallel, what we need is a programming model, system software, and a supporting architecture that are naturally parallel. Researchers have the rare opportunity to re-invent these cornerstones of computing, provided they simplify the efficient programming of highly parallel systems.

...read moreread less

2,262 citations

Journal Article•DOI•

Numerical solution of saddle point problems

[...]

Michele Benzi¹, Gene H. Golub², Jörg Liesen³•Institutions (3)

Emory University¹, Stanford University², Technical University of Berlin³

01 May 2005-Acta Numerica

TL;DR: A large selection of solution methods for linear systems in saddle point form are presented, with an emphasis on iterative methods for large and sparse problems.

...read moreread less

Abstract: Large linear systems of saddle point type arise in a wide variety of applications throughout computational science and engineering. Due to their indefiniteness and often poor spectral properties, such linear systems represent a significant challenge for solver developers. In recent years there has been a surge of interest in saddle point problems, and numerous solution techniques have been proposed for this type of system. The aim of this paper is to present and discuss a large selection of solution methods for linear systems in saddle point form, with an emphasis on iterative methods for large and sparse problems.

...read moreread less

2,253 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse