Home
/
Authors
/
Patrick G. Bridges

Author

Patrick G. Bridges

Other affiliations: University of Arizona, Sandia National Laboratories

Bio: Patrick G. Bridges is an academic researcher from University of New Mexico. The author has contributed to research in topics: System software & Virtualization. The author has an hindex of 22, co-authored 89 publications receiving 2204 citations. Previous affiliations of Patrick G. Bridges include University of Arizona & Sandia National Laboratories.

Topics: System software, Virtualization, Scalability, Fault tolerance, Software ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1997
1996

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Evaluating the viability of process replication reliability for exascale systems

[...]

Kurt B. Ferreira¹, Jon Stearley¹, James H. Laros¹, Ron A. Oldfield¹, Kevin Pedretti¹, Ron Brightwell¹, Rolf Riesen², Patrick G. Bridges³, Dorian Arnold³ - Show less +5 more•Institutions (3)

Sandia National Laboratories¹, IBM², University of New Mexico³

12 Nov 2011

TL;DR: Results show that state machine replication is a potentially useful technique for meeting the fault tolerance demands of HPC applications on future exascale platforms.

...read moreread less

Abstract: As high-end computing machines continue to grow in size, issues such as fault tolerance and reliability limit application scalability. Current techniques to ensure progress across faults, like checkpoint-restart, are increasingly problematic at these scales due to excessive overheads predicted to more than double an application's time to solution. Replicated computing techniques, particularly state machine replication, long used in distributed and mission critical systems, have been suggested as an alternative to checkpoint-restart. In this paper, we evaluate the viability of using state machine replication as the primary fault tolerance mechanism for upcoming exascale systems. We use a combination of modeling, empirical analysis, and simulation to study the costs and benefits of this approach in comparison to checkpoint/restart on a wide range of system parameters. These results, which cover different failure distributions, hardware mean time to failures, and I/O bandwidths, show that state machine replication is a potentially useful technique for meeting the fault tolerance demands of HPC applications on future exascale platforms.

...read moreread less

250 citations

Proceedings Article•DOI•

Characterizing application sensitivity to OS interference using kernel-level noise injection

[...]

Kurt B. Ferreira¹, Patrick G. Bridges¹, Ron Brightwell²•Institutions (2)

University of New Mexico¹, Sandia National Laboratories²

15 Nov 2008

TL;DR: This paper examines the sensitivity of real-world, large-scale applications to a range of OS noise patterns using a kernel-based noise injection mechanism implemented in the Catamount lightweight kernel, and demonstrates the importance of how noise is generated, in terms of frequency and duration, and how this impact changes with application scale.

...read moreread less

Abstract: Operating system noise has been shown to be a key limiter of application scalability in high-end systems. While several studies have attempted to quantify the sources and effects of system interference using user-level mechanisms, there are few published studies on the effect of different kinds of kernel-generated noise on application performance at scale. In this paper, we examine the sensitivity of real-world, large-scale applications to a range of OS noise patterns using a kernel-based noise injection mechanism implemented in the Catamount lightweight kernel. Our results demonstrate the importance of how noise is generated, in terms of frequency and duration, and how this impact changes with application scale. For example, our results show that 2.5% net processor noise at 10,000 nodes can have no impact or can result in over a factor of 20 slowdown for the same application, depending solely on how the noise is generated. We also discuss how the characteristics of the applications we studied, for example computation/communication ratios, collective communication sizes, and other characteristics, related to their tendency to amplify or absorb noise. Finally, we discuss the implications of our findings on the design of new operating systems, middleware, and other system services for high-end parallel systems.

...read moreread less

216 citations

Proceedings Article•DOI•

Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing

[...]

John R. Lange¹, Kevin Pedretti², Trammell Hudson², Peter A. Dinda¹, Zheng Cui³, Lei Xia¹, Patrick G. Bridges³, Andy Gocke¹, Steven Jaconette¹, Mike Levenhagen², Ron Brightwell² - Show less +7 more•Institutions (3)

Northwestern University¹, Sandia National Laboratories², University of New Mexico³

19 Apr 2010

TL;DR: This work describes the design, implementation, and integration of Palacios, a new open-source VMM under development at Northwestern University and the University of New Mexico that enables applications executing in a virtualized environment to achieve scalable high performance on large machines.

...read moreread less

Abstract: Palacios is a new open-source VMM under development at Northwestern University and the University of New Mexico that enables applications executing in a virtualized environment to achieve scalable high performance on large machines. Palacios functions as a modularized extension to Kitten, a high performance operating system being developed at Sandia National Laboratories to support large-scale supercomputing applications. Together, Palacios and Kitten provide a thin layer over the hardware to support full-featured virtualized environments alongside Kitten's lightweight native environment. Palacios supports existing, unmodified applications and operating systems by using the hardware virtualization technologies in recent AMD and Intel processors. Additionally, Palacios leverages Kitten's simple memory management scheme to enable low-overhead pass-through of native devices to a virtualized environment. We describe the design, implementation, and integration of Palacios and Kitten. Our benchmarks show that Palacios provides near native (within 5%), scalable performance for virtualized environments running important parallel applications. This new architecture provides an incremental path for applications to use supercomputers, running specialized lightweight host operating systems, that is not significantly performance-compromised.

...read moreread less

170 citations

Proceedings Article•

Toba: Java For Applications: A Way Ahead of Time (WAT) Compiler

[...]

Todd A. Proebsting¹, Gregg M. Townsend¹, Patrick G. Bridges¹, John H. Hartman¹, Tim Newsham¹, Scott A. Watterson¹ - Show less +2 more•Institutions (1)

University of Arizona¹

16 Jun 1997

TL;DR: Toba is a system for generating efficient standalone Java applications that includes a Java-bytecode-to-C compiler, a garbage collector, a threads package, and Java API support.

...read moreread less

Abstract: Toba is a system for generating efficient standalone Java applications. Toba includes a Java-bytecode-to-C compiler, a garbage collector, a threads package, and Java API support. Toba-compiled Java applications execute 1.5-4.2 times faster than interpreted and Just-In-Time compiled applications.

...read moreread less

145 citations

Fault-tolerant iterative methods via selective reliability.

[...]

Kurt B. Ferreira¹, Patrick G. Bridges¹, Michael A. Heroux¹, Mark Hoemmen¹•Institutions (1)

Sandia National Laboratories¹

01 Jun 2011

TL;DR: This work shows that if the system lets applications apply reliability selectively, they can develop iterations that compute the right answer despite faults, and illustrates convergence for a sample algorithm, Fault-Tolerant GMRES, for representative test problems and fault rates.

...read moreread less

Abstract: Current iterative methods for solving linear equations assume reliability of data (no “bit flips”) and arithmetic (correct up to rounding error). If faults occur, the solver usually either aborts, or computes the wrong answer without indication. System reliability guarantees consume energy or reduces performance. As processor counts continue to grow, these costs will become unbearable. Instead, we show that if the system lets applications apply reliability selectively, we can develop iterations that compute the right answer despite faults. These “fault-tolerant” methods either converge eventually, at a rate that degrades gracefully with increased fault rate, or return a clear failure indication in the rare case that they cannot converge. If faults are infrequent, these algorithms spend most of their time in unreliable mode. This can save energy, improve performance, and avoid restarting from checkpoints. We illustrate convergence for a sample algorithm, Fault-Tolerant GMRES, for representative test problems and fault rates.

...read moreread less

113 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

Cited by

PDF

Open Access

More filters

Fast parallel algorithms for short-range molecular dynamics

[...]

Steven J. Plimpton¹•Institutions (1)

Sandia National Laboratories¹

01 May 1993

TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.

...read moreread less

Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

...read moreread less

29,323 citations

Journal Article•DOI•

The click modular router

[...]

Eddie Kohler¹, Robert Morris¹, Benjie Chen¹, John Jannotti¹, M. Frans Kaashoek¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2000-ACM Transactions on Computer Systems

TL;DR: On conventional PC hardware, the Click IP router achieves a maximum loss-free forwarding rate of 333,000 64-byte packets per second, demonstrating that Click's modular and flexible architecture is compatible with good performance.

...read moreread less

Abstract: Clicks is a new software architecture for building flexible and configurable routers. A Click router is assembled from packet processing modules called elements. Individual elements implement simple router functions like packet classification, queuing, scheduling, and interfacing with network devices. A router configurable is a directed graph with elements at the vertices; packets flow along the edges of the graph. Several features make individual elements more powerful and complex configurations easier to write, including pull connections, which model packet flow drivn by transmitting hardware devices, and flow-based router context, which helps an element locate other interesting elements. Click configurations are modular and easy to extend. A standards-compliant Click IP router has 16 elements on its forwarding path; some of its elements are also useful in Ethernet switches and IP tunnelling configurations. Extending the IP router to support dropping policies, fairness among flows, or Differentiated Services simply requires adding a couple of element at the right place. On conventional PC hardware, the Click IP router achieves a maximum loss-free forwarding rate of 333,000 64-byte packets per second, demonstrating that Click's modular and flexible architecture is compatible with good performance.

...read moreread less

2,595 citations

The Transmission Control Protocol.

[...]

Aleksander Malinowski, Bogdan M. Wilamowski

01 Jan 2005

1,360 citations

Proceedings Article•DOI•

Soot: a Java bytecode optimization framework

[...]

Raja Vallée-Rai¹, Phong Co¹, Etienne Gagnon¹, Laurie Hendren¹, Patrick Lam¹, Vijay Sundaresan¹ - Show less +2 more•Institutions (1)

McGill University¹

01 Nov 2010

TL;DR: Soot, a framework for optimizing Java* bytecode, is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of bytecode which is simple to manipulate; Jimple, a typed 3-address intermediate representation suitable for optimization; and Grimp, an aggregated version of Jimple suitable for decompilation.

...read moreread less

Abstract: This paper presents Soot, a framework for optimizing Java* bytecode. The framework is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of bytecode which is simple to manipulate; Jimple, a typed 3-address intermediate representation suitable for optimization; and Grimp, an aggregated version of Jimple suitable for decompilation. We describe the motivation for each representation, and the salient points in translating from one representation to another. In order to demonstrate the usefulness of the framework, we have implemented intraprocedural and whole program optimizations. To show that whole program bytecode optimization can give performance improvements, we provide experimental results for 12 large benchmarks, including 8 SPECjvm98 benchmarks running on JDK 1.2 for GNU/Linuxtm. These results show up to 8% improvement when the optimized bytecode is run using the interpreter and up to 21% when run using the JIT compiler.

...read moreread less

1,160 citations

Journal Article•DOI•

Communications of the ACM

[...]

Daniel Gooch

01 Dec 2011-ACM Crossroads Student Magazine

TL;DR: CACM is really essential reading for students, it keeps tabs on the latest in computer science and is a valuable asset for us students, who tend to delve deep into a particular area of CS and forget everything that is happening around us.

...read moreread less

Abstract: Communications of the ACM (CACM for short, not the best sounding acronym around) is the ACM’s flagship magazine. Started in 1957, CACM is handy for keeping up to date on current research being carried out across all topics of computer science and realworld applications. CACM has had an illustrious past with many influential pieces of work and debates started within its pages. These include Hoare’s presentation of the Quicksort algorithm; Rivest, Shamir and Adleman’s description of the first publickey cryptosystem RSA; and Dijkstra’s famous letter against the use of GOTO. In addition to the print edition, which is released monthly, there is a fantastic website (http://cacm.acm. org/) that showcases not only the most recent edition but all previous CACM articles as well, readable online as well as downloadable as a PDF. In addition, the website lets you browse for articles by subject, a handy feature if you want to focus on a particular topic. CACM is really essential reading. Pretty much guaranteed to contain content that is interesting to anyone, it keeps tabs on the latest in computer science. It is a valuable asset for us students, who tend to delve deep into a particular area of CS and forget everything that is happening around us. — Daniel Gooch U ndergraduate research is like a box of chocolates: You never know what kind of project you will get. That being said, there are still a few things you should know to get the most out of the experience.

...read moreread less

856 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse