P
Patrick Caffrey
Researcher at IBM
Publications - 5
Citations - 283
Patrick Caffrey is an academic researcher from IBM. The author has contributed to research in topics: Scalability & Executable. The author has an hindex of 5, co-authored 5 publications receiving 280 citations. Previous affiliations of Patrick Caffrey include Lawrence Livermore National Laboratory & University of California.
Papers
More filters
Proceedings ArticleDOI
Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System
Terry Jones,Shawn Dawson,R. Neely,William G. Tuel,Larry Bert Brenner,Jeffrey Fier,Robert S. Blackmore,Patrick Caffrey,Brian Maskell,Paul Tomlinson,Mark Roberts +10 more
TL;DR: A novel co-scheduling scheme for improving performance of fine-grain collective activities such as barriers and reductions is presented, an implementation consisting of operating system kernel modifications and run-time system is described, and a set of empirical results comparing the technique with traditional operating system scheduling are presented.
Patent
Method of performing checkpoint/restart of a parallel program
Kalman Meth,Anton Prenneis,Adnan Agbaria,Patrick Caffrey,William J. Ferrante,Su-Hsuan Huang,Demetrios K. Michailaros,William G. Tuel +7 more
TL;DR: In this article, a checkpoint of a parallel program is taken in order to provide a consistent state of the program in the event the program is to be restarted, however, the timing of when the checkpoint should be taken by each process is the responsibility of a coordinating process.
Patent
Parallel-aware, dedicated job co-scheduling method and system
Terry Jones,Pythagoras Watson,William G. Tuel,Larry Bert Brenner,Patrick Caffrey,Jeffrey Fier +5 more
TL;DR: In this paper, a parallel-aware co-scheduling method and system for improving the performance and scalability of a dedicated parallel job having synchronizing collective operations is presented.
Patent
Program products for performing checkpoint/restart of a parallel program
Kalman Meth,Anton Prenneis,Adnan Agbaria,Patrick Caffrey,William J. Ferrante,Su-Hsuan Huang,Demetrios K. Michailaros,William G. Tuel +7 more
TL;DR: In this paper, a checkpoint of a parallel program is taken in order to provide a consistent state of the program in the event the program is to be restarted, however, the timing of when the checkpoint should be taken by each process is the responsibility of a coordinating process.
Patent
Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes
Terry Jones,Pythagoras Watson,William G. Tuel,Larry Bert Brenner,Patrick Caffrey,Jeffrey Fier +5 more
TL;DR: In this article, a parallel-aware co-scheduling method and system for improving the performance and scalability of a dedicated parallel job having synchronizing collective operations is presented.