K
Kathryn Mohror
Researcher at Lawrence Livermore National Laboratory
Publications - 94
Citations - 2095
Kathryn Mohror is an academic researcher from Lawrence Livermore National Laboratory. The author has contributed to research in topics: File system & Scalability. The author has an hindex of 20, co-authored 81 publications receiving 1788 citations. Previous affiliations of Kathryn Mohror include Portland State University & University of Tennessee at Chattanooga.
Papers
More filters
Proceedings ArticleDOI
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
TL;DR: The Scalable Checkpoint/Restart (SCR) library is designed, a multi-level checkpoint system that writes checkpoints to RAM, Flash, or disk on the compute nodes in addition to the parallel file system that improves efficiency on existing large-scale systems and that this benefit increases as the system size grows.
Proceedings ArticleDOI
There goes the neighborhood: performance degradation due to nearby jobs
TL;DR: This paper focuses on Cray machines and investigates potential causes for performance variability such as OS jitter, shape of the allocated partition, and interference from other jobs sharing the same network links.
Proceedings ArticleDOI
Design and modeling of a non-blocking checkpointing system
Kento Sato,Kathryn Mohror,Adam Moody,Todd Gamblin,B R de Supinski,Naoya Maruyama,Satoshi Matsuoka +6 more
TL;DR: The design of the system is presented, the system can improve efficiency by 1.1 to 2.0x on future machines, and applications using the checkpointing system can achieve high efficiency even when using a PFS with lower bandwidth.
ReportDOI
Detailed Modeling, Design, and Evaluation of a Scalable Multi-level Checkpointing System
TL;DR: The goal is to design light-weight checkpoints to handle the most common failure modes and rely on more expensive checkpoints for less common, but more severe failures, and to develop low-cost checkpoint schemes that are 100x-1000x faster than the parallel file system and effective against 85% of system failures.
Proceedings ArticleDOI
An ephemeral burst-buffer file system for scientific applications
TL;DR: This study has designed an ephemeral Burst Buffer File System (BurstFS) that supports scalable and efficient aggregation of I/O bandwidth from burst buffers while having the same life cycle as a batch-submitted job.