scispace - formally typeset
K

Kathryn Mohror

Researcher at Lawrence Livermore National Laboratory

Publications -  94
Citations -  2095

Kathryn Mohror is an academic researcher from Lawrence Livermore National Laboratory. The author has contributed to research in topics: File system & Scalability. The author has an hindex of 20, co-authored 81 publications receiving 1788 citations. Previous affiliations of Kathryn Mohror include Portland State University & University of Tennessee at Chattanooga.

Papers
More filters
Proceedings ArticleDOI

Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System

TL;DR: The Scalable Checkpoint/Restart (SCR) library is designed, a multi-level checkpoint system that writes checkpoints to RAM, Flash, or disk on the compute nodes in addition to the parallel file system that improves efficiency on existing large-scale systems and that this benefit increases as the system size grows.
Proceedings ArticleDOI

There goes the neighborhood: performance degradation due to nearby jobs

TL;DR: This paper focuses on Cray machines and investigates potential causes for performance variability such as OS jitter, shape of the allocated partition, and interference from other jobs sharing the same network links.
Proceedings ArticleDOI

Design and modeling of a non-blocking checkpointing system

TL;DR: The design of the system is presented, the system can improve efficiency by 1.1 to 2.0x on future machines, and applications using the checkpointing system can achieve high efficiency even when using a PFS with lower bandwidth.
ReportDOI

Detailed Modeling, Design, and Evaluation of a Scalable Multi-level Checkpointing System

TL;DR: The goal is to design light-weight checkpoints to handle the most common failure modes and rely on more expensive checkpoints for less common, but more severe failures, and to develop low-cost checkpoint schemes that are 100x-1000x faster than the parallel file system and effective against 85% of system failures.
Proceedings ArticleDOI

An ephemeral burst-buffer file system for scientific applications

TL;DR: This study has designed an ephemeral Burst Buffer File System (BurstFS) that supports scalable and efficient aggregation of I/O bandwidth from burst buffers while having the same life cycle as a batch-submitted job.