scispace - formally typeset
Search or ask a question
Author

Galen M. Shipman

Bio: Galen M. Shipman is an academic researcher from Los Alamos National Laboratory. The author has contributed to research in topics: File system & Lustre (file system). The author has an hindex of 27, co-authored 83 publications receiving 2103 citations. Previous affiliations of Galen M. Shipman include National Center for Computational Sciences & Oak Ridge National Laboratory.


Papers
More filters
Proceedings ArticleDOI
25 Sep 2006
TL;DR: This work describes Open MPI's architecture for heterogeneous network and processor support, and demonstrates the transparency to the application developer while maintaining very high levels of performance.
Abstract: The growth in the number of generally available, distributed, heterogeneous computing systems places increasing importance on the development of user-friendly tools that enable application developers to efficiently use these resources. Open MPI provides support for several aspects of heterogeneity within a single, open-source MPI implementation. Through careful abstractions, heterogeneous support maintains efficient use of uniform computational platforms. We describe Open MPI's architecture for heterogeneous network and processor support. A key design features of this implementation is the transparency to the application developer while maintaining very high levels of performance. This is demonstrated with the results of several numerical experiments.

152 citations

Patent
28 Jan 2011
TL;DR: In this paper, an optimized redundant array of solid state devices may include an array of one or more optimized solid-state devices and a controller coupled to the optimized solidstate devices for managing the devices.
Abstract: An optimized redundant array of solid state devices may include an array of one or more optimized solid-state devices and a controller coupled to the solid-state devices for managing the solid-state devices. The controller may be configured to globally coordinate the garbage collection activities of each of said optimized solid-state devices, for instance, to minimize the degraded performance time and increase the optimal performance time of the entire array of devices.

138 citations

Proceedings ArticleDOI
10 Apr 2011
TL;DR: This paper examines the GC process and proposes a semi-preemptive GC scheme that can preempt on-going GC processing and service pending I/O requests in the queue that can enhance flash performance by pipelining internal GC operations and merge them with pending I-O requests whenever possible.
Abstract: NAND flash memory is a preferred storage media for various platforms ranging from embedded systems to enterprise-scale systems. Flash devices do not have any mechanical moving parts and provide low-latency access. They also require less power compared to rotating media. Unlike hard disks, flash devices use out-of-update operations and they require a garbage collection (GC) process to reclaim invalid pages to create free blocks. This GC process is a major cause of performance degradation when running concurrently with other I/O operations as internal bandwidth is consumed to reclaim these invalid pages. The invocation of the GC process is generally governed by a low watermark on free blocks and other internal device metrics that different workloads meet at different intervals. This results in I/O performance that is highly dependent on workload characteristics. In this paper, we examine the GC process and propose a semi-preemptive GC scheme that can preempt on-going GC processing and service pending I/O requests in the queue. Moreover, we further enhance flash performance by pipelining internal GC operations and merge them with pending I/O requests whenever possible. Our experimental evaluation of this semi-preemptive GC sheme with realistic workloads demonstrate both improved performance and reduced performance variability. Write-dominant workloads show up to a 66.56% improvement in average response time with a 83.30% reduced variance in response time compared to the non-preemptive GC scheme.

118 citations

Journal ArticleDOI
TL;DR: Here, several recent applications of the big and deep data analysis methods are reviewed to visualize, compress, and translate this multidimensional structural and functional data into physically and chemically relevant information.
Abstract: The development of electron and scanning probe microscopies in the second half of the twentieth century has produced spectacular images of the internal structure and composition of matter with nanometer, molecular, and atomic resolution. Largely, this progress was enabled by computer-assisted methods of microscope operation, data acquisition, and analysis. Advances in imaging technology in the beginning of the twenty-first century have opened the proverbial floodgates on the availability of high-veracity information on structure and functionality. From the hardware perspective, high-resolution imaging methods now routinely resolve atomic positions with approximately picometer precision, allowing for quantitative measurements of individual bond lengths and angles. Similarly, functional imaging often leads to multidimensional data sets containing partial or full information on properties of interest, acquired as a function of multiple parameters (time, temperature, or other external stimuli). Here, we review several recent applications of the big and deep data analysis methods to visualize, compress, and translate this multidimensional structural and functional data into physically and chemically relevant information.

101 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Several of the fundamental algorithms used in LAMMPS are described along with the design strategies which have made it flexible for both users and developers, and some capabilities recently added to the code which were enabled by this flexibility are highlighted.

1,956 citations

Journal ArticleDOI
TL;DR: The Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) as mentioned in this paper is a simulator for particle-based modeling of materials at length scales ranging from atomic to mesoscale to continuum.

1,517 citations

Proceedings ArticleDOI
11 Jun 2012
TL;DR: This paper collects detailed traces from Facebook's Memcached deployment, arguably the world's largest, and analyzes the workloads from multiple angles, including: request composition, size, and rate; cache efficacy; temporal patterns; and application use cases.
Abstract: Key-value stores are a vital component in many scale-out enterprises, including social networks, online retail, and risk analysis. Accordingly, they are receiving increased attention from the research community in an effort to improve their performance, scalability, reliability, cost, and power consumption. To be effective, such efforts require a detailed understanding of realistic key-value workloads. And yet little is known about these workloads outside of the companies that operate them. This paper aims to address this gap.To this end, we have collected detailed traces from Facebook's Memcached deployment, arguably the world's largest. The traces capture over 284 billion requests from five different Memcached use cases over several days. We analyze the workloads from multiple angles, including: request composition, size, and rate; cache efficacy; temporal patterns; and application use cases. We also propose a simple model of the most representative trace to enable the generation of more realistic synthetic workloads by the community.Our analysis details many characteristics of the caching workload. It also reveals a number of surprises: a GET/SET ratio of 30:1 that is higher than assumed in the literature; some applications of Memcached behave more like persistent storage than a cache; strong locality metrics, such as keys accessed many millions of times a day, do not always suffice for a high hit rate; and there is still room for efficiency and hit rate improvements in Memcached's implementation. Toward the last point, we make several suggestions that address the exposed deficiencies.

880 citations

Journal ArticleDOI
TL;DR: This study aims to provide a common basis for CPM climate simulations by giving a holistic review of the topic, and presents the consolidated outcome of studies that addressed the added value of CPMClimate simulations compared to LSMs.
Abstract: Regional climate modeling using convection-permitting models (CPMs; horizontal grid spacing 10 km). CPMs no longer rely on convection parameterization schemes, which had been identified as a major source of errors and uncertainties in LSMs. Moreover, CPMs allow for a more accurate representation of surface and orography fields. The drawback of CPMs is the high demand on computational resources. For this reason, first CPM climate simulations only appeared a decade ago. In this study, we aim to provide a common basis for CPM climate simulations by giving a holistic review of the topic. The most important components in CPMs such as physical parameterizations and dynamical formulations are discussed critically. An overview of weaknesses and an outlook on required future developments is provided. Most importantly, this review presents the consolidated outcome of studies that addressed the added value of CPM climate simulations compared to LSMs. Improvements are evident mostly for climate statistics related to deep convection, mountainous regions, or extreme events. The climate change signals of CPM simulations suggest an increase in flash floods, changes in hail storm characteristics, and reductions in the snowpack over mountains. In conclusion, CPMs are a very promising tool for future climate research. However, coordinated modeling programs are crucially needed to advance parameterizations of unresolved physics and to assess the full potential of CPMs.

833 citations