scispace - formally typeset
Search or ask a question
Author

Ralph Butler

Other affiliations: Argonne National Laboratory
Bio: Ralph Butler is an academic researcher from Middle Tennessee State University. The author has contributed to research in topics: System software & Reactive programming. The author has an hindex of 6, co-authored 14 publications receiving 2104 citations. Previous affiliations of Ralph Butler include Argonne National Laboratory.

Papers
More filters
Journal ArticleDOI
TL;DR: The subsystem approach is described, the first release of the growing library of populated subsystems is offered, and the SEED is the first annotation environment that supports this model of annotation.
Abstract: The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.

1,896 citations

Journal ArticleDOI
TL;DR: The recent updates to the PATRIC resource are reported, including new web-based comparative analysis tools, eight new services and the release of a command-line interface to access, query and analyze data.
Abstract: The PathoSystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center funded by the National Institute of Allergy and Infectious Diseases (https://www.patricbrc.org). PATRIC supports bioinformatic analyses of all bacteria with a special emphasis on pathogens, offering a rich comparative analysis environment that provides users with access to over 250 000 uniformly annotated and publicly available genomes with curated metadata. PATRIC offers web-based visualization and comparative analysis tools, a private workspace in which users can analyze their own data in the context of the public collections, services that streamline complex bioinformatic workflows and command-line tools for bulk data analysis. Over the past several years, as genomic and other omics-related experiments have become more cost-effective and widespread, we have observed considerable growth in the usage of and demand for easy-to-use, publicly available bioinformatic tools and services. Here we report the recent updates to the PATRIC resource, including new web-based comparative analysis tools, eight new services and the release of a command-line interface to access, query and analyze data.

554 citations

Journal ArticleDOI
TL;DR: It is shown that the calculated transverse sum rule has large contributions from two-body currents, indicating that these mechanisms lead to a significant enhancement of the quasielastic transverse response.
Abstract: An ab initio calculation of the $^{12}\mathrm{C}$ elastic form factor and sum rules of longitudinal and transverse response functions measured in inclusive ($e$, ${e}^{\ensuremath{'}}$) scattering are reported, based on realistic nuclear potentials and electromagnetic currents. The longitudinal elastic form factor and sum rule are found to be in satisfactory agreement with available experimental data. A direct comparison between theory and experiment is difficult for the transverse sum rule. However, it is shown that the calculated transverse sum rule has large contributions from two-body currents, indicating that these mechanisms lead to a significant enhancement of the quasielastic transverse response. This fact may have implications for the anomaly observed in recent neutrino quasielastic charge-changing scattering experiments on $^{12}\mathrm{C}$.

92 citations

01 Jan 2010
TL;DR: This is the story of a simple programming model, its implementation for extreme computing, and a breakthrough in nuclear physics, which addresses fundamental issues in the origins of the authors' Universe and the library developed to enable this application on the largest computers may have applications beyond this one.
Abstract: This is the story of a simple programming model, its implementation for extreme computing, and a breakthrough in nuclear physics. A critical issue for the future of high-performance computing is the programming model to use on next-generation architectures. Described here is a promising approach: program very large machines by combining a simplified programming model with a scalable library implementation. The presentation takes the form of a case study in nuclear physics. The chosen application addresses fundamental issues in the origins of our Universe, while the library developed to enable this application on the largest computers may have applications beyond this one.

53 citations

Proceedings ArticleDOI
10 Mar 2006
TL;DR: A customizable portable virtual machine (CPVM) environment that makes it possible for students and professors to develop a customized, fully loaded, fully functional virtual machine that they can run on any computer that has a USB port or an ftp client without compromising the host.
Abstract: We present our experiences with a customizable portable virtual machine (CPVM) environment that makes it possible for students and professors to develop a customized, fully loaded, fully functional virtual machine that they can run on any computer that has a USB port or an ftp client without compromising the host. We also make available to students and faculty a copy of a portable virtual machine which provides compilers, window managers, and IDEs as well as the virtual operating system. This copy is identical to the environment found in our labs. CPVM has allowed students to use their personal Windows computers as if they were the Linux machines in the lab without having to install Linux. It has also enabled professors to effectively take their office environments with them to remote locations such as conferences. CPVM is composed of free, open-source software, and is available via anonymous ftp.

11 citations


Cited by
More filters
01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

Journal ArticleDOI
TL;DR: A fully automated service for annotating bacterial and archaeal genomes that identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user.
Abstract: The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them. We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service. By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.

9,397 citations

Journal ArticleDOI
TL;DR: A new method for metagenomic biomarker discovery is described and validates by way of class comparison, tests of biological consistency and effect size estimation to address the challenge of finding organisms, genes, or pathways that consistently explain the differences between two or more microbial communities.
Abstract: This study describes and validates a new method for metagenomic biomarker discovery by way of class comparison, tests of biological consistency and effect size estimation. This addresses the challenge of finding organisms, genes, or pathways that consistently explain the differences between two or more microbial communities, which is a central problem to the study of metagenomics. We extensively validate our method on several microbiomes and a convenient online interface for the method is provided at http://huttenhower.sph.harvard.edu/lefse/.

9,057 citations

Journal ArticleDOI
TL;DR: The interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources are described.
Abstract: In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

3,415 citations

Journal ArticleDOI
TL;DR: The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes that is stable, extensible, and freely available to all researchers.
Abstract: Random community genomes (metagenomes) are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers. A high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. Phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. User access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing datasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats. The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis – the availability of high-performance computing for annotating the data. http://metagenomics.nmpdr.org

3,322 citations