scispace - formally typeset
Search or ask a question
Author

Catherine A. Lozupone

Bio: Catherine A. Lozupone is an academic researcher from Anschutz Medical Campus. The author has contributed to research in topics: Microbiome & Gut flora. The author has an hindex of 61, co-authored 129 publications receiving 91999 citations. Previous affiliations of Catherine A. Lozupone include University of Pennsylvania & Colorado State University.


Papers
More filters
Journal ArticleDOI
TL;DR: An overview of the analysis pipeline and links to raw data and processed output from the runs with and without denoising are provided.
Abstract: Supplementary Figure 1 Overview of the analysis pipeline. Supplementary Table 1 Details of conventionally raised and conventionalized mouse samples. Supplementary Discussion Expanded discussion of QIIME analyses presented in the main text; Sequencing of 16S rRNA gene amplicons; QIIME analysis notes; Expanded Figure 1 legend; Links to raw data and processed output from the runs with and without denoising.

28,911 citations

Journal ArticleDOI
Evan Bolyen1, Jai Ram Rideout1, Matthew R. Dillon1, Nicholas A. Bokulich1, Christian C. Abnet2, Gabriel A. Al-Ghalith3, Harriet Alexander4, Harriet Alexander5, Eric J. Alm6, Manimozhiyan Arumugam7, Francesco Asnicar8, Yang Bai9, Jordan E. Bisanz10, Kyle Bittinger11, Asker Daniel Brejnrod7, Colin J. Brislawn12, C. Titus Brown4, Benjamin J. Callahan13, Andrés Mauricio Caraballo-Rodríguez14, John Chase1, Emily K. Cope1, Ricardo Silva14, Christian Diener15, Pieter C. Dorrestein14, Gavin M. Douglas16, Daniel M. Durall17, Claire Duvallet6, Christian F. Edwardson, Madeleine Ernst14, Madeleine Ernst18, Mehrbod Estaki17, Jennifer Fouquier19, Julia M. Gauglitz14, Sean M. Gibbons20, Sean M. Gibbons15, Deanna L. Gibson17, Antonio Gonzalez14, Kestrel Gorlick1, Jiarong Guo21, Benjamin Hillmann3, Susan Holmes22, Hannes Holste14, Curtis Huttenhower23, Curtis Huttenhower24, Gavin A. Huttley25, Stefan Janssen26, Alan K. Jarmusch14, Lingjing Jiang14, Benjamin D. Kaehler25, Benjamin D. Kaehler27, Kyo Bin Kang14, Kyo Bin Kang28, Christopher R. Keefe1, Paul Keim1, Scott T. Kelley29, Dan Knights3, Irina Koester14, Tomasz Kosciolek14, Jorden Kreps1, Morgan G. I. Langille16, Joslynn S. Lee30, Ruth E. Ley31, Ruth E. Ley32, Yong-Xin Liu, Erikka Loftfield2, Catherine A. Lozupone19, Massoud Maher14, Clarisse Marotz14, Bryan D Martin20, Daniel McDonald14, Lauren J. McIver23, Lauren J. McIver24, Alexey V. Melnik14, Jessica L. Metcalf33, Sydney C. Morgan17, Jamie Morton14, Ahmad Turan Naimey1, Jose A. Navas-Molina34, Jose A. Navas-Molina14, Louis-Félix Nothias14, Stephanie B. Orchanian, Talima Pearson1, Samuel L. Peoples35, Samuel L. Peoples20, Daniel Petras14, Mary L. Preuss36, Elmar Pruesse19, Lasse Buur Rasmussen7, Adam R. Rivers37, Michael S. Robeson38, Patrick Rosenthal36, Nicola Segata8, Michael Shaffer19, Arron Shiffer1, Rashmi Sinha2, Se Jin Song14, John R. Spear39, Austin D. Swafford, Luke R. Thompson40, Luke R. Thompson41, Pedro J. Torres29, Pauline Trinh20, Anupriya Tripathi14, Peter J. Turnbaugh10, Sabah Ul-Hasan42, Justin J. J. van der Hooft43, Fernando Vargas, Yoshiki Vázquez-Baeza14, Emily Vogtmann2, Max von Hippel44, William A. Walters32, Yunhu Wan2, Mingxun Wang14, Jonathan Warren45, Kyle C. Weber37, Kyle C. Weber46, Charles H. D. Williamson1, Amy D. Willis20, Zhenjiang Zech Xu14, Jesse R. Zaneveld20, Yilong Zhang47, Qiyun Zhu14, Rob Knight14, J. Gregory Caporaso1 
TL;DR: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and R.K.P. and partial support was also provided by the following: grants NIH U54CA143925 and U54MD012388.
Abstract: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and 1565057 to R.K. Partial support was also provided by the following: grants NIH U54CA143925 (J.G.C. and T.P.) and U54MD012388 (J.G.C. and T.P.); grants from the Alfred P. Sloan Foundation (J.G.C. and R.K.); ERCSTG project MetaPG (N.S.); the Strategic Priority Research Program of the Chinese Academy of Sciences QYZDB-SSW-SMC021 (Y.B.); the Australian National Health and Medical Research Council APP1085372 (G.A.H., J.G.C., Von Bing Yap and R.K.); the Natural Sciences and Engineering Research Council (NSERC) to D.L.G.; and the State of Arizona Technology and Research Initiative Fund (TRIF), administered by the Arizona Board of Regents, through Northern Arizona University. All NCI coauthors were supported by the Intramural Research Program of the National Cancer Institute. S.M.G. and C. Diener were supported by the Washington Research Foundation Distinguished Investigator Award.

8,821 citations

Journal ArticleDOI
Curtis Huttenhower1, Curtis Huttenhower2, Dirk Gevers2, Rob Knight3  +250 moreInstitutions (42)
14 Jun 2012-Nature
TL;DR: The Human Microbiome Project Consortium reported the first results of their analysis of microbial communities from distinct, clinically relevant body habitats in a human cohort; the insights into the microbial communities of a healthy population lay foundations for future exploration of the epidemiology, ecology and translational applications of the human microbiome as discussed by the authors.
Abstract: The Human Microbiome Project Consortium reports the first results of their analysis of microbial communities from distinct, clinically relevant body habitats in a human cohort; the insights into the microbial communities of a healthy population lay foundations for future exploration of the epidemiology, ecology and translational applications of the human microbiome.

8,410 citations

Journal ArticleDOI
TL;DR: This work sequences a diverse array of 25 environmental samples and three known “mock communities” at a depth averaging 3.1 million reads per sample to demonstrate excellent consistency in taxonomic recovery and recapture diversity patterns that were previously reported on the basis of metaanalysis of many studies from the literature.
Abstract: The ongoing revolution in high-throughput sequencing continues to democratize the ability of small groups of investigators to map the microbial component of the biosphere. In particular, the coevolution of new sequencing platforms and new software tools allows data acquisition and analysis on an unprecedented scale. Here we report the next stage in this coevolutionary arms race, using the Illumina GAIIx platform to sequence a diverse array of 25 environmental samples and three known “mock communities” at a depth averaging 3.1 million reads per sample. We demonstrate excellent consistency in taxonomic recovery and recapture diversity patterns that were previously reported on the basis of metaanalysis of many studies from the literature (notably, the saline/nonsaline split in environmental samples and the split between host-associated and free-living communities). We also demonstrate that 2,000 Illumina single-end reads are sufficient to recapture the same relationships among samples that we observe with the full dataset. The results thus open up the possibility of conducting large-scale studies analyzing thousands of samples simultaneously to survey microbial communities at an unprecedented spatial and temporal resolution.

6,767 citations

Journal ArticleDOI
TL;DR: The results illustrate that UniFrac provides a new way of characterizing microbial communities, using the wealth of environmental rRNA sequences, and allows quantitative insight into the factors that underlie the distribution of lineages among environments.
Abstract: We introduce here a new method for computing differences between microbial communities based on phylogenetic information. This method, UniFrac, measures the phylogenetic distance between sets of taxa in a phylogenetic tree as the fraction of the branch length of the tree that leads to descendants from either one environment or the other, but not both. UniFrac can be used to determine whether communities are significantly different, to compare many communities simultaneously using clustering and ordination techniques, and to measure the relative contributions of different factors, such as chemistry and geography, to similarities between samples. We demonstrate the utility of UniFrac by applying it to published 16S rRNA gene libraries from cultured isolates and environmental clones of bacteria in marine sediment, water, and ice. Our results reveal that (i) cultured isolates from ice, water, and sediment resemble each other and environmental clone sequences from sea ice, but not environmental clone sequences from sediment and water; (ii) the geographical location does not correlate strongly with bacterial community differences in ice and sediment from the Arctic and Antarctic; and (iii) bacterial communities differ between terrestrially impacted seawater (whether polar or temperate) and warm oligotrophic seawater, whereas those in individual seawater samples are not more similar to each other than to those in sediment or ice samples. These results illustrate that UniFrac provides a new way of characterizing microbial communities, using the wealth of environmental rRNA sequences, and allows quantitative insight into the factors that underlie the distribution of lineages among environments.

6,679 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: An overview of the analysis pipeline and links to raw data and processed output from the runs with and without denoising are provided.
Abstract: Supplementary Figure 1 Overview of the analysis pipeline. Supplementary Table 1 Details of conventionally raised and conventionalized mouse samples. Supplementary Discussion Expanded discussion of QIIME analyses presented in the main text; Sequencing of 16S rRNA gene amplicons; QIIME analysis notes; Expanded Figure 1 legend; Links to raw data and processed output from the runs with and without denoising.

28,911 citations

Journal ArticleDOI
TL;DR: The extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.
Abstract: SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.

18,256 citations

Journal ArticleDOI
TL;DR: M mothur is used as a case study to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the α and β diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments.
Abstract: mothur aims to be a comprehensive software package that allows users to use a single piece of software to analyze community sequence data. It builds upon previous tools to provide a flexible and powerful software package for analyzing sequencing data. As a case study, we used mothur to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the alpha and beta diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments. This analysis of more than 222,000 sequences was completed in less than 2 h with a laptop computer.

17,350 citations

Journal ArticleDOI
TL;DR: The RDP Classifier can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's Taxonomic Outline of the Prokaryotes, and the majority of the classification errors appear to be due to anomalies in the current taxonomies.
Abstract: The Ribosomal Database Project (RDP) Classifier, a naive Bayesian classifier, can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's Taxonomic Outline of the Prokaryotes (2nd ed., release 5.0, Springer-Verlag, New York, NY, 2004). It provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. The majority of classifications (98%) were of high estimated confidence (≥95%) and high accuracy (98%). In addition to being tested with the corpus of 5,014 type strain sequences from Bergey's outline, the RDP Classifier was tested with a corpus of 23,095 rRNA sequences as assigned by the NCBI into their alternative higher-order taxonomy. The results from leave-one-out testing on both corpora show that the overall accuracies at all levels of confidence for near-full-length and 400-base segments were 89% or above down to the genus level, and the majority of the classification errors appear to be due to anomalies in the current taxonomies. For shorter rRNA segments, such as those that might be generated by pyrosequencing, the error rate varied greatly over the length of the 16S rRNA gene, with segments around the V2 and V4 variable regions giving the lowest error rates. The RDP Classifier is suitable both for the analysis of single rRNA sequences and for the analysis of libraries of thousands of sequences. Another related tool, RDP Library Compare, was developed to facilitate microbial-community comparison based on 16S rRNA gene sequence libraries. It combines the RDP Classifier with a statistical test to flag taxa differentially represented between samples. The RDP Classifier and RDP Library Compare are available online at http://rdp.cme.msu.edu/.

16,048 citations

Journal ArticleDOI
TL;DR: UNLABELLED RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies with maximum likelihood (ML) that has been used to compute ML trees on two of the largest alignments to date.
Abstract: Summary: RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies with maximum likelihood (ML). Low-level technical optimizations, a modification of the search algorithm, and the use of the GTR+CAT approximation as replacement for GTR+Γ yield a program that is between 2.7 and 52 times faster than the previous version of RAxML. A large-scale performance comparison with GARLI, PHYML, IQPNNI and MrBayes on real data containing 1000 up to 6722 taxa shows that RAxML requires at least 5.6 times less main memory and yields better trees in similar times than the best competing program (GARLI) on datasets up to 2500 taxa. On datasets ≥4000 taxa it also runs 2--3 times faster than GARLI. RAxML has been parallelized with MPI to conduct parallel multiple bootstraps and inferences on distinct starting trees. The program has been used to compute ML trees on two of the largest alignments to date containing 25 057 (1463 bp) and 2182 (51 089 bp) taxa, respectively. Availability: icwww.epfl.ch/~stamatak Contact: Alexandros.Stamatakis@epfl.ch Supplementary information: Supplementary data are available at Bioinformatics online.

14,847 citations