Home
/
Authors
/
Arron Shiffer

Author

Arron Shiffer

Bio: Arron Shiffer is an academic researcher from Northern Arizona University. The author has contributed to research in topics: Microbiome & Raw data. The author has an hindex of 7, co-authored 12 publications receiving 4372 citations.

Topics: Microbiome, Raw data, UniFrac, Taxonomic rank, Phylogenetic tree ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2

[...]

Evan Bolyen¹, Jai Ram Rideout¹, Matthew R. Dillon¹, Nicholas A. Bokulich¹, Christian C. Abnet², Gabriel A. Al-Ghalith³, Harriet Alexander⁴, Harriet Alexander⁵, Eric J. Alm⁶, Manimozhiyan Arumugam⁷, Francesco Asnicar⁸, Yang Bai⁹, Jordan E. Bisanz¹⁰, Kyle Bittinger¹¹, Asker Daniel Brejnrod⁷, Colin J. Brislawn¹², C. Titus Brown⁴, Benjamin J. Callahan¹³, Andrés Mauricio Caraballo-Rodríguez¹⁴, John Chase¹, Emily K. Cope¹, Ricardo Silva¹⁴, Christian Diener¹⁵, Pieter C. Dorrestein¹⁴, Gavin M. Douglas¹⁶, Daniel M. Durall¹⁷, Claire Duvallet⁶, Christian F. Edwardson, Madeleine Ernst¹⁸, Madeleine Ernst¹⁴, Mehrbod Estaki¹⁷, Jennifer Fouquier¹⁹, Julia M. Gauglitz¹⁴, Sean M. Gibbons¹⁵, Sean M. Gibbons²⁰, Deanna L. Gibson¹⁷, Antonio Gonzalez¹⁴, Kestrel Gorlick¹, Jiarong Guo²¹, Benjamin Hillmann³, Susan Holmes²², Hannes Holste¹⁴, Curtis Huttenhower²³, Curtis Huttenhower²⁴, Gavin A. Huttley²⁵, Stefan Janssen²⁶, Alan K. Jarmusch¹⁴, Lingjing Jiang¹⁴, Benjamin D. Kaehler²⁷, Benjamin D. Kaehler²⁵, Kyo Bin Kang²⁸, Kyo Bin Kang¹⁴, Christopher R. Keefe¹, Paul Keim¹, Scott T. Kelley²⁹, Dan Knights³, Irina Koester¹⁴, Tomasz Kosciolek¹⁴, Jorden Kreps¹, Morgan G. I. Langille¹⁶, Joslynn S. Lee³⁰, Ruth E. Ley³¹, Ruth E. Ley³², Yong-Xin Liu, Erikka Loftfield², Catherine A. Lozupone¹⁹, Massoud Maher¹⁴, Clarisse Marotz¹⁴, Bryan D Martin²⁰, Daniel McDonald¹⁴, Lauren J. McIver²³, Lauren J. McIver²⁴, Alexey V. Melnik¹⁴, Jessica L. Metcalf³³, Sydney C. Morgan¹⁷, Jamie Morton¹⁴, Ahmad Turan Naimey¹, Jose A. Navas-Molina¹⁴, Jose A. Navas-Molina³⁴, Louis-Félix Nothias¹⁴, Stephanie B. Orchanian, Talima Pearson¹, Samuel L. Peoples²⁰, Samuel L. Peoples³⁵, Daniel Petras¹⁴, Mary L. Preuss³⁶, Elmar Pruesse¹⁹, Lasse Buur Rasmussen⁷, Adam R. Rivers³⁷, Michael S. Robeson³⁸, Patrick Rosenthal³⁶, Nicola Segata⁸, Michael Shaffer¹⁹, Arron Shiffer¹, Rashmi Sinha², Se Jin Song¹⁴, John R. Spear³⁹, Austin D. Swafford, Luke R. Thompson⁴⁰, Luke R. Thompson⁴¹, Pedro J. Torres²⁹, Pauline Trinh²⁰, Anupriya Tripathi¹⁴, Peter J. Turnbaugh¹⁰, Sabah Ul-Hasan⁴², Justin J. J. van der Hooft⁴³, Fernando Vargas, Yoshiki Vázquez-Baeza¹⁴, Emily Vogtmann², Max von Hippel⁴⁴, William A. Walters³², Yunhu Wan², Mingxun Wang¹⁴, Jonathan Warren⁴⁵, Kyle C. Weber³⁷, Kyle C. Weber⁴⁶, Charles H. D. Williamson¹, Amy D. Willis²⁰, Zhenjiang Zech Xu¹⁴, Jesse R. Zaneveld²⁰, Yilong Zhang⁴⁷, Qiyun Zhu¹⁴, Rob Knight¹⁴, J. Gregory Caporaso¹ - Show less +120 more•Institutions (47)

Northern Arizona University¹, National Institutes of Health², University of Minnesota³, University of California, Davis⁴, Woods Hole Oceanographic Institution⁵, Massachusetts Institute of Technology⁶, University of Copenhagen⁷, University of Trento⁸, Chinese Academy of Sciences⁹, University of California, San Francisco¹⁰, University of Pennsylvania¹¹, Pacific Northwest National Laboratory¹², North Carolina State University¹³, University of California, San Diego¹⁴, Institute for Systems Biology¹⁵, Dalhousie University¹⁶, University of British Columbia¹⁷, Statens Serum Institut¹⁸, Anschutz Medical Campus¹⁹, University of Washington²⁰, Michigan State University²¹, Stanford University²², Broad Institute²³, Harvard University²⁴, Australian National University²⁵, University of Düsseldorf²⁶, University of New South Wales²⁷, Sookmyung Women's University²⁸, San Diego State University²⁹, Howard Hughes Medical Institute³⁰, Cornell University³¹, Max Planck Society³², Colorado State University³³, Google³⁴, Syracuse University³⁵, Webster University³⁶, United States Department of Agriculture³⁷, University of Arkansas for Medical Sciences³⁸, Colorado School of Mines³⁹, University of Southern Mississippi⁴⁰, National Oceanic and Atmospheric Administration⁴¹, University of California, Merced⁴², Wageningen University and Research Centre⁴³, University of Arizona⁴⁴, Environment Agency⁴⁵, University of Florida⁴⁶, Merck & Co.⁴⁷

01 Aug 2019-Nature Biotechnology

TL;DR: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and R.K.P. and partial support was also provided by the following: grants NIH U54CA143925 and U54MD012388.

...read moreread less

Abstract: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and 1565057 to R.K. Partial support was also provided by the following: grants NIH U54CA143925 (J.G.C. and T.P.) and U54MD012388 (J.G.C. and T.P.); grants from the Alfred P. Sloan Foundation (J.G.C. and R.K.); ERCSTG project MetaPG (N.S.); the Strategic Priority Research Program of the Chinese Academy of Sciences QYZDB-SSW-SMC021 (Y.B.); the Australian National Health and Medical Research Council APP1085372 (G.A.H., J.G.C., Von Bing Yap and R.K.); the Natural Sciences and Engineering Research Council (NSERC) to D.L.G.; and the State of Arizona Technology and Research Initiative Fund (TRIF), administered by the Arizona Board of Regents, through Northern Arizona University. All NCI coauthors were supported by the Intramural Research Program of the National Cancer Institute. S.M.G. and C. Diener were supported by the Washington Research Foundation Distinguished Investigator Award.

...read moreread less

8,821 citations

Posted Content•DOI•

QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science

[...]

Evan Bolyen¹, Jai Ram Rideout¹, Matthew R. Dillon¹, Nicholas A. Bokulich¹, Christian C. Abnet, Gabriel A. Al-Ghalith², Harriet Alexander³, Harriet Alexander⁴, Eric J. Alm⁵, Manimozhiyan Arumugam⁶, Francesco Asnicar⁷, Yang Bai⁸, Jordan E. Bisanz⁹, Kyle Bittinger¹⁰, Asker Daniel Brejnrod⁶, Colin J. Brislawn¹¹, C. Titus Brown³, Benjamin J. Callahan¹², Andrés Mauricio Caraballo-Rodríguez¹³, John Chase¹, Emily K. Cope¹, Ricardo Silva¹³, Pieter C. Dorrestein¹³, Gavin M. Douglas¹⁴, Daniel M. Durall¹⁵, Claire Duvallet⁵, Christian F. Edwardson¹⁶, Madeleine Ernst¹³, Mehrbod Estaki¹⁵, Jennifer Fouquier¹⁷, Julia M. Gauglitz¹³, Deanna L. Gibson¹⁵, Antonio Gonzalez¹⁸, Kestrel Gorlick¹, Jiarong Guo¹⁹, Benjamin Hillmann², Susan Holmes²⁰, Hannes Holste¹⁸, Curtis Huttenhower²¹, Curtis Huttenhower²², Gavin A. Huttley²³, Stefan Janssen²⁴, Alan K. Jarmusch¹³, Lingjing Jiang¹⁸, Benjamin D. Kaehler²³, Kyo Bin Kang²⁵, Kyo Bin Kang¹³, Christopher R. Keefe¹, Paul Keim¹, Scott T. Kelley²⁶, Dan Knights², Irina Koester¹⁸, Irina Koester¹³, Tomasz Kosciolek¹⁸, Jorden Kreps¹, Morgan G. I. Langille¹⁴, Joslynn S. Lee²⁷, Ruth E. Ley²⁸, Ruth E. Ley²⁹, Yong-Xin Liu⁸, Erikka Loftfield, Catherine A. Lozupone¹⁷, Massoud Maher¹⁸, Clarisse Marotz¹⁸, Bryan D Martin³⁰, Daniel McDonald¹⁸, Lauren J. McIver²², Lauren J. McIver²¹, Alexey V. Melnik¹³, Jessica L. Metcalf³¹, Sydney C. Morgan¹⁵, Jamie Morton¹⁸, Ahmad Turan Naimey¹, Jose A. Navas-Molina¹⁸, Jose A. Navas-Molina³², Louis-Félix Nothias¹³, Stephanie B. Orchanian¹⁸, Talima Pearson¹, Samuel L. Peoples³³, Samuel L. Peoples³⁰, Daniel Petras¹³, Mary L. Preuss³⁴, Elmar Pruesse¹⁷, Lasse Buur Rasmussen⁶, Adam R. Rivers³⁵, Ii Michael S Robeson³⁶, Patrick Rosenthal³⁴, Nicola Segata⁷, Michael Shaffer¹⁷, Arron Shiffer¹, Rashmi Sinha, Se Jin Song¹⁸, John R. Spear³⁷, Austin D. Swafford¹⁸, Luke R. Thompson³⁸, Luke R. Thompson³⁹, Pedro J. Torres²⁶, Pauline Trinh³⁰, Anupriya Tripathi¹⁸, Anupriya Tripathi¹³, Peter J. Turnbaugh⁹, Sabah Ul-Hasan⁴⁰, Justin J. J. van der Hooft⁴¹, Fernando Vargas¹⁸, Yoshiki Vázquez-Baeza¹⁸, Emily Vogtmann, Max von Hippel⁴², William A. Walters²⁹, Yunhu Wan, Mingxun Wang¹³, Jonathan Warren⁴³, Kyle C. Weber³⁵, Kyle C. Weber⁴⁴, Chase Hd Williamson¹, Amy D. Willis³⁰, Zhenjiang Zech Xu¹⁸, Jesse R. Zaneveld³⁰, Yilong Zhang⁴⁵, Rob Knight¹⁸, J. Gregory Caporaso¹ - Show less +116 more•Institutions (45)

Northern Arizona University¹, University of Minnesota², University of California, Davis³, Woods Hole Oceanographic Institution⁴, Massachusetts Institute of Technology⁵, University of Copenhagen⁶, University of Trento⁷, Chinese Academy of Sciences⁸, University of California, San Francisco⁹, Children's Hospital of Philadelphia¹⁰, Pacific Northwest National Laboratory¹¹, North Carolina State University¹², University of Montana¹³, Dalhousie University¹⁴, University of British Columbia¹⁵, Shedd Aquarium¹⁶, University of Colorado Denver¹⁷, University of California, San Diego¹⁸, Michigan State University¹⁹, Stanford University²⁰, Broad Institute²¹, Harvard University²², Australian National University²³, University of Düsseldorf²⁴, Sookmyung Women's University²⁵, San Diego State University²⁶, Howard Hughes Medical Institute²⁷, Cornell University²⁸, Max Planck Society²⁹, University of Washington³⁰, Colorado State University³¹, Google³², Syracuse University³³, Webster University³⁴, United States Department of Agriculture³⁵, University of Arkansas for Medical Sciences³⁶, Colorado School of Mines³⁷, University of Southern Mississippi³⁸, Atlantic Oceanographic and Meteorological Laboratory³⁹, University of California, Merced⁴⁰, Wageningen University and Research Centre⁴¹, University of Arizona⁴², Environment Agency⁴³, University of Florida⁴⁴, Merck & Co.⁴⁵

24 Oct 2018-PeerJ

TL;DR: QIIME 2 provides new features that will drive the next generation of microbiome research, including interactive spatial and temporal analysis and visualization tools, support for metabolomics and shotgun metagenomics analysis, and automated data provenance tracking to ensure reproducible, transparent microbiome data science.

...read moreread less

Abstract: We present QIIME 2, an open-source microbiome data science platform accessible to users spanning the microbiome research ecosystem, from scientists and engineers to clinicians and policy makers. QIIME 2 provides new features that will drive the next generation of microbiome research. These include interactive spatial and temporal analysis and visualization tools, support for metabolomics and shotgun metagenomics analysis, and automated data provenance tracking to ensure reproducible, transparent microbiome data science.

...read moreread less

875 citations

Journal Article•DOI•

Author Correction: Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.

[...]

Evan Bolyen¹, Jai Ram Rideout¹, Matthew R. Dillon¹, Nicholas A. Bokulich¹, Christian C. Abnet², Gabriel A. Al-Ghalith³, Harriet Alexander⁴, Harriet Alexander⁵, Eric J. Alm⁶, Manimozhiyan Arumugam⁷, Francesco Asnicar⁸, Yang Bai⁹, Jordan E. Bisanz¹⁰, Kyle Bittinger¹¹, Asker Daniel Brejnrod⁷, Colin J. Brislawn¹², C. Titus Brown⁴, Benjamin J. Callahan¹³, Andrés Mauricio Caraballo-Rodríguez¹⁴, John Chase¹, Emily K. Cope¹, Ricardo Silva¹⁴, Christian Diener¹⁵, Pieter C. Dorrestein¹⁴, Gavin M. Douglas¹⁶, Daniel M. Durall¹⁷, Claire Duvallet⁶, Christian F. Edwardson, Madeleine Ernst¹⁸, Madeleine Ernst¹⁴, Mehrbod Estaki¹⁷, Jennifer Fouquier¹⁹, Julia M. Gauglitz¹⁴, Sean M. Gibbons¹⁵, Sean M. Gibbons²⁰, Deanna L. Gibson¹⁷, Antonio Gonzalez²¹, Kestrel Gorlick¹, Jiarong Guo²², Benjamin Hillmann³, Susan Holmes²³, Hannes Holste²¹, Curtis Huttenhower²⁴, Curtis Huttenhower²⁵, Gavin A. Huttley²⁶, Stefan Janssen²⁷, Alan K. Jarmusch¹⁴, Lingjing Jiang²¹, Benjamin D. Kaehler²⁶, Benjamin D. Kaehler²⁸, Kyo Bin Kang¹⁴, Kyo Bin Kang²⁹, Christopher R. Keefe¹, Paul Keim¹, Scott T. Kelley³⁰, Dan Knights³, Irina Koester¹⁴, Irina Koester²¹, Tomasz Kosciolek²¹, Jorden Kreps¹, Morgan G. I. Langille¹⁶, Joslynn S. Lee³¹, Ruth E. Ley³², Ruth E. Ley³³, Yong-Xin Liu, Erikka Loftfield², Catherine A. Lozupone¹⁹, Massoud Maher²¹, Clarisse Marotz²¹, Bryan D Martin²⁰, Daniel McDonald²¹, Lauren J. McIver²⁴, Lauren J. McIver²⁵, Alexey V. Melnik¹⁴, Jessica L. Metcalf³⁴, Sydney C. Morgan¹⁷, Jamie Morton²¹, Ahmad Turan Naimey¹, Jose A. Navas-Molina³⁵, Jose A. Navas-Molina²¹, Louis-Félix Nothias¹⁴, Stephanie B. Orchanian, Talima Pearson¹, Samuel L. Peoples³⁶, Samuel L. Peoples²⁰, Daniel Petras¹⁴, Mary L. Preuss³⁷, Elmar Pruesse¹⁹, Lasse Buur Rasmussen⁷, Adam R. Rivers³⁸, Michael S. Robeson³⁹, Patrick Rosenthal³⁷, Nicola Segata⁸, Michael Shaffer¹⁹, Arron Shiffer¹, Rashmi Sinha², Se Jin Song²¹, John R. Spear⁴⁰, Austin D. Swafford, Luke R. Thompson⁴¹, Luke R. Thompson⁴², Pedro J. Torres³⁰, Pauline Trinh²⁰, Anupriya Tripathi²¹, Anupriya Tripathi¹⁴, Peter J. Turnbaugh¹⁰, Sabah Ul-Hasan⁴³, Justin J. J. van der Hooft⁴⁴, Fernando Vargas, Yoshiki Vázquez-Baeza²¹, Emily Vogtmann², Max von Hippel⁴⁵, William A. Walters³³, Yunhu Wan², Mingxun Wang¹⁴, Jonathan Warren⁴⁶, Kyle C. Weber⁴⁷, Kyle C. Weber³⁸, Charles H. D. Williamson¹, Amy D. Willis²⁰, Zhenjiang Zech Xu²¹, Jesse R. Zaneveld²⁰, Yilong Zhang⁴⁸, Qiyun Zhu²¹, Rob Knight²¹, J. Gregory Caporaso¹ - Show less +122 more•Institutions (48)

Northern Arizona University¹, National Institutes of Health², University of Minnesota³, University of California, Davis⁴, Woods Hole Oceanographic Institution⁵, Massachusetts Institute of Technology⁶, University of Copenhagen⁷, University of Trento⁸, Chinese Academy of Sciences⁹, University of California, San Francisco¹⁰, University of Pennsylvania¹¹, Pacific Northwest National Laboratory¹², North Carolina State University¹³, University of Montana¹⁴, Institute for Systems Biology¹⁵, Dalhousie University¹⁶, University of British Columbia¹⁷, Statens Serum Institut¹⁸, Anschutz Medical Campus¹⁹, University of Washington²⁰, University of California, San Diego²¹, Michigan State University²², Stanford University²³, Harvard University²⁴, Broad Institute²⁵, Australian National University²⁶, University of Düsseldorf²⁷, University of New South Wales²⁸, Sookmyung Women's University²⁹, San Diego State University³⁰, Howard Hughes Medical Institute³¹, Cornell University³², Max Planck Society³³, Colorado State University³⁴, Google³⁵, Syracuse University³⁶, Webster University³⁷, United States Department of Agriculture³⁸, University of Arkansas for Medical Sciences³⁹, Colorado School of Mines⁴⁰, National Oceanic and Atmospheric Administration⁴¹, University of Southern Mississippi⁴², University of California, Merced⁴³, Wageningen University and Research Centre⁴⁴, University of Arizona⁴⁵, Environment Agency⁴⁶, University of Florida⁴⁷, Merck & Co.⁴⁸

01 Sep 2019-Nature Biotechnology

TL;DR: An amendment to this paper has been published and can be accessed via a link at the top of the paper.

...read moreread less

Abstract: In the version of this article initially published, some reference citations were incorrect. The three references to Jupyter Notebooks should have cited Kluyver et al. instead of Gonzalez et al. The reference to Qiita should have cited Gonzalez et al. instead of Schloss et al. The reference to mothur should have cited Schloss et al. instead of McMurdie & Holmes. The reference to phyloseq should have cited McMurdie & Holmes instead of Huber et al. The reference to Bioconductor should have cited Huber et al. instead of Franzosa et al. And the reference to the biobakery suite should have cited Franzosa et al. instead of Kluyver et al. The errors have been corrected in the HTML and PDF versions of the article.

...read moreread less

301 citations

Journal Article•DOI•

mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking

[...]

Nicholas A. Bokulich¹, Jai Ram Rideout¹, William G. Mercurio¹, Arron Shiffer¹, Benjamin E. Wolfe², Corinne F. Maurice³, Rachel J. Dutton⁴, Peter J. Turnbaugh⁵, Rob Knight⁴, J. Gregory Caporaso¹ - Show less +6 more•Institutions (5)

Northern Arizona University¹, Tufts University², McGill University³, University of California, San Diego⁴, University of California, San Francisco⁵

25 Oct 2016

TL;DR: This work presents mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, and outlines its intended expansion and evolve to meet the changing needs of the omics community.

...read moreread less

Abstract: Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community.

...read moreread less

83 citations

Journal Article•DOI•

ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses

[...]

Jennifer Fouquier¹, Jai Ram Rideout², Evan Bolyen², John Chase², Arron Shiffer², Daniel McDonald³, Rob Knight⁴, J. Gregory Caporaso², Scott T. Kelley¹ - Show less +5 more•Institutions (4)

San Diego State University¹, Northern Arizona University², Institute for Systems Biology³, University of California, San Diego⁴

24 Feb 2016-Microbiome

TL;DR: The Silva/UNITE-based ghost tree presented here can be easily integrated into existing fungal analysis pipelines to enhance the resolution of fungal community differences and improve understanding of these communities in built environments.

...read moreread less

Abstract: Fungi play critical roles in many ecosystems, cause serious diseases in plants and animals, and pose significant threats to human health and structural integrity problems in built environments. While most fungal diversity remains unknown, the development of PCR primers for the internal transcribed spacer (ITS) combined with next-generation sequencing has substantially improved our ability to profile fungal microbial diversity. Although the high sequence variability in the ITS region facilitates more accurate species identification, it also makes multiple sequence alignment and phylogenetic analysis unreliable across evolutionarily distant fungi because the sequences are hard to align accurately. To address this issue, we created ghost-tree, a bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach starts with a “foundation” phylogeny based on one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families). Then, “extension” phylogenies are built for more closely related organisms (e.g., fungal species or strains) using a second more rapidly evolving genetic marker. These smaller phylogenies are then grafted onto the foundation tree by mapping taxonomic names such that each corresponding foundation-tree tip would branch into its new “extension tree” child. We applied ghost-tree to graft fungal extension phylogenies derived from ITS sequences onto a foundation phylogeny derived from fungal 18S sequences. Our analysis of simulated and real fungal ITS data sets found that phylogenetic distances between fungal communities computed using ghost-tree phylogenies explained significantly more variance than non-phylogenetic distances. The phylogenetic metrics also improved our ability to distinguish small differences (effect sizes) between microbial communities, though results were similar to non-phylogenetic methods for larger effect sizes. The Silva/UNITE-based ghost tree presented here can be easily integrated into existing fungal analysis pipelines to enhance the resolution of fungal community differences and improve understanding of these communities in built environments. The ghost-tree software package can also be used to develop phylogenetic trees for other marker gene sets that afford different taxonomic resolution, or for bridging genome trees with amplicon trees. ghost-tree is pip-installable. All source code, documentation, and test code are available under the BSD license at https://github.com/JTFouquier/ghost-tree .

...read moreread less

52 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era.

[...]

Bui Quang Minh¹, Heiko A. Schmidt², Olga Chernomor², Dominik Schrempf², Dominik Schrempf³, Michael D. Woodhams⁴, Arndt von Haeseler², Arndt von Haeseler⁵, Robert Lanfear¹ - Show less +5 more•Institutions (5)

Australian National University¹, Medical University of Vienna², Eötvös Loránd University³, University of Tasmania⁴, University of Vienna⁵

01 May 2020-Molecular Biology and Evolution

TL;DR: Some notable features of IQ-TREE version 2 are described and the key advantages over other software are highlighted.

...read moreread less

Abstract: IQ-TREE (http://www.iqtree.org, last accessed February 6, 2020) is a user-friendly and widely used software package for phylogenetic inference using maximum likelihood. Since the release of version 1 in 2014, we have continuously expanded IQ-TREE to integrate a plethora of new models of sequence evolution and efficient computational approaches of phylogenetic inference to deal with genomic data. Here, we describe notable features of IQ-TREE version 2 and highlight the key advantages over other software.

...read moreread less

4,337 citations

Journal Article•DOI•

Evolution of Protein Molecules

[...]

S. Jeffery

01 Apr 1979-Biochemical Society Transactions

3,734 citations

Journal Article•DOI•

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin

[...]

Nicholas A. Bokulich¹, Benjamin D. Kaehler², Jai Ram Rideout¹, Matthew R. Dillon¹, Evan Bolyen¹, Rob Knight³, Gavin A. Huttley², J. Gregory Caporaso¹ - Show less +4 more•Institutions (3)

Northern Arizona University¹, Australian National University², University of California, San Diego³

17 May 2018-Microbiome

TL;DR: The results illustrate the importance of parameter tuning for optimizing classifier performance, and the recommendations regarding parameter choices for these classifiers under a range of standard operating conditions are made.

...read moreread less

Abstract: Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated “novel” marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ). Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.

...read moreread less

2,475 citations

Journal Article•

Fast Tree: Computing Large Minimum-Evolution Trees with Profiles instead of a Distance Matrix

[...]

Morgan N. Price, Paramvir S. Dehal, Adam P. Arkin

18 Jun 2009-Lawrence Berkeley National Laboratory

TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.

...read moreread less

Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

...read moreread less

2,436 citations

Machine learning with Python

[...]

Pedro Ferreira, Christopher L. Simons

25 Apr 2017

TL;DR: This presentation is a case study taken from the travel and holiday industry and describes the effectiveness of various techniques as well as the performance of Python-based libraries such as Python Data Analysis Library (Pandas), and Scikit-learn (built on NumPy, SciPy and matplotlib).

...read moreread less

Abstract: This presentation is a case study taken from the travel and holiday industry. Paxport/Multicom, based in UK and Sweden, have recently adopted a recommendation system for holiday accommodation bookings. Machine learning techniques such as Collaborative Filtering have been applied using Python (3.5.1), with Jupyter (4.0.6) as the main framework. Data scale and sparsity present significant challenges in the case study, and so the effectiveness of various techniques are described as well as the performance of Python-based libraries such as Python Data Analysis Library (Pandas), and Scikit-learn (built on NumPy, SciPy and matplotlib). The presentation is suitable for all levels of programmers.

...read moreread less

1,338 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse