Home
/
Authors
/
Evguenia Kopylova

Author

Evguenia Kopylova

Other affiliations: University of Colorado Boulder, French Institute for Research in Computer Science and Automation, McMaster University ...read more

Bio: Evguenia Kopylova is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Ribosomal RNA & Genome. The author has an hindex of 12, co-authored 17 publications receiving 3951 citations. Previous affiliations of Evguenia Kopylova include University of Colorado Boulder & French Institute for Research in Computer Science and Automation.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data.

[...]

Evguenia Kopylova¹, Laurent Noé¹, Hélène Touzet¹•Institutions (1)

Laboratoire d'Informatique Fondamentale de Lille¹

01 Dec 2012-Bioinformatics

TL;DR: SortMeRNA, a new software designed to rapidly filter rRNA fragments from metatranscriptomic data, is presented, capable of handling large sets of reads and sorting out all fragments matching to the rRNA database with high sensitivity and low running time.

...read moreread less

Abstract: MOTIVATION: The application of Next-Generation Sequencing (NGS) technologies to RNAs directly extracted from a community of organisms yields a mixture of fragments characterizing both coding and non-coding types of RNAs. The tasks to distinguish among these and to further categorize the families of messenger RNAs and ribosomal RNAs is an important step for examining gene expression patterns of an interactive environment and the phylogenetic classification of the constituting species. RESULTS: We present SortMeRNA, a new software designed to rapidly filter ribosomal RNA fragments from metatranscriptomic data. It is capable of handling large sets of reads and sorting out all fragments matching to the rRNA database with high sensitivity and low running time. AVAILABILITY: http://bioinfo.lifl.fr/RNA/sortmerna CONTACT: evguenia.kopylova@lifl.fr SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

...read moreread less

1,868 citations

Journal Article•DOI•

A communal catalogue reveals Earth’s multiscale microbial diversity

[...]

Luke R. Thompson¹, Luke R. Thompson², Luke R. Thompson³, Jon G. Sanders¹, Daniel McDonald¹, Amnon Amir¹, Joshua Ladau⁴, Kenneth J. Locey⁵, Robert J. Prill⁶, Anupriya Tripathi¹, Sean M. Gibbons⁷, Sean M. Gibbons⁸, Gail Ackermann¹, Jose A. Navas-Molina¹, Stefan Janssen¹, Evguenia Kopylova¹, Yoshiki Vázquez-Baeza¹, Antonio Gonzalez¹, James T. Morton¹, Siavash Mirarab¹, Zhenjiang Zech Xu¹, Lingjing Jiang¹, Mohamed F. Haroon⁹, Jad N. Kanbar¹, Qiyun Zhu¹, Se Jin Song¹, Tomasz Kosciolek¹, Nicholas A. Bokulich¹⁰, Joshua P Lefler¹, Colin J. Brislawn¹¹, Gregory Humphrey¹, Sarah M. Owens¹², Jarrad T. Hampton-Marcell¹³, Jarrad T. Hampton-Marcell¹², Donna Berg-Lyons¹⁴, Valerie J. McKenzie¹⁴, Noah Fierer¹⁵, Noah Fierer¹⁴, Jed A. Fuhrman¹⁶, Aaron Clauset¹⁴, Rick Stevens¹², Rick Stevens¹⁷, Ashley Shade¹⁸, Katherine S. Pollard⁴, Kelly D. Goodwin³, Janet K. Jansson¹¹, Jack A. Gilbert¹⁷, Jack A. Gilbert¹², Rob Knight¹ - Show less +45 more•Institutions (18)

University of California, San Diego¹, University of Southern Mississippi², Atlantic Oceanographic and Meteorological Laboratory³, University of California, San Francisco⁴, Indiana University⁵, IBM⁶, Massachusetts Institute of Technology⁷, Broad Institute⁸, Harvard University⁹, Northern Arizona University¹⁰, Pacific Northwest National Laboratory¹¹, Argonne National Laboratory¹², University of Illinois at Chicago¹³, University of Colorado Boulder¹⁴, Cooperative Institute for Research in Environmental Sciences¹⁵, University of Southern California¹⁶, University of Chicago¹⁷, Michigan State University¹⁸

01 Nov 2017-Nature

TL;DR: A meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project is presented, creating both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth’s microbial diversity.

...read moreread less

Abstract: Our growing awareness of the microbial world’s importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth’s microbial diversity.

...read moreread less

1,676 citations

Journal Article•DOI•

Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns

[...]

Amnon Amir¹, Daniel McDonald¹, Jose A. Navas-Molina¹, Evguenia Kopylova¹, James T. Morton¹, Zhenjiang Zech Xu¹, Eric P. Kightley², Luke R. Thompson¹, Embriette R. Hyde¹, Antonio Gonzalez¹, Rob Knight¹ - Show less +7 more•Institutions (2)

University of California, San Diego¹, University of Colorado Boulder²

21 Apr 2017

TL;DR: A novel sub-operational-taxonomic-unit (sOTU) approach that uses error profiles to obtain putative error-free sequences from Illumina MiSeq and HiSeq sequencing platforms, Deblur, which substantially reduces computational demands relative to similar sOTU methods and does so with similar or better sensitivity and specificity.

...read moreread less

Abstract: High-throughput sequencing of 16S ribosomal RNA gene amplicons has facilitated understanding of complex microbial communities, but the inherent noise in PCR and DNA sequencing limits differentiation of closely related bacteria. Although many scientific questions can be addressed with broad taxonomic profiles, clinical, food safety, and some ecological applications require higher specificity. Here we introduce a novel sub-operational-taxonomic-unit (sOTU) approach, Deblur, that uses error profiles to obtain putative error-free sequences from Illumina MiSeq and HiSeq sequencing platforms. Deblur substantially reduces computational demands relative to similar sOTU methods and does so with similar or better sensitivity and specificity. Using simulations, mock mixtures, and real data sets, we detected closely related bacterial sequences with single nucleotide differences while removing false positives and maintaining stability in detection, suggesting that Deblur is limited only by read length and diversity within the amplicon sequences. Because Deblur operates on a per-sample level, it scales to modern data sets and meta-analyses. To highlight Deblur's ability to integrate data sets, we include an interactive exploration of its application to multiple distinct sequencing rounds of the American Gut Project. Deblur is open source under the Berkeley Software Distribution (BSD) license, easily installable, and downloadable from https://github.com/biocore/deblur. IMPORTANCE Deblur provides a rapid and sensitive means to assess ecological patterns driven by differentiation of closely related taxa. This algorithm provides a solution to the problem of identifying real ecological differences between taxa whose amplicons differ by a single base pair, is applicable in an automated fashion to large-scale sequencing data sets, and can integrate sequencing runs collected over time.

...read moreread less

1,181 citations

Journal Article•DOI•

Microbiome analyses of blood and tissues suggest cancer diagnostic approach

[...]

Gregory D. Poore¹, Evguenia Kopylova¹, Qiyun Zhu¹, Carolina S. Carpenter¹, Serena Fraraccio¹, Stephen Wandro¹, Tomasz Kosciolek¹, Tomasz Kosciolek², Stefan Janssen¹, Stefan Janssen³, Jessica L. Metcalf⁴, Se Jin Song¹, Jad N. Kanbar¹, Sandrine Miller-Montgomery¹, Robert K. Heaton¹, Rana R. McKay¹, Sandip Pravin Patel¹, Austin D. Swafford¹, Rob Knight - Show less +15 more•Institutions (4)

University of California, San Diego¹, Jagiellonian University², University of Giessen³, Colorado State University⁴

11 Mar 2020-Nature

TL;DR: Microbial nucleic acids are detected in samples of tissues and blood from more than 10,000 patients with cancer, and machine learning is used to show that these can be used to discriminate between and among different types of cancer, suggesting a new microbiome-based diagnostic approach.

...read moreread less

Abstract: Systematic characterization of the cancer microbiome provides the opportunity to develop techniques that exploit non-human, microorganism-derived molecules in the diagnosis of a major human disease. Following recent demonstrations that some types of cancer show substantial microbial contributions1–10, we re-examined whole-genome and whole-transcriptome sequencing studies in The Cancer Genome Atlas11 (TCGA) of 33 types of cancer from treatment-naive patients (a total of 18,116 samples) for microbial reads, and found unique microbial signatures in tissue and blood within and between most major types of cancer. These TCGA blood signatures remained predictive when applied to patients with stage Ia–IIc cancer and cancers lacking any genomic alterations currently measured on two commercial-grade cell-free tumour DNA platforms, despite the use of very stringent decontamination analyses that discarded up to 92.3% of total sequence data. In addition, we could discriminate among samples from healthy, cancer-free individuals (n = 69) and those from patients with multiple types of cancer (prostate, lung, and melanoma; 100 samples in total) solely using plasma-derived, cell-free microbial nucleic acids. This potential microbiome-based oncology diagnostic tool warrants further exploration. Microbial nucleic acids are detected in samples of tissues and blood from more than 10,000 patients with cancer, and machine learning is used to show that these can be used to discriminate between and among different types of cancer, suggesting a new microbiome-based diagnostic approach.

...read moreread less

524 citations

Journal Article•DOI•

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0.

[...]

Francesco Asnicar¹, Andrew Maltez Thomas¹, Francesco Beghini¹, Claudia Mengoni¹, Serena Manara¹, Paolo Manghi¹, Qiyun Zhu², Mattia Bolzan¹, Fabio Cumbo¹, Uyen May², Jon G. Sanders³, Jon G. Sanders², Moreno Zolfo¹, Evguenia Kopylova², Edoardo Pasolli⁴, Edoardo Pasolli¹, Rob Knight, Siavash Mirarab², Curtis Huttenhower⁵, Nicola Segata¹ - Show less +16 more•Institutions (5)

University of Trento¹, University of California, San Diego², Cornell University³, University of Naples Federico II⁴, Harvard University⁵

19 May 2020-Nature Communications

TL;DR: PhyloPhlAn 3.0 can assign genomes from isolate sequencing or MAGs to species-level genome bins built from >230,000 publically available sequences, and reconstructs strain-level phylogenies from among the closest species using clade-specific maximally informative markers.

...read moreread less

Abstract: Microbial genomes are available at an ever-increasing pace, as cultivation and sequencing become cheaper and obtaining metagenome-assembled genomes (MAGs) becomes more effective. Phylogenetic placement methods to contextualize hundreds of thousands of genomes must thus be efficiently scalable and sensitive from closely related strains to divergent phyla. We present PhyloPhlAn 3.0, an accurate, rapid, and easy-to-use method for large-scale microbial genome characterization and phylogenetic analysis at multiple levels of resolution. PhyloPhlAn 3.0 can assign genomes from isolate sequencing or MAGs to species-level genome bins built from >230,000 publically available sequences. For individual clades of interest, it reconstructs strain-level phylogenies from among the closest species using clade-specific maximally informative markers. At the other extreme of resolution, it scales to large phylogenies comprising >17,000 microbial species. Examples including Staphylococcus aureus isolates, gut metagenomes, and meta-analyses demonstrate the ability of PhyloPhlAn 3.0 to support genomic and metagenomic analyses.

...read moreread less

277 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2

[...]

Evan Bolyen¹, Jai Ram Rideout¹, Matthew R. Dillon¹, Nicholas A. Bokulich¹, Christian C. Abnet², Gabriel A. Al-Ghalith³, Harriet Alexander⁴, Harriet Alexander⁵, Eric J. Alm⁶, Manimozhiyan Arumugam⁷, Francesco Asnicar⁸, Yang Bai⁹, Jordan E. Bisanz¹⁰, Kyle Bittinger¹¹, Asker Daniel Brejnrod⁷, Colin J. Brislawn¹², C. Titus Brown⁴, Benjamin J. Callahan¹³, Andrés Mauricio Caraballo-Rodríguez¹⁴, John Chase¹, Emily K. Cope¹, Ricardo Silva¹⁴, Christian Diener¹⁵, Pieter C. Dorrestein¹⁴, Gavin M. Douglas¹⁶, Daniel M. Durall¹⁷, Claire Duvallet⁶, Christian F. Edwardson, Madeleine Ernst¹⁸, Madeleine Ernst¹⁴, Mehrbod Estaki¹⁷, Jennifer Fouquier¹⁹, Julia M. Gauglitz¹⁴, Sean M. Gibbons²⁰, Sean M. Gibbons¹⁵, Deanna L. Gibson¹⁷, Antonio Gonzalez¹⁴, Kestrel Gorlick¹, Jiarong Guo²¹, Benjamin Hillmann³, Susan Holmes²², Hannes Holste¹⁴, Curtis Huttenhower²³, Curtis Huttenhower²⁴, Gavin A. Huttley²⁵, Stefan Janssen²⁶, Alan K. Jarmusch¹⁴, Lingjing Jiang¹⁴, Benjamin D. Kaehler²⁷, Benjamin D. Kaehler²⁵, Kyo Bin Kang¹⁴, Kyo Bin Kang²⁸, Christopher R. Keefe¹, Paul Keim¹, Scott T. Kelley²⁹, Dan Knights³, Irina Koester¹⁴, Tomasz Kosciolek¹⁴, Jorden Kreps¹, Morgan G. I. Langille¹⁶, Joslynn S. Lee³⁰, Ruth E. Ley³¹, Ruth E. Ley³², Yong-Xin Liu, Erikka Loftfield², Catherine A. Lozupone¹⁹, Massoud Maher¹⁴, Clarisse Marotz¹⁴, Bryan D Martin²⁰, Daniel McDonald¹⁴, Lauren J. McIver²³, Lauren J. McIver²⁴, Alexey V. Melnik¹⁴, Jessica L. Metcalf³³, Sydney C. Morgan¹⁷, Jamie Morton¹⁴, Ahmad Turan Naimey¹, Jose A. Navas-Molina¹⁴, Jose A. Navas-Molina³⁴, Louis-Félix Nothias¹⁴, Stephanie B. Orchanian, Talima Pearson¹, Samuel L. Peoples³⁵, Samuel L. Peoples²⁰, Daniel Petras¹⁴, Mary L. Preuss³⁶, Elmar Pruesse¹⁹, Lasse Buur Rasmussen⁷, Adam R. Rivers³⁷, Michael S. Robeson³⁸, Patrick Rosenthal³⁶, Nicola Segata⁸, Michael Shaffer¹⁹, Arron Shiffer¹, Rashmi Sinha², Se Jin Song¹⁴, John R. Spear³⁹, Austin D. Swafford, Luke R. Thompson⁴⁰, Luke R. Thompson⁴¹, Pedro J. Torres²⁹, Pauline Trinh²⁰, Anupriya Tripathi¹⁴, Peter J. Turnbaugh¹⁰, Sabah Ul-Hasan⁴², Justin J. J. van der Hooft⁴³, Fernando Vargas, Yoshiki Vázquez-Baeza¹⁴, Emily Vogtmann², Max von Hippel⁴⁴, William A. Walters³¹, Yunhu Wan², Mingxun Wang¹⁴, Jonathan Warren⁴⁵, Kyle C. Weber⁴⁶, Kyle C. Weber³⁷, Charles H. D. Williamson¹, Amy D. Willis²⁰, Zhenjiang Zech Xu¹⁴, Jesse R. Zaneveld²⁰, Yilong Zhang⁴⁷, Qiyun Zhu¹⁴, Rob Knight¹⁴, J. Gregory Caporaso¹ - Show less +120 more•Institutions (47)

Northern Arizona University¹, National Institutes of Health², University of Minnesota³, University of California, Davis⁴, Woods Hole Oceanographic Institution⁵, Massachusetts Institute of Technology⁶, University of Copenhagen⁷, University of Trento⁸, Chinese Academy of Sciences⁹, University of California, San Francisco¹⁰, University of Pennsylvania¹¹, Pacific Northwest National Laboratory¹², North Carolina State University¹³, University of California, San Diego¹⁴, Institute for Systems Biology¹⁵, Dalhousie University¹⁶, University of British Columbia¹⁷, Statens Serum Institut¹⁸, Anschutz Medical Campus¹⁹, University of Washington²⁰, Michigan State University²¹, Stanford University²², Broad Institute²³, Harvard University²⁴, Australian National University²⁵, University of Düsseldorf²⁶, University of New South Wales²⁷, Sookmyung Women's University²⁸, San Diego State University²⁹, Howard Hughes Medical Institute³⁰, Max Planck Society³¹, Cornell University³², Colorado State University³³, Google³⁴, Syracuse University³⁵, Webster University³⁶, United States Department of Agriculture³⁷, University of Arkansas for Medical Sciences³⁸, Colorado School of Mines³⁹, University of Southern Mississippi⁴⁰, National Oceanic and Atmospheric Administration⁴¹, University of California, Merced⁴², Wageningen University and Research Centre⁴³, University of Arizona⁴⁴, Environment Agency⁴⁵, University of Florida⁴⁶, Merck & Co.⁴⁷

01 Aug 2019-Nature Biotechnology

TL;DR: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and R.K.P. and partial support was also provided by the following: grants NIH U54CA143925 and U54MD012388.

...read moreread less

Abstract: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and 1565057 to R.K. Partial support was also provided by the following: grants NIH U54CA143925 (J.G.C. and T.P.) and U54MD012388 (J.G.C. and T.P.); grants from the Alfred P. Sloan Foundation (J.G.C. and R.K.); ERCSTG project MetaPG (N.S.); the Strategic Priority Research Program of the Chinese Academy of Sciences QYZDB-SSW-SMC021 (Y.B.); the Australian National Health and Medical Research Council APP1085372 (G.A.H., J.G.C., Von Bing Yap and R.K.); the Natural Sciences and Engineering Research Council (NSERC) to D.L.G.; and the State of Arizona Technology and Research Initiative Fund (TRIF), administered by the Arizona Board of Regents, through Northern Arizona University. All NCI coauthors were supported by the Intramural Research Program of the National Cancer Institute. S.M.G. and C. Diener were supported by the Washington Research Foundation Distinguished Investigator Award.

...read moreread less

8,821 citations

Journal Article•DOI•

VSEARCH: a versatile open source tool for metagenomics

[...]

Torbjørn Rognes¹, Torbjørn Rognes², Tomas Flouri³, Tomas Flouri⁴, Ben Nichols⁵, Christopher Quince⁵, Christopher Quince⁶, Frédéric Mahé⁷ - Show less +4 more•Institutions (7)

University of Oslo¹, Oslo University Hospital², Heidelberg Institute for Theoretical Studies³, Karlsruhe Institute of Technology⁴, University of Glasgow⁵, University of Warwick⁶, Kaiserslautern University of Technology⁷

18 Oct 2016-PeerJ

TL;DR: VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with US EARCH for paired-ends read merging and dereplication.

...read moreread less

Abstract: Background: VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar, 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use. Methods: When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads. Results: VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e., format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available at https://github.com/torognes/vsearch under either the BSD 2-clause license or the GNU General Public License version 3.0. Discussion: VSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.

...read moreread less

5,850 citations

Journal Article•DOI•

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin

[...]

Nicholas A. Bokulich¹, Benjamin D. Kaehler², Jai Ram Rideout¹, Matthew R. Dillon¹, Evan Bolyen¹, Rob Knight³, Gavin A. Huttley², J. Gregory Caporaso¹ - Show less +4 more•Institutions (3)

Northern Arizona University¹, Australian National University², University of California, San Diego³

17 May 2018-Microbiome

TL;DR: The results illustrate the importance of parameter tuning for optimizing classifier performance, and the recommendations regarding parameter choices for these classifiers under a range of standard operating conditions are made.

...read moreread less

Abstract: Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated “novel” marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ). Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.

...read moreread less

2,475 citations

Journal Article•

Fast Tree: Computing Large Minimum-Evolution Trees with Profiles instead of a Distance Matrix

[...]

Morgan N. Price, Paramvir S. Dehal, Adam P. Arkin

18 Jun 2009-Lawrence Berkeley National Laboratory

TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.

...read moreread less

Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

...read moreread less

2,436 citations

Journal Article•DOI•

Gut Microbiota Regulate Motor Deficits and Neuroinflammation in a Model of Parkinson’s Disease

[...]

Timothy R. Sampson¹, Justine W. Debelius², Taren Thron¹, Stefan Janssen², Gauri G. Shastri¹, Zehra Esra Ilhan³, Collin Challis¹, Catherine E. Schretter¹, Sandra Rocha⁴, Viviana Gradinaru¹, Marie-Françoise Chesselet⁵, Ali Keshavarzian⁶, Kathleen M. Shannon⁶, Rosa Krajmalnik-Brown³, Pernilla Wittung-Stafshede⁴, Rob Knight², Sarkis K. Mazmanian¹ - Show less +13 more•Institutions (6)

California Institute of Technology¹, University of California, San Diego², Arizona State University³, Chalmers University of Technology⁴, University of California, Los Angeles⁵, Rush University Medical Center⁶

01 Dec 2016-Cell

TL;DR: It is reported herein that gut microbiota are required for motor deficits, microglia activation, and αSyn pathology, and suggested that alterations in the human microbiome represent a risk factor for PD.

...read moreread less

2,142 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse