Home
/
Authors
/
Pelin Yilmaz

Author

Pelin Yilmaz

Other affiliations: University of British Columbia, Jacobs University Bremen

Bio: Pelin Yilmaz is an academic researcher from Max Planck Society. The author has contributed to research in topics: Metagenomics & Phylum. The author has an hindex of 27, co-authored 56 publications receiving 18308 citations. Previous affiliations of Pelin Yilmaz include University of British Columbia & Jacobs University Bremen.

Topics: Metagenomics, Phylum, Metadata, Phylogenetic tree, Ribosomal RNA ...read more

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The SILVA ribosomal RNA gene database project: improved data processing and web-based tools

[...]

Christian Quast¹, Elmar Pruesse², Pelin Yilmaz², Jan Gerken², Timmy Schweer², Pablo Yarza², Jörg Peplies², Frank Oliver Glöckner² - Show less +4 more•Institutions (2)

Max Planck Society¹, Jacobs University Bremen²

28 Nov 2012-Nucleic Acids Research

TL;DR: The extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.

...read moreread less

Abstract: SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.

...read moreread less

18,256 citations

Journal Article•DOI•

The SILVA and "All-species Living Tree Project (LTP)" taxonomic frameworks.

[...]

Pelin Yilmaz¹, Laura Wegener Parfrey¹, Pablo Yarza², Jan E. Gerken², Elmar Pruesse², Christian Quast², Timmy Schweer², Jörg Peplies², Wolfgang Ludwig¹, Frank Oliver Glöckner² - Show less +6 more•Institutions (2)

University of British Columbia¹, Max Planck Society²

01 Jan 2014-Nucleic Acids Research

TL;DR: The improvements the SILVA taxonomy has undergone in the last 3 years are described, focusing on the curation process, the various resources used for curation and the comparison of the SILva taxonomy with Greengenes and RDP-II taxonomies.

...read moreread less

Abstract: SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive resource for up-to-date quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. SILVA provides a manually curated taxonomy for all three domains of life, based on representative phylogenetic trees for the small- and large-subunit rRNA genes. This article describes the improvements the SILVA taxonomy has undergone in the last 3 years. Specifically we are focusing on the curation process, the various resources used for curation and the comparison of the SILVA taxonomy with Greengenes and RDP-II taxonomies. Our comparisons not only revealed a reasonable overlap between the taxa names, but also points to significant differences in both names and numbers of taxa between the three resources.

...read moreread less

2,187 citations

Journal Article•DOI•

Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences

[...]

Pablo Yarza¹, Pelin Yilmaz², Elmar Pruesse², Frank Oliver Glöckner³, Wolfgang Ludwig⁴, Karl-Heinz Schleifer⁴, William B. Whitman⁵, Jean Euzeby⁶, Rudolf Amann², Ramon Rosselló-Móra¹ - Show less +6 more•Institutions (6)

University of the Balearic Islands¹, Max Planck Society², Jacobs University Bremen³, Technische Universität München⁴, University of Georgia⁵, École nationale vétérinaire de Toulouse⁶

01 Sep 2014-Nature Reviews Microbiology

TL;DR: This article proposes rational taxonomic boundaries for high taxa of bacteria and archaea on the basis of 16S rRNA gene sequence identities and suggests a rationale for the circumscription of uncultured taxa that is compatible with the taxonomy of cultured bacteria and Archaea.

...read moreread less

Abstract: Publicly available sequence databases of the small subunit ribosomal RNA gene, also known as 16S rRNA in bacteria and archaea, are growing rapidly, and the number of entries currently exceeds 4 million. However, a unified classification and nomenclature framework for all bacteria and archaea does not yet exist. In this Analysis article, we propose rational taxonomic boundaries for high taxa of bacteria and archaea on the basis of 16S rRNA gene sequence identities and suggest a rationale for the circumscription of uncultured taxa that is compatible with the taxonomy of cultured bacteria and archaea. Our analyses show that only nearly complete 16S rRNA sequences give accurate measures of taxonomic diversity. In addition, our analyses suggest that most of the 16S rRNA sequences of the high taxa will be discovered in environmental surveys by the end of the current decade.

...read moreread less

1,755 citations

Journal Article•DOI•

Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

[...]

Robert M. Bowers¹, Nikos C. Kyrpides¹, Ramunas Stepanauskas², Miranda Harmon-Smith¹, Devin F. R. Doud¹, T. B. K. Reddy¹, Frederik Schulz¹, Jessica K. Jarett¹, Adam R. Rivers¹, Adam R. Rivers³, Emiley A. Eloe-Fadrosh¹, Susannah G. Tringe¹, Susannah G. Tringe⁴, Natalia Ivanova¹, Alex Copeland¹, Alicia Clum¹, Eric D. Becraft², Rex R. Malmstrom¹, Bruce W. Birren⁵, Mircea Podar⁶, Peer Bork, George M. Weinstock, George M. Garrity⁷, Jeremy A. Dodsworth⁸, Shibu Yooseph⁹, Granger G. Sutton⁹, Frank Oliver Gloeckner¹⁰, Jack A. Gilbert¹¹, William C. Nelson¹², Steven J. Hallam¹³, Sean P. Jungbluth¹, Sean P. Jungbluth¹⁴, Thijs J. G. Ettema¹⁵, Scott Tighe¹⁶, Konstantinos T. Konstantinidis¹⁷, Wen Tso Liu¹⁸, Brett J. Baker¹⁹, Thomas Rattei²⁰, Jonathan A. Eisen²¹, Brian P. Hedlund²², Katherine D. McMahon²³, Noah Fierer²⁴, Rob Knight²⁵, Robert D. Finn²⁶, Guy Cochrane²⁶, Ilene Karsch-Mizrachi²⁷, Gene W. Tyson²⁸, Christian Rinke²⁸, Alla Lapidus²⁹, Folker Meyer¹¹, Pelin Yilmaz¹⁰, Donovan H. Parks²⁸, A. M. Eren, Lynn M. Schriml, Jillian F. Banfield³⁰, Philip Hugenholtz²⁸, Tanja Woyke¹⁰ - Show less +53 more•Institutions (30)

Joint Genome Institute¹, Bigelow Laboratory For Ocean Sciences², United States Department of Agriculture³, University of California, Merced⁴, Broad Institute⁵, Oak Ridge National Laboratory⁶, Michigan State University⁷, California State University, San Bernardino⁸, J. Craig Venter Institute⁹, Max Planck Society¹⁰, Argonne National Laboratory¹¹, Pacific Northwest National Laboratory¹², University of British Columbia¹³, University of Southern California¹⁴, Science for Life Laboratory¹⁵, University of Vermont¹⁶, Georgia Institute of Technology¹⁷, University of Illinois at Urbana–Champaign¹⁸, University of Texas at Austin¹⁹, University of Vienna²⁰, University of California, Davis²¹, University of Nevada, Las Vegas²², University of Wisconsin-Madison²³, Cooperative Institute for Research in Environmental Sciences²⁴, University of California, San Diego²⁵, European Bioinformatics Institute²⁶, National Institutes of Health²⁷, University of Queensland²⁸, Saint Petersburg State University²⁹, University of California, Berkeley³⁰

01 Jul 2018-Nature Biotechnology

TL;DR: Two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences are presented, including the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum information about a Metagenome-Assembled Genomes (MIMAG), including estimates of genome completeness and contamination.

...read moreread less

Abstract: We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Gene Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.

...read moreread less

1,171 citations

Journal Article•DOI•

Minimum Information about a Biosynthetic Gene cluster.

[...]

Marnix H. Medema¹, Marnix H. Medema², Renzo Kottmann¹, Pelin Yilmaz¹ +161 more•Institutions (84)

18 Aug 2015-Nature Chemical Biology

TL;DR: This work proposes the Minimum Information about a Biosynthetic Gene cluster (MIBiG) data standard, to facilitate consistent and systematic deposition and retrieval of data on biosynthetic gene clusters.

...read moreread less

Abstract: A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to exploit. To facilitate consistent and systematic deposition and retrieval of data on biosynthetic gene clusters, we propose the Minimum Information about a Biosynthetic Gene cluster (MIBiG) data standard.

...read moreread less

633 citations

1
2
3
4
…
5
6
7
8
9
10
11
12

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets

[...]

Sudhir Kumar¹, Glen Stecher², Koichiro Tamura³•Institutions (3)

King Abdulaziz University¹, Temple University², Tokyo Metropolitan University³

22 Mar 2016-Molecular Biology and Evolution

TL;DR: The latest version of the Molecular Evolutionary Genetics Analysis (Mega) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine, has been optimized for use on 64-bit computing systems for analyzing larger datasets.

...read moreread less

Abstract: We present the latest version of the Molecular Evolutionary Genetics Analysis (Mega) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine. In this major upgrade, Mega has been optimized for use on 64-bit computing systems for analyzing larger datasets. Researchers can now explore and analyze tens of thousands of sequences in Mega The new version also provides an advanced wizard for building timetrees and includes a new functionality to automatically predict gene duplication events in gene family trees. The 64-bit Mega is made available in two interfaces: graphical and command line. The graphical user interface (GUI) is a native Microsoft Windows application that can also be used on Mac OS X. The command line Mega is available as native applications for Windows, Linux, and Mac OS X. They are intended for use in high-throughput and scripted analysis. Both versions are available from www.megasoftware.net free of charge.

...read moreread less

33,048 citations

Journal Article•DOI•

The SILVA ribosomal RNA gene database project: improved data processing and web-based tools

[...]

Christian Quast¹, Elmar Pruesse², Pelin Yilmaz², Jan Gerken², Timmy Schweer², Pablo Yarza², Jörg Peplies², Frank Oliver Glöckner² - Show less +4 more•Institutions (2)

Max Planck Society¹, Jacobs University Bremen²

28 Nov 2012-Nucleic Acids Research

...read moreread less

18,256 citations

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

[...]

Glenn Tesler

01 Jun 2012

TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

...read moreread less

10,124 citations

Journal Article•DOI•

VSEARCH: a versatile open source tool for metagenomics

[...]

Torbjørn Rognes¹, Torbjørn Rognes², Tomas Flouri³, Tomas Flouri⁴, Ben Nichols⁵, Christopher Quince⁵, Christopher Quince⁶, Frédéric Mahé⁷ - Show less +4 more•Institutions (7)

Oslo University Hospital¹, University of Oslo², Heidelberg Institute for Theoretical Studies³, Karlsruhe Institute of Technology⁴, University of Glasgow⁵, University of Warwick⁶, Kaiserslautern University of Technology⁷

18 Oct 2016-PeerJ

TL;DR: VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with US EARCH for paired-ends read merging and dereplication.

...read moreread less

Abstract: Background: VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar, 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use. Methods: When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads. Results: VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e., format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available at https://github.com/torognes/vsearch under either the BSD 2-clause license or the GNU General Public License version 3.0. Discussion: VSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.

...read moreread less

5,850 citations

Journal Article•DOI•

Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies.

[...]

Seok Hwan Yoon¹, Sung-Min Ha¹, Soonjae Kwon¹, Jeongmin Lim¹, Yeseul Kim¹, Hyungseok Seo¹, Jongsik Chun¹ - Show less +3 more•Institutions (1)

Seoul National University¹

30 May 2017-International Journal of Systematic and Evolutionary Microbiology

TL;DR: An integrated database, called EzBioCloud, that holds the taxonomic hierarchy of the Bacteria and Archaea, which is represented by quality-controlled 16S rRNA gene and genome sequences, with accompanying bioinformatics tools.

...read moreread less

Abstract: The recent advent of DNA sequencing technologies facilitates the use of genome sequencing data that provide means for more informative and precise classification and identification of members of the Bacteria and Archaea. Because the current species definition is based on the comparison of genome sequences between type and other strains in a given species, building a genome database with correct taxonomic information is of paramount need to enhance our efforts in exploring prokaryotic diversity and discovering novel species as well as for routine identifications. Here we introduce an integrated database, called EzBioCloud, that holds the taxonomic hierarchy of the Bacteria and Archaea, which is represented by quality-controlled 16S rRNA gene and genome sequences. Whole-genome assemblies in the NCBI Assembly Database were screened for low quality and subjected to a composite identification bioinformatics pipeline that employs gene-based searches followed by the calculation of average nucleotide identity. As a result, the database is made of 61 700 species/phylotypes, including 13 132 with validly published names, and 62 362 whole-genome assemblies that were identified taxonomically at the genus, species and subspecies levels. Genomic properties, such as genome size and DNA G+C content, and the occurrence in human microbiome data were calculated for each genus or higher taxa. This united database of taxonomy, 16S rRNA gene and genome sequences, with accompanying bioinformatics tools, should accelerate genome-based classification and identification of members of the Bacteria and Archaea. The database and related search tools are available at www.ezbiocloud.net/.

...read moreread less

5,027 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse