scispace - formally typeset
Search or ask a question
Author

Yiyuan Li

Other affiliations: University of Texas at Austin
Bio: Yiyuan Li is an academic researcher from University of Notre Dame. The author has contributed to research in topics: Environmental DNA & Genome. The author has an hindex of 20, co-authored 28 publications receiving 3415 citations. Previous affiliations of Yiyuan Li include University of Texas at Austin.

Papers
More filters
Journal ArticleDOI
Bernhard Misof, Shanlin Liu, Karen Meusemann1, Ralph S. Peters, Alexander Donath, Christoph Mayer, Paul B. Frandsen2, Jessica L. Ware2, Tomas Flouri3, Rolf G. Beutel4, Oliver Niehuis, Malte Petersen, Fernando Izquierdo-Carrasco3, Torsten Wappler5, Jes Rust5, Andre J. Aberer3, Ulrike Aspöck6, Ulrike Aspöck7, Horst Aspöck7, Daniela Bartel7, Alexander Blanke8, Simon Berger3, Alexander Böhm7, Thomas R. Buckley9, Brett Calcott10, Junqing Chen, Frank Friedrich11, Makiko Fukui12, Mari Fujita8, Carola Greve, Peter Grobe, Shengchang Gu, Ying Huang, Lars S. Jermiin1, Akito Y. Kawahara13, Lars Krogmann14, Martin Kubiak11, Robert Lanfear15, Robert Lanfear16, Robert Lanfear17, Harald Letsch7, Yiyuan Li, Zhenyu Li, Jiguang Li, Haorong Lu, Ryuichiro Machida8, Yuta Mashimo8, Pashalia Kapli18, Pashalia Kapli3, Duane D. McKenna19, Guanliang Meng, Yasutaka Nakagaki8, José Luis Navarrete-Heredia20, Michael Ott21, Yanxiang Ou, Günther Pass7, Lars Podsiadlowski5, Hans Pohl4, Björn M. von Reumont22, Kai Schütte11, Kaoru Sekiya8, Shota Shimizu8, Adam Slipinski1, Alexandros Stamatakis3, Alexandros Stamatakis23, Wenhui Song, Xu Su, Nikolaus U. Szucsich7, Meihua Tan, Xuemei Tan, Min Tang, Jingbo Tang, Gerald Timelthaler7, Shigekazu Tomizuka8, Michelle D. Trautwein24, Xiaoli Tong25, Toshiki Uchifune8, Manfred Walzl7, Brian M. Wiegmann26, Jeanne Wilbrandt, Benjamin Wipfler4, Thomas K. F. Wong1, Qiong Wu, Gengxiong Wu, Yinlong Xie, Shenzhou Yang, Qing Yang, David K. Yeates1, Kazunori Yoshizawa27, Qing Zhang, Rui Zhang, Wenwei Zhang, Yunhui Zhang, Jing Zhao, Chengran Zhou, Lili Zhou, Tanja Ziesmann, Shijie Zou, Yingrui Li, Xun Xu, Yong Zhang, Huanming Yang, Jian Wang, Jun Wang, Karl M. Kjer2, Xin Zhou 
07 Nov 2014-Science
TL;DR: The phylogeny of all major insect lineages reveals how and when insects diversified and provides a comprehensive reliable scaffold for future comparative analyses of evolutionary innovations among insects.
Abstract: Insects are the most speciose group of animals, but the phylogenetic relationships of many major lineages remain unresolved. We inferred the phylogeny of insects from 1478 protein-coding genes. Phylogenomic analyses of nucleotide and amino acid sequences, with site-specific nucleotide or domain-specific amino acid substitution models, produced statistically robust and congruent results resolving previously controversial phylogenetic relations hips. We dated the origin of insects to the Early Ordovician [~479 million years ago (Ma)], of insect flight to the Early Devonian (~406 Ma), of major extant lineages to the Mississippian (~345 Ma), and the major diversification of holometabolous insects to the Early Cretaceous. Our phylogenomic study provides a comprehensive reliable scaffold for future comparative analyses of evolutionary innovations among insects.

1,998 citations

Journal ArticleDOI
TL;DR: A mitogenome toolkit MitoZ is developed, consisting of independent modules of de novo assembly, findMitoScaf (find Mitochondrial Scaffolds), annotation and visualization, that can generate mitogenomes assembly together with annotations and visualization results from HTS raw reads.
Abstract: Mitochondrial genome (mitogenome) plays important roles in evolutionary and ecological studies. It becomes routine to utilize multiple genes on mitogenome or the entire mitogenomes to investigate phylogeny and biodiversity of focal groups with the onset of High Throughput Sequencing (HTS) technologies. We developed a mitogenome toolkit MitoZ, consisting of independent modules of de novo assembly, findMitoScaf (find Mitochondrial Scaffolds), annotation and visualization, that can generate mitogenome assembly together with annotation and visualization results from HTS raw reads. We evaluated its performance using a total of 50 samples of which mitogenomes are publicly available. The results showed that MitoZ can recover more full-length mitogenomes with higher accuracy compared to the other available mitogenome assemblers. Overall, MitoZ provides a one-click solution to construct the annotated mitogenome from HTS raw data and will facilitate large scale ecological and evolutionary studies. MitoZ is free open source software distributed under GPLv3 license and available at https://github.com/linzhi2013/MitoZ.

465 citations

Journal ArticleDOI
TL;DR: The results illustrate the potential for eDNA sampling and metabarcode approaches to improve quantification of aquatic species diversity in natural environments and point the way towards using eDNA metabarcoding as an index of macrofaunal species abundance.
Abstract: Freshwater fauna are particularly sensitive to environmental change and disturbance. Management agencies frequently use fish and amphibian biodiversity as indicators of ecosystem health and a way to prioritize and assess management strategies. Traditional aquatic bioassessment that relies on capture of organisms via nets, traps and electrofishing gear typically has low detection probabilities for rare species and can injure individuals of protected species. Our objective was to determine whether environmental DNA (eDNA) sampling and metabarcoding analysis can be used to accurately measure species diversity in aquatic assemblages with differing structures. We manipulated the density and relative abundance of eight fish and one amphibian species in replicated 206-L mesocosms. Environmental DNA was filtered from water samples, and six mitochondrial gene fragments were Illumina-sequenced to measure species diversity in each mesocosm. Metabarcoding detected all nine species in all treatment replicates. Additionally, we found a modest, but positive relationship between species abundance and sequencing read abundance. Our results illustrate the potential for eDNA sampling and metabarcoding approaches to improve quantification of aquatic species diversity in natural environments and point the way towards using eDNA metabarcoding as an index of macrofaunal species abundance.

312 citations

Journal ArticleDOI
TL;DR: The ability of the new Illumina PCR-free pipeline for DNA metabarcoding to detect small arthropod specimens and its tendency to avoid most, if not all, false positives suggests its great potential in biodiversity-related surveillance, such as in biomonitoring programs.
Abstract: Next-generation-sequencing (NGS) technologies combined with a classic DNA barcoding approach have enabled fast and credible measurement for biodiversity of mixed environmental samples. However, the PCR amplification involved in nearly all existing NGS protocols inevitably introduces taxonomic biases. In the present study, we developed new Illumina pipelines without PCR amplifications to analyze terrestrial arthropod communities. Mitochondrial enrichment directly followed by Illumina shotgun sequencing, at an ultra-high sequence volume, enabled the recovery of Cytochrome c Oxidase subunit 1 (COI) barcode sequences, which allowed for the estimation of species composition at high fidelity for a terrestrial insect community. With 15.5 Gbp Illumina data, approximately 97% and 92% were detected out of the 37 input Operational Taxonomic Units (OTUs), whether the reference barcode library was used or not, respectively, while only 1 novel OTU was found for the latter. Additionally, relatively strong correlation between the sequencing volume and the total biomass was observed for species from the bulk sample, suggesting a potential solution to reveal relative abundance. The ability of the new Illumina PCR-free pipeline for DNA metabarcoding to detect small arthropod specimens and its tendency to avoid most, if not all, false positives suggests its great potential in biodiversity-related surveillance, such as in biomonitoring programs. However, further improvement for mitochondrial enrichment is likely needed for the application of the new pipeline in analyzing arthropod communities at higher diversity.

247 citations

Journal ArticleDOI
TL;DR: A novel multiplex sequencing and assembly pipeline allowing for simultaneous acquisition of full mitogenomes from pooled animals without DNA enrichment or amplification is developed and demonstrates the plausibility of a multi-locus mito-metagenomics approach as the next phase of the current single- locus metabarcoding method.
Abstract: The advent in high-throughput-sequencing (HTS) technologies has revolutionized conventional biodiversity research by enabling parallel capture of DNA sequences possessing species-level diagnosis. However, polymerase chain reaction (PCR)-based implementation is biased by the efficiency of primer binding across lineages of organisms. A PCR-free HTS approach will alleviate this artefact and significantly improve upon the multi-locus method utilizing full mitogenomes. Here we developed a novel multiplex sequencing and assembly pipeline allowing for simultaneous acquisition of full mitogenomes from pooled animals without DNA enrichment or amplification. By concatenating assemblies from three de novo assemblers, we obtained high-quality mitogenomes for all 49 pooled taxa, with 36 species >15 kb and the remaining >10 kb, including 20 complete mitogenomes and nearly all protein coding genes (99.6%). The assembly quality was carefully validated with Sanger sequences, reference genomes and conservativeness of protein coding genes across taxa. The new method was effective even for closely related taxa, e.g. three Drosophila spp., demonstrating its broad utility for biodiversity research and mito-phylogenomics. Finally, the in silico simulation showed that by recruiting multiple mito-loci, taxon detection was improved at a fixed sequencing depth. Combined, these results demonstrate the plausibility of a multi-locus mito-metagenomics approach as the next phase of the current single-locus metabarcoding method.

240 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Preface to the Princeton Landmarks in Biology Edition vii Preface xi Symbols used xiii 1.
Abstract: Preface to the Princeton Landmarks in Biology Edition vii Preface xi Symbols Used xiii 1. The Importance of Islands 3 2. Area and Number of Speicies 8 3. Further Explanations of the Area-Diversity Pattern 19 4. The Strategy of Colonization 68 5. Invasibility and the Variable Niche 94 6. Stepping Stones and Biotic Exchange 123 7. Evolutionary Changes Following Colonization 145 8. Prospect 181 Glossary 185 References 193 Index 201

14,171 citations

Journal ArticleDOI
TL;DR: PartitionFinder 2 is a program for automatically selecting best-fit partitioning schemes and models of evolution for phylogenetic analyses that includes the ability to analyze morphological datasets, new methods to analyze genome-scale datasets, and new output formats to facilitate interoperability with downstream software.
Abstract: PartitionFinder 2 is a program for automatically selecting best-fit partitioning schemes and models of evolution for phylogenetic analyses. PartitionFinder 2 is substantially faster and more efficient than version 1, and incorporates many new methods and features. These include the ability to analyze morphological datasets, new methods to analyze genome-scale datasets, new output formats to facilitate interoperability with downstream software, and many new models of molecular evolution. PartitionFinder 2 is freely available under an open source license and works on Windows, OSX, and Linux operating systems. It can be downloaded from www.robertlanfear.com/partitionfinder. The source code is available at https://github.com/brettc/partitionfinder.

3,445 citations

Journal ArticleDOI
TL;DR: RAxML-NG is presented, a from-scratch re-implementation of the established greedy tree search algorithm of RAxML/ExaML, which offers improved accuracy, flexibility, speed, scalability, and usability compared with RAx ML/ exaML.
Abstract: MOTIVATION Phylogenies are important for fundamental biological research, but also have numerous applications in biotechnology, agriculture and medicine. Finding the optimal tree under the popular maximum likelihood (ML) criterion is known to be NP-hard. Thus, highly optimized and scalable codes are needed to analyze constantly growing empirical datasets. RESULTS We present RAxML-NG, a from-scratch re-implementation of the established greedy tree search algorithm of RAxML/ExaML. RAxML-NG offers improved accuracy, flexibility, speed, scalability, and usability compared with RAxML/ExaML. On taxon-rich datasets, RAxML-NG typically finds higher-scoring trees than IQTree, an increasingly popular recent tool for ML-based phylogenetic inference (although IQ-Tree shows better stability). Finally, RAxML-NG introduces several new features, such as the detection of terraces in tree space and the recently introduced transfer bootstrap support metric. AVAILABILITY AND IMPLEMENTATION The code is available under GNU GPL at https://github.com/amkozlov/raxml-ng. RAxML-NG web service (maintained by Vital-IT) is available at https://raxml-ng.vital-it.ch/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

1,765 citations

Journal ArticleDOI
TL;DR: This work presents BUSCO v3 with example analyses that highlight the wide‐ranging utility of BUSCO assessments, which extend beyond quality control of genomics data sets to applications in comparative genomics analyses, gene predictor training, metagenomics, and phylogenomics.
Abstract: Genomics promises comprehensive surveying of genomes and metagenomes, but rapidly changing technologies and expanding data volumes make evaluation of completeness a challenging task. Technical sequencing quality metrics can be complemented by quantifying completeness of genomic data sets in terms of the expected gene content of Benchmarking Universal Single-Copy Orthologs (BUSCO, http://busco.ezlab.org). The latest software release implements a complete refactoring of the code to make it more flexible and extendable to facilitate high-throughput assessments. The original six lineage assessment data sets have been updated with improved species sampling, 34 new subsets have been built for vertebrates, arthropods, fungi, and prokaryotes that greatly enhance resolution, and data sets are now also available for nematodes, protists, and plants. Here, we present BUSCO v3 with example analyses that highlight the wide-ranging utility of BUSCO assessments, which extend beyond quality control of genomics data sets to applications in comparative genomics analyses, gene predictor training, metagenomics, and phylogenomics.

1,575 citations

Journal ArticleDOI
TL;DR: GetOrganelle assemblies are more accurate than published and/or NOVOPlasty-reassembled plastomes as assessed by mapping and are able to reassemble the circular Plastomes from 47 datasets using GetOrganelle.
Abstract: GetOrganelle is a state-of-the-art toolkit to accurately assemble organelle genomes from whole genome sequencing data. It recruits organelle-associated reads using a modified “baiting and iterative mapping” approach, conducts de novo assembly, filters and disentangles the assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published plant datasets, we are able to reassemble the circular plastomes from 47 datasets using GetOrganelle. GetOrganelle assemblies are more accurate than published and/or NOVOPlasty-reassembled plastomes as assessed by mapping. We also assemble complete mitochondrial genomes using GetOrganelle. GetOrganelle is freely released under a GPL-3 license ( https://github.com/Kinggerm/GetOrganelle ).

1,160 citations