scispace - formally typeset
Search or ask a question
Author

Shanlin Liu

Other affiliations: University of Copenhagen
Bio: Shanlin Liu is an academic researcher from China Agricultural University. The author has contributed to research in topics: DNA barcoding & Population. The author has an hindex of 20, co-authored 49 publications receiving 1647 citations. Previous affiliations of Shanlin Liu include University of Copenhagen.

Papers
More filters
Journal ArticleDOI
TL;DR: NextPolish is a tool that efficiently corrects sequence errors in genomes assembled with long reads by consisting of two interlinked modules designed to score and count K-mers from high quality short reads, and to polish genome assemblies containing large numbers of base errors.
Abstract: MOTIVATION Although long-read sequencing technologies can produce genomes with long contiguity, they suffer from high error rates. Thus, we developed NextPolish, a tool that efficiently corrects sequence errors in genomes assembled with long reads. This new tool consists of two interlinked modules that are designed to score and count K-mers from high quality short reads, and to polish genome assemblies containing large numbers of base errors. RESULTS When evaluated for the speed and efficiency using human and a plant (Arabidopsis thaliana) genomes, NextPolish outperformed Pilon by correcting sequence errors faster, and with a higher correction accuracy. AVAILABILITY AND IMPLEMENTATION NextPolish is implemented in C and Python. The source code is available from https://github.com/Nextomics/NextPolish. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

383 citations

Journal ArticleDOI
TL;DR: The observations suggest that the BGISEQ-500 holds the potential to represent a valid and potentially valuable alternative platform for palaeogenomic data generation that is worthy of future exploration by those interested in the sequencing and analysis of degraded DNA.
Abstract: Ancient DNA research has been revolutionized following development of next-generation sequencing platforms. Although a number of such platforms have been applied to ancient DNA samples, the Illumina series are the dominant choice today, mainly because of high production capacities and short read production. Recently a potentially attractive alternative platform for palaeogenomic data generation has been developed, the BGISEQ-500, whose sequence output are comparable with the Illumina series. In this study, we modified the standard BGISEQ-500 library preparation specifically for use on degraded DNA, then directly compared the sequencing performance and data quality of the BGISEQ-500 to the Illumina HiSeq2500 platform on DNA extracted from 8 historic and ancient dog and wolf samples. The data generated were largely comparable between sequencing platforms, with no statistically significant difference observed for parameters including level (P = 0.371) and average sequence length (P = 0718) of endogenous nuclear DNA, sequence GC content (P = 0.311), double-stranded DNA damage rate (v. 0.309), and sequence clonality (P = 0.093). Small significant differences were found in single-strand DNA damage rate (δS; slightly lower for the BGISEQ-500, P = 0.011) and the background rate of difference from the reference genome (θ; slightly higher for BGISEQ-500, P = 0.012). This may result from the differences in amplification cycles used to polymerase chain reaction-amplify the libraries. A significant difference was also observed in the mitochondrial DNA percentages recovered (P = 0.018), although we believe this is likely a stochastic effect relating to the extremely low levels of mitochondria that were sequenced from 3 of the samples with overall very low levels of endogenous DNA. Although we acknowledge that our analyses were limited to animal material, our observations suggest that the BGISEQ-500 holds the potential to represent a valid and potentially valuable alternative platform for palaeogenomic data generation that is worthy of future exploration by those interested in the sequencing and analysis of degraded DNA.

282 citations

Journal ArticleDOI
TL;DR: It is demonstrated that the most recent common ancestor of Lepidoptera is considerably older than previously hypothesized, and it is shown that multiple lineages of moths independently evolved hearing organs well before the origin of bats, rejecting the hypothesis that lepidopteran hearing organs arose in response to these predators.
Abstract: Butterflies and moths (Lepidoptera) are one of the major superradiations of insects, comprising nearly 160,000 described extant species. As herbivores, pollinators, and prey, Lepidoptera play a fundamental role in almost every terrestrial ecosystem. Lepidoptera are also indicators of environmental change and serve as models for research on mimicry and genetics. They have been central to the development of coevolutionary hypotheses, such as butterflies with flowering plants and moths' evolutionary arms race with echolocating bats. However, these hypotheses have not been rigorously tested, because a robust lepidopteran phylogeny and timing of evolutionary novelties are lacking. To address these issues, we inferred a comprehensive phylogeny of Lepidoptera, using the largest dataset assembled for the order (2,098 orthologous protein-coding genes from transcriptomes of 186 species, representing nearly all superfamilies), and dated it with carefully evaluated synapomorphy-based fossils. The oldest members of the Lepidoptera crown group appeared in the Late Carboniferous (∼300 Ma) and fed on nonvascular land plants. Lepidoptera evolved the tube-like proboscis in the Middle Triassic (∼241 Ma), which allowed them to acquire nectar from flowering plants. This morphological innovation, along with other traits, likely promoted the extraordinary diversification of superfamily-level lepidopteran crown groups. The ancestor of butterflies was likely nocturnal, and our results indicate that butterflies became day-flying in the Late Cretaceous (∼98 Ma). Moth hearing organs arose multiple times before the evolutionary arms race between moths and bats, perhaps initially detecting a wide range of sound frequencies before being co-opted to specifically detect bat sonar. Our study provides an essential framework for future comparative studies on butterfly and moth evolution.

236 citations

Journal ArticleDOI
17 Feb 2021-Nature
TL;DR: In this paper, the authors report the recovery of genome-wide data from three mammoth specimens dating to the Early and Middle Pleistocene subepochs, two of which are more than one million years old.
Abstract: Temporal genomic data hold great potential for studying evolutionary processes such as speciation. However, sampling across speciation events would, in many cases, require genomic time series that stretch well back into the Early Pleistocene subepoch. Although theoretical models suggest that DNA should survive on this timescale1, the oldest genomic data recovered so far are from a horse specimen dated to 780–560 thousand years ago2. Here we report the recovery of genome-wide data from three mammoth specimens dating to the Early and Middle Pleistocene subepochs, two of which are more than one million years old. We find that two distinct mammoth lineages were present in eastern Siberia during the Early Pleistocene. One of these lineages gave rise to the woolly mammoth and the other represents a previously unrecognized lineage that was ancestral to the first mammoths to colonize North America. Our analyses reveal that the Columbian mammoth of North America traces its ancestry to a Middle Pleistocene hybridization between these two lineages, with roughly equal admixture proportions. Finally, we show that the majority of protein-coding changes associated with cold adaptation in woolly mammoths were already present one million years ago. These findings highlight the potential of deep-time palaeogenomics to expand our understanding of speciation and long-term adaptive evolution. Siberian mammoth genomes from the Early and Middle Pleistocene subepochs reveal adaptive changes and a key hybridization event, highlighting the value of deep-time palaeogenomics for studies of speciation and long-term evolutionary trends.

127 citations


Cited by
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

01 Jan 2016
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for downloading modern applied statistics with s. As you may know, people have search hundreds times for their favorite readings like this modern applied statistics with s, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful virus inside their laptop. modern applied statistics with s is available in our digital library an online access to it is set as public so you can download it instantly. Our digital library saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the modern applied statistics with s is universally compatible with any devices to read.

5,249 citations

Journal ArticleDOI
TL;DR: Efforts have been put to improve efficiency, flexibility, support for 'big data' (R's long vectors), ease of use and quality check before a new release of ape.
Abstract: Summary After more than fifteen years of existence, the R package ape has continuously grown its contents, and has been used by a growing community of users The release of version 50 has marked a leap towards a modern software for evolutionary analyses Efforts have been put to improve efficiency, flexibility, support for 'big data' (R's long vectors), ease of use and quality check before a new release These changes will hopefully make ape a useful software for the study of biodiversity and evolution in a context of increasing data quantity Availability and implementation ape is distributed through the Comprehensive R Archive Network: http://cranr-projectorg/package=ape Further information may be found at http://ape-packageirdfr/

4,303 citations

Journal ArticleDOI

3,734 citations

01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations