scispace - formally typeset
Search or ask a question
Journal ArticleDOI

PRICE: software for the targeted assembly of components of (Meta) genomic sequence data.

01 May 2013-G3: Genes, Genomes, Genetics (Genetics Society of America)-Vol. 3, Iss: 5, pp 865-880
TL;DR: The assembly strategy implemented by PRICE is described and examples of its application to the sequence of particular genes, transcripts, and virus genomes from complex multicomponent datasets, including an assembly of the BCBL-1 strain of Kaposi’s sarcoma-associated herpesvirus.
Abstract: Low-cost DNA sequencing technologies have expanded the role for direct nucleic acid sequencing in the analysis of genomes, transcriptomes, and the metagenomes of whole ecosystems. Human and machine comprehension of such large datasets can be simplified via synthesis of sequence fragments into long, contiguous blocks of sequence (contigs), but most of the progress in the field of assembly has focused on genomes in isolation rather than metagenomes. Here, we present software for paired-read iterative contig extension (PRICE), a strategy for focused assembly of particular nucleic acid species using complex metagenomic data as input. We describe the assembly strategy implemented by PRICE and provide examples of its application to the sequence of particular genes, transcripts, and virus genomes from complex multicomponent datasets, including an assembly of the BCBL-1 strain of Kaposi’s sarcoma-associated herpesvirus. PRICE is open-source and available for free download (derisilab.ucsf.edu/software/price/ or sourceforge.net/projects/pricedenovo/).
Citations
More filters
Journal ArticleDOI
TL;DR: This review focuses on the application of untargeted metagenomic next-generation sequencing to the clinical diagnosis of infectious diseases, particularly in areas in which conventional diagnostic approaches have limitations.
Abstract: Nearly all infectious agents contain DNA or RNA genomes, making sequencing an attractive approach for pathogen detection. The cost of high-throughput or next-generation sequencing has been reduced ...

630 citations


Cites background from "PRICE: software for the targeted as..."

  • ...In these cases, de novo assembly may be attempted if the pathogen sequence data are readily abundant in the specimen or if an isolate is obtainable (59, 60)....

    [...]

Journal ArticleDOI
TL;DR: This review describes the analytical strategies and specific tools that can be applied to metagenomic data and the considerations and caveats associated with their use and documents how metagenomes can be analyzed to quantify community structure and diversity.
Abstract: Environmental DNA sequencing has revealed the expansive biodiversity of microorganisms and clarified the relationship between host-associated microbial communities and host phenotype. Shotgun metagenomic DNA sequencing is a relatively new and powerful environmental sequencing approach that provides insight into community biodiversity and function. But, the analysis of metagenomic sequences is complicated due to the complex structure of the data. Fortunately, new tools and data resources have been developed to circumvent these complexities and allow researchers to determine which microbes are present in the community and what they might be doing. This review describes the analytical strategies and specific tools that can be applied to metagenomic data and the considerations and caveats associated with their use. Specifically, it documents how metagenomes can be analyzed to quantify community structure and diversity, assemble novel genomes, identify new taxa and genes, and determine which metabolic pathways are encoded in the community. It also discusses several methods that can be used compare metagenomes to identify taxa and functions that differentiate communities.

445 citations


Cites background or methods from "PRICE: software for the targeted as..."

  • ...PRICE implements a series of data reduction procedures to minimize the complexity associated with generating an initial set of contigs and then uses paired-end information associated with reads to merge contigs (Ruby et al., 2013)....

    [...]

  • ...In some instances, complete or nearly complete genomes can be assembled, which provides insight into the genomic composition of uncultured organisms found in a community (Iverson et al., 2012; Wrighton et al., 2012; Ruby et al., 2013)....

    [...]

Journal ArticleDOI
31 Oct 2014-Science
TL;DR: In this article, a combination of transcriptomics and evolutionary approaches detected a Y-specific sex-determinant candidate, OGI, that displays male-specific conservation among Diospyros species OGI encodes a small RNA targeting the autosomal MeGI gene, a homeodomain transcription factor regulating anther fertility in a dosage-dependent fashion.
Abstract: In plants, multiple lineages have evolved sex chromosomes independently, providing a powerful comparative framework, but few specific determinants controlling the expression of a specific sex have been identified We investigated sex determinants in the Caucasian persimmon, Diospyros lotus, a dioecious plant with heterogametic males (XY) Male-specific short nucleotide sequences were used to define a male-determining region A combination of transcriptomics and evolutionary approaches detected a Y-specific sex-determinant candidate, OGI, that displays male-specific conservation among Diospyros species OGI encodes a small RNA targeting the autosomal MeGI gene, a homeodomain transcription factor regulating anther fertility in a dosage-dependent fashion This identification of a feminizing gene suppressed by a Y-chromosome-encoded small RNA contributes to our understanding of the evolution of sex chromosome systems in higher plants

310 citations

Journal ArticleDOI
TL;DR: This work introduces DASH (Depletion of Abundant Sequences by Hybridization), a broadly applicable method to remove unwanted high-abundance species prior to sequencing that can be adapted for any sample type and increases sequencing yield without additional cost.
Abstract: Next-generation sequencing has generated a need for a broadly applicable method to remove unwanted high-abundance species prior to sequencing. We introduce DASH (Depletion of Abundant Sequences by Hybridization). Sequencing libraries are ‘DASHed’ with recombinant Cas9 protein complexed with a library of guide RNAs targeting unwanted species for cleavage, thus preventing them from consuming sequencing space. We demonstrate a more than 99 % reduction of mitochondrial rRNA in HeLa cells, and enrichment of pathogen sequences in patient samples. We also demonstrate an application of DASH in cancer. This simple method can be adapted for any sample type and increases sequencing yield without additional cost.

258 citations


Cites background from "PRICE: software for the targeted as..."

  • ...2 [51] such that only read pairs with less than five ambiguous base calls (defined as Ns or positions with <95 % confidence based on Phred score) were retained....

    [...]

Journal ArticleDOI
TL;DR: This review discusses techniques, including nucleic acid extraction from different environments, sample preparation and high-throughput sequencing platforms, that are becoming more widely used to study whole communities of prokaryotes in many niches.

240 citations

References
More filters
Journal ArticleDOI
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

88,255 citations

Journal ArticleDOI
TL;DR: A new method for determining nucleotide sequences in DNA is described, which makes use of the 2',3'-dideoxy and arabinon nucleoside analogues of the normal deoxynucleoside triphosphates, which act as specific chain-terminating inhibitors of DNA polymerase.
Abstract: A new method for determining nucleotide sequences in DNA is described. It is similar to the “plus and minus” method [Sanger, F. & Coulson, A. R. (1975) J. Mol. Biol. 94, 441-448] but makes use of the 2′,3′-dideoxy and arabinonucleoside analogues of the normal deoxynucleoside triphosphates, which act as specific chain-terminating inhibitors of DNA polymerase. The technique has been applied to the DNA of bacteriophage ϕX174 and is more rapid and more accurate than either the plus or the minus method.

62,728 citations


"PRICE: software for the targeted as..." refers methods in this paper

  • ...Much effort has been applied to the development of computational methods for the de novo assembly of genomes using the type of data generated by these technologies: typically, shorter reads and/or higher error frequencies vs. traditional Sanger sequencing (Sanger et al. 1977; Glenn 2011)....

    [...]

Journal ArticleDOI
TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.
Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

15,665 citations


"PRICE: software for the targeted as..." refers methods in this paper

  • .../Trinity.pl–seqType fq –JM 20G– left [first-read _sequence.txt file] –right [second-read _sequence.txt file] –output [output file]–CPU 40.”...

    [...]

  • ...The second failure mode applied to IDBA-UD and Trinity and was characterized by extensive coverage of the LSV2 genome (Figure 2H) and longer assembled fragments (Figure 2I), but also by high redundancy, with many contigs containing slight sequence variations repeatedly covering the same parts of the genome (Figure 2J)....

    [...]

  • ...(G) Contigs from assemblies performed on the same paired-read dataset as above (Dryad repository: doi:10.5061/dryad.9n8rh) using MetaVelvet (Zerbino and Birney 2008; Namiki et al. 2012) (blue), SOAPdenovo (Li et al. 2010b) (orange), IDBA-UD (Peng et al. 2012) (green), and Trinity (Grabherr et al. 2011) (red)....

    [...]

  • ...…of the de Bruijn graph assembler Velvet (Zerbino and Birney 2008); the de Bruijn graph assembler SOAPdenovo (Li et al. 2010b); the metagenome-optimized assembler IDBA-UD (Peng et al. 2012); and the transcriptome assembler Trinity (Grabherr et al. 2011) (see Materials and Methods for details)....

    [...]

  • ...…from assemblies performed on the same paired-read dataset as above (Dryad repository: doi:10.5061/dryad.9n8rh) using MetaVelvet (Zerbino and Birney 2008; Namiki et al. 2012) (blue), SOAPdenovo (Li et al. 2010b) (orange), IDBA-UD (Peng et al. 2012) (green), and Trinity (Grabherr et al. 2011) (red)....

    [...]

Journal ArticleDOI
TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.

11,844 citations


"PRICE: software for the targeted as..." refers methods in this paper

  • ...…were performed using a constrained, fail-fast, semiglobal implementation of the dynamic programming alignment algorithm (the NeedlemanWunsch algorithm (Needleman and Wunsch 1970), modified to not penalize initial gaps and to trace back from the highest-scoring terminal node of either sequence)....

    [...]

Journal ArticleDOI
TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).

10,262 citations

Trending Questions (1)
What is price means?

PRICE stands for Paired-Read Iterative Contig Extension, a software for targeted assembly of specific nucleic acid components from complex metagenomic data, aiding in genomic and transcriptomic analysis.