scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Sequencing technologies-the next generation

01 Jan 2010-Nature Reviews Genetics (Nature Publishing Group)-Vol. 11, Iss: 1, pp 31-46
TL;DR: A technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments is presented.
Abstract: Demand has never been greater for revolutionary technologies that deliver fast, inexpensive and accurate genome information. This challenge has catalysed the development of next-generation sequencing (NGS) technologies. The inexpensive production of large volumes of sequence data is the primary advantage over conventional methods. Here, I present a technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments. I also outline the broad range of applications for NGS technologies, in addition to providing guidelines for platform selection to address biological questions of interest.

Summary (1 min read)

Jump to:  and [Summary]

Summary

  • DNA sequencing is one of the most important platforms for study in biological systems today.
  • The high-throughput-next generation sequencing technologies delivers fast, inexpensive, and accurate genome information.
  • Next generation sequencing can produce over 100 times more data than methods based on Sanger Sequencing.
  • The next generation sequencing technologies offered from Illumina / Solexa, ABI/SOLiD, 454/Roche, and Helicos has provided unprecedented opportunity for high-throughput functional genomic research.
  • Next generation sequence technologies offer novel and rapid ways for genome-wide characterization and profiling of mRNA's, transcription factor regions, and DNA patterns.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

TEMPLATE DESIGN © 2008
www.PosterPresentations.com
ABSTRACT
Conclusion and Future Work
Next Generation Sequencing
CONTACT INFO
Data Analysis Comparisons
Downstream Analysis
REFERENCES
DNA sequencing is one of the most important platforms for
study in biological systems today. The high-throughput-next
generation sequencing technologies delivers fast,
inexpensive, and accurate genome information. Next
generation sequencing can produce over 100 times more data
than methods based on Sanger Sequencing. The next
generation sequencing technologies offered from Illumina /
Solexa, ABI/SOLiD, 454/Roche, and Helicos has provided
unprecedented opportunity for high–throughput functional
genomic research. Next generation sequence technologies
offer novel and rapid ways for genome-wide characterization
and profiling of mRNAs, transcription factor regions, and DNA
patterns.
Fig. 7) This is a plot of the frequency of each percentage covered for all nodes.
BLAST is in blue, MUMmer is in green.
Sequencing Technologies – the Next Generation,
Micahel L. Metzkerh
Next Generation Sequencing Pipeline Development and Data Analysis
Fig. 9) This is a plot of the coverage of each Node. BLAST points are blue,
MUMmer points are red.
Fig. 6) This is a plot of the frequency of each percentage covered for all contigs.
BLAST is in blue, MUMmer is in green.
454/Roche – 454 Life Sciences is a Biotechnology company
that is a part of Roche and based in Branford, Connecticut.
The center develops ultra-fast high-throughput DNA
sequencing methods and tools.
Illumina/Solexa– Illumina is a company that develops and
manufactures integrated systems for the analysis of gene
variation. Solexa was founded to develop genome
sequencing technology.
ABI/SOLiD - (Sequencing by Oligonucleotide Ligation and
Detection) is a next-generation DNA sequencing technology
developed by Life Technologies and has been commercially
available since 2006. This next generation technology
generates hundreds of millions to billions of small sequence
reads at one time.
Helicos - Helicos's technology images the extension of
individual DNA molecules using a defined primer and
individual fluorescently labeled nucleotides, which contain a
"virtual terminator" preventing incorporation of multiple
nucleotides per cycle.
Julian Pierre
1
, Jordan Taylor
2
, Amit Upadhyay
3
, Bhanu Rekepalli
3
Fig. 8) This is a plot of the coverage of each Contig. BLAST points are blue,
MUMmer points are red.
Using the coverage of
each individual contig
ID, the results for both
BLAST and MUMmer
were plotted. While
BLAST hit more contigs,
there are more contigs
with a higher coverage
that were hit by
MUMmer.
Using the data gathered
from both BLAST and
MUMmer, the frequency
of the amount covered
for each contig was
plotted. From Fig 6), it
can be inferred that
MUMmer hit more
accurately for contigs.
Fig 4) from main.g2.bx.psu.edu
Once the results were found using both the BLAST and
MUMmer search tools, we created a program to see which
sequencing tool had the most hits per contig. The total
number of contigs in the database file is 160,749 and the
total number of nodes in the query file is 552,305. BLAST
returned a total of 123,070 hits and MUMmer returned a
total of 121,829 hits. From the results, MUMmer hit more
accurately than BLAST while BLAST hit more contigs than
MUMmer.
In Next-Generation Sequencing, data analysis is one of the
most expensive processes. While the cost of genome
sequencing goes down, the cost of analyzing data is still
expensive. In the future, the “$1,000 genome will come with
a $20,000 analysis price tag.”
The same process was
done with the Nodes.
From Fig 7), it can be
inferred that BLAST hit
more accurately with
nodes. However, there
are more BLAST results
with lower coverage.
The future of next generation sequencing can be broken
down into a variety of categories such as personalized
medicine, bio fuels, climate change, and other life science
fields.
Personalized Medicine is a medical model that proposes
the customization of medical decision to tailor an
individual
Bio Fuels present a source of alternative energy.
Microalgal biofuels use algae to synthesize the fuel. In
order to optimize the process, an understanding of the
gene-function relationship of algae would prove helpful.
Climate change is the active study of past and future
theoretical models which uses the past climate data to
make future projections.
In conclusion, we hope to contribute the knowledge we
have gained to contribute to fields such as these.
The same process was
done with the Nodes.
While BLAST hit more
Nodes, there are more
Nodes that hit with a
lower coverage using
BLAST.
1 Texas Southern University, 2 Austin Peay State University, 3 University of Tennessee
Next Gen Sequencing uses a wide array of tools to obtain results based
on the genome sequence. The most widely used Tools are BLAST,
HMMER, and MUMmer.
BLAST (Basic Local Alignment Search Tool) is a multi-sequence
alignment tool developed by NIH (National Institute of Health). It is
used find similar regions in different sequences and then compare
their similarities.
MUMmer (Maximum Unique Matches) is a rapid alignment system
used for rapidly aligning entire genomes. It can also align incomplete
genomes and can easily handle 1000’s of contigs from a shotgun
sequencing project.
HMMER (Hidden Markov Modeler) is used for searching sequence
databases for homologs of protein sequences, and for making protein
sequence alignments. It implements methods using probabilistic
models called profile hidden Markov models (HMMs)
Genome Assembly
Sequence Analysis refers to
the process of subjecting a
DNA, RNA or peptide
sequence to a wide range of
analytical methods to:
Compare sequences to find
similarities and infer if they
are Homologous
To identify the features of
the sequence such as gene
structure, distribution,
introns and exons, and
regulation of gene
expression
Identify Sequence
differences and variations
such as mutations
Fig. 1) This is figure shows three different Next Generation Sequencing methods. [2]
Fig. 2) Taken from A Hitchhiker’s Guide to Next-Generation Sequencing, by Gabe Rudy
Fig. 3) Taken from bio.davidson.edu/courses. Shows alignment results for yeast.
Fig 5) from jcvi.org shows the mapping of chr6 of a Human Genome
Julian Pierre – julz_pierre@yahoo.com
Jordan Taylor – jtaylor74@my.apsu.edu
Amit Upadhyay – aupadhy1@utk.edu
Bhanu Rekepalli – brekapal@utk.edu
http://www.roche.com/research_and_development/r_d_overview/
r_d_sites.htm?id=18
http://www.pnas.org/content/99/6/3712/F1.expansion.html
http://www.yerkes.emory.edu/nhp_genomics_core/Services/
Sequencing.html
http://www.illumina.com/technology/solexa_technology.ilmn
http://blast.ncbi.nlm.nih.gov/Blast.cgi
https://main.g2.bx.psu.edu/u/dan/p/fastq
http://ori.dhhs.gov/education/products/n_illinois_u/datamanagement/
datopic.htmll
http://www.jcvi.org/medicago/include/images/chr6.BamHI.maps.jpg
Gabe Rudy, (2010) A Hitchhikers Guide to Next-Generation
Sequencing, :1-9, Golden Helix
[1] John D. McPherson, (2009) Next-Generation Gap, 6:1-4, Nature
Methods Supplement
[2]Michael L. Metzker, (2010) Sequencing Technologies, - the next
generation, 11:1-5, Nature Reviews
Md. Fakruddin,Khanjada Shahnewaj Bin mannan, (2012) Next
Generation sequencing technologies – Principles and prospects,
6:1-9, Research and Reviews in Biosciences
Misra N., Panda P. K., Parida B. K., Mishra B. K., (2012)
Phylogenomic Study of Lipid Genes Involved in Mocroalgal Biofuel
Production – Candidate Gene Mining and Metabolic Pathway
Analyses, Evolutionary Bioinformatics 8:545-564, doi: 10.4137/
EBO.S10159
Galaxy is an open, web-based
platform for data intensive
biomedical research. It can be
used on its own free public
server where you can perform,
reproduce, and share complete
analyses.
An example of how Galaxy
reflects its data is shown in Fig 5.
Two FASTA files related to the same nucleotide sequence
were input into both BLAST and MUMmer and the results
were parsed into tables. Then, the coverage of all hit contigs
and nodes from both programs was found.
Citations
More filters
Journal ArticleDOI
TL;DR: The range of bat viromes, including viruses from mammals, insects, fungi, plants, and phages, in 11 insectivorous bat species common in six provinces of China are described and the complete or partial genome sequences of 13 novel mammalian viruses are identified.
Abstract: Bats are natural hosts for a large variety of zoonotic viruses. This study aimed to describe the range of bat viromes, including viruses from mammals, insects, fungi, plants, and phages, in 11 insectivorous bat species (216 bats in total) common in six provinces of China. To analyze viromes, we used sequence-independent PCR amplification and next-generation sequencing technology (Solexa Genome Analyzer II; Illumina). The viromes were identified by sequence similarity comparisons to known viruses. The mammalian viruses included those of the Adenoviridae, Herpesviridae, Papillomaviridae, Retroviridae, Circoviridae, Rhabdoviridae, Astroviridae, Flaviridae, Coronaviridae, Picornaviridae, and Parvovirinae; insect viruses included those of the Baculoviridae, Iflaviridae, Dicistroviridae, Tetraviridae, and Densovirinae; fungal viruses included those of the Chrysoviridae, Hypoviridae, Partitiviridae, and Totiviridae; and phages included those of the Caudovirales, Inoviridae, and Microviridae and unclassified phages. In addition to the viruses and phages associated with the insects, plants, and bacterial flora related to the diet and habitation of bats, we identified the complete or partial genome sequences of 13 novel mammalian viruses. These included herpesviruses, papillomaviruses, a circovirus, a bocavirus, picornaviruses, a pestivirus, and a foamy virus. Pairwise alignments and phylogenetic analyses indicated that these novel viruses showed little genetic similarity with previously reported viruses. This study also revealed a high prevalence and diversity of bat astroviruses and coronaviruses in some provinces. These findings have expanded our understanding of the viromes of bats in China and hinted at the presence of a large variety of unknown mammalian viruses in many common bat species of mainland China.

245 citations


Cites background from "Sequencing technologies-the next ge..."

  • ...As described in our previous study (67), DNA libraries based on the PCR products described above were constructed according to the manufacturer’s instructions (Illumina)....

    [...]

  • ...In this study, the Illumina Solexa GA II method was used for a single read of 81 bp, and 11 species were separately sequenced....

    [...]

  • ...A series of in-house Perl scripts was then employed for further quality control, and reads were culled according to the following criteria: (i) reads filtered with Illumina’s Consensus Assessment of Sequence and Variation (CASAVA) software, (ii) reads with no call sites, (iii) reads with similarity to the sequencing adaptor and the primer K sequence, and (iv) duplicate reads and low-complexity reads....

    [...]

  • ...One bat virome analysis conducted by Ge et al. (18) with the Illumina Solexa GA method for a single read of 35 bp mainly described the insect viruses in some bat species of China....

    [...]

  • ...The amplified viral nucleic acid libraries of the 11 bat species were then sequenced with an Illumina GA II (one lane per species; Table 1)....

    [...]

Journal ArticleDOI
TL;DR: Although exome sequencing has been proven to be a promising approach to study Mendelian disorders, several shortcomings of this method must be noted, such as the inability to capture regulatory or evolutionary conserved sequences in non-coding regions and the incomplete capturing of all exons.
Abstract: Over the past several years, more focus has been placed on dissecting the genetic basis of complex diseases and traits through genome-wide association studies. In contrast, Mendelian disorders have received little attention mainly due to the lack of newer and more powerful methods to study these disorders. Linkage studies have previously been the main tool to elucidate the genetics of Mendelian disorders; however, extremely rare disorders or sporadic cases caused by de novo variants are not amendable to this study design. Exome sequencing has now become technically feasible and more cost-effective due to the recent advances in high-throughput sequence capture methods and next-generation sequencing technologies which have offered new opportunities for Mendelian disorder research. Exome sequencing has been swiftly applied to the discovery of new causal variants and candidate genes for a number of Mendelian disorders such as Kabuki syndrome, Miller syndrome and Fowler syndrome. In addition, de novo variants were also identified for sporadic cases, which would have not been possible without exome sequencing. Although exome sequencing has been proven to be a promising approach to study Mendelian disorders, several shortcomings of this method must be noted, such as the inability to capture regulatory or evolutionary conserved sequences in non-coding regions and the incomplete capturing of all exons.

245 citations


Cites background from "Sequencing technologies-the next ge..."

  • ...This development coupled with the high-throughput sequencing data produced by next-generation sequencing (NGS) technologies ensures an adequate depth of sequencing coverage to accurately detect the variants in the exome or targeted regions (Mamanova et al. 2010; Turner et al. 2010; Koboldt et al. 2010; Metzker 2010; Shendure and Ji 2008)....

    [...]

  • ...…sequencing data produced by next-generation sequencing (NGS) technologies ensures an adequate depth of sequencing coverage to accurately detect the variants in the exome or targeted regions (Mamanova et al. 2010; Turner et al. 2010; Koboldt et al. 2010; Metzker 2010; Shendure and Ji 2008)....

    [...]

01 Jan 2011
TL;DR: In this paper, a comparison of the solution-based exome capture kits provided by Agilent and Roche NimbleGen is conducted, and the results show that the Nimblegen exome captures are more accurate at aligning the exome libraries aligned to the target regions.
Abstract: Techniques enabling targeted re-sequencing of the protein coding sequences of the human genome on next generation sequencing instruments are of great interest. We conducted a systematic comparison of the solution-based exome capture kits provided by Agilent and Roche NimbleGen. A control DNA sample was captured with all four capture methods and prepared for Illumina GAII sequencing. Sequence data from additional samples prepared with the same protocols were also used in the comparison. We developed a bioinformatics pipeline for quality control, short read alignment, variant identification and annotation of the sequence data. In our analysis, a larger percentage of the high quality reads from the NimbleGen captures than from the Agilent captures aligned to the capture target regions. High GC content of the target sequence was associated with poor capture success in all exome enrichment methods. Comparison of mean allele balances for heterozygous variants indicated a tendency to have more reference bases than variant bases in the heterozygous variant positions within the target regions in all methods. There was virtually no difference in the genotype concordance compared to genotypes derived from SNP arrays. A minimum of 11× coverage was required to make a heterozygote genotype call with 99% accuracy when compared to common SNPs on genome-wide association arrays. Libraries captured with NimbleGen kits aligned more accurately to the target regions. The updated NimbleGen kit most efficiently covered the exome with a minimum coverage of 20×, yet none of the kits captured all the Consensus Coding Sequence annotated exons.

244 citations

Journal ArticleDOI
TL;DR: This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events.
Abstract: Structural variants are genomic rearrangements larger than 50 bp accounting for around 1% of the variation among human genomes. They impact on phenotypic diversity and play a role in various diseases including neurological/neurocognitive disorders and cancer development and progression. Dissecting structural variants from next- generation sequencing data presents several challenges and a number of approaches have been proposed in the literature. In this mini review we describe and summarise the latest tools – and their underlying algorithms – designed for the analysis of whole-genome sequencing, whole-exome sequencing, custom captures and amplicon sequencing data, pointing out the major advantages/drawbacks. We also report a summary of the most recent applications of third-generation sequencing platforms. This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events.

242 citations


Cites methods from "Sequencing technologies-the next ge..."

  • ...They allow for the sequencing of millions of short (few hundreds bp) DNA fragments (reads) simultaneously and may process a whole human genome in three days at 500-fold less cost than previous methods (Voelkerding et al., 2009; Metzker, 2010)....

    [...]

Journal ArticleDOI
TL;DR: Barley is a resilient crop with much potential which can be realised in the future as mentioned in this paper, however, substantial gains in crucial sustainability characteristics should be achievable, together with increased understanding of the physiological basis of many agronomic traits, particularly water and nutrient use efficiency.
Abstract: Barley is cultivated both in highly productive agricultural systems and also in marginal and subsistence environments. Its distribution is worldwide and is of considerable economic importance for animal feed and alcohol production. The overall importance of barley as a human food is minor but there is much potential for new uses exploiting the health benefits of whole grain and beta-glucans. The barley supply chains are complex and show added value at many stages. Germplasm resources for barley are considerable, with much potential for exploitation of its biodiversity available through the use of recently developed genomic and breeding tools. Consequently, substantial gains in crucial sustainability characteristics should be achievable in the future, together with increased understanding of the physiological basis of many agronomic traits, particularly water and nutrient use efficiency. Barley’s ability to adapt to multiple biotic and abiotic stresses will be crucial to its future exploitation and increased emphasis on these traits in elite germplasm is needed to equip the crop for environmental change. Similarly, resource use efficiency should become a higher priority to ensure the crop’s sustainability in the long-term. Clearly barley is a resilient crop with much potential which can be realised in the future.

241 citations


Cites background from "Sequencing technologies-the next ge..."

  • ...In addition, the new sequencing capabilities driven by rapid technological advances will find applications within barley breeding (including MAS) as it has already in genomic research (Metzker 2009; Varshney et al. 2009)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.
Abstract: RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.

11,528 citations


"Sequencing technologies-the next ge..." refers background in this paper

  • ...For example, in gene-expression studies microarrays are now being replaced by seq-based methods , which can identify and quantify rare transcripts without prior knowledge of a particular gene and can provide information regarding alternative splicing and sequence variation in identified gene...

    [...]

Journal ArticleDOI
TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.
Abstract: We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.

9,389 citations

Journal ArticleDOI
15 Sep 2005-Nature
TL;DR: A scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments with 96% coverage at 99.96% accuracy in one run of the machine is described.
Abstract: The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.

8,434 citations

Journal ArticleDOI
20 Feb 2009-Cell
TL;DR: This work has revealed unexpected diversity in their biogenesis pathways and the regulatory mechanisms that they access, which has direct implications for fundamental biology as well as disease etiology and treatment.

4,490 citations


"Sequencing technologies-the next ge..." refers background in this paper

  • ...and to elucidate the role of non-coding RNAs in health and diseas...

    [...]

Journal ArticleDOI
20 Feb 2009-Cell
TL;DR: The evolution of long noncoding RNAs and their roles in transcriptional regulation, epigenetic gene regulation, and disease are reviewed.

4,277 citations


"Sequencing technologies-the next ge..." refers background in this paper

  • ...and to elucidate the role of non-coding RNAs in health and diseas...

    [...]