scispace - formally typeset
Open AccessJournal ArticleDOI

Repetitive DNA and next-generation sequencing: computational challenges and solutions.

TLDR
The computational problems surrounding repeats are discussed and strategies used by current bioinformatics systems to solve them are described.
Abstract
Repetitive DNA sequences are abundant in a broad range of species, from bacteria to mammals, and they cover nearly half of the human genome. Repeats have always presented technical challenges for sequence alignment and assembly programs. Next-generation sequencing projects, with their short read lengths and high data volumes, have made these challenges more difficult. From a computational perspective, repeats create ambiguities in alignment and assembly, which, in turn, can produce biases and errors when interpreting results. Simply ignoring repeats is not an option, as this creates problems of its own and may mean that important biological phenomena are missed. We discuss the computational problems surrounding repeats and describe strategies used by current bioinformatics systems to solve them.

read more

Citations
More filters

Integrative Genomics Viewer

TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Journal ArticleDOI

The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.

TL;DR: The Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains is presented, demonstrating that the approach exhibits unrivaled speed while maintaining the accuracy of existing methods.
Journal ArticleDOI

ChIP-seq and Beyond: new and improved methodologies to detect and characterize protein-DNA interactions

TL;DR: The latest advances in methods to detect and functionally characterize DNA-bound proteins are described, which are being used to identify variability in the functions of DNA-binding proteins across genomes and individuals.
Journal ArticleDOI

Discovery of microbial natural products by activation of silent biosynthetic gene clusters.

TL;DR: This Review discusses the strategies that have been developed in bacteria and fungi to identify and induce the expression of silent BGCs, and briefly summarize methods for the isolation and structural characterization of their metabolic products.
References
More filters
Journal ArticleDOI

The Sequence Alignment/Map format and SAMtools

TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI

Fast and accurate short read alignment with Burrows–Wheeler transform

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Journal ArticleDOI

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.
Journal ArticleDOI

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation

TL;DR: The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.
Related Papers (5)

Initial sequencing and analysis of the human genome.

Eric S. Lander, +248 more
- 15 Feb 2001 -