scispace - formally typeset
Open AccessJournal ArticleDOI

ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data

Reads0
Chats0
TLDR
The ANNOVAR tool to annotate single nucleotide variants and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP is developed.
Abstract
High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires ∼4 min to perform gene-based annotation and ∼15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3

TL;DR: It appears that the 5′ and 3′ UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus.
Journal ArticleDOI

The Ensembl Variant Effect Predictor.

TL;DR: The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
Journal ArticleDOI

Genetic studies of body mass index yield new insights for obesity biology

TL;DR: A genome-wide association study and Metabochip meta-analysis of body mass index (BMI), a measure commonly used to define obesity and assess adiposity, in up to 339,224 individuals provide strong support for a role of the central nervous system in obesity susceptibility.

Genetic studies of body mass index yield new insights for obesity biology

Adam E. Locke, +481 more
TL;DR: This paper conducted a genome-wide association study and meta-analysis of body mass index (BMI), a measure commonly used to define obesity and assess adiposity, in up to 339,224 individuals.
Journal ArticleDOI

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

TL;DR: This work shows that sequence specificities can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for pattern discovery.
References
More filters
Journal ArticleDOI

SIFT: predicting amino acid changes that affect protein function

TL;DR: SIFT is a program that predicts whether an amino acid substitution affects protein function so that users can prioritize substitutions for further study and can distinguish between functionally neutral and deleterious amino acid changes in mutagenesis studies and on human polymorphisms.
Journal ArticleDOI

NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins

TL;DR: The National Center for Biotechnology Information Reference Sequence (RefSeq) database provides a non-redundant collection of sequences representing genomic data, transcripts and proteins that pragmatically includes sequence data that are currently publicly available in the archival databases.
Journal ArticleDOI

Accurate whole human genome sequencing using reversible terminator chemistry

David R. Bentley, +201 more
- 06 Nov 2008 - 
TL;DR: An approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost is reported, effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.
Journal ArticleDOI

Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes

TL;DR: A comprehensive search for conserved elements in vertebrate genomes is conducted, using genome-wide multiple alignments of five vertebrate species (human, mouse, rat, chicken, and Fugu rubripes), using a two-state phylogenetic hidden Markov model (phylo-HMM).
Journal ArticleDOI

Human non‐synonymous SNPs: server and survey

TL;DR: A World Wide Web server is presented to predict the effect of an nsSNP on protein structure and function and the dependence of selective pressure on the structural and functional properties of proteins is studied.
Related Papers (5)