scispace - formally typeset
Search or ask a question
Topic

SNP annotation

About: SNP annotation is a research topic. Over the lifetime, 64 publications have been published within this topic receiving 10416 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The ANNOVAR tool to annotate single nucleotide variants and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP is developed.
Abstract: High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires ∼4 min to perform gene-based annotation and ∼15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.

10,461 citations

Journal ArticleDOI
TL;DR: The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up to a full chromosome and includes assembly data, genes and gene predictions, mRNA and EST alignments, and comparative genomics, regulation, expression and variation data.
Abstract: The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up to a full chromosome and includes assembly data, genes and gene predictions, mRNA and EST alignments, and comparative genomics, regulation, expression and variation data. The database is optimized for fast interactive performance with web tools that provide powerful visualization and querying capabilities for mining the data. In the past year, 22 new assemblies and several new sets of human variation annotation have been released. New features include VisiGene, a fully integrated in situ hybridization image browser; phyloGif, for drawing evolutionary tree diagrams; a redesigned Custom Track feature; an expanded SNP annotation track; and many new display options. The Genome Browser, other tools, downloadable data files and links to documentation and other information can be found at http://genome.ucsc.edu/.

1,061 citations

Journal ArticleDOI
TL;DR: A GWIA approach that selects combinations of SNPs for interaction analysis based on a priori information is presented and various meaningful GWIA strategies that can be conducted using INTERSNP are introduced.
Abstract: Summary: Genome-wide association studies (GWAS) have lead to the identification of hundreds of genomic regions associated with complex diseases. Nevertheless, a large fraction of their heritability remains unexplained. Interaction between genetic variants is one of several putative explanations for the ‘case of missing heritability’ and, therefore, a compelling next analysis step. However, genomewide interaction analysis (GWIA) of all pairs of SNPs from a standard marker panel is computationally unfeasible without massive parallelization. Furthermore, GWIA of all SNP triples is utopian. In order to overcome these computational constraints, we present a GWIA approach that selects combinations of SNPs for interaction analysis based on a priori information. Sources of information are statistical evidence (single marker association at a moderate level), genetic relevance (genomic location) and biologic relevance (SNP function class and pathway information). We introduce the software package INTERSNP that implements a logistic regression framework as well as log-linear models for joint analysis of multiple SNPs. Automatic handling of SNP annotation and pathways from the KEGG database is provided. In addition, Monte Carlo simulations to judge genome-wide significance are implemented. We introduce various meaningful GWIA strategies that can be conducted using INTERSNP. Typical examples are, for instance, the analysis of all pairs of non-synonymous SNPs, or, the analysis of all combinations of three SNPs that lie in a common pathway and that are among the top 50 000 single-marker results. We demonstrate the feasibility of these and other GWIA strategies by application to a GWAS dataset and discuss promising results. Availability: The software is available at http://intersnp.meb.unibonn.de

153 citations

Journal ArticleDOI
TL;DR: NGS-SNP is a collection of command-line scripts for providing rich annotations for SNPs identified by the sequencing of whole genomes from any organism with reference sequences in Ensembl, including the results of detailed comparisons with orthologous sequences.
Abstract: Summary: NGS-SNP is a collection of command-line scripts for providing rich annotations for SNPs identified by the sequencing of whole genomes from any organism with reference sequences in Ensembl. Included among the annotations, several of which are not available from any existing SNP annotation tools, are the results of detailed comparisons with orthologous sequences. These comparisons can, for example, identify SNPs that affect conserved residues, or alter residues or genes linked to phenotypes in another species. Availability: NGS-SNP is available both as a set of scripts and as a virtual machine. The virtual machine consists of a Linux operating system with all the NGS-SNP dependencies pre-installed. The source code and virtual machine are freely available for download at http://stothard.afns.ualberta.ca/downloads/NGS-SNP/. Contact: stothard@ualberta.ca Supplementary information:Supplementary data are available at Bioinformatics online.

104 citations

Journal ArticleDOI
TL;DR: The present results effectively narrowed down the associated regions compared to previous QTL studies and revealed haplotypes and candidate genes on SSC12 for meat quality traits in pigs.
Abstract: Pork quality is an economically important trait and one of the main selection criteria for breeding in the swine industry. In this genome-wide association study (GWAS), 455 pigs from a porcine Large White × Minzhu intercross population were genotyped using the Illumina PorcineSNP60K Beadchip, and phenotyped for intramuscular fat content (IMF), marbling, moisture, color L*, color a*, color b* and color score in the longissimus muscle (LM). Association tests between each trait and the SNPs were performed via the Genome Wide Rapid Association using the Mixed Model and Regression-Genomic Control (GRAMMAR-GC) approach. From the Ensembl porcine database, SNP annotation was implemented using Sus scrofa Build 9. A total of 45 SNPs showed significant association with one or multiple meat quality traits. Of the 45 SNPs, 36 were located on SSC12. These significantly associated SNPs aligned to or were in close approximation to previously reported quantitative trait loci (QTL) and some were located within introns of previously reported candidate genes. Two haplotype blocks ASGA0100525-ASGA0055225-ALGA0067099-MARC0004712-DIAS0000861, and ASGA0085522-H3GA0056170 were detected in the significant region. The first block contained the genes MYH1, MYH2 and MYH4. A SNP (ASGA0094812) within an intron of the USP43 gene was significantly associated with five meat quality traits. The present results effectively narrowed down the associated regions compared to previous QTL studies and revealed haplotypes and candidate genes on SSC12 for meat quality traits in pigs.

82 citations

Network Information
Related Topics (5)
Genome
74.2K papers, 3.8M citations
73% related
Gene
211.7K papers, 10.3M citations
71% related
Gene expression
113.3K papers, 5.5M citations
69% related
Locus (genetics)
42.7K papers, 2M citations
68% related
Regulation of gene expression
85.4K papers, 5.8M citations
68% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20215
20205
20194
20183
20171
20166