VAT: A computational framework to functionally annotate variants in personal genomes within a cloud-computing environment
Lukas Habegger,Suganthi Balasubramanian,David Z. Chen,Ekta Khurana,Andrea Sboner,Arif Harmanci,Joel Rozowsky,Declan Clarke,Michael Snyder,Mark Gerstein +9 more
Reads0
Chats0
TLDR
The Variant Annotation Tool (VAT) is developed to functionally annotate variants from multiple personal genomes at the transcript level as well as obtain summary statistics across genes and individuals.Abstract:
Summary: The functional annotation of variants obtained through sequencing projects is generally assumed to be a simple intersection of genomic coordinates with genomic features. However, complexities arise for several reasons, including the differential effects of a variant on alternatively spliced transcripts, as well as the difficulty in assessing the impact of small insertions/deletions and large structural variants. Taking these factors into consideration, we developed the Variant Annotation Tool (VAT) to functionally annotate variants from multiple personal genomes at the transcript level as well as obtain summary statistics across genes and individuals. VAT also allows visualization of the effects of different variants, integrates allele frequencies and genotype data from the underlying individuals and facilitates comparative analysis between different groups of individuals. VAT can either be run through a command-line interface or as a web application. Finally, in order to enable on-demand access and to minimize unnecessary transfers of large data files, VAT can be run as a virtual machine in a cloud-computing environment.
Availability and Implementation: VAT is implemented in C and PHP. The VAT web service, Amazon Machine Image, source code and detailed documentation are available at vat.gersteinlab.org.
Contact: lukas.habegger@yale.edu or mark.gerstein@yale.edu
Supplementary Information: Supplementary data are available at Bioinformatics online.read more
Citations
More filters
Journal ArticleDOI
Whole-genome sequence variation, population structure and demographic history of the Dutch population
Laurent C. Francioli,Androniki Menelaou,Sara L. Pulit,Freerk van Dijk,Pier Francesco Palamara,Clara C. Elbers,Pieter B. Neerincx,Kai Ye,Kai Ye,Victor Guryev,Wigard P. Kloosterman,Patrick Deelen,Abdel Abdellaoui,Elisabeth M. van Leeuwen,Mannis van Oven,Martijn Vermaat,Mingkun Li,Jeroen F. J. Laros,Lennart C. Karssen,Alexandros Kanterakis,Najaf Amin,Jouke-Jan Hottenga,Eric-Wubbo Lameijer,Mathijs Kattenberg,Martijn Dijkstra,Heorhiy Byelas,Jessica van Setten,Barbera D. C. van Schaik,Jan Bot,Isaac J. Nijman,Ivo Renkens,Tobias Marschall,Alexander Schönhuth,Jayne Y. Hehir-Kwa,Robert E. Handsaker,Robert E. Handsaker,Paz Polak,Mashaal Sohail,Mashaal Sohail,Dana Vuzman,Fereydoun Hormozdiari,David van Enckevort,Hailiang Mei,Vyacheslav Koval,Matthijs Moed,K. Joeri van der Velde,Fernando Rivadeneira,Fernando Rivadeneira,Fernando Rivadeneira,Karol Estrada,Carolina Medina-Gomez,Aaron Isaacs,Aaron Isaacs,Steven A. McCarroll,Marian Beekman,Anton J. M. de Craen,H. Eka D. Suchiman,Albert Hofman,Ben A. Oostra,André G. Uitterlinden,Gonneke Willemsen,Mathieu Platteel,Jan H. Veldink,Leonard H. van den Berg,Steven J. Pitts,Shobha Potluri,Purnima Sundar,David R. Cox,David R. Cox,Shamil R. Sunyaev,Johan T. den Dunnen,Mark Stoneking,Peter de Knijff,Manfred Kayser,Qibin Li,Yingrui Li,Yuanping Du,Ruoyan Chen,Hongzhi Cao,Ning Li,Sujie Cao,Jun Wang,Jasper A. Bovenberg,Itsik Pe'er,P. Eline Slagboom,Cornelia M. van Duijn,Dorret I. Boomsma,Gert-Jan B. van Ommen,Paul I.W. de Bakker,Paul I.W. de Bakker,Morris A. Swertz,Cisca Wijmenga +91 more
TL;DR: The Genome of the Netherlands (GoNL) Project is described, in which the whole genomes of 250 Dutch parent-offspring families were sequenced and a haplotype map of 20.4 million single-nucleotide variants and 1.2 million insertions and deletions were constructed.
Journal ArticleDOI
Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR
TL;DR: A protocol to use the ANNOVAR (ANNOtate VARiation) software to facilitate fast and easy variant annotations, including gene-based, region-based and filter-based annotations on a variant call format (VCF) file generated from human genomes.
Journal ArticleDOI
Integrative Annotation of Variants from 1092 Humans: Application to Cancer Genomics
Ekta Khurana,Yao Fu,Vincenza Colonna,Vincenza Colonna,Xinmeng Jasmine Mu,Hyun Min Kang,Tuuli Lappalainen,Tuuli Lappalainen,Andrea Sboner,Andrea Sboner,Lucas Lochovsky,Jieming Chen,Arif Harmanci,Jishnu Das,Alexej Abyzov,Suganthi Balasubramanian,Kathryn Beal,Dimple Chakravarty,Danny Challis,Yuan Chen,Declan Clarke,Laura Clarke,Fiona Cunningham,Uday S. Evani,Paul Flicek,Robert Fragoza,Erik Garrison,Richard A. Gibbs,Zeynep H. Gümüş,Javier Herrero,Naoki Kitabayashi,Yong Kong,Kasper Lage,Vaja Liluashvili,Steven M. Lipkin,Daniel G. MacArthur,Daniel G. MacArthur,Gabor T. Marth,Donna M. Muzny,Tune H. Pers,Tune H. Pers,Tune H. Pers,Graham R. S. Ritchie,Jeffrey A. Rosenfeld,Jeffrey A. Rosenfeld,Cristina Sisu,Xiaomu Wei,Michael Wilson,Yali Xue,Fuli Yu,Emmanouil T. Dermitzakis,Emmanouil T. Dermitzakis,Haiyuan Yu,Mark A. Rubin,Chris Tyler-Smith,Mark Gerstein +55 more
TL;DR: In this article, the authors used patterns of polymorphisms in functionally annotated regions in 1092 humans to identify deleterious variants; then, they experimentally validated candidates, finding regions particularly sensitive to mutations and variants that are disruptive because of mechanistic effects on transcription-factor binding.
Journal ArticleDOI
FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer.
Yao Fu,Zhu Liu,Shaoke Lou,Jason Bedford,Xinmeng Jasmine Mu,Xinmeng Jasmine Mu,Kevin Y. Yip,Ekta Khurana,Ekta Khurana,Mark Gerstein +9 more
TL;DR: A computational framework to annotate and prioritize noncoding drivers from thousands of somatic alterations in a typical tumor, FunSeq2, which combines an adjustable data context integrating large-scale genomics and cancer resources with a streamlined variant-prioritization pipeline.
Journal ArticleDOI
Phenolyzer: phenotype-based prioritization of candidate genes for human diseases
TL;DR: Phenolyzer is a tool that uses prior information to implicate genes involved in diseases, and exhibits superior performance over competing methods for prioritizing Mendelian and complex disease genes, based on disease or phenotype terms entered as free text.
References
More filters
Journal ArticleDOI
ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data
TL;DR: The ANNOVAR tool to annotate single nucleotide variants and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP is developed.
Journal ArticleDOI
The variant call format and VCFtools
Petr Danecek,Adam Auton,Gonçalo R. Abecasis,Cornelis A. Albers,Eric Banks,Mark A. DePristo,Robert E. Handsaker,Gerton Lunter,Gabor T. Marth,Stephen T. Sherry,Gilean McVean,Richard Durbin +11 more
TL;DR: VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
Journal ArticleDOI
A Map of Human Genome Variation From Population-Scale Sequencing
Gonçalo R. Abecasis,David Altshuler,David Altshuler,Adam Auton,Lisa D Brooks,Richard Durbin,Richard A. Gibbs,Matthew E. Hurles,Gil McVean +8 more
TL;DR: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype as mentioned in this paper, and the results of the pilot phase of the project, designed to develop and compare different strategies for genomewide sequencing with high-throughput platforms.
Journal ArticleDOI
Table S2: Trans-factors and trinucleotide repeat instability Trans-factor
Journal ArticleDOI
SIFT: predicting amino acid changes that affect protein function
Pauline C. Ng,Steven Henikoff +1 more
TL;DR: SIFT is a program that predicts whether an amino acid substitution affects protein function so that users can prioritize substitutions for further study and can distinguish between functionally neutral and deleterious amino acid changes in mutagenesis studies and on human polymorphisms.