Topic

Variant Call Format

About: Variant Call Format is a research topic. Over the lifetime, 179 publications have been published within this topic receiving 93623 citations. The topic is also known as: VCF.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A standard variation file format for human genome sequences

[...]

Martin G. Reese, Barry Moore¹, Colin Batchelor², Fidel Salas, Fiona Cunningham³, Gabor T. Marth⁴, Lincoln Stein⁵, Paul Flicek³, Mark Yandell¹, Karen Eilbeck¹ - Show less +6 more•Institutions (5)

University of Utah¹, Royal Society of Chemistry², European Bioinformatics Institute³, Boston College⁴, Ontario Institute for Cancer Research⁵

26 Aug 2010-Genome Biology

TL;DR: The Genome Variation Format (GVF), an extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data.

...read moreread less

Abstract: Here we describe the Genome Variation Format (GVF) and the 10Gen dataset. GVF, an extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data. The 10Gen dataset, ten human genomes in GVF format, is freely available for community analysis from the Sequence Ontology website and from an Amazon elastic block storage (EBS) snapshot for use in Amazon's EC2 cloud computing environment.

...read moreread less

92 citations

Journal Article•DOI•

VCF-kit: assorted utilities for the variant call format.

[...]

Daniel E. Cook¹, Erik C. Andersen¹•Institutions (1)

Northwestern University¹

15 May 2017-Bioinformatics

TL;DR: VCF‐kit adds essential utilities to process and analyze VCF files, including primer generation for variant validation, dendrogram production, genotype imputation from sequence data in linkage studies, and additional tools.

...read moreread less

Abstract: Summary The variant call format (VCF) is a popular standard for storing genetic variation data. As a result, a large collection of tools has been developed that perform diverse analyses using VCF files. However, some tasks common to statistical and population geneticists have not been created yet. To streamline these types of analyses, we created novel tools that analyze or annotate VCF files and organized these tools into a command-line based utility named VCF-kit. VCF-kit adds essential utilities to process and analyze VCF files, including primer generation for variant validation, dendrogram production, genotype imputation from sequence data in linkage studies, and additional tools. Availability and Implementation https://github.com/AndersenLab/VCF-kit. Contact erik.andersen@northwestern.edu.

...read moreread less

86 citations

Journal Article•DOI•

The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments.

[...]

Jean-Simon Brouard¹, Flavio S Schenkel², Andrew Marete¹, Nathalie Bissonnette¹•Institutions (2)

Agriculture and Agri-Food Canada¹, University of Guelph²

21 Jun 2019-Journal of animal science and biotechnology

TL;DR: The GATK joint genotyping method for calling variants on RNA-seq data was validated by comparing this approach to a so-called “per-sample” method, indicating that both approaches are very close in their capacity of detecting reference variants and that the joint genotypes method is more sensitive than the per-sample method.

...read moreread less

Abstract: The Genome Analysis Toolkit (GATK) is a popular set of programs for discovering and genotyping variants from next-generation sequencing data. The current GATK recommendation for RNA sequencing (RNA-seq) is to perform variant calling from individual samples, with the drawback that only variable positions are reported. Versions 3.0 and above of GATK offer the possibility of calling DNA variants on cohorts of samples using the HaplotypeCaller algorithm in Genomic Variant Call Format (GVCF) mode. Using this approach, variants are called individually on each sample, generating one GVCF file per sample that lists genotype likelihoods and their genome annotations. In a second step, variants are called from the GVCF files through a joint genotyping analysis. This strategy is more flexible and reduces computational challenges in comparison to the traditional joint discovery workflow. Using a GVCF workflow for mining SNP in RNA-seq data provides substantial advantages, including reporting homozygous genotypes for the reference allele as well as missing data. Taking advantage of RNA-seq data derived from primary macrophages isolated from 50 cows, the GATK joint genotyping method for calling variants on RNA-seq data was validated by comparing this approach to a so-called “per-sample” method. In addition, pair-wise comparisons of the two methods were performed to evaluate their respective sensitivity, precision and accuracy using DNA genotypes from a companion study including the same 50 cows genotyped using either genotyping-by-sequencing or with the Bovine SNP50 Beadchip (imputed to the Bovine high density). Results indicate that both approaches are very close in their capacity of detecting reference variants and that the joint genotyping method is more sensitive than the per-sample method. Given that the joint genotyping method is more flexible and technically easier, we recommend this approach for variant calling in RNA-seq experiments.

...read moreread less

72 citations

Journal Article•DOI•

Jannovar: a java library for exome annotation.

[...]

Marten Jäger¹, Kai Wang², Sebastian Bauer¹, Damian Smedley³, Peter Krawitz¹, Peter Krawitz⁴, Peter N. Robinson - Show less +3 more•Institutions (4)

Charité¹, University of Southern California², Wellcome Trust Sanger Institute³, Max Planck Society⁴

01 May 2014-Human Mutation

TL;DR: Jannovar, a stand‐alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis, uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society‐compliant annotations.

...read moreread less

Abstract: Transcript-based annotation and pedigree analysis are two basic steps in the computational analysis of whole-exome sequencing experiments in genetic diagnostics and disease-gene discovery projects. Here, we present Jannovar, a stand-alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis. Jannovar uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society-compliant annotations both for variants affecting coding sequences and splice junctions as well as untranslated regions and noncoding RNA transcripts. Jannovar can also perform family-based pedigree analysis with Variant Call Format (VCF) files with data from members of a family segregating a Mendelian disorder. Using a desktop computer, Jannovar requires a few seconds to annotate a typical VCF file with exome data. Jannovar is freely available under the BSD2 license. Source code as well as the Java application and library file can be downloaded from http://compbio.charite.de (with tutorial) and https://github.com/charite/jannovar.

...read moreread less

63 citations

Journal Article•DOI•

A standard file format for data from DNA sequencing instruments.

[...]

Simon Dear¹, Rodger Staden¹•Institutions (1)

Laboratory of Molecular Biology¹

01 Jan 1992-Dna Sequence

TL;DR: A machine independent format for storing data derived from automatic sequencing machines is described, which can store the derived sequence, the traces and a set of confidence measures for each base.

...read moreread less

Abstract: There are now a number of machines for determining DNA sequences. These devices are currently of two types: those such as the Applied Biosystems 373A and the Pharmacia A.L.F. which interpret the sequences of samples as they run on gels within the machine, and those, such as the Bio-Rad and Amersham readers that scan and analyse conventional autoradiographs. Both types of machine can produce their data in the form of traces which represent the band intensity of each of the four base types at each position in the sequence. At present all the machines write files in different formats. We describe a machine independent formal for storing data derived from automatic sequencing machines. Files in this format can store the derived sequence, the traces and a set of confidence measures for each base. We have adopted the format as the standard for our sequence handling software.

...read moreread less

60 citations

Collapse

Network Information

Performance

Metrics

179

Papers

121,241

Citations

No. of papers in the topic in previous years
Year	Papers
2022	1
2021	20
2020	17
2019	22
2018	17
2017	16

Variant Call Format

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics