scispace - formally typeset
Open AccessJournal ArticleDOI

BCFtools/csq: haplotype-aware variant consequences.

Petr Danecek, +1 more
- 01 Jul 2017 - 
- Vol. 33, Iss: 13, pp 2037-2039
Reads0
Chats0
TLDR
BCFtools/csq is a fast program for haplotype‐aware consequence calling which can take into account known phase, and Predictions match existing tools when run in localized mode, but the program is an order of magnitude faster and requires an orders of magnitude less memory.
Abstract
Motivation Prediction of functional variant consequences is an important part of sequencing pipelines, allowing the categorization and prioritization of genetic variants for follow up analysis. However, current predictors analyze variants as isolated events, which can lead to incorrect predictions when adjacent variants alter the same codon, or when a frame-shifting indel is followed by a frame-restoring indel. Exploiting known haplotype information when making consequence predictions can resolve these issues. Results BCFtools/csq is a fast program for haplotype-aware consequence calling which can take into account known phase. Consequence predictions are changed for 501 of 5019 compound variants found in the 81.7M variants in the 1000 Genomes Project data, with an average of 139 compound variants per haplotype. Predictions match existing tools when run in localized mode, but the program is an order of magnitude faster and requires an order of magnitude less memory. Availability and implementation The program is freely available for commercial and non-commercial use in the BCFtools package which is available for download from http://samtools.github.io/bcftools . Contact pd3@sanger.ac.uk. Supplementary information Supplementary data are available at Bioinformatics online.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Rapid genotyping of targeted viral samples using Illumina short-read sequencing data

TL;DR: This paper presents a pipeline designed to reconstruct the dominant consensus genome of viral samples and analyze their within-host variability, and benchmarked the approach on numerous datasets and showed that it could be obtained reliably without further manual data curation.
Posted ContentDOI

Temporal GWAS identifies a widely distributed putative adhesin contributing to pathogen success in Shigella spp

TL;DR: The results indicate the potential importance of Stv in controlling Shigella and other infections, and the validity of a tGWAS approach for identifying biological drivers underpinning the evolution and expansion of AMR pathogens over time, and highlights the effectiveness of using t GWAS on historical isolate collections for identifying novel contributors to pathogen success over time.
Journal ArticleDOI

Ancient mitochondrial genome diversity in South America: Contributions from Quebrada del Toro, Northwestern Argentina.

TL;DR: In this article , the authors analyzed the complete ancient mitogenome of individuals from the Ojo de Agua archeological site (970 BP) in Quebrada del Toro (Salta, Argentina).
Journal ArticleDOI

Extensive genome introgression between domestic ferret and European polecat during population recovery in Great Britain

TL;DR: This article carried out population-level whole-genome sequencing on 8 domestic ferrets, 19 British European polecat, and 15 European mainland polecat from the European mainland, and found high degrees of genome introgression in British polecats outside their previous stronghold, even in those individuals phenotyped as “pure” polecats.
Posted ContentDOI

Machine-learning prediction of resistance to sub-inhibitory antimicrobial concentrations from Escherichia coli genomes

TL;DR: In this paper, the authors used a high throughput phenotypic assay to measure bacterial growth of a systematic collection of natural Escherichia coli strains and then employed machine learning models to predict bacterial growth from genomic data under non-therapeutic sub-inhibitory concentrations of antimicrobials.
References
More filters
Journal ArticleDOI

ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data

TL;DR: The ANNOVAR tool to annotate single nucleotide variants and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP is developed.
Journal ArticleDOI

Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek, +106 more
- 18 Aug 2016 - 
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Journal ArticleDOI

A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3

TL;DR: It appears that the 5′ and 3′ UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus.
Journal ArticleDOI

The Ensembl Variant Effect Predictor.

TL;DR: The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
Journal ArticleDOI

A reference panel of 64,976 haplotypes for genotype imputation

Shane A. McCarthy, +117 more
- 22 Aug 2016 - 
TL;DR: A reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies.
Related Papers (5)