BCFtools/csq: haplotype-aware variant consequences.
Petr Danecek,Shane A. McCarthy +1 more
Reads0
Chats0
TLDR
BCFtools/csq is a fast program for haplotype‐aware consequence calling which can take into account known phase, and Predictions match existing tools when run in localized mode, but the program is an order of magnitude faster and requires an orders of magnitude less memory.Abstract:
Motivation Prediction of functional variant consequences is an important part of sequencing pipelines, allowing the categorization and prioritization of genetic variants for follow up analysis. However, current predictors analyze variants as isolated events, which can lead to incorrect predictions when adjacent variants alter the same codon, or when a frame-shifting indel is followed by a frame-restoring indel. Exploiting known haplotype information when making consequence predictions can resolve these issues. Results BCFtools/csq is a fast program for haplotype-aware consequence calling which can take into account known phase. Consequence predictions are changed for 501 of 5019 compound variants found in the 81.7M variants in the 1000 Genomes Project data, with an average of 139 compound variants per haplotype. Predictions match existing tools when run in localized mode, but the program is an order of magnitude faster and requires an order of magnitude less memory. Availability and implementation The program is freely available for commercial and non-commercial use in the BCFtools package which is available for download from http://samtools.github.io/bcftools . Contact pd3@sanger.ac.uk. Supplementary information Supplementary data are available at Bioinformatics online.read more
Citations
More filters
Journal ArticleDOI
Rapid genotyping of targeted viral samples using Illumina short-read sequencing data
TL;DR: This paper presents a pipeline designed to reconstruct the dominant consensus genome of viral samples and analyze their within-host variability, and benchmarked the approach on numerous datasets and showed that it could be obtained reliably without further manual data curation.
Posted ContentDOI
Temporal GWAS identifies a widely distributed putative adhesin contributing to pathogen success in Shigella spp
Rebecca J. Bennett,P. Malaka De Silva,Rebecca J. Bengtsson,Malcolm J. Horsburgh,Tim R. Blower,Kate S. Baker +5 more
TL;DR: The results indicate the potential importance of Stv in controlling Shigella and other infections, and the validity of a tGWAS approach for identifying biological drivers underpinning the evolution and expansion of AMR pathogens over time, and highlights the effectiveness of using t GWAS on historical isolate collections for identifying novel contributors to pathogen success over time.
Journal ArticleDOI
Ancient mitochondrial genome diversity in South America: Contributions from Quebrada del Toro, Northwestern Argentina.
María Gabriela Russo,Valeria Arencibia,Matthew V. Emery,P. Mercolli,Lucas Luciano Maldonado,Laura Kamenetzky,Sergio Alejandro Avena,Cristina B. Dejean,Anne C. Stone +8 more
TL;DR: In this article , the authors analyzed the complete ancient mitogenome of individuals from the Ojo de Agua archeological site (970 BP) in Quebrada del Toro (Salta, Argentina).
Journal ArticleDOI
Extensive genome introgression between domestic ferret and European polecat during population recovery in Great Britain
TL;DR: This article carried out population-level whole-genome sequencing on 8 domestic ferrets, 19 British European polecat, and 15 European mainland polecat from the European mainland, and found high degrees of genome introgression in British polecats outside their previous stronghold, even in those individuals phenotyped as “pure” polecats.
Posted ContentDOI
Machine-learning prediction of resistance to sub-inhibitory antimicrobial concentrations from Escherichia coli genomes
Sam Benkwitz-Bedford,Martin Palm,Talip Yasir Demirtas,Ville Mustonen,Anne Farewell,Jonas Warringer,Danesh Moradigaravand,Leopold Parts,Leopold Parts +8 more
TL;DR: In this paper, the authors used a high throughput phenotypic assay to measure bacterial growth of a systematic collection of natural Escherichia coli strains and then employed machine learning models to predict bacterial growth from genomic data under non-therapeutic sub-inhibitory concentrations of antimicrobials.
References
More filters
Journal ArticleDOI
ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data
TL;DR: The ANNOVAR tool to annotate single nucleotide variants and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP is developed.
Journal ArticleDOI
Analysis of protein-coding genetic variation in 60,706 humans
Monkol Lek,Konrad J. Karczewski,Konrad J. Karczewski,Eric Vallabh Minikel,Eric Vallabh Minikel,Kaitlin E. Samocha,Eric Banks,Timothy Fennell,Anne H. O’Donnell-Luria,Anne H. O’Donnell-Luria,Anne H. O’Donnell-Luria,James S. Ware,Andrew J. Hill,Andrew J. Hill,Andrew J. Hill,Beryl B. Cummings,Beryl B. Cummings,Taru Tukiainen,Taru Tukiainen,Daniel P. Birnbaum,Jack A. Kosmicki,Laramie E. Duncan,Laramie E. Duncan,Karol Estrada,Karol Estrada,Fengmei Zhao,Fengmei Zhao,James Zou,Emma Pierce-Hoffman,Emma Pierce-Hoffman,Joanne Berghout,David Neil Cooper,Nicole A. Deflaux,Mark A. DePristo,Ron Do,Jason Flannick,Jason Flannick,Menachem Fromer,Laura D. Gauthier,Jackie Goldstein,Jackie Goldstein,Namrata Gupta,Daniel P. Howrigan,Daniel P. Howrigan,Adam Kiezun,Mitja I. Kurki,Mitja I. Kurki,Ami Levy Moonshine,Pradeep Natarajan,Lorena Orozco,Gina M. Peloso,Gina M. Peloso,Ryan Poplin,Manuel A. Rivas,Valentin Ruano-Rubio,Samuel A. Rose,Douglas M. Ruderfer,Khalid Shakir,Peter D. Stenson,Christine Stevens,Brett Thomas,Brett Thomas,Grace Tiao,María Teresa Tusié-Luna,Ben Weisburd,Hong-Hee Won,Dongmei Yu,David Altshuler,David Altshuler,Diego Ardissino,Michael Boehnke,John Danesh,Stacey Donnelly,Roberto Elosua,Jose C. Florez,Jose C. Florez,Stacey Gabriel,Gad Getz,Gad Getz,Stephen J. Glatt,Christina M. Hultman,Sekar Kathiresan,Markku Laakso,Steven A. McCarroll,Steven A. McCarroll,Mark I. McCarthy,Mark I. McCarthy,Dermot P.B. McGovern,Ruth McPherson,Benjamin M. Neale,Benjamin M. Neale,Aarno Palotie,Shaun Purcell,Danish Saleheen,Jeremiah M. Scharf,Pamela Sklar,Patrick F. Sullivan,Patrick F. Sullivan,Jaakko Tuomilehto,Ming T. Tsuang,Hugh Watkins,Hugh Watkins,James G. Wilson,Mark J. Daly,Mark J. Daly,Daniel G. MacArthur,Daniel G. MacArthur +106 more
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Journal ArticleDOI
A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3
Pablo Cingolani,Adrian E. Platts,Le Lily Wang,M. Coon,Tung T. Nguyen,Luan Wang,Susan Land,Xiangyi Lu,Douglas M. Ruden +8 more
TL;DR: It appears that the 5′ and 3′ UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus.
Journal ArticleDOI
The Ensembl Variant Effect Predictor.
William M. McLaren,Laurent Gil,Sarah E. Hunt,Harpreet Singh Riat,Graham R. S. Ritchie,Anja Thormann,Paul Flicek,Fiona Cunningham +7 more
TL;DR: The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
Journal ArticleDOI
A reference panel of 64,976 haplotypes for genotype imputation
Shane A. McCarthy,Sayantan Das,Warren W. Kretzschmar,Olivier Delaneau,Andrew R. Wood,Alexander Teumer,Hyun Min Kang,Christian Fuchsberger,Petr Danecek,Kevin Sharp,Yang Luo,C Sidore,Alan Kwong,Nicholas J. Timpson,Seppo Koskinen,Scott I. Vrieze,Laura J. Scott,He Zhang,Anubha Mahajan,Jan H. Veldink,Ulrike Peters,Ulrike Peters,Carlos N. Pato,Cornelia M. van Duijn,Christopher E. Gillies,Ilaria Gandin,Massimo Mezzavilla,Arthur Gilly,Massimiliano Cocca,Michela Traglia,Andrea Angius,Jeffrey C. Barrett,D.I. Boomsma,Kari Branham,Gerome Breen,Gerome Breen,Chad M. Brummett,Fabio Busonero,Harry Campbell,Andrew T. Chan,Sai Chen,Emily Y. Chew,Francis S. Collins,Laura J Corbin,George Davey Smith,George Dedoussis,Marcus Dörr,Aliki-Eleni Farmaki,Luigi Ferrucci,Lukas Forer,Ross M. Fraser,Stacey Gabriel,Shawn Levy,Leif Groop,Leif Groop,Tabitha A. Harrison,Andrew T. Hattersley,Oddgeir L. Holmen,Kristian Hveem,Matthias Kretzler,James Lee,Matt McGue,Thomas Meitinger,David Melzer,Josine L. Min,Karen L. Mohlke,John B. Vincent,Matthias Nauck,Deborah A. Nickerson,Aarno Palotie,Aarno Palotie,Michele T. Pato,Nicola Pirastu,Melvin G. McInnis,J. Brent Richards,J. Brent Richards,Cinzia Sala,Veikko Salomaa,David Schlessinger,Sebastian Schoenherr,P. Eline Slagboom,Kerrin S. Small,Tim D. Spector,Dwight Stambolian,Marcus A. Tuke,Jaakko Tuomilehto,Leonard H. van den Berg,Wouter van Rheenen,Uwe Völker,Cisca Wijmenga,Daniela Toniolo,Eleftheria Zeggini,Paolo Gasparini,Matthew G. Sampson,James F. Wilson,Timothy M. Frayling,Paul I.W. de Bakker,Morris A. Swertz,Steven A. McCarroll,Charles Kooperberg,Annelot M. Dekker,David Altshuler,Cristen J. Willer,William G. Iacono,Samuli Ripatti,Nicole Soranzo,Nicole Soranzo,Klaudia Walter,Anand Swaroop,Francesco Cucca,Carl A. Anderson,Richard M. Myers,Michael Boehnke,Mark I. McCarthy,Mark I. McCarthy,Richard Durbin,Gonçalo R. Abecasis,Jonathan Marchini +117 more
TL;DR: A reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies.