scispace - formally typeset
Open AccessJournal ArticleDOI

Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications

TLDR
The performance of Platypus is demonstrated by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.
Abstract
High-throughput DNA sequencing technology has transformed genetic research and is starting to make an impact on clinical practice. However, analyzing high-throughput sequencing data remains challenging, particularly in clinical settings where accuracy and turnaround times are critical. We present a new approach to this problem, implemented in a software package called Platypus. Platypus achieves high sensitivity and specificity for SNPs, indels and complex polymorphisms by using local de novo assembly to generate candidate variants, followed by local realignment and probabilistic haplotype estimation. It is an order of magnitude faster than existing tools and generates calls from raw aligned read data without preprocessing. We demonstrate the performance of Platypus in clinically relevant experimental designs by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 - 
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Journal ArticleDOI

Pan-cancer analysis of whole genomes

Peter J. Campbell, +1332 more
- 06 Feb 2020 - 
TL;DR: The flagship paper of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium describes the generation of the integrative analyses of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types, the structures for international data sharing and standardized analyses, and the main scientific findings from across the consortium studies.
Posted ContentDOI

Scaling accurate genetic variant discovery to tens of thousands of samples

TL;DR: A novel assembly-based approach to variant calling, the GATK HaplotypeCaller and Reference Confidence Model, that determines genotype likelihoods independently per-sample but performs joint calling across all samples within a project simultaneously, showing that the accuracy of indel variant calling is superior in comparison to other algorithms.
Journal ArticleDOI

Allele-Specific HLA Loss and Immune Escape in Lung Cancer Evolution

Nicholas McGranahan, +219 more
- 30 Nov 2017 - 
TL;DR: It is found that HLA LOH occurs in 40% of non-small-cell lung cancers (NSCLCs) and is associated with a high subclonal neoantigen burden, APOBEC-mediated mutagenesis, upregulation of cytolytic activity, and PD-L1 positivity.
References
More filters
Journal ArticleDOI

Fast and accurate short read alignment with Burrows–Wheeler transform

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Journal ArticleDOI

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Journal ArticleDOI

Velvet: Algorithms for de novo short read assembly using de Bruijn graphs

TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.
Journal ArticleDOI

An integrated map of genetic variation from 1,092 human genomes

TL;DR: It is shown that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites.
Related Papers (5)