scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature (Nature Publishing Group)-Vol. 526, Iss: 7571, pp 68-74
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
Citations
More filters
Journal ArticleDOI
TL;DR: The addition of bevacizumab may increase the pCR after standard neoadjuvant chemotherapy for patients with TNBC with BRCA1/2 mutations, and in patients treated with anthracycline and taxane-based chemotherapy, pCR was a weaker predictor of DFS than for patients without mutations.
Abstract: PurposeBRCA1/2 mutations are frequent in patients with triple-negative breast cancer (TNBC). These patients are often treated with primary systemic chemotherapy. The aim of this study was to analyze the effects of BRCA1/2 mutations on pathologic complete response (pCR) and disease-free survival (DFS) in a cohort of patients with TNBC treated with anthracycline and taxane–containing chemotherapy, with or without bevacizumab.Patients and MethodsGermline DNA was sequenced to identify mutations in BRCA1 and BRCA2 in 493 patients with TNBC from the GeparQuinto study. The pCR rates were compared in patients with and without mutation, as well as in patients treated with and without bevacizumab. In addition, the influence of BRCA1/2 mutation status and pCR status on DFS was evaluated relative to treatment.ResultsBRCA1/2 mutations were detected in 18.3% of patients with TNBC. Overall, patients with mutations had a pCR rate of 50%, compared with 31.5% in patients without a mutation (odds ratio [OR], 2.17; 95% CI, 1...

73 citations

Journal ArticleDOI
01 Jul 2018-Genetics
TL;DR: It is shown that using parsimony to infer the ancestral state at a specific site seriously breaks down in two situations, and a method is presented that is capable of providing nearly unbiased estimates of ancestral state probabilities on a site-by-site basis and the uSFS.
Abstract: It is known that the allele ancestral to the variation at a polymorphic site cannot be assigned with certainty, and that the most frequently used method to assign the ancestral state-maximum parsimony-is prone to misinference. Estimates of counts of sites that have a certain number of copies of the derived allele in a sample (the unfolded site frequency spectrum, uSFS) made by parsimony are therefore also biased. We previously developed a maximum likelihood method to estimate the uSFS for a focal species using information from two outgroups while assuming simple models of nucleotide substitution. Here, we extend this approach to allow multiple outgroups (implemented for three outgroups), potentially any phylogenetic tree topology, and more complex models of nucleotide substitution. We find, however, that two outgroups and the Kimura two-parameter model are adequate for uSFS inference in most cases. We show that using parsimony to infer the ancestral state at a specific site seriously breaks down in two situations. The first is where the outgroups provide no information about the ancestral state of variation in the focal species. In this case, nucleotide variation will be underestimated if such sites are excluded. The second is where the minor allele in the focal species agrees with the allelic state of the outgroups. In this situation, parsimony tends to overestimate the probability of the major allele being derived, because it fails to account for the fact that sites with a high frequency of the derived allele tend to be rare. We present a method that corrects this deficiency and is capable of providing nearly unbiased estimates of ancestral state probabilities on a site-by-site basis and the uSFS.

73 citations

Journal ArticleDOI
TL;DR: The genomic investigation provides insight into the people associated with this long-standing megalith funerary tradition, including their social dynamics, and shows kin relations among the buried individuals and an overrepresentation of males, suggesting that at least some of these funerary monuments were used by patrilineal societies.
Abstract: Paleogenomic and archaeological studies show that Neolithic lifeways spread from the Fertile Crescent into Europe around 9000 BCE, reaching northwestern Europe by 4000 BCE. Starting around 4500 BCE, a new phenomenon of constructing megalithic monuments, particularly for funerary practices, emerged along the Atlantic facade. While it has been suggested that the emergence of megaliths was associated with the territories of farming communities, the origin and social structure of the groups that erected them has remained largely unknown. We generated genome sequence data from human remains, corresponding to 24 individuals from five megalithic burial sites, encompassing the widespread tradition of megalithic construction in northern and western Europe, and analyzed our results in relation to the existing European paleogenomic data. The various individuals buried in megaliths show genetic affinities with local farming groups within their different chronological contexts. Individuals buried in megaliths display (past) admixture with local hunter-gatherers, similar to that seen in other Neolithic individuals in Europe. In relation to the tomb populations, we find significantly more males than females buried in the megaliths of the British Isles. The genetic data show close kin relationships among the individuals buried within the megaliths, and for the Irish megaliths, we found a kin relation between individuals buried in different megaliths. We also see paternal continuity through time, including the same Y-chromosome haplotypes reoccurring. These observations suggest that the investigated funerary monuments were associated with patrilineal kindred groups. Our genomic investigation provides insight into the people associated with this long-standing megalith funerary tradition, including their social dynamics.

72 citations

Journal ArticleDOI
21 Mar 2019-Cell
TL;DR: The field of population genomics holds great potential for providing further insights into the evolution of human disease, and has shown that demographic processes shaped the distribution and frequency of disease-associated variants over time.

72 citations

Journal ArticleDOI
TL;DR: The contribution of the noncoding genome and its alteration in the development and progression of cancer is showcased and the opportunities to translate the biological characterization of genetic and epigenetic alterations in the non-coding cancer genome into novel approaches to treat or monitor disease are highlighted.
Abstract: The emergence of whole-genome annotation approaches is paving the way for the comprehensive annotation of the human genome across diverse cell and tissue types exposed to various environmental conditions. This has already unmasked the positions of thousands of functional cis-regulatory elements integral to transcriptional regulation, such as enhancers, promoters, and anchors of chromatin interactions that populate the noncoding genome. Recent studies have shown that cis-regulatory elements are commonly the targets of genetic and epigenetic alterations associated with aberrant gene expression in cancer. Here, we review these findings to showcase the contribution of the noncoding genome and its alteration in the development and progression of cancer. We also highlight the opportunities to translate the biological characterization of genetic and epigenetic alterations in the noncoding cancer genome into novel approaches to treat or monitor disease. Significance: The majority of genetic and epigenetic alterations accumulate in the noncoding genome throughout oncogenesis. Discriminating driver from passenger events is a challenge that holds great promise to improve our understanding of the etiology of different cancer types. Advancing our understanding of the noncoding cancer genome may thus identify new therapeutic opportunities and accelerate our capacity to find improved biomarkers to monitor various stages of cancer development. Cancer Discov; 6(11); 1215–29. ©2016 AACR.

72 citations

References
More filters
Journal ArticleDOI
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

88,255 citations

Journal ArticleDOI
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Abstract: Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: [email protected]

45,957 citations

Journal ArticleDOI
TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.
Abstract: Motivation: Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing webbased methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner. Results: This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets. Availability and implementation: BEDTools was written in C++. Source code and a comprehensive user manual are freely available at http://code.google.com/p/bedtools

18,858 citations

Journal ArticleDOI
06 Sep 2012-Nature
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

13,548 citations

Journal ArticleDOI
TL;DR: VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
Abstract: Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Availability: http://vcftools.sourceforge.net Contact: [email protected]

10,164 citations