scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature (Nature Publishing Group)-Vol. 526, Iss: 7571, pp 68-74
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
Citations
More filters
Journal ArticleDOI
TL;DR: Pilot studies suggest that genetic testing can also provide similar diagnostic insight among adult patients, and the creation of evidence-based guidelines for the utilization and implementation of genetic testing in nephrology will help to translate genetic knowledge into improved clinical outcomes for patients with kidney disease.
Abstract: Technologies such as next-generation sequencing and chromosomal microarray have advanced the understanding of the molecular pathogenesis of a variety of renal disorders. Genetic findings are increasingly used to inform the clinical management of many nephropathies, enabling targeted disease surveillance, choice of therapy, and family counselling. Genetic analysis has excellent diagnostic utility in paediatric nephrology, as illustrated by sequencing studies of patients with congenital anomalies of the kidney and urinary tract and steroid-resistant nephrotic syndrome. Although additional investigation is needed, pilot studies suggest that genetic testing can also provide similar diagnostic insight among adult patients. Reaching a genetic diagnosis first involves choosing the appropriate testing modality, as guided by the clinical presentation of the patient and the number of potential genes associated with the suspected nephropathy. Genome-wide sequencing increases diagnostic sensitivity relative to targeted panels, but holds the challenges of identifying causal variants in the vast amount of data generated and interpreting secondary findings. In order to realize the promise of genomic medicine for kidney disease, many technical, logistical, and ethical questions that accompany the implementation of genetic testing in nephrology must be addressed. The creation of evidence-based guidelines for the utilization and implementation of genetic testing in nephrology will help to translate genetic knowledge into improved clinical outcomes for patients with kidney disease.

96 citations

Journal ArticleDOI
TL;DR: Structural variants from 795 newly-diagnosed patients are analyzed and translocations involving the immunoglobulin lambda (IgL) locus are reported to be present in 10% of patients, indicative of poor prognosis, and IgL-MYC-translocated myeloma is being misclassified.
Abstract: Multiple myeloma is a malignancy of antibody-secreting plasma cells. Most patients benefit from current therapies, however, 20% of patients relapse or die within two years and are deemed high risk. Here we analyze structural variants from 795 newly-diagnosed patients as part of the CoMMpass study. We report translocations involving the immunoglobulin lambda (IgL) locus are present in 10% of patients, and indicative of poor prognosis. This is particularly true for IgL-MYC translocations, which coincide with focal amplifications of enhancers at both loci. Importantly, 78% of IgL-MYC translocations co-occur with hyperdiploid disease, a marker of standard risk, suggesting that IgL-MYC-translocated myeloma is being misclassified. Patients with IgL-translocations fail to benefit from IMiDs, which target IKZF1, a transcription factor that binds the IgL enhancer at some of the highest levels in the myeloma epigenome. These data implicate IgL translocation as a driver of poor prognosis which may be due to IMiD resistance. Multiple myeloma is frequently characterised by translocation of genes next to the immunoglobulin heavy chain locus. In this study, the authors sequence a large cohort of high risk myeloma samples and find translocations of cMyc to the immunoglobulin heavy chain locus and this is associated with poor prognosis.

96 citations

Journal ArticleDOI
TL;DR: These studies provide the framework for comprehensive system-level analysis of the GRN underlying the development of a single sensory neuron, the rod photoreceptor, and define the relationship of NRL with other transcriptional regulators and downstream cognate effectors.

96 citations

Journal ArticleDOI
17 Oct 2019-Cell
TL;DR: Wang et al. as mentioned in this paper used whole-genome sequencing of 4,810 Singapore Chinese, Malays, and Indians to identify 98.3 million SNPs and small insertions or deletions over half of which are novel.

96 citations

Journal ArticleDOI
TL;DR: This work designed SELECT, an algorithmic approach to systematically identify evolutionary dependencies from alteration patterns, which provides a framework for the design of strategies to predict cancer progression and therapeutic response.

95 citations

References
More filters
Journal ArticleDOI
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

88,255 citations

Journal ArticleDOI
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Abstract: Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: [email protected]

45,957 citations

Journal ArticleDOI
TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.
Abstract: Motivation: Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing webbased methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner. Results: This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets. Availability and implementation: BEDTools was written in C++. Source code and a comprehensive user manual are freely available at http://code.google.com/p/bedtools

18,858 citations

Journal ArticleDOI
06 Sep 2012-Nature
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

13,548 citations

Journal ArticleDOI
TL;DR: VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
Abstract: Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Availability: http://vcftools.sourceforge.net Contact: [email protected]

10,164 citations