Showing papers by "Adam Auton published in 2015"
••
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
12,661 citations
01 Oct 2015
TL;DR: The 1000 Genomes Project as mentioned in this paper provided a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and reported the completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole genome sequencing, deep exome sequencing and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
3,247 citations
••
University of Washington1, University of Maryland, Baltimore2, Broad Institute3, Harvard University4, Mayo Clinic5, Yale University6, Washington University in St. Louis7, University of Michigan8, University of Texas Health Science Center at Houston9, Louisiana State University10, University of North Carolina at Charlotte11, Wellcome Trust12, University of Texas MD Anderson Cancer Center13, Boston College14, Yeshiva University15, Bilkent University16, University of California, San Diego17, National Institutes of Health18, Leiden University19, Baylor College of Medicine20, Cornell University21, University of Oxford22, Utrecht University23, Icahn School of Medicine at Mount Sinai24, Kyoto University25, Virginia Commonwealth University26, Heidelberg University27, Ewha Womans University28
TL;DR: In this paper, the authors describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which are constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations.
Abstract: Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
1,971 citations
••
TL;DR: It is confirmed that aging is associated with the accumulation of somatic mutations, and strongly suggest that the level of genome instability of normal cells, modified by both endogenous and environmental factors, is the main risk factor for cancer.
Abstract: Aging is associated with an increased risk of cancer, possibly in part because of an age-related increase in mutations in normal tissues. Due to their extremely low abundance, somatic mutations in normal tissues frequently escape detection. Tumors, as clonal expansions of single cells, can provide information about the somatic mutations present in these cells prior to tumorigenesis. Here, we used data from The Cancer Genome Atlas (TCGA), to systematically study the frequency and spectrum of somatic mutations in a total of 6,969 patients and 34 different tumor types as a function of the age of the patient. After using linear modeling to control for the age structure of different tumor types, we found that the number of identified somatic mutations increases exponentially with age. Using additional data from the literature, we found that accumulation of somatic mutations is associated with cell division rate, cancer risk and cigarette smoking, with the latter also associated with a distinct spectrum of mutations. Our results confirm that aging is associated with the accumulation of somatic mutations, and strongly suggest that the level of genome instability of normal cells, modified by both endogenous and environmental factors, is the main risk factor for cancer.
96 citations
••
TL;DR: Results suggest that both rare and common DNA variations in PNPLA3 and SAMM50 may be correlated with NAFLD in this small population study, while commonDNA variations in CHUK and ERLIN1 may have a protective interaction.
Abstract: We explored potential genetic risk factors implicated in nonalcoholic fatty liver disease (NAFLD) within a Caribbean-Hispanic population in New York City. A total of 316 individuals including 40 subjects with biopsy-proven NAFLD, 24 ethnically matched non-NAFLD controls, and a 252 ethnically mixed random sampling of Bronx County, New York were analyzed. Genotype analysis was performed to determine allelic frequencies of 74 known single-nucleotide polymorphisms (SNPs) associated with NAFLD risk based on previous genome-wide association study (GWAS) and candidate gene studies. Additionally, the entire coding region of PNPLA3, a gene showing the strongest association to NAFLD was subjected to Sanger sequencing. Results suggest that both rare and common DNA variations in PNPLA3 and SAMM50 may be correlated with NAFLD in this small population study, while common DNA variations in CHUK and ERLIN1, may have a protective interaction. Common SNPs in ENPP1 and ABCC2 have suggestive association with fatty liver, but with less compelling significance. In conclusion, Hispanic patients of Caribbean ancestry may have different interactions with NAFLD genetic modifiers; therefore, further investigation with a larger sample size, into this Caribbean-Hispanic population is warranted.
16 citations