scispace - formally typeset
Search or ask a question
Author

Adam Auton

Other affiliations: Broad Institute, Cornell University, University of Oxford  ...read more
Bio: Adam Auton is an academic researcher from Albert Einstein College of Medicine. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 47, co-authored 94 publications receiving 51799 citations. Previous affiliations of Adam Auton include Broad Institute & Cornell University.


Papers
More filters
Posted ContentDOI
24 Jul 2018-bioRxiv
TL;DR: In this article, a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy was introduced. But the method was not applied to predict 21 highly heritable traits in the UK Biobank.
Abstract: Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, which includes coding, conserved, regulatory and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. LDpred-funct attained higher prediction accuracy than other polygenic prediction methods in simulations using real genotypes. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank. We used association statistics from British-ancestry samples as training data (avg N=373K) and samples of other European ancestries as validation data (avg N=22K), to minimize confounding. LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R2=0.144; highest R2=0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (total N=1107K; higher heritability in UK Biobank cohort) increased prediction R2 to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.

49 citations

Journal ArticleDOI
28 Aug 2019
TL;DR: A genome-wide association study of knee pain in the UK Biobank is reported, identifying two loci in or near GDF5 and COL27A1 that are associated with knee pain.
Abstract: Knee pain is one of the most common musculoskeletal complaints that brings people to medical attention. Approximately 50% of individuals over the age of 50 report an experience of knee pain within the past 12 months. We sought to identify the genetic variants associated with knee pain in 171,516 subjects from the UK Biobank cohort and seek supporting evidence in cohorts from 23andMe, the Osteoarthritis Initiative, and the Johnston County Osteoarthritis Project. We identified two loci that reached genome-wide significance in the UK Biobank: rs143384, located in GDF5 (P = 1.32 × 10−12), a gene previously implicated in osteoarthritis; and rs2808772, located near COL27A1 (P = 1.49 × 10−8). These findings were supported in cohorts with self-reported osteoarthritis/radiographic knee osteoarthritis without pain information. In this report on genome-wide association of knee pain, we identified two loci in or near GDF5 and COL27A1 that are associated with knee pain. Weihua Meng, Mark Adams et al. report a genome-wide association study of knee pain in the UK Biobank, identifying two loci near GDF5 and COL27A1 as significantly associated. These findings are supported by association data in additional cohorts, using self-reported osteoarthritis or radiographic knee osteoarthritis as a proxy for knee pain.

47 citations

Journal ArticleDOI
TL;DR: The data suggests that dogs have similar broad scale properties of recombination to humans, while fine scale recombination is similar to other species lacking PRDM9.
Abstract: Meiotic recombination in mammals has been shown to largely cluster into hotspots, which are targeted by the chromatin modifier PRDM9. The canid family, including wolves and dogs, has undergone a series of disrupting mutations in this gene, rendering PRDM9 inactive. Given the importance of PRDM9, it is of great interest to learn how its absence in the dog genome affects patterns of recombination placement. We have used genotypes from domestic dog pedigrees to generate sex-specific genetic maps of recombination in this species. On a broad scale, we find that placement of recombination events in dogs is consistent with that in mice and apes, in that the majority of recombination occurs toward the telomeres in males, while female crossing over is more frequent and evenly spread along chromosomes. It has been previously suggested that dog recombination is more uniform in distribution than that of humans; however, we found that recombination in dogs is less uniform than in humans. We examined the distribution of recombination within the genome, and found that recombination is elevated immediately upstream of the transcription start site and around CpG islands, in agreement with previous studies, but that this effect is stronger in male dogs. We also found evidence for positive crossover interference influencing the spacing between recombination events in dogs, as has been observed in other species including humans and mice. Overall our data suggests that dogs have similar broad scale properties of recombination to humans, while fine scale recombination is similar to other species lacking PRDM9.

46 citations

Journal ArticleDOI
TL;DR: A large scale meta-analysis of heart failure GWAS and replication in a comparable sized cohort to identify one known and two novel loci associated with heart failure reveal a putative causal variant in a cardiac muscle specific regulatory region activated during cardiomyocyte differentiation that binds to the ACTN2 gene.
Abstract: Heart failure is a major public health problem affecting over 23 million people worldwide. In this study, we present the results of a large scale meta-analysis of heart failure GWAS and replication in a comparable sized cohort to identify one known and two novel loci associated with heart failure. Heart failure sub-phenotyping shows that a new locus in chromosome 1 is associated with left ventricular adverse remodeling and clinical heart failure, in response to different initial cardiac muscle insults. Functional characterization and fine-mapping of that locus reveal a putative causal variant in a cardiac muscle specific regulatory region activated during cardiomyocyte differentiation that binds to the ACTN2 gene, a crucial structural protein inside the cardiac sarcolemma (Hi-C interaction p-value = 0.00002). Genome-editing in human embryonic stem cell-derived cardiomyocytes confirms the influence of the identified regulatory region in the expression of ACTN2. Our findings extend our understanding of biological mechanisms underlying heart failure. Heart failure has a heterogeneous etiology and the genetic underpinnings are not well understood. Here, Arvanitis et al. perform GWAS meta-analysis including 10,976 heart failure cases and 437,573 controls, identify new loci near ABO and ACTN2 and show that deletion of a ACTN2 enhancer leads to reduced ACTN2 expression in differentiating cardiomyocytes.

44 citations

Journal ArticleDOI
TL;DR: In this paper, a method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy was proposed. But the method was not applied to predict 21 highly heritable traits in the UK Biobank and 23andMe cohorts.
Abstract: Polygenic risk prediction is a widely investigated topic because of its promising clinical applications. Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, including coding, conserved, regulatory, and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank (avg N = 373 K as training data). LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R2 = 0.144; highest R2 = 0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (N = 1107 K) increased prediction R2 to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.

43 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

Journal ArticleDOI
TL;DR: In this article, the authors present an approach for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

10,798 citations

Journal ArticleDOI
TL;DR: VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
Abstract: Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Availability: http://vcftools.sourceforge.net Contact: [email protected]

10,164 citations