scispace - formally typeset
Search or ask a question
Author

Allan Motyer

Other affiliations: University of New South Wales
Bio: Allan Motyer is an academic researcher from University of Melbourne. The author has contributed to research in topics: Genome-wide association study & Biobank. The author has an hindex of 9, co-authored 17 publications receiving 3106 citations. Previous affiliations of Allan Motyer include University of New South Wales.

Papers
More filters
Journal ArticleDOI
11 Oct 2018-Nature
TL;DR: Deep phenotype and genome-wide genetic data from 500,000 individuals from the UK Biobank is described, describing population structure and relatedness in the cohort, and imputation to increase the number of testable variants to 96 million.
Abstract: The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

4,489 citations

Posted ContentDOI
20 Jul 2017-bioRxiv
TL;DR: The UK Biobank project is a large prospective cohort study of ~500,000 individuals from across the United Kingdom, aged between 40-69 at recruitment, and a set of analyses that reveal properties of the genetic data – such as population structure and relatedness – that can be important for downstream analyses are conducted.
Abstract: The UK Biobank project is a large prospective cohort study of ~500,000 individuals from across the United Kingdom, aged between 40-69 at recruitment. A rich variety of phenotypic and health-related information is available on each participant, making the resource unprecedented in its size and scope. Here we describe the genome-wide genotype data (~805,000 markers) collected on all individuals in the cohort and its quality control procedures. Genotype data on this scale offers novel opportunities for assessing quality issues, although the wide range of ancestries of the individuals in the cohort also creates particular challenges. We also conducted a set of analyses that reveal properties of the genetic data – such as population structure and relatedness – that can be important for downstream analyses. In addition, we phased and imputed genotypes into the dataset, using computationally efficient methods combined with the Haplotype Reference Consortium (HRC) and UK10K haplotype resource. This increases the number of testable variants by over 100-fold to ~96 million variants. We also imputed classical allelic variation at 11 human leukocyte antigen (HLA) genes, and as a quality control check of this imputation, we replicate signals of known associations between HLA alleles and many common diseases. We describe tools that allow efficient genome-wide association studies (GWAS) of multiple traits and fast phenome-wide association studies (PheWAS), which work together with a new compressed file format that has been used to distribute the dataset. As a further check of the genotyped and imputed datasets, we performed a test-case genome-wide association scan on a well-studied human trait, standing height.

659 citations

Journal ArticleDOI
TL;DR: A new Bayesian analysis framework is developed that exploits the hierarchical structure of diagnosis classifications to analyze genetic variants against UK Biobank disease phenotypes derived from self-reporting and hospital episode statistics and identifies new associations between classical human leukocyte antigen (HLA) alleles and common immune-mediated diseases (IMDs).
Abstract: Genetic discovery from the multitude of phenotypes extractable from routine healthcare data can transform understanding of the human phenome and accelerate progress toward precision medicine. However, a critical question when analyzing high-dimensional and heterogeneous data is how best to interrogate increasingly specific subphenotypes while retaining statistical power to detect genetic associations. Here we develop and employ a new Bayesian analysis framework that exploits the hierarchical structure of diagnosis classifications to analyze genetic variants against UK Biobank disease phenotypes derived from self-reporting and hospital episode statistics. Our method displays a more than 20% increase in power to detect genetic effects over other approaches and identifies new associations between classical human leukocyte antigen (HLA) alleles and common immune-mediated diseases (IMDs). By applying the approach to genetic risk scores (GRSs), we show the extent of genetic sharing among IMDs and expose differences in disease perception or diagnosis with potential clinical implications.

58 citations

Journal ArticleDOI
TL;DR: Genetic variants for IgE‐mediated peanut allergy are yet to be fully characterized and to date only one genomewide association study (GWAS) has been published.
Abstract: Background Genetic variants for IgE-mediated peanut allergy are yet to be fully characterized and to date only one genome-wide association study (GWAS) has been published. Objective To identify genetic variants associated with challenge proven peanut allergy. Methods We carried out a GWAS comparing 73 infants with challenge-proven IgE-mediated peanut allergy against 148 non-allergic infants (all ~ 1 year old). We tested a total of 3.8 million single nucleotide polymorphism (SNPs), as well as imputed HLA alleles and amino acids. Replication was assessed by de novo genotyping in a panel of additional 117 cases and 380 controls, and in silico testing in two independent GWAS cohorts. Results We identified 21 independent associations at P ≤ 5x10-5 but were unable to replicate these. The most significant HLA association was the previously reported amino acid variant located at position 71, within the peptide-binding groove of HLA-DRB1 (P = 2x10-4). Our study therefore reproduced previous findings for the association between peanut allergy and HLA-DRB1 in this Australian population. Conclusions & Clinical Relevance Genetic determinants for challenge proven peanut allergy include alleles at the HLA-DRB1 locus. This article is protected by copyright. All rights reserved.

40 citations

Journal ArticleDOI
TL;DR: In this paper, the authors considered the class of level-independent quasi-birth-and-death (QBD) processes and derived simple conditions for possible decay rates of the stationary distribution of the 'level' process.
Abstract: We consider the class of level-independent quasi-birth-and-death (QBD) processes that have countably many phases and generator matrices with tridiagonal blocks that are themselves tridiagonal and phase independent. We derive simple conditions for possible decay rates of the stationary distribution of the 'level' process. It may be possible to obtain decay rates satisfying these conditions by varying only the transition structure at level 0. Our results generalize those of Kroese, Scheinhardt, and Taylor, who studied in detail a particular example, the tandem Jackson network, from the class of QBD processes studied here. The conditions derived here are applied to three practical examples.

32 citations


Cited by
More filters
Journal ArticleDOI
30 May 2018-eLife
TL;DR: MR-Base is a platform that integrates a curated database of complete GWAS results (no restrictions according to statistical significance) with an application programming interface, web app and R packages that automate 2SMR, and includes several sensitivity analyses for assessing the impact of horizontal pleiotropy and other violations of assumptions.
Abstract: Results from genome-wide association studies (GWAS) can be used to infer causal relationships between phenotypes, using a strategy known as 2-sample Mendelian randomization (2SMR) and bypassing the need for individual-level data. However, 2SMR methods are evolving rapidly and GWAS results are often insufficiently curated, undermining efficient implementation of the approach. We therefore developed MR-Base ( http://www.mrbase.org ): a platform that integrates a curated database of complete GWAS results (no restrictions according to statistical significance) with an application programming interface, web app and R packages that automate 2SMR. The software includes several sensitivity analyses for assessing the impact of horizontal pleiotropy and other violations of assumptions. The database currently comprises 11 billion single nucleotide polymorphism-trait associations from 1673 GWAS and is updated on a regular basis. Integrating data with software ensures more rigorous application of hypothesis-driven analyses and allows millions of potential causal relationships to be efficiently evaluated in phenome-wide association studies.

2,520 citations

Journal ArticleDOI
TL;DR: Genome-wide polygenic risk scores derived from GWAS data for five common diseases can identify subgroups of the population with risk approaching or exceeding that of a monogenic mutation.
Abstract: A key public health need is to identify individuals at high risk for a given disease to enable enhanced screening or preventive therapies. Because most common diseases have a genetic component, one important approach is to stratify individuals based on inherited DNA variation1. Proposed clinical applications have largely focused on finding carriers of rare monogenic mutations at several-fold increased risk. Although most disease risk is polygenic in nature2-5, it has not yet been possible to use polygenic predictors to identify individuals at risk comparable to monogenic mutations. Here, we develop and validate genome-wide polygenic scores for five common diseases. The approach identifies 8.0, 6.1, 3.5, 3.2, and 1.5% of the population at greater than threefold increased risk for coronary artery disease, atrial fibrillation, type 2 diabetes, inflammatory bowel disease, and breast cancer, respectively. For coronary artery disease, this prevalence is 20-fold higher than the carrier frequency of rare monogenic mutations conferring comparable risk6. We propose that it is time to contemplate the inclusion of polygenic risk prediction in clinical care, and discuss relevant issues.

1,962 citations

Journal ArticleDOI
James J. Lee1, Robbee Wedow2, Aysu Okbay3, Edward Kong4, Omeed Maghzian4, Meghan Zacher4, Tuan Anh Nguyen-Viet5, Peter Bowers4, Julia Sidorenko6, Julia Sidorenko7, Richard Karlsson Linnér3, Richard Karlsson Linnér8, Mark Alan Fontana9, Mark Alan Fontana5, Tushar Kundu5, Chanwook Lee4, Hui Li4, Ruoxi Li5, Rebecca Royer5, Pascal Timshel10, Pascal Timshel11, Raymond K. Walters4, Raymond K. Walters12, Emily A. Willoughby1, Loic Yengo7, Maris Alver6, Yanchun Bao13, David W. Clark14, Felix R. Day15, Nicholas A. Furlotte, Peter K. Joshi14, Peter K. Joshi16, Kathryn E. Kemper7, Aaron Kleinman, Claudia Langenberg15, Reedik Mägi6, Joey W. Trampush5, Shefali S. Verma17, Yang Wu7, Max Lam, Jing Hua Zhao15, Zhili Zheng18, Zhili Zheng7, Jason D. Boardman2, Harry Campbell14, Jeremy Freese19, Kathleen Mullan Harris20, Caroline Hayward14, Pamela Herd13, Pamela Herd21, Meena Kumari13, Todd Lencz22, Todd Lencz23, Jian'an Luan15, Anil K. Malhotra23, Anil K. Malhotra22, Andres Metspalu6, Lili Milani6, Ken K. Ong15, John R. B. Perry15, David J. Porteous14, Marylyn D. Ritchie17, Melissa C. Smart14, Blair H. Smith24, Joyce Y. Tung, Nicholas J. Wareham15, James F. Wilson14, Jonathan P. Beauchamp25, Dalton Conley26, Tõnu Esko6, Steven F. Lehrer27, Steven F. Lehrer28, Steven F. Lehrer29, Patrik K. E. Magnusson30, Sven Oskarsson31, Tune H. Pers10, Tune H. Pers11, Matthew R. Robinson7, Matthew R. Robinson32, Kevin Thom33, Chelsea Watson5, Christopher F. Chabris17, Michelle N. Meyer17, David Laibson4, Jian Yang7, Magnus Johannesson34, Philipp Koellinger3, Philipp Koellinger8, Patrick Turley12, Patrick Turley4, Peter M. Visscher7, Daniel J. Benjamin5, Daniel J. Benjamin29, David Cesarini29, David Cesarini33 
TL;DR: A joint (multi-phenotype) analysis of educational attainment and three related cognitive phenotypes generates polygenic scores that explain 11–13% of the variance ineducational attainment and 7–10% ofthe variance in cognitive performance, which substantially increases the utility ofpolygenic scores as tools in research.
Abstract: Here we conducted a large-scale genetic association analysis of educational attainment in a sample of approximately 1.1 million individuals and identify 1,271 independent genome-wide-significant SNPs. For the SNPs taken together, we found evidence of heterogeneous effects across environments. The SNPs implicate genes involved in brain-development processes and neuron-to-neuron communication. In a separate analysis of the X chromosome, we identify 10 independent genome-wide-significant SNPs and estimate a SNP heritability of around 0.3% in both men and women, consistent with partial dosage compensation. A joint (multi-phenotype) analysis of educational attainment and three related cognitive phenotypes generates polygenic scores that explain 11-13% of the variance in educational attainment and 7-10% of the variance in cognitive performance. This prediction accuracy substantially increases the utility of polygenic scores as tools in research.

1,658 citations

Journal ArticleDOI
TL;DR: A genetic meta-analysis of depression found 269 associated genes that highlight several potential drug repositioning opportunities, and relationships with depression were found for neuroticism and smoking.
Abstract: Major depression is a debilitating psychiatric illness that is typically associated with low mood and anhedonia. Depression has a heritable component that has remained difficult to elucidate with current sample sizes due to the polygenic nature of the disorder. To maximize sample size, we meta-analyzed data on 807,553 individuals (246,363 cases and 561,190 controls) from the three largest genome-wide association studies of depression. We identified 102 independent variants, 269 genes, and 15 genesets associated with depression, including both genes and gene pathways associated with synaptic structure and neurotransmission. An enrichment analysis provided further evidence of the importance of prefrontal brain regions. In an independent replication sample of 1,306,354 individuals (414,055 cases and 892,299 controls), 87 of the 102 associated variants were significant after multiple testing correction. These findings advance our understanding of the complex genetic architecture of depression and provide several future avenues for understanding etiology and developing new treatment approaches.

1,312 citations