scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Validating therapeutic targets through human genetics

TL;DR: The concept of dose–response curves derived from experiments of nature are described, with an emphasis on human genetics as a valuable tool to prioritize molecular targets in drug development.
Abstract: More than 90% of the compounds that enter clinical trials fail to demonstrate sufficient safety and efficacy to gain regulatory approval. Most of this failure is due to the limited predictive value of preclinical models of disease, and our continued ignorance regarding the consequences of perturbing specific targets over long periods of time in humans. 'Experiments of nature' - naturally occurring mutations in humans that affect the activity of a particular protein target or targets - can be used to estimate the probable efficacy and toxicity of a drug targeting such proteins, as well as to establish causal rather than reactive relationships between targets and outcomes. Here, we describe the concept of dose-response curves derived from experiments of nature, with an emphasis on human genetics as a valuable tool to prioritize molecular targets in drug development. We discuss empirical examples of drug-gene pairs that support the role of human genetics in testing therapeutic hypotheses at the stage of target validation, provide objective criteria to prioritize genetic findings for future drug discovery efforts and highlight the limitations of a target validation approach that is anchored in human genetics.
Citations
More filters
Journal ArticleDOI
11 Oct 2018-Nature
TL;DR: Deep phenotype and genome-wide genetic data from 500,000 individuals from the UK Biobank is described, describing population structure and relatedness in the cohort, and imputation to increase the number of testable variants to 96 million.
Abstract: The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

4,489 citations

Journal ArticleDOI
TL;DR: Improved data access is improved with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database.
Abstract: The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.

2,878 citations

Journal ArticleDOI
TL;DR: The MR-PRESSO test detects and corrects horizontal pleiotropy in multi-instrument Mendelian randomization (MR) analyses and introduces distortions in the causal estimates in MR that ranged on average from –131% to 201%; it is shown using simulations that the MR-pressO test is best suited when horizontal Pleiotropy occurs in <50% of instruments.
Abstract: Horizontal pleiotropy occurs when the variant has an effect on disease outside of its effect on the exposure in Mendelian randomization (MR). Violation of the ‘no horizontal pleiotropy’ assumption can cause severe bias in MR. We developed the Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) test to identify horizontal pleiotropic outliers in multi-instrument summary-level MR testing. We showed using simulations that the MR-PRESSO test is best suited when horizontal pleiotropy occurs in 48% of causal relationships.

2,362 citations

Journal ArticleDOI
Yukinori Okada1, Yukinori Okada2, Di Wu3, Di Wu2, Di Wu1, Gosia Trynka1, Gosia Trynka2, Towfique Raj2, Towfique Raj1, Chikashi Terao4, Katsunori Ikari, Yuta Kochi, Koichiro Ohmura4, Akari Suzuki, Shinji Yoshida, Robert R. Graham5, A. Manoharan5, Ward Ortmann5, Tushar Bhangale5, Joshua C. Denny6, Robert J. Carroll6, Anne E. Eyler6, Jeff Greenberg7, Joel M. Kremer, Dimitrios A. Pappas8, Lei Jiang9, Jian Yin9, Lingying Ye9, Ding Feng Su9, Jian Yang10, Gang Xie11, E.C. Keystone11, Harm-Jan Westra12, Tõnu Esko13, Tõnu Esko1, Tõnu Esko2, Andres Metspalu13, Xuezhong Zhou14, Namrata Gupta1, Daniel B. Mirel1, Eli A. Stahl15, Dorothee Diogo2, Dorothee Diogo1, Jing Cui2, Jing Cui1, Katherine P. Liao2, Katherine P. Liao1, Michael H. Guo2, Michael H. Guo1, Keiko Myouzen, Takahisa Kawaguchi4, Marieke J H Coenen16, Piet L. C. M. van Riel16, Mart A F J van de Laar17, Henk-Jan Guchelaar18, Tom W J Huizinga18, Philippe Dieudé19, Xavier Mariette20, S. Louis Bridges21, Alexandra Zhernakova18, Alexandra Zhernakova12, René E. M. Toes18, Paul P. Tak22, Paul P. Tak23, Paul P. Tak24, Corinne Miceli-Richard20, So Young Bang25, Hye Soon Lee25, Javier Martin26, Miguel A. Gonzalez-Gay, Luis Rodriguez-Rodriguez27, Solbritt Rantapää-Dahlqvist28, Lisbeth Ärlestig28, Hyon K. Choi2, Hyon K. Choi29, Yoichiro Kamatani30, Pilar Galan19, Mark Lathrop31, Steve Eyre32, Steve Eyre33, John Bowes32, John Bowes33, Anne Barton33, Niek de Vries22, Larry W. Moreland34, Lindsey A. Criswell35, Elizabeth W. Karlson2, Atsuo Taniguchi, Ryo Yamada4, Michiaki Kubo, Jun Liu2, Sang Cheol Bae25, Jane Worthington33, Jane Worthington32, Leonid Padyukov36, Lars Klareskog36, Peter K. Gregersen37, Soumya Raychaudhuri1, Soumya Raychaudhuri2, Barbara E. Stranger38, Philip L. De Jager1, Philip L. De Jager2, Lude Franke12, Peter M. Visscher10, Matthew A. Brown10, Hisashi Yamanaka, Tsuneyo Mimori4, Atsushi Takahashi, Huji Xu9, Timothy W. Behrens5, Katherine A. Siminovitch11, Shigeki Momohara, Fumihiko Matsuda4, Kazuhiko Yamamoto39, Robert M. Plenge2, Robert M. Plenge1 
20 Feb 2014-Nature
TL;DR: A genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries provides empirical evidence that the genetics of RA can provide important information for drug discovery, and sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis.
Abstract: A major challenge in human genetics is to devise a systematic strategy to integrate disease-associated variants with diverse genomic and biological data sets to provide insight into disease pathogenesis and guide drug discovery for complex traits such as rheumatoid arthritis (RA)1. Here we performed a genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries (29,880 RA cases and 73,758 controls), by evaluating ~10 million single-nucleotide polymorphisms. We discovered 42 novel RA risk loci at a genome-wide level of significance, bringing the total to 101 (refs 2, 3, 4). We devised an in silico pipeline using established bioinformatics methods based on functional annotation5, cis-acting expression quantitative trait loci6 and pathway analyses7, 8, 9—as well as novel methods based on genetic overlap with human primary immunodeficiency, haematological cancer somatic mutations and knockout mouse phenotypes—to identify 98 biological candidate genes at these 101 risk loci. We demonstrate that these genes are the targets of approved therapies for RA, and further suggest that drugs approved for other indications may be repurposed for the treatment of RA. Together, this comprehensive genetic study sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis, and provides empirical evidence that the genetics of RA can provide important information for drug discovery.

1,910 citations

References
More filters
Journal ArticleDOI
TL;DR: A new method and the corresponding software tool, PolyPhen-2, which is different from the early tool polyPhen1 in the set of predictive features, alignment pipeline, and the method of classification is presented and performance, as presented by its receiver operating characteristic curves, was consistently superior.
Abstract: To the Editor: Applications of rapidly advancing sequencing technologies exacerbate the need to interpret individual sequence variants. Sequencing of phenotyped clinical subjects will soon become a method of choice in studies of the genetic causes of Mendelian and complex diseases. New exon capture techniques will direct sequencing efforts towards the most informative and easily interpretable protein-coding fraction of the genome. Thus, the demand for computational predictions of the impact of protein sequence variants will continue to grow. Here we present a new method and the corresponding software tool, PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), which is different from the early tool PolyPhen1 in the set of predictive features, alignment pipeline, and the method of classification (Fig. 1a). PolyPhen-2 uses eight sequence-based and three structure-based predictive features (Supplementary Table 1) which were selected automatically by an iterative greedy algorithm (Supplementary Methods). Majority of these features involve comparison of a property of the wild-type (ancestral, normal) allele and the corresponding property of the mutant (derived, disease-causing) allele, which together define an amino acid replacement. Most informative features characterize how well the two human alleles fit into the pattern of amino acid replacements within the multiple sequence alignment of homologous proteins, how distant the protein harboring the first deviation from the human wild-type allele is from the human protein, and whether the mutant allele originated at a hypermutable site2. The alignment pipeline selects the set of homologous sequences for the analysis using a clustering algorithm and then constructs and refines their multiple alignment (Supplementary Fig. 1). The functional significance of an allele replacement is predicted from its individual features (Supplementary Figs. 2–4) by Naive Bayes classifier (Supplementary Methods). Figure 1 PolyPhen-2 pipeline and prediction accuracy. (a) Overview of the algorithm. (b) Receiver operating characteristic (ROC) curves for predictions made by PolyPhen-2 using five-fold cross-validation on HumDiv (red) and HumVar3 (light green). UniRef100 (solid ... We used two pairs of datasets to train and test PolyPhen-2. We compiled the first pair, HumDiv, from all 3,155 damaging alleles with known effects on the molecular function causing human Mendelian diseases, present in the UniProt database, together with 6,321 differences between human proteins and their closely related mammalian homologs, assumed to be non-damaging (Supplementary Methods). The second pair, HumVar3, consists of all the 13,032 human disease-causing mutations from UniProt, together with 8,946 human nsSNPs without annotated involvement in disease, which were treated as non-damaging. We found that PolyPhen-2 performance, as presented by its receiver operating characteristic curves, was consistently superior compared to PolyPhen (Fig. 1b) and it also compared favorably with the three other popular prediction tools4–6 (Fig. 1c). For a false positive rate of 20%, PolyPhen-2 achieves the rate of true positive predictions of 92% and 73% on HumDiv and HumVar, respectively (Supplementary Table 2). One reason for a lower accuracy of predictions on HumVar is that nsSNPs assumed to be non-damaging in HumVar contain a sizable fraction of mildly deleterious alleles. In contrast, most of amino acid replacements assumed non-damaging in HumDiv must be close to selective neutrality. Because alleles that are even mildly but unconditionally deleterious cannot be fixed in the evolving lineage, no method based on comparative sequence analysis is ideal for discriminating between drastically and mildly deleterious mutations, which are assigned to the opposite categories in HumVar. Another reason is that HumDiv uses an extra criterion to avoid possible erroneous annotations of damaging mutations. For a mutation, PolyPhen-2 calculates Naive Bayes posterior probability that this mutation is damaging and reports estimates of false positive (the chance that the mutation is classified as damaging when it is in fact non-damaging) and true positive (the chance that the mutation is classified as damaging when it is indeed damaging) rates. A mutation is also appraised qualitatively, as benign, possibly damaging, or probably damaging (Supplementary Methods). The user can choose between HumDiv- and HumVar-trained PolyPhen-2. Diagnostics of Mendelian diseases requires distinguishing mutations with drastic effects from all the remaining human variation, including abundant mildly deleterious alleles. Thus, HumVar-trained PolyPhen-2 should be used for this task. In contrast, HumDiv-trained PolyPhen-2 should be used for evaluating rare alleles at loci potentially involved in complex phenotypes, dense mapping of regions identified by genome-wide association studies, and analysis of natural selection from sequence data, where even mildly deleterious alleles must be treated as damaging.

11,571 citations

Journal ArticleDOI
TL;DR: This protocol describes the use of the 'Sorting Tolerant From Intolerant' (SIFT) algorithm in predicting whether an AAS affects protein function.
Abstract: The effect of genetic mutation on phenotype is of significant interest in genetics. The type of genetic mutation that causes a single amino acid substitution (AAS) in a protein sequence is called a non-synonymous single nucleotide polymorphism (nsSNP). An nsSNP could potentially affect the function of the protein, subsequently altering the carrier's phenotype. This protocol describes the use of the 'Sorting Tolerant From Intolerant' (SIFT) algorithm in predicting whether an AAS affects protein function. To assess the effect of a substitution, SIFT assumes that important positions in a protein sequence have been conserved throughout evolution and therefore substitutions at these positions may affect protein function. Thus, by using sequence homology, SIFT predicts the effects of all possible substitutions at each position in the protein sequence. The protocol typically takes 5–20 min, depending on the input. SIFT is available as an online tool ( http://sift-dna.org ).

6,154 citations

Journal ArticleDOI
John W. Belmont1, Andrew Boudreau, Suzanne M. Leal1, Paul Hardenbol  +229 moreInstitutions (40)
27 Oct 2005
TL;DR: A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
Abstract: Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.

5,479 citations

Journal ArticleDOI
TL;DR: Specific standards designed to maintain rigor while also promoting communication are proposed for the interpretation of linkage results in genetic studies under way for many complex traits.
Abstract: Genetic studies are under way for many complex traits, spurred by the recent feasibility of whole genome scans. Clear guidelines for the interpretation of linkage results are needed to avoid a flood of false positive claims. At the same time, an overly cautious approach runs the risk of causing true hints of linkage to be missed. We address this problem by proposing specific standards designed to maintain rigor while also promoting communication.

5,317 citations

Journal ArticleDOI
15 Apr 2005-Science
TL;DR: A genome-wide screen for polymorphisms associated with age-related macular degeneration revealed a polymorphism in linkage disequilibrium with the risk allele representing a tyrosine-histidine change at amino acid 402 in the complement factor H gene.
Abstract: Age-related macular degeneration (AMD) is a major cause of blindness in the elderly. We report a genome-wide screen of 96 cases and 50 controls for polymorphisms associated with AMD. Among 116,204 single-nucleotide polymorphisms genotyped, an intronic and common variant in the complement factor H gene ( CFH ) is strongly associated with AMD (nominal P value -7 ). In individuals homozygous for the risk allele, the likelihood of AMD is increased by a factor of 7.4 (95% confidence interval 2.9 to 19). Resequencing revealed a polymorphism in linkage disequilibrium with the risk allele representing a tyrosine-histidine change at amino acid 402. This polymorphism is in a region of CFH that binds heparin and C-reactive protein. The CFH gene is located on chromosome 1 in a region repeatedly linked to AMD in family-based studies.

4,459 citations

Related Papers (5)
18 Aug 2016-Nature
Monkol Lek, Konrad J. Karczewski, Konrad J. Karczewski, Eric Vallabh Minikel, Eric Vallabh Minikel, Kaitlin E. Samocha, Eric Banks, Timothy Fennell, Anne H. O’Donnell-Luria, Anne H. O’Donnell-Luria, Anne H. O’Donnell-Luria, James S. Ware, Andrew J. Hill, Andrew J. Hill, Andrew J. Hill, Beryl B. Cummings, Beryl B. Cummings, Taru Tukiainen, Taru Tukiainen, Daniel P. Birnbaum, Jack A. Kosmicki, Laramie E. Duncan, Laramie E. Duncan, Karol Estrada, Karol Estrada, Fengmei Zhao, Fengmei Zhao, James Zou, Emma Pierce-Hoffman, Emma Pierce-Hoffman, Joanne Berghout, David Neil Cooper, Nicole A. Deflaux, Mark A. DePristo, Ron Do, Jason Flannick, Jason Flannick, Menachem Fromer, Laura D. Gauthier, Jackie Goldstein, Jackie Goldstein, Namrata Gupta, Daniel P. Howrigan, Daniel P. Howrigan, Adam Kiezun, Mitja I. Kurki, Mitja I. Kurki, Ami Levy Moonshine, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso, Gina M. Peloso, Ryan Poplin, Manuel A. Rivas, Valentin Ruano-Rubio, Samuel A. Rose, Douglas M. Ruderfer, Khalid Shakir, Peter D. Stenson, Christine Stevens, Brett Thomas, Brett Thomas, Grace Tiao, María Teresa Tusié-Luna, Ben Weisburd, Hong-Hee Won, Dongmei Yu, David Altshuler, David Altshuler, Diego Ardissino, Michael Boehnke, John Danesh, Stacey Donnelly, Roberto Elosua, Jose C. Florez, Jose C. Florez, Stacey Gabriel, Gad Getz, Gad Getz, Stephen J. Glatt, Christina M. Hultman, Sekar Kathiresan, Markku Laakso, Steven A. McCarroll, Steven A. McCarroll, Mark I. McCarthy, Mark I. McCarthy, Dermot P.B. McGovern, Ruth McPherson, Benjamin M. Neale, Benjamin M. Neale, Aarno Palotie, Shaun Purcell, Danish Saleheen, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan, Patrick F. Sullivan, Jaakko Tuomilehto, Ming T. Tsuang, Hugh Watkins, Hugh Watkins, James G. Wilson, Mark J. Daly, Mark J. Daly, Daniel G. MacArthur, Daniel G. MacArthur