scispace - formally typeset
Search or ask a question

Showing papers by "Wellcome Trust Centre for Human Genetics published in 2014"


Journal ArticleDOI
TL;DR: The performance of Platypus is demonstrated by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.
Abstract: High-throughput DNA sequencing technology has transformed genetic research and is starting to make an impact on clinical practice. However, analyzing high-throughput sequencing data remains challenging, particularly in clinical settings where accuracy and turnaround times are critical. We present a new approach to this problem, implemented in a software package called Platypus. Platypus achieves high sensitivity and specificity for SNPs, indels and complex polymorphisms by using local de novo assembly to generate candidate variants, followed by local realignment and probabilistic haplotype estimation. It is an order of magnitude faster than existing tools and generates calls from raw aligned read data without preprocessing. We demonstrate the performance of Platypus in clinically relevant experimental designs by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.

940 citations


Journal ArticleDOI
07 Mar 2014-Science
TL;DR: This work mapped interindividual variation in gene expression as a quantitative trait, defining expression quantitative trait loci (eQTLs) and found trans associations to the major histocompatibility complex are dependent on context, paralleling the expression of class II genes.
Abstract: To systematically investigate the impact of immune stimulation upon regulatory variant activity, we exposed primary monocytes from 432 healthy Europeans to interferon-γ (IFN-γ) or differing durations of lipopolysaccharide and mapped expression quantitative trait loci (eQTLs). More than half of cis-eQTLs identified, involving hundreds of genes and associated pathways, are detected specifically in stimulated monocytes. Induced innate immune activity reveals multiple master regulatory trans-eQTLs including the major histocompatibility complex (MHC), coding variants altering enzyme and receptor function, an IFN-β cytokine network showing temporal specificity, and an interferon regulatory factor 2 (IRF2) transcription factor-modulated network. Induced eQTL are significantly enriched for genome-wide association study loci, identifying context-specific associations to putative causal genes including CARD9, ATM, and IRF8. Thus, applying pathophysiologically relevant immune stimuli assists resolution of functional genetic variants.

726 citations


Journal ArticleDOI
14 Feb 2014-Science
TL;DR: An atlas of worldwide human admixture history, constructed by using genetic data alone and encompassing over 100 events occurring over the past 4000 years, is revealed, revealing admixture to be an almost universal force shaping human populations.
Abstract: Modern genetic data combined with appropriate statistical methods have the potential to contribute substantially to our understanding of human history. We have developed an approach that exploits the genomic structure of admixed populations to date and characterize historical mixture events at fine scales. We used this to produce an atlas of worldwide human admixture history, constructed by using genetic data alone and encompassing over 100 events occurring over the past 4000 years. We identified events whose dates and participants suggest they describe genetic impacts of the Mongol empire, Arab slave trade, Bantu expansion, first millennium CE migrations in Eastern Europe, and European colonialism, as well as unrecorded events, revealing admixture to be an almost universal force shaping human populations.

688 citations


Journal ArticleDOI
Sofia Khan1, Dario Greco1, Dario Greco2, Kyriaki Michailidou3  +158 moreInstitutions (54)
12 Nov 2014-PLOS ONE
TL;DR: Five miRNA binding site SNPs associated significantly with breast cancer risk are located in the 3′ UTR of CASP8, HDDC3, DROSHA, MUSTN1, and MYCL1, respectively, which belongs to miRNA machinery genes and has a central role in initial miRNA processing.
Abstract: Genetic variations, such as single nucleotide polymorphisms (SNPs) in microRNAs (miRNA) or in the miRNA binding sites may affect the miRNA dependent gene expression regulation, which has been implicated in various cancers, including breast cancer, and may alter individual susceptibility to cancer. We investigated associations between miRNA related SNPs and breast cancer risk. First we evaluated 2,196 SNPs in a case-control study combining nine genome wide association studies (GWAS). Second, we further investigated 42 SNPs with suggestive evidence for association using 41,785 cases and 41,880 controls from 41 studies included in the Breast Cancer Association Consortium (BCAC). Combining the GWAS and BCAC data within a meta-analysis, we estimated main effects on breast cancer risk as well as risks for estrogen receptor (ER) and age defined subgroups. Five miRNA binding site SNPs associated significantly with breast cancer risk: rs1045494 (odds ratio (OR) 0.92; 95% confidence interval (CI): 0.88-0.96), rs1052532 (OR 0.97; 95% CI: 0.95-0.99), rs10719 (OR 0.97; 95% CI: 0.94-0.99), rs4687554 (OR 0.97; 95% CI: 0.95-0.99, and rs3134615 (OR 1.03; 95% CI: 1.01-1.05) located in the 3' UTR of CASP8, HDDC3, DROSHA, MUSTN1, and MYCL1, respectively. DROSHA belongs to miRNA machinery genes and has a central role in initial miRNA processing. The remaining genes are involved in different molecular functions, including apoptosis and gene expression regulation. Further studies are warranted to elucidate whether the miRNA binding site SNPs are the causative variants for the observed risk effects.

686 citations


Journal ArticleDOI
21 Aug 2014-Nature
TL;DR: The first three-dimensional structure of a GABAAR, the human β3 homopentamer, at 3 Å resolution is presented and reveals architectural elements unique to eukaryotic Cys-loop receptors and shows an unexpected structural role for a conserved N-linked glycan.
Abstract: Type-A γ-aminobutyric acid receptors (GABAARs) are the principal mediators of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signalling triggers hyperactive neurological disorders such as insomnia, anxiety and epilepsy. Here we present the first three-dimensional structure of a GABAAR, the human β3 homopentamer, at 3 A resolution. This structure reveals architectural elements unique to eukaryotic Cys-loop receptors, explains the mechanistic consequences of multiple human disease mutations and shows an unexpected structural role for a conserved N-linked glycan. The receptor was crystallized bound to a previously unknown agonist, benzamidine, opening a new avenue for the rational design of GABAAR modulators. The channel region forms a closed gate at the base of the pore, representative of a desensitized state. These results offer new insights into the signalling mechanisms of pentameric ligand-gated ion channels and enhance current understanding of GABAergic neurotransmission. GABAA receptors are the principal mediators of rapid inhibitor synaptic transmission in the brain, and a decline in GABAA signalling leads to diseases including epilepsy, insomnia, anxiety and autism; here, the first X-ray crystal structure of a human GABAA receptor, the human β3 homopentamer, reveals structural features unique for this receptor class and uncovers the locations of key disease-causing mutations. Paul Miller and Radu Aricescu report the first X-ray crystal structure of the human GABAA receptor, a pentameric ligand-gated ion channel and the principal mediator of rapid inhibitory synaptic transmission in the brain. The overall structure resembles those of other Cys-loop receptors but there are also several unique features, including the presence of an extended glycan sheath that would restrict interactions with other synaptic proteins. The authors discuss how specific mutations may be linked to specific diseases, and since the structure was obtained in the presence of benzamidine, a GABAA receptor agonist, it is hoped that this work could contribute to the design of new therapeutic agents.

629 citations


Journal ArticleDOI
Michael V. Holmes1, Michael V. Holmes2, Caroline Dale3, Luisa Zuccolo  +167 moreInstitutions (62)
10 Jul 2014-BMJ
TL;DR: In this article, the causal role of alcohol consumption in cardiovascular disease was investigated using a Mendelian randomisation meta-analysis of 56 epidemiological studies, including 20 259 coronary heart disease cases and 10 164 stroke events.
Abstract: OBJECTIVE: To use the rs1229984 variant in the alcohol dehydrogenase 1B gene (ADH1B) as an instrument to investigate the causal role of alcohol in cardiovascular disease. DESIGN: Mendelian randomisation meta-analysis of 56 epidemiological studies. PARTICIPANTS: 261 991 individuals of European descent, including 20 259 coronary heart disease cases and 10 164 stroke events. Data were available on ADH1B rs1229984 variant, alcohol phenotypes, and cardiovascular biomarkers. MAIN OUTCOME MEASURES: Odds ratio for coronary heart disease and stroke associated with the ADH1B variant in all individuals and by categories of alcohol consumption. RESULTS: Carriers of the A-allele of ADH1B rs1229984 consumed 17.2% fewer units of alcohol per week (95% confidence interval 15.6% to 18.9%), had a lower prevalence of binge drinking (odds ratio 0.78 (95% CI 0.73 to 0.84)), and had higher abstention (odds ratio 1.27 (1.21 to 1.34)) than non-carriers. Rs1229984 A-allele carriers had lower systolic blood pressure (-0.88 (-1.19 to -0.56) mm Hg), interleukin-6 levels (-5.2% (-7.8 to -2.4%)), waist circumference (-0.3 (-0.6 to -0.1) cm), and body mass index (-0.17 (-0.24 to -0.10) kg/m(2)). Rs1229984 A-allele carriers had lower odds of coronary heart disease (odds ratio 0.90 (0.84 to 0.96)). The protective association of the ADH1B rs1229984 A-allele variant remained the same across all categories of alcohol consumption (P=0.83 for heterogeneity). Although no association of rs1229984 was identified with the combined subtypes of stroke, carriers of the A-allele had lower odds of ischaemic stroke (odds ratio 0.83 (0.72 to 0.95)). CONCLUSIONS: Individuals with a genetic variant associated with non-drinking and lower alcohol consumption had a more favourable cardiovascular profile and a reduced risk of coronary heart disease than those without the genetic variant. This suggests that reduction of alcohol consumption, even for light to moderate drinkers, is beneficial for cardiovascular health.

571 citations


Journal ArticleDOI
TL;DR: It is found that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations, and a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals is developed.
Abstract: Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally ‘unrelated’ individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM) to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.

555 citations


Journal ArticleDOI
05 Feb 2014-Neuron
TL;DR: Genome-wide association and linkage results provide constraints on the allele frequencies and effect sizes of susceptibility loci, which are used to interpret the voluminous candidate gene literature.

507 citations


Journal ArticleDOI
TL;DR: This work mapped and examined the function of human islet cis-regulatory networks and identifies genomic sequences that are targeted by islet transcription factors to drive islet-specific gene activity and shows that most such sequences reside in clusters of enhancers that form physical three-dimensional chromatin domains.
Abstract: Type 2 diabetes affects over 300 million people, causing severe complications and premature death, yet the underlying molecular mechanisms are largely unknown. Pancreatic islet dysfunction is central in type 2 diabetes pathogenesis, and understanding islet genome regulation could therefore provide valuable mechanistic insights. We have now mapped and examined the function of human islet cis-regulatory networks. We identify genomic sequences that are targeted by islet transcription factors to drive islet-specific gene activity and show that most such sequences reside in clusters of enhancers that form physical three-dimensional chromatin domains. We find that sequence variants associated with type 2 diabetes and fasting glycemia are enriched in these clustered islet enhancers and identify trait-associated variants that disrupt DNA binding and islet enhancer activity. Our studies illustrate how islet transcription factors interact functionally with the epigenome and provide systematic evidence that the dysregulation of islet enhancers is relevant to the mechanisms underlying type 2 diabetes.

476 citations


Journal ArticleDOI
TL;DR: Mutations of the SHANK genes were detected in the whole spectrum of autism with a gradient of severity in cognitive impairment and the clinical relevance of these genes remains to be ascertained.
Abstract: SHANK genes code for scaffold proteins located at the post-synaptic density of glutamatergic synapses. In neurons, SHANK2 and SHANK3 have a positive effect on the induction and maturation of dendritic spines, whereas SHANK1 induces the enlargement of spine heads. Mutations in SHANK genes have been associated with autism spectrum disorders (ASD), but their prevalence and clinical relevance remain to be determined. Here, we performed a new screen and a meta-analysis of SHANK copy-number and coding-sequence variants in ASD. Copy-number variants were analyzed in 5,657 patients and 19,163 controls, coding-sequence variants were ascertained in 760 to 2,147 patients and 492 to 1,090 controls (depending on the gene), and, individuals carrying de novo or truncating SHANK mutations underwent an extensive clinical investigation. Copy-number variants and truncating mutations in SHANK genes were present in ∼1% of patients with ASD: mutations in SHANK1 were rare (0.04%) and present in males with normal IQ and autism; mutations in SHANK2 were present in 0.17% of patients with ASD and mild intellectual disability; mutations in SHANK3 were present in 0.69% of patients with ASD and up to 2.12% of the cases with moderate to profound intellectual disability. In summary, mutations of the SHANK genes were detected in the whole spectrum of autism with a gradient of severity in cognitive impairment. Given the rare frequency of SHANK1 and SHANK2 deleterious mutations, the clinical relevance of these genes remains to be ascertained. In contrast, the frequency and the penetrance of SHANK3 mutations in individuals with ASD and intellectual disability-more than 1 in 50-warrant its consideration for mutation screening in clinical practice.

452 citations


Journal ArticleDOI
TL;DR: In this article, a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data is presented. But this protocol is not suitable for large consortia such as the GIANT Consortium.
Abstract: Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC at the study file level, the meta-level across studies and the meta-analysis output level. Real-world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for the use of a powerful and flexible software package called EasyQC. Precise timings will be greatly influenced by consortium size. For consortia of comparable size to the GIANT Consortium, this protocol takes a minimum of about 10 months to complete.

Journal ArticleDOI
TL;DR: Using a set of validation genotypes at SNP and biallelic indels it is shown that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low frequency variants.
Abstract: A major use of the 1000 Genomes Project (1000 GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000 GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.

Journal ArticleDOI
TL;DR: In this paper, the authors compared exome sequence data on 3,000 Finns to the same number of non-Finnish Europeans and discovered that the average Finn has more low-frequency loss-of-function variants and complete gene knockouts.
Abstract: Exome sequencing studies in complex diseases are challenged by the allelic heterogeneity, large number and modest effect sizes of associated variants on disease risk and the presence of large numbers of neutral variants, even in phenotypically relevant genes. Isolated populations with recent bottlenecks offer advantages for studying rare variants in complex diseases as they have deleterious variants that are present at higher frequencies as well as a substantial reduction in rare neutral variation. To explore the potential of the Finnish founder population for studying low-frequency (0.5-5%) variants in complex diseases, we compared exome sequence data on 3,000 Finns to the same number of non-Finnish Europeans and discovered that, despite having fewer variable sites overall, the average Finn has more low-frequency loss-of-function variants and complete gene knockouts. We then used several well-characterized Finnish population cohorts to study the phenotypic effects of 83 enriched loss-of-function variants across 60 phenotypes in 36,262 Finns. Using a deep set of quantitative traits collected on these cohorts, we show 5 associations (p<5×10⁻⁸) including splice variants in LPA that lowered plasma lipoprotein(a) levels (P = 1.5×10⁻¹¹⁷). Through accessing the national medical records of these participants, we evaluate the LPA finding via Mendelian randomization and confirm that these splice variants confer protection from cardiovascular disease (OR = 0.84, P = 3×10⁻⁴), demonstrating for the first time the correlation between very low levels of LPA in humans with potential therapeutic implications for cardiovascular diseases. More generally, this study articulates substantial advantages for studying the role of rare variation in complex phenotypes in founder populations like the Finns and by combining a unique population genetic history with data from large population cohorts and centralized research access to National Health Registers.

Journal ArticleDOI
Karani Santhanakrishnan Vimaleswaran1, Karani Santhanakrishnan Vimaleswaran2, Alana Cavadino1, Diane J. Berry1, Rolf Jorde3, Aida Karina Dieffenbach4, Chen Lu5, Alexessander Couto Alves6, Alexessander Couto Alves7, Hiddo J.L. Heerspink8, Emmi Tikkanen9, J. G. Eriksson10, Andrew Wong11, Massimo Mangino12, Kathleen A. Jablonski13, Ilja M. Nolte8, Denise K. Houston14, Tarunveer S. Ahluwalia15, Tarunveer S. Ahluwalia16, Peter J. van der Most8, Dorota Pasko17, Lina Zgaga18, Lina Zgaga19, Elisabeth Thiering20, Veronique Vitart19, Ross M. Fraser19, Jennifer E. Huffman19, Rudolf A. de Boer8, Ben Schöttker4, Kai-Uwe Saum4, Mark I. McCarthy21, Mark I. McCarthy22, Josée Dupuis5, Karl-Heinz Herzig23, Karl-Heinz Herzig7, Sylvain Sebert7, Anneli Pouta23, Anneli Pouta24, Jaana Laitinen25, Marcus E. Kleber26, Gerjan Navis8, Mattias Lorentzon10, Karen A. Jameson27, Nigel K Arden22, Nigel K Arden27, Jackie A. Cooper11, Jayshree Acharya11, Rebecca Hardy11, Olli T. Raitakari28, Olli T. Raitakari29, Samuli Ripatti9, Liana K. Billings, Jari Lahti9, Clive Osmond27, Brenda W.J.H. Penninx30, Lars Rejnmark31, Kurt Lohman14, Lavinia Paternoster32, Ronald P. Stolk8, Dena G. Hernandez24, Liisa Byberg33, Emil Hagström33, Håkan Melhus33, Erik Ingelsson33, Erik Ingelsson21, Erik Ingelsson34, Dan Mellström10, Östen Ljunggren33, Ioanna Tzoulaki6, Stela McLachlan19, Evropi Theodoratou19, Carla M. T. Tiesler20, Antti Jula24, Pau Navarro19, Alan F. Wright19, Ozren Polasek35, James F. Wilson19, Igor Rudan19, Veikko Salomaa24, Joachim Heinrich, Harry Campbell19, Jacqueline F. Price19, Magnus Karlsson36, Lars Lind33, Karl Michaëlsson33, Stefania Bandinelli, Timothy M. Frayling17, Catharina A. Hartman8, Thorkild I. A. Sørensen37, Thorkild I. A. Sørensen15, Stephen B. Kritchevsky14, Bente L. Langdahl31, Johan G. Eriksson, Jose C. Florez38, Tim D. Spector12, Terho Lehtimäki39, Diana Kuh11, Steve E. Humphries11, Cyrus Cooper27, Cyrus Cooper22, Claes Ohlsson10, Winfried März26, Winfried März40, Winfried März41, Martin H. de Borst8, Meena Kumari11, Mika Kivimäki11, Thomas J. Wang42, Chris Power1, Hermann Brenner4, Guri Grimnes3, Pim van der Harst8, Harold Snieder8, Aroon D. Hingorani11, Stefan Pilz40, John C. Whittaker43, Marjo-Riitta Järvelin, Elina Hyppönen44, Elina Hyppönen1 
TL;DR: In this article, the authors used a mendelian randomisation approach to test whether low plasma 25-hydroxyvitamin D (25[OH]D) concentration is causally associated with blood pressure and hypertension risk.

Journal ArticleDOI
TL;DR: It is established that human colorectal cancer lines are representative of the main subtypes of primary tumors at the genomic level, further validating their utility as tools to investigate coloreCTal cancer biology and drug responses.
Abstract: Human colorectal cancer cell lines are used widely to investigate tumor biology, experimental therapy, and biomarkers. However, to what extent these established cell lines represent and maintain the genetic diversity of primary cancers is uncertain. In this study, we profiled 70 colorectal cancer cell lines for mutations and DNA copy number by whole-exome sequencing and SNP microarray analyses, respectively. Gene expression was defined using RNA-Seq. Cell line data were compared with those published for primary colorectal cancers in The Cancer Genome Atlas. Notably, we found that exome mutation and DNA copy-number spectra in colorectal cancer cell lines closely resembled those seen in primary colorectal tumors. Similarities included the presence of two hypermutation phenotypes, as defined by signatures for defective DNA mismatch repair and DNA polymerase e proofreading deficiency, along with concordant mutation profiles in the broadly altered WNT, MAPK, PI3K, TGFβ, and p53 pathways. Furthermore, we documented mutations enriched in genes involved in chromatin remodeling (ARID1A, CHD6, and SRCAP) and histone methylation or acetylation (ASH1L, EP300, EP400, MLL2, MLL3, PRDM2, and TRRAP). Chromosomal instability was prevalent in nonhypermutated cases, with similar patterns of chromosomal gains and losses. Although paired cell lines derived from the same tumor exhibited considerable mutation and DNA copy-number differences, in silico simulations suggest that these differences mainly reflected a preexisting heterogeneity in the tumor cells. In conclusion, our results establish that human colorectal cancer lines are representative of the main subtypes of primary tumors at the genomic level, further validating their utility as tools to investigate colorectal cancer biology and drug responses.

Journal ArticleDOI
01 Jan 2014-Stroke
TL;DR: In this article, the authors conducted a genome-wide analysis to evaluate the extent of shared genetic determination of the two diseases and found substantial overlap in the genetic risk of IS and particularly the LAS subtype with CAD.
Abstract: BACKGROUND AND PURPOSE: Ischemic stroke (IS) and coronary artery disease (CAD) share several risk factors and each has a substantial heritability. We conducted a genome-wide analysis to evaluate the extent of shared genetic determination of the two diseases. METHODS: Genome-wide association data were obtained from the METASTROKE, Coronary Artery Disease Genome-wide Replication and Meta-analysis (CARDIoGRAM), and Coronary Artery Disease (C4D) Genetics consortia. We first analyzed common variants reaching a nominal threshold of significance (P<0.01) for CAD for their association with IS and vice versa. We then examined specific overlap across phenotypes for variants that reached a high threshold of significance. Finally, we conducted a joint meta-analysis on the combined phenotype of IS or CAD. Corresponding analyses were performed restricted to the 2167 individuals with the ischemic large artery stroke (LAS) subtype. RESULTS: Common variants associated with CAD at P<0.01 were associated with a significant excess risk for IS and for LAS and vice versa. Among the 42 known genome-wide significant loci for CAD, 3 and 5 loci were significantly associated with IS and LAS, respectively. In the joint meta-analyses, 15 loci passed genome-wide significance (P<5×10(-8)) for the combined phenotype of IS or CAD and 17 loci passed genome-wide significance for LAS or CAD. Because these loci had prior evidence for genome-wide significance for CAD, we specifically analyzed the respective signals for IS and LAS and found evidence for association at chr12q24/SH2B3 (PIS=1.62×10(-7)) and ABO (PIS=2.6×10(-4)), as well as at HDAC9 (PLAS=2.32×10(-12)), 9p21 (PLAS=3.70×10(-6)), RAI1-PEMT-RASD1 (PLAS=2.69×10(-5)), EDNRA (PLAS=7.29×10(-4)), and CYP17A1-CNNM2-NT5C2 (PLAS=4.9×10(-4)). CONCLUSIONS: Our results demonstrate substantial overlap in the genetic risk of IS and particularly the LAS subtype with CAD.

Journal ArticleDOI
TL;DR: It is shown, using the specific example of Parkinson disease, that identification of protein–protein interactions can help determine the most likely candidate for several GWAS loci, and proposed that three different genes for PD have a common biological function.
Abstract: Mutations in leucine-rich repeat kinase 2 (LRRK2) cause inherited Parkinson disease (PD), and common variants around LRRK2 are a risk factor for sporadic PD. Using protein–protein interaction arrays, we identified BCL2-associated athanogene 5, Rab7L1 (RAB7, member RAS oncogene family-like 1), and Cyclin-G–associated kinase as binding partners of LRRK2. The latter two genes are candidate genes for risk for sporadic PD identified by genome-wide association studies. These proteins form a complex that promotes clearance of Golgi-derived vesicles through the autophagy–lysosome system both in vitro and in vivo. We propose that three different genes for PD have a common biological function. More generally, data integration from multiple unbiased screens can provide insight into human disease mechanisms.

Journal ArticleDOI
TL;DR: The impact of post-zygotic mosaicism on disease risk is illustrated, could explain why males are more frequently affected by cancer and suggest that chromosome Y is important in processes beyond sex determination.
Abstract: Incidence and mortality for sex-unspecific cancers are higher among men, a fact that is largely unexplained. Furthermore, age-related loss of chromosome Y (LOY) is frequent in normal hematopoietic cells, but the phenotypic consequences of LOY have been elusive. From analysis of 1,153 elderly men, we report that LOY in peripheral blood was associated with risks of all-cause mortality (hazards ratio (HR) = 1.91, 95% confidence interval (CI) = 1.17-3.13; 637 events) and non-hematological cancer mortality (HR = 3.62, 95% CI = 1.56-8.41; 132 events). LOY affected at least 8.2% of the subjects in this cohort, and median survival times among men with LOY were 5.5 years shorter. Association of LOY with risk of all-cause mortality was validated in an independent cohort (HR = 3.66) in which 20.5% of subjects showed LOY. These results illustrate the impact of post-zygotic mosaicism on disease risk, could explain why males are more frequently affected by cancer and suggest that chromosome Y is important in processes beyond sex determination. LOY in blood could become a predictive biomarker of male carcinogenesis.

Journal ArticleDOI
01 Jun 2014-Diabetes
TL;DR: By assembling extensive data on continuous glycemic traits, this work has exposed the diverse mechanisms whereby type 2 diabetes risk variants impact disease predisposition.
Abstract: Patients with established type 2 diabetes display both β-cell dysfunction and insulin resistance. To define fundamental processes leading to the diabetic state, we examined the relationship between type 2 diabetes risk variants at 37 established susceptibility loci, and indices of proinsulin processing, insulin secretion, and insulin sensitivity. We included data from up to 58,614 nondiabetic subjects with basal measures and 17,327 with dynamic measures. We used additive genetic models with adjustment for sex, age, and BMI, followed by fixed-effects, inverse-variance meta-analyses. Cluster analyses grouped risk loci into five major categories based on their relationship to these continuous glycemic phenotypes. The first cluster (PPARG, KLF14, IRS1, GCKR) was characterized by primary effects on insulin sensitivity. The second cluster (MTNR1B, GCK) featured risk alleles associated with reduced insulin secretion and fasting hyperglycemia. ARAP1 constituted a third cluster characterized by defects in insulin processing. A fourth cluster (TCF7L2, SLC30A8, HHEX/IDE, CDKAL1, CDKN2A/2B) was defined by loci influencing insulin processing and secretion without a detectable change in fasting glucose levels. The final group contained 20 risk loci with no clear-cut associations to continuous glycemic traits. By assembling extensive data on continuous glycemic traits, we have exposed the diverse mechanisms whereby type 2 diabetes risk variants impact disease predisposition.


Journal ArticleDOI
01 Jun 2014-Diabetes
TL;DR: In this paper, the authors used RNA sequencing to map transcripts expressed in five palmitate-treated human islet preparations, observing 1,325 modified genes, including PAX4 and GATA6.
Abstract: Pancreatic β-cell dysfunction and death are central in the pathogenesis of type 2 diabetes (T2D). Saturated fatty acids cause β-cell failure and contribute to diabetes development in genetically predisposed individuals. Here we used RNA sequencing to map transcripts expressed in five palmitate-treated human islet preparations, observing 1,325 modified genes. Palmitate induced fatty acid metabolism and endoplasmic reticulum (ER) stress. Functional studies identified novel mediators of adaptive ER stress signaling. Palmitate modified genes regulating ubiquitin and proteasome function, autophagy, and apoptosis. Inhibition of autophagic flux and lysosome function contributed to lipotoxicity. Palmitate inhibited transcription factors controlling β-cell phenotype, including PAX4 and GATA6. Fifty-nine T2D candidate genes were expressed in human islets, and 11 were modified by palmitate. Palmitate modified expression of 17 splicing factors and shifted alternative splicing of 3,525 transcripts. Ingenuity Pathway Analysis of modified transcripts and genes confirmed that top changed functions related to cell death. Database for Annotation, Visualization and Integrated Discovery (DAVID) analysis of transcription factor binding sites in palmitate-modified transcripts revealed a role for PAX4, GATA, and the ER stress response regulators XBP1 and ATF6. This human islet transcriptome study identified novel mechanisms of palmitate-induced β-cell dysfunction and death. The data point to cross talk between metabolic stress and candidate genes at the β-cell level.

Journal ArticleDOI
TL;DR: This study provides the baseline prevalence of K13-propeller mutations in sub-Saharan Africa and detects 22 unique mutations, of which 7 were nonsynonymous.
Abstract: Mutations in the Plasmodium falciparum K13-propeller domain have recently been shown to be important determinants of artemisinin resistance in Southeast Asia. This study investigated the prevalence of K13-propeller polymorphisms across sub-Saharan Africa. A total of 1212 P. falciparum samples collected from 12 countries were sequenced. None of the K13-propeller mutations previously reported in Southeast Asia were found, but 22 unique mutations were detected, of which 7 were nonsynonymous. Allele frequencies ranged between 1% and 3%. Three mutations were observed in >1 country, and the A578S was present in parasites from 5 countries. This study provides the baseline prevalence of K13-propeller mutations in sub-Saharan Africa.

Journal ArticleDOI
TL;DR: A mass spectrometry-based non-targeted metabolomics study for association with incident CHD events identified four lipid-related metabolites with evidence for clinical utility, as well as a causal role in CHD development.
Abstract: Analyses of circulating metabolites in large prospective epidemiological studies could lead to improved prediction and better biological understanding of coronary heart disease (CHD). We performed a mass spectrometry-based non-targeted metabolomics study for association with incident CHD events in 1,028 individuals (131 events; 10 y. median follow-up) with validation in 1,670 individuals (282 events; 3.9 y. median follow-up). Four metabolites were replicated and independent of main cardiovascular risk factors [lysophosphatidylcholine 18∶1 (hazard ratio [HR] per standard deviation [SD] increment = 0.77, P-value<0.001), lysophosphatidylcholine 18∶2 (HR = 0.81, P-value<0.001), monoglyceride 18∶2 (MG 18∶2; HR = 1.18, P-value = 0.011) and sphingomyelin 28∶1 (HR = 0.85, P-value = 0.015)]. Together they contributed to moderate improvements in discrimination and re-classification in addition to traditional risk factors (C-statistic: 0.76 vs. 0.75; NRI: 9.2%). MG 18∶2 was associated with CHD independently of triglycerides. Lysophosphatidylcholines were negatively associated with body mass index, C-reactive protein and with less evidence of subclinical cardiovascular disease in additional 970 participants; a reverse pattern was observed for MG 18∶2. MG 18∶2 showed an enrichment (P-value = 0.002) of significant associations with CHD-associated SNPs (P-value = 1.2×10-7 for association with rs964184 in the ZNF259/APOA5 region) and a weak, but positive causal effect (odds ratio = 1.05 per SD increment in MG 18∶2, P-value = 0.05) on CHD, as suggested by Mendelian randomization analysis. In conclusion, we identified four lipid-related metabolites with evidence for clinical utility, as well as a causal role in CHD development.

Journal ArticleDOI
TL;DR: WGS on six patients with severe early-onset epilepsy who had previously been refractory to molecular diagnosis, and their parents, reveals two novel genes for Ohtahara Syndrome, KCNT1 and PIGQ and uncovers unexpected genetic mechanisms.
Abstract: In severe early-onset epilepsy, precise clinical and molecular genetic diagnosis is complex, as many metabolic and electro-physiological processes have been implicated in disease causation. The clinical phenotypes share many features such as complex seizure types and developmental delay. Molecular diagnosis has historically been confined to sequential testing of candidate genes known to be associated with specific sub-phenotypes, but the diagnostic yield of this approach can be low. We conducted whole-genome sequencing (WGS) on six patients with severe early-onset epilepsy who had previously been refractory to molecular diagnosis, and their parents. Four of these patients had a clinical diagnosis of Ohtahara Syndrome (OS) and two patients had severe non-syndromic early-onset epilepsy (NSEOE). In two OS cases, we found de novo non-synonymous mutations in the genes KCNQ2 and SCN2A. In a third OS case, WGS revealed paternal isodisomy for chromosome 9, leading to identification of the causal homozygous missense variant in KCNT1, which produced a substantial increase in potassium channel current. The fourth OS patient had a recessive mutation in PIGQ that led to exon skipping and defective glycophosphatidyl inositol biosynthesis. The two patients with NSEOE had likely pathogenic de novo mutations in CBL and CSNK1G1, respectively. Mutations in these genes were not found among 500 additional individuals with epilepsy. This work reveals two novel genes for OS, KCNT1 and PIGQ. It also uncovers unexpected genetic mechanisms and emphasizes the power of WGS as a clinical tool for making molecular diagnoses, particularly for highly heterogeneous disorders.

Journal ArticleDOI
TL;DR: This work identifies and characterises sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences, revealing that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.
Abstract: Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25–0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1–5.0). From extrapolations we estimate that 8.2% (7.1–9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.

Journal ArticleDOI
TL;DR: The role of CNTNAP2 is examined in the context of larger neurogenetic networks during development and disorder, given what is known regarding the regulation and function of this gene.
Abstract: The genetic basis of complex neurological disorders involving language are poorly understood, partly due to the multiple additive genetic risk factors that are thought to be responsible. Furthermore, these conditions are often syndromic in that they have a range of endophenotypes that may be associated with the disorder and that may be present in different combinations in patients. However, the emergence of individual genes implicated across multiple disorders has suggested that they might share similar underlying genetic mechanisms. The CNTNAP2 gene is an excellent example of this, as it has recently been implicated in a broad range of phenotypes including autism spectrum disorder (ASD), schizophrenia, intellectual disability, dyslexia and language impairment. This review considers the evidence implicating CNTNAP2 in these conditions, the genetic risk factors and mutations that have been identified in patient and population studies and how these relate to patient phenotypes. The role of CNTNAP2 is examined in the context of larger neurogenetic networks during development and disorder, given what is known regarding the regulation and function of this gene. Understanding the role of CNTNAP2 in diverse neurological disorders will further our understanding of how combinations of individual genetic risk factors can contribute to complex conditions.


Journal ArticleDOI
TL;DR: A panel of genetic biomarkers for capecitabine monotherapy toxicity would currently comprise only the four DPYD and TYMS variants above, but the test panel might be extended to include additional, rare DPYD variants functionally equivalent to *2A and 2846A, though insufficient evidence supports its use in bolus, infusional, or combination FU.
Abstract: Purpose Fluourouracil (FU) is a mainstay of chemotherapy, although toxicities are common. Genetic biomarkers have been used to predict these adverse events, but their utility is uncertain. Patients and Methods

Journal ArticleDOI
TL;DR: It is suggested that commonly encountered natural environmental stresses can accelerate the accumulation and change the profiles of novel inherited variants in plants.
Abstract: Evolution is fueled by phenotypic diversity, which is in turn due to underlying heritable genetic (and potentially epigenetic) variation. While environmental factors are well known to influence the accumulation of novel variation in microorganisms and human cancer cells, the extent to which the natural environment influences the accumulation of novel variation in plants is relatively unknown. Here we use whole-genome and whole-methylome sequencing to test if a specific environmental stress (high-salinity soil) changes the frequency and molecular profile of accumulated mutations and epimutations (changes in cytosine methylation status) in mutation accumulation (MA) lineages of Arabidopsis thaliana. We first show that stressed lineages accumulate ∼100% more mutations, and that these mutations exhibit a distinctive molecular mutational spectrum (specific increases in relative frequency of transversion and insertion/deletion [indel] mutations). We next show that stressed lineages accumulate ∼45% more differentially methylated cytosine positions (DMPs) at CG sites (CG-DMPs) than controls, and also show that while many (∼75%) of these CG-DMPs are inherited, some can be lost in subsequent generations. Finally, we show that stress-associated CG-DMPs arise more frequently in genic than in nongenic regions of the genome. We suggest that commonly encountered natural environmental stresses can accelerate the accumulation and change the profiles of novel inherited variants in plants. Our findings are significant because stress exposure is common among plants in the wild, and they suggest that environmental factors may significantly alter the rates and patterns of incidence of the inherited novel variants that fuel plant evolution.

Journal ArticleDOI
TL;DR: The authors found that the relative effect of a type 2 diabetes genetic risk score is greater in younger and leaner participants, and the high absolute risk associated with obesity at any level of genetic risk highlights the importance of universal rather than targeted approaches to lifestyle intervention.
Abstract: Background: Understanding of the genetic basis of type 2 diabetes (T2D) has progressed rapidly, but the interactions between common genetic variants and lifestyle risk factors have not been systematically investigated in studies with adequate statistical power. Therefore, we aimed to quantify the combined effects of genetic and lifestyle factors on risk of T2D in order to inform strategies for prevention. Methods and Findings: The InterAct study includes 12,403 incident T2D cases and a representative sub-cohort of 16,154 individuals from a cohort of 340,234 European participants with 3.99 million person-years of follow-up. We studied the combined effects of an additive genetic T2D risk score and modifiable and non-modifiable risk factors using Prenticeweighted Cox regression and random effects meta-analysis methods. The effect of the genetic score was significantly greater in younger individuals (p for interaction = 1.20610 24 ). Relative genetic risk (per standard deviation [4.4 risk alleles]) was also larger in participants who were leaner, both in terms of body mass index (p for interaction = 1.50610 23 ) and waist circumference (p for interaction = 7.49610 29 ). Examination of absolute risks by strata showed the importance of obesity for T2D risk. The 10-y cumulative incidence of T2D rose from 0.25% to 0.89% across extreme quartiles of the genetic score in normal weight individuals, compared to 4.22% to 7.99% in obese individuals. We detected no significant interactions between the genetic score and sex, diabetes family history, physical activity, or dietary habits assessed by a Mediterranean diet score. Conclusions: The relative effect of a T2D genetic risk score is greater in younger and leaner participants. However, this subgroup is at low absolute risk and would not be a logical target for preventive interventions. The high absolute risk associated with obesity at any level of genetic risk highlights the importance of universal rather than targeted approaches to lifestyle intervention. Please see later in the article for the Editors’ Summary.