Showing papers by "Wellcome Trust Sanger Institute published in 2014"

PDF

Open Access

Journal Article•DOI•

[...]

Robert D. Finn¹, Alex Bateman², Jody Clements¹, Penelope Coggill², Ruth Y. Eberhardt², Sean R. Eddy¹, Andreas Heger, Kirstie Hetherington³, Liisa Holm, Jaina Mistry², Erik L. L. Sonnhammer⁴, John Tate², Marco Punta² - Show less +9 more•Institutions (4)

Howard Hughes Medical Institute¹, European Bioinformatics Institute², Wellcome Trust Sanger Institute³, Stockholm University⁴

01 Jan 2014-Nucleic Acids Research

TL;DR: Pfam as discussed by the authors is a widely used database of protein families, containing 14 831 manually curated entries in the current version, version 27.0, and has been updated several times since 2012.

...read moreread less

Abstract: Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.

...read moreread less

9,415 citations

Journal Article•DOI•

InterProScan 5: genome-scale protein function classification

[...]

Philip Jones¹, David Binns¹, Hsin-Yu Chang¹, Matthew Fraser¹, Weizhong Li¹, Craig McAnulla¹, Hamish McWilliam¹, John Maslen¹, Alex L. Mitchell¹, Gift Nuka¹, Sebastien Pesseat¹, Antony F. Quinn¹, Amaia Sangrador-Vegas¹, Maxim Scheremetjew¹, Siew-Yit Yong¹, Rodrigo Lopez¹, Sarah Hunter¹ - Show less +13 more•Institutions (1)

Wellcome Trust Sanger Institute¹

01 May 2014-Bioinformatics

TL;DR: A new Java-based architecture for the widely used protein function prediction software package InterProScan is described, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis.

...read moreread less

Abstract: Motivation: Robust, large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterise many millions of sequences. Here we describe a new Java-based architecture for the widely-used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete re-implementation of the software framework, resulting in a flexible and stable system that is able to utilise both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the (open) source code is hosted at Google Code. Availability: InterProScan is distributed via FTP at ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/ and the source code is available from http://code.google.com/p/interproscan/. Contact: http://www.ebi.ac.uk/support or interhelp@ebi.ac.uk

...read moreread less

5,434 citations

Journal Article•DOI•

Reagent and laboratory contamination can critically impact sequence-based microbiome analyses

[...]

Susannah J. Salter¹, Michael J. Cox², Elena M. Turek², Szymon T. Calus³, William O.C.M. Cookson², Miriam F. Moffatt², Paul Turner⁴, Paul Turner⁵, Julian Parkhill¹, Nicholas J. Loman³, Alan W. Walker⁶, Alan W. Walker¹ - Show less +8 more•Institutions (6)

Wellcome Trust Sanger Institute¹, National Institutes of Health², University of Birmingham³, University of Oxford⁴, Mahidol University⁵, University of Aberdeen⁶

12 Nov 2014-BMC Biology

TL;DR: It is demonstrated that contaminating DNA is ubiquitous in commonly used DNA extraction kits and other laboratory reagents, varies greatly in composition between different kits and kit batches, and that this contamination critically impacts results obtained from samples containing a low microbial biomass.

...read moreread less

Abstract: The study of microbial communities has been revolutionised in recent years by the widespread adoption of culture independent analytical techniques such as 16S rRNA gene sequencing and metagenomics. One potential confounder of these sequence-based approaches is the presence of contamination in DNA extraction kits and other laboratory reagents. In this study we demonstrate that contaminating DNA is ubiquitous in commonly used DNA extraction kits and other laboratory reagents, varies greatly in composition between different kits and kit batches, and that this contamination critically impacts results obtained from samples containing a low microbial biomass. Contamination impacts both PCR-based 16S rRNA gene surveys and shotgun metagenomics. We provide an extensive list of potential contaminating genera, and guidelines on how to mitigate the effects of contamination. These results suggest that caution should be advised when applying sequence-based techniques to the study of microbiota present in low biomass environments. Concurrent sequencing of negative control samples is strongly advised.

...read moreread less

2,459 citations

Journal Article•DOI•

Synaptic, transcriptional and chromatin genes disrupted in autism

[...]

Silvia De Rubeis¹, Xin-Xin He², Arthur P. Goldberg¹, Christopher S. Poultney¹, Kaitlin E. Samocha³, A. Ercument Cicek², Yan Kou¹, Li Liu², Menachem Fromer¹, Menachem Fromer³, R. Susan Walker⁴, Tarjinder Singh⁵, Lambertus Klei⁶, Jack A. Kosmicki³, Shih-Chen Fu¹, Branko Aleksic⁷, Monica Biscaldi⁸, Patrick Bolton⁹, Jessica M. Brownfeld¹, Jinlu Cai¹, Nicholas G. Campbell¹⁰, Angel Carracedo¹¹, Angel Carracedo¹², Maria H. Chahrour³, Andreas G. Chiocchetti, Hilary Coon¹³, Emily L. Crawford¹⁰, Lucy Crooks⁵, Sarah Curran⁹, Geraldine Dawson¹⁴, Eftichia Duketis, Bridget A. Fernandez¹⁵, Louise Gallagher¹⁶, Evan T. Geller¹⁷, Stephen J. Guter¹⁸, R. Sean Hill¹⁹, R. Sean Hill³, Iuliana Ionita-Laza²⁰, Patricia Jiménez González, Helena Kilpinen, Sabine M. Klauck²¹, Alexander Kolevzon¹, Irene Lee²², Jing Lei², Terho Lehtimäki, Chiao-Feng Lin¹⁷, Avi Ma'ayan¹, Christian R. Marshall⁴, Alison L. McInnes²³, Benjamin M. Neale²⁴, Michael John Owen²⁵, Norio Ozaki⁷, Mara Parellada²⁶, Jeremy R. Parr²⁷, Shaun Purcell¹, Kaija Puura, Deepthi Rajagopalan⁴, Karola Rehnström⁵, Abraham Reichenberg¹, Aniko Sabo²⁸, Michael Sachse, Stephen Sanders²⁹, Chad M. Schafer², Martin Schulte-Rüther³⁰, David Skuse²², David Skuse³¹, Christine Stevens²⁴, Peter Szatmari³², Kristiina Tammimies⁴, Otto Valladares¹⁷, Annette Voran³³, Li-San Wang¹⁷, Lauren A. Weiss²⁹, A. Jeremy Willsey²⁹, Timothy W. Yu¹⁹, Timothy W. Yu³, Ryan K. C. Yuen⁴, Edwin H. Cook¹⁸, Christine M. Freitag, Michael Gill¹⁶, Christina M. Hultman³⁴, Thomas Lehner³⁵, Aarno Palotie³⁶, Aarno Palotie²⁴, Aarno Palotie³, Gerard D. Schellenberg¹⁷, Pamela Sklar¹, Matthew W. State²⁹, James S. Sutcliffe¹⁰, Christopher A. Walsh¹⁹, Christopher A. Walsh³, Stephen W. Scherer⁴, Michael E. Zwick³⁷, Jeffrey C. Barrett⁵, David J. Cutler³⁷, Kathryn Roeder², Bernie Devlin⁶, Mark J. Daly²⁴, Mark J. Daly³, Joseph D. Buxbaum¹ - Show less +96 more•Institutions (37)

Icahn School of Medicine at Mount Sinai¹, Carnegie Mellon University², Harvard University³, University of Toronto⁴, Wellcome Trust Sanger Institute⁵, University of Pittsburgh⁶, Nagoya University⁷, University of Freiburg⁸, King's College London⁹, Vanderbilt University¹⁰, King Abdulaziz University¹¹, University of Santiago de Compostela¹², University of Utah¹³, Duke University¹⁴, Memorial University of Newfoundland¹⁵, Trinity College, Dublin¹⁶, University of Pennsylvania¹⁷, University of Illinois at Chicago¹⁸, Boston Children's Hospital¹⁹, Columbia University²⁰, German Cancer Research Center²¹, University College London²², Kaiser Permanente²³, Broad Institute²⁴, Cardiff University²⁵, Complutense University of Madrid²⁶, Newcastle University²⁷, Baylor College of Medicine²⁸, University of California, San Francisco²⁹, RWTH Aachen University³⁰, National Health Service³¹, McMaster University³², Saarland University³³, Karolinska Institutet³⁴, National Institutes of Health³⁵, University of Helsinki³⁶, Emory University³⁷

13 Nov 2014-Nature

TL;DR: Using exome sequencing, it is shown that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate of < 0.05, plus a set of 107 genes strongly enriched for those likely to affect risk (FDR < 0.30).

...read moreread less

Abstract: The genetic architecture of autism spectrum disorder involves the interplay of common and rare variants and their impact on hundreds of genes. Using exome sequencing, here we show that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, plus a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic formation, transcriptional regulation and chromatin-remodelling pathways. These include voltage-gated ion channels regulating the propagation of action potentials, pacemaking and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodellers-most prominently those that mediate post-translational lysine methylation/demethylation modifications of histones.

...read moreread less

2,228 citations

Journal Article•DOI•

Defining the role of common variation in the genomic and biological architecture of adult human height

[...]

Andrew R. Wood¹, Tõnu Esko², Jian Yang³, Sailaja Vedantam⁴ +441 more•Institutions (132)

01 Nov 2014-Nature Genetics

TL;DR: This article identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height, and all common variants together captured 60% of heritability.

...read moreread less

Abstract: Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

...read moreread less

1,872 citations

Journal Article•DOI•

A molecular marker of artemisinin-resistant Plasmodium falciparum malaria

[...]

02 Jan 2014-Nature

TL;DR: Strong correlations between the presence of a mutant allele, in vitro parasite survival rates and in vivo parasite clearance rates indicate that K13-propeller mutations are important determinants of artemisinin resistance.

...read moreread less

Abstract: Plasmodium falciparum resistance to artemisinin derivatives in southeast Asia threatens malaria control and elimination activities worldwide. To monitor the spread of artemisinin resistance, a molecular marker is urgently needed. Here, using whole-genome sequencing of an artemisinin-resistant parasite line from Africa and clinical parasite isolates from Cambodia, we associate mutations in the PF3D7_1343700 kelch propeller domain ('K13-propeller') with artemisinin resistance in vitro and in vivo. Mutant K13-propeller alleles cluster in Cambodian provinces where resistance is prevalent, and the increasing frequency of a dominant mutant K13-propeller allele correlates with the recent spread of resistance in western Cambodia. Strong correlations between the presence of a mutant allele, in vitro parasite survival rates and in vivo parasite clearance rates indicate that K13-propeller mutations are important determinants of artemisinin resistance. K13-propeller polymorphism constitutes a useful molecular marker for large-scale surveillance efforts to contain artemisinin resistance in the Greater Mekong Subregion and prevent its global spread.

...read moreread less

1,639 citations

Journal Article•DOI•

De novo mutations in schizophrenia implicate synaptic networks

[...]

Menachem Fromer¹, Andrew Pocklington², David J. Kavanagh², Hywel Williams², Sarah Dwyer², Padhraig Gormley³, Lyudmila Georgieva², Elliott Rees², Priit Palta³, Douglas M. Ruderfer¹, Noa Carrera², Isla Humphreys², Jessica S. Johnson¹, Panos Roussos¹, Douglas Barker⁴, Eric Banks⁴, Vihra Milanova⁵, Seth G. N. Grant⁶, Eilis Hannon², Samuel A. Rose⁴, Kimberly Chambert⁴, Milind Mahajan¹, Edward M. Scolnick⁴, Jennifer L. Moran⁴, George Kirov², Aarno Palotie³, Steven A. McCarroll⁷, Peter Holmans², Pamela Sklar¹, Michael John Owen², Shaun Purcell¹, Michael Conlon O'Donovan² - Show less +28 more•Institutions (7)

Icahn School of Medicine at Mount Sinai¹, Cardiff University², Wellcome Trust Sanger Institute³, Massachusetts Institute of Technology⁴, Sofia Medical University⁵, University of Edinburgh⁶, Harvard University⁷

13 Feb 2014-Nature

TL;DR: Genes affected by mutations in schizophrenia overlap those mutated in autism and intellectual disability, as do mutation-enriched synaptic pathways, and pathophysiology shared with other neurodevelopmental disorders.

...read moreread less

Abstract: Inherited alleles account for most of the genetic risk for schizophrenia. However, new (de novo) mutations, in the form of large chromosomal copy number changes, occur in a small fraction of cases and disproportionally disrupt genes encoding postsynaptic proteins. Here we show that small de novo mutations, affecting one or a few nucleotides, are overrepresented among glutamatergic postsynaptic proteins comprising activity-regulated cytoskeleton-associated protein (ARC) and N-methyl-d-aspartate receptor (NMDAR) complexes. Mutations are additionally enriched in proteins that interact with these complexes to modulate synaptic strength, namely proteins regulating actin filament dynamics and those whose messenger RNAs are targets of fragile X mental retardation protein (FMRP). Genes affected by mutations in schizophrenia overlap those mutated in autism and intellectual disability, as do mutation-enriched synaptic pathways. Aligning our findings with a parallel case–control study, we demonstrate reproducible insights into aetiological mechanisms for schizophrenia and reveal pathophysiology shared with other neurodevelopmental disorders.

...read moreread less

1,501 citations

Journal Article•DOI•

A polygenic burden of rare disruptive mutations in schizophrenia

[...]

Shaun Purcell¹, Jennifer L. Moran², Menachem Fromer¹, Douglas M. Ruderfer¹, Nadia Solovieff³, Panos Roussos¹, Colm O'Dushlaine², Kimberly Chambert², Sarah E. Bergen⁴, Anna K. Kähler⁴, Laramie E. Duncan³, Eli A. Stahl¹, Giulio Genovese², Esperanza Fernández⁵, Mark O. Collins⁶, Noboru H. Komiyama⁶, Jyoti S. Choudhary⁶, Patrik K. E. Magnusson⁴, Eric Banks², Khalid Shakir², Kiran V. Garimella², Timothy Fennell², Mark A. DePristo², Seth G. N. Grant⁷, Stephen J. Haggarty³, Stacey Gabriel², Edward M. Scolnick², Eric S. Lander², Christina M. Hultman⁴, Patrick F. Sullivan⁸, Steven A. McCarroll³, Pamela Sklar¹ - Show less +28 more•Institutions (8)

Icahn School of Medicine at Mount Sinai¹, Broad Institute², Harvard University³, Karolinska Institutet⁴, Katholieke Universiteit Leuven⁵, Wellcome Trust Sanger Institute⁶, University of Edinburgh⁷, University of North Carolina at Chapel Hill⁸

13 Feb 2014-Nature

TL;DR: In this article, the exome sequences of 2,536 schizophrenia cases and 2,543 controls were analyzed and the authors demonstrated a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes.

...read moreread less

Abstract: Schizophrenia is a common disease with a complex aetiology, probably involving multiple and heterogeneous genetic factors. Here, by analysing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we demonstrate a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes. Particularly enriched gene sets include the voltage-gated calcium ion channel and the signalling complex formed by the activity-regulated cytoskeleton-associated scaffold protein (ARC) of the postsynaptic density, sets previously implicated by genome-wide association and copy-number variation studies. Similar to reports in autism, targets of the fragile X mental retardation protein (FMRP, product of FMR1) are enriched for case mutations. No individual gene-based test achieves significance after correction for multiple testing and we do not detect any alleles of moderately low frequency (approximately 0.5 to 1 per cent) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene-mapping paradigms in neuropsychiatric disease.

...read moreread less

1,323 citations

Journal Article•DOI•

Guidelines for investigating causality of sequence variants in human disease

[...]

Daniel G. MacArthur¹, Teri A. Manolio², David Dimmock³, Heidi L. Rehm¹, Jay Shendure⁴, Gonçalo R. Abecasis⁵, David R. Adams², Russ B. Altman⁶, Stylianos E. Antonarakis⁷, Euan A. Ashley⁶, Jeffrey C. Barrett⁸, Leslie G. Biesecker², Donald F. Conrad⁹, Gregory M. Cooper, Nancy J. Cox¹⁰, Mark J. Daly¹, Mark Gerstein¹¹, David Goldstein¹², Joel N. Hirschhorn¹³, Suzanne M. Leal¹⁴, Len A. Pennacchio¹⁵, John A. Stamatoyannopoulos⁴, Shamil R. Sunyaev¹, David Valle¹⁶, Benjamin F. Voight¹⁷, Wendy Winckler¹⁸, Chris Gunter - Show less +23 more•Institutions (18)

Harvard University¹, National Institutes of Health², Medical College of Wisconsin³, University of Washington⁴, University of Michigan⁵, Stanford University⁶, University of Geneva⁷, Wellcome Trust Sanger Institute⁸, Washington University in St. Louis⁹, University of Chicago¹⁰, Yale University¹¹, Duke University¹², Boston Children's Hospital¹³, Baylor College of Medicine¹⁴, Lawrence Berkeley National Laboratory¹⁵, Johns Hopkins University¹⁶, University of Pennsylvania¹⁷, Broad Institute¹⁸

24 Apr 2014-Nature

TL;DR: The key challenges of assessing sequence variants in human disease are discussed, integrating both gene-level and variant-level support for causality and guidelines for summarizing confidence in variant pathogenicity are proposed.

...read moreread less

Abstract: The discovery of rare genetic variants is accelerating, and clear guidelines for distinguishing disease-causing sequence variants from the many potentially functional variants present in any human genome are urgently needed. Without rigorous standards we risk an acceleration of false-positive reports of causality, which would impede the translation of genomic research findings into the clinical diagnostic setting and hinder biological understanding of disease. Here we discuss the key challenges of assessing sequence variants in human disease, integrating both gene-level and variant-level support for causality. We propose guidelines for summarizing confidence in variant pathogenicity and highlight several areas that require further resource development.

...read moreread less

1,165 citations

Journal Article•DOI•

Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library

[...]

Hiroko Koike-Yusa¹, Yang Li¹, E-Pien Tan¹, Martin Del Castillo Velasco-Herrera¹, Kosuke Yusa¹ - Show less +1 more•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Mar 2014-Nature Biotechnology

TL;DR: The results demonstrate the potential for efficient loss-of-function screening using the CRISPR-Cas9 system and identify 27 known and 4 previously unknown genes implicated in these phenotypes.

...read moreread less

Abstract: Identification of genes influencing a phenotype of interest is frequently achieved through genetic screening by RNA interference (RNAi) or knockouts. However, RNAi may only achieve partial depletion of gene activity, and knockout-based screens are difficult in diploid mammalian cells. Here we took advantage of the efficiency and high throughput of genome editing based on type II, clustered, regularly interspaced, short palindromic repeats (CRISPR)-CRISPR-associated (Cas) systems to introduce genome-wide targeted mutations in mouse embryonic stem cells (ESCs). We designed 87,897 guide RNAs (gRNAs) targeting 19,150 mouse protein-coding genes and used a lentiviral vector to express these gRNAs in ESCs that constitutively express Cas9. Screening the resulting ESC mutant libraries for resistance to either Clostridium septicum alpha-toxin or 6-thioguanine identified 27 known and 4 previously unknown genes implicated in these phenotypes. Our results demonstrate the potential for efficient loss-of-function screening using the CRISPR-Cas9 system.

...read moreread less

1,001 citations

Journal Article•DOI•

An atlas of genetic influences on human blood metabolites

[...]

So-Youn Shin¹, Eric B. Fauman², Ann-Kristin Petersen, Jan Krumsiek, Rita Santos³, Jie Huang¹, Matthias Arnold, Idil Erte⁴, Vincenzo Forgetta⁵, Tsun-Po Yang¹, Klaudia Walter¹, Cristina Menni⁴, Lu Chen¹, Lu Chen⁶, Louella Vasquez¹, Ana M. Valdes⁷, Ana M. Valdes⁴, Craig L. Hyde², Vicky Wang², Daniel Ziemek², Phoebe M. Roberts², Li Xi², Elin Grundberg⁵, Melanie Waldenberger, J. Brent Richards⁵, Robert P. Mohney⁸, Michael V. Milburn⁸, Sally John², Jeff K. Trimmer², Fabian J. Theis⁹, John P. Overington³, Karsten Suhre¹⁰, M. Julia Brosnan², Christian Gieger, Gabi Kastenmüller, Tim D. Spector⁴, Nicole Soranzo¹ - Show less +33 more•Institutions (10)

Wellcome Trust Sanger Institute¹, Pfizer², European Bioinformatics Institute³, King's College London⁴, McGill University⁵, University of Cambridge⁶, University of Nottingham⁷, Durham University⁸, Technische Universität München⁹, Cornell University¹⁰

01 Jun 2014-Nature Genetics

TL;DR: The most comprehensive exploration of genetic loci influencing human metabolism thus far, comprising 7,824 adult individuals from 2 European population studies, is reported, reporting genome-wide significant associations at 145 metabolic loci and their biochemical connectivity with more than 400 metabolites in human blood.

...read moreread less

Abstract: Genome-wide association scans with high-throughput metabolic profiling provide unprecedented insights into how genetic variation influences metabolism and complex disease. Here we report the most comprehensive exploration of genetic loci influencing human metabolism thus far, comprising 7,824 adult individuals from 2 European population studies. We report genome-wide significant associations at 145 metabolic loci and their biochemical connectivity with more than 400 metabolites in human blood. We extensively characterize the resulting in vivo blueprint of metabolism in human blood by integrating it with information on gene expression, heritability and overlap with known loci for complex disorders, inborn errors of metabolism and pharmacological targets. We further developed a database and web-based resources for data mining and results visualization. Our findings provide new insights into the role of inherited variation in blood metabolic diversity and identify potential new opportunities for drug development and for understanding disease.

...read moreread less

Journal Article•DOI•

Spatial and temporal diversity in genomic instability processes defines lung cancer evolution

[...]

Elza C de Bruin¹, Nicholas McGranahan², Nicholas McGranahan¹, Richard Mitter², Max Salm², David C. Wedge³, Lucy R. Yates⁴, Lucy R. Yates³, Mariam Jamal-Hanjani¹, Seema Shafi¹, Nirupa Murugaesu¹, Andrew Rowan², Eva Grönroos², Madiha A. Muhammad¹, Stuart Horswell², Marco Gerlinger², Ignacio Varela⁵, David T. Jones³, John Marshall³, Thierry Voet³, Thierry Voet⁶, Peter Van Loo³, Peter Van Loo⁶, Doris Rassl⁷, Robert C. Rintoul⁷, Sam M. Janes¹, Siow Ming Lee¹, Martin Forster¹, Tanya Ahmad¹, David Lawrence¹, Mary Falzon¹, Arrigo Capitanio¹, Timothy T. Harkins⁸, Clarence C. Lee⁸, Warren Tom⁸, Enock Teefe⁸, Shann-Ching Chen⁸, Sharmin Begum², Adam Rabinowitz², Benjamin Phillimore², Bradley Spencer-Dene², Gordon Stamp², Zoltan Szallasi⁹, Zoltan Szallasi¹⁰, Nik Matthews², Aengus Stewart², Peter J. Campbell³, Charles Swanton¹, Charles Swanton² - Show less +45 more•Institutions (10)

University College London¹, London Research Institute², Wellcome Trust Sanger Institute³, University of Cambridge⁴, Spanish National Research Council⁵, Katholieke Universiteit Leuven⁶, Papworth Hospital⁷, Thermo Fisher Scientific⁸, Technical University of Denmark⁹, Harvard University¹⁰

10 Oct 2014-Science

TL;DR: 25 spatially distinct regions from seven operable NSCLCs were sequenced and found evidence of branched evolution, with driver mutations arising before and after subclonal diversification, and pronounced intratumor heterogeneity in copy number alterations, translocations, and mutations associated with APOBEC cytidine deaminase activity.

...read moreread less

Abstract: Spatial and temporal dissection of the genomic changes occurring during the evolution of human non–small cell lung cancer (NSCLC) may help elucidate the basis for its dismal prognosis. We sequenced 25 spatially distinct regions from seven operable NSCLCs and found evidence of branched evolution, with driver mutations arising before and after subclonal diversification. There was pronounced intratumor heterogeneity in copy number alterations, translocations, and mutations associated with APOBEC cytidine deaminase activity. Despite maintained carcinogen exposure, tumors from smokers showed a relative decrease in smoking-related mutations over time, accompanied by an increase in APOBEC-associated mutations. In tumors from former smokers, genome-doubling occurred within a smoking-signature context before subclonal diversification, which suggested that a long period of tumor latency had preceded clinical detection. The regionally separated driver mutations, coupled with the relentless and heterogeneous nature of the genome instability processes, are likely to confound treatment success in NSCLC.

...read moreread less

Journal Article•DOI•

Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility.

[...]

Anubha Mahajan¹, Min Jin Go, Weihua Zhang², Jennifer E. Below³ +392 more•Institutions (104)

01 Mar 2014-Nature Genetics

TL;DR: In this paper, the authors aggregated published meta-analyses of genome-wide association studies (GWAS), including 26,488 cases and 83,964 controls of European, east Asian, south Asian and Mexican and Mexican American ancestry.

...read moreread less

Abstract: To further understanding of the genetic basis of type 2 diabetes (T2D) susceptibility, we aggregated published meta-analyses of genome-wide association studies (GWAS), including 26,488 cases and 83,964 controls of European, east Asian, south Asian and Mexican and Mexican American ancestry. We observed a significant excess in the directional consistency of T2D risk alleles across ancestry groups, even at SNPs demonstrating only weak evidence of association. By following up the strongest signals of association from the trans-ethnic meta-analysis in an additional 21,491 cases and 55,647 controls of European ancestry, we identified seven new T2D susceptibility loci. Furthermore, we observed considerable improvements in the fine-mapping resolution of common variant association signals at several T2D susceptibility loci. These observations highlight the benefits of trans-ethnic GWAS for the discovery and characterization of complex trait loci and emphasize an exciting opportunity to extend insight into the genetic architecture and pathogenesis of human diseases across populations of diverse ancestry.

...read moreread less

Journal Article•DOI•

A framework for the interpretation of de novo mutation in human disease

[...]

Kaitlin E. Samocha¹, Elise B. Robinson¹, Stephen Sanders², Christine Stevens³, Aniko Sabo⁴, Lauren M. McGrath¹, Jack A. Kosmicki⁵, Karola Rehnström⁶, Swapan Mallick¹, Andrew Kirby¹, Dennis P. Wall⁵, Daniel G. MacArthur³, Daniel G. MacArthur¹, Stacey Gabriel³, Mark A. DePristo, Shaun Purcell⁷, Shaun Purcell³, Shaun Purcell¹, Aarno Palotie⁶, Eric Boerwinkle⁸, Joseph D. Buxbaum⁷, Edwin H. Cook⁹, Richard A. Gibbs⁴, Gerard D. Schellenberg¹⁰, James S. Sutcliffe¹¹, Bernie Devlin¹², Kathryn Roeder¹³, Benjamin M. Neale³, Benjamin M. Neale¹, Mark J. Daly³, Mark J. Daly¹ - Show less +27 more•Institutions (13)

Harvard University¹, Yale University², Broad Institute³, Baylor College of Medicine⁴, Beth Israel Deaconess Medical Center⁵, Wellcome Trust Sanger Institute⁶, Icahn School of Medicine at Mount Sinai⁷, University of Texas Health Science Center at Houston⁸, University of Illinois at Chicago⁹, University of Pennsylvania¹⁰, Vanderbilt University¹¹, University of Pittsburgh¹², Carnegie Mellon University¹³

01 Sep 2014-Nature Genetics

TL;DR: This model is used to identify ∼1,000 genes that are significantly lacking in functional coding variation in non-ASD samples and are enriched for de novo loss-of-function mutations identified in ASD cases, suggesting that the role of de noVO mutations in ASDs might reside in fundamental neurodevelopmental processes.

...read moreread less

Abstract: Mark Daly and colleagues present a statistical framework to evaluate the role of de novo mutations in human disease by calibrating a model of de novo mutation rates at the individual gene level. The mutation probabilities defined by their model and list of constrained genes can be used to help identify genetic variants that have a significant role in disease.

...read moreread less

Journal Article•DOI•

Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity

[...]

Sébastien A. Smallwood¹, Heather J. Lee¹, Heather J. Lee², Christof Angermueller³, Felix Krueger¹, Heba Saadeh¹, Julian R. Peat¹, Simon Andrews¹, Oliver Stegle³, Wolf Reik⁴, Wolf Reik¹, Wolf Reik², Gavin Kelsey⁴, Gavin Kelsey¹ - Show less +10 more•Institutions (4)

Babraham Institute¹, Wellcome Trust Sanger Institute², European Bioinformatics Institute³, University of Cambridge⁴

01 Aug 2014-Nature Methods

TL;DR: In this article, a single-cell bisulfite sequencing (scBS-seq) method was used to accurately measure DNA methylation at up to 48.4% of CpG sites.

...read moreread less

Abstract: We report a single-cell bisulfite sequencing (scBS-seq) method that can be used to accurately measure DNA methylation at up to 48.4% of CpG sites. Embryonic stem cells grown in serum or in 2i medium displayed epigenetic heterogeneity, with '2i-like' cells present in serum culture. Integration of 12 individual mouse oocyte datasets largely recapitulated the whole DNA methylome, which makes scBS-seq a versatile tool to explore DNA methylation in rare cells and heterogeneous populations.

...read moreread less

Journal Article•DOI•

Loss-of-function mutations in APOC3, triglycerides, and coronary disease

[...]

Jacy R Crosby¹, Gina M. Peloso², Gina M. Peloso³, Paul L. Auer⁴, David R. Crosslin⁵, Nathan O. Stitziel⁶, Leslie A. Lange⁷, Yingchang Lu⁸, Zheng-Zheng Tang⁷, He Zhang⁹, George Hindy¹⁰, Nicholas G. D. Masca¹¹, Kathleen Stirrups¹², Stavroula Kanoni¹², Ron Do², Ron Do³, Goo Jun⁹, Youna Hu⁹, Hyun Min Kang⁹, Chenyi Xue⁹, Anuj Goel¹³, Martin Farrall¹³, Stefano Duga¹⁴, Pier Angelica Merlini, Rosanna Asselta¹⁴, Domenico Girelli¹⁵, Oliviero Olivieri¹⁵, Nicola Martinelli¹⁵, Wu Yin¹⁶, Dermot F. Reilly¹⁶, Elizabeth K. Speliotes⁹, Caroline S. Fox¹⁷, Kristian Hveem¹⁸, Oddgeir L. Holmen¹⁹, Majid Nikpay²⁰, Deborah N. Farlow³, Themistocles L. Assimes²¹, Nora Franceschini⁷, Jennifer G. Robinson²², Kari E. North⁷, Lisa W. Martin²³, Mark A. DePristo³, Namrata Gupta³, Stefan A. Escher¹⁰, Jan-Håkan Jansson²⁴, Natalie R. van Zuydam²⁵, Colin N. A. Palmer²⁵, Nicholas J. Wareham²⁶, Werner Koch²⁷, Thomas Meitinger²⁷, Annette Peters, Wolfgang Lieb²⁸, Raimund Erbel, Inke R. König²⁹, Jochen Kruppa²⁹, Franziska Degenhardt³⁰, Omri Gottesman⁸, Erwin P. Bottinger⁸, Christopher J. O'Donnell¹⁷, Bruce M. Psaty⁵, Bruce M. Psaty³¹, Christie M. Ballantyne³², Christie M. Ballantyne³³, Gonçalo R. Abecasis⁹, Jose M. Ordovas³⁴, Jose M. Ordovas³⁵, Olle Melander¹⁰, Hugh Watkins¹³, Marju Orho-Melander¹⁰, Diego Ardissino, Ruth J. F. Loos⁸, Ruth McPherson²⁰, Cristen J. Willer⁹, Jeanette Erdmann²⁹, Alistair S. Hall³⁶, Nilesh J. Samani¹¹, Panos Deloukas³⁷, Panos Deloukas³⁸, Panos Deloukas¹², Heribert Schunkert²⁷, James G. Wilson³⁹, Charles Kooperberg⁴⁰, Stephen S. Rich⁴¹, Russell P. Tracy⁴², Danyu Lin⁷, David Altshuler², David Altshuler³, Stacey Gabriel³, Deborah A. Nickerson⁵, Gail P. Jarvik⁵, L. Adrienne Cupples²⁶, L. Adrienne Cupples⁴³, Alexander P. Reiner⁵, Alexander P. Reiner⁴⁰, Eric Boerwinkle³³, Sekar Kathiresan², Sekar Kathiresan³ - Show less +93 more•Institutions (43)

University of Texas Health Science Center at Houston¹, Harvard University², Broad Institute³, University of Wisconsin–Milwaukee⁴, University of Washington⁵, Washington University in St. Louis⁶, University of North Carolina at Chapel Hill⁷, Icahn School of Medicine at Mount Sinai⁸, University of Michigan⁹, Lund University¹⁰, University of Leicester¹¹, Queen Mary University of London¹², University of Oxford¹³, University of Milan¹⁴, University of Verona¹⁵, Merck & Co.¹⁶, National Institutes of Health¹⁷, Levanger Hospital¹⁸, Norwegian University of Science and Technology¹⁹, University of Ottawa²⁰, Stanford University²¹, University of Iowa²², George Washington University²³, Umeå University²⁴, University of Dundee²⁵, Cambridge University Hospitals NHS Foundation Trust²⁶, Technische Universität München²⁷, University of Kiel²⁸, University of Lübeck²⁹, University of Bonn³⁰, Group Health Cooperative³¹, Houston Methodist Hospital³², Baylor College of Medicine³³, IMDEA³⁴, Tufts University³⁵, University of Leeds³⁶, King Abdulaziz University³⁷, Wellcome Trust Sanger Institute³⁸, University of Mississippi³⁹, Fred Hutchinson Cancer Research Center⁴⁰, University of Virginia⁴¹, University of Vermont⁴², Boston University⁴³

02 Jul 2014-The New England Journal of Medicine

TL;DR: Rare mutations that disrupt AP OC3 function were associated with lower levels of plasma triglycerides and APOC3, and carriers of these mutations were found to have a reduced risk of coronary heart disease.

...read moreread less

Abstract: Background Plasma triglyceride levels are heritable and are correlated with the risk of coronary heart disease. Sequencing of the protein-coding regions of the human genome (the exome) has the potential to identify rare mutations that have a large effect on phenotype. Methods We sequenced the protein-coding regions of 18,666 genes in each of 3734 participants of European or African ancestry in the Exome Sequencing Project. We conducted tests to determine whether rare mutations in coding sequence, individually or in aggregate within a gene, were associated with plasma triglyceride levels. For mutations associated with triglyceride levels, we subsequently evaluated their association with the risk of coronary heart disease in 110,970 persons. Results An aggregate of rare mutations in the gene encoding apolipoprotein C3 (APOC3) was associated with lower plasma triglyceride levels. Among the four mutations that drove this result, three were loss-of-function mutations: a nonsense mutation (R19X) and two splice-site mutations (IVS2+1G→A and IVS3+1G→T). The fourth was a missense mutation (A43T). Approximately 1 in 150 persons in the study was a heterozygous carrier of at least one of these four mutations. Triglyceride levels in the carriers were 39% lower than levels in noncarriers (P<1×10 − 20 ), and circulating levels of APOC3 in carriers were 46% lower than levels in noncarriers (P = 8×10 − 10 ). The risk of coronary heart disease among 498 carriers of any rare APOC3 mutation was 40% lower than the risk among 110,472 noncarriers (odds ratio, 0.60; 95% confidence interval, 0.47 to 0.75; P = 4×10 − 6 ). Conclusions Rare mutations that disrupt APOC3 function were associated with lower levels of plasma triglycerides and APOC3. Carriers of these mutations were found to have a reduced risk of coronary heart disease. (Funded by the National Heart, Lung, and Blood Institute and others.)

...read moreread less

Journal Article•DOI•

Inferring human population size and separation history from multiple genome sequences

[...]

Stephan Schiffels¹, Richard Durbin¹•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Aug 2014-Nature Genetics

TL;DR: Results from applying multiple sequentially Markovian coalescent (MSMC) to genome sequences from nine populations across the world suggest that the genetic separation of non-African ancestors from African Yoruban ancestors started long before 50,000 years ago and give information about human population history as recent as 2,000 Years ago.

...read moreread less

Abstract: The availability of complete human genome sequences from populations across the world has given rise to new population genetic inference methods that explicitly model ancestral relationships under recombination and mutation. So far, application of these methods to evolutionary history more recent than 20,000-30,000 years ago and to population separations has been limited. Here we present a new method that overcomes these shortcomings. The multiple sequentially Markovian coalescent (MSMC) analyzes the observed pattern of mutations in multiple individuals, focusing on the first coalescence between any two individuals. Results from applying MSMC to genome sequences from nine populations across the world suggest that the genetic separation of non-African ancestors from African Yoruban ancestors started long before 50,000 years ago and give information about human population history as recent as 2,000 years ago, including the bottleneck in the peopling of the Americas and separations within Africa, East Asia and Europe.

...read moreread less

Journal Article•DOI•

Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing

[...]

Jianjun Zhang¹, Junya Fujimoto¹, Jianhua Zhang¹, David C. Wedge², Xingzhi Song¹, Jiexin Zhang¹, Sahil Seth¹, Chi Wan Chow¹, Yu Cao¹, Curtis Gumbs¹, Kathryn A. Gold¹, Neda Kalhor¹, Latasha Little¹, Harshad S. Mahadeshwar¹, Cesar A. Moran¹, Alexei Protopopov¹, Huandong Sun¹, Jiabin Tang¹, Xifeng Wu¹, Yuanqing Ye¹, William N. William¹, J. Jack Lee¹, John V. Heymach¹, Waun Ki Hong¹, Stephen G. Swisher¹, Ignacio I. Wistuba¹, Andrew Futreal¹, Andrew Futreal² - Show less +24 more•Institutions (2)

University of Texas MD Anderson Cancer Center¹, Wellcome Trust Sanger Institute²

10 Oct 2014-Science

TL;DR: WES data indicate that a larger subclonal mutation fraction may be associated with increased likelihood of postsurgical relapse in patients with localized lung adenocarcinomas, and different mutations are present in different regions of any given lung cancer, and their pattern may predict patient relapse.

...read moreread less

Abstract: Cancers are composed of populations of cells with distinct molecular and phenotypic features, a phenomenon termed intratumor heterogeneity (ITH). ITH in lung cancers has not been well studied. We applied multiregion whole-exome sequencing (WES) on 11 localized lung adenocarcinomas. All tumors showed clear evidence of ITH. On average, 76% of all mutations and 20 out of 21 known cancer gene mutations were identified in all regions of individual tumors, which suggested that single-region sequencing may be adequate to identify the majority of known cancer gene mutations in localized lung adenocarcinomas. With a median follow-up of 21 months after surgery, three patients have relapsed, and all three patients had significantly larger fractions of subclonal mutations in their primary tumors than patients without relapse. These data indicate that a larger subclonal mutation fraction may be associated with increased likelihood of postsurgical relapse in patients with localized lung adenocarcinomas.

...read moreread less

Journal Article•DOI•

The genomic substrate for adaptive radiation in African cichlid fish

[...]

David Brawand¹, David Brawand², Catherine E. Wagner³, Catherine E. Wagner⁴, Yang I. Li², Milan Malinsky⁵, Milan Malinsky⁶, Irene Keller⁴, Shaohua Fan⁷, Oleg Simakov⁷, Alvin Yu Jin Ng⁸, Zhi Wei Lim⁸, Etienne Bezault⁹, Jason Turner-Maier¹, Jeremy A. Johnson¹, Rosa Alcazar¹⁰, Hyun Ji Noh¹, Pamela Russell¹¹, Bronwen Aken⁶, Jessica Alföldi¹, Chris T. Amemiya¹², Naoual Azzouzi¹³, Jean-François Baroiller, Frédérique Barloy-Hubler¹³, Aaron M. Berlin¹, Ryan F. Bloomquist¹⁴, Karen L. Carleton¹⁵, Matthew A. Conte¹⁵, Helena D'Cotta, Orly Eshel, Leslie Gaffney¹, Francis Galibert¹³, Hugo F. Gante¹⁶, Sante Gnerre¹, Lucie Greuter⁴, Lucie Greuter³, Richard Guyon¹³, Natalie S. Haddad¹⁴, Wilfried Haerty², Robert M Harris¹⁷, Hans A. Hofmann¹⁷, Thibaut Hourlier⁶, Gideon Hulata, David B. Jaffe¹, Marcia Lara¹, Alison P. Lee⁸, Iain MacCallum¹, Salome Mwaiko³, Masato Nikaido¹⁸, Hidenori Nishihara¹⁸, Catherine Ozouf-Costaz¹⁹, David J. Penman²⁰, Dariusz Przybylski¹, Michaelle Rakotomanga¹³, Suzy C. P. Renn⁹, Filipe J. Ribeiro¹, Micha Ron, Walter Salzburger¹⁶, Luis Sanchez-Pulido², M. Emília Santos¹⁶, Steve Searle⁶, Ted Sharpe¹, Ross Swofford¹, Frederick J. Tan²¹, Louise Williams¹, Sarah Young¹, Shuangye Yin¹, Norihiro Okada²², Norihiro Okada¹⁸, Thomas D. Kocher¹⁵, Eric A. Miska⁵, Eric S. Lander¹, Byrappa Venkatesh⁸, Russell D. Fernald¹⁰, Axel Meyer⁷, Chris P. Ponting², J. Todd Streelman¹⁴, Kerstin Lindblad-Toh²³, Kerstin Lindblad-Toh¹, Ole Seehausen³, Ole Seehausen⁴, Federica Di Palma¹, Federica Di Palma²⁴ - Show less +79 more•Institutions (24)

18 Sep 2014-Nature

TL;DR: This article found an excess of gene duplications in the East African lineage compared to Nile tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs.

...read moreread less

Abstract: Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.

...read moreread less

Journal Article•DOI•

The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data

[...]

Sebastian Köhler¹, Sandra C. Doelken¹, Christopher J. Mungall², Sebastian Bauer¹, Helen V. Firth³, Helen V. Firth⁴, Isabelle Bailleul-Forestier⁵, Graeme C.M. Black⁶, Danielle L. Brown⁷, Michael Brudno⁸, Jennifer Campbell⁷, Jennifer Campbell⁹, David R. FitzPatrick, Janan T. Eppig, Andrew P. Jackson, Kathleen Freson¹⁰, Marta Girdea⁸, Ingo Helbig¹¹, Jane A. Hurst¹², Johanna A. Jähn¹¹, Laird G. Jackson¹³, Anne M. Kelly¹⁴, David H. Ledbetter¹⁵, Sahar Mansour¹⁶, Christa Lese Martin¹⁵, Celia Moss, Andrew D Mumford¹⁷, Willem H. Ouwehand³, Willem H. Ouwehand¹⁴, Soo Mi Park⁴, Erin Rooney Riggs¹⁵, Richard H. Scott¹², Sanjay M. Sisodiya¹², Steven Van Vooren, Ronald J. Wapner¹⁸, Andrew O.M. Wilkie¹⁹, Caroline F. Wright³, Anneke T. Vulto-van Silfhout²⁰, Nicole de Leeuw²⁰, Bert B.A. de Vries²⁰, Nicole L. Washingthon², Cynthia L. Smith, Monte Westerfield²¹, Paul N. Schofield¹⁴, Barbara J. Ruef²¹, Georgios V. Gkoutos²², Melissa A. Haendel, Damian Smedley³, Suzanna E. Lewis², Peter N. Robinson¹, Peter N. Robinson²³ - Show less +47 more•Institutions (23)

01 Jan 2014-Nucleic Acids Research

TL;DR: The updated HPO database is described, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO, allowing integration of existing datasets and interoperability with multiple biomedical resources.

...read moreread less

Abstract: The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, provides a structured, comprehensive and well-defined set of 10,088 classes (terms) describing human phenotypic abnormalities and 13,326 subclass relations between the HPO classes. In addition we have developed logical definitions for 46% of all HPO classes using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains. This allows interoperability with several resources, especially those containing phenotype information on model organisms such as mouse and zebrafish. Here we describe the updated HPO database, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO. Various meta-attributes such as frequency, references and negations are associated with each annotation. Several large-scale projects worldwide utilize the HPO for describing phenotype information in their datasets. We have therefore generated equivalence mappings to other phenotype vocabularies such as LDDB, Orphanet, MedDRA, UMLS and phenoDB, allowing integration of existing datasets and interoperability with multiple biomedical resources. We have created various ways to access the HPO database content using flat files, a MySQL database, and Web-based tools. All data and documentation on the HPO project can be found online.

...read moreread less

Journal Article•DOI•

Resetting Transcription Factor Control Circuitry toward Ground-State Pluripotency in Human

[...]

Yasuhiro Takashima¹, Ge Guo¹, Remco Loos², Jennifer Nichols¹, Gabriella Ficz³, Felix Krueger⁴, David Oxley⁴, Fátima Santos⁴, James Clarke¹, William Mansfield¹, Wolf Reik⁵, Paul Bertone², Paul Bertone¹, Austin Smith¹ - Show less +10 more•Institutions (5)

University of Cambridge¹, European Bioinformatics Institute², University of London³, Babraham Institute⁴, Wellcome Trust Sanger Institute⁵

11 Sep 2014-Cell

TL;DR: It is reported that short-term expression of two components, NANOG and KLF2, is sufficient to ignite other elements of the network and reset the human pluripotent state and demonstrate feasibility of installing and propagating functional control circuitry for ground-state pluripotency in human cells.

...read moreread less

Journal Article•DOI•

Efficient genome modification by CRISPR-Cas9 nickase with minimal off-target effects.

[...]

Bin Shen¹, Wensheng Zhang², Jun Zhang¹, Jiankui Zhou¹, Jianying Wang¹, Li Chen¹, Lu Wang³, Alex Hodgkins², Vivek Iyer², Xingxu Huang¹, William C. Skarnes² - Show less +7 more•Institutions (3)

Nanjing University¹, Wellcome Trust Sanger Institute², Beijing Institute of Genomics³

01 Apr 2014-Nature Methods

TL;DR: This work has shown that co-microinjection of mouse embryos with Cas9 mRNA and single guide RNAs induces on-target and off-target mutations that are transmissible to offspring, but Cas9 nickase can be used to efficiently mutate genes without detectable damage at known off- target sites.

...read moreread less

Abstract: Bacterial RNA-directed Cas9 endonuclease is a versatile tool for site-specific genome modification in eukaryotes. Co-microinjection of mouse embryos with Cas9 mRNA and single guide RNAs induces on-target and off-target mutations that are transmissible to offspring. However, Cas9 nickase can be used to efficiently mutate genes without detectable damage at known off-target sites. This method is applicable for genome editing of any model organism and minimizes confounding problems of off-target mutations.

...read moreread less

Journal Article•DOI•

Heterogeneity of genomic evolution and mutational profiles in multiple myeloma

[...]

Niccolo Bolli¹, Hervé Avet-Loiseau², David C. Wedge¹, Peter Van Loo¹, Ludmil B. Alexandrov¹, Inigo Martincorena¹, Kevin J. Dawson¹, Francesco Iorio¹, Serena Nik-Zainal¹, Graham R. Bignell¹, Jonathan Hinton¹, Yang Li¹, Jose M. C. Tubio¹, Stuart McLaren¹, Sarah O' Meara¹, Adam Butler¹, Jon W. Teague¹, Laura Mudie¹, Elizabeth Anderson¹, Naim U. Rashid³, Yu-Tzu Tai³, Masood A. Shammas³, Adam S. Sperling³, Mariateresa Fulciniti³, Paul G. Richardson³, Giovanni Parmigiani³, Florence Magrangeas⁴, Stephane Minvielle⁴, Philippe Moreau, Michel Attal², Thierry Facon, P. Andrew Futreal¹, Kenneth C. Anderson³, Peter J. Campbell¹, Nikhil C. Munshi³ - Show less +31 more•Institutions (4)

Wellcome Trust Sanger Institute¹, French Institute of Health and Medical Research², Harvard University³, University of Nantes⁴

16 Jan 2014-Nature Communications

TL;DR: The myeloma genome is heterogeneous across the cohort, and exhibits diversity in clonal admixture and in dynamics of evolution, which may impact prognostic stratification, therapeutic approaches and assessment of disease response to treatment.

...read moreread less

Abstract: Multiple myeloma is an incurable plasma cell malignancy with a complex and incompletely understood molecular pathogenesis. Here we use whole-exome sequencing, copy-number profiling and cytogenetics to analyse 84 myeloma samples. Most cases have a complex subclonal structure and show clusters of subclonal variants, including subclonal driver mutations. Serial sampling reveals diverse patterns of clonal evolution, including linear evolution, differential clonal response and branching evolution. Diverse processes contribute to the mutational repertoire, including kataegis and somatic hypermutation, and their relative contribution changes over time. We find heterogeneity of mutational spectrum across samples, with few recurrent genes. We identify new candidate genes, including truncations of SP140, LTB, ROBO1 and clustered missense mutations in EGR1. The myeloma genome is heterogeneous across the cohort, and exhibits diversity in clonal admixture and in dynamics of evolution, which may impact prognostic stratification, therapeutic approaches and assessment of disease response to treatment.

...read moreread less

Journal Article•DOI•

Mechanisms underlying mutational signatures in human cancers

[...]

Thomas Helleday¹, Saeed Eshtad¹, Serena Nik-Zainal²•Institutions (2)

Science for Life Laboratory¹, Wellcome Trust Sanger Institute²

01 Sep 2014-Nature Reviews Genetics

TL;DR: Mutational signatures can be used as a physiological readout of the biological history of a cancer and also have potential use for discerning ongoing mutational processes from historical ones, thus possibly revealing new targets for anticancer therapies.

...read moreread less

Abstract: The collective somatic mutations observed in a cancer are the outcome of multiple mutagenic processes that have been operative over the lifetime of a patient. Each process leaves a characteristic imprint--a mutational signature--on the cancer genome, which is defined by the type of DNA damage and DNA repair processes that result in base substitutions, insertions and deletions or structural variations. With the advent of whole-genome sequencing, researchers are identifying an increasing array of these signatures. Mutational signatures can be used as a physiological readout of the biological history of a cancer and also have potential use for discerning ongoing mutational processes from historical ones, thus possibly revealing new targets for anticancer therapies.

...read moreread less

Journal Article•DOI•

Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression.

[...]

Benjamin P. Fairfax¹, Peter Humburg¹, Seiko Makino¹, Vivek Naranbhai¹, Daniel Wong¹, Evelyn Lau¹, Luke Jostins¹, Katharine Plant¹, Robert Andrews², Chris J. McGee², Julian C. Knight¹ - Show less +7 more•Institutions (2)

Wellcome Trust Centre for Human Genetics¹, Wellcome Trust Sanger Institute²

07 Mar 2014-Science

TL;DR: This work mapped interindividual variation in gene expression as a quantitative trait, defining expression quantitative trait loci (eQTLs) and found trans associations to the major histocompatibility complex are dependent on context, paralleling the expression of class II genes.

...read moreread less

Abstract: To systematically investigate the impact of immune stimulation upon regulatory variant activity, we exposed primary monocytes from 432 healthy Europeans to interferon-γ (IFN-γ) or differing durations of lipopolysaccharide and mapped expression quantitative trait loci (eQTLs). More than half of cis-eQTLs identified, involving hundreds of genes and associated pathways, are detected specifically in stimulated monocytes. Induced innate immune activity reveals multiple master regulatory trans-eQTLs including the major histocompatibility complex (MHC), coding variants altering enzyme and receptor function, an IFN-β cytokine network showing temporal specificity, and an interferon regulatory factor 2 (IRF2) transcription factor-modulated network. Induced eQTL are significantly enriched for genome-wide association study loci, identifying context-specific associations to putative causal genes including CARD9, ATM, and IRF8. Thus, applying pathophysiologically relevant immune stimuli assists resolution of functional genetic variants.

...read moreread less

Journal Article•DOI•

Defining functional DNA elements in the human genome

[...]

Manolis Kellis¹, Barbara J. Wold², Michael Snyder³, Bradley E. Bernstein⁴, Anshul Kundaje⁵, Georgi K. Marinov², Lucas D. Ward⁵, Ewan Birney, Gregory E. Crawford⁶, Job Dekker⁷, Ian Dunham, Laura Elnitski⁸, Peggy J. Farnham⁹, Elise A. Feingold⁸, Mark Gerstein¹⁰, Morgan C. Giddings, David M. Gilbert¹¹, Thomas R. Gingeras¹², Eric D. Green⁸, Roderic Guigó, Tim Hubbard¹³, Jim Kent¹⁴, Jason D. Lieb¹⁵, Richard M. Myers, Michael J. Pazin⁸, Bing Ren¹⁶, John A. Stamatoyannopoulos¹⁷, Zhiping Weng⁷, Kevin P. White¹⁸, Ross C. Hardison¹⁹ - Show less +26 more•Institutions (19)

Massachusetts Institute of Technology¹, California Institute of Technology², Stanford University³, Harvard University⁴, Broad Institute⁵, Duke University⁶, University of Massachusetts Medical School⁷, National Institutes of Health⁸, University of Southern California⁹, Yale University¹⁰, Florida State University¹¹, Cold Spring Harbor Laboratory¹², Wellcome Trust Sanger Institute¹³, University of California, Santa Cruz¹⁴, Princeton University¹⁵, University of California, San Diego¹⁶, University of Washington¹⁷, University of Chicago¹⁸, Pennsylvania State University¹⁹

29 Apr 2014-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies are reviewed.

...read moreread less

Abstract: With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease.

...read moreread less

Journal Article•DOI•

DNA methylation and body-mass index: a genome-wide analysis

[...]

Katherine J. Dick¹, Katherine J. Dick², Christopher P. Nelson², Christopher P. Nelson¹, Loukia Tsaprouni³, Johanna K. Sandling³, Johanna K. Sandling⁴, Dylan Aïssi⁵, Dylan Aïssi⁶, Dylan Aïssi⁷, Simone Wahl, Eshwar Meduri³, Pierre-Emmanuel Morange⁸, Harald Grallert, Melanie Waldenberger, Annette Peters, Jeanette Erdmann⁹, Christian Hengstenberg¹⁰, François Cambien⁷, François Cambien⁶, François Cambien⁵, Alison H. Goodall², Alison H. Goodall¹, Willem H. Ouwehand¹¹, Willem H. Ouwehand¹², Willem H. Ouwehand³, Heribert Schunkert¹⁰, John R. Thompson², Tim D. Spector¹³, Christian Gieger, David-Alexandre Trégouët⁶, David-Alexandre Trégouët⁵, David-Alexandre Trégouët⁷, Panos Deloukas³, Panos Deloukas¹⁴, Panos Deloukas¹⁵, Nilesh J. Samani², Nilesh J. Samani¹ - Show less +34 more•Institutions (15)

National Institute for Health Research¹, University of Leicester², Wellcome Trust Sanger Institute³, Science for Life Laboratory⁴, Institute of Chartered Accountants of Nigeria⁵, University of Paris⁶, French Institute of Health and Medical Research⁷, Aix-Marseille University⁸, University of Lübeck⁹, Technische Universität München¹⁰, National Health Service¹¹, University of Cambridge¹², King's College London¹³, Queen Mary University of London¹⁴, King Abdulaziz University¹⁵

07 Jun 2014-The Lancet

TL;DR: Increased BMI in adults of European origin is associated with increased methylation at the HIF3A locus in blood cells and in adipose tissue, and perturbation of hypoxia inducible transcription factor pathways could have an important role in the response to increased weight in people.

...read moreread less

The genomic substrate for adaptive radiation in African cichlid fish

[...]

David Brawand¹, David Brawand², Catherine E. Wagner³, Catherine E. Wagner⁴, Yang I. Li¹, Milan Malinsky⁵, Milan Malinsky⁶, Irene Keller³, Shaohua Fan⁷, Oleg Simakov⁷, Alvin Yu Jin Ng⁸, Zhi Wei Lim⁸, Etienne Bezault⁹, Jason Turner-Maier², Jeremy A. Johnson², Rosa Alcazar¹⁰, Hyun Ji Noh², Pamela Russell¹¹, Bronwen Aken⁵, Jessica Alföldi², Chris T. Amemiya¹², Naoual Azzouzi¹³, Jean-François Baroiller, Frédérique Barloy-Hubler¹³, Aaron M. Berlin², Ryan F. Bloomquist¹⁴, Karen L. Carleton¹⁵, Matthew A. Conte¹⁵, Helena D'Cotta, Orly Eshel, Leslie Gaffney², Francis Galibert¹³, Hugo F. Gante¹⁶, Sante Gnerre², Lucie Greuter⁴, Lucie Greuter³, Richard Guyon¹³, Natalie S. Haddad¹⁴, Wilfried Haerty¹, Robert M Harris¹⁷, Hans A. Hofmann¹⁷, Thibaut Hourlier⁵, Gideon Hulata, David B. Jaffe², Marcia Lara², Alison P. Lee⁸, Iain MacCallum², Salome Mwaiko⁴, Masato Nikaido¹⁸, Hidenori Nishihara¹⁸, Catherine Ozouf-Costaz¹⁹, David J. Penman²⁰, Dariusz Przybylski², Michaelle Rakotomanga¹³, Suzy C. P. Renn⁹, Filipe J. Ribeiro², Micha Ron, Walter Salzburger¹⁶, Luis Sanchez-Pulido¹, M. Emília Santos¹⁶, Steve Searle⁵, Ted Sharpe², Ross Swofford², Frederick J. Tan²¹, Louise Williams², Sarah Young², Shuangye Yin², Norihiro Okada²², Norihiro Okada¹⁸, Thomas D. Kocher¹⁵, Eric A. Miska⁶, Eric S. Lander², Byrappa Venkatesh⁸, Russell D. Fernald¹⁰, Axel Meyer⁷, Chris P. Ponting¹, J. Todd Streelman¹⁴, Kerstin Lindblad-Toh², Kerstin Lindblad-Toh²³, Ole Seehausen³, Ole Seehausen⁴, Federica Di Palma²⁴, Federica Di Palma² - Show less +79 more•Institutions (24)

01 Sep 2014

TL;DR: It is concluded that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.

...read moreread less

Journal Article•DOI•

PredictProtein—an open resource for online prediction of protein structural and functional features

[...]

Guy Yachdav¹, Edda Kloppmann¹, László Kaján¹, Maximilian Hecht¹, Tatyana Goldberg¹, Tobias Hamp¹, Peter Hönigschmid¹, Andrea Schafferhans¹, Manfred Roos¹, Michael Bernhofer¹, Lothar Richter¹, Haim Ashkenazy², Marco Punta³, Avner Schlessinger⁴, Yana Bromberg⁵, Reinhard Schneider⁶, Gerrit Vriend, Chris Sander, Nir Ben-Tal⁷, Burkhard Rost¹ - Show less +16 more•Institutions (7)

Technische Universität München¹, Tel Aviv University², Wellcome Trust Sanger Institute³, European Bioinformatics Institute⁴, Icahn School of Medicine at Mount Sinai⁵, Rutgers University⁶, Memorial Sloan Kettering Cancer Center⁷

01 Jul 2014-Nucleic Acids Research

TL;DR: The goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics, and the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures.

...read moreread less

Abstract: PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.

...read moreread less

Journal Article•DOI•

A General Approach for Haplotype Phasing across the Full Spectrum of Relatedness

[...]

Jared O'Connell¹, Deepti Gurdasani², Olivier Delaneau³, Nicola Pirastu⁴, Sheila Ulivi⁴, Massimiliano Cocca⁵, Michela Traglia⁵, Jie Huang², Jennifer E. Huffman⁶, Igor Rudan⁶, Ruth McQuillan⁶, Ross M. Fraser⁶, Harry Campbell⁶, Ozren Polasek⁷, Gershim Asiki⁸, Kenneth Ekoru⁹, Caroline Hayward⁶, Alan F. Wright⁶, Veronique Vitart⁶, Pau Navarro⁶, Jean-François Zagury⁹, James F. Wilson⁶, Daniela Toniolo⁵, Paolo Gasparini⁴, Nicole Soranzo², Manjinder S. Sandhu², Jonathan Marchini¹ - Show less +23 more•Institutions (9)

Wellcome Trust Centre for Human Genetics¹, Wellcome Trust Sanger Institute², University of Oxford³, University of Trieste⁴, Vita-Salute San Raffaele University⁵, University of Edinburgh⁶, University of Split⁷, Uganda Virus Research Institute⁸, Conservatoire national des arts et métiers⁹

17 Apr 2014-PLOS Genetics

TL;DR: It is found that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations, and a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals is developed.

...read moreread less

Abstract: Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally ‘unrelated’ individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM) to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.

...read moreread less

Collapse