scispace - formally typeset
Search or ask a question

Showing papers by "Broad Institute published in 2009"


Journal ArticleDOI
Shaun Purcell1, Shaun Purcell2, Naomi R. Wray3, Jennifer Stone1, Jennifer Stone2, Peter M. Visscher, Michael Conlon O'Donovan4, Patrick F. Sullivan5, Pamela Sklar2, Pamela Sklar1, Douglas M. Ruderfer, Andrew McQuillin, Derek W. Morris6, Colm O'Dushlaine6, Aiden Corvin6, Peter Holmans4, Stuart MacGregor3, Hugh Gurling, Douglas Blackwood7, Nicholas John Craddock5, Michael Gill6, Christina M. Hultman8, Christina M. Hultman9, George Kirov4, Paul Lichtenstein8, Walter J. Muir7, Michael John Owen4, Carlos N. Pato10, Edward M. Scolnick2, Edward M. Scolnick1, David St Clair, Nigel Williams4, Lyudmila Georgieva4, Ivan Nikolov4, Nadine Norton4, Hywel Williams4, Draga Toncheva, Vihra Milanova, Emma Flordal Thelander8, Patrick Sullivan11, Elaine Kenny6, Emma M. Quinn6, Khalid Choudhury12, Susmita Datta12, Jonathan Pimm12, Srinivasa Thirumalai13, Vinay Puri12, Robert Krasucki12, Jacob Lawrence12, Digby Quested14, Nicholas Bass12, Caroline Crombie15, Gillian Fraser15, Soh Leh Kuan, Nicholas Walker, Kevin A. McGhee7, Ben S. Pickard16, P. Malloy7, Alan W Maclean7, Margaret Van Beck7, Michele T. Pato10, Helena Medeiros10, Frank A. Middleton17, Célia Barreto Carvalho10, Christopher P. Morley17, Ayman H. Fanous, David V. Conti10, James A. Knowles10, Carlos Ferreira, António Macedo18, M. Helena Azevedo18, Andrew Kirby2, Andrew Kirby1, Manuel A. R. Ferreira1, Manuel A. R. Ferreira2, Mark J. Daly1, Mark J. Daly2, Kimberly Chambert2, Finny G Kuruvilla2, Stacey Gabriel2, Kristin G. Ardlie2, Jennifer L. Moran2 
06 Aug 2009-Nature
TL;DR: The extent to which common genetic variation underlies the risk of schizophrenia is shown, using two analytic approaches, and the major histocompatibility complex is implicate, which is shown to involve thousands of common alleles of very small effect.
Abstract: Schizophrenia is a severe mental disorder with a lifetime risk of about 1%, characterized by hallucinations, delusions and cognitive deficits, with heritability estimated at up to 80%(1,2). We performed a genome-wide association study of 3,322 European individuals with schizophrenia and 3,587 controls. Here we show, using two analytic approaches, the extent to which common genetic variation underlies the risk of schizophrenia. First, we implicate the major histocompatibility complex. Second, we provide molecular genetic evidence for a substantial polygenic component to the risk of schizophrenia involving thousands of common alleles of very small effect. We show that this component also contributes to the risk of bipolar disorder, but not to several non-psychiatric diseases.

4,573 citations


Journal ArticleDOI
12 Mar 2009-Nature
TL;DR: It is demonstrated that specific lincRNAs are transcriptionally regulated by key transcription factors in these processes such as p53, NFκB, Sox2, Oct4 (also known as Pou5f1) and Nanog, defining a unique collection of functional linc RNAs that are highly conserved and implicated in diverse biological processes.
Abstract: There is growing recognition that mammalian cells produce many thousands of large intergenic transcripts. However, the functional significance of these transcripts has been particularly controversial. Although there are some well-characterized examples, most (>95%) show little evidence of evolutionary conservation and have been suggested to represent transcriptional noise. Here we report a new approach to identifying large non-coding RNAs using chromatin-state maps to discover discrete transcriptional units intervening known protein-coding loci. Our approach identified ~1,600 large multi-exonic RNAs across four mouse cell types. In sharp contrast to previous collections, these large intervening non-coding RNAs (lincRNAs) show strong purifying selection in their genomic loci, exonic sequences and promoter regions, with greater than 95% showing clear evolutionary conservation. We also developed a functional genomics approach that assigns putative functions to each lincRNA, demonstrating a diverse range of roles for lincRNAs in processes from embryonic stem cell pluripotency to cell proliferation. We obtained independent functional validation for the predictions for over 100 lincRNAs, using cell-based assays. In particular, we demonstrate that specific lincRNAs are transcriptionally regulated by key transcription factors in these processes such as p53, NFκB, Sox2, Oct4 (also known as Pou5f1) and Nanog. Together, these results define a unique collection of functional lincRNAs that are highly conserved and implicated in diverse biological processes.

3,875 citations


Journal ArticleDOI
29 Oct 2009-Nature
TL;DR: It is shown that SCFA–GPR43 interactions profoundly affect inflammatory responses, and GPR43 binding of SCFAs potentially provides a molecular link between diet, gastrointestinal bacterial metabolism, and immune and inflammatory responses.
Abstract: The immune system responds to pathogens by a variety of pattern recognition molecules such as the Toll-like receptors (TLRs), which promote recognition of dangerous foreign pathogens. However, recent evidence indicates that normal intestinal microbiota might also positively influence immune responses, and protect against the development of inflammatory diseases. One of these elements may be short-chain fatty acids (SCFAs), which are produced by fermentation of dietary fibre by intestinal microbiota. A feature of human ulcerative colitis and other colitic diseases is a change in 'healthy' microbiota such as Bifidobacterium and Bacteriodes, and a concurrent reduction in SCFAs. Moreover, increased intake of fermentable dietary fibre, or SCFAs, seems to be clinically beneficial in the treatment of colitis. SCFAs bind the G-protein-coupled receptor 43 (GPR43, also known as FFAR2), and here we show that SCFA-GPR43 interactions profoundly affect inflammatory responses. Stimulation of GPR43 by SCFAs was necessary for the normal resolution of certain inflammatory responses, because GPR43-deficient (Gpr43(-/-)) mice showed exacerbated or unresolving inflammation in models of colitis, arthritis and asthma. This seemed to relate to increased production of inflammatory mediators by Gpr43(-/-) immune cells, and increased immune cell recruitment. Germ-free mice, which are devoid of bacteria and express little or no SCFAs, showed a similar dysregulation of certain inflammatory responses. GPR43 binding of SCFAs potentially provides a molecular link between diet, gastrointestinal bacterial metabolism, and immune and inflammatory responses.

2,515 citations


Journal ArticleDOI
07 May 2009-Nature
TL;DR: The results define over 55,000 potential transcriptional enhancers in the human genome, significantly expanding the current catalogue of human enhancers and highlighting the role of these elements in cell-type-specific gene expression.
Abstract: The human body is composed of diverse cell types with distinct functions. Although it is known that lineage specification depends on cell-specific gene expression, which in turn is driven by promoters, enhancers, insulators and other cis-regulatory DNA sequences for each gene, the relative roles of these regulatory elements in this process are not clear. We have previously developed a chromatin-immunoprecipitation-based microarray method (ChIP-chip) to locate promoters, enhancers and insulators in the human genome. Here we use the same approach to identify these elements in multiple cell types and investigate their roles in cell-type-specific gene expression. We observed that the chromatin state at promoters and CTCF-binding at insulators is largely invariant across diverse cell types. In contrast, enhancers are marked with highly cell-type-specific histone modification patterns, strongly correlate to cell-type-specific gene expression programs on a global scale, and are functionally active in a cell-type-specific manner. Our results define over 55,000 potential transcriptional enhancers in the human genome, significantly expanding the current catalogue of human enhancers and highlighting the role of these elements in cell-type-specific gene expression.

2,320 citations


Journal ArticleDOI
21 Aug 2009-Cell
TL;DR: Global gene expression analyses show that salinomycin treatment results in the loss of expression of breast CSC genes previously identified by analyses of breast tissues isolated directly from patients, demonstrating the ability to identify agents with specific toxicity for epithelial CSCs.

2,258 citations


Journal ArticleDOI
TL;DR: Several of the likely causal genes are highly expressed or known to act in the central nervous system (CNS), emphasizing, as in rare monogenic forms of obesity, the role of the CNS in predisposition to obesity.
Abstract: Common variants at only two loci, FTO and MC4R, have been reproducibly associated with body mass index (BMI) in humans. To identify additional loci, we conducted meta-analysis of 15 genome-wide association studies for BMI (n > 32,000) and followed up top signals in 14 additional cohorts (n > 59,000). We strongly confirm FTO and MC4R and identify six additional loci (P < 5 x 10(-8)): TMEM18, KCTD15, GNPDA2, SH2B1, MTCH2 and NEGR1 (where a 45-kb deletion polymorphism is a candidate causal variant). Several of the likely causal genes are highly expressed or known to act in the central nervous system (CNS), emphasizing, as in rare monogenic forms of obesity, the role of the CNS in predisposition to obesity.

1,710 citations


Journal ArticleDOI
24 Sep 2009-Nature
TL;DR: It is predicted that there will be an excess of recessive diseases in India, which should be possible to screen and map genetically and is higher in traditionally upper caste and Indo-European speakers.
Abstract: India has been underrepresented in genome-wide surveys of human variation. We analyse 25 diverse groups in India to provide strong evidence for two ancient populations, genetically divergent, that are ancestral to most Indians today. One, the 'Ancestral North Indians' (ANI), is genetically close to Middle Easterners, Central Asians, and Europeans, whereas the other, the 'Ancestral South Indians' (ASI), is as distinct from ANI and East Asians as they are from each other. By introducing methods that can estimate ancestry without accurate ancestral populations, we show that ANI ancestry ranges from 39-71% in most Indian groups, and is higher in traditionally upper caste and Indo-European speakers. Groups with only ASI ancestry may no longer exist in mainland India. However, the indigenous Andaman Islanders are unique in being ASI-related groups without ANI ancestry. Allele frequency differences between groups in India are larger than in Europe, reflecting strong founder effects whose signatures have been maintained for thousands of years owing to endogamy. We therefore predict that there will be an excess of recessive diseases in India, which should be possible to screen and map genetically.

1,457 citations


Journal ArticleDOI
TL;DR: A capture method that uses biotinylated RNA 'baits' to fish targets out of a 'pond' of DNA fragments that uniformity was such that ∼60% of target bases in the exonic 'catch', and ∼80% in the regional catch, had at least half the mean coverage.
Abstract: Targeting genomic loci by massively parallel sequencing requires new methods to enrich templates to be sequenced. We developed a capture method that uses biotinylated RNA 'baits' to fish targets out of a 'pond' of DNA fragments. The RNA is transcribed from PCR-amplified oligodeoxynucleotides originally synthesized on a microarray, generating sufficient bait for multiple captures at concentrations high enough to drive the hybridization. We tested this method with 170-mer baits that target >15,000 coding exons (2.5 Mb) and four regions (1.7 Mb total) using Illumina sequencing as read-out. About 90% of uniquely aligning bases fell on or near bait sequence; up to 50% lay on exons proper. The uniformity was such that approximately 60% of target bases in the exonic 'catch', and approximately 80% in the regional catch, had at least half the mean coverage. One lane of Illumina sequence was sufficient to call high-confidence genotypes for 89% of the targeted exon space.

1,444 citations


Journal ArticleDOI
Brian J. Haas1, Sophien Kamoun2, Sophien Kamoun3, Michael C. Zody1, Michael C. Zody4, Rays H. Y. Jiang5, Rays H. Y. Jiang1, Robert E. Handsaker1, Liliana M. Cano3, Manfred Grabherr1, Chinnappa D. Kodira6, Chinnappa D. Kodira1, Sylvain Raffaele3, Trudy Torto-Alalibo6, Trudy Torto-Alalibo2, Tolga O. Bozkurt3, Audrey M. V. Ah-Fong7, Lucia Alvarado1, Vicky L. Anderson8, Miles R. Armstrong9, Anna O. Avrova9, Laura Baxter10, Jim Beynon10, Petra C. Boevink9, Stephanie R. Bollmann11, Jorunn I. B. Bos2, Vincent Bulone12, Guohong Cai13, Cahid Cakir2, James C. Carrington14, Megan Chawner15, Lucio Conti16, Stefano Costanzo11, Richard Ewan16, Noah Fahlgren14, Michael A. Fischbach17, Johanna Fugelstad12, Eleanor M. Gilroy9, Sante Gnerre1, Pamela J. Green18, Laura J. Grenville-Briggs8, John Griffith15, Niklaus J. Grünwald11, Karolyn Horn15, Neil R. Horner8, Chia-Hui Hu19, Edgar Huitema2, Dong-Hoon Jeong18, Alexandra M. E. Jones3, Jonathan D. G. Jones3, Richard W. Jones11, Elinor K. Karlsson1, Sridhara G. Kunjeti20, Kurt Lamour21, Zhenyu Liu2, Li-Jun Ma1, Dan MacLean3, Marcus C. Chibucos22, Hayes McDonald23, Jessica McWalters15, Harold J. G. Meijer5, William Morgan24, Paul Morris25, Carol A. Munro8, Keith O'Neill1, Keith O'Neill6, Manuel D. Ospina-Giraldo15, Andrés Pinzón, Leighton Pritchard9, Bernard H Ramsahoye26, Qinghu Ren27, Silvia Restrepo, Sourav Roy7, Ari Sadanandom16, Alon Savidor28, Sebastian Schornack3, David C. Schwartz29, Ulrike Schumann8, Ben Schwessinger3, Lauren Seyer15, Ted Sharpe1, Cristina Silvar3, Jing Song2, David J. Studholme3, Sean M. Sykes1, Marco Thines3, Marco Thines30, Peter J. I. van de Vondervoort5, Vipaporn Phuntumart25, Stephan Wawra8, R. Weide5, Joe Win3, Carolyn A. Young2, Shiguo Zhou29, William E. Fry13, Blake C. Meyers18, Pieter van West8, Jean B. Ristaino19, Francine Govers5, Paul R. J. Birch31, Stephen C. Whisson9, Howard S. Judelson7, Chad Nusbaum1 
17 Sep 2009-Nature
TL;DR: The sequence of the P. infestans genome is reported, which at ∼240 megabases (Mb) is by far the largest and most complex genome sequenced so far in the chromalveolates and probably plays a crucial part in the rapid adaptability of the pathogen to host plants and underpins its evolutionary potential.
Abstract: Phytophthora infestans is the most destructive pathogen of potato and a model organism for the oomycetes, a distinct lineage of fungus-like eukaryotes that are related to organisms such as brown algae and diatoms. As the agent of the Irish potato famine in the mid-nineteenth century, P. infestans has had a tremendous effect on human history, resulting in famine and population displacement(1). To this day, it affects world agriculture by causing the most destructive disease of potato, the fourth largest food crop and a critical alternative to the major cereal crops for feeding the world's population(1). Current annual worldwide potato crop losses due to late blight are conservatively estimated at $6.7 billion(2). Management of this devastating pathogen is challenged by its remarkable speed of adaptation to control strategies such as genetically resistant cultivars(3,4). Here we report the sequence of the P. infestans genome, which at similar to 240 megabases (Mb) is by far the largest and most complex genome sequenced so far in the chromalveolates. Its expansion results from a proliferation of repetitive DNA accounting for similar to 74% of the genome. Comparison with two other Phytophthora genomes showed rapid turnover and extensive expansion of specific families of secreted disease effector proteins, including many genes that are induced during infection or are predicted to have activities that alter host physiology. These fast-evolving effector genes are localized to highly dynamic and expanded regions of the P. infestans genome. This probably plays a crucial part in the rapid adaptability of the pathogen to host plants and underpins its evolutionary potential.

1,341 citations


Journal ArticleDOI
Sekar Kathiresan1, Benjamin F. Voight1, Shaun Purcell2, Kiran Musunuru1, Diego Ardissino, Pier Mannuccio Mannucci3, Sonia S. Anand4, James C. Engert5, Nilesh J. Samani6, Heribert Schunkert7, Jeanette Erdmann7, Muredach P. Reilly8, Daniel J. Rader8, Thomas M. Morgan9, John A. Spertus10, Monika Stoll11, Domenico Girelli12, Pascal P. McKeown13, Christopher Patterson13, David S. Siscovick14, Christopher J. O'Donnell15, Roberto Elosua, Leena Peltonen16, Veikko Salomaa17, Stephen M. Schwartz14, Olle Melander18, David Altshuler1, Pier Angelica Merlini, Carlo Berzuini19, Luisa Bernardinelli19, Flora Peyvandi3, Marco Tubaro, Patrizia Celli, Maurizio Ferrario, Raffaela Fetiveau, Nicola Marziliano, Giorgio Casari20, Michele Galli, Flavio Ribichini12, Marco Rossi, Francesco Bernardi21, Pietro Zonzin, Alberto Piazza22, Jean Yee14, Yechiel Friedlander23, Jaume Marrugat, Gavin Lucas, Isaac Subirana, Joan Sala24, Rafael Ramos, James B. Meigs1, Gordon H. Williams1, David M. Nathan1, Calum A. MacRae1, Aki S. Havulinna17, Göran Berglund18, Joel N. Hirschhorn1, Rosanna Asselta, Stefano Duga, Marta Spreafico25, Mark J. Daly1, James Nemesh2, Joshua M. Korn1, Steven A. McCarroll1, Aarti Surti2, Candace Guiducci2, Lauren Gianniny2, Daniel B. Mirel2, Melissa Parkin2, Noël P. Burtt2, Stacey Gabriel2, John R. Thompson6, Peter S. Braund6, Benjamin J. Wright6, Anthony J. Balmforth26, Stephen G. Ball26, Alistair S. Hall26, Patrick Linsel-Nitschke7, Wolfgang Lieb7, Andreas Ziegler7, Inke R. König7, Christian Hengstenberg27, Marcus Fischer27, Klaus Stark27, Anika Grosshennig7, Michael Preuss7, H-Erich Wichmann28, Stefan Schreiber29, Willem H. Ouwehand19, Panos Deloukas30, Michael Scholz, François Cambien31, Mingyao Li8, Zhen Chen8, Robert L. Wilensky8, William H. Matthai8, Atif Qasim8, Hakon Hakonarson8, Joe Devaney32, Mary-Susan Burnett32, Augusto D. Pichard32, Kenneth M. Kent32, Lowell F. Satler32, Joseph M. Lindsay32, Ron Waksman32, Stephen E. Epstein32, Thomas Scheffold, Klaus Berger11, Andreas Huge11, Nicola Martinelli12, Oliviero Olivieri12, Roberto Corrocher12, Hilma Holm33, Gudmar Thorleifsson33, Unnur Thorsteinsdottir34, Kari Stefansson34, Ron Do5, Changchun Xie4, David S. Siscovick14 
TL;DR: SNPs at nine loci were reproducibly associated with myocardial infarction, but tests of common and rare CNVs failed to identify additional associations with my Cardiovascular Infarction risk.
Abstract: We conducted a genome-wide association study testing single nucleotide polymorphisms (SNPs) and copy number variants (CNVs) for association with early-onset myocardial infarction in 2,967 cases and 3,075 controls We carried out replication in an independent sample with an effective sample size of up to 19,492 SNPs at nine loci reached genome-wide significance: three are newly identified (21q22 near MRPS6-SLC5A3-KCNE2, 6p24 in PHACTR1 and 2q33 in WDR12) and six replicated prior observations1, 2, 3, 4 (9p21, 1p13 near CELSR2-PSRC1-SORT1, 10q11 near CXCL12, 1q41 in MIA3, 19p13 near LDLR and 1p32 near PCSK9) We tested 554 common copy number polymorphisms (>1% allele frequency) and none met the pre-specified threshold for replication (P < 10-3) We identified 8,065 rare CNVs but did not detect a greater CNV burden in cases compared to controls, in genes compared to the genome as a whole, or at any individual locus SNPs at nine loci were reproducibly associated with myocardial infarction, but tests of common and rare CNVs failed to identify additional associations with myocardial infarction risk

1,092 citations


Journal ArticleDOI
TL;DR: A multilaboratory study to assess reproducibility, recovery, linear dynamic range and limits of detection and quantification of multiplexed, MRM-based assays, conducted by NCI-CPTAC demonstrates that these assays can be highly reproducible within and across laboratories and instrument platforms.
Abstract: Verification of candidate biomarkers relies upon specific, quantitative assays optimized for selective detection of target proteins, and is increasingly viewed as a critical step in the discovery pipeline that bridges unbiased biomarker discovery to preclinical validation. Although individual laboratories have demonstrated that multiple reaction monitoring (MRM) coupled with isotope dilution mass spectrometry can quantify candidate protein biomarkers in plasma, reproducibility and transferability of these assays between laboratories have not been demonstrated. We describe a multilaboratory study to assess reproducibility, recovery, linear dynamic range and limits of detection and quantification of multiplexed, MRM-based assays, conducted by NCI-CPTAC. Using common materials and standardized protocols, we demonstrate that these assays can be highly reproducible within and across laboratories and instrument platforms, and are sensitive to low mug/ml protein concentrations in unfractionated plasma. We provide data and benchmarks against which individual laboratories can compare their performance and evaluate new technologies for biomarker verification in plasma.

Journal ArticleDOI
16 Jul 2009-Nature
TL;DR: Analysis of the 363 megabase nuclear genome of the blood fluke, the first sequenced flatworm, and a representative of the Lophotrochozoa offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and theDevelopment of tissues into organs.
Abstract: Schistosoma mansoni is responsible for the neglected tropical disease schistosomiasis that affects 210 million people in 76 countries. Here we present analysis of the 363 megabase nuclear genome of the blood fluke. It encodes at least 11,809 genes, with an unusual intron size distribution, and new families of micro-exon genes that undergo frequent alternative splicing. As the first sequenced flatworm, and a representative of the Lophotrochozoa, it offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and the development of tissues into organs. Our analysis has been informed by the need to find new drug targets. The deficits in lipid metabolism that make schistosomes dependent on the host are revealed, and the identification of membrane receptors, ion channels and more than 300 proteases provide new insights into the biology of the life cycle and new targets. Bioinformatics approaches have identified metabolic chokepoints, and a chemogenomic screen has pinpointed schistosome proteins for which existing drugs may be active. The information generated provides an invaluable resource for the research community to develop much needed new control tools for the treatment and eradication of this important and neglected disease.

Journal ArticleDOI
04 Jun 2009-Nature
TL;DR: There are significant expansions of cell wall, secreted and transporter gene families in pathogenic species, suggesting adaptations associated with virulence in Candida albicans species.
Abstract: Candida species are the most common cause of opportunistic fungal infection worldwide. Here we report the genome sequences of six Candida species and compare these and related pathogens and non-pathogens. There are significant expansions of cell wall, secreted and transporter gene families in pathogenic species, suggesting adaptations associated with virulence. Large genomic tracts are homozygous in three diploid species, possibly resulting from recent recombination events. Surprisingly, key components of the mating and meiosis pathways are missing from several species. These include major differences at the mating-type loci (MTL); Lodderomyces elongisporus lacks MTL, and components of the a1/2 cell identity determinant were lost in other species, raising questions about how mating and cell types are controlled. Analysis of the CUG leucine-to-serine genetic-code change reveals that 99% of ancestral CUG codons were erased and new ones arose elsewhere. Lastly, we revise the Candida albicans gene catalogue, identifying many new genes.

Journal ArticleDOI
TL;DR: The method for SNP genotyping described in this unit is based on the commercially available Sequenom MassARRAY platform and uses MALDI‐TOF mass spectrometry to identify the SNP allele using the distinct mass of the extended primer.
Abstract: The method for SNP genotyping described in this unit is based on the commercially available Sequenom MassARRAY platform. The assay consists of an initial locus-specific PCR reaction, followed by single base extension using mass-modified dideoxynucleotide terminators of an oligonucleotide primer which anneals immediately upstream of the polymorphic site of interest. Using MALDI-TOF mass spectrometry, the distinct mass of the extended primer identifies the SNP allele.

Journal ArticleDOI
TL;DR: A peak of genomic amplification on chromosome 3q26.33 found in squamous cell carcinomas of the lung and esophagus contains the transcription factor gene SOX2, which is necessary for normal esophageal squamous development, promotes differentiation and proliferation of basal tracheal cells and cooperates in induction of pluripotent stem cells.
Abstract: Lineage-survival oncogenes are activated by somatic DNA alterations in cancers arising from the cell lineages in which these genes play a role in normal development. Here we show that a peak of genomic amplification on chromosome 3q26.33 found in squamous cell carcinomas (SCCs) of the lung and esophagus contains the transcription factor gene SOX2, which is mutated in hereditary human esophageal malformations, is necessary for normal esophageal squamous development, promotes differentiation and proliferation of basal tracheal cells and cooperates in induction of pluripotent stem cells. SOX2 expression is required for proliferation and anchorage-independent growth of lung and esophageal cell lines, as shown by RNA interference experiments. Furthermore, ectopic expression of SOX2 here cooperated with FOXE1 or FGFR2 to transform immortalized tracheobronchial epithelial cells. SOX2-driven tumors show expression of markers of both squamous differentiation and pluripotency. These characteristics identify SOX2 as a lineage-survival oncogene in lung and esophageal SCC.


Journal ArticleDOI
TL;DR: The association observed between low-density lipoprotein and an infrequent variant in AR suggests the potential of such a cohort for identifying associations with both common, low-impact and rarer, high-impact quantitative trait loci.
Abstract: Genome-wide association studies (GWAS) of longitudinal birth cohorts enable joint investigation of environmental and genetic influences on complex traits. We report GWAS results for nine quantitative metabolic traits (triglycerides, high-density lipoprotein, low-density lipoprotein, glucose, insulin, C-reactive protein, body mass index, and systolic and diastolic blood pressure) in the Northern Finland Birth Cohort 1966 (NFBC1966), drawn from the most genetically isolated Finnish regions. We replicate most previously reported associations for these traits and identify nine new associations, several of which highlight genes with metabolic functions: high-density lipoprotein with NR1H3 (LXRA), low-density lipoprotein with AR and FADS1-FADS2, glucose with MTNR1B, and insulin with PANK1. Two of these new associations emerged after adjustment of results for body mass index. Gene-environment interaction analyses suggested additional associations, which will require validation in larger samples. The currently identified loci, together with quantified environmental exposures, explain little of the trait variation in NFBC1966. The association observed between low-density lipoprotein and an infrequent variant in AR suggests the potential of such a cohort for identifying associations with both common, low-impact and rarer, high-impact quantitative trait loci.

Journal ArticleDOI
TL;DR: It is reported that the susceptibility allele near IRF8, which encodes a transcription factor known to function in type I Interferon signaling, is associated with higher mRNA expression of interferon-response pathway genes in subjects with MS.
Abstract: We report the results of a meta-analysis of genome-wide association scans for multiple sclerosis (MS) susceptibility that includes 2,624 subjects with MS and 7,220 control subjects. Replication in an independent set of 2,215 subjects with MS and 2,116 control subjects validates new MS susceptibility loci at TNFRSF1A (combined P = 1.59 x 10(-11)), IRF8 (P = 3.73 x 10(-9)) and CD6 (P = 3.79 x 10(-9)). TNFRSF1A harbors two independent susceptibility alleles: rs1800693 is a common variant with modest effect (odds ratio = 1.2), whereas rs4149584 is a nonsynonymous coding polymorphism of low frequency but with stronger effect (allele frequency = 0.02; odds ratio = 1.6). We also report that the susceptibility allele near IRF8, which encodes a transcription factor known to function in type I interferon signaling, is associated with higher mRNA expression of interferon-response pathway genes in subjects with MS.

Journal ArticleDOI
TL;DR: It is reported that uORFs correlate with significantly reduced protein expression of the downstream ORF, based on analysis of 11,649 matched mRNA and protein measurements from 4 published mammalian studies, and that 5 uORF-altering mutations, detected within genes previously linked to human diseases, dramatically silence expression ofThe downstream protein.
Abstract: Upstream ORFs (uORFs) are mRNA elements defined by a start codon in the 5′ UTR that is out-of-frame with the main coding sequence. Although uORFs are present in approximately half of human and mouse transcripts, no study has investigated their global impact on protein expression. Here, we report that uORFs correlate with significantly reduced protein expression of the downstream ORF, based on analysis of 11,649 matched mRNA and protein measurements from 4 published mammalian studies. Using reporter constructs to test 25 selected uORFs, we estimate that uORFs typically reduce protein expression by 30–80%, with a modest impact on mRNA levels. We additionally identify polymorphisms that alter uORF presence in 509 human genes. Finally, we report that 5 uORF-altering mutations, detected within genes previously linked to human diseases, dramatically silence expression of the downstream protein. Together, our results suggest that uORFs influence the protein expression of thousands of mammalian genes and that variation in these elements can influence human phenotype and disease.

Journal ArticleDOI
06 Nov 2009-Science
TL;DR: The analysis reveals an evolutionarily new centromere on equine chromosome 11 that displays properties of an immature but fully functioning Centromere and is devoid of centromeric satellite sequence, suggesting thatCentromeric function may arise before satellite repeat accumulation.
Abstract: We report a high-quality draft sequence of the genome of the horse (Equus caballus). The genome is relatively repetitive but has little segmental duplication. Chromosomes appear to have undergone few historical rearrangements: 53% of equine chromosomes show conserved synteny to a single human chromosome. Equine chromosome 11 is shown to have an evolutionary new centromere devoid of centromeric satellite DNA, suggesting that centromeric function may arise before satellite repeat accumulation. Linkage disequilibrium, showing the influences of early domestication of large herds of female horses, is intermediate in length between dog and human, and there is long-range haplotype sharing among breeds.

Journal ArticleDOI
12 Nov 2009-Nature
TL;DR: It is demonstrated that direct, high-affinity binding of the hydrocarbon-stapled peptide SAHM1 prevents assembly of the active transcriptional complex at a critical protein–protein interface in the NOTCH transactivation complex.
Abstract: Direct inhibition of transcription factor complexes remains a central challenge in the discipline of ligand discovery. In general, these proteins lack surface involutions suitable for high-affinity binding by small molecules. Here we report the design of synthetic, cell-permeable, stabilized α-helical peptides that target a critical protein–protein interface in the NOTCH transactivation complex. We demonstrate that direct, high-affinity binding of the hydrocarbon-stapled peptide SAHM1 prevents assembly of the active transcriptional complex. Inappropriate NOTCH activation is directly implicated in the pathogenesis of several disease states, including T-cell acute lymphoblastic leukaemia (T-ALL). The treatment of leukaemic cells with SAHM1 results in genome-wide suppression of NOTCH-activated genes. Direct antagonism of the NOTCH transcriptional program causes potent, NOTCH-specific anti-proliferative effects in cultured cells and in a mouse model of NOTCH1-driven T-ALL. The NOTCH complex is of tremendous interest because of its role as a master developmental regulator of gene transcription, a substrate for γ-secretase and an oncogene inappropriately activated in many cancers including T-cell leukaemias. Like the majority of transcription factors, NOTCH was thought to be untargetable by synthetic cell-permeable molecules. But now a promising NOTCH antagonist has been designed, and found to be effective in reducing leukaemia growth in a mouse model. The hydrocarbon-stapled peptide SAHM1 acts by preventing assembly of the active transcriptional complex, providing a potentially valuable tool for studies of the role of NOTCH and a starting point for therapeutic agents. In addition, the direct targeting of transactivation complexes may be applicable to several other transcription factor complexes previously considered untargetable. It is notoriously difficult to target transcription factors with aberrant activity in cancer. Inappropriate activation of the NOTCH complex of transcription factors is directly implicated in the pathogenesis of several disease states, including T-cell acute lymphoblastic leukaemia. The design of synthetic, cell-permeable, stabilized α-helical peptides that disrupt protein–protein interactions in NOTCH is now described.

Journal ArticleDOI
Inga Prokopenko1, Claudia Langenberg2, Jose C. Florez3, Jose C. Florez4, Richa Saxena3, Richa Saxena4, Nicole Soranzo5, Nicole Soranzo6, Gudmar Thorleifsson7, Ruth J. F. Loos2, Alisa K. Manning8, Anne U. Jackson9, Yurii S. Aulchenko10, Simon C. Potter6, Michael R. Erdos11, Serena Sanna, Jouke-Jan Hottenga12, Eleanor Wheeler6, Marika Kaakinen13, Valeriya Lyssenko14, Wei-Min Chen15, Kourosh R. Ahmadi5, Jacques S. Beckmann16, Jacques S. Beckmann17, Richard N. Bergman18, Murielle Bochud17, Lori L. Bonnycastle11, Thomas A. Buchanan18, Antonio Cao, Alessandra C. L. Cervino5, Lachlan J. M. Coin19, Francis S. Collins11, Laura Crisponi, Eco J. C. de Geus12, Abbas Dehghan10, Panos Deloukas6, Alex S. F. Doney20, Paul Elliott19, Nelson B. Freimer21, Vesela Gateva9, Christian Herder22, Albert Hofman10, Thomas Edward Hughes23, Sarah E. Hunt6, Thomas Illig, Michael Inouye6, Bo Isomaa, Toby Johnson16, Toby Johnson24, Toby Johnson17, Augustine Kong7, Maria Krestyaninova25, Johanna Kuusisto26, Markku Laakso26, Noha Lim27, Ulf Lindblad14, Cecilia M. Lindgren1, O. T. McCann6, Karen L. Mohlke28, Andrew D. Morris20, Silvia Naitza, Marco Orru, Colin N. A. Palmer20, Anneli Pouta29, Joshua C. Randall1, Wolfgang Rathmann22, Jouko Saramies, Paul Scheet9, Laura J. Scott9, Angelo Scuteri11, Stephen J. Sharp2, Eric J.G. Sijbrands10, Jan H. Smit30, Kijoung Song27, Valgerdur Steinthorsdottir7, Heather M. Stringham9, Tiinamaija Tuomi31, Jaakko Tuomilehto, André G. Uitterlinden10, Benjamin F. Voight4, Benjamin F. Voight3, Dawn M. Waterworth27, H-Erich Wichmann32, Gonneke Willemsen12, Jacqueline C.M. Witteman10, Xin Yuan27, Jing Hua Zhao2, Eleftheria Zeggini1, David Schlessinger11, Manjinder S. Sandhu2, Manjinder S. Sandhu33, Dorret I. Boomsma12, Manuela Uda, Tim D. Spector5, Brenda W.J.H. Penninx34, Brenda W.J.H. Penninx33, Brenda W.J.H. Penninx35, David Altshuler4, David Altshuler3, Peter Vollenweider17, Marjo-Riitta Järvelin19, Marjo-Riitta Järvelin13, Edward G. Lakatta11, Gérard Waeber17, Caroline S. Fox11, Caroline S. Fox36, Leena Peltonen6, Leena Peltonen37, Leif Groop14, Vincent Mooser27, L. Adrienne Cupples8, Unnur Thorsteinsdottir38, Unnur Thorsteinsdottir7, Michael Boehnke9, Inês Barroso6, Cornelia M. van Duijn10, Josée Dupuis8, Richard M. Watanabe18, Kari Stefansson7, Kari Stefansson38, Mark I. McCarthy1, Mark I. McCarthy39, Nicholas J. Wareham2, James B. Meigs3, Gonçalo R. Abecasis9 
TL;DR: Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten genome-wide association scans, and previous associations of fasting glucose with variants at the G6PC2 and GCK loci are confirmed.
Abstract: To identify previously unknown genetic loci associated with fasting glucose concentrations, we examined the leading association signals in ten genome-wide association scans involving a total of 36,610 individuals of European descent. Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten studies. The strongest signal was observed at rs10830963, where each G allele (frequency 0.30 in HapMap CEU) was associated with an increase of 0.07 (95% CI = 0.06-0.08) mmol/l in fasting glucose levels (P = 3.2 x 10(-50)) and reduced beta-cell function as measured by homeostasis model assessment (HOMA-B, P = 1.1 x 10(-15)). The same allele was associated with an increased risk of type 2 diabetes (odds ratio = 1.09 (1.05-1.12), per G allele P = 3.3 x 10(-7)) in a meta-analysis of 13 case-control studies totaling 18,236 cases and 64,453 controls. Our analyses also confirm previous associations of fasting glucose with variants at the G6PC2 (rs560887, P = 1.1 x 10(-57)) and GCK (rs4607517, P = 1.0 x 10(-25)) loci.

Journal ArticleDOI
Cecilia M. Lindgren1, Iris M. Heid2, Joshua C. Randall1, Claudia Lamina3  +152 moreInstitutions (36)
TL;DR: By focusing on anthropometric measures of central obesity and fat distribution, a meta-analysis of 16 genome-wide association studies informative for adult waist circumference and waist–hip ratio identified three loci implicated in the regulation of human adiposity.
Abstract: To identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580) informative for adult waist circumference (WC) and waist-hip ratio (WHR). We selected 26 SNPs for follow-up, for which the evidence of association with measures of central adiposity (WC and/or WHR) was strong and disproportionate to that for overall adiposity or height. Follow-up studies in a maximum of 70,689 individuals identified two loci strongly associated with measures of central adiposity; these map near TFAP2B (WC, P = 1.9x10(-11)) and MSRA (WC, P = 8.9x10(-9)). A third locus, near LYPLAL1, was associated with WHR in women only (P = 2.6x10(-8)). The variants near TFAP2B appear to influence central adiposity through an effect on overall obesity/fat-mass, whereas LYPLAL1 displays a strong female-only association with fat distribution. By focusing on anthropometric measures of central obesity and fat distribution, we have identified three loci implicated in the regulation of human adiposity.

Journal ArticleDOI
24 Dec 2009-Cell
TL;DR: A combination of yeast two-hybrid analysis and genome-wide expression profiling implicated hundreds of human factors in mediating viral-host interactions and pointed to potential roles for some unanticipated host and viral proteins in viral infection and the host response.

Journal ArticleDOI
TL;DR: The results affirm the importance of MEK dependency in BRAF-mutant melanoma and suggest novel mechanisms of resistance to MEK and B-RAF inhibitors that may have important clinical implications.
Abstract: Genetic alterations that activate the mitogen-activated protein kinase (MAP kinase) pathway occur commonly in cancer. For example, the majority of melanomas harbor mutations in the BRAF oncogene, which are predicted to confer enhanced sensitivity to pharmacologic MAP kinase inhibition (e.g., RAF or MEK inhibitors). We investigated the clinical relevance of MEK dependency in melanoma by massively parallel sequencing of resistant clones generated from a MEK1 random mutagenesis screen in vitro, as well as tumors obtained from relapsed patients following treatment with AZD6244, an allosteric MEK inhibitor. Most mutations conferring resistance to MEK inhibition in vitro populated the allosteric drug binding pocket or α-helix C and showed robust (≈100-fold) resistance to allosteric MEK inhibition. Other mutations affected MEK1 codons located within or abutting the N-terminal negative regulatory helix (helix A), which also undergo gain-of-function germline mutations in cardio-facio-cutaneous (CFC) syndrome. One such mutation, MEK1(P124L), was identified in a resistant metastatic focus that emerged in a melanoma patient treated with AZD6244. Both MEK1(P124L) and MEK1(Q56P), which disrupts helix A, conferred cross-resistance to PLX4720, a selective B-RAF inhibitor. However, exposing BRAF-mutant melanoma cells to AZD6244 and PLX4720 in combination prevented emergence of resistant clones. These results affirm the importance of MEK dependency in BRAF-mutant melanoma and suggest novel mechanisms of resistance to MEK and B-RAF inhibitors that may have important clinical implications.

Journal ArticleDOI
TL;DR: Evidence is presented that the region harboring this variant is a transcriptional enhancer, that the alleles of rs6983267 differentially bind transcription factor 7-like 2 (TCF7L2) and that the risk region physically interacts with the MYC proto-oncogene.
Abstract: An inherited variant on chromosome 8q24, rs6983267, is significantly associated with cancer pathogenesis. We present evidence that the region harboring this variant is a transcriptional enhancer, that the alleles of rs6983267 differentially bind transcription factor 7-like 2 (TCF7L2) and that the risk region physically interacts with the MYC proto-oncogene. These data provide strong support for a biological mechanism underlying this non-protein-coding risk variant.

Journal ArticleDOI
Lauren A. Weiss1, Lauren A. Weiss2, Dan E. Arking3, Mark J. Daly2  +211 moreInstitutions (54)
08 Oct 2009-Nature
TL;DR: A linkage and association mapping study using half a million genome-wide single nucleotide polymorphisms in a common set of 1,031 multiplex autism families, implicating SEMA5A as an autism susceptibility gene.
Abstract: Although autism is a highly heritable neurodevelopmental disorder, attempts to identify specific susceptibility genes have thus far met with limited success. Genome-wide association studies using half a million or more markers, particularly those with very large sample sizes achieved through meta-analysis, have shown great success in mapping genes for other complex genetic traits. Consequently, we initiated a linkage and association mapping study using half a million genome-wide single nucleotide polymorphisms (SNPs) in a common set of 1,031 multiplex autism families (1,553 affected offspring). We identified regions of suggestive and significant linkage on chromosomes 6q27 and 20p13, respectively. Initial analysis did not yield genome-wide significant associations; however, genotyping of top hits in additional families revealed an SNP on chromosome 5p15 (between SEMA5A and TAS2R1) that was significantly associated with autism (P = 2 x 10(-7)). We also demonstrated that expression of SEMA5A is reduced in brains from autistic patients, further implicating SEMA5A as an autism susceptibility gene. The linkage regions reported here provide targets for rare variation screening whereas the discovery of a single novel association demonstrates the action of common variants.

Journal ArticleDOI
TL;DR: It is found that chronic social defeat stress in mice causes a transient decrease, followed by a persistent increase, in levels of acetylated histone H3 in the nucleus accumbens, an important limbic brain region.
Abstract: Persistent symptoms of depression suggest the involvement of stable molecular adaptations in brain, which may be reflected at the level of chromatin remodeling. We find that chronic social defeat stress in mice causes a transient decrease, followed by a persistent increase, in levels of acetylated histone H3 in the nucleus accumbens, an important limbic brain region. This persistent increase in H3 acetylation is associated with decreased levels of histone deacetylase 2 (HDAC2) in the nucleus accumbens. Similar effects were observed in the nucleus accumbens of depressed humans studied postmortem. These changes in H3 acetylation and HDAC2 expression mediate long-lasting positive neuronal adaptations, since infusion of HDAC inhibitors into the nucleus accumbens, which increases histone acetylation, exerts robust antidepressant-like effects in the social defeat paradigm and other behavioral assays. HDAC inhibitor [N-(2-aminophenyl)-4-[N-(pyridine-3-ylmethoxy-carbonyl)aminomethyl]benzamide (MS-275)] infusion also reverses the effects of chronic defeat stress on global patterns of gene expression in the nucleus accumbens, as determined by microarray analysis, with striking similarities to the effects of the standard antidepressant fluoxetine. Stress-regulated genes whose expression is normalized selectively by MS-275 may provide promising targets for the future development of novel antidepressant treatments. Together, these findings provide new insight into the underlying molecular mechanisms of depression and antidepressant action, and support the antidepressant potential of HDAC inhibitors and perhaps other agents that act at the level of chromatin structure.

Journal ArticleDOI
TL;DR: HAPMIX will be of particular utility for mapping disease genes in recently admixed populations, as its accurate estimates of local ancestry permit admixture and case-control association signals to be combined, enabling more powerful tests of association than with either signal alone.
Abstract: Identifying the ancestry of chromosomal segments of distinct ancestry has a wide range of applications from disease mapping to learning about history. Most methods require the use of unlinked markers; but, using all markers from genome-wide scanning arrays, it should in principle be possible to infer the ancestry of even very small segments with exquisite accuracy. We describe a method, HAPMIX, which employs an explicit population genetic model to perform such local ancestry inference based on fine-scale variation data. We show that HAPMIX outperforms other methods, and we explore its utility for inferring ancestry, learning about ancestral populations, and inferring dates of admixture. We validate the method empirically by applying it to populations that have experienced recent and ancient admixture: 935 African Americans from the United States and 29 Mozabites from North Africa. HAPMIX will be of particular utility for mapping disease genes in recently admixed populations, as its accurate estimates of local ancestry permit admixture and case-control association signals to be combined, enabling more powerful tests of association than with either signal alone.

Journal ArticleDOI
TL;DR: A precipitous drop in costs and increase in sequencing efficiency is anticipated, with concomitant development of improved annotation technology, and it is proposed to create a collection of tissue and DNA specimens for 10,000 vertebrate species specifically designated for whole-genome sequencing in the very near future.
Abstract: American Genetic Association, Gordon and Betty Moore Foundation, NHGRI Intramural Sequencing Center, and UCSC Alumni Association to cost of the Genome 10K workshop; Howard Hughes Medical Institute to D. H.; Gordon and Betty Moore Foundation to S. C. S.; A