scispace - formally typeset
Search or ask a question

Showing papers on "Gene published in 2011"


Journal ArticleDOI
Debra A. Bell1, Andrew Berchuck2, Michael J. Birrer3, Jeremy Chien1  +282 moreInstitutions (35)
30 Jun 2011-Nature
TL;DR: It is reported that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1,BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes.
Abstract: A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients' lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology.

5,878 citations


Journal ArticleDOI
19 May 2011-Nature
TL;DR: Using a quantitative model, the first genome-scale prediction of synthesis rates of mRNAs and proteins is obtained and it is found that the cellular abundance of proteins is predominantly controlled at the level of translation.
Abstract: Gene expression is a multistep process that involves the transcription, translation and turnover of messenger RNAs and proteins. Although it is one of the most fundamental processes of life, the entire cascade has never been quantified on a genome-wide scale. Here we simultaneously measured absolute mRNA and protein abundance and turnover by parallel metabolic pulse labelling for more than 5,000 genes in mammalian cells. Whereas mRNA and protein levels correlated better than previously thought, corresponding half-lives showed no correlation. Using a quantitative model we have obtained the first genome-scale prediction of synthesis rates of mRNAs and proteins. We find that the cellular abundance of proteins is predominantly controlled at the level of translation. Genes with similar combinations of mRNA and protein stability shared functional properties, indicating that half-lives evolved under energetic and dynamic constraints. Quantitative information about all stages of gene expression provides a rich resource and helps to provide a greater understanding of the underlying design principles.

5,635 citations


Journal ArticleDOI
Xun Xu1, Shengkai Pan1, Shifeng Cheng1, Bo Zhang1, Mu D1, Peixiang Ni1, Gengyun Zhang1, Shuang Yang1, Ruiqiang Li1, Jun Wang1, Gisella Orjeda2, Frank Guzman2, Torres M2, Roberto Lozano2, Olga Ponce2, Diana Martinez2, De la Cruz G3, Chakrabarti Sk3, Patil Vu3, Konstantin G. Skryabin4, Boris B. Kuznetsov4, Nikolai V. Ravin4, Tatjana V. Kolganova4, Alexey V. Beletsky4, Andrey V. Mardanov4, Di Genova A5, Dan Bolser5, David M. A. Martin5, Li G, Yang Y, Hanhui Kuang6, Hu Q6, Xiong X7, Gerard J. Bishop8, Boris Sagredo, Nilo Mejía, Zagorski W9, Robert Gromadka9, Jan Gawor9, Pawel Szczesny9, Sanwen Huang, Zhang Z, Liang C, He J, Li Y, He Y, Xu J, Youjun Zhang, Xie B, Du Y, Qu D, Merideth Bonierbale10, Marc Ghislain10, Herrera Mdel R, Giovanni Giuliano, Marco Pietrella, Gaetano Perrotta, Paolo Facella, O'Brien K11, Sergio Enrique Feingold, Barreiro Le, Massa Ga, Luis Aníbal Diambra12, Brett R Whitty13, Brieanne Vaillancourt13, Lin H13, Alicia N. Massa13, Geoffroy M13, Lundback S13, Dean DellaPenna13, Buell Cr14, Sanjeev Kumar Sharma14, David Marshall14, Robbie Waugh14, Glenn J. Bryan14, Destefanis M15, Istvan Nagy15, Dan Milbourne15, Susan Thomson16, Mark Fiers16, Jeanne M. E. Jacobs16, Kåre Lehmann Nielsen17, Mads Sønderkær17, Marina Iovene18, Giovana Augusta Torres18, Jiming Jiang18, Richard E. Veilleux19, Christian W. B. Bachem20, de Boer J20, Theo Borm20, Bjorn Kloosterman20, van Eck H20, Erwin Datema20, Hekkert Bt20, Aska Goverse20, van Ham Rc20, Richard G. F. Visser20 
10 Jul 2011-Nature
TL;DR: The potato genome sequence provides a platform for genetic improvement of this vital crop and predicts 39,031 protein-coding genes and presents evidence for at least two genome duplication events indicative of a palaeopolyploid origin.
Abstract: Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.

1,813 citations


Journal ArticleDOI
Xiaowu Wang1, Hanzhong Wang, Jun Wang2, Jun Wang3, Jun Wang4, Rifei Sun, Jian Wu, Shengyi Liu, Yinqi Bai3, Jeong-Hwan Mun5, Ian Bancroft6, Feng Cheng, Sanwen Huang, Xixiang Li, Wei Hua, Junyi Wang3, Xiyin Wang7, Xiyin Wang8, Michael Freeling9, J. Chris Pires10, Andrew H. Paterson7, Boulos Chalhoub, Bo Wang3, Alice Hayward11, Alice Hayward12, Andrew G. Sharpe13, Beom-Seok Park5, Bernd Weisshaar14, Binghang Liu3, Bo Li3, Bo Liu, Chaobo Tong, Chi Song3, Chris Duran11, Chris Duran15, Chunfang Peng3, Geng Chunyu3, Chushin Koh13, Chuyu Lin3, David Edwards15, David Edwards11, Desheng Mu3, Di Shen, Eleni Soumpourou6, Fei Li, Fiona Fraser6, Gavin C. Conant10, Gilles Lassalle16, Graham J.W. King2, Guusje Bonnema17, Haibao Tang9, Haiping Wang, Harry Belcram, Heling Zhou3, Hideki Hirakawa, Hiroshi Abe, Hui Guo7, Hui Wang, Huizhe Jin7, Isobel A. P. Parkin18, Jacqueline Batley12, Jacqueline Batley11, Jeong-Sun Kim5, Jérémy Just, Jianwen Li3, Jiaohui Xu3, Jie Deng, Jin A Kim5, Jingping Li7, Jingyin Yu, Jinling Meng19, Jinpeng Wang8, Jiumeng Min3, Julie Poulain20, Katsunori Hatakeyama, Kui Wu3, Li Wang8, Lu Fang, Martin Trick6, Matthew G. Links18, Meixia Zhao, Mina Jin5, Nirala Ramchiary21, Nizar Drou22, Paul J. Berkman15, Paul J. Berkman11, Qingle Cai3, Quanfei Huang3, Ruiqiang Li3, Satoshi Tabata, Shifeng Cheng3, Shu Zhang3, Shujiang Zhang, Shunmou Huang, Shusei Sato, Silong Sun, Soo-Jin Kwon5, Su-Ryun Choi21, Tae-Ho Lee7, Wei Fan3, Xiang Zhao3, Xu Tan7, Xun Xu3, Yan Wang, Yang Qiu, Ye Yin3, Yingrui Li3, Yongchen Du, Yongcui Liao, Yong Pyo Lim21, Yoshihiro Narusaka, Yupeng Wang8, Zhenyi Wang8, Zhenyu Li3, Zhiwen Wang3, Zhiyong Xiong10, Zhonghua Zhang 
TL;DR: The annotation and analysis of the draft genome sequence of Brassica rapa accession Chiifu-401-42, a Chinese cabbage, and used Arabidopsis thaliana as an outgroup for investigating the consequences of genome triplication, such as structural and functional evolution.
Abstract: We report the annotation and analysis of the draft genome sequence of Brassica rapa accession Chiifu-401-42, a Chinese cabbage. We modeled 41,174 protein coding genes in the B. rapa genome, which has undergone genome triplication. We used Arabidopsis thaliana as an outgroup for investigating the consequences of genome triplication, such as structural and functional evolution. The extent of gene loss (fractionation) among triplicated genome segments varies, with one of the three copies consistently retaining a disproportionately large fraction of the genes expected to have been present in its ancestor. Variation in the number of members of gene families present in the genome may contribute to the remarkable morphological plasticity of Brassica species. The B. rapa genome sequence provides an important resource for studying the evolution of polyploid genomes and underpins the genetic improvement of Brassica oil and vegetable crops.

1,811 citations


Journal ArticleDOI
07 Apr 2011-Nature
TL;DR: OTTIP RNA binds the adaptor protein WDR5 directly and targets WDR 5/MLL complexes across HOXA, driving histone H3 lysine 4 trimethylation and gene transcription.
Abstract: A major question in developmental biology is how functionally related groups of genes are switched on at the right time and in the right place. Long intergenic non-coding RNAs (lincRNAs) have been implicated in both gene silencing and activation, and could be a means of long-range control of gene expression. A lincRNA termed HOTTIP that coordinates the activation of multiple 5' HOXA regulatory genes has now been identified at the 5' tip of the HOXA locus. Chromosomal looping brings HOTTIP close its target genes, where it facilitates histone H3 lysine 4 trimethylation and gene transcription. Long intergenic non-coding RNAs (lincRNAs) have been implicated in both gene silencing and activation, and could be a means for long-range control of gene expression. Here a lincRNA termed HOTTIP is identified at the 5′ tip of the HOXA locus that coordinates the activation of multiple 5′ HOXA genes. Chromosomal looping brings HOTTIP into the proximity of its target genes, where it seems to be required to facilitate histone H3 lysine 4 trimethylation and gene transcription. The genome is extensively transcribed into long intergenic noncoding RNAs (lincRNAs), many of which are implicated in gene silencing1,2. Potential roles of lincRNAs in gene activation are much less understood3,4,5. Development and homeostasis require coordinate regulation of neighbouring genes through a process termed locus control6. Some locus control elements and enhancers transcribe lincRNAs7,8,9,10, hinting at possible roles in long-range control. In vertebrates, 39 Hox genes, encoding homeodomain transcription factors critical for positional identity, are clustered in four chromosomal loci; the Hox genes are expressed in nested anterior-posterior and proximal-distal patterns colinear with their genomic position from 3′ to 5′of the cluster11. Here we identify HOTTIP, a lincRNA transcribed from the 5′ tip of the HOXA locus that coordinates the activation of several 5′ HOXA genes in vivo. Chromosomal looping brings HOTTIP into close proximity to its target genes. HOTTIP RNA binds the adaptor protein WDR5 directly and targets WDR5/MLL complexes across HOXA, driving histone H3 lysine 4 trimethylation and gene transcription. Induced proximity is necessary and sufficient for HOTTIP RNA activation of its target genes. Thus, by serving as key intermediates that transmit information from higher order chromosomal looping into chromatin modifications, lincRNAs may organize chromatin domains to coordinate long-range gene activation.

1,782 citations


Journal ArticleDOI
16 Jun 2011-Nature
TL;DR: High-throughput genome engineering highlighted by this study is broadly applicable to rat and human stem cells and provides a foundation for future genome-wide efforts aimed at deciphering the function of all genes encoded by the mammalian genome.
Abstract: Gene targeting in embryonic stem cells has become the principal technology for manipulation of the mouse genome, offering unrivalled accuracy in allele design and access to conditional mutagenesis. To bring these advantages to the wider research community, large-scale mouse knockout programmes are producing a permanent resource of targeted mutations in all protein-coding genes. Here we report the establishment of a high-throughput gene-targeting pipeline for the generation of reporter-tagged, conditional alleles. Computational allele design, 96-well modular vector construction and high-efficiency gene-targeting strategies have been combined to mutate genes on an unprecedented scale. So far, more than 12,000 vectors and 9,000 conditional targeted alleles have been produced in highly germline-competent C57BL/6N embryonic stem cells. High-throughput genome engineering highlighted by this study is broadly applicable to rat and human stem cells and provides a foundation for future genome-wide efforts aimed at deciphering the function of all genes encoded by the mammalian genome.

1,538 citations


Journal ArticleDOI
15 Sep 2011-Nature
TL;DR: These sequences provide a starting point for a new era in the functional analysis of a key model organism and show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus.
Abstract: We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism.

1,453 citations


Journal ArticleDOI
24 Mar 2011-Nature
TL;DR: 111,195 new elements are identified, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events and inferred protein isoforms that previously eluded discovery using established experimental, prediction and conservation-based approaches.
Abstract: Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development of this complex organism. Here we used RNA-Seq, tiling microarrays and cDNA sequencing to explore the transcriptome in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded discovery using established experimental, prediction and conservation-based approaches. These data substantially expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of transcriptome dynamics throughout development.

1,427 citations


Journal ArticleDOI
John K. Colbourne1, Michael E. Pfrender2, Michael E. Pfrender3, Donald L. Gilbert1, W. Kelley Thomas4, Abraham E. Tucker1, Abraham E. Tucker4, Todd H. Oakley5, Shin-ichi Tokishita6, Andrea Aerts7, Georg J. Arnold8, Malay Kumar Basu9, Malay Kumar Basu10, Darren J Bauer4, Carla E. Cáceres11, Liran Carmel10, Liran Carmel12, Claudio Casola1, Jeong Hyeon Choi1, John C. Detter7, Qunfeng Dong13, Qunfeng Dong1, Serge Dusheyko7, Brian D. Eads1, Thomas Fröhlich8, Kerry Geiler-Samerotte14, Kerry Geiler-Samerotte5, Daniel Gerlach15, Daniel Gerlach16, Phil Hatcher4, Sanjuro Jogdeo17, Sanjuro Jogdeo4, Jeroen Krijgsveld18, Evgenia V. Kriventseva15, Dietmar Kültz19, Christian Laforsch8, Erika Lindquist7, Jacqueline Lopez1, J. Robert Manak20, J. Robert Manak21, Jean Muller22, Jasmyn Pangilinan7, Rupali P Patwardhan23, Rupali P Patwardhan1, Samuel Pitluck7, Ellen J. Pritham24, Andreas Rechtsteiner1, Andreas Rechtsteiner25, Mina Rho1, Igor B. Rogozin10, Onur Sakarya26, Onur Sakarya5, Asaf Salamov7, Sarah Schaack24, Sarah Schaack1, Harris Shapiro7, Yasuhiro Shiga6, Courtney Skalitzky20, Zachary Smith1, Alexander Souvorov10, Way Sung4, Zuojian Tang1, Zuojian Tang27, Dai Tsuchiya1, Hank Tu7, Hank Tu26, Harmjan R. Vos18, Mei Wang7, Yuri I. Wolf10, Hideo Yamagata6, Takuji Yamada, Yuzhen Ye1, Joseph R. Shaw1, Justen Andrews1, Teresa J. Crease28, Haixu Tang1, Susan Lucas7, Hugh M. Robertson11, Peer Bork, Eugene V. Koonin10, Evgeny M. Zdobnov29, Evgeny M. Zdobnov15, Igor V. Grigoriev7, Michael Lynch1, Jeffrey L. Boore30, Jeffrey L. Boore7 
04 Feb 2011-Science
TL;DR: The Daphnia genome reveals a multitude of genes and shows adaptation through gene family expansions, and the coexpansion of gene families interacting within metabolic pathways suggests that the maintenance of duplicated genes is not random.
Abstract: We describe the draft genome of the microcrustacean Daphnia pulex, which is only 200 megabases and contains at least 30,907 genes. The high gene count is a consequence of an elevated rate of gene duplication resulting in tandem gene clusters. More than a third of Daphnia's genes have no detectable homologs in any other available proteome, and the most amplified gene families are specific to the Daphnia lineage. The coexpansion of gene families interacting within metabolic pathways suggests that the maintenance of duplicated genes is not random, and the analysis of gene expression under different environmental conditions reveals that numerous paralogs acquire divergent expression patterns soon after duplication. Daphnia-specific genes, including many additional loci within sequenced regions that are otherwise devoid of annotations, are the most responsive genes to ecological challenges.

1,204 citations


Journal ArticleDOI
Ci Chu1, Kun Qu1, Franklin L. Zhong1, Steven E. Artandi1, Howard Y. Chang1 
TL;DR: ChIRP-seq of three lncRNAs reveal that RNA occupancy sites in the genome are focal, sequence-specific, and numerous, and generally applicable to illuminate the intersection of RNA and chromatin with newfound precision genome wide.

1,095 citations


Journal ArticleDOI
TL;DR: Global gene expression analysis demonstrated that exogenous IRF5 upregulated or downregulated expression of established phenotypic markers of M1 or M2 macrophages, respectively, suggesting a critical role for IRf5 in M1 macrophage polarization and defining a previously unknown function forIRF5 as a transcriptional repressor.
Abstract: Polymorphisms in the gene encoding the transcription factor IRF5 that lead to higher mRNA expression are associated with many autoimmune diseases. Here we show that IRF5 expression in macrophages was reversibly induced by inflammatory stimuli and contributed to the plasticity of macrophage polarization. High expression of IRF5 was characteristic of M1 macrophages, in which it directly activated transcription of the genes encoding interleukin 12 subunit p40 (IL-12p40), IL-12p35 and IL-23p19 and repressed the gene encoding IL-10. Consequently, those macrophages set up the environment for a potent T helper type 1 (T(H)1)-T(H)17 response. Global gene expression analysis demonstrated that exogenous IRF5 upregulated or downregulated expression of established phenotypic markers of M1 or M2 macrophages, respectively. Our data suggest a critical role for IRF5 in M1 macrophage polarization and define a previously unknown function for IRF5 as a transcriptional repressor.

Journal ArticleDOI
19 May 2011-Nature
TL;DR: It is proposed that TET1 fine-tunes transcription, opposes aberrant DNA methylation at CpG-rich sequences and thereby contributes to the regulation ofDNA methylation fidelity.
Abstract: Enzymes catalysing the methylation of the 5-position of cytosine (mC) have essential roles in regulating gene expression and maintaining cellular identity. Recently, TET1 was found to hydroxylate the methyl group of mC, converting it to 5-hydroxymethyl cytosine (hmC). Here we show that TET1 binds throughout the genome of embryonic stem cells, with the majority of binding sites located at transcription start sites (TSSs) of CpG-rich promoters and within genes. The hmC modification is found in gene bodies and in contrast to mC is also enriched at CpG-rich TSSs. We provide evidence further that TET1 has a role in transcriptional repression. TET1 binds a significant proportion of Polycomb group target genes. Furthermore, TET1 associates and colocalizes with the SIN3A co-repressor complex. We propose that TET1 fine-tunes transcription, opposes aberrant DNA methylation at CpG-rich sequences and thereby contributes to the regulation of DNA methylation fidelity.

Journal ArticleDOI
TL;DR: In this article, an ultra-high-density array that tiles the promoters of 56 cell-cycle genes was used to interrogate 108 samples representing diverse perturbations, identifying 216 transcribed regions that encode putative lncRNAs, many with RT-PCR-validated periodic expression during the cell cycle.
Abstract: Transcription of long noncoding RNAs (lncRNAs) within gene regulatory elements can modulate gene activity in response to external stimuli, but the scope and functions of such activity are not known. Here we use an ultrahigh-density array that tiles the promoters of 56 cell-cycle genes to interrogate 108 samples representing diverse perturbations. We identify 216 transcribed regions that encode putative lncRNAs, many with RT-PCR-validated periodic expression during the cell cycle, show altered expression in human cancers and are regulated in expression by specific oncogenic stimuli, stem cell differentiation or DNA damage. DNA damage induces five lncRNAs from the CDKN1A promoter, and one such lncRNA, named PANDA, is induced in a p53-dependent manner. PANDA interacts with the transcription factor NF-YA to limit expression of pro-apoptotic genes; PANDA depletion markedly sensitized human fibroblasts to apoptosis by doxorubicin. These findings suggest potentially widespread roles for promoter lncRNAs in cell-growth control.

Journal ArticleDOI
30 Sep 2011-Science
TL;DR: Comprising tandem, polymorphic amino acid repeats that individually specify contiguous nucleotides in DNA, this domain is being deployed in DNA targeting for applications ranging from understanding gene function in model organisms to improving traits in crop plants to treating genetic disorders in people.
Abstract: Generating and applying new knowledge from the wealth of available genomic information is hindered, in part, by the difficulty of altering nucleotide sequences and expression of genes in living cells in a targeted fashion. Progress has been made in engineering DNA binding domains to direct proteins to particular sequences for mutagenesis or manipulation of transcription; however, achieving the requisite specificities has been challenging. Transcription activator-like (TAL) effectors of plant pathogenic bacteria contain a modular DNA binding domain that appears to overcome this challenge. Comprising tandem, polymorphic amino acid repeats that individually specify contiguous nucleotides in DNA, this domain is being deployed in DNA targeting for applications ranging from understanding gene function in model organisms to improving traits in crop plants to treating genetic disorders in people.

Journal ArticleDOI
TL;DR: This method uses the T4 bacteriophage β-glucosyltransferase to transfer an engineered glucose moiety containing an azide group onto the hydroxyl group of 5-hmC, a recently identified epigenetic modification present in substantial amounts in certain mammalian cell types.
Abstract: In contrast to 5-methylcytosine (5-mC), which has been studied extensively, little is known about 5-hydroxymethylcytosine (5-hmC), a recently identified epigenetic modification present in substantial amounts in certain mammalian cell types. Here we present a method for determining the genome-wide distribution of 5-hmC. We use the T4 bacteriophage β-glucosyltransferase to transfer an engineered glucose moiety containing an azide group onto the hydroxyl group of 5-hmC. The azide group can be chemically modified with biotin for detection, affinity enrichment and sequencing of 5-hmC-containing DNA fragments in mammalian genomes. Using this method, we demonstrate that 5-hmC is present in human cell lines beyond those previously recognized. We also find a gene expression level-dependent enrichment of intragenic 5-hmC in mouse cerebellum and an age-dependent acquisition of this modification in specific gene bodies linked to neurodegenerative disorders.

Journal ArticleDOI
TL;DR: By combining next-generation sequencing and copy number analysis, it is shown that the DLBCL coding genome contains, on average, more than 30 clonally represented gene alterations per case and novel dysregulated pathways underlying its pathogenesis are identified.
Abstract: Diffuse large B-cell lymphoma (DLBCL) is the most common form of human lymphoma. Although a number of structural alterations have been associated with the pathogenesis of this malignancy, the full spectrum of genetic lesions that are present in the DLBCL genome, and therefore the identity of dysregulated cellular pathways, remains unknown. By combining next-generation sequencing and copy number analysis, we show that the DLBCL coding genome contains, on average, more than 30 clonally represented gene alterations per case. This analysis also revealed mutations in genes not previously implicated in DLBCL pathogenesis, including those regulating chromatin methylation (MLL2; 24% of samples) and immune recognition by T cells. These results provide initial data on the complexity of the DLBCL coding genome and identify novel dysregulated pathways underlying its pathogenesis.

01 Jun 2011
TL;DR: This work uses an ultrahigh-density array that tiles the promoters of 56 cell-cycle genes to interrogate 108 samples representing diverse perturbations and identifies 216 transcribed regions that encode putative lncRNAs, many with RT-PCR–validated periodic expression during the cell cycle.
Abstract: Transcription of long noncoding RNAs (lncRNAs) within gene regulatory elements can modulate gene activity in response to external stimuli, but the scope and functions of such activity are not known. Here we use an ultrahigh-density array that tiles the promoters of 56 cell-cycle genes to interrogate 108 samples representing diverse perturbations. We identify 216 transcribed regions that encode putative lncRNAs, many with RT-PCR-validated periodic expression during the cell cycle, show altered expression in human cancers and are regulated in expression by specific oncogenic stimuli, stem cell differentiation or DNA damage. DNA damage induces five lncRNAs from the CDKN1A promoter, and one such lncRNA, named PANDA, is induced in a p53-dependent manner. PANDA interacts with the transcription factor NF-YA to limit expression of pro-apoptotic genes; PANDA depletion markedly sensitized human fibroblasts to apoptosis by doxorubicin. These findings suggest potentially widespread roles for promoter lncRNAs in cell-growth control.

Journal ArticleDOI
TL;DR: This work provides the most comprehensive genetic characterization of a sterol catabolic pathway to date, suggests putative roles for uncharacterized virulence genes, and precisely maps genes encoding potential drug targets.
Abstract: The pathways that comprise cellular metabolism are highly interconnected, and alterations in individual enzymes can have far-reaching effects. As a result, global profiling methods that measure gene expression are of limited value in predicting how the loss of an individual function will affect the cell. In this work, we employed a new method of global phenotypic profiling to directly define the genes required for the growth of Mycobacterium tuberculosis. A combination of high-density mutagenesis and deep-sequencing was used to characterize the composition of complex mutant libraries exposed to different conditions. This allowed the unambiguous identification of the genes that are essential for Mtb to grow in vitro, and proved to be a significant improvement over previous approaches. To further explore functions that are required for persistence in the host, we defined the pathways necessary for the utilization of cholesterol, a critical carbon source during infection. Few of the genes we identified had previously been implicated in this adaptation by transcriptional profiling, and only a fraction were encoded in the chromosomal region known to encode sterol catabolic functions. These genes comprise an unexpectedly large percentage of those previously shown to be required for bacterial growth in mouse tissue. Thus, this single nutritional change accounts for a significant fraction of the adaption to the host. This work provides the most comprehensive genetic characterization of a sterol catabolic pathway to date, suggests putative roles for uncharacterized virulence genes, and precisely maps genes encoding potential drug targets.

Journal ArticleDOI
TL;DR: This review compares the MYB and bHLH gene families from structural, evolutionary and functional perspectives and suggests that the next few years are likely to witness an increasing understanding of the extent to which conserved transcription factors participate at similar positions in gene regulatory networks across plant species.
Abstract: The expansion of gene families encoding regulatory proteins is typically associated with the increase in complexity characteristic of multi-cellular organisms. The MYB and basic helix-loop-helix (bHLH) families provide excellent examples of how gene duplication and divergence within particular groups of transcription factors are associated with, if not driven by, the morphological and metabolic diversity that characterize the higher plants. These gene families expanded dramatically in higher plants; for example, there are approximately 339 and 162 MYB and bHLH genes, respectively, in Arabidopsis, and approximately 230 and 111, respectively, in rice. In contrast, the Chlamydomonas genome has only 38 MYB genes and eight bHLH genes. In this review, we compare the MYB and bHLH gene families from structural, evolutionary and functional perspectives. The knowledge acquired on the role of many of these factors in Arabidopsis provides an excellent reference to explore sequence-function relationships in crops and other plants. The physical interaction and regulatory synergy between particular sub-classes of MYB and bHLH factors is perhaps one of the best examples of combinatorial plant gene regulation. However, members of the MYB and bHLH families also interact with a number of other regulatory proteins, forming complexes that either activate or repress the expression of sets of target genes that are increasingly being identified through a diversity of high-throughput genomic approaches. The next few years are likely to witness an increasing understanding of the extent to which conserved transcription factors participate at similar positions in gene regulatory networks across plant species.

Journal ArticleDOI
22 Apr 2011-Science
TL;DR: This work established various gene trap cell lines and transgenic cell lines expressing a short-lived luciferase protein from an unstable mRNA, and recorded bioluminescence in real time in single cells, demonstrating that bursting kinetics are highly gene-specific.
Abstract: In prokaryotes and eukaryotes, most genes appear to be transcribed during short periods called transcriptional bursts, interspersed by silent intervals. We describe how such bursts generate gene-specific temporal patterns of messenger RNA (mRNA) synthesis in mammalian cells. To monitor transcription at high temporal resolution, we established various gene trap cell lines and transgenic cell lines expressing a short-lived luciferase protein from an unstable mRNA, and recorded bioluminescence in real time in single cells. Mathematical modeling identified gene-specific on- and off-switching rates in transcriptional activity and mean numbers of mRNAs produced during the bursts. Transcriptional kinetics were markedly altered by cis-regulatory DNA elements. Our analysis demonstrated that bursting kinetics are highly gene-specific, reflecting refractory periods during which genes stay inactive for a certain time before switching on again.

Journal ArticleDOI
TL;DR: The 207-Mb genome sequence of the North American Arabidopsis lyrata strain MN47, based on 8.3× dideoxy sequence coverage, is reported, indicating pervasive selection for a smaller genome in this outcrossing species.
Abstract: We present the 207 Mb genome sequence of the outcrosser Arabidopsis lyrata, which diverged from the self-fertilizing species A. thaliana about 10 million years ago. It is generally assumed that the much smaller A. thaliana genome, which is only 125 Mb, constitutes the derived state for the family. Apparent genome reduction in this genus can be partially attributed to the loss of DNA from large-scale rearrangements, but the main cause lies in the hundreds of thousands of small deletions found throughout the genome. These occurred primarily in non-coding DNA and transposons, but protein-coding multi-gene families are smaller in A. thaliana as well. Analysis of deletions and insertions still segregating in A. thaliana indicates that the process of DNA loss is ongoing, suggesting pervasive selection for a smaller genome.

Book
01 Jan 2011
TL;DR: Achilles cleavage actin and actin homologs AIDS AIDS HIV enzymes Alzheimer's disease amino acid synthesis annexins antibody molecules antisense oligonucleotides arabidopsis genome autoantibodies and autoimmunity automation in genome research.
Abstract: Achilles cleavage actin and actin homologs AIDS AIDS HIV enzymes Alzheimer's disease amino acid synthesis annexins antibody molecules antisense oligonucleotides arabidopsis genome autoantibodies and autoimmunity automation in genome research bacterial growth and division bacterial pathogenesis bacteriorhodopsin biochemical genetics biodegradation of organic wastes bioelectronics bioenergetics of the cell bioinorganic chemistry biomaterials for organ regeneration biomolecular electronics and applications bioorganic chemistry bioprocess engineering bioreactor transport processes biosensors biotechnology breast cancer calcium biochemistry/cancer carbohydrate analysis carbohydrate antigens cardiovascular diseases cell-cell interactions cell death and ageing chaperones chemiluminescence and bioluminescence repressor-operator recognition restriction endonucleases and methyltransferases for the modification of DNA restriction landmark genomic scanning method retinoblastoma retinoids ribosome preparations and protein synthesis techniques ribozyme chemistry RNA scanning tunnelling microscopy in sequencing of DNA sequence alignment of proteins and nucleic acids sequence analysis sequence divergence estimation steroid hormones and receptors superantigens synthetic peptide libraries theoretical molecular biology transgenic animal patents transgenic fish/transgenic mammals translation of RNA protein transport proteins transposens in the human genome triple-helix forming oligonucleotides tumour suppressor genes ultraviolet radiation damage to DNA vaccine biotechnolog viral envelope assembly and budding viruses vitamins X-ray diffraction of biomolecules yeast artificial chromosomes techniques yeast genetics zinc finger DNA binding moti. (Part contents).

Journal ArticleDOI
19 May 2011-Nature
TL;DR: The results indicate that 5hmC has a probable role in transcriptional regulation, and suggest a model in which5hmC contributes to the ‘poised’ chromatin signature found at developmentally-regulated genes in ES cells.
Abstract: 5-hydroxymethylcytosine (5hmC) is a modified base present at low levels in diverse cell types in mammals. 5hmC is generated by the TET family of Fe(II) and 2-oxoglutarate-dependent enzymes through oxidation of 5-methylcytosine (5mC). 5hmC and TET proteins have been implicated in stem cell biology and cancer, but information on the genome-wide distribution of 5hmC is limited. Here we describe two novel and specific approaches to profile the genomic localization of 5hmC. The first approach, termed GLIB (glucosylation, periodate oxidation, biotinylation) uses a combination of enzymatic and chemical steps to isolate DNA fragments containing as few as a single 5hmC. The second approach involves conversion of 5hmC to cytosine 5-methylenesulphonate (CMS) by treatment of genomic DNA with sodium bisulphite, followed by immunoprecipitation of CMS-containing DNA with a specific antiserum to CMS. High-throughput sequencing of 5hmC-containing DNA from mouse embryonic stem (ES) cells showed strong enrichment within exons and near transcriptional start sites. 5hmC was especially enriched at the start sites of genes whose promoters bear dual histone 3 lysine 27 trimethylation (H3K27me3) and histone 3 lysine 4 trimethylation (H3K4me3) marks. Our results indicate that 5hmC has a probable role in transcriptional regulation, and suggest a model in which 5hmC contributes to the 'poised' chromatin signature found at developmentally-regulated genes in ES cells.

Journal ArticleDOI
TL;DR: A strong genetic component to inter-individual variation in DNA methylation profiles is demonstrated, and there was an enrichment of SNPs that affect both methylation and gene expression, providing evidence for shared mechanisms in a fraction of genes.
Abstract: DNA methylation is an essential epigenetic mechanism involved in gene regulation and disease, but little is known about the mechanisms underlying inter-individual variation in methylation profiles. Here we measured methylation levels at 22,290 CpG dinucleotides in lymphoblastoid cell lines from 77 HapMap Yoruba individuals, for which genome-wide gene expression and genotype data were also available. Association analyses of methylation levels with more than three million common single nucleotide polymorphisms (SNPs) identified 180 CpG-sites in 173 genes that were associated with nearby SNPs (putatively in cis, usually within 5 kb) at a false discovery rate of 10%. The most intriguing trans signal was obtained for SNP rs10876043 in the disco-interacting protein 2 homolog B gene (DIP2B, previously postulated to play a role in DNA methylation), that had a genome-wide significant association with the first principal component of patterns of methylation; however, we found only modest signal of trans-acting associations overall. As expected, we found significant negative correlations between promoter methylation and gene expression levels measured by RNA-sequencing across genes. Finally, there was a significant overlap of SNPs that were associated with both methylation and gene expression levels. Our results demonstrate a strong genetic component to inter-individual variation in DNA methylation profiles. Furthermore, there was an enrichment of SNPs that affect both methylation and gene expression, providing evidence for shared mechanisms in a fraction of genes.

Journal ArticleDOI
TL;DR: A draft genomic sequence of the CHO-K1 ancestral cell line is presented and it is discussed how the availability of this genome sequence may facilitate genome-scale science for the optimization of biopharmaceutical protein production.
Abstract: Chinese hamster ovary (CHO)-derived cell lines are the preferred host cells for the production of therapeutic proteins. Here we present a draft genomic sequence of the CHO-K1 ancestral cell line. The assembly comprises 2.45 Gb of genomic sequence, with 24,383 predicted genes. We associate most of the assembled scaffolds with 21 chromosomes isolated by microfluidics to identify chromosomal locations of genes. Furthermore, we investigate genes involved in glycosylation, which affect therapeutic protein quality, and viral susceptibility genes, which are relevant to cell engineering and regulatory concerns. Homologs of most human glycosylation-associated genes are present in the CHO-K1 genome, although 141 of these homologs are not expressed under exponential growth conditions. Many important viral entry genes are also present in the genome but not expressed, which may explain the unusual viral resistance property of CHO cell lines. We discuss how the availability of this genome sequence may facilitate genome-scale science for the optimization of biopharmaceutical protein production.

Journal ArticleDOI
Jo Ann Banks1, Tomoaki Nishiyama2, Mitsuyasu Hasebe3, Mitsuyasu Hasebe4, John L. Bowman5, John L. Bowman6, Michael Gribskov1, Claude W. dePamphilis7, Victor A. Albert8, Naoki Aono4, Tsuyoshi Aoyama3, Tsuyoshi Aoyama4, Barbara A. Ambrose9, Neil W. Ashton10, Michael J. Axtell7, Elizabeth I. Barker10, Michael S. Barker11, Jeffrey L. Bennetzen12, Nicholas D. Bonawitz1, Clint Chapple1, Chaoyang Cheng, Luiz Gustavo Guedes Corrêa13, Michael Dacre14, Jeremy D. DeBarry12, Ingo Dreyer13, Marek Eliáš15, Eric M. Engstrom16, Mark Estelle17, Liang Feng12, Cédric Finet18, Sandra K. Floyd6, Wolf B. Frommer19, Tomomichi Fujita20, Lydia Gramzow21, Michael Gutensohn22, Michael Gutensohn1, Jesper Harholt23, Mitsuru Hattori24, Mitsuru Hattori25, Alexander Heyl26, Tadayoshi Hirai27, Yuji Hiwatashi3, Yuji Hiwatashi4, Masaki Ishikawa, Mineko Iwata, Kenneth G. Karol9, Barbara Koehler13, Uener Kolukisaoglu28, Uener Kolukisaoglu29, Minoru Kubo, Tetsuya Kurata30, Sylvie Lalonde19, Kejie Li1, Ying Li31, Ying Li1, Amy Litt9, Eric Lyons32, Gerard Manning14, Takeshi Maruyama20, Todd P. Michael33, Koji Mikami20, Saori Miyazaki4, Saori Miyazaki34, Shin-Ichi Morinaga24, Shin-Ichi Morinaga4, TakashiMurata3, TakashiMurata4, Bernd Mueller-Roeber35, David R. Nelson36, Mari Obara, Yasuko Oguri, Richard G. Olmstead37, Naoko T. Onodera38, Bent O. Petersen23, Birgit Pils39, Michael J. Prigge17, Stefan A. Rensing40, Diego Mauricio Riaño-Pachón35, Diego Mauricio Riaño-Pachón41, Alison W. Roberts42, Yoshikatsu Sato, Henrik Vibe Scheller43, Henrik Vibe Scheller32, Burkhard Schulz1, Christian Schulz44, Eugene V. Shakirov45, Nakako Shibagaki46, Naoki Shinohara20, Dorothy E. Shippen45, Iben Sørensen47, Iben Sørensen23, Ryo Sotooka20, Nagisa Sugimoto, Mamoru Sugita25, Naomi Sumikawa4, Milos Tanurdzic48, Günter Theißen21, Peter Ulvskov23, Sachiko Wakazuki, Jing-Ke Weng14, Jing-Ke Weng1, William G.T. Willats23, Daniel Wipf49, Paul G. Wolf50, Lixing Yang12, Andreas Zimmer40, Qihui Zhu12, Therese Mitros32, Uffe Hellsten51, Dominique Loqué43, Robert Otillar51, Asaf Salamov51, Jeremy Schmutz51, Harris Shapiro51, Erika Lindquist51, Susan Lucas51, Daniel S. Rokhsar32, Daniel S. Rokhsar51, Igor V. Grigoriev51 
20 May 2011-Science
TL;DR: The genome sequence of the lycophyte Selaginella moellendorffii (Selaginella), the first nonseed vascular plant genome reported, is reported, finding that the transition from a gametophytes- to a sporophyte-dominated life cycle required far fewer new genes than the Transition from a non Seed vascular to a flowering plant.
Abstract: Vascular plants appeared ~410 million years ago, then diverged into several lineages of which only two survive: the euphyllophytes (ferns and seed plants) and the lycophytes. We report here the genome sequence of the lycophyte Selaginella moellendorffii (Selaginella), the first nonseed vascular plant genome reported. By comparing gene content in evolutionarily diverse taxa, we found that the transition from a gametophyte- to a sporophyte-dominated life cycle required far fewer new genes than the transition from a nonseed vascular to a flowering plant, whereas secondary metabolic genes expanded extensively and in parallel in the lycophyte and angiosperm lineages. Selaginella differs in posttranscriptional gene regulation, including small RNA regulation of repetitive elements, an absence of the trans-acting small interfering RNA pathway, and extensive RNA editing of organellar genes.

Journal ArticleDOI
01 Dec 2011-Nature
TL;DR: Compared genome-wide DNA methylation among 10 A. thaliana lines, differentially methylated sites were farther from transposable elements and showed less association with short interfering RNA expression than invariant positions, which has important implications for the potential contribution of sequence-independent epialleles to plant evolution.
Abstract: Heritable epigenetic polymorphisms, such as differential cytosine methylation, can underlie phenotypic variation. Moreover, wild strains of the plant Arabidopsis thaliana differ in many epialleles, and these can influence the expression of nearby genes. However, to understand their role in evolution, it is imperative to ascertain the emergence rate and stability of epialleles, including those that are not due to structural variation. We have compared genome-wide DNA methylation among 10 A. thaliana lines, derived 30 generations ago from a common ancestor. Epimutations at individual positions were easily detected, and close to 30,000 cytosines in each strain were differentially methylated. In contrast, larger regions of contiguous methylation were much more stable, and the frequency of changes was in the same low range as that of DNA mutations. Like individual positions, the same regions were often affected by differential methylation in independent lines, with evidence for recurrent cycles of forward and reverse mutations. Transposable elements and short interfering RNAs have been causally linked to DNA methylation. In agreement, differentially methylated sites were farther from transposable elements and showed less association with short interfering RNA expression than invariant positions. The biased distribution and frequent reversion of epimutations have important implications for the potential contribution of sequence-independent epialleles to plant evolution.

Journal ArticleDOI
22 Sep 2011-Nature
TL;DR: Genetic differences between Arabidopsis thaliana accessions underlie the plant’s extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0.
Abstract: Genetic differences between Arabidopsis thaliana accessions underlie the plant's extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions.

Journal ArticleDOI
09 Jun 2011-Blood
TL;DR: Using chromatin immunoprecipitation linked to high throughput sequencing, HIF-binding sites across the genome are identified, indicating that these sites operate over long genomic intervals, and epigenetic regulation of chromatin may have an important role in defining the response to hypoxia.

Journal ArticleDOI
TL;DR: It is proposed that the universal bias in gene loss between the genomes of this ancient tetraploid, and perhaps all tetraPLoids, is the result of selection against loss of the gene responsible for the majority of total expression for a duplicate gene pair.
Abstract: Ancient tetraploidies are found throughout the eukaryotes. After duplication, one copy of each duplicate gene pair tends to be lost (fractionate). For all studied tetraploidies, the loss of duplicated genes, known as homeologs, homoeologs, ohnologs, or syntenic paralogs, is uneven between duplicate regions. In maize, a species that experienced a tetraploidy 5–12 million years ago, we show that in addition to uneven ancient gene loss, the two complete genomes contained within maize are differentiated by ongoing fractionation among diverse inbreds as well as by a pattern of overexpression of genes from the genome that has experienced less gene loss. These expression differences are consistent over a range of experiments quantifying RNA abundance in different tissues. We propose that the universal bias in gene loss between the genomes of this ancient tetraploid, and perhaps all tetraploids, is the result of selection against loss of the gene responsible for the majority of total expression for a duplicate gene pair. Although the tetraploidy of maize is ancient, biased gene loss and expression continue today and explain, at least in part, the remarkable genetic diversity found among modern maize cultivars.