scispace - formally typeset
Search or ask a question

Showing papers on "Intron published in 2013"


Journal ArticleDOI
01 Feb 2013-RNA
TL;DR: High-throughput sequencing of libraries prepared from ribosome-depleted RNA with or without digestion with the RNA exonuclease showed that ecircRNAs are abundant, stable, conserved and nonrandom products of RNA splicing that could be involved in control of gene expression.
Abstract: Circular RNAs composed of exonic sequence have been described in a small number of genes. Thought to result from splicing errors, circular RNA species possess no known function. To delineate the universe of endogenous circular RNAs, we performed high-throughput sequencing (RNA-seq) of libraries prepared from ribosome-depleted RNA with or without digestion with the RNA exonuclease, RNase R. We identified >25,000 distinct RNA species in human fibroblasts that contained non-colinear exons (a "backsplice") and were reproducibly enriched by exonuclease degradation of linear RNA. These RNAs were validated as circular RNA (ecircRNA), rather than linear RNA, and were more stable than associated linear mRNAs in vivo. In some cases, the abundance of circular molecules exceeded that of associated linear mRNA by >10-fold. By conservative estimate, we identified ecircRNAs from 14.4% of actively transcribed genes in human fibroblasts. Application of this method to murine testis RNA identified 69 ecircRNAs in precisely orthologous locations to human circular RNAs. Of note, paralogous kinases HIPK2 and HIPK3 produce abundant ecircRNA from their second exon in both humans and mice. Though HIPK3 circular RNAs contain an AUG translation start, it and other ecircRNAs were not bound to ribosomes. Circular RNAs could be degraded by siRNAs and, therefore, may act as competing endogenous RNAs. Bioinformatic analysis revealed shared features of circularized exons, including long bordering introns that contained complementary ALU repeats. These data show that ecircRNAs are abundant, stable, conserved and nonrandom products of RNA splicing that could be involved in control of gene expression.

3,310 citations


Journal ArticleDOI
TL;DR: Using an improved computational approach for circular RNA identification, widespread circular RNA expression is found in Drosophila melanogaster and it is estimated that in humans, circular RNA may account for 1% as many molecules as poly(A) RNA.
Abstract: Thousands of loci in the human and mouse genomes give rise to circular RNA transcripts; at many of these loci, the predominant RNA isoform is a circle. Using an improved computational approach for circular RNA identification, we found widespread circular RNA expression in Drosophila melanogaster and estimate that in humans, circular RNA may account for 1% as many molecules as poly(A) RNA. Analysis of data from the ENCODE consortium revealed that the repertoire of genes expressing circular RNA, the ratio of circular to linear transcripts for each gene, and even the pattern of splice isoforms of circular RNAs from each gene were cell-type specific. These results suggest that biogenesis of circular RNA is an integral, conserved, and regulated feature of the gene expression program.

1,567 citations


Journal ArticleDOI
TL;DR: Deep-sequence RNA from leukocyte-depleted platelets is deep-sequenced to capture the complex profile of all expressed transcripts, revealing diverse classes of non-coding RNAs, including: pervasive antisense transcripts to protein-c coding loci; numerous, previously unreported and abundant microRNAs; retrotransposons; and thousands of novel un-annotated long and short intronic transcripts, an intriguing finding considering the anucleate nature of platelets.
Abstract: Human blood platelets are essential to maintaining normal hemostasis, and platelet dysfunction often causes bleeding or thrombosis. Estimates of genome-wide platelet RNA expression using microarrays have provided insights to the platelet transcriptome but were limited by the number of known transcripts. The goal of this effort was to deep-sequence RNA from leukocyte-depleted platelets to capture the complex profile of all expressed transcripts. From each of four healthy individuals we generated long RNA (≥40 nucleotides) profiles from total and ribosomal-RNA depleted RNA preparations, as well as short RNA (<40 nucleotides) profiles. Analysis of ~1 billion reads revealed that coding and non-coding platelet transcripts span a very wide dynamic range (≥16 PCR cycles beyond β-actin), a result we validated through qRT-PCR on many dozens of platelet messenger RNAs. Surprisingly, ribosomal-RNA depletion significantly and adversely affected estimates of the relative abundance of transcripts. Of the known protein-coding loci, ~9,500 are present in human platelets. We observed a strong correlation between mRNAs identified by RNA-seq and microarray for well-expressed mRNAs, but RNASeq identified many more transcripts of lower abundance and permitted discovery of novel transcripts. Our analyses revealed diverse classes of non-coding RNAs, including: pervasive antisense transcripts to protein-coding loci; numerous, previously unreported and abundant microRNAs; retrotransposons; and thousands of novel un-annotated long and short intronic transcripts, an intriguing finding considering the anucleate nature of platelets. The data are available through a local mirror of the UCSC genome browser and can be accessed at: http://cm.jefferson.edu/platelets_2012/ .

562 citations


Journal ArticleDOI
TL;DR: The results show the feasibility of deep sequencing full-length RNA from complex eukaryotic transcriptomes on a single-molecule level and high-confidence mappings are consistent with GENCODE annotations.
Abstract: Global RNA studies have become central to understanding biological processes, but methods such as microarrays and short-read sequencing are unable to describe an entire RNA molecule from 5' to 3' end. Here we use single-molecule long-read sequencing technology from Pacific Biosciences to sequence the polyadenylated RNA complement of a pooled set of 20 human organs and tissues without the need for fragmentation or amplification. We show that full-length RNA molecules of up to 1.5 kb can readily be monitored with little sequence loss at the 5' ends. For longer RNA molecules more 5' nucleotides are missing, but complete intron structures are often preserved. In total, we identify ∼14,000 spliced GENCODE genes. High-confidence mappings are consistent with GENCODE annotations, but >10% of the alignments represent intron structures that were not previously annotated. As a group, transcripts mapping to unannotated regions have features of long, noncoding RNAs. Our results show the feasibility of deep sequencing full-length RNA from complex eukaryotic transcriptomes on a single-molecule level.

540 citations


Journal ArticleDOI
TL;DR: The data show that the GGGGCC repeat is bidirectionally translated into five distinct DPR proteins that co-aggregate in the characteristic p62-positive TDP-43 negative inclusions found in FTLD/ALS cases with C9orf72 repeat expansion.
Abstract: Massive GGGGCC repeat expansion in the first intron of the gene C9orf72 is the most common known cause of familial frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS). Despite its intronic localization and lack of an ATG start codon, the repeat region is translated in all three reading frames into aggregating dipeptide-repeat (DPR) proteins, poly-(Gly-Ala), poly-(Gly-Pro) and poly-(Gly-Arg). We took an antibody-based approach to further validate the translation of DPR proteins. To test whether the antisense repeat RNA transcript is also translated, we raised antibodies against the predicted products, poly-(Ala-Pro) and poly-(Pro-Arg). Both antibodies stained p62-positive neuronal cytoplasmic inclusions throughout the cerebellum and hippocampus indicating that not only sense but also antisense strand repeats are translated into DPR proteins in the absence of ATG start codons. Protein products of both strands co-aggregate suggesting concurrent translation of both strands. Moreover, an antibody targeting the putative carboxyl terminus of DPR proteins can detect inclusion pathology in C9orf72 repeat expansion carriers suggesting that the non-ATG translation continues through the entire repeat and beyond. A highly sensitive monoclonal antibody against poly-(Gly-Arg), visualized abundant inclusion pathology in all cortical regions and some inclusions also in motoneurons. Together, our data show that the GGGGCC repeat is bidirectionally translated into five distinct DPR proteins that co-aggregate in the characteristic p62-positive TDP-43 negative inclusions found in FTLD/ALS cases with C9orf72 repeat expansion. Novel monoclonal antibodies against poly-(Gly-Arg) will facilitate pathological diagnosis of C9orf72 FTLD/ALS.

417 citations


Journal ArticleDOI
01 Aug 2013-Cell
TL;DR: Bioinformatic analyses of transcriptomic and proteomic data of normal white blood cell differentiation reveal IR as a physiological mechanism of gene expression control and establish that IR coupled with NMD is a conserved mechanism in normal granulopoiesis.

411 citations


Journal ArticleDOI
Nian Liu1, Marc Parisien1, Qing Dai1, Guanqun Zheng1, Chuan He1, Tao Pan1 
01 Dec 2013-RNA
TL;DR: A method that accurately determines m(6)A status at any site in mRNA/lncRNA, termed site-specific cleavage and radioactive-labeling followed by ligation-assisted extraction and thin-layer chromatography (SCARLET), which determines the precise location of the m( 6)A residue and its modification fraction, which are crucial parameters in probing the cellular dynamics of m(7)A modification.
Abstract: N(6)-methyladenosine (m(6)A) is the most abundant modification in mammalian mRNA and long noncoding RNA (lncRNA). Recent discoveries of two m(6)A demethylases and cell-type and cell-state-dependent m(6)A patterns indicate that m(6)A modifications are highly dynamic and likely play important biological roles for RNA akin to DNA methylation or histone modification. Proposed functions for m(6)A modification include mRNA splicing, export, stability, and immune tolerance; but m(6)A studies have been hindered by the lack of methods for its identification at single nucleotide resolution. Here, we develop a method that accurately determines m(6)A status at any site in mRNA/lncRNA, termed site-specific cleavage and radioactive-labeling followed by ligation-assisted extraction and thin-layer chromatography (SCARLET). The method determines the precise location of the m(6)A residue and its modification fraction, which are crucial parameters in probing the cellular dynamics of m(6)A modification. We applied the method to determine the m(6)A status at several sites in two human lncRNAs and three human mRNAs and found that m(6)A fraction varies between 6% and 80% among these sites. We also found that many m(6)A candidate sites in these RNAs are however not modified. The precise determination of m(6)A status in a long noncoding RNA also enables the identification of an m(6)A-containing RNA structural motif.

395 citations


Journal ArticleDOI
14 Mar 2013-Cell
TL;DR: This extensive crosstalk between gene regulatory layers takes advantage of dynamic spatial, physical, and temporal organizational properties of the cell nucleus, and further emphasizes the importance of developing a multidimensional understanding of splicing control.

390 citations


Journal ArticleDOI
TL;DR: Quantitative analysis of APA isoforms indicated that promoter-distal pAs, regardless of intron or exon locations, become more abundant during embryonic development and cell differentiation and that upregulated isoforms have stronger pA, suggesting global modulation of the 3′ end–processing activity in development and differentiation.
Abstract: Alternative cleavage and polyadenylation (APA) generates diverse mRNA isoforms. We developed 3' region extraction and deep sequencing (3'READS) to address mispriming issues that commonly plague poly(A) site (pA) identification, and we used the method to comprehensively map pAs in the mouse genome. Thorough annotation of gene 3' ends revealed over 5,000 previously overlooked pAs (∼8% of total) flanked by A-rich sequences, underscoring the necessity of using an accurate tool for pA mapping. About 79% of mRNA genes and 66% of long noncoding RNA genes undergo APA, but these two gene types have distinct usage patterns for pAs in introns and upstream exons. Quantitative analysis of APA isoforms by 3'READS indicated that promoter-distal pAs, regardless of intron or exon locations, become more abundant during embryonic development and cell differentiation and that upregulated isoforms have stronger pAs, suggesting global modulation of the 3' end-processing activity in development and differentiation.

383 citations


Journal ArticleDOI
TL;DR: It is shown that RNA editing sites can be called with high confidence using RNA sequencing data from multiple samples across either individuals or species, without the need for matched genomic DNA sequence.
Abstract: We show that RNA editing sites can be called with high confidence using RNA sequencing data from multiple samples across either individuals or species, without the need for matched genomic DNA sequence. We identified many previously unidentified editing sites in both humans and Drosophila; our results nearly double the known number of human protein recoding events. We also found that human genes harboring conserved editing sites within Alu repeats are enriched for neuronal functions.

335 citations


Journal ArticleDOI
TL;DR: It is shown here that binding in distal intronic regions by Rbfox splicing factors important in development is extensive and is an active mode of splicing regulation.
Abstract: Alternative splicing (AS) enables programmed diversity of gene expression across tissues and development. We show here that binding in distal intronic regions (>500 nucleotides (nt) from any exon) by Rbfox splicing factors important in development is extensive and is an active mode of splicing regulation. Similarly to exon-proximal sites, distal sites contain evolutionarily conserved GCATG sequences and are associated with AS activation and repression upon modulation of Rbfox abundance in human and mouse experimental systems. As a proof of principle, we validated the activity of two specific Rbfox enhancers in KIF21A and ENAH distal introns and showed that a conserved long-range RNA-RNA base-pairing interaction (an RNA bridge) is necessary for Rbfox-mediated exon inclusion in the ENAH gene. Thus we demonstrate a previously unknown RNA-mediated mechanism for AS control by distally bound RNA-binding proteins.

Journal ArticleDOI
TL;DR: ASprofile as mentioned in this paper identifies alternative splicing events in 16 different human tissues, which provide a dynamic picture of splicing variation across the tissues, and detects 26,989 potential exon skipping events representing differences in splicing patterns among the tissues.
Abstract: Alternative splicing is widely recognized for its roles in regulating genes and creating gene diversity. However, despite many efforts, the repertoire of gene splicing variation is still incompletely characterized, even in humans. Here we describe a new computational system, ASprofile, and its application to RNA-seq data from Illumina’s Human Body Map project (>2.5 billion reads). Using the system, we identified putative alternative splicing events in 16 different human tissues, which provide a dynamic picture of splicing variation across the tissues. We detected 26,989 potential exon skipping events representing differences in splicing patterns among the tissues. A large proportion of the events (>60%) were novel, involving new exons (~3000), new introns (~16000), or both. When tracing these events across the sixteen tissues, only a small number (4-7%) appeared to be differentially expressed (‘switched’) between two tissues, while 30-45% showed little variation, and the remaining 50-65% were not present in one or both tissues compared. Novel exon skipping events appeared to be slightly less variable than known events, but were more tissue-specific. Our study represents the first effort to build a comprehensive catalog of alternative splicing in normal human tissues from RNA-seq data, while providing insights into the role of alternative splicing in shaping tissue transcriptome differences. The catalog of events and the ASprofile software are freely available from the Zenodo repository ( http://zenodo.org/record/7068 ; doi: 10.5281/zenodo.7068 ) and from our web site http://ccb.jhu.edu/software/ASprofile .

Journal ArticleDOI
TL;DR: RNA editing alters the identity of nucleotides in RNA molecules such that the information for a protein in the mRNA differs from the prediction of the genomic DNA, resulting in new proteins that are amplified in plant species with organellar RNA editing.
Abstract: RNA editing alters the identity of nucleotides in RNA molecules such that the information for a protein in the mRNA differs from the prediction of the genomic DNA. In chloroplasts and mitochondria of flowering plants, RNA editing changes C nucleotides to U nucleotides; in ferns and mosses, it also changes U to C. The approximately 500 editing sites in mitochondria and 40 editing sites in plastids of flowering plants are individually addressed by specific proteins, genes for which are amplified in plant species with organellar RNA editing. These proteins contain repeat elements that bind to cognate RNA sequence motifs just 5′ to the edited nucleotide. In flowering plants, the site-specific proteins interact selectively with individual members of a different, smaller family of proteins. These latter proteins may be connectors between the site-specific proteins and the as yet unknown deaminating enzymatic activity.

Journal ArticleDOI
14 Nov 2013-Nature
TL;DR: It is shown that the U6 snRNA catalyses both of the two splicing reactions by positioning divalent metals that stabilize the leaving groups during each reaction, indicating that RNA mediates catalysis within the spliceosome.
Abstract: In nuclear pre-messenger RNA splicing, introns are excised by the spliceosome, a dynamic machine composed of both proteins and small nuclear RNAs (snRNAs) Over thirty years ago, after the discovery of self-splicing group II intron RNAs, the snRNAs were proposed to catalyse splicing However, no definitive evidence for a role of either RNA or protein in catalysis by the spliceosome has been reported so far By using metal rescue strategies in spliceosomes from budding yeast, here we show that the U6 snRNA catalyses both of the two splicing reactions by positioning divalent metals that stabilize the leaving groups during each reaction Notably, all of the U6 catalytic metal ligands we identified correspond to the ligands observed to position catalytic, divalent metals in crystal structures of a group II intron RNA These findings indicate that group II introns and the spliceosome share common catalytic mechanisms and probably common evolutionary origins Our results demonstrate that RNA mediates catalysis within the spliceosome

Journal ArticleDOI
TL;DR: It is found that these SR proteins promote both inclusion and skipping of exons in vivo, but their binding patterns do not explain such opposite responses, and specific effects on regulated splicing by one SR protein actually depend on a complex set of relationships with multiple other SR proteins in mammalian genomes.

Journal ArticleDOI
27 Oct 2013-Nature
TL;DR: The molecular basis for the specific and modular recognition of RNA bases A, G and U is revealed and provides an important framework for potential biotechnological applications of PPR proteins in RNA-related research areas.
Abstract: Pentatricopeptide repeat (PPR) proteins represent a large family of sequence-specific RNA-binding proteins that are involved in multiple aspects of RNA metabolism. PPR proteins, which are found in exceptionally large numbers in the mitochondria and chloroplasts of terrestrial plants(1-5), recognize single-stranded RNA (ssRNA) in a modular fashion(6-8). The maize chloroplast protein PPR10 binds to two similar RNA sequences from the ATPI-ATPH and PSAJ-RPL33 intergenic regions, referred to as ATPH and PSAJ, respectively(9,10). By protecting the target RNA elements from 5' or 3' exonucleases, PPR10 defines the corresponding 5' and 3' messenger RNA termini(9-11). Despite rigorous functional characterizations, the structural basis of sequence-specific ssRNA recognition by PPR proteins remains to be elucidated. Here we report the crystal structures of PPR10 in RNA-free and RNA-bound states at resolutions of 2.85 and 2.45 angstrom, respectively. In the absence of RNA binding, the nineteen repeats of PPR10 are assembled into a right-handed superhelical spiral. PPR10 forms an antiparallel, intertwined homodimer and exhibits considerable conformational changes upon binding to its target ssRNA, an 18-nucleotide PSAJ element. Six nucleotides of PSAJ are specifically recognized by six corresponding PPR10 repeats following the predicted code. The molecular basis for the specific and modular recognition of RNA bases A, G and U is revealed. The structural elucidation of RNA recognition by PPR proteins provides an important framework for potential biotechnological applications of PPR proteins in RNA-related research areas.

Journal ArticleDOI
TL;DR: This work solved the solution structure of the TDP-43 RRMs in complex with UG-rich RNA and revealed not only how T DP-43 recognizes UG repeats but also how RNA binding–dependent inter-RRM interactions are crucial for TDP
Abstract: TDP-43 encodes an alternative-splicing regulator with tandem RNA-recognition motifs (RRMs) The protein regulates cystic fibrosis transmembrane regulator (CFTR) exon 9 splicing through binding to long UG-rich RNA sequences and is found in cytoplasmic inclusions of several neurodegenerative diseases We solved the solution structure of the TDP-43 RRMs in complex with UG-rich RNA Ten nucleotides are bound by both RRMs, and six are recognized sequence specifically Among these, a central G interacts with both RRMs and stabilizes a new tandem RRM arrangement Mutations that eliminate recognition of this key nucleotide or crucial inter-RRM interactions disrupt RNA binding and TDP-43-dependent splicing regulation In contrast, point mutations that affect base-specific recognition in either RRM have weaker effects Our findings reveal not only how TDP-43 recognizes UG repeats but also how RNA binding-dependent inter-RRM interactions are crucial for TDP-43 function

Journal ArticleDOI
TL;DR: This review discusses how the spliceosome can successfully define exons and introns in a huge variety of pre‐mRNA molecules with nucleotide‐precision through a complex combinatorial control resulting from many different factors/influences.
Abstract: One of the fundamental issues in RNA splicing research is represented by understanding how the spliceosome can successfully define exons and introns in a huge variety of pre-mRNA molecules with nucleotide-precision. Since its first description, researchers in this field have identified and characterized many fundamental elements and players capable of affecting the splicing process, both in a negative and positive manner. Indeed, it can be argued that today we know a great deal about the forces that make an exon, an exon and an intron, an intron. As will be discussed in this review, these decisions are a result of a complex combinatorial control resulting from many different factors/influences. Most importantly, these influences act across several levels of complexity starting from the relatively simple interaction between two consensus 5' and 3' splice sites to much more complex factors: such as the interplay between silencer or enhancer sequences, transcriptional processivity, genomic milieu, nucleosome positioning, and histone modifications at the chromatin level. Depending on local contexts, all these factors will act either antagonistically or synergistically to decide the exon/intron fate of any given RNA sequence. At present, however, what we still lack is a precise understanding of how all these processes add up to help the spliceosome reach a decision. Therefore, it is expected that future challenges in splicing research will be the careful characterization of all these influences to improve our ability to predict splicing choices in different organisms or in specific contexts.

Journal ArticleDOI
11 Jul 2013-Nature
TL;DR: It is shown that knocking out the P. falciparum variant-silencing SET gene (here termed PfSETvs), which encodes an orthologue of Drosophila melanogaster ASH1 and controls histone H3 lysine 36 trimethylation (H3K36me3) on var genes, results in the transcription of virtually all var genes in the single parasite nuclei and their expression as proteins on the surface of individual infected red blood cells.
Abstract: The variant antigen Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1), which is expressed on the surface of P. falciparum-infected red blood cells, is a critical virulence factor for malaria. Each parasite has 60 antigenically distinct var genes that each code for a different PfEMP1 protein. During infection the clonal parasite population expresses only one gene at a time before switching to the expression of a new variant antigen as an immune-evasion mechanism to avoid the host antibody response. The mechanism by which 59 of the 60 var genes are silenced remains largely unknown. Here we show that knocking out the P. falciparum variant-silencing SET gene (here termed PfSETvs), which encodes an orthologue of Drosophila melanogaster ASH1 and controls histone H3 lysine 36 trimethylation (H3K36me3) on var genes, results in the transcription of virtually all var genes in the single parasite nuclei and their expression as proteins on the surface of individual infected red blood cells. PfSETvs-dependent H3K36me3 is present along the entire gene body, including the transcription start site, to silence var genes. With low occupancy of PfSETvs at both the transcription start site of var genes and the intronic promoter, expression of var genes coincides with transcription of their corresponding antisense long noncoding RNA. These results uncover a previously unknown role of PfSETvs-dependent H3K36me3 in silencing var genes in P. falciparum that might provide a general mechanism by which orthologues of PfSETvs repress gene expression in other eukaryotes. PfSETvs knockout parasites expressing all PfEMP1 proteins may also be applied to the development of a malaria vaccine.

Journal ArticleDOI
TL;DR: Analysis of the compact, function-rich genome of P. purpureum suggests that ancestral lineages of red algae acted as mediators of horizontal gene transfer between prokaryotes and photosynthetic eukaryotes, thereby significantly enriching genomes across the tree of photosynthetics life.
Abstract: The limited knowledge we have about red algal genomes comes from the highly specialized extremophiles, Cyanidiophyceae. Here, we describe the first genome sequence from a mesophilic, unicellular red alga, Porphyridium purpureum. The 8,355 predicted genes in P. purpureum, hundreds of which are likely to be implicated in a history of horizontal gene transfer, reside in a genome of 19.7 Mbp with 235 spliceosomal introns. Analysis of light-harvesting complex proteins reveals a nuclear-encoded phycobiliprotein in the alga. We uncover a complex set of carbohydrate-active enzymes, identify the genes required for the methylerythritol phosphate pathway of isoprenoid biosynthesis, and find evidence of sexual reproduction. Analysis of the compact, function-rich genome of P. purpureum suggests that ancestral lineages of red algae acted as mediators of horizontal gene transfer between prokaryotes and photosynthetic eukaryotes, thereby significantly enriching genomes across the tree of photosynthetic life.

Journal ArticleDOI
TL;DR: The results confirm that SQSTM1 gene mutations could be the cause or genetic susceptibility factor of ALS in some patients and identify two novel missense mutations in two SALS that were not detected in 360 control subjects.
Abstract: Mutations in SQSTM1 encoding the sequestosome 1/p62 protein have recently been identified in familial and sporadic cases of amyotrophic lateral sclerosis (ALS). p62 is a component of the ubiquitin inclusions detected in degenerating neurons in ALS patients. We sequenced SQSTM1 in 90 French patients with familial ALS (FALS) and 74 autopsied ALS cases with sporadic ALS (SALS). We identified, at the heterozygote state, one missense c.1175C>T, p.Pro392Leu (exon 8) in one of our FALS and one substitution in intron 7 (the c.1165+1G>A, previously called IVS7+1 G-A, A390X) affecting the exon 7 splicing site in one SALS. These mutations that are located in the ubiquitin-associated domain (UBA domain) of the p62 protein have already been described in Paget’s disease and ALS patients carrying these mutations had both concomitant Paget’s disease. However, we also identified two novel missense mutations in two SALS: the c.259A>G, p.Met87Val in exon 2 and the c.304A>G, p.Lys102Glu in exon 3. These mutations that were not detected in 360 control subjects are possibly pathogenic. Neuropathology analysis of three patients carrying SQSTM1 variants revealed the presence of large round p62 inclusions in motor neurons, and immunoblot analysis showed an increased p62 and TDP-43 protein levels in the spinal cord. Our results confirm that SQSTM1 gene mutations could be the cause or genetic susceptibility factor of ALS in some patients.

Journal ArticleDOI
TL;DR: The robustness of splicing patterns in plants is highlighted and the importance of ongoing annotation and visualization of RNA-Seq data using interactive tools such as Integrated Genome Browser is highlighted.
Abstract: Pollen grains of Arabidopsis (Arabidopsis thaliana) contain two haploid sperm cells enclosed in a haploid vegetative cell. Upon germination, the vegetative cell extrudes a pollen tube that carries the sperm to an ovule for fertilization. Knowing the identity, relative abundance, and splicing patterns of pollen transcripts will improve our understanding of pollen and allow investigation of tissue-specific splicing in plants. Most Arabidopsis pollen transcriptome studies have used the ATH1 microarray, which does not assay splice variants and lacks specific probe sets for many genes. To investigate the pollen transcriptome, we performed high-throughput sequencing (RNA-Seq) of Arabidopsis pollen and seedlings for comparison. Gene expression was more diverse in seedling, and genes involved in cell wall biogenesis were highly expressed in pollen. RNA-Seq detected at least 4,172 protein-coding genes expressed in pollen, including 289 assayed only by nonspecific probe sets. Additional exons and previously unannotated 5′ and 3′ untranslated regions for pollen-expressed genes were revealed. We detected regions in the genome not previously annotated as expressed; 14 were tested and 12 were confirmed by polymerase chain reaction. Gapped read alignments revealed 1,908 high-confidence new splicing events supported by 10 or more spliced read alignments. Alternative splicing patterns in pollen and seedling were highly correlated. For most alternatively spliced genes, the ratio of variants in pollen and seedling was similar, except for some encoding proteins involved in RNA splicing. This study highlights the robustness of splicing patterns in plants and the importance of ongoing annotation and visualization of RNA-Seq data using interactive tools such as Integrated Genome Browser.

Journal ArticleDOI
TL;DR: This report describes a novel long noncoding RNA that is induced by cigarette smoke extract both in vitro and in vivo and is elevated in numerous lung cancer cell lines, and identifies a novel and intriguing new nonc coding RNA that may act downstream of NRF2 to regulate gene expression and mediate oxidative stress protection in airway epithelial cells.
Abstract: The incidence of lung diseases and cancer caused by cigarette smoke is increasing. The molecular mechanisms of gene regulation induced by cigarette smoke that ultimately lead to cancer remain unclear. This report describes a novel long noncoding RNA (lncRNA) that is induced by cigarette smoke extract (CSE) both in vitro and in vivo and is elevated in numerous lung cancer cell lines. We have termed this lncRNA the smoke and cancer–associated lncRNA–1 (SCAL1). This lncRNA is located in chromosome 5, and initial sequencing analysis reveals a transcript with four exons and three introns. The expression of SCAL1 is regulated transcriptionally by nuclear factor erythroid 2–related factor (NRF2), as determined by the small, interfering RNA (siRNA) knockdown of NRF2 and kelch-like ECH-associated protein 1 (KEAP1). A nuclear factor erythroid-derived 2 (NF-E2) motif was identified in the promoter region that shows binding to NRF2 after its activation. Functionally, the siRNA knockdown of SCAL1 in human bronchial epithelial cells shows a significant potentiation of cytotoxicity induced by CSE in vitro. Altogether, these results identify a novel and intriguing new noncoding RNA that may act downstream of NRF2 to regulate gene expression and mediate oxidative stress protection in airway epithelial cells.

Journal ArticleDOI
TL;DR: These findings suggest that the foci containing GRSF1 and RNase P correspond to sites where primary RNA transcripts converge to be processed, and are termed “mitochondrial RNA granules.”

Journal ArticleDOI
TL;DR: It is found that the derived allele of this site is less efficient than the ancestral allele in activating transcription from a reporter construct, and is a plausible candidate for having caused a recent selective sweep in the FOXP2 gene.
Abstract: The FOXP2 gene is required for normal development of speech and language. By isolating and sequencing FOXP2 genomic DNA fragments from a 49,000-year-old Iberian Neandertal and 50 present-day humans, we have identified substitutions in the gene shared by all or nearly all present-day humansbut absent or polymorphic in Neandertals. One such substitution is localized in intron 8 and affects a binding site for the transcription factor POU3F2, which is highly conserved among vertebrates. We find that the derived allele of this site is less efficient than the ancestral allele in activating transcription from a reporter construct. The derived allele also binds less POU3F2 dimers than POU3F2 monomers compared with the ancestral allele. Because the substitution in the POU3F2 binding site is likely to alter the regulation of FOXP2 expression, and because it is localized in a region of the gene associated with a previously described signal of positive selection, it is a plausible candidate for having caused a recent selective sweep in the FOXP2 gene.

Journal ArticleDOI
31 Jan 2013-Nature
TL;DR: The crystal structure of yeast Prp8 in complex with Aar2, a U5 small nuclear ribonucleoprotein particle assembly factor, provides crucial insights into the architecture of the spliceosome active site, and reinforces the notion that nuclear pre-mRNA splicing and group II intron splicing have a common origin.
Abstract: The active centre of the spliceosome consists of an intricate network formed by U5, U2 and U6 small nuclear RNAs, and a pre-messenger-RNA substrate. Prp8, a component of the U5 small nuclear ribonucleoprotein particle, crosslinks extensively with this RNA catalytic core. Here we present the crystal structure of yeast Prp8 (residues 885-2413) in complex with Aar2, a U5 small nuclear ribonucleoprotein particle assembly factor. The structure reveals tightly associated domains of Prp8 resembling a bacterial group II intron reverse transcriptase and a type II restriction endonuclease. Suppressors of splice-site mutations, and an intron branch-point crosslink, map to a large cavity formed by the reverse transcriptase thumb, and the endonuclease-like and RNaseH-like domains. This cavity is large enough to accommodate the catalytic core of group II intron RNA. The structure provides crucial insights into the architecture of the spliceosome active site, and reinforces the notion that nuclear pre-mRNA splicing and group II intron splicing have a common origin.

Journal ArticleDOI
01 Jan 2013-RNA
TL;DR: It is shown that SR and hnRNP splicing factors exploit similar mechanisms to positively or negatively influence splice site selection, based on their binding location relative to regulated 5' splice sites.
Abstract: Alternative splicing is regulated by splicing factors that modulate splice site selection. In some cases, however, splicing factors show antagonistic activities by either activating or repressing splicing. Here, we show that these opposing outcomes are based on their binding location relative to regulated 59 splice sites. SR proteins enhance splicing only when they are recruited to the exon. However, they interfere with splicing by simply relocating them to the opposite intronic side of the splice site. hnRNP splicing factors display analogous opposing activities, but in a reversed position dependence. Activation by SR or hnRNP proteins increases splice site recognition at the earliest steps of exon definition, whereas splicing repression promotes the assembly of nonproductive complexes that arrest spliceosome assembly prior to splice site pairing. Thus, SR and hnRNP splicing factors exploit similar mechanisms to positively or negatively influence splice site selection.

Journal ArticleDOI
TL;DR: This study systematically revealed splicing signatures of the three most common types of breast tumors using RNA sequencing and validated the presence of novel hybrid isoforms of critical molecules like CDK4, LARP1, ADD3, and PHLPP2.
Abstract: Breast cancer transcriptome acquires a myriad of regulation changes, and splicing is critical for the cell to "tailor-make" specific functional transcripts. We systematically revealed splicing signatures of the three most common types of breast tumors using RNA sequencing: TNBC, non-TNBC and HER2-positive breast cancer. We discovered subtype specific differentially spliced genes and splice isoforms not previously recognized in human transcriptome. Further, we showed that exon skip and intron retention are predominant splice events in breast cancer. In addition, we found that differential expression of primary transcripts and promoter switching are significantly deregulated in breast cancer compared to normal breast. We validated the presence of novel hybrid isoforms of critical molecules like CDK4, LARP1, ADD3, and PHLPP2. Our study provides the first comprehensive portrait of transcriptional and splicing signatures specific to breast cancer sub-types, as well as previously unknown transcripts that prompt the need for complete annotation of tissue and disease specific transcriptome.

Journal ArticleDOI
01 Jul 2013-RNA
TL;DR: It is found that group II intron RTs differ from the retroviral enzymes in template switching with minimal base-pairing to the 3' ends of new RNA templates, making it possible to efficiently and seamlessly link adaptors containing PCR-primer binding sites to cDNA ends without an RNA ligase step.
Abstract: Mobile group II introns encode reverse transcriptases (RTs) that function in intron mobility (“retrohoming”) by a process that requires reverse transcription of a highly structured, 2–2.5-kb intron RNA with high processivity and fidelity. Although the latter properties are potentially useful for applications in cDNA synthesis and next-generation RNA sequencing (RNA-seq), group II intron RTs have been difficult to purify free of the intron RNA, and their utility as research tools has not been investigated systematically. Here, we developed general methods for the high-level expression and purification of group II intron-encoded RTs as fusion proteins with a rigidly linked, noncleavable solubility tag, and we applied them to group II intron RTs from bacterial thermophiles. We thus obtained thermostable group II intron RT fusion proteins that have higher processivity, fidelity, and thermostability than retroviral RTs, synthesize cDNAs at temperatures up to 81°C, and have significant advantages for qRT-PCR, capillary electrophoresis for RNA-structure mapping, and next-generation RNA sequencing. Further, we find that group II intron RTs differ from the retroviral enzymes in template switching with minimal base-pairing to the 3′ ends of new RNA templates, making it possible to efficiently and seamlessly link adaptors containing PCR-primer binding sites to cDNA ends without an RNA ligase step. This novel template-switching activity enables facile and less biased cloning of nonpolyadenylated RNAs, such as miRNAs or protein-bound RNA fragments. Our findings demonstrate novel biochemical activities and inherent advantages of group II intron RTs for research, biotechnological, and diagnostic methods, with potentially wide applications.

Journal ArticleDOI
01 May 2013-Genetics
TL;DR: The state of knowledge for tRNA post-transcriptional processing, turnover, and subcellular dynamics is addressed, highlighting the questions that remain.
Abstract: Transfer RNAs (tRNAs) are essential for protein synthesis. In eukaryotes, tRNA biosynthesis employs a specialized RNA polymerase that generates initial transcripts that must be subsequently altered via a multitude of post-transcriptional steps before the tRNAs beome mature molecules that function in protein synthesis. Genetic, genomic, biochemical, and cell biological approaches possible in the powerful Saccharomyces cerevisiae system have led to exciting advances in our understandings of tRNA post-transcriptional processing as well as to novel insights into tRNA turnover and tRNA subcellular dynamics. tRNA processing steps include removal of transcribed leader and trailer sequences, addition of CCA to the 3′ mature sequence and, for tRNAHis, addition of a 5′ G. About 20% of yeast tRNAs are encoded by intron-containing genes. The three-step splicing process to remove the introns surprisingly occurs in the cytoplasm in yeast and each of the splicing enzymes appears to moonlight in functions in addition to tRNA splicing. There are 25 different nucleoside modifications that are added post-transcriptionally, creating tRNAs in which ∼15% of the residues are nucleosides other than A, G, U, or C. These modified nucleosides serve numerous important functions including tRNA discrimination, translation fidelity, and tRNA quality control. Mature tRNAs are very stable, but nevertheless yeast cells possess multiple pathways to degrade inappropriately processed or folded tRNAs. Mature tRNAs are also dynamic in cells, moving from the cytoplasm to the nucleus and back again to the cytoplasm; the mechanism and function of this retrograde process is poorly understood. Here, the state of knowledge for tRNA post-transcriptional processing, turnover, and subcellular dynamics is addressed, highlighting the questions that remain.