scispace - formally typeset
Open accessJournal ArticleDOI: 10.1016/J.MOLCEL.2020.12.018

Co-transcriptional splicing regulates 3' end cleavage during mammalian erythropoiesis.

04 Mar 2021-Molecular Cell (Elsevier)-Vol. 81, Iss: 5
Abstract: Pre-mRNA processing steps are tightly coordinated with transcription in many organisms. To determine how co-transcriptional splicing is integrated with transcription elongation and 3' end formation in mammalian cells, we performed long-read sequencing of individual nascent RNAs and precision run-on sequencing (PRO-seq) during mouse erythropoiesis. Splicing was not accompanied by transcriptional pausing and was detected when RNA polymerase II (Pol II) was within 75-300 nucleotides of 3' splice sites (3'SSs), often during transcription of the downstream exon. Interestingly, several hundred introns displayed abundant splicing intermediates, suggesting that splicing delays can take place between the two catalytic steps. Overall, splicing efficiencies were correlated among introns within the same transcript, and intron retention was associated with inefficient 3' end cleavage. Remarkably, a thalassemia patient-derived mutation introducing a cryptic 3'SS improved both splicing and 3' end cleavage of individual β-globin transcripts, demonstrating functional coupling between the two co-transcriptional processes as a determinant of productive gene output.

... read more

Topics: RNA splicing (69%), Exon (64%), Intron (64%) ... read more

20 results found

Open accessJournal ArticleDOI: 10.1016/J.MOLCEL.2021.02.034
Rui Sousa-Luís1, Gwendal Dujardin2, Inna Zukher2, Hiroshi Kimura3  +5 moreInstitutions (4)
06 May 2021-Molecular Cell
Abstract: Summary Mammalian chromatin is the site of both RNA polymerase II (Pol II) transcription and coupled RNA processing. However, molecular details of such co-transcriptional mechanisms remain obscure, partly because of technical limitations in purifying authentic nascent transcripts. We present a new approach to characterize nascent RNA, called polymerase intact nascent transcript (POINT) technology. This three-pronged methodology maps nascent RNA 5′ ends (POINT-5), establishes the kinetics of co-transcriptional splicing patterns (POINT-nano), and profiles whole transcription units (POINT-seq). In particular, we show by depletion of the nuclear exonuclease Xrn2 that this activity acts selectively on cleaved 5′ P-RNA at polyadenylation sites. Furthermore, POINT-nano reveals that co-transcriptional splicing either occurs immediately after splice site transcription or is delayed until Pol II transcribes downstream sequences. Finally, we connect RNA cleavage and splicing with either premature or full-length transcript termination. We anticipate that POINT technology will afford full dissection of the complexity of co-transcriptional RNA processing.

... read more

Topics: RNA polymerase II (62%), Transcription (biology) (60%), RNA splicing (60%) ... read more

10 Citations

Open accessJournal ArticleDOI: 10.1186/S13059-021-02296-0
01 Mar 2021-Genome Biology
Abstract: Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long-read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools ( ), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations.

... read more

6 Citations

Open accessJournal ArticleDOI: 10.3389/FONC.2021.666937
Ken Asada, Syuzo Kaneko, Ken Takasawa, Hidenori Machino  +5 moreInstitutions (1)
Abstract: With the completion of the International Human Genome Project, we have entered what is known as the post-genome era, and efforts to apply genomic information to medicine have become more active. In particular, with the announcement of the Precision Medicine Initiative by U.S. President Barack Obama in his State of the Union address at the beginning of 2015, "precision medicine," which aims to divide patients and potential patients into subgroups with respect to disease susceptibility, has become the focus of worldwide attention. The field of oncology is also actively adopting the precision oncology approach, which is based on molecular profiling, such as genomic information, to select the appropriate treatment. However, the current precision oncology is dominated by a method called targeted-gene panel (TGP), which uses next-generation sequencing (NGS) to analyze a limited number of specific cancer-related genes and suggest optimal treatments, but this method causes the problem that the number of patients who benefit from it is limited. In order to steadily develop precision oncology, it is necessary to integrate and analyze more detailed omics data, such as whole genome data and epigenome data. On the other hand, with the advancement of analysis technologies such as NGS, the amount of data obtained by omics analysis has become enormous, and artificial intelligence (AI) technologies, mainly machine learning (ML) technologies, are being actively used to make more efficient and accurate predictions. In this review, we will focus on whole genome sequencing (WGS) analysis and epigenome analysis, introduce the latest results of omics analysis using ML technologies for the development of precision oncology, and discuss the future prospects.

... read more

Topics: Precision medicine (55%)

5 Citations

Open accessJournal ArticleDOI: 10.1016/J.JMB.2021.166975
Abstract: Folding of RNA into secondary structures through intramolecular base pairing determines an RNA's three-dimensional architecture and associated function Simple RNA structures like stem loops can provide specialized functions independent of coding capacity, such as protein binding, regulation of RNA processing and stability, stimulation or inhibition of translation RNA catalysis is dependent on tertiary structures found in the ribosome, tRNAs and group I and II introns While the extent to which non-coding RNAs contribute to cellular maintenance is generally appreciated, the fact that both non-coding and coding RNA can assume relevant structural states has only recently gained attention In particular, the co-transcriptional folding of nascent RNA of all classes has the potential to regulate co-transcriptional processing, RNP (ribonucleoprotein particle) formation, and transcription itself Riboswitches are established examples of co-transcriptionally folded coding RNAs that directly regulate transcription, mainly in prokaryotes Here we discuss recent studies in both prokaryotes and eukaryotes showing that structure formation may carry a more widespread regulatory logic during RNA synthesis Local structures forming close to the catalytic center of RNA polymerases have the potential to regulate transcription by reducing backtracking In addition, stem loops or more complex structures may alter co-transcriptional RNA processing or its efficiency Several examples of functional structures have been identified to date, and this review provides an overview of physiologically distinct processes where co-transcriptionally folded RNA plays a role Experimental approaches such as single-molecule FRET and in vivo structural probing to further advance our insight into the significance of co-transcriptional structure formation are discussed

... read more

Topics: RNA (69%), Riboswitch (64%), Intron (64%) ... read more

5 Citations

Journal ArticleDOI: 10.1002/WRNA.1657
Abstract: The polycomb repressive complexes 1 and 2 (PRCs; PRC1 and PRC2) are conserved histone-modifying enzymes that often function cooperatively to repress gene expression. The PRCs are regulated by long noncoding RNAs (lncRNAs) in complex ways. On the one hand, specific lncRNAs cause the PRCs to engage with chromatin and repress gene expression over genomic regions that can span megabases. On the other hand, the PRCs bind RNA with seemingly little sequence specificity, and at least in the case of PRC2, direct RNA-binding has the effect of inhibiting the enzyme. Thus, some RNAs appear to promote PRC activity, while others may inhibit it. The reasons behind this apparent dichotomy are unclear. The most potent PRC-activating lncRNAs associate with chromatin and are predominantly unspliced or harbor unusually long exons. Emerging data imply that these lncRNAs promote PRC activity through internal RNA sequence elements that arise and disappear rapidly in evolutionary time. These sequence elements may function by interacting with common subsets of RNA-binding proteins that recruit or stabilize PRCs on chromatin. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Recognition RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications.

... read more

Topics: RNA (56%), Chromatin (55%), PRC2 (53%) ... read more

3 Citations


99 results found

Open accessJournal ArticleDOI: 10.1093/BIOINFORMATICS/BTP352
Heng Li1, Bob Handsaker2, Alec Wysoker2, T. J. Fennell2  +5 moreInstitutions (4)
01 Aug 2009-Bioinformatics
Abstract: Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: Contact: [email protected]

... read more

Topics: Variant Call Format (62%), Stockholm format (61%), FASTQ format (56%) ... read more

35,747 Citations

Open accessJournal ArticleDOI: 10.1093/BIOINFORMATICS/BTS635
01 Jan 2013-Bioinformatics
Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from

... read more

Topics: MRNA Sequencing (57%)

20,172 Citations

Open accessJournal ArticleDOI: 10.1186/GB-2009-10-3-R25
04 Mar 2009-Genome Biology
Abstract: Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source

... read more

Topics: Hybrid genome assembly (51%)

18,079 Citations

Open accessJournal ArticleDOI: 10.1093/BIOINFORMATICS/BTQ033
Aaron R. Quinlan1, Ira M. Hall1Institutions (1)
15 Mar 2010-Bioinformatics
Abstract: Motivation: Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing webbased methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner. Results: This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets. Availability and implementation: BEDTools was written in C++. Source code and a comprehensive user manual are freely available at

... read more

Topics: Software suite (52%), Source code (50%)

14,088 Citations

Journal ArticleDOI: 10.14806/EJ.17.1.200
Marcel Martin1Institutions (1)
02 May 2011-EMBnet.journal
Abstract: When small RNA is sequenced on current sequencing machines, the resulting reads are usually longer than the RNA and therefore contain parts of the 3' adapter. That adapter must be found and removed error-tolerantly from each read before read mapping. Previous solutions are either hard to use or do not offer required features, in particular support for color space data. As an easy to use alternative, we developed the command-line tool cutadapt, which supports 454, Illumina and SOLiD (color space) data, offers two adapter trimming algorithms, and has other useful features. Cutadapt, including its MIT-licensed source code, is available for download at

... read more

Topics: Adapter (genetics) (50%)

13,576 Citations

No. of citations received by the Paper in previous years
Network Information
Related Papers (5)
STAR: ultrafast universal RNA-seq aligner01 Jan 2013, Bioinformatics

Alexander Dobin, Carrie A. Davis +7 more

Splicing of Nascent RNA Coincides with Intron Exit from RNA Polymerase II.07 Apr 2016, Cell

Fernando Carrillo Oesterreich, Lydia Herzel +5 more