scispace - formally typeset
Search or ask a question

Showing papers on "Exon published in 2008"


Journal ArticleDOI
27 Nov 2008-Nature
TL;DR: An in-depth analysis of 15 diverse human tissue and cell line transcriptomes on the basis of deep sequencing of complementary DNA fragments yielding a digital inventory of gene and mRNA isoform expression suggested common involvement of specific factors in tissue-level regulation of both splicing and polyadenylation.
Abstract: Through alternative processing of pre-messenger RNAs, individual mammalian genes often produce multiple mRNA and protein isoforms that may have related, distinct or even opposing functions. Here we report an in-depth analysis of 15 diverse human tissue and cell line transcriptomes on the basis of deep sequencing of complementary DNA fragments, yielding a digital inventory of gene and mRNA isoform expression. Analyses in which sequence reads are mapped to exon-exon junctions indicated that 92-94% of human genes undergo alternative splicing, 86% with a minor isoform frequency of 15% or more. Differences in isoform-specific read densities indicated that most alternative splicing and alternative cleavage and polyadenylation events vary between tissues, whereas variation between individuals was approximately twofold to threefold less common. Extreme or 'switch-like' regulation of splicing between tissues was associated with increased sequence conservation in regulatory regions and with generation of full-length open reading frames. Patterns of alternative splicing and alternative cleavage and polyadenylation were strongly correlated across tissues, suggesting coordinated regulation of these processes, and sequence conservation of a subset of known regulatory motifs in both alternative introns and 3' untranslated regions suggested common involvement of specific factors in tissue-level regulation of both splicing and polyadenylation.

4,711 citations


Journal ArticleDOI
Qun Pan1, Ofer Shai1, Leo J. Lee1, Brendan J. Frey1, Benjamin J. Blencowe1 
TL;DR: It is estimated that transcripts from ∼95% of multiexon genes undergoAlternative splicing and that there are ∼100,000 intermediate- to high-abundance alternative splicing events in major human tissues.
Abstract: We carried out the first analysis of alternative splicing complexity in human tissues using mRNA-Seq data. New splice junctions were detected in approximately 20% of multiexon genes, many of which are tissue specific. By combining mRNA-Seq and EST-cDNA sequence data, we estimate that transcripts from approximately 95% of multiexon genes undergo alternative splicing and that there are approximately 100,000 intermediate- to high-abundance alternative splicing events in major human tissues. From a comparison with quantitative alternative splicing microarray profiling data, we also show that mRNA-Seq data provide reliable measurements for exon inclusion levels.

3,455 citations


Journal ArticleDOI
27 Nov 2008-Nature
TL;DR: A genome-wide means of mapping protein–RNA binding sites in vivo, by high-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP), which revealed a large number of Nova–RNA interactions in 3′ untranslated regions, leading to the discovery that Nova regulates alternative polyadenylation in the brain.
Abstract: Protein-RNA interactions have critical roles in all aspects of gene expression. However, applying biochemical methods to understand such interactions in living tissues has been challenging. Here we develop a genome-wide means of mapping protein-RNA binding sites in vivo, by high-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP). HITS-CLIP analysis of the neuron-specific splicing factor Nova revealed extremely reproducible RNA-binding maps in multiple mouse brains. These maps provide genome-wide in vivo biochemical footprints confirming the previous prediction that the position of Nova binding determines the outcome of alternative splicing; moreover, they are sufficiently powerful to predict Nova action de novo. HITS-CLIP revealed a large number of Nova-RNA interactions in 3' untranslated regions, leading to the discovery that Nova regulates alternative polyadenylation in the brain. HITS-CLIP, therefore, provides a robust, unbiased means to identify functional protein-RNA interactions in vivo.

1,313 citations


Journal ArticleDOI
15 Aug 2008-Science
TL;DR: A global survey of messenger RNA splicing events identified 94,241 splice junctions and showed that exon skipping is the most prevalent form of alternative splicing.
Abstract: The functional complexity of the human transcriptome is not yet fully elucidated. We report a high-throughput sequence of the human transcriptome from a human embryonic kidney and a B cell line. We used shotgun sequencing of transcripts to generate randomly distributed reads. Of these, 50% mapped to unique genomic locations, of which 80% corresponded to known exons. We found that 66% of the polyadenylated transcriptome mapped to known genes and 34% to nonannotated genomic regions. On the basis of known transcripts, RNA-Seq can detect 25% more genes than can microarrays. A global survey of messenger RNA splicing events identified 94,241 splice junctions (4096 of which were previously unidentified) and showed that exon skipping is the most prevalent form of alternative splicing.

1,288 citations


Journal ArticleDOI
01 May 2008-RNA
TL;DR: The current state of knowledge of splicing cis-regulatory elements and their context-dependent effects on splicing are summarized, emphasizing recent global/genome-wide studies and open questions.
Abstract: Alternative splicing of pre-mRNAs is a major contributor to both proteomic diversity and control of gene expression levels. Splicing is tightly regulated in different tissues and developmental stages, and its disruption can lead to a wide range of human diseases. An important long-term goal in the splicing field is to determine a set of rules or ‘‘code’’ for splicing that will enable prediction of the splicing pattern of any primary transcript from its sequence. Outside of the core splice site motifs, the bulk of the information required for splicing is thought to be contained in exonic and intronic cis-regulatory elements that function by recruitment of sequence-specific RNA-binding protein factors that either activate or repress the use of adjacent splice sites. Here, we summarize the current state of knowledge of splicing cis-regulatory elements and their context-dependent effects on splicing, emphasizing recent global/genome-wide studies and open questions.

970 citations


Journal ArticleDOI
TL;DR: A simple and effective mechanism by which PCa cells can synthesize a constitutively active AR and thus circumvent androgen ablation is described.
Abstract: The standard systemic treatment for prostate cancer (PCa) is androgen ablation, which causes tumor regression by inhibiting activity of the androgen receptor (AR). Invariably, PCa recurs with a fatal androgen-refractory phenotype. Importantly, the growth of androgen-refractory PCa remains dependent on the AR through various mechanisms of aberrant AR activation. Here, we studied the 22Rv1 PCa cell line, which was derived from a CWR22 xenograft that relapsed during androgen ablation. Three AR isoforms are expressed in 22Rv1 cells: a full-length version with duplicated exon 3 and two truncated versions lacking the COOH terminal domain (CTD). We found that CTD-truncated AR isoforms are encoded by mRNAs that have a novel exon 2b at their 3' end. Functionally, these AR isoforms are constitutively active and promote the expression of endogenous AR-dependent genes, as well as the proliferation of 22Rv1 cells in a ligand-independent manner. AR mRNAs containing exon 2b and their protein products are expressed in commonly studied PCa cell lines. Moreover, exon 2b-derived species are enriched in xenograft-based models of therapy-resistant PCa. Together, our data describe a simple and effective mechanism by which PCa cells can synthesize a constitutively active AR and thus circumvent androgen ablation.

828 citations


Journal ArticleDOI
19 Dec 2008-Science
TL;DR: A technique is described that can be used to identify the DNA strand of origin for any particular RNA transcript, and quantify the number of sense and antisense transcripts from expressed genes at a global level.
Abstract: Transcription in mammalian cells can be assessed at a genome-wide level, but it has been difficult to reliably determine whether individual transcripts are derived from the plus or minus strands of chromosomes. This distinction can be critical for understanding the relationship between known transcripts (sense) and the complementary antisense transcripts that may regulate them. Here, we describe a technique that can be used to (i) identify the DNA strand of origin for any particular RNA transcript, and (ii) quantify the number of sense and antisense transcripts from expressed genes at a global level. We examined five different human cell types and in each case found evidence for antisense transcripts in 2900 to 6400 human genes. The distribution of antisense transcripts was distinct from that of sense transcripts, was nonrandom across the genome, and differed among cell types. Antisense transcripts thus appear to be a pervasive feature of human cells, which suggests that they are a fundamental component of gene regulation.

532 citations


Journal ArticleDOI
TL;DR: The results show that the high-resolution ASO-tiling approach can identify cis-elements that modulate splicing positively or negatively and highlight the therapeutic potential of some of these ASOs in the context of SMA.
Abstract: survival of motor neuron 2, centromeric (SMN2) is a gene that modifies the severity of spinal muscular atrophy (SMA), a motor-neuron disease that is the leading genetic cause of infant mortality. Increasing inclusion of SMN2 exon 7, which is predominantly skipped, holds promise to treat or possibly cure SMA; one practical strategy is the disruption of splicing silencers that impair exon 7 recognition. By using an antisense oligonucleotide (ASO)-tiling method, we systematically screened the proximal intronic regions flanking exon 7 and identified two intronic splicing silencers (ISSs): one in intron 6 and a recently described one in intron 7. We analyzed the intron 7 ISS by mutagenesis, coupled with splicing assays, RNA-affinity chromatography, and protein overexpression, and found two tandem hnRNP A1/A2 motifs within the ISS that are responsible for its inhibitory character. Mutations in these two motifs, or ASOs that block them, promote very efficient exon 7 inclusion. We screened 31 ASOs in this region and selected two optimal ones to test in human SMN2 transgenic mice. Both ASOs strongly increased hSMN2 exon 7 inclusion in the liver and kidney of the transgenic animals. Our results show that the high-resolution ASO-tiling approach can identify cis-elements that modulate splicing positively or negatively. Most importantly, our results highlight the therapeutic potential of some of these ASOs in the context of SMA.

492 citations


Journal ArticleDOI
TL;DR: A multiplex reverse transcription-PCR system is developed that captures all in-frame fusions between EML4 to exon 20 of ALK to reinforce the importance of accurate diagnosis of E ML4-ALK–positive tumors for the optimization of treatment strategies.
Abstract: Purpose: EML4-ALK is a fusion-type protein tyrosine kinase that is generated by inv(2)(p21p23) in the genome of non–small cell lung cancer (NSCLC). To allow sensitive detection of EML4-ALK fusion transcripts, we have now developed a multiplex reverse transcription-PCR (RT-PCR) system that captures all in-frame fusions between the two genes. Experimental Design: Primers were designed to detect all possible in-frame fusions of EML4 to exon 20 of ALK , and a single-tube multiplex RT-PCR assay was done with total RNA from 656 solid tumors of the lung ( n = 364) and 10 other organs. Results: From consecutive lung adenocarcinoma cases ( n = 253), we identified 11 specimens (4.35%) positive for fusion transcripts, 9 of which were positive for the previously identified variants 1, 2, and 3. The remaining two specimens harbored novel transcript isoforms in which exon 14 (variant 4) or exon 2 (variant 5) of EML4 was connected to exon 20 of ALK . No fusion transcripts were detected for other types of lung cancer ( n = 111) or for tumors from 10 other organs ( n = 292). Genomic rearrangements responsible for the fusion events in NSCLC cells were confirmed by genomic PCR analysis and fluorescence in situ hybridization. The novel isoforms of EML4-ALK manifested marked oncogenic activity, and they yielded a pattern of cytoplasmic staining with fine granular foci in immunohistochemical analysis of NSCLC specimens. Conclusions: These data reinforce the importance of accurate diagnosis of EML4-ALK–positive tumors for the optimization of treatment strategies.

478 citations


Journal ArticleDOI
TL;DR: It is shown that massively parallel sequencing of randomly primed cDNAs, using a next-generation sequencing-by-synthesis technology, offers the potential to generate relative measures of mRNA and individual exon abundance while simultaneously profiling the prevalence of both annotated and novel exons and exon-splicing events.
Abstract: Sequence-based methods for transcriptome characterization have typically relied on generation of either serial analysis of gene expression tags or expressed sequence tags. Although such approaches have the potential to enumerate transcripts by counting sequence tags derived from them, they typically do not robustly survey the majority of transcripts along their entire length. Here we show that massively parallel sequencing of randomly primed cDNAs, using a next-generation sequencing-by-synthesis technology, offers the potential to generate relative measures of mRNA and individual exon abundance while simultaneously profiling the prevalence of both annotated and novel exons and exon-splicing events. This technique identifies known single nucleotide polymorphisms (SNPs) as well as novel single-base variants. Analysis of these variants, and previously unannotated splicing events in the HeLa S3 cell line, reveals an overrepresentation of gene categories including those previously implicated in cancer.

470 citations


Journal ArticleDOI
TL;DR: The findings implicate BANK1 as a susceptibility gene for SLE, with variants affecting regulatory sites and key functional domains, which could contribute to sustained B cell–receptor signaling and B-cell hyperactivity characteristic of this disease.
Abstract: Systemic lupus erythematosus (SLE) is a prototypical autoimmune disease characterized by production of autoantibodies and complex genetic inheritance. In a genome-wide scan using 85,042 SNPs, we identified an association between SLE and a nonsynonymous substitution (rs10516487, R61H) in the B-cell scaffold protein with ankyrin repeats gene, BANK1. We replicated the association in four independent case-control sets (combined P = 3.7 x 10(-10); OR = 1.38). We analyzed BANK1 cDNA and found two isoforms, one full-length and the other alternatively spliced and lacking exon 2 (Delta2), encoding a protein without a putative IP3R-binding domain. The transcripts were differentially expressed depending on a branch point-site SNP, rs17266594, in strong linkage disequilibrium (LD) with rs10516487. A third associated variant was found in the ankyrin domain (rs3733197, A383T). Our findings implicate BANK1 as a susceptibility gene for SLE, with variants affecting regulatory sites and key functional domains. The disease-associated variants could contribute to sustained B cell-receptor signaling and B-cell hyperactivity characteristic of this disease.

Journal ArticleDOI
TL;DR: Two slightly different fusion cDNAs in which exon 6 of EML4 was joined to exon 20 of ALK were each identified in two individuals of the cohort and exhibited marked transforming activity in vitro as well as oncogenic activity in vivo.
Abstract: The genome of a subset of non-small-cell lung cancers (NSCLC) harbors a small inversion within chromosome 2 that gives rise to a transforming fusion gene, EML4-ALK, which encodes an activated protein tyrosine kinase. Although breakpoints within EML4 have been identified in introns 13 and 20, giving rise to variants 1 and 2, respectively, of EML4-ALK, it has remained unclear whether other isoforms of the fusion gene are present in NSCLC cells. We have now screened NSCLC specimens for other in-frame fusion cDNAs that contain both EML4 and ALK sequences. Two slightly different fusion cDNAs in which exon 6 of EML4 was joined to exon 20 of ALK were each identified in two individuals of the cohort. Whereas one cDNA contained only exons 1 to 6 of EML4 (variant 3a), the other also contained an additional 33-bp sequence derived from intron 6 of EML4 (variant 3b). The protein encoded by the latter cDNA thus contained an insertion of 11 amino acids between the EML4 and ALK sequences of that encoded by the former. Both variants 3a and 3b of EML4-ALK exhibited marked transforming activity in vitro as well as oncogenic activity in vivo. A lung cancer cell line expressing endogenous variant 3 of EML4-ALK underwent cell death on exposure to a specific inhibitor of ALK catalytic activity. These data increase the frequency of EML4-ALK-positive NSCLC tumors and bolster the clinical relevance of this oncogenic kinase.

Journal ArticleDOI
TL;DR: The aim of this work is to provide the basic facts about TDP-43 an assessment of the multiple functions ascribed to this protein and to suggest that it may be involved in other cellular processes such as microRNA biogenesis, apoptosis, and cell division.
Abstract: TDP-43 is a RNA/DNA binding protein that structurally resembles a typical hnRNP protein family member and displays a significant specificity for binding the common microsatellite region (GU/GT)n. Initially described as a regulator of HIV-1 gene expression, it has been reported in the past to affect both normal and pathological RNA splicing events. In particular, it has been shown to play a fundamental role in the occurrence of several monosymptomatic/full forms of Cystic Fibrosis caused by pathological skipping of CFTR exon 9 from the mature mRNA. Recently, and in a way probably unrelated to splicing, a hyperphosphorylated form of TDP-43 has also been found to accumulate in the cytoplasm of neuronal cells of patients affected by fronto temporal lobar degenerations. In addition to its role in transcription and splicing regulation, a growing body of evidence indirectly suggests that TDP-43 may be involved in other cellular processes such as microRNA biogenesis, apoptosis, and cell division. The aim of this work is to provide the basic facts about TDP-43 an assessment of the multiple functions ascribed to this protein.

Journal ArticleDOI
TL;DR: Regulation of splicing by growth and splice factors is identified as a key event in determining the relative pro-versus anti-angiogenic expression of VEGF isoforms, and it is suggested that p38 MAPK-Clk/sty kinases are responsible for the TGFβ1-induced DSS selection.
Abstract: Vascular endothelial growth factor A (VEGFA; hereafter referred to as VEGF) is a key regulator of physiological and pathological angiogenesis. Two families of VEGF isoforms are generated by alternate splice-site selection in the terminal exon. Proximal splice-site selection (PSS) in exon 8 results in pro-angiogenic VEGFxxx isoforms (xxx is the number of amino acids), whereas distal splice-site selection (DSS) results in anti-angiogenic VEGFxxxb isoforms. To investigate control of PSS and DSS, we investigated the regulation of isoform expression by extracellular growth factor administration and intracellular splicing factors. In primary epithelial cells VEGFxxxb formed the majority of VEGF isoforms (74%). IGF1, and TNFalpha treatment favoured PSS (increasing VEGFxxx) whereas TGFbeta1 favoured DSS, increasing VEGFxxxb levels. TGFbeta1 induced DSS selection was prevented by inhibition of p38 MAPK and the Clk/sty (CDC-like kinase, CLK1) splicing factor kinase family, but not ERK1/2. Clk phosphorylates SR protein splicing factors ASF/SF2, SRp40 and SRp55. To determine whether SR splicing factors alter VEGF splicing, they were overexpressed in epithelial cells, and VEGF isoform production assessed. ASF/SF2, and SRp40 both favoured PSS, whereas SRp55 upregulated VEGFxxxb (DSS) isoforms relative to VEGFxxx. SRp55 knockdown reduced expression of VEGF165b. Moreover, SRp55 bound to a 35 nucleotide region of the 3'UTR immediately downstream of the stop codon in exon 8b. These results identify regulation of splicing by growth and splice factors as a key event in determining the relative pro-versus anti-angiogenic expression of VEGF isoforms, and suggest that p38 MAPK-Clk/sty kinases are responsible for the TGFbeta1-induced DSS selection, and identify SRp55 as a key regulatory splice factor.

Journal ArticleDOI
TL;DR: The results demonstrate that the regulatory effects of genetic variation in a normal human population are far more complex than previously observed, and this extra layer of molecular diversity may account for natural phenotypic variation and disease susceptibility.
Abstract: We have performed a genome-wide analysis of common genetic variation controlling differential expression of transcript isoforms in the CEU HapMap population using a comprehensive exon tiling microarray covering 17,897 genes. We detected 324 genes with significant associations between flanking SNPs and transcript levels. Of these, 39% reflected changes in whole gene expression and 55% reflected transcript isoform changes such as splicing variants (exon skipping, alternative splice site use, intron retention), differential 5' UTR (initiation of transcription) use, and differential 3' UTR (alternative polyadenylation) use. These results demonstrate that the regulatory effects of genetic variation in a normal human population are far more complex than previously observed. This extra layer of molecular diversity may account for natural phenotypic variation and disease susceptibility.

Journal ArticleDOI
TL;DR: Interestingly, different exon 20 mutations and coexisting mutations seemed to have a different influence on gefitinib response, but variability exists between different individuals.
Abstract: Purpose: Clinical reports about responsiveness to gefitinib treatment in patients of non-small cell lung cancer (NSCLC) with mutations in exon 20 of epidermal growth factor receptor (EGFR) are limited. To increase understanding of the influence of exon 20 mutations on NSCLC treatment with gefitinib, we investigated the clinical features of lung cancer in patients with exon 20 mutations and analyzed the gefitinib treatment response. Experimental Design: We surveyed the clinical data and mutational studies of NSCLC patients with EGFR exon 20 mutations in the National Taiwan University Hospital and reviewed the literature reports about EGFR exon 20 mutations and the gefitinib treatment response. Results: Twenty-three patients with mutations in exon 20 were identified. Nine (39%) had coexisting mutations in EGFR exons other than exon 20. Sixteen patients received gefitinib treatment, and a response was noted in 4 patients. The gefitinib response rate of NSCLC with exon 20 mutations was 25%, far lower than those with deletions in exon 19 and L858R mutations. Interestingly, different exon 20 mutations and coexisting mutations seemed to have a different influence on gefitinib response. Conclusions: EGFR exon 20 mutations of NSCLC patients result in poorer responsiveness to gefitinib treatment, but variability exists between different individuals.

Journal ArticleDOI
TL;DR: The first genome-scale expression compendium of human alternative splicing events is generated using custom whole-transcript microarrays monitoring expression of 24,426 alternativesplicing events in 48 diverse human samples, providing a rich resource for studying splicing regulation.
Abstract: Alternative pre-messenger RNA splicing influences development, physiology and disease, but its regulation in humans is not well understood, partially because of the limited scale at which the expression of specific splicing events has been measured. We generated the first genome-scale expression compendium of human alternative splicing events using custom whole-transcript microarrays monitoring expression of 24,426 alternative splicing events in 48 diverse human samples. Over 11,700 genes and 9,500 splicing events were differentially expressed, providing a rich resource for studying splicing regulation. An unbiased, systematic screen of 21,760 4-mer to 7-mer words for cis-regulatory motifs identified 143 RNA 'words' enriched near regulated cassette exons, including six clusters of motifs represented by UCUCU, UGCAUG, UGCU, UGUGU, UUUU and AGGG, which map to trans-acting regulators PTB, Fox, Muscleblind, CELF/CUG-BP, TIA-1 and hnRNP F/H, respectively. Each cluster showed a distinct pattern of genomic location and tissue specificity. For example, UCUCU occurs 110 to 35 nucleotides preceding cassette exons upregulated in brain and striated muscle but depleted in other tissues. UCUCU and UGCAUG seem to have similar function but independent action, occurring 5' and 3', respectively, of 33% of the cassette exons upregulated in skeletal muscle but co-occurring for only 2%.

Journal ArticleDOI
01 Oct 2008-Genetics
TL;DR: The genetic redundancy suggests that the presence of duplicated copies of phyA genes accounts for the generation of photoperiod insensitivity, while protecting against the deleterious effects of mutation.
Abstract: Gene and genome duplications underlie the origins of evolutionary novelty in plants. Soybean, Glycine max, is considered to be a paleopolyploid species with a complex genome. We found multiple homologs of the phytochrome A gene (phyA) in the soybean genome and determined the DNA sequences of two paralogs designated GmphyA1 and GmphyA2. Analysis of the GmphyA2 gene from the lines carrying a recessive allele at a photoperiod insensitivity locus, E4, revealed that a Ty1/copia-like retrotransposon was inserted in exon 1 of the gene, which resulted in dysfunction of the gene. Mapping studies suggested that GmphyA2 is encoded by E4. The GmphyA1 gene was mapped to a region of linkage group O, which is homeologous to the region harboring E4 in linkage group I. Plants homozygous for the e4 allele were etiolated under continuous far red light, but the de-etiolation occurred partially, indicating that the mutation alone did not cause a complete loss of phyA function. The genetic redundancy suggests that the presence of duplicated copies of phyA genes accounts for the generation of photoperiod insensitivity, while protecting against the deleterious effects of mutation. Thus, this phenomenon provides a link between gene duplication and establishment of an adaptive response of plants to environments.

Journal ArticleDOI
01 Feb 2008-Blood
TL;DR: Several somatic mutations of JAK2 exon 12 can be found in a myeloproliferative disorder that is mainly characterized by erythrocytosis, as detected in patients with polycythemia vera and familial PV.

Journal ArticleDOI
TL;DR: The first genome-wide screen for SNPs that associate with alternative splicing and gene expression in human primary cells, evaluating 93 autopsy-collected cortical brain tissue samples with no defined neuropsychiatric condition and 80 peripheral blood mononucleated cell samples collected from living healthy donors, suggests that splicing effects may be of more phenotypic significance than overall gene expression changes.
Abstract: Numerous genome-wide screens for polymorphisms that influence gene expression have provided key insights into the genetic control of transcription. Despite this work, the relevance of specific polymorphisms to in vivo expression and splicing remains unclear. We carried out the first genome-wide screen, to our knowledge, for SNPs that associate with alternative splicing and gene expression in human primary cells, evaluating 93 autopsy-collected cortical brain tissue samples with no defined neuropsychiatric condition and 80 peripheral blood mononucleated cell samples collected from living healthy donors. We identified 23 high confidence associations with total expression and 80 with alternative splicing as reflected by expression levels of specific exons. Fewer than 50% of the implicated SNPs however show effects in both tissue types, reflecting strong evidence for distinct genetic control of splicing and expression in the two tissue types. The data generated here also suggest the possibility that splicing effects may be responsible for up to 13 out of 84 reported genome-wide significant associations with human traits. These results emphasize the importance of establishing a database of polymorphisms affecting splicing and expression in primary tissue types and suggest that splicing effects may be of more phenotypic significance than overall gene expression changes.

Journal ArticleDOI
05 Sep 2008-Science
TL;DR: It is shown that normal endometrial stromal cells contain a specific chimeric RNA joining 5′ exons of the JAZF1 gene on chromosome 7p15 to 3′ exon of the Polycomb group gene JJAZ1/SUZ12 on chromosome 17q11 and that this RNA is translated into J AZF1-JJAZ1, a protein with anti-apoptotic activity.
Abstract: Chromosomal rearrangements that create gene fusions are common features of human tumors. The prevailing view is that the resultant chimeric transcripts and proteins are abnormal, tumor-specific products that provide tumor cells with a growth and/or survival advantage. We show that normal endometrial stromal cells contain a specific chimeric RNA joining 5' exons of the JAZF1 gene on chromosome 7p15 to 3' exons of the Polycomb group gene JJAZ1/SUZ12 on chromosome 17q11 and that this RNA is translated into JAZF1-JJAZ1, a protein with anti-apoptotic activity. The JAZF1-JJAZ1 RNA appears to arise from physiologically regulated trans-splicing between precursor messenger RNAs for JAZF1 and JJAZ1. The chimeric RNA and protein are identical to those produced from a gene fusion found in human endometrial stromal tumors. These observations suggest that certain gene fusions may be pro-neoplastic owing to constitutive expression of chimeric gene products normally generated by trans-splicing of RNAs in developing tissues.

Journal ArticleDOI
TL;DR: The observation that 18,024 of 144,079 peptides did not match current gene models suggests that 13% of the Arabidopsis proteome was incomplete due to approximately equal numbers of missing and incorrect gene models.
Abstract: Gene annotation underpins genome science. Most often protein coding sequence is inferred from the genome based on transcript evidence and computational predictions. While generally correct, gene models suffer from errors in reading frame, exon border definition, and exon identification. To ascertain the error rate of Arabidopsis thaliana gene models, we isolated proteins from a sample of Arabidopsis tissues and determined the amino acid sequences of 144,079 distinct peptides by tandem mass spectrometry. The peptides corresponded to 1 or more of 3 different translations of the genome: a 6-frame translation, an exon splice-graph, and the currently annotated proteome. The majority of the peptides (126,055) resided in existing gene models (12,769 confirmed proteins), comprising 40% of annotated genes. Surprisingly, 18,024 novel peptides were found that do not correspond to annotated genes. Using the gene finding program AUGUSTUS and 5,426 novel peptides that occurred in clusters, we discovered 778 new protein-coding genes and refined the annotation of an additional 695 gene models. The remaining 13,449 novel peptides provide high quality annotation (>99% correct) for thousands of additional genes. Our observation that 18,024 of 144,079 peptides did not match current gene models suggests that 13% of the Arabidopsis proteome was incomplete due to approximately equal numbers of missing and incorrect gene models.

Journal ArticleDOI
TL;DR: A large family of genes coding for small proteins has been identified in D. discoideum and two groups of very similar genes from this family have been shown to be specifically expressed in prestalk cells during development.
Abstract: The social amoeba Dictyostelium discoideum executes a multicellular development program upon starvation. This morphogenetic process requires the differential regulation of a large number of genes and is coordinated by extracellular signals. The MADS-box transcription factor SrfA is required for several stages of development, including slug migration and spore terminal differentiation. Subtractive hybridization allowed the isolation of a gene, sigN (SrfA-induced gene N), that was dependent on the transcription factor SrfA for expression at the slug stage of development. Homology searches detected the existence of a large family of sigN-related genes in the Dictyostelium discoideum genome. The 13 most similar genes are grouped in two regions of chromosome 2 and have been named Group1 and Group2 sigN genes. The putative encoded proteins are 87–89 amino acids long. All these genes have a similar structure, composed of a first exon containing a 13 nucleotides long open reading frame and a second exon comprising the remaining of the putative coding region. The expression of these genes is induced at10 hours of development. Analyses of their promoter regions indicate that these genes are expressed in the prestalk region of developing structures. The addition of antibodies raised against SigN Group 2 proteins induced disintegration of multi-cellular structures at the mound stage of development. A large family of genes coding for small proteins has been identified in D. discoideum. Two groups of very similar genes from this family have been shown to be specifically expressed in prestalk cells during development. Functional studies using antibodies raised against Group 2 SigN proteins indicate that these genes could play a role during multicellular development.

Journal ArticleDOI
TL;DR: The authors comprehensively predict the targets of the brain and muscle-specific splicing factor Fox-1 (A2BP1) and its paralog Fox-2 (RBM9) and systematically define the corresponding splicing regulatory networks genome-wide.
Abstract: The precise regulation of many alternative splicing (AS) events by specific splicing factors is essential to determine tissue types and developmental stages. However, the molecular basis of tissue-specific AS regulation and the properties of splicing regulatory networks (SRNs) are poorly understood. Here we comprehensively predict the targets of the brain- and muscle-specific splicing factor Fox-1 (A2BP1) and its paralog Fox-2 (RBM9) and systematically define the corresponding SRNs genome-wide. Fox-1/2 are conserved from worm to human, and specifically recognize the RNA element UGCAUG. We integrate Fox-1/2-binding specificity with phylogenetic conservation, splicing microarray data, and additional computational and experimental characterization. We predict thousands of Fox-1/2 targets with conserved binding sites, at a false discovery rate (FDR) of ∼24%, including many validated experimentally, suggesting a surprisingly extensive SRN. The preferred position of the binding sites differs according to AS pattern, and determines either activation or repression of exon recognition by Fox-1/2. Many predicted targets are important for neuromuscular functions, and have been implicated in several genetic diseases. We also identified instances of binding site creation or loss in different vertebrate lineages and human populations, which likely reflect fine-tuning of gene expression regulation during evolution.

Journal ArticleDOI
TL;DR: PPMO M23D-B, designed to force skipping of stop-codon containing dystrophin exon 23, is investigated in an mdx mouse model of Duchenne muscular dystrophy, the first report of oligonucleotide-mediated exon skipping and dystrophic protein induction in the heart of treated animals.

Journal ArticleDOI
TL;DR: Discovery of silence mutations and intronic mutations of tau gene in some individuals with frontotemporal dementia with Parkinsonism linked to chromosome 17 (FTDP-17), demonstrates that dysregulation of t Tau exon 10 alternative splicing and consequently of 3R-tau/4R-Tau balance is sufficient to cause neurodegeneration and dementia.
Abstract: Abnormalities of microtubule-associated protein tau play a central role in neurofibrillary degeneration in several neurodegenerative disorders that collectively called tauopathies. Six isoforms of tau are expressed in adult human brain, which result from alternative splicing of pre-mRNA generated from a single tau gene. Alternative splicing of tau exon 10 results in tau isoforms containing either three or four microtubule-binding repeats (3R-tau and 4R-tau, respectively). Approximately equal levels of 3R-tau and 4R-tau are expressed in normal adult human brain, but the 3R-tau/4R-tau ratio is altered in the brains in several tauopathies. Discovery of silence mutations and intronic mutations of tau gene in some individuals with frontotemporal dementia with Parkinsonism linked to chromosome 17 (FTDP-17), which only disrupt tau exon 10 splicing but do not alter tau's primary sequence, demonstrates that dysregulation of tau exon 10 alternative splicing and consequently of 3R-tau/4R-tau balance is sufficient to cause neurodegeneration and dementia. Here, we review the gene structure, transcripts and protein isoforms of tau, followed by the regulation of exon 10 splicing that determines the expression of 3R-tau or 4R-tau. Finally, dysregulation of exon 10 splicing of tau in several tauopathies is discussed. Understanding the molecular mechanisms by which tau exon 10 splicing is regulated and how it is disrupted in tauopathies will provide new insight into the mechanisms of these tauopathies and help identify new therapeutic targets to treat these disorders.

Journal ArticleDOI
TL;DR: It is suggested that splicing silencers play a more prominent role in alternative splicing regulation than previously anticipated and that defects in genes involved in the regulation of splicing in cancerous tissues affect the delicate regulation of the inclusion level of alternatively skipped exons, shifting their mode ofsplicing back to constitutive.
Abstract: Splicing is a molecular mechanism, by which introns are removed from an mRNA precursor and exons are ligated to form a mature mRNA. Mutations that cause defects in the splicing mechanism are known to be responsible for many diseases, including cystic fibrosis and familial dysautonomia. If mutations that cause defects in splicing are responsible for such severe deleterious phenotypic differences, it is possible that mutations in splicing are also responsible for mildly deleterious phenotypic differences. Although deleterious mutations are rapidly eliminated from the population by purifying selection, the selection against mild deleterious effects is not as strong. Since mildly deleterious mutations have a chance of surviving natural selection, we might be mistakenly referring to these mutations as neutral variation between individuals. Splicing has also been shown to be seriously affected in cancer. Examination of cancerous tissues revealed alterations in expression levels of genes involved in mRNA processing and also a slight reduction in the level of exon skipping--the most common form of alternative splicing in humans. This implies that defects in genes involved in the regulation of splicing in cancerous tissues affect the delicate regulation of the inclusion level of alternatively skipped exons, shifting their mode of splicing back to constitutive. It may be that splicing silencers play a more prominent role in alternative splicing regulation than previously anticipated.

Journal ArticleDOI
TL;DR: In silico protein predictions suggest that the identified cancer-specific splice variants encode proteins with potentially altered functions, indicating that they may be involved in pathogenesis and hence represent novel therapeutic targets.

Journal ArticleDOI
TL;DR: The current knowledge regarding issues from an evolutionary perspective of alternative splicing is summarized and the mechanisms by which multiple transcripts are generated from a single mRNA precursor are summarized.
Abstract: Alternative splicing is a well-characterized mechanism by which multiple transcripts are generated from a single mRNA precursor. By allowing production of several protein isoforms from one pre-mRNA, alternative splicing contributes to proteomic diversity. But what do we know about the origin of this mechanism? Do the same evolutionary forces apply to alternatively and constitutively splice exons? Do similar forces act on all types of alternative splicing? Are the products generated by alternative splicing functional? Why is "improper" recognition of exons and introns allowed by the splicing machinery? In this review, we summarize the current knowledge regarding these issues from an evolutionary perspective.

Journal ArticleDOI
TL;DR: The present results, which are in contrast with previously published data, suggest that, with the notable exception of male and female reproduction, ERβ is not required in the mouse for the development and homeostasis of the major body systems.
Abstract: Estrogen signaling is mediated by estrogen receptors α (ERα) and β (ERβ). Although a consensus has now been reached concerning many physiological functions of ERα, those of ERβ are still controversial: When housed and examined in two distant laboratories, mice originating from the same initial ERβ mutant exhibited widely different phenotypes, which were themselves different from the phenotype of another ERβ mutant previously generated in our laboratory. Because, in addition to a knockout insertion in exon 3, all these mouse mutants displayed alternative splicing transcripts, we have now constructed a ERβ mouse mutant (ERβSTL−/L−) in which exon 3 was cleanly deleted by Cre/LoxP-mediated excision and was devoid of any transcript downstream of exon 3. Both females and males were sterile. The histology of the ovary was mildly affected, and no histological defects were detected in other organs, neither in females nor in males. Our present results, which are in contrast with previously published data, suggest that, with the notable exception of male and female reproduction, ERβ is not required in the mouse for the development and homeostasis of the major body systems.