scispace - formally typeset
Search or ask a question

Showing papers by "James B. Brown published in 2018"


Journal ArticleDOI
TL;DR: The iterative random forest algorithm (iRF) is developed and demonstrated to be utility for high-order interaction discovery in two prediction problems: enhancer activity in the early Drosophila embryo and alternative splicing of primary transcripts in human-derived cell lines.
Abstract: Genomics has revolutionized biology, enabling the interrogation of whole transcriptomes, genome-wide binding sites for proteins, and many other molecular processes. However, individual genomic assays measure elements that interact in vivo as components of larger molecular machines. Understanding how these high-order interactions drive gene expression presents a substantial statistical challenge. Building on random forests (RFs) and random intersection trees (RITs) and through extensive, biologically inspired simulations, we developed the iterative random forest algorithm (iRF). iRF trains a feature-weighted ensemble of decision trees to detect stable, high-order interactions with the same order of computational cost as the RF. We demonstrate the utility of iRF for high-order interaction discovery in two prediction problems: enhancer activity in the early Drosophila embryo and alternative splicing of primary transcripts in human-derived cell lines. In Drosophila, among the 20 pairwise transcription factor interactions iRF identifies as stable (returned in more than half of bootstrap replicates), 80% have been previously reported as physical interactions. Moreover, third-order interactions, e.g., between Zelda (Zld), Giant (Gt), and Twist (Twi), suggest high-order relationships that are candidates for follow-up experiments. In human-derived cells, iRF rediscovered a central role of H3K36me3 in chromatin-mediated splicing regulation and identified interesting fifth- and sixth-order interactions, indicative of multivalent nucleosomes with specific roles in splicing regulation. By decoupling the order of interactions from the computational cost of identification, iRF opens additional avenues of inquiry into the molecular mechanisms underlying genome biology.

216 citations


Journal ArticleDOI
TL;DR: The positive correlation between internal exons and gene expression potentially points to an evolutionary conserved mechanism, whereas the negative regulation of gene expression via methylation of promoters and exon 1 is potentially a secondary mechanism that has been evolved in vertebrates.
Abstract: DNA methylation is an evolutionary ancient epigenetic modification that is phylogenetically widespread. Comparative studies of the methylome across a diverse range of non-conventional and conventional model organisms is expected to help reveal how the landscape of DNA methylation and its functions have evolved. Here, we explore the DNA methylation profile of two species of the crustacean Daphnia using whole genome bisulfite sequencing. We then compare our data with the methylomes of two insects and two mammals to achieve a better understanding of the function of DNA methylation in Daphnia. Using RNA-sequencing data for all six species, we investigate the correlation between DNA methylation and gene expression. DNA methylation in Daphnia is mainly enriched within the coding regions of genes, with the highest methylation levels observed at exons 2-4. In contrast, vertebrate genomes are globally methylated, and increase towards the highest methylation levels observed at exon 2, and maintained across the rest of the gene body. Although DNA methylation patterns differ among all species, their methylation profiles share a bimodal distribution across the genomes. Genes with low levels of CpG methylation and gene expression are mainly enriched for species specific genes. In contrast, genes associated with high methylated CpG sites are highly transcribed and evolutionary conserved across all species. Finally, the positive correlation between internal exons and gene expression potentially points to an evolutionary conserved mechanism, whereas the negative regulation of gene expression via methylation of promoters and exon 1 is potentially a secondary mechanism that has been evolved in vertebrates.

44 citations


Journal ArticleDOI
TL;DR: It is discovered that approximately one‐third of the Daphnia genes, enriched for metabolism, cell signalling and general stress response, drives transcriptional early response to environmental stress and it is shared among genetic backgrounds.
Abstract: Natural habitats are exposed to an increasing number of environmental stressors that cause important ecological consequences. However, the multifarious nature of environmental change, the strength and the relative timing of each stressor largely limit our understanding of biological responses to environmental change. In particular, early response to unpredictable environmental change, critical to survival and fitness in later life stages, is largely uncharacterized. Here, we characterize the early transcriptional response of the keystone species Daphnia magna to twelve environmental perturbations, including biotic and abiotic stressors. We first perform a differential expression analysis aimed at identifying differential regulation of individual genes in response to stress. This preliminary analysis revealed that a few individual genes were responsive to environmental perturbations and they were modulated in a stressor and genotype-specific manner. Given the limited number of differentially regulated genes, we were unable to identify pathways involved in stress response. Hence, to gain a better understanding of the genetic and functional foundation of tolerance to multiple environmental stressors, we leveraged the correlative nature of networks and performed a weighted gene co-expression network analysis. We discovered that approximately one-third of the Daphnia genes, enriched for metabolism, cell signalling and general stress response, drives transcriptional early response to environmental stress and it is shared among genetic backgrounds. This initial response is followed by a genotype- and/or condition-specific transcriptional response with a strong genotype-by-environment interaction. Intriguingly, genotype- and condition-specific transcriptional response is found in genes not conserved beyond crustaceans, suggesting niche-specific adaptation.

42 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: The proposed TRNG has been successfully validated on three different processes and they all passed the National Institute of Standards and Technology (NIST) tests, making it a suitable candidate for future cryptographically secured applications in the internet of things (IoT).
Abstract: A novel True Random Number Generator (TRNG), using random telegraph noise (RTN) as the entropy source, is proposed to address speed, design area, power and cost simultaneously. For the first time, the proposed design breaks the inherent speed limitation and generates true random numbers up to 3Mbps with ultra-low power. This is over 10 times faster than the state-of-the-art RTN-TRNG [6]. Moreover, the new design does not require selection of devices and thus avoids the use of large transistor array and laborious post-selection process. This reduces the circuit area and the cost. The proposed TRNG has been successfully validated on three different processes and they all passed the National Institute of Standards and Technology (NIST) tests, making it a suitable candidate for future cryptographically secured applications in the internet of things (IoT).

27 citations


Journal ArticleDOI
29 Jun 2018-RNA
TL;DR: RNA-seq analysis of nonsense-mediated decay-inhibited cells revealed previously undescribed splice junctions that connect constitutive exons 4 and 5 to highly conserved cryptic cassette exons within the intron, and it is proposed that these exons function as decoys that engage the introns-terminal splice sites, thereby blocking cross-intron interactions required for excision.
Abstract: During terminal erythropoiesis, the splicing machinery in differentiating erythroblasts executes a robust intron retention (IR) program that impacts expression of hundreds of genes. We studied IR mechanisms in the SF3B1 splicing factor gene, which expresses ∼50% of its transcripts in late erythroblasts as a nuclear isoform that retains intron 4. RNA-seq analysis of nonsense-mediated decay (NMD)-inhibited cells revealed previously undescribed splice junctions, rare or not detected in normal cells, that connect constitutive exons 4 and 5 to highly conserved cryptic cassette exons within the intron. Minigene splicing reporter assays showed that these cassettes promote IR. Genome-wide analysis of splice junction reads demonstrated that cryptic noncoding cassettes are much more common in large (>1 kb) retained introns than they are in small retained introns or in nonretained introns. Functional assays showed that heterologous cassettes can promote retention of intron 4 in the SF3B1 splicing reporter. Although many of these cryptic exons were spliced inefficiently, they exhibited substantial binding of U2AF1 and U2AF2 adjacent to their splice acceptor sites. We propose that these exons function as decoys that engage the intron-terminal splice sites, thereby blocking cross-intron interactions required for excision. Developmental regulation of decoy function underlies a major component of the erythroblast IR program.

27 citations



Posted ContentDOI
18 Jan 2018-bioRxiv
TL;DR: It is shown that two classes of enhancer are active during early Drosophila embryogenesis and that by focusing on a single, relatively homogeneous class of elements, over 98% prediction accuracy can be achieved in a balanced, completely held-out test set.
Abstract: Identifying functional enhancers elements in metazoan systems is a major challenge For example, large-scale validation of enhancers predicted by ENCODE reveal false positive rates of at least 70% Here we use the pregrastrula patterning network of Drosophila melanogaster to demonstrate that loss in accuracy in held out data results from heterogeneity of functional signatures in enhancer elements We show that two classes of enhancer are active during early Drosophila embryogenesis and that by focusing on a single, relatively homogeneous class of elements, over 98% prediction accuracy can be achieved in a balanced, completely held-out test set The class of well predicted elements is composed predominantly of enhancers driving multi-stage, segmentation patterns, which we designate segmentation driving enhancers (SDE) Prediction is driven by the DNA occupancy of early developmental transcription factors, with almost no additional power derived from histone modifications We further show that improved accuracy is not a property of a particular prediction method: after conditioning on the SDE set, naive Bayes and logistic regression perform as well as more sophisticated tools Applying this method to a genome-wide scan, we predict 1,640 SDEs that cover 16% of the genome, 916 of which are novel An analysis of 32 novel SDEs using wholemount embryonic imaging of stably integrated reporter constructs chosen throughout our prediction rank-list showed >90% drove expression patterns We achieved 867% precision on a genome-wide scan, with an estimated recall of at least 98%, indicating high accuracy and completeness in annotating this class of functional elements

9 citations


Journal ArticleDOI
05 Dec 2018
TL;DR: This work presents a meta-analyses of the determinants of infectious disease in eight types of inflammatory bowel disease and shows clear patterns in response to antibiotics and in particular the immune response to E.coli.
Abstract: 1 Denotes equal contribution a Department of Biological Statistics and Computational Biology, Cornell University b Department of Statistical Science, Cornell University c Statistics Department, University of California, Berkeley d Centre for Computational Biology, School of Biosciences, University of Birmingham e Molecular Ecosystems Biology Department, Biosciences Area, Lawrence Berkeley National Laboratory f Department of Electrical Engineering and Computer Sciences, University of California, Berkeley DOI: 10.21105/joss.01077

Posted ContentDOI
09 Mar 2018-bioRxiv
TL;DR: RNA-seq analysis of nonsense-mediated decay (NMD)-inhibited cells revealed previously undescribed splice junctions that connect constitutive exons 4 and 5 to highly conserved cryptic cassette exons within the intron, suggesting developmental regulation of decoy function underlies a major component of the erythroblast IR program.
Abstract: During terminal erythropoiesis, differentiating erythroblasts execute a robust program of intron retention (IR). We studied IR mechanisms in the SF3B1 splicing factor gene, which expresses ~50% of its transcripts in late erythroblasts as a nuclear isoform that retains intron 4. RNA-seq splice junction reads from nonsense-mediated decay (NMD)-inhibited cells revealed that highly conserved intron sequences encode cryptic cassette exons, and minigene splicing reporter assays showed that these cassettes function as decoys that promote IR. Novel decoy exons were common in large (>1kb) retained introns, and heterologous decoys promoted retention of intron 4. Although most decoys were spliced inefficiently, they exhibited substantial binding of U2AF1 and U2AF2 adjacent to their splice acceptor sites. We propose that decoy exons engage intron-terminal splice sites, blocking cross-intron interactions required for excision, and that developmental regulation of decoy function underlies a major component of the erythroblast IR program.