scispace - formally typeset
Search or ask a question

Showing papers by "Jens Allmer published in 2022"


Book ChapterDOI
TL;DR: In this article, the authors point out open challenges important for computational modelling or for general understanding of miRNA-based regulation and show how their investigation is beneficial. But they do not address the problem of predicting all possible pairs of a miRNA and its target.
Abstract: Mature microRNAs (miRNAs) are short RNA sequences about 18-24 nucleotide long, which provide the recognition key within RISC for the posttranscriptional regulation of target RNAs. Considering the canonical pathway, mature miRNAs are produced via a multistep process. Their transcription (pri-miRNAs) and first processing step via the microprocessor complex (pre-miRNAs) occur in the nucleus. Then they are exported into the cytosol, processed again by Dicer (dsRNA) and finally a single strand (mature miRNA) is incorporated into RISC (miRISC). The sequence of the incorporated miRNA provides the function of RNA target recognition via hybridization. Following binding of the target, the mRNA is either degraded or translation is inhibited, which ultimately leads to less protein production. Conversely, it has been shown that binding within the 5' UTR of the mRNA can lead to an increase in protein product. Regulation of homeostasis is very important for a cell; therefore, all steps in the miRNA-based regulation pathway, from transcription to the incorporation of the mature miRNA into RISC, are under tight control. While much research effort has been exerted in this area, the knowledgebase is not sufficient for accurately modelling miRNA regulation computationally. The computational prediction of miRNAs is, however, necessary because it is not feasible to investigate all possible pairs of a miRNA and its target, let alone miRNAs and their targets. We here point out open challenges important for computational modelling or for our general understanding of miRNA-based regulation and show how their investigation is beneficial. It is our hope that this collection of challenges will lead to their resolution in the near future.

2 citations


Journal ArticleDOI
TL;DR: A central or shared database enforcing community reporting and quality standards is needed in the future for ncRNAs, and a small number of highly complementary ncRNA databases are discussed in this work.
Abstract: Diseases such as cancer are often defined by dysregulation of gene expression. Noncoding RNAs (ncRNA) such as microRNAs are involved in gene expression and cell-cell communication. Many other ncRNAs exist, such as circular RNAs and small nucleolar RNAs. A wealth of knowledge is available for many ncRNAs, but the information is federated in many databases. A small number of highly complementary ncRNA databases are discussed in this work. Their relevance for cancer research is highlighted, and some of the current problems and limitations are revealed. A central or shared database enforcing community reporting and quality standards is needed in the future. • RNA-seq • Noncoding RNAs • Databases • Data repositories.

1 citations


Book ChapterDOI
TL;DR: In this paper, an ensemble classifier of multiple two-class random forest classifiers was designed, where each random forest was trained on one species-clade pair and the approach was tested with different sampling methods on a dataset that was taken from miRBase version 21 and evaluated using a hierarchical F-measure.
Abstract: Gene regulation is of utmost importance to cell homeostasis; thus, any dysregulation in it often leads to disease. MicroRNAs (miRNAs) are involved in posttranscriptional gene regulation and consequently, their dysregulation has been associated with many diseases.MiRBase version 21 contains microRNAs from about 200 species organized into about 70 clades. It has been shown that not all miRNAs collected in the database are likely to be real and, therefore, novel routes to delineate between correct and false miRNAs should be explored. We introduce a novel approach based on k-mer frequencies and machine learning that assigns an unknown/unlabeled miRNA to its most likely clade/species of origin. A simple way to filter new data would be to ensure that the novel miRNA categorizes closely to the species it is said to originate from. For that, an ensemble classifier of multiple two-class random forest classifiers was designed, where each random forest was trained on one species-clade pair. The approach was tested with different sampling methods on a dataset that was taken from miRBase version 21 and it was evaluated using a hierarchical F-measure. The approach predicted 81% to 94% of the test data correctly, depending on the sampling method. This is the first classifier that can classify miRNAs to their species of origin. This method will aid in the evaluation of miRNA database integrity and analysis of noisy miRNA samples.