scispace - formally typeset
Search or ask a question
JournalISSN: 2226-6089

EMBnet.journal 

EMBnet Stichting
About: EMBnet.journal is an academic journal published by EMBnet Stichting. The journal publishes majorly in the area(s): EMBnet & Genome. It has an ISSN identifier of 2226-6089. It is also open access. Over the lifetime, 308 publications have been published receiving 21202 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The command-line tool cutadapt is developed, which supports 454, Illumina and SOLiD (color space) data, offers two adapter trimming algorithms, and has other useful features.
Abstract: When small RNA is sequenced on current sequencing machines, the resulting reads are usually longer than the RNA and therefore contain parts of the 3' adapter. That adapter must be found and removed error-tolerantly from each read before read mapping. Previous solutions are either hard to use or do not offer required features, in particular support for color space data. As an easy to use alternative, we developed the command-line tool cutadapt, which supports 454, Illumina and SOLiD (color space) data, offers two adapter trimming algorithms, and has other useful features. Cutadapt, including its MIT-licensed source code, is available for download at http://code.google.com/p/cutadapt/

20,255 citations

Journal ArticleDOI
TL;DR: It is observed that many RNA-seq datasets have not reached saturation for detection of expressed genes and that the relative proportion of different transcript biotypes changes with increasing sequencing depth, and a novel differential expression methodology – NOISeq1 that is robust to the amount of reads is proposed.
Abstract: Introduction Next Generation Sequencing (NGS) technologies have brought a revolution to research in genome and genome regulation. One of the most breaking applications of NGS is in transcriptome analysis. RNA-seq has revealed exciting new data on gene models, alternative splicing and extra-genic expression. Also RNA-seq permits the quantification of gene expression across a large dynamic range and with more reproducibility than microarrays. Several methods for the assessment of differential expression from count data have been proposed but biases associated to transcript length and transcript frequency distributions have been reported. It is still not clear how much sequencing reads should be generated in a RNA-seq experiment to obtain reliable results and what’s exactly being detected. In general we observed that many RNA-seq datasets have not reached saturation for detection of expressed genes and that the relative proportion of different transcript biotypes changes with increasing sequencing depth. In this work we investigate the effect that library size has on the assessment of differential expression on different aspects of the selected genes. We show that current statistical methods suffer from a strong dependency of their significant calls on the number of mapped reads considered and proposed a novel differential expression methodology – NOISeq1that is robust to the amount of reads.

135 citations

Journal Article
TL;DR: ForCon is a software tool for the conversion of nucleic acid and amino acid sequence alignments that runs on IBMcompatible computers under a Microsoft Windows environment.
Abstract: ForCon is a software tool for the conversion of nucleic acid and amino acid sequence alignments that runs on IBMcompatible computers under a Microsoft Windows environment.The program converts alignment formats used by all popular software packages for sequence alignment and phylogenetic tree inference.ForCon is available for free on request from the authors or can be downloaded via internet at URL http://bioc-www.uia.ac.be/u/jraes/ index.html .It is also included in the software package TREECON for Windows (see http://bioc-www.uia.ac.be/u/ yvdp/index.html).

45 citations

Journal ArticleDOI
TL;DR: To bring sequencing to the foreground, scientists have to slide over obstacles and find alternative ways to approach the issue of data volume, where out of the box solutions may ease the typical research workflow until technological development meets the needs of Bioinformatics.
Abstract: During the last decades, there is a vast data explosion in bioinformatics. Big data centres are trying to face this data crisis, reaching high storage capacity levels. Although several scientific giants examine how to handle the enormous pile of information in their cupboards, the problem remains unsolved. On a daily basis, there is a massive quantity of permanent loss of extensive information due to infrastructure and storage space problems. The motivation for sequencing has fallen behind. Sometimes, the time that is spent to solve storage space problems is longer than the one dedicated to collect and analyse data. To bring sequencing to the foreground, scientists have to slide over such obstacles and find alternative ways to approach the issue of data volume. Scientific community experiences the data crisis era, where, out of the box solutions may ease the typical research workflow, until technological development meets the needs of Bioinformatics.

41 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
20238
202210
202113
20204
20193
20185