scispace - formally typeset
Search or ask a question
JournalISSN: 1574-8936

Current Bioinformatics 

Bentham Science Publishers
About: Current Bioinformatics is an academic journal published by Bentham Science Publishers. The journal publishes majorly in the area(s): Computer science & Gene. It has an ISSN identifier of 1574-8936. Over the lifetime, 838 publications have been published receiving 9111 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This article provides a review of the most widely used ensemble learning methods and their application in various bioinformatics problems, including the main topics of gene expression, mass spectrometry-based proteomics, gene-gene interaction identification from genome-wide association studies, and prediction of regulatory elements from DNA and protein sequences.
Abstract: Ensemble learning is an intensively studies technique in machine learning and pattern recognition. Recent work in computational biology has seen an increasing use of ensemble learning methods due to their unique advantages in dealing with small sample size, high-dimensionality, and complexity data structures. The aim of this article is two-fold. First, it is to provide a review of the most widely used ensemble learning methods and their application in various bioinformatics problems, including the main topics of gene expression, mass spectrometry-based proteomics, gene-gene interaction identification from genome-wide association studies, and prediction of regulatory elements from DNA and protein sequences. Second, we try to identify and summarize future trends of ensemble methods in bioinformatics. Promising directions such as ensemble of support vector machine, meta-ensemble, and ensemble based feature selection are discussed.

436 citations

Journal ArticleDOI
TL;DR: A state-of-the-art overview of the data processing tools available is provided, with their advantages and disadvantages, and comparisons are made to guide the reader.
Abstract: Biological systems are increasingly being studied in a holistic manner, using omics approaches, to provide quantitative and qualitative descriptions of the diverse collection of cellular components. Among the omics approaches, metabolomics, which deals with the quantitative global profiling of small molecules or metabolites, is being used extensively to explore the dynamic response of living systems, such as organelles, cells, tissues, organs and whole organisms, under diverse physiological and pathological conditions. This technology is now used routinely in a number of applications, including basic and clinical research, agriculture, microbiology, food science, nutrition, pharmaceutical research, environmental science and the development of biofuels. Of the multiple analytical platforms available to perform such analyses, nuclear magnetic resonance and mass spectrometry have come to dominate, owing to the high resolution and large datasets that can be generated with these techniques. The large multidimensional datasets that result from such studies must be processed and analyzed to render this data meaningful. Thus, bioinformatics tools are essential for the efficient processing of huge datasets, the characterization of the detected signals, and to align multiple datasets and their features. This paper provides a state-of-the-art overview of the data processing tools available, and reviews a collection of recent reports on the topic. Data conversion, pre-processing, alignment, normalization and statistical analysis are introduced, with their advantages and disadvantages, and comparisons are made to guide the reader.

285 citations

Journal ArticleDOI
TL;DR: A review of automated methods for the discovery of Simple Sequence Repeats (SSRs) and Single Nucleotide Polymorphisms (SNPs) can be found in this paper.
Abstract: Molecular genetic markers represent one of the most powerful tools for the analysis of genomes and enable the association of heritable traits with underlying genomic variation. Molecular marker technology has developed rapidly over the last decade and two forms of sequence based marker, Simple Sequence Repeats (SSRs), also known as microsatellites, and Single Nucleotide Polymorphisms (SNPs) now predominate applications in modern genetic analysis. The reducing cost of DNA sequencing has led to the availability of large sequence data sets derived from whole genome sequencing and large scale Expressed Sequence Tag (EST) discovery that enable the mining of SSRs and SNPs, which may then be applied to diversity analysis, genetic trait mapping, association studies, and marker assisted selection. These markers are inexpensive, require minimal labour to produce and can frequently be associated with annotated genes. Here we review automated methods for the discovery of SSRs and SNPs and provide an overview of the diverse applications of these markers.

151 citations

Journal ArticleDOI
TL;DR: This survey considers the major bioinformatics applications ofHidden Markov Models, such as alignment, labeling, and profiling of sequences, protein structure prediction, and pattern recognition, and provides a critical appraisal of the use and perspectives of HMMs.
Abstract: Hidden Markov Models (HMMs) became recently important and popular among bioinformatics researchers, and many software tools are based on them. In this survey, we first consider in some detail the mathematical foundations of HMMs, we describe the most important algorithms, and provide useful comparisons, pointing out advantages and drawbacks. We then consider the major bioinformatics applications, such as alignment, labeling, and profiling of sequences, protein structure prediction, and pattern recognition. We finally provide a critical appraisal of the use and perspectives of HMMs in bioinformatics.

146 citations

Journal ArticleDOI
TL;DR: This review attempted to present a unified approach that considers both class-prediction and class-discovery, and discussed important issues such as preprocessing of gene expression data, curse of dimensionality, feature extraction/selection, and measuring or estimating classifier performance.
Abstract: In this review, we have discussed the class-prediction and discovery methods that are applied to gene expression data, along with the implications of the findings. We attempted to present a unified approach that considers both class-prediction and class-discovery. We devoted a substantial part of this review to an overview of pattern classification/recognition methods and discussed important issues such as preprocessing of gene expression data, curse of dimensionality, feature extraction/selection, and measuring or estimating classifier performance. We discussed and summarized important properties such as generalizability (sensitivity to overtraining), built-in feature selection, ability to report prediction strength, and transparency (ease of understanding of the operation) of different class-predictor design approaches to provide a quick and concise reference. We have also covered the topic of biclustering, which is an emerging clustering method that processes the entries of the gene expression data matrix in both gene and sample directions simultaneously, in detail.

138 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
202366
202277
202154
202074
201979
201829