scispace - formally typeset
Search or ask a question
Book ChapterDOI

Soft Computing in Bioinformatics

01 Jan 2021-pp 431-446
TL;DR: In this paper, the authors explored the soft computing based techniques for bioinformatics and discussed the necessity of soft computing techniques and their compatibility for solving wide spectrum of bio-informatic related problems.
Abstract: In this chapter, we explored the soft computing based techniques for bioinformatics. Necessity of soft computing techniques and their compatibility for solving wide spectrum of bioinformatics related problems is reviewed. Basics of soft computing techniques are discussed and their relevancy in solving many bioinformatics based problems is also elaborated. Actual experimental results on two real world bioinformatics data demonstrated the efficacy of soft computing techniques over conventional one for biological data problems.
References
More filters
Journal ArticleDOI
TL;DR: The application of ACO to this bioinformatics problem compares favourably with specialised, state-of-the-art methods for the 2D and 3D HP protein folding problem; the empirical results indicate that the rather simple ACO algorithm scales worse with sequence length but usually finds a more diverse ensemble of native states.
Abstract: The protein folding problem is a fundamental problems in computational molecular biology and biochemical physics Various optimisation methods have been applied to formulations of the ab-initio folding problem that are based on reduced models of protein structure, including Monte Carlo methods, Evolutionary Algorithms, Tabu Search and hybrid approaches In our work, we have introduced an ant colony optimisation (ACO) algorithm to address the non-deterministic polynomial-time hard (NP-hard) combinatorial problem of predicting a protein's conformation from its amino acid sequence under a widely studied, conceptually simple model – the 2-dimensional (2D) and 3-dimensional (3D) hydrophobic-polar (HP) model We present an improvement of our previous ACO algorithm for the 2D HP model and its extension to the 3D HP model We show that this new algorithm, dubbed ACO-HPPFP-3, performs better than previous state-of-the-art algorithms on sequences whose native conformations do not contain structural nuclei (parts of the native fold that predominantly consist of local interactions) at the ends, but rather in the middle of the sequence, and that it generally finds a more diverse set of native conformations The application of ACO to this bioinformatics problem compares favourably with specialised, state-of-the-art methods for the 2D and 3D HP protein folding problem; our empirical results indicate that our rather simple ACO algorithm scales worse with sequence length but usually finds a more diverse ensemble of native states Therefore the development of ACO algorithms for more complex and realistic models of protein structure holds significant promise

278 citations

Journal ArticleDOI
TL;DR: The results showed that Fuzzy c-means had a very good performance in all cases being very stable even in the presence of outliers and overlapping, while all other clustering algorithms were very affected by the amount of overlapping and outliers.

263 citations

Journal ArticleDOI
TL;DR: This study presents a comparative study of the classification accuracy of ECG signals using a well-known neural network architecture named multi-layered perceptron (MLP) with backpropagation training algorithm, and a new fuzzy clustering NN architecture (FCNN) for early diagnosis.

244 citations

Journal ArticleDOI
TL;DR: A common misunderstanding of Gaussian-function-based kernel fuzzy clustering is corrected, and a kernel fuzzy c-means clustering-based fuzzy SVM algorithm (KFCM-FSVM) is developed to deal with the classification problems with outliers or noises.
Abstract: The support vector machine (SVM) has provided higher performance than traditional learning machines and has been widely applied in real-world classification problems and nonlinear function estimation problems. Unfortunately, the training process of the SVM is sensitive to the outliers or noises in the training set. In this paper, a common misunderstanding of Gaussian-function-based kernel fuzzy clustering is corrected, and a kernel fuzzy c-means clustering-based fuzzy SVM algorithm (KFCM-FSVM) is developed to deal with the classification problems with outliers or noises. In the KFCM-FSVM algorithm, we first use the FCM clustering to cluster each of two classes from the training set in the high-dimensional feature space. The farthest pair of clusters, where one cluster comes from the positive class and the other from the negative class, is then searched and forms one new training set with membership degrees. Finally, we adopt FSVM to induce the final classification results on this new training set. The computational complexity of the KFCM-FSVM algorithm is analyzed. A set of experiments is conducted on six benchmarking datasets and four artificial datasets for testing the generalization performance of the KFCM-FSVM algorithm. The results indicate that the KFCM-FSVM algorithm is robust for classification problems with outliers or noises.

238 citations

Journal ArticleDOI
TL;DR: This work developed a machine learning approach using neural networks and support vector machines to extract common features among known RNAs for prediction of new RNA genes in the unannotated regions of prokaryotic and archaeal genomes and achieved a significant improvement in accuracy.
Abstract: Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using neural networks and support vector machines to extract common features among known RNAs for prediction of new RNA genes in the unannotated regions of prokaryotic and archaeal genomes. The Escherichia coli genome was used for development, but we have applied this method to several other bacterial and archaeal genomes. Networks based on nucleotide composition were 80–90% accurate in jackknife testing experiments for bacteria and 90–99% for hyperthermophilic archaea. We also achieved a significant improvement in accuracy by combining these predictions with those obtained using a second set of parameters consisting of known RNA sequence motifs and the calculated free energy of folding. Several known fRNAs not included in the training datasets were identified as well as several hundred predicted novel RNAs. These studies indicate that there are many unidentified RNAs in simple genomes that can be predicted computationally as a precursor to experimental study. Public access to our RNA gene predictions and an interface for user predictions is available via the web.

222 citations