Protein Folds Prediction with Hierarchical Structured SVM

doi:10.2174/157016461302160514000940

Journal Article•DOI•

Protein Folds Prediction with Hierarchical Structured SVM

31 May 2016-Current Proteomics-Vol. 13, Iss: 2, pp 79-85

About: This article is published in Current Proteomics.The article was published on 2016-05-31. It has received 112 citations till now. The article focuses on the topics: Structured support vector machine.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Sequence clustering in bioinformatics: an empirical study.

[...]

Quan Zou¹, Quan Zou², Gang Lin¹, Xingpeng Jiang³, Xiangrong Liu⁴, Xiangxiang Zeng⁴ - Show less +2 more•Institutions (4)

Tianjin University¹, University of Electronic Science and Technology of China², Central China Normal University³, Xiamen University⁴

18 Sep 2018-Briefings in Bioinformatics

TL;DR: This review selected several popular clustering tools, briefly explained the key computing principles, analyzed their characters and compared them using two independent benchmark datasets to assist bioinformatics users in employing suitable clustering tool effectively to analyze big sequencing data.

...read moreread less

Abstract: Sequence clustering is a basic bioinformatics task that is attracting renewed attention with the development of metagenomics and microbiomics. The latest sequencing techniques have decreased costs and as a result, massive amounts of DNA/RNA sequences are being produced. The challenge is to cluster the sequence data using stable, quick and accurate methods. For microbiome sequencing data, 16S ribosomal RNA operational taxonomic units are typically used. However, there is often a gap between algorithm developers and bioinformatics users. Different software tools can produce diverse results and users can find them difficult to analyze. Understanding the different clustering mechanisms is crucial to understanding the results that they produce. In this review, we selected several popular clustering tools, briefly explained the key computing principles, analyzed their characters and compared them using two independent benchmark datasets. Our aim is to assist bioinformatics users in employing suitable clustering tools effectively to analyze big sequencing data. Related data, codes and software tools were accessible at the link http://lab.malab.cn/∼lg/clustering/.

...read moreread less

170 citations

Journal Article•DOI•

iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators.

[...]

Chao-Qin Feng¹, Zhao-Yue Zhang¹, Xiao-Juan Zhu¹, Yan Lin², Wei Chen³, Wei Chen¹, Hua Tang, Hao Lin¹ - Show less +4 more•Institutions (3)

University of Electronic Science and Technology of China¹, Sichuan Agricultural University², North China University of Science and Technology³

01 May 2019-Bioinformatics

TL;DR: A new predictor based on support vector machine to identify transcription terminators based on pseudo k-tuple nucleotide composition (PseKNC) that could become a powerful tool for bacterial terminator recognition.

...read moreread less

Abstract: Motivation Transcription termination is an important regulatory step of gene expression. If there is no terminator in gene, transcription could not stop, which will result in abnormal gene expression. Detecting such terminators can determine the operon structure in bacterial organisms and improve genome annotation. Thus, accurate identification of transcriptional terminators is essential and extremely important in the research of transcription regulations. Results In this study, we developed a new predictor called 'iTerm-PseKNC' based on support vector machine to identify transcription terminators. The binomial distribution approach was used to pick out the optimal feature subset derived from pseudo k-tuple nucleotide composition (PseKNC). The 5-fold cross-validation test results showed that our proposed method achieved an accuracy of 95%. To further evaluate the generalization ability of 'iTerm-PseKNC', the model was examined on independent datasets which are experimentally confirmed Rho-independent terminators in Escherichia coli and Bacillus subtilis genomes. As a result, all the terminators in E. coli and 87.5% of the terminators in B. subtilis were correctly identified, suggesting that the proposed model could become a powerful tool for bacterial terminator recognition. Availability and implementation For the convenience of most of wet-experimental researchers, the web-server for 'iTerm-PseKNC' was established at http://lin-group.cn/server/iTerm-PseKNC/, by which users can easily obtain their desired result without the need to go through the detailed mathematical equations involved.

...read moreread less

165 citations

Journal Article•DOI•

Machine Learning for Drug-Target Interaction Prediction

[...]

Ruolan Chen¹, Xiangrong Liu¹, Shuting Jin¹, Jiawei Lin¹, Juan Liu¹ - Show less +1 more•Institutions (1)

Xiamen University¹

31 Aug 2018-Molecules

TL;DR: A hierarchical classification scheme is adopted and several representative methods of each category of drug-target interaction prediction are introduced, especially the recent state-of-the-art methods.

...read moreread less

Abstract: Identifying drug-target interactions will greatly narrow down the scope of search of candidate medications, and thus can serve as the vital first step in drug discovery Considering that in vitro experiments are extremely costly and time-consuming, high efficiency computational prediction methods could serve as promising strategies for drug-target interaction (DTI) prediction In this review, our goal is to focus on machine learning approaches and provide a comprehensive overview First, we summarize a brief list of databases frequently used in drug discovery Next, we adopt a hierarchical classification scheme and introduce several representative methods of each category, especially the recent state-of-the-art methods In addition, we compare the advantages and limitations of methods in each category Lastly, we discuss the remaining challenges and future outlook of machine learning in DTI prediction This article may provide a reference and tutorial insights on machine learning-based DTI prediction for future researchers

...read moreread less

162 citations

Journal Article•DOI•

Exploring sequence-based features for the improved prediction of DNA N4-methylcytosine sites in multiple species.

[...]

Leyi Wei¹, Shasha Luan¹, Luis Augusto Eijy Nagai², Ran Su¹, Quan Zou¹, Quan Zou³ - Show less +2 more•Institutions (3)

Tianjin University¹, University of Tokyo², University of Electronic Science and Technology of China³

15 Apr 2019-Bioinformatics

TL;DR: This study proposes a machine learning based predictor, namely 4mcPred‐SVM, for the genome‐wide detection of DNA 4mC sites, and presents a new feature representation algorithm that sufficiently exploits sequence‐based information.

...read moreread less

Abstract: Motivation As one of important epigenetic modifications, DNA N4-methylcytosine (4mC) is recently shown to play crucial roles in restriction-modification systems. For better understanding of their functional mechanisms, it is fundamentally important to identify 4mC modification. Machine learning methods have recently emerged as an effective and efficient approach for the high-throughput identification of 4mC sites, although high predictive error rates are still challenging for existing methods. Therefore, it is highly desirable to develop a computational method to more accurately identify m4C sites. Results In this study, we propose a machine learning based predictor, namely 4mcPred-SVM, for the genome-wide detection of DNA 4mC sites. In this predictor, we present a new feature representation algorithm that sufficiently exploits sequence-based information. To improve the feature representation ability, we use a two-step feature optimization strategy, thereby obtaining the most representative features. Using the resulting features and Support Vector Machine (SVM), we adaptively train the optimal models for different species. Comparative results on benchmark datasets from six species indicate that our predictor is able to achieve generally better performance in predicting 4mC sites as compared to the state-of-the-art predictors. Importantly, the sequence-based features can reliably and robust predict 4mC sites, facilitating the discovery of potentially important sequence characteristics for the prediction of 4mC sites. Availability and implementation The user-friendly webserver that implements the proposed 4mcPred-SVM is well established, and is freely accessible at http://server.malab.cn/4mcPred-SVM. Supplementary information Supplementary data are available at Bioinformatics online.

...read moreread less

139 citations

Journal Article•DOI•

iRNA-2OM: A Sequence-Based Predictor for Identifying 2'-O-Methylation Sites in Homo sapiens.

[...]

Hui Yang¹, Hao Lv¹, Hui Ding¹, Wei Chen², Wei Chen¹, Hao Lin¹ - Show less +2 more•Institutions (2)

University of Electronic Science and Technology of China¹, North China University of Science and Technology²

09 Nov 2018-Journal of Computational Biology

TL;DR: This study proposed a support vector machine-based model to predict 2'-O-methylation sites in H. sapiens, and the RNA sequences were encoded with the optimal features obtained from feature selection.

...read moreread less

Abstract: 2'-O-methylation plays an important biological role in gene expression. Owing to the explosive increase in genomic sequencing data, it is necessary to develop a method for quickly and efficiently identifying whether a sequence contains the 2'-O-methylation site. As an additional method to the experimental technique, a computational method may help to identify 2'-O-methylation sites. In this study, based on the experimental 2'-O-methylation data of Homo sapiens, we proposed a support vector machine-based model to predict 2'-O-methylation sites in H. sapiens. In this model, the RNA sequences were encoded with the optimal features obtained from feature selection. In the fivefold cross-validation test, the accuracy reached 97.95%.

...read moreread less

124 citations

Collapse

Protein Folds Prediction with Hierarchical Structured SVM

Citations

Related Papers (5)