scispace - formally typeset
Journal ArticleDOI

Predicting Protein Function via Semantic Integration of Multiple Networks

TLDR
This paper proposes a method, called SimNet, to Semantically integrate multiple functional association Networks derived from heterogenous data sources, and shows that, SimNet not only achieves better (or comparable) results than other related competitive approaches, but also takes much less time.
Abstract
Determining the biological functions of proteins is one of the key challenges in the post-genomic era. The rapidly accumulated large volumes of proteomic and genomic data drives to develop computational models for automatically predicting protein function in large scale. Recent approaches focus on integrating multiple heterogeneous data sources and they often get better results than methods that use single data source alone. In this paper, we investigate how to integrate multiple biological data sources with the biological knowledge, i.e., Gene Ontology (GO), for protein function prediction. We propose a method, called SimNet , to S emantically i ntegrate m ultiple functional association Net works derived from heterogenous data sources. SimNet firstly utilizes GO annotations of proteins to capture the semantic similarity between proteins and introduces a semantic kernel based on the similarity. Next, SimNet constructs a composite network, obtained as a weighted summation of individual networks, and aligns the network with the kernel to get the weights assigned to individual networks. Then, it applies a network-based classifier on the composite network to predict protein function. Experiment results on heterogenous proteomic data sources of Yeast, Human, Mouse, and Fly show that, SimNet not only achieves better (or comparable) results than other related competitive approaches, but also takes much less time. The Matlab codes of SimNet are available at https://sites.google.com/site/guoxian85/simnet .

read more

Citations
More filters
Journal ArticleDOI

BRWLDA: bi-random walks for predicting lncRNA-disease associations.

TL;DR: A model that performs Bi-Random Walks to predict novel LncRNA-Disease Associations (BRWLDA in short) achieves reliable and better performance than other comparing methods not only on experiment verified associations, but also on the simulated experiments with masked associations.
Journal ArticleDOI

Integrating multi-network topology for gene function prediction using deep neural networks

TL;DR: A novel semi-supervised autoencoder method to integrate multiple networks and generate a low-dimensional feature representation and a convolutional neural network based on the integrated feature embedding to annotate unlabeled gene functions are designed.
Journal ArticleDOI

Improving the measurement of semantic similarity by combining gene ontology and co-functional network: a random walk based approach.

TL;DR: Using NETSIM2 as an example, it is found that the accuracy of semantic similarities can be significantly improved after effectively incorporating the global gene-to-gene interactions in the co-functional network, especially on the species that gene annotations in GO are far from complete.
Journal ArticleDOI

A Literature Review of Gene Function Prediction by Modeling Gene Ontology.

TL;DR: It is concluded that there remain many largely overlooked but important topics for future research in gene function prediction that apply GO in different ways, such as using hierarchical or flat inter-relationships between GO terms, compressing massive GO terms and quantifying semantic similarities.
References
More filters
Journal ArticleDOI

Gene Ontology: tool for the unification of biology

TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Proceedings Article

An Information-Theoretic Definition of Similarity

Dekang Lin
TL;DR: This work presents an informationtheoretic definition of similarity that is applicable as long as there is a probabilistic model and demonstrates how this definition can be used to measure the similarity in a number of different domains.
Proceedings Article

Learning with Local and Global Consistency

TL;DR: A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points.

Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy

TL;DR: This paper presents a new approach for measuring semantic similarity/distance between words and concepts that combines a lexical taxonomy structure with corpus statistical information so that the semantic distance between nodes in the semantic space constructed by the taxonomy can be better quantified with the computational evidence derived from a distributional analysis of corpus data.
Journal ArticleDOI

A Review On Multi-Label Learning Algorithms

TL;DR: This paper aims to provide a timely review on this area with emphasis on state-of-the-art multi-label learning algorithms with relevant analyses and discussions.
Related Papers (5)

A large-scale evaluation of computational protein function prediction

Predrag Radivojac, +107 more
- 01 Mar 2013 -