scispace - formally typeset
Search or ask a question

Showing papers by "Gianluca Pollastri published in 2008"


Proceedings ArticleDOI
01 Jun 2008
TL;DR: An effective approach to adapt a traditional neural network to learn ordinal categories is described, a generalization of the perceptron method for ordinal regression, which outperforms a neural network classification method.
Abstract: Ordinal regression is an important type of learning, which has properties of both classification and regression. Here we describe an effective approach to adapt a traditional neural network to learn ordinal categories. Our approach is a generalization of the perceptron method for ordinal regression. On several benchmark datasets, our method (NNRank) outperforms a neural network classification method. Compared with the ordinal regression methods using Gaussian processes and support vector machines, NNRank achieves comparable performance. Moreover, NNRank has the advantages of traditional neural networks: learning in both online and batch modes, handling very large training datasets, and making rapid predictions. These features make NNRank a useful and complementary tool for large-scale data mining tasks such as information retrieval, Web page ranking, collaborative filtering, and protein ranking in bioinformatics. The neural network software is available at: http://www.cs.missouri.edu/~chengji/cheng software.html.

160 citations


Journal ArticleDOI
TL;DR: The global fitting of titrational events (GloFTE) method is applied to experimental data on five enzyme systems and on a single non-enzyme system, and it is shown that the extracted electrostatic interaction energies and effective dielectric constants for a subset of these systems agree excellently with experimentally determined values as well as with theoretical calculations.

31 citations


Journal ArticleDOI
TL;DR: This work focuses on the problem of learning to predict more "physical" contact maps by first predicting contact maps through a traditional system (XXStout), and then filtering these maps by an ensemble of artificial neural networks.
Abstract: Protein topology representations such as residue contact maps are an important intermediate step towards ab initio prediction of protein structure, but the problem of predicting reliable contact maps is far from solved. One of the main pitfalls of existing contact map predictors is that they generally predict unphysical maps, i.e. maps that cannot be embedded into three-dimensional structures or, at best, violate a number of basic constraints observed in real protein structures, such as the maximum number of contacts for a residue. Here, we focus on the problem of learning to predict more "physical" contact maps. We do so by first predicting contact maps through a traditional system (XXStout), and then filtering these maps by an ensemble of artificial neural networks. The filter is provided as input not only the bare predicted map, but also a number of global or long-range features extracted from it. In a rigorous cross-validation test, we show that the filter greatly improves the predicted maps it is input. CASP7 results, on which we report here, corroborate this finding. Importantly, since the approach we present here is fully modular, it may be beneficial to any other ab initio contact map predictor.

5 citations


Book ChapterDOI
26 Mar 2008
TL;DR: It is argued in this paper that additional information for recognising nativelike structures can be obtained by regarding the final conformation as the result of a generative process reminiscent of the folding process that generates structures in nature.
Abstract: Many algorithms that attempt to predict proteins' native structure from sequence need to generate a large set of hypotheses in order to ensure that nearly correct structures are included, leading to the problem of assessing the quality of alternative 3D conformations. This problem has been mostly approached by focusing on the final 3D conformation, with machine learning techniques playing a leading role. We argue in this paper that additional information for recognising nativelike structures can be obtained by regarding the final conformation as the result of a generative process reminiscent of the folding process that generates structures in nature. We introduce a coarse representation of protein pseudo-folding based on binary trees and introduce a kernel function for assessing their similarity. Kernel-based analysis techniques empirically demonstrate a significant correlation between information contained into pseudo-folding trees and features of native folds in a large and non-redundant set of proteins.

2 citations