scispace - formally typeset
Search or ask a question

Showing papers by "Sushmita Paul published in 2010"


Journal ArticleDOI
01 Nov 2010
TL;DR: A new feature selection algorithm is presented, based on rough set theory, to select a set of effective molecular descriptors from a given QSAR dataset by maximizing both relevance and significance of the descriptors.
Abstract: Quantitative structure activity relationship (QSAR) is one of the important disciplines of computer-aided drug design that deals with the predictive modeling of properties of a molecule. In general, each QSAR dataset is small in size with large number of features or descriptors. Among the large amount of descriptors presented in the QSAR dataset, only a small fraction of them is effective for performing the predictive modeling task. In this paper, a new feature selection algorithm is presented, based on rough set theory, to select a set of effective molecular descriptors from a given QSAR dataset. The proposed algorithm selects the set of molecular descriptors by maximizing both relevance and significance of the descriptors. An important finding is that the proposed feature selection algorithm is shown to be effective in selecting relevant and significant molecular descriptors from the QSAR dataset for predictive modeling. The performance of the proposed algorithm is studied using R2 statistic of support vector regression method. The effectiveness of the proposed algorithm, along with a comparison with existing algorithms, is demonstrated on three QSAR datasets.

29 citations


Proceedings ArticleDOI
01 Dec 2010
TL;DR: A rough set based gene selection algorithm is presented that selects the set of genes by maximizing the relevance and significance of the genes, which are calculated based on the theory of rough sets.
Abstract: Gene selection from microarray data is an important issue for gene expression based classification and to carry out a diagnostic test In this regard, a rough set based gene selection algorithm is presented It selects the set of genes by maximizing the relevance and significance of the genes, which are calculated based on the theory of rough sets Using the predictive accuracy of K-nearest neighbor rule and support vector machine, the performance of the proposed algorithm, along with a comparison with other related methods is studied on five cancer and two arthritis microarray data sets Promising performance was achieved by the proposed gene selection algorithm with relevant and significant genes from microarray data set in a reasonable time

12 citations