scispace - formally typeset
Search or ask a question

Showing papers by "Sushmita Paul published in 2011"


Journal ArticleDOI
TL;DR: A new feature selection algorithm is presented based on rough set theory that selects a set of genes from microarray data by maximizing the relevance and significance of the selected genes.

130 citations


Proceedings ArticleDOI
12 Nov 2011
TL;DR: In this paper, the application of rough-fuzzy c-means (RFCM)algorithm is presented to discover co-expressed gene clusters and a method is introduced based on Dunn's cluster validity index to identify optimum values of different parameters of the initialization method and the RFCM algorithm.
Abstract: Clustering is one of the important analysis in functional genomics that discovers groups of co-expressed genes from micro array data. In this paper, the application of rough-fuzzy c-means (RFCM)algorithm is presented to discover co-expressed gene clusters. One of the major issues of the RFCM based micro array data clustering is how to select initial prototypes of different clusters. To overcome this limitation, a method is proposed to select initial cluster centers. It enables the RFCM algorithm to converge to an optimum or near optimum solutions and helps to discover co-expressed gene clusters. A method is also introduced based on Dunn's cluster validity index to identify optimum values of different parameters of the initialization method and the RFCM algorithm. The effectiveness of the RFCM algorithm, along with a comparison with other related methods, is demonstrated on five yeast gene expression time-series data sets using Silhouette index, Davies-Bould in index, and gene ontology based analysis.

20 citations


Proceedings ArticleDOI
19 Feb 2011
TL;DR: A fuzzy discretization method is proposed for rough set based gene selection algorithm to compute relevance and significance of continuous valued genes directly to select relevant and significant genes from micro array data.
Abstract: Selection of reliable genes from micro array gene expression data is essential to carry out a diagnostic test and successful treatment. In this regard, a rough set based gene selection algorithm is developed recently to select genes from micro array data. In this paper, a fuzzy discretization method is proposed for rough set based gene selection algorithm to compute relevance and significance of continuous valued genes directly. The performance of the proposed fuzzy discretization method, along with a comparison with crisp counterpart, is presented in terms of classification accuracy of K-nearest neighbor rule and support vector machine on seven micro array data sets. An important finding is that the proposed discretization method is shown to be effective in selecting relevant and significant genes from micro array data.

5 citations


Book ChapterDOI
19 Dec 2011
TL;DR: The performance of the rough set based gene selection algorithm, along with a comparison with other gene selection methods, is studied using the predictive accuracy of K-nearest neighbor rule and support vector machine on two cancer and one arthritis microarray data sets.
Abstract: Selection of reliable genes from a huge gene expression data containing high intergene correlation is essential to carry out a diagnostic test and successful treatment. In this regard, a rough set based gene selection algorithm is reported, which selects a set of genes by maximizing the relevance and significance of the selected genes. A gene ontology-based similarity measure is proposed to analyze the functional diversity of the selected genes. It also helps to analyze the effectiveness of different gene selection methods. The performance of the rough set based gene selection algorithm, along with a comparison with other gene selection methods, is studied using the predictive accuracy of K-nearest neighbor rule and support vector machine on two cancer and one arthritis microarray data sets. An important finding is that the rough set based gene selection algorithm selects more functionally diverse set of genes than the existing algorithms.