scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Rough-Fuzzy Clustering for Grouping Functionally Similar Genes from Microarray Data

TL;DR: An efficient method is proposed to select initial prototypes of different gene clusters, which enables the proposed c-means algorithm to converge to an optimum or near optimum solutions and helps to discover coexpressed gene clusters.
Abstract: Gene expression data clustering is one of the important tasks of functional genomics as it provides a powerful tool for studying functional relationships of genes in a biological process. Identifying coexpressed groups of genes represents the basic challenge in gene clustering problem. In this regard, a gene clustering algorithm, termed as robust rough-fuzzy $(c)$-means, is proposed judiciously integrating the merits of rough sets and fuzzy sets. While the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in cluster definition, the integration of probabilistic and possibilistic memberships of fuzzy sets enables efficient handling of overlapping partitions in noisy environment. The concept of possibilistic lower bound and probabilistic boundary of a cluster, introduced in robust rough-fuzzy $(c)$-means, enables efficient selection of gene clusters. An efficient method is proposed to select initial prototypes of different gene clusters, which enables the proposed $(c)$-means algorithm to converge to an optimum or near optimum solutions and helps to discover coexpressed gene clusters. The effectiveness of the algorithm, along with a comparison with other algorithms, is demonstrated both qualitatively and quantitatively on 14 yeast microarray data sets.
Citations
More filters
Journal ArticleDOI
TL;DR: By integrating the notions of shadowed sets and multigranulation into rough-fuzzy clustering approaches, the overall topology of data can be captured well and the uncertain information implicated inData can be effectively addressed, including the uncertainty generated by fuzzification coefficient.

76 citations

Journal ArticleDOI
TL;DR: A novel gene selection method based on the neighborhood rough set model is proposed, which has the ability of dealing with real-value data whilst maintaining the original gene classification information, and an entropy measure is addressed under the frame of neighborhood rough sets for tackling the uncertainty and noisy of gene expression data.

75 citations


Cites background from "Rough-Fuzzy Clustering for Grouping..."

  • ...The further researches on gene selection from gene 15 expression data have significant influences on bioinformatics [4,5], tumor or 16 ∗ Corresponding author....

    [...]

Journal ArticleDOI
TL;DR: Experiments on real cancer gene expression profiles indicate that R DCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms.
Abstract: Performing clustering analysis is one of the important research topics in cancer discovery using gene expression profiles, which is crucial in facilitating the successful diagnosis and treatment of cancer. While there are quite a number of research works which perform tumor clustering, few of them considers how to incorporate fuzzy theory together with an optimization process into a consensus clustering framework to improve the performance of clustering analysis. In this paper, we first propose a random double clustering based cluster ensemble framework (RDCCE) to perform tumor clustering based on gene expression data. Specifically, RDCCE generates a set of representative features using a randomly selected clustering algorithm in the ensemble, and then assigns samples to their corresponding clusters based on the grouping results. In addition, we also introduce the random double clustering based fuzzy cluster ensemble framework (RDCFCE), which is designed to improve the performance of RDCCE by integrating the newly proposed fuzzy extension model into the ensemble framework. RDCFCE adopts the normalized cut algorithm as the consensus function to summarize the fuzzy matrices generated by the fuzzy extension models, partition the consensus matrix, and obtain the final result. Finally, adaptive RDCFCE (A-RDCFCE) is proposed to optimize RDCFCE and improve the performance of RDCFCE further by adopting a self-evolutionary process (SEPP) for the parameter set. Experiments on real cancer gene expression profiles indicate that RDCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms.

73 citations


Cites background from "Rough-Fuzzy Clustering for Grouping..."

  • ...…of Cancer Data Zhiwen Yu, Hantao Chen, Jane You, Jiming Liu, Hau-San Wong, Guoqiang Han, and Le Li Abstract Performing clustering analysis is one of the important research topics in cancer discovery using gene expression pro.les, which is crucial in facilitating the successful diagnosis…...

    [...]

Journal ArticleDOI
18 Mar 2014-PLOS ONE
TL;DR: A novel rough-fuzzy clustering method to detect overlapping protein complexes in protein-protein interaction (PPI) networks and provides a new insight of network division, and it can also be applied to identify overlapping community structure in social networks and LFR benchmark networks.
Abstract: In this paper, we present a novel rough-fuzzy clustering (RFC) method to detect overlapping protein complexes in protein-protein interaction (PPI) networks. RFC focuses on fuzzy relation model rather than graph model by integrating fuzzy sets and rough sets, employs the upper and lower approximations of rough sets to deal with overlapping complexes, and calculates the number of complexes automatically. Fuzzy relation between proteins is established and then transformed into fuzzy equivalence relation. Non-overlapping complexes correspond to equivalence classes satisfying certain equivalence relation. To obtain overlapping complexes, we calculate the similarity between one protein and each complex, and then determine whether the protein belongs to one or multiple complexes by computing the ratio of each similarity to maximum similarity. To validate RFC quantitatively, we test it in Gavin, Collins, Krogan and BioGRID datasets. Experiment results show that there is a good correspondence to reference complexes in MIPS and SGD databases. Then we compare RFC with several previous methods, including ClusterONE, CMC, MCL, GCE, OSLOM and CFinder. Results show the precision, sensitivity and separation are 32.4%, 42.9% and 81.9% higher than mean of the five methods in four weighted networks, and are 0.5%, 11.2% and 66.1% higher than mean of the six methods in five unweighted networks. Our method RFC works well for protein complexes detection and provides a new insight of network division, and it can also be applied to identify overlapping community structure in social networks and LFR benchmark networks.

55 citations


Cites background or methods from "Rough-Fuzzy Clustering for Grouping..."

  • ...There exist several rough-fuzzy clustering algorithms in previous studies [8,14,17,18,40], such as rough c-means clustering (RCM) [13,15], rough-fuzzy c-means clustering (RFCM) [8,18] and rough-fuzzy possibilistic c-means clustering (RFPCM) [17]....

    [...]

  • ...Each of them has limitations: some algorithms only work in unweighted networks, and can be applied to weighted data sets only after binarizing them by deleting edges whose weights are below a given threshold, while others need to assign the number of complexes firstly [8,9]....

    [...]

Journal ArticleDOI
TL;DR: The quadratic weights and Gini-Simpson diversity based fuzzy clustering model (QWGSD-FC), is first proposed as a basis of this work and two types of cross-domain, soft-partition clustering frameworks and their corresponding algorithms, referred to as type-I/type-II knowledge-transfer-oriented c-means (TI-KT-CM and TII-KT

46 citations


Cites methods from "Rough-Fuzzy Clustering for Grouping..."

  • ...[32] proposed another fuzzy subspace clustering method for handling high-dimensional, sparse data; and in addition, some application studies with respect to softpartition clustering were also conducted, such as image compression [33,34], image segmentation [35–37], real-time target tracking [38,39], and gene expression data analysis [40]....

    [...]

References
More filters
Book ChapterDOI

[...]

01 Jan 2012

139,059 citations


"Rough-Fuzzy Clustering for Grouping..." refers methods in this paper

  • ...Different clustering techniques such as hierarchical clustering [9], k-means algorithm [10], self organizing map [11], graph theoretical approaches [12], [13], [14], [15], modelbased clustering [16], [17], [18], [19], and density-based 286 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL....

    [...]

Book
01 Aug 1996
TL;DR: A separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.
Abstract: A fuzzy set is a class of objects with a continuum of grades of membership. Such a set is characterized by a membership (characteristic) function which assigns to each object a grade of membership ranging between zero and one. The notions of inclusion, union, intersection, complement, relation, convexity, etc., are extended to such sets, and various properties of these notions in the context of fuzzy sets are established. In particular, a separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.

52,705 citations

Journal ArticleDOI
TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Abstract: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is de- scribed that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be inter- preted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly charac- terized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

16,371 citations

Book
31 Jul 1981
TL;DR: Books, as a source that may involve the facts, opinion, literature, religion, and many others are the great friends to join with, becomes what you need to get.
Abstract: New updated! The latest book from a very famous author finally comes out. Book of pattern recognition with fuzzy objective function algorithms, as an amazing reference becomes what you need to get. What's for is this book? Are you still thinking for what the book is? Well, this is what you probably will get. You should have made proper choices for your better life. Book, as a source that may involve the facts, opinion, literature, religion, and many others are the great friends to join with.

15,662 citations

Journal ArticleDOI
TL;DR: A new graphical display is proposed for partitioning techniques, where each cluster is represented by a so-called silhouette, which is based on the comparison of its tightness and separation, and provides an evaluation of clustering validity.

14,144 citations