Rough-Fuzzy Clustering for Grouping Functionally Similar Genes from Microarray Data

doi:10.1109/TCBB.2012.103

Home
/
Papers
/
Rough-Fuzzy Clustering for Grouping Functionally Similar Genes from Microarray Data

Journal Article•DOI•

Rough-Fuzzy Clustering for Grouping Functionally Similar Genes from Microarray Data

01 Mar 2013-IEEE/ACM Transactions on Computational Biology and Bioinformatics (IEEE)-Vol. 10, Iss: 2, pp 286-299

TL;DR: An efficient method is proposed to select initial prototypes of different gene clusters, which enables the proposed c-means algorithm to converge to an optimum or near optimum solutions and helps to discover coexpressed gene clusters.

read less

Abstract: Gene expression data clustering is one of the important tasks of functional genomics as it provides a powerful tool for studying functional relationships of genes in a biological process. Identifying coexpressed groups of genes represents the basic challenge in gene clustering problem. In this regard, a gene clustering algorithm, termed as robust rough-fuzzy $(c)$-means, is proposed judiciously integrating the merits of rough sets and fuzzy sets. While the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in cluster definition, the integration of probabilistic and possibilistic memberships of fuzzy sets enables efficient handling of overlapping partitions in noisy environment. The concept of possibilistic lower bound and probabilistic boundary of a cluster, introduced in robust rough-fuzzy $(c)$-means, enables efficient selection of gene clusters. An efficient method is proposed to select initial prototypes of different gene clusters, which enables the proposed $(c)$-means algorithm to converge to an optimum or near optimum solutions and helps to discover coexpressed gene clusters. The effectiveness of the algorithm, along with a comparison with other algorithms, is demonstrated both qualitatively and quantitatively on 14 yeast microarray data sets.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Multigranulation rough-fuzzy clustering based on shadowed sets

[...]

Jie Zhou¹, Jie Zhou², Zhihui Lai¹, Zhihui Lai², Duoqian Miao³, Can Gao², Can Gao¹, Xiaodong Yue⁴ - Show less +4 more•Institutions (4)

Hong Kong Polytechnic University¹, Shenzhen University², Tongji University³, Shanghai University⁴

01 Jan 2020-Information Sciences

TL;DR: By integrating the notions of shadowed sets and multigranulation into rough-fuzzy clustering approaches, the overall topology of data can be captured well and the uncertain information implicated inData can be effectively addressed, including the uncertainty generated by fuzzification coefficient.

...read moreread less

76 citations

Journal Article•DOI•

Gene selection for tumor classification using neighborhood rough sets and entropy measures

[...]

Yumin Chen¹, Zunjun Zhang², Jianzhong Zheng², Ying Ma¹, Yu Xue³ - Show less +1 more•Institutions (3)

Xiamen University of Technology¹, Fujian University of Traditional Chinese Medicine², Nanjing University of Information Science and Technology³

01 Mar 2017-Journal of Biomedical Informatics

TL;DR: A novel gene selection method based on the neighborhood rough set model is proposed, which has the ability of dealing with real-value data whilst maintaining the original gene classification information, and an entropy measure is addressed under the frame of neighborhood rough sets for tackling the uncertainty and noisy of gene expression data.

...read moreread less

75 citations

Cites background from "Rough-Fuzzy Clustering for Grouping..."

...The further researches on gene selection from gene 15 expression data have significant influences on bioinformatics [4,5], tumor or 16 ∗ Corresponding author....
[...]

Journal Article•DOI•

Adaptive fuzzy consensus clustering framework for clustering analysis of cancer data

[...]

Zhiwen Yu¹, Hantao Chen¹, Jane You², Jiming Liu³, Hau-San Wong⁴, Guoqiang Han¹, Le Li¹ - Show less +3 more•Institutions (4)

South China University of Technology¹, Hong Kong Polytechnic University², Hong Kong Baptist University³, City University of Hong Kong⁴

01 Jul 2015-IEEE/ACM Transactions on Computational Biology and Bioinformatics

TL;DR: Experiments on real cancer gene expression profiles indicate that R DCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms.

...read moreread less

Abstract: Performing clustering analysis is one of the important research topics in cancer discovery using gene expression profiles, which is crucial in facilitating the successful diagnosis and treatment of cancer. While there are quite a number of research works which perform tumor clustering, few of them considers how to incorporate fuzzy theory together with an optimization process into a consensus clustering framework to improve the performance of clustering analysis. In this paper, we first propose a random double clustering based cluster ensemble framework (RDCCE) to perform tumor clustering based on gene expression data. Specifically, RDCCE generates a set of representative features using a randomly selected clustering algorithm in the ensemble, and then assigns samples to their corresponding clusters based on the grouping results. In addition, we also introduce the random double clustering based fuzzy cluster ensemble framework (RDCFCE), which is designed to improve the performance of RDCCE by integrating the newly proposed fuzzy extension model into the ensemble framework. RDCFCE adopts the normalized cut algorithm as the consensus function to summarize the fuzzy matrices generated by the fuzzy extension models, partition the consensus matrix, and obtain the final result. Finally, adaptive RDCFCE (A-RDCFCE) is proposed to optimize RDCFCE and improve the performance of RDCFCE further by adopting a self-evolutionary process (SEPP) for the parameter set. Experiments on real cancer gene expression profiles indicate that RDCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms.

...read moreread less

73 citations

Cites background from "Rough-Fuzzy Clustering for Grouping..."

...…of Cancer Data Zhiwen Yu, Hantao Chen, Jane You, Jiming Liu, Hau-San Wong, Guoqiang Han, and Le Li Abstract Performing clustering analysis is one of the important research topics in cancer discovery using gene expression pro.les, which is crucial in facilitating the successful diagnosis…...
[...]

Journal Article•DOI•

Detecting Overlapping Protein Complexes by Rough-Fuzzy Clustering in Protein-Protein Interaction Networks

[...]

Hao Wu¹, Lin Gao¹, Jihua Dong², Xiaofei Yang¹•Institutions (2)

Xidian University¹, Northwest A&F University²

18 Mar 2014-PLOS ONE

TL;DR: A novel rough-fuzzy clustering method to detect overlapping protein complexes in protein-protein interaction (PPI) networks and provides a new insight of network division, and it can also be applied to identify overlapping community structure in social networks and LFR benchmark networks.

...read moreread less

Abstract: In this paper, we present a novel rough-fuzzy clustering (RFC) method to detect overlapping protein complexes in protein-protein interaction (PPI) networks. RFC focuses on fuzzy relation model rather than graph model by integrating fuzzy sets and rough sets, employs the upper and lower approximations of rough sets to deal with overlapping complexes, and calculates the number of complexes automatically. Fuzzy relation between proteins is established and then transformed into fuzzy equivalence relation. Non-overlapping complexes correspond to equivalence classes satisfying certain equivalence relation. To obtain overlapping complexes, we calculate the similarity between one protein and each complex, and then determine whether the protein belongs to one or multiple complexes by computing the ratio of each similarity to maximum similarity. To validate RFC quantitatively, we test it in Gavin, Collins, Krogan and BioGRID datasets. Experiment results show that there is a good correspondence to reference complexes in MIPS and SGD databases. Then we compare RFC with several previous methods, including ClusterONE, CMC, MCL, GCE, OSLOM and CFinder. Results show the precision, sensitivity and separation are 32.4%, 42.9% and 81.9% higher than mean of the five methods in four weighted networks, and are 0.5%, 11.2% and 66.1% higher than mean of the six methods in five unweighted networks. Our method RFC works well for protein complexes detection and provides a new insight of network division, and it can also be applied to identify overlapping community structure in social networks and LFR benchmark networks.

...read moreread less

55 citations

Cites background or methods from "Rough-Fuzzy Clustering for Grouping..."

...There exist several rough-fuzzy clustering algorithms in previous studies [8,14,17,18,40], such as rough c-means clustering (RCM) [13,15], rough-fuzzy c-means clustering (RFCM) [8,18] and rough-fuzzy possibilistic c-means clustering (RFPCM) [17]....
[...]
...Each of them has limitations: some algorithms only work in unweighted networks, and can be applied to weighted data sets only after binarizing them by deleting edges whose weights are below a given threshold, while others need to assign the number of complexes firstly [8,9]....
[...]

Journal Article•DOI•

Cross-domain, soft-partition clustering with diversity measure and knowledge reference

[...]

Pengjiang Qian¹, Shouwei Sun², Yizhang Jiang², Kuan-Hao Su¹, Tongguang Ni², Shitong Wang², Raymond F. Muzic¹ - Show less +3 more•Institutions (2)

Case Western Reserve University¹, Jiangnan University²

01 Feb 2016-Pattern Recognition

TL;DR: The quadratic weights and Gini-Simpson diversity based fuzzy clustering model (QWGSD-FC), is first proposed as a basis of this work and two types of cross-domain, soft-partition clustering frameworks and their corresponding algorithms, referred to as type-I/type-II knowledge-transfer-oriented c-means (TI-KT-CM and TII-KT

...read moreread less

46 citations

Cites methods from "Rough-Fuzzy Clustering for Grouping..."

...[32] proposed another fuzzy subspace clustering method for handling high-dimensional, sparse data; and in addition, some application studies with respect to softpartition clustering were also conducted, such as image compression [33,34], image segmentation [35–37], real-time target tracking [38,39], and gene expression data analysis [40]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

References

PDF

Open Access

More filters

Book Chapter•DOI•

I and J

[...]

William Marsden

01 Jan 2012

139,059 citations

"Rough-Fuzzy Clustering for Grouping..." refers methods in this paper

...Different clustering techniques such as hierarchical clustering [9], k-means algorithm [10], self organizing map [11], graph theoretical approaches [12], [13], [14], [15], modelbased clustering [16], [17], [18], [19], and density-based 286 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL....
[...]

Book•

Fuzzy sets

[...]

Lotfi A. Zadeh

01 Aug 1996

TL;DR: A separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.

...read moreread less

Abstract: A fuzzy set is a class of objects with a continuum of grades of membership. Such a set is characterized by a membership (characteristic) function which assigns to each object a grade of membership ranging between zero and one. The notions of inclusion, union, intersection, complement, relation, convexity, etc., are extended to such sets, and various properties of these notions in the context of fuzzy sets are established. In particular, a separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.

...read moreread less

52,705 citations

Journal Article•DOI•

Cluster analysis and display of genome-wide expression patterns

[...]

Michael B. Eisen¹, Paul T. Spellman¹, Patrick O. Brown¹, David Botstein¹•Institutions (1)

Stanford University¹

08 Dec 1998-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.

...read moreread less

Abstract: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is de- scribed that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be inter- preted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly charac- terized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

...read moreread less

16,371 citations

Book•

Pattern Recognition with Fuzzy Objective Function Algorithms

[...]

James C. Bezdek

31 Jul 1981

TL;DR: Books, as a source that may involve the facts, opinion, literature, religion, and many others are the great friends to join with, becomes what you need to get.

...read moreread less

Abstract: New updated! The latest book from a very famous author finally comes out. Book of pattern recognition with fuzzy objective function algorithms, as an amazing reference becomes what you need to get. What's for is this book? Are you still thinking for what the book is? Well, this is what you probably will get. You should have made proper choices for your better life. Book, as a source that may involve the facts, opinion, literature, religion, and many others are the great friends to join with.

...read moreread less

15,662 citations

Journal Article•DOI•

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

[...]

Peter J. Rousseeuw¹•Institutions (1)

University of Fribourg¹

01 Nov 1987-Journal of Computational and Applied Mathematics

TL;DR: A new graphical display is proposed for partitioning techniques, where each cluster is represented by a so-called silhouette, which is based on the comparison of its tightness and separation, and provides an evaluation of clustering validity.

...read moreread less

14,144 citations