Clustering Algorithms: Their Application to Gene Expression Data

doi:10.4137/BBI.S38316

Open AccessJournal ArticleDOI

Clustering Algorithms: Their Application to Gene Expression Data

Jelili Oyelade, +7 more

- 30 Nov 2016 -

Bioinformatics and Biology Insights

- Vol. 10, Iss: 10, pp 237-253

Chats0

TLDR

This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.

Abstract:

Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.

Citations

PDF

Open Access

More filters

Posted Content

Weighted Cox regression for the prediction of heterogeneous patient subgroups

Katrin Madjar, +1 more

- 19 Mar 2020 -

arXiv: Methodology

TL;DR: A penalized Cox regression model with a weighted version of the Cox partial likelihood that includes patients of all subgroups but assigns them individual weights based on their subgroup affiliation is proposed.

...read moreread less

Journal ArticleDOI

Enhancing semantic belief function to handle decision conflicts in SoS using k-means clustering.

Eman K. Elsayed, +4 more

- 07 Apr 2021 -

PeerJ

TL;DR: The k-means clustering technique is adopted to enhance the detection and solving of conflict resulting while co-integrating new systems into an existing SoS, an enhancement of the Ontology Belief Function System of Systems (OBFSoS).

...read moreread less

Journal ArticleDOI

Landslide Susceptibility Mapping Using DIvisive ANAlysis (DIANA) and RObust Clustering Using linKs (ROCK) Algorithms, and Comparison of Their Performance

Deborah Simon Mwakapesa, +3 more

- 26 Feb 2023 -

Sustainability

TL;DR: Wang et al. as mentioned in this paper applied and compared the performance of DIvisive ANAlysis (DIANA) and RObust Clustering using linKs (ROCK) algorithms for landslide susceptibility mapping in the Baota District, China.

...read moreread less

Book ChapterDOI

UCSL : A Machine Learning Expectation-Maximization framework for Unsupervised Clustering driven by Supervised Learning

Robin Louiset, +5 more

TL;DR: In this paper, the authors proposed a general Expectation-Maximization ensemble framework called UCSL (Unsupervised Clustering driven by Supervised Learning), which can integrate any clustering method and can be driven by binary classification and regression.

...read moreread less

Journal ArticleDOI

Logical analysis of built-in DBSCAN Functions in Popular Data Science Programming Languages

M. Amiruzzaman, +3 more

- 26 Jun 2022 -

MIST journal of science and technology

TL;DR: A scientific way to assess the results of DBSCAN built-in function, as well as output inconsistencies are proposed, which reveals various differences and advises caution when working with built- in functionality.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Maximum likelihood from incomplete data via the EM algorithm

Arthur P. Dempster, +2 more

- 01 Sep 1977 -

Journal of the royal statistical society...

Some methods for classification and analysis of multivariate observations

James B. MacQueen

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.

...read moreread less

Proceedings Article

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

Martin Ester, +3 more

TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.

...read moreread less