scispace - formally typeset
Open AccessJournal ArticleDOI

Clustering Algorithms: Their Application to Gene Expression Data

Reads0
Chats0
TLDR
This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.
Abstract
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.

read more

Citations
More filters

The Self-Organizing Map

TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Journal ArticleDOI

Applications of machine learning to diagnosis and treatment of neurodegenerative diseases

TL;DR: How machine learning can aid early diagnosis and interpretation of medical images as well as the discovery and development of new therapies is discussed, and the latest developments in the use of machine learning to interrogate neurodegenerative disease-related datasets are described.
Journal ArticleDOI

A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects

TL;DR: Clustering is an essential tool in data mining research and applications as discussed by the authors and it is the subject of active research in many fields of study, such as computer science, data science, statistics, pattern recognition, artificial intelligence, and machine learning.
Journal ArticleDOI

Deep learning-based clustering approaches for bioinformatics

TL;DR: In this article, the authors present a review of state-of-the-art DL-based approaches for clustering analysis that are based on representation learning, which they hope to be useful for bioinformatics research.
Journal ArticleDOI

Single-cell transcriptomic evidence for dense intracortical neuropeptide networks

TL;DR: Here, neuron-type-specific patterns of NP gene expression are used to offer specific, testable predictions regarding 37 peptidergic neuromodulatory networks that may play prominent roles in cortical homeostasis and plasticity.
References
More filters
Journal ArticleDOI

Metagenes and molecular pattern discovery using matrix factorization.

TL;DR: Nonnegative matrix factorization is described, an algorithm based on decomposition by parts that can reduce the dimension of expression data from thousands of genes to a handful of metagenes, and found less sensitive to a priori selection of genes or initial conditions and able to detect alternative or context-dependent patterns of gene expression in complex biological systems.
Journal ArticleDOI

On cluster validity for the fuzzy c-means model

TL;DR: Limitation analysis indicates, and numerical experiments confirm, that the Fukuyama-Sugeno index is sensitive to both high and low values of m and may be unreliable because of this, and calculations suggest that the best choice for m is probably in the interval [1.5, 2.5], whose mean and midpoint, m=2, have often been the preferred choice for many users of FCM.
Journal ArticleDOI

Unsupervised optimal fuzzy clustering

TL;DR: The unsupervised fuzzy partition-optimal number of classes algorithm performs well in situations of large variability of cluster shapes, densities, and number of data points in each cluster.
Journal ArticleDOI

A new approach to clustering

TL;DR: A new method of representation of the reduced data, based on the idea of “fuzzy sets,” is proposed to avoid some of the problems of current clustering procedures and to provide better insight into the structure of the original data.
Proceedings Article

STING: A Statistical Information Grid Approach to Spatial Data Mining

TL;DR: The idea is to capture statistical information associated with spatial cells in such a manner that whole classes of queries and clustering problems can be answered without recourse to the individual objects.
Related Papers (5)
Trending Questions (1)
What are applications of clustering algorithms?

Applications of clustering algorithms include revealing natural structures in gene expression data, understanding gene functions, identifying cell subtypes, mining information from noisy data, and aiding in vaccine design.