Clustering Algorithms: Their Application to Gene Expression Data
Jelili Oyelade,Itunuoluwa Isewon,Funke Oladipupo,Olufemi Aromolaran,Efosa Uwoghiren,Faridah Ameh,Moses Achas,Ezekiel Adebiyi +7 more
TLDR
This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.Abstract:
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.read more
Citations
More filters
The Self-Organizing Map
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Journal ArticleDOI
Applications of machine learning to diagnosis and treatment of neurodegenerative diseases
Monika A Myszczynska,Poojitha N. Ojamies,Alix M. B. Lacoste,Daniel Neil,Amir Saffari,Richard J. Mead,Guillaume M. Hautbergue,Joanna D. Holbrook,Laura Ferraiuolo +8 more
TL;DR: How machine learning can aid early diagnosis and interpretation of medical images as well as the discovery and development of new therapies is discussed, and the latest developments in the use of machine learning to interrogate neurodegenerative disease-related datasets are described.
Journal ArticleDOI
A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects
Ezugwu E. Absalom,Abiodun Motunrayo Ikotun,Olaide Nathaniel Oyelade,Laith Abualigah,Jeffrey O. Agushaka,Christopher Ifeanyi Eke,Andronicus Ayobami Akinyelu +6 more
TL;DR: Clustering is an essential tool in data mining research and applications as discussed by the authors and it is the subject of active research in many fields of study, such as computer science, data science, statistics, pattern recognition, artificial intelligence, and machine learning.
Journal ArticleDOI
Deep learning-based clustering approaches for bioinformatics
Md. Rezaul Karim,Oya Beyan,Oya Beyan,Achille Zappa,Ivan G. Costa,Dietrich Rebholz-Schuhmann,Michael Cochez,Stefan Decker,Stefan Decker +8 more
TL;DR: In this article, the authors present a review of state-of-the-art DL-based approaches for clustering analysis that are based on representation learning, which they hope to be useful for bioinformatics research.
Journal ArticleDOI
Single-cell transcriptomic evidence for dense intracortical neuropeptide networks
Stephen J. Smith,Uygar Sümbül,Lucas T. Graybuck,Forrest Collman,Sharmishtaa Seshamani,Rohan Gala,Olga Gliko,Leila Elabbady,Jeremy A. Miller,Trygve E. Bakken,Jean Rossier,Zizhen Yao,Ed Lein,Hongkui Zeng,Bosiljka Tasic,Michael Hawrylycz +15 more
TL;DR: Here, neuron-type-specific patterns of NP gene expression are used to offer specific, testable predictions regarding 37 peptidergic neuromodulatory networks that may play prominent roles in cortical homeostasis and plasticity.
References
More filters
Journal ArticleDOI
What is the expectation maximization algorithm
Chuong B. Do,Serafim Batzoglou +1 more
TL;DR: The expectation maximization algorithm arises in many computational biology applications that involve probabilistic models and is good for, and how does it work?
Journal ArticleDOI
FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data
Limin Fu,Enzo Medico +1 more
TL;DR: The FLAME algorithm has intrinsic advantages, such as the ability to capture non-linear relationships and non-globular clusters, the automated definition of the number of clusters, and the identification of cluster outliers, i.e. genes that are not assigned to any cluster.
Journal ArticleDOI
A cluster validity index for fuzzy clustering
Kuo-Lung Wu,Miin-Shen Yang +1 more
TL;DR: The results of comparative study show that the proposed PCAES index has high ability in producing a good cluster number estimate and in addition, it provides a new point of view for cluster validity in a noisy environment.
Book ChapterDOI
Some developments of the Blackwell-MacQueen urn scheme
TL;DR: The Blackwell-MacQueen description of sampling from a Dirichlet random distribution on an abstract space is reviewed and extended to a general family of random discrete distributions in this paper, and results are obtained by application of Kingman's theory of partition structures.
Journal ArticleDOI
A new cluster validity index for the fuzzy c-mean
TL;DR: A new cluster validity index is introduced, which assesses the average compactness and separation of fuzzy partitions generated by the fuzzy c-means algorithm, and performed favorably in all studies, even in those where other validity indices failed to indicate the true number of clusters within each data set.