Clustering Algorithms: Their Application to Gene Expression Data
Jelili Oyelade,Itunuoluwa Isewon,Funke Oladipupo,Olufemi Aromolaran,Efosa Uwoghiren,Faridah Ameh,Moses Achas,Ezekiel Adebiyi +7 more
Reads0
Chats0
TLDR
This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.Abstract:
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.read more
Citations
More filters
Proceedings ArticleDOI
Multi-objective semi-supervised clustering algorithm based on constraint set optimization for gene expression data
TL;DR: Zhang et al. as discussed by the authors proposed a semi-supervised clustering algorithm for gene expression data based on constraint set optimization to solve the problem of noisy constraints, and the proposed method can filter the constraint sets with the guidance of clustering validities.
Posted ContentDOI
Identification of genes associated with abiotic stress tolerance in sweetpotato using weighted gene co-expression network analysis
Mercy N. Kitavi,Dorcus C. Gemenet,Joshua C. Wood,John P. Hamilton,Shanshan Yu,Zhangjun Fei,Awais Khan,C. Robin Buell +7 more
TL;DR: In this paper , the authors identify differentially expressed genes (DEGs) in leaves of the orange-fleshed cultivar "Beauregard" to identify shared and unique gene coexpression networks under multiple abiotic stress conditions.
Proceedings ArticleDOI
Computational insights on the molecular mechanisms across breast cancer progression combining gene differential expression and co-expression
Emmanouil K. Ikonomakis,Marilena M. Bourdakou,George Kolios,Michael N. Vrahatis,George M. Spyrou +4 more
TL;DR: In this article, the authors combine gene differential analysis (single gene view) with gene co-expression analysis (systemic view) to provide insights on the implicated molecular mechanisms across breast cancer progression.
Journal ArticleDOI
Self-organizing map with granular competitive learning: Application to microarray clustering
TL;DR: The effectiveness of SOMGCL is demonstrated in clustering of both the samples and genes in microarrays having the large number of genes and classes in terms of cluster evaluation metrics and quantization error.
Proceedings ArticleDOI
Speeding up the interpretation of differential gene expression analysis results
TL;DR: In this article , the authors proposed a complete pipeline to make the process of understanding the results of differential gene expression analysis much faster, easier, and more efficient, which takes in Gene Ontology terms along with descriptions of collected genes, and returns the output of gene clusters, topics they are related to, and a filtered list of most common words that can be found in each of them.
References
More filters
Journal ArticleDOI
Maximum likelihood from incomplete data via the EM algorithm
Journal Article
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +15 more
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Posted Content
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Andreas Müller,Joel Nothman,Gilles Louppe,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +18 more
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Some methods for classification and analysis of multivariate observations
TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Proceedings Article
A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise
TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.