scispace - formally typeset
Journal ArticleDOI

Curve-based clustering of time course gene expression data using self-organizing maps

Reads0
Chats0
TLDR
CurveSOM is a very promising tool for the exploratory analysis of time course expression data, as it is not only able to group genes into clusters with high accuracy but also able to find true time-shifted correlations of expression patterns across clusters.
Abstract
There is an increasing interest in clustering time course gene expression data to investigate a wide range of biological processes. However, developing a clustering algorithm ideal for time course gene express data is still challenging. As timing is an important factor in defining true clusters, a clustering algorithm shall explore expression correlations between time points in order to achieve a high clustering accuracy. Moreover, inter-cluster gene relationships are often desired in order to facilitate the computational inference of biological pathways and regulatory networks. In this paper, a new clustering algorithm called CurveSOM is developed to offer both features above. It first presents each gene by a cubic smoothing spline fitted to the time course expression profile, and then groups genes into clusters by applying a self-organizing map-based clustering on the resulting splines. CurveSOM has been tested on three well-studied yeast cell cycle datasets, and compared with four popular programs including Cluster 3.0, GENECLUSTER, MCLUST, and SSClust. The results show that CurveSOM is a very promising tool for the exploratory analysis of time course expression data, as it is not only able to group genes into clusters with high accuracy but also able to find true time-shifted correlations of expression patterns across clusters.

read more

Citations
More filters
Journal ArticleDOI

A Linear Mixed Model Spline Framework for Analysing Time Course 'Omics' Data

TL;DR: This work presents a novel, robust and powerful framework to analyze time course ‘omics’ data that consists of three stages: quality assessment and filtering, profile modelling, and analysis, and demonstrates the high sensitivity and specificity of the approach for differential expression analysis.
Journal ArticleDOI

Clustering longitudinal profiles using P-splines and mixed effects models applied to time-course gene expression data

TL;DR: An alternative approach is proposed, which aims to alleviate some of the limitations to the techniques previously described, and exploits the connection between the linear mixed effects model and P-spline smoothing to simultaneously smooth the gene expression data to remove any measurement error/noise.
Journal ArticleDOI

Analysis of gene expression profile identifies potential biomarkers for atherosclerosis.

TL;DR: The present study indicated that APH1B, JAM3, FBLN2, CSAD and PSTPIP2 may have important roles in the progression of atherosclerosis in females and may be potential biomarkers for early diagnosis and prognosis as well as treatment targets for this disease.
Journal ArticleDOI

Identification of gene markers in the development of smoking-induced lung cancer.

TL;DR: Functional and pathway enrichment analyses separately were conducted for the DEGs and indicated that there were differences in gene level between the two groups and SVM analysis indicated that the five features had potential diagnostic value.
Journal ArticleDOI

Identification of breast cancer recurrence risk factors based on functional pathways in tumor and normal tissues.

TL;DR: The results reveal that the integration of tumor and normal tissue functional analyses can comprehensively enhance the understanding of BC prognosis.
References
More filters
Journal ArticleDOI

Cluster analysis and display of genome-wide expression patterns

TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Journal ArticleDOI

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

TL;DR: A new graphical display is proposed for partitioning techniques, where each cluster is represented by a so-called silhouette, which is based on the comparison of its tightness and separation, and provides an evaluation of clustering validity.
Book

Self-Organizing Maps

TL;DR: The Self-Organising Map (SOM) algorithm was introduced by the author in 1981 as mentioned in this paper, and many applications form one of the major approaches to the contemporary artificial neural networks field, and new technologies have already been based on it.
Journal ArticleDOI

Comprehensive Identification of Cell Cycle–regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization

TL;DR: A comprehensive catalog of yeast genes whose transcript levels vary periodically within the cell cycle is created, and it is found that the mRNA levels of more than half of these 800 genes respond to one or both of these cyclins.
Journal ArticleDOI

Estimating the number of clusters in a data set via the gap statistic

TL;DR: In this paper, the authors proposed a method called the "gap statistic" for estimating the number of clusters (groups) in a set of data, which uses the output of any clustering algorithm (e.g. K-means or hierarchical), comparing the change in within-cluster dispersion with that expected under an appropriate reference null distribution.
Related Papers (5)