scispace - formally typeset
Journal ArticleDOI

Noise-robust soft clustering of gene expression time-course data

TLDR
To overcome the limitations of hard clustering, this work applied soft clustering which offers several advantages for researchers, including more noise robust and a priori pre-filtering of genes can be avoided.
Abstract
Clustering is an important tool in microarray data analysis. This unsupervised learning technique is commonly used to reveal structures hidden in large gene expression data sets. The vast majority of clustering algorithms applied so far produce hard partitions of the data, i.e. each gene is assigned exactly to one cluster. Hard clustering is favourable if clusters are well separated. However, this is generally not the case for microarray time-course data, where gene clusters frequently overlap. Additionally, hard clustering algorithms are often highly sensitive to noise. To overcome the limitations of hard clustering, we applied soft clustering which offers several advantages for researchers. First, it generates accessible internal cluster structures, i.e. it indicates how well corresponding clusters represent genes. This can be used for the more targeted search for regulatory elements. Second, the overall relation between clusters, and thus a global clustering structure, can be defined. Additionally, soft clustering is more noise robust and a priori pre-filtering of genes can be avoided. This prevents the exclusion of biologically relevant genes from the data analysis. Soft clustering was implemented here using the fuzzy c-means algorithm. Procedures to find optimal clustering parameters were developed. A software package for soft clustering has been developed based on the open-source statistical language R. The package called Mfuzz is freely available.

read more

Citations
More filters
Journal ArticleDOI

Global, in vivo, and site-specific phosphorylation dynamics in signaling networks.

TL;DR: A general mass spectrometric technology is developed and applied for identification and quantitation of phosphorylation sites as a function of stimulus, time, and subcellular location to provide a missing link in a global, integrative view of cellular regulation.
Journal ArticleDOI

Mfuzz: a software package for soft clustering of microarray data.

TL;DR: An R package termed Mfuzz is constructed implementing soft clustering tools for microarray data analysis, which can overcome shortcomings of conventional hard clustering techniques and offer further advantages.
Journal ArticleDOI

The NASA Twins Study: A multidimensional analysis of a year-long human spaceflight.

Francine E. Garrett-Bakelman, +88 more
- 12 Apr 2019 - 
TL;DR: Given that the majority of the biological and human health variables remained stable, or returned to baseline, after a 340-day space mission, these data suggest that human health can be mostly sustained over this duration of spaceflight.
Journal ArticleDOI

Fuzzy c-Means Algorithms for Very Large Data

TL;DR: This paper compares the efficacy of three different implementations of techniques aimed to extend fuzzy c-means (FCM) clustering to VL data and concludes by demonstrating the VL algorithms on a dataset with 5 billion objects and presenting a set of recommendations regarding the use of different VL FCM clustering schemes.
References
More filters
Journal ArticleDOI

Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation

TL;DR: Whole-genome mRNA quantitation is tested by applying it to three extensively studied regulatory systems in the yeast Saccharomyces cerevisiae: galactose response, heat shock, and mating type, and yielded all of the four relevant DNA motifs and most of the known a- and α-specific genes.
Journal ArticleDOI

Analysis of gene expression data using self‐organizing maps

TL;DR: The SOM algorithm is applied to analyze published data of yeast gene expression and it is shown that SOM is an excellent tool for the analysis and visualization of gene expression profiles.
Journal ArticleDOI

Fuzzy C-means method for clustering microarray data

TL;DR: By setting threshold levels for the membership values of the FCM method, genes which are tigthly associated to a given cluster can be selected and this selection increases the overall biological significance of the genes within the cluster.
Journal ArticleDOI

Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering

TL;DR: Fuzzy k-means clustering is a useful analytical tool for extracting biological insights from gene-expression data and suggests that a prevalent theme in the regulation of yeast gene expression is the condition-specific coregulation of overlapping sets of genes.
Journal ArticleDOI

Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters.

TL;DR: A simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure and guarantees to eventually find the globally optimal distribution of genes over clusters.
Related Papers (5)