Journal ArticleDOI
Noise-robust soft clustering of gene expression time-course data
TLDR
To overcome the limitations of hard clustering, this work applied soft clustering which offers several advantages for researchers, including more noise robust and a priori pre-filtering of genes can be avoided.Abstract:
Clustering is an important tool in microarray data analysis. This unsupervised learning technique is commonly used to reveal structures hidden in large gene expression data sets. The vast majority of clustering algorithms applied so far produce hard partitions of the data, i.e. each gene is assigned exactly to one cluster. Hard clustering is favourable if clusters are well separated. However, this is generally not the case for microarray time-course data, where gene clusters frequently overlap. Additionally, hard clustering algorithms are often highly sensitive to noise. To overcome the limitations of hard clustering, we applied soft clustering which offers several advantages for researchers. First, it generates accessible internal cluster structures, i.e. it indicates how well corresponding clusters represent genes. This can be used for the more targeted search for regulatory elements. Second, the overall relation between clusters, and thus a global clustering structure, can be defined. Additionally, soft clustering is more noise robust and a priori pre-filtering of genes can be avoided. This prevents the exclusion of biologically relevant genes from the data analysis. Soft clustering was implemented here using the fuzzy c-means algorithm. Procedures to find optimal clustering parameters were developed. A software package for soft clustering has been developed based on the open-source statistical language R. The package called Mfuzz is freely available.read more
Citations
More filters
Journal ArticleDOI
Global, in vivo, and site-specific phosphorylation dynamics in signaling networks.
Jesper V. Olsen,Blagoy Blagoev,Florian Gnad,Boris Macek,Boris Macek,Chanchal Kumar,Peter Mortensen,Matthias Mann +7 more
TL;DR: A general mass spectrometric technology is developed and applied for identification and quantitation of phosphorylation sites as a function of stimulus, time, and subcellular location to provide a missing link in a global, integrative view of cellular regulation.
Journal ArticleDOI
Mfuzz: a software package for soft clustering of microarray data.
TL;DR: An R package termed Mfuzz is constructed implementing soft clustering tools for microarray data analysis, which can overcome shortcomings of conventional hard clustering techniques and offer further advantages.
Journal ArticleDOI
The NASA Twins Study: A multidimensional analysis of a year-long human spaceflight.
Francine E. Garrett-Bakelman,Francine E. Garrett-Bakelman,Manjula Darshi,Stefan J. Green,Ruben C. Gur,Ling Lin,Brandon R. Macias,Miles J. McKenna,Cem Meydan,Tejaswini Mishra,Jad Nasrini,Brian D. Piening,Brian D. Piening,Lindsay F. Rizzardi,Kumar Sharma,Jamila H. Siamwala,Jamila H. Siamwala,Lynn Taylor,Martha Hotz Vitaterna,Maryam Afkarian,Ebrahim Afshinnekoo,Sara Ahadi,Aditya Ambati,Maneesh Arya,Daniela Bezdan,Colin M. Callahan,Songjie Chen,Augustine M.K. Choi,George E. Chlipala,Kévin Contrepois,Marisa Covington,Brian Crucian,Immaculata De Vivo,David F. Dinges,Douglas J. Ebert,Jason I. Feinberg,Jorge Gandara,Kerry George,John Goutsias,George Grills,Alan R. Hargens,Martina Heer,Martina Heer,Ryan P. Hillary,Andrew N. Hoofnagle,Vivian Hook,Garrett Jenkinson,Garrett Jenkinson,Peng Jiang,Ali Keshavarzian,Steven S. Laurie,Brittany Lee-McMullen,Sarah B. Lumpkins,Matthew MacKay,Mark Maienschein-Cline,Ari Melnick,Tyler M. Moore,Kiichi Nakahira,Hemal H. Patel,Robert Pietrzyk,Varsha Rao,Rintaro Saito,Rintaro Saito,Denis Salins,Jan M. Schilling,Dorothy D. Sears,Caroline Sheridan,Michael B. Stenger,Rakel Tryggvadottir,Alexander E. Urban,Tomas Vaisar,Benjamin Van Espen,Jing Zhang,Michael G. Ziegler,Sara R. Zwart,John B. Charles,Craig E. Kundrot,Graham B. I. Scott,Susan M. Bailey,Mathias Basner,Andrew P. Feinberg,Stuart M. C. Lee,Christopher E. Mason,Emmanuel Mignot,Brinda K. Rana,Scott M. Smith,Michael Snyder,Fred W. Turek,Fred W. Turek +88 more
TL;DR: Given that the majority of the biological and human health variables remained stable, or returned to baseline, after a 340-day space mission, these data suggest that human health can be mostly sustained over this duration of spaceflight.
Journal ArticleDOI
System-Wide Temporal Characterization of the Proteome and Phosphoproteome of Human Embryonic Stem Cell Differentiation
Kristoffer T.G. Rigbolt,Tatyana A. Prokhorova,Vyacheslav Akimov,Jeanette Henningsen,Jeanette Henningsen,Pia Thermann Johansen,Irina Kratchmarova,Moustapha Kassem,Moustapha Kassem,Matthias Mann,Jesper V. Olsen,Blagoy Blagoev +11 more
TL;DR: Cellular events underlying the pluripotency of human embryonic stem cells (hESCs) are elucidated and a core hESC phosphoproteome of sites with similar robust changes in response to the two distinct treatments is identified.
Journal ArticleDOI
Fuzzy c-Means Algorithms for Very Large Data
TL;DR: This paper compares the efficacy of three different implementations of techniques aimed to extend fuzzy c-means (FCM) clustering to VL data and concludes by demonstrating the VL algorithms on a dataset with 5 billion objects and presenting a set of recommendations regarding the use of different VL FCM clustering schemes.
References
More filters
Journal ArticleDOI
Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation
TL;DR: Whole-genome mRNA quantitation is tested by applying it to three extensively studied regulatory systems in the yeast Saccharomyces cerevisiae: galactose response, heat shock, and mating type, and yielded all of the four relevant DNA motifs and most of the known a- and α-specific genes.
Journal ArticleDOI
Analysis of gene expression data using self‐organizing maps
TL;DR: The SOM algorithm is applied to analyze published data of yeast gene expression and it is shown that SOM is an excellent tool for the analysis and visualization of gene expression profiles.
Journal ArticleDOI
Fuzzy C-means method for clustering microarray data
Doulaye Dembélé,Philippe Kastner +1 more
TL;DR: By setting threshold levels for the membership values of the FCM method, genes which are tigthly associated to a given cluster can be selected and this selection increases the overall biological significance of the genes within the cluster.
Journal ArticleDOI
Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering
TL;DR: Fuzzy k-means clustering is a useful analytical tool for extracting biological insights from gene-expression data and suggests that a prevalent theme in the regulation of yeast gene expression is the condition-specific coregulation of overlapping sets of genes.
Journal ArticleDOI
Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters.
TL;DR: A simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure and guarantees to eventually find the globally optimal distribution of genes over clusters.
Related Papers (5)
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more