Journal ArticleDOI
Noise-robust soft clustering of gene expression time-course data
TLDR
To overcome the limitations of hard clustering, this work applied soft clustering which offers several advantages for researchers, including more noise robust and a priori pre-filtering of genes can be avoided.Abstract:
Clustering is an important tool in microarray data analysis. This unsupervised learning technique is commonly used to reveal structures hidden in large gene expression data sets. The vast majority of clustering algorithms applied so far produce hard partitions of the data, i.e. each gene is assigned exactly to one cluster. Hard clustering is favourable if clusters are well separated. However, this is generally not the case for microarray time-course data, where gene clusters frequently overlap. Additionally, hard clustering algorithms are often highly sensitive to noise. To overcome the limitations of hard clustering, we applied soft clustering which offers several advantages for researchers. First, it generates accessible internal cluster structures, i.e. it indicates how well corresponding clusters represent genes. This can be used for the more targeted search for regulatory elements. Second, the overall relation between clusters, and thus a global clustering structure, can be defined. Additionally, soft clustering is more noise robust and a priori pre-filtering of genes can be avoided. This prevents the exclusion of biologically relevant genes from the data analysis. Soft clustering was implemented here using the fuzzy c-means algorithm. Procedures to find optimal clustering parameters were developed. A software package for soft clustering has been developed based on the open-source statistical language R. The package called Mfuzz is freely available.read more
Citations
More filters
Journal ArticleDOI
A Transcriptomics Resource Reveals a Transcriptional Transition During Ordered Sarcomere Morphogenesis in Flight Muscle
Maria L. Spletter,Maria L. Spletter,Christiane Barz,Assa Yeroslaviz,Xu Zhang,Xu Zhang,Xu Zhang,Sandra B. Lemke,Adrien Bonnard,Erich Brunner,Giovanni Cardone,Konrad Basler,Bianca Habermann,Bianca Habermann,Frank Schnorrer,Frank Schnorrer +15 more
TL;DR: An ordered sarcomere morphogenesis process under precise transcriptional control is defined – a concept that may also apply to vertebrate muscle or heart development.
Journal ArticleDOI
m6A RNA modifications are measured at single-base resolution across the mammalian transcriptome
Lulu Hu,Shun Liu,Yong Peng,Ruiqi Ge,Rui Su,Chamara Senevirathne,Bryan T. Harada,Qing Dai,Jiangbo Wei,Li-Sheng Zhang,Ziyang Hao,Liangzhi Luo,Huanyu Wang,Yuru Wang,Minkui Luo,Mengjie Chen,Jianjun Chen,Chuan He +17 more
TL;DR: m6A-SAC-seq is a quantitative method to dissect the dynamics and functional roles of m6A sites in diverse biological processes using limited input RNA.
Journal ArticleDOI
Dissecting the proteome dynamics of the early heat stress response leading to plant survival or death in Arabidopsis.
Sira Echevarría-Zomeño,Lourdes Fernández-Calvino,Ana B. Castro-Sanz,Juan Antonio López,Jesús Vázquez,M. Mar Castellano +5 more
TL;DR: Thermotolerance assays of mutants in genes with an uncharacterized role in heat stress demonstrate the relevance of this study to uncover both positive and negative heat regulators and pinpoint a pivotal role of JR1 and BAG6 in heat tolerance.
Journal ArticleDOI
Low oxygen levels as a trigger for enhancement of respiratory metabolism in Saccharomyces cerevisiae.
Eija Rintala,Mervi Toivari,Juha-Pekka Pitkänen,Marilyn G. Wiebe,Laura Ruohonen,Merja Penttilä +5 more
TL;DR: Global upregulation of genes encoding components of the respiratory pathway under conditions of intermediate oxygen suggested a regulatory mechanism to control these genes as a response to the need of more efficient energy production.
Journal ArticleDOI
Clustering gene expression time course data using mixtures of multivariate t-distributions
TL;DR: A very general and flexible model-based technique is used to cluster longitudinal data, usingixtures of multivariate t-distributions with a linear model for the mean and a modified Cholesky-decomposed covariance structure.
References
More filters
Journal ArticleDOI
Cluster analysis and display of genome-wide expression patterns
TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Book
Pattern Recognition with Fuzzy Objective Function Algorithms
TL;DR: Books, as a source that may involve the facts, opinion, literature, religion, and many others are the great friends to join with, becomes what you need to get.
Related Papers (5)
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more