scispace - formally typeset
Search or ask a question
Book ChapterDOI

A Least Squares Fitting-Based Modeling of Gene Regulatory Sub-networks

TL;DR: This paper presents a simple and novel least squares fitting-based modeling approach for the extraction simple gene regulatory sub- networks from biclusters in microarray time series gene expression data.
Abstract: This paper presents a simple and novel least squares fitting-based modeling approach for the extraction simple gene regulatory sub- networks from biclusters in microarray time series gene expression data. Preprocessing helps in retaining the strongly interacting gene regulatory pairs. The methodology was applied to public-domain data sets of Yeast and the experimental results were biologically validated based on standard databases and information from literature.

Content maybe subject to copyright    Report

Citations
More filters
Patent
10 Aug 2010
TL;DR: In this paper, methods and custom computing apparatuses for identifying gene-gene interactions from gene expression data, based on which a gene regulatory sub-network can be built.
Abstract: Disclosed are methods and custom computing apparatuses for identifying gene-gene interactions from gene expression data, based on which a gene regulatory sub-network can be built. In particular, relationships in which multiple genes co-regulate one target gene can also be identified.

2 citations

References
More filters
Journal ArticleDOI
TL;DR: The genome-wide characterization of mRNA transcript levels during the cell cycle of the budding yeast S. cerevisiae indicates a mechanism for local chromosomal organization in global mRNA regulation and links a range of human genes to cell cycle period-specific biological functions.

2,232 citations


"A Least Squares Fitting-Based Model..." refers methods in this paper

  • ...Yeast cell-cycle CDC28 data [6], a collection of 6220 genes for 17 time points, taken at intervals of 10-minutes, were chosen for applying our methodology....

    [...]

Journal ArticleDOI
TL;DR: This work develops algorithms for identifying generalized hierarchies and uses these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes and eukaryotes, finding that TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell.
Abstract: A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and their target genes can be modeled in terms of directed regulatory networks. These relationships, in turn, can be readily compared with commonplace “chain-of-command” structures in social networks, which have characteristic hierarchical layouts. Here, we develop algorithms for identifying generalized hierarchies (allowing for various loop structures) and use these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae), with most TFs at the bottom levels and only a few master TFs on top. These masters are situated near the center of the protein–protein interaction network, a different type of network from the regulatory one, and they receive most of the input for the whole regulatory hierarchy through protein interactions. Moreover, they have maximal influence over other genes, in terms of affecting expression-level changes. Surprisingly, however, TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell. Finally, one might think master TFs achieve their wide influence through directly regulating many targets, but TFs with most direct targets are in the middle of the hierarchy. We find, in fact, that these midlevel TFs are “control bottlenecks” in the hierarchy, and this great degree of control for “middle managers” has parallels in efficient social structures in various corporate and governmental settings.

355 citations


"A Least Squares Fitting-Based Model..." refers background in this paper

  • ...While identifying the hierarchical structure of regulatory networks [8] it was reported that Y HR084W -Y NL192W forms a TF-T gene pair....

    [...]

Journal ArticleDOI
TL;DR: Novel methods for estimation of missing values in microarray data sets that are based on the least squares principle, and that utilize correlations between both genes and arrays are presented.
Abstract: Microarray experiments generate data sets with information on the expression levels of thousands of genes in a set of biological samples. Unfortunately, such experiments often produce multiple missing expression values, normally due to various experimental problems. As many algorithms for gene expression analysis require a complete data matrix as input, the missing values have to be estimated in order to analyze the available data. Alternatively, genes and arrays can be removed until no missing values remain. However, for genes or arrays with only a small number of missing values, it is desirable to impute those values. For the subsequent analysis to be as informative as possible, it is essential that the estimates for the missing gene expression values are accurate. A small amount of badly estimated missing values in the data might be enough for clustering methods, such as hierachical clustering or K-means clustering, to produce misleading results. Thus, accurate methods for missing value estimation are needed. We present novel methods for estimation of missing values in microarray data sets that are based on the least squares principle, and that utilize correlations between both genes and arrays. For this set of methods, we use the common reference name LSimpute. We compare the estimation accuracy of our methods with the widely used KNNimpute on three complete data matrices from public data sets by randomly knocking out data (labeling as missing). From these tests, we conclude that our LSimpute methods produce estimates that consistently are more accurate than those obtained using KNNimpute. Additionally, we examine a more classic approach to missing value estimation based on expectation maximization (EM). We refer to our EM implementations as EMimpute, and the estimate errors using the EMimpute methods are compared with those our novel methods produce. The results indicate that on average, the estimates from our best performing LSimpute method are at least as accurate as those from the best EMimpute algorithm.

344 citations


"A Least Squares Fitting-Based Model..." refers methods in this paper

  • ...Eventually, a total of 6029 genes were taken for imputation of missing values according to the methodology provided in [ 7 ]....

    [...]

Journal ArticleDOI
TL;DR: A novel multi-objective evolutionary biclustering framework is introduced by incorporating local search strategies and a new quantitative measure to evaluate the goodness of the biclusters is developed.

253 citations


"A Least Squares Fitting-Based Model..." refers methods in this paper

  • ...The algorithm followed is discussed in details in [2]....

    [...]

  • ...In this paper we propose the method of least squares fitting using polynomials in the framework of continuous-column multiobjective evolutionary biclustering [2] to extract the interaction between gene pairs....

    [...]

Book
21 Aug 2002
TL;DR: The book discusses the foundations for analyzing microarray data sets, genomic data-mining, the creation of standardized nomenclature and data models, clinical applications offunctional genomics research, and the future of functional genomics.
Abstract: From the Publisher: Functional genomics--the deconstruction of the genome to determine the biological function of genes and gene interactions--is one of the most fruitful new areas of biology. The growing use of DNA microarrays allows researchers to assess the expression of tens of thousands of genes at a time. This quantitative change has led to qualitative progress in our ability to understand regulatory processes at the cellular level. This book provides a systematic introduction to the use of DNA microarrays as an investigative tool for functional genomics. The presentation is appropriate for readers from biology or bioinformatics. After presenting a framework for the design of microarray-driven functional genomics experiments, the book discusses the foundations for analyzing microarray data sets, genomic data-mining, the creation of standardized nomenclature and data models, clinical applications of functional genomics research, and the future of functional genomics.

194 citations