scispace - formally typeset
Search or ask a question
Author

Siva Sivaganesan

Bio: Siva Sivaganesan is an academic researcher from University of Cincinnati. The author has contributed to research in topics: Prior probability & Bayesian probability. The author has an hindex of 18, co-authored 51 publications receiving 1986 citations. Previous affiliations of Siva Sivaganesan include University of Cincinnati Academic Health Center.


Papers
More filters
Journal ArticleDOI
01 Jun 1994-Test
TL;DR: An overview of the subject of robust Bayesian analysis is provided, one that is accessible to statisticians outside the field, and recent developments in the area are reviewed.
Abstract: Robust Bayesian analysis is the study of the sensitivity of Bayesian answers to uncertain inputs. This paper seeks to provide an overview of the subject, one that is accessible to statisticians outside the field. Recent developments in the area are also reviewed, though with very uneven emphasis.

587 citations

Journal ArticleDOI
TL;DR: A clustering procedure based on the Bayesian infinite mixture model and applied to clustering gene expression profiles that allows for incorporation of uncertainties involved in the model selection in the final assessment of confidence in similarities of expression profiles.
Abstract: MOTIVATION The biologic significance of results obtained through cluster analyses of gene expression data generated in microarray experiments have been demonstrated in many studies. In this article we focus on the development of a clustering procedure based on the concept of Bayesian model-averaging and a precise statistical model of expression data. RESULTS We developed a clustering procedure based on the Bayesian infinite mixture model and applied it to clustering gene expression profiles. Clusters of genes with similar expression patterns are identified from the posterior distribution of clusterings defined implicitly by the stochastic data-generation model. The posterior distribution of clusterings is estimated by a Gibbs sampler. We summarized the posterior distribution of clusterings by calculating posterior pairwise probabilities of co-expression and used the complete linkage principle to create clusters. This approach has several advantages over usual clustering procedures. The analysis allows for incorporation of a reasonable probabilistic model for generating data. The method does not require specifying the number of clusters and resulting optimal clustering is obtained by averaging over models with all possible numbers of clusters. Expression profiles that are not similar to any other profile are automatically detected, the method incorporates experimental replicates, and it can be extended to accommodate missing data. This approach represents a qualitative shift in the model-based cluster analysis of expression data because it allows for incorporation of uncertainties involved in the model selection in the final assessment of confidence in similarities of expression profiles. We also demonstrated the importance of incorporating the information on experimental variability into the clustering model. AVAILABILITY The MS Windows(TM) based program implementing the Gibbs sampler and supplemental material is available at http://homepages.uc.edu/~medvedm/BioinformaticsSupplement.htm CONTACT medvedm@email.uc.edu

312 citations

Journal ArticleDOI
Alexandra B Keenan1, Sherry L. Jenkins1, Kathleen M. Jagodnik1, Simon Koplev1, Edward He1, Denis Torre1, Zichen Wang1, Anders B. Dohlman1, Moshe C. Silverstein1, Alexander Lachmann1, Maxim V. Kuleshov1, Avi Ma'ayan1, Vasileios Stathias2, Raymond Terryn2, Daniel J. Cooper2, Michele Forlin2, Amar Koleti2, Dusica Vidovic2, Caty Chung2, Stephan C. Schürer2, Jouzas Vasiliauskas3, Marcin Pilarczyk3, Behrouz Shamsaei3, Mehdi Fazel3, Yan Ren3, Wen Niu3, Nicholas A. Clark3, Shana White3, Naim Al Mahi3, Lixia Zhang3, Michal Kouril3, John F. Reichard3, Siva Sivaganesan3, Mario Medvedovic3, Jaroslaw Meller3, Rick J. Koch1, Marc R. Birtwistle1, Ravi Iyengar1, Eric A. Sobie1, Evren U. Azeloglu1, Julia A. Kaye4, Jeannette Osterloh4, Kelly Haston4, Jaslin Kalra4, Steve Finkbiener4, Jonathan Z. Li5, Pamela Milani5, Miriam Adam5, Renan Escalante-Chong5, Karen Sachs5, Alexander LeNail5, Divya Ramamoorthy5, Ernest Fraenkel5, Gavin Daigle6, Uzma Hussain6, Alyssa Coye6, Jeffrey D. Rothstein6, Dhruv Sareen7, Loren Ornelas7, Maria G. Banuelos7, Berhan Mandefro7, Ritchie Ho7, Clive N. Svendsen7, Ryan G. Lim8, Jennifer Stocksdale8, Malcolm Casale8, Terri G. Thompson8, Jie Wu8, Leslie M. Thompson8, Victoria Dardov7, Vidya Venkatraman7, Andrea Matlock7, Jennifer E. Van Eyk7, Jacob D. Jaffe9, Malvina Papanastasiou9, Aravind Subramanian9, Todd R. Golub, Sean D. Erickson10, Mohammad Fallahi-Sichani10, Marc Hafner10, Nathanael S. Gray10, Jia-Ren Lin10, Caitlin E. Mills10, Jeremy L. Muhlich10, Mario Niepel10, Caroline E. Shamu10, Elizabeth H. Williams10, David Wrobel10, Peter K. Sorger10, Laura M. Heiser11, Joe W. Gray11, James E. Korkola11, Gordon B. Mills12, Mark A. LaBarge13, Mark A. LaBarge14, Heidi S. Feiler11, Mark A. Dane11, Elmar Bucher11, Michel Nederlof11, Damir Sudar11, Sean M. Gross11, David Kilburn11, Rebecca Smith11, Kaylyn Devlin11, Ron Margolis, Leslie Derr, Albert Lee, Ajay Pillai 
TL;DR: The LINCS program focuses on cellular physiology shared among tissues and cell types relevant to an array of diseases, including cancer, heart disease, and neurodegenerative disorders.
Abstract: The Library of Integrated Network-Based Cellular Signatures (LINCS) is an NIH Common Fund program that catalogs how human cells globally respond to chemical, genetic, and disease perturbations. Resources generated by LINCS include experimental and computational methods, visualization tools, molecular and imaging data, and signatures. By assembling an integrated picture of the range of responses of human cells exposed to many perturbations, the LINCS program aims to better understand human disease and to advance the development of new therapies. Perturbations under study include drugs, genetic perturbations, tissue micro-environments, antibodies, and disease-causing mutations. Responses to perturbations are measured by transcript profiling, mass spectrometry, cell imaging, and biochemical methods, among other assays. The LINCS program focuses on cellular physiology shared among tissues and cell types relevant to an array of diseases, including cancer, heart disease, and neurodegenerative disorders. This Perspective describes LINCS technologies, datasets, tools, and approaches to data accessibility and reusability.

300 citations

Journal ArticleDOI
TL;DR: A Bayesian hierarchical normal model is used to define a novel Intensity-Based Moderated T-statistic (IBMT), which is completely data-dependent using empirical Bayes philosophy to estimate hyperparameters, and thus does not require specification of any free parameters.
Abstract: The small sample sizes often used for microarray experiments result in poor estimates of variance if each gene is considered independently. Yet accurately estimating variability of gene expression measurements in microarray experiments is essential for correctly identifying differentially expressed genes. Several recently developed methods for testing differential expression of genes utilize hierarchical Bayesian models to "pool" information from multiple genes. We have developed a statistical testing procedure that further improves upon current methods by incorporating the well-documented relationship between the absolute gene expression level and the variance of gene expression measurements into the general empirical Bayes framework. We present a novel Bayesian moderated-T, which we show to perform favorably in simulations, with two real, dual-channel microarray experiments and in two controlled single-channel experiments. In simulations, the new method achieved greater power while correctly estimating the true proportion of false positives, and in the analysis of two publicly-available "spike-in" experiments, the new method performed favorably compared to all tested alternatives. We also applied our method to two experimental datasets and discuss the additional biological insights as revealed by our method in contrast to the others. The R-source code for implementing our algorithm is freely available at http://eh3.uc.edu/ibmt . We use a Bayesian hierarchical normal model to define a novel Intensity-Based Moderated T-statistic (IBMT). The method is completely data-dependent using empirical Bayes philosophy to estimate hyperparameters, and thus does not require specification of any free parameters. IBMT has the strength of balancing two important factors in the analysis of microarray data: the degree of independence of variances relative to the degree of identity (i.e. t-tests vs. equal variance assumption), and the relationship between variance and signal intensity. When this variance-intensity relationship is weak or does not exist, IBMT reduces to a previously described moderated t-statistic. Furthermore, our method may be directly applied to any array platform and experimental design. Together, these properties show IBMT to be a valuable option in the analysis of virtually any microarray experiment.

253 citations

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of robustness or sensitivity of given Bayesian posterior criteria to specification of the prior distribution, including the posterior mean, variance and probability of a set (for credible regions and hypothesis testing).
Abstract: We consider the problem of robustness or sensitivity of given Bayesian posterior criteria to specification of the prior distribution. Criteria considered include the posterior mean, variance and probability of a set (for credible regions and hypothesis testing). Uncertainty in an elicited prior, $\pi_0$, is modelled by an $\varepsilon$-contamination class $\Gamma = \{\pi = (1 - \varepsilon)\pi_0 + \varepsilon q, q \in Q\}$, where $\varepsilon$ reflects the amount of probabilistic uncertainty in $\pi_0$, and $Q$ is a class of allowable contaminations. For $Q = \{$all unimodal distributions$\}$ and $Q = \{\text{all symmetric unimodal distributions}\}$, we determine the ranges of the various posterior criteria as $\pi$ varies over $\Gamma$.

131 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

22,147 citations

Journal ArticleDOI
TL;DR: New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments, and the voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline.
Abstract: New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or better than count-based RNA-seq methods even when the data are generated according to the assumptions of the earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods.

4,475 citations

Journal ArticleDOI
TL;DR: The approach is general because it offers the definition, identification, estimation, and sensitivity analysis of causal mediation effects without reference to any specific statistical model and can accommodate linear and nonlinear relationships, parametric and nonparametric models, continuous and discrete mediators, and various types of outcome variables.
Abstract: Traditionally in the social sciences, causal mediation analysis has been formulated, understood, and implemented within the framework of linear structural equation models. We argue and demonstrate that this is problematic for 3 reasons: the lack of a general definition of causal mediation effects independent of a particular statistical model, the inability to specify the key identification assumption, and the difficulty of extending the framework to nonlinear models. In this article, we propose an alternative approach that overcomes these limitations. Our approach is general because it offers the definition, identification, estimation, and sensitivity analysis of causal mediation effects without reference to any specific statistical model. Further, our approach explicitly links these 4 elements closely together within a single framework. As a result, the proposed framework can accommodate linear and nonlinear relationships, parametric and nonparametric models, continuous and discrete mediators, and various types of outcome variables. The general definition and identification result also allow us to develop sensitivity analysis in the context of commonly used models, which enables applied researchers to formally assess the robustness of their empirical conclusions to violations of the key assumption. We illustrate our approach by applying it to the Job Search Intervention Study. We also offer easy-to-use software that implements all our proposed methods.

2,393 citations

Journal ArticleDOI
TL;DR: The first new effect size index is described is a residual-based index that quantifies the amount of variance explained in both the mediator and the outcome and the second new effectsize index quantifying the indirect effect as the proportion of the maximum possible indirect effect that could have been obtained, given the scales of the variables involved.
Abstract: The statistical analysis of mediation effects has become an indispensable tool for helping scientists investigate processes thought to be causal. Yet, in spite of many recent advances in the estimation and testing of mediation effects, little attention has been given to methods for communicating effect size and the practical importance of those effect sizes. Our goals in this article are to (a) outline some general desiderata for effect size measures, (b) describe current methods of expressing effect size and practical importance for mediation, (c) use the desiderata to evaluate these methods, and (d) develop new methods to communicate effect size in the context of mediation analysis. The first new effect size index we describe is a residual-based index that quantifies the amount of variance explained in both the mediator and the outcome. The second new effect size index quantifies the indirect effect as the proportion of the maximum possible indirect effect that could have been obtained, given the scales of the variables involved. We supplement our discussion by offering easy-to-use R tools for the numerical and visual communication of effect size for mediation effects.

2,359 citations