scispace - formally typeset
Journal ArticleDOI

Constructing Pathway-Based Priors within a Gaussian Mixture Model for Bayesian Regression and Classification

TLDR
Simulations demonstrate that the GMM REMLP prior yields better performance than the EM algorithm for small data sets, and is applied to phenotype classification when the prior knowledge consists of colon cancer pathways.
Abstract
Gene-expression-based classification and regression are major concerns in translational genomics. If the feature-label distribution is known, then an optimal classifier can be derived. If the predictor-target distribution is known, then an optimal regression function can be derived. In practice, neither is known, data must be employed, and, for small samples, prior knowledge concerning the feature-label or predictor-target distribution can be used in the learning process. Optimal Bayesian classification and optimal Bayesian regression provide optimality under uncertainty. With optimal Bayesian classification (or regression), uncertainty is treated directly on the feature-label (or predictor-target) distribution. The fundamental engineering problem is prior construction. The Regularized Expected Mean Log-Likelihood Prior (REMLP) utilizes pathway information and provides viable priors for the feature-label distribution, assuming that the training data contain labels. In practice, the labels may not be observed. This paper extends the REMLP methodology to a Gaussian mixture model (GMM) when the labels are unknown. Prior construction bundled with prior update via Bayesian sampling results in Monte Carlo approximations to the optimal Bayesian regression function and optimal Bayesian classifier. Simulations demonstrate that the GMM REMLP prior yields better performance than the EM algorithm for small data sets. We apply it to phenotype classification when the prior knowledge consists of colon cancer pathways.

read more

Citations
More filters
Journal ArticleDOI

Autonomous efficient experiment design for materials discovery with Bayesian model averaging

TL;DR: In this article, the authors propose a framework that combines Bayesian model averaging within Bayesian optimization in order to realize a system capable of autonomously and adaptively learning not only the most promising regions in the materials space but also the models that most efficiently guide such exploration.
Journal ArticleDOI

Control of Gene Regulatory Networks Using Bayesian Inverse Reinforcement Learning

TL;DR: A Bayesian Inverse Reinforcement Learning (BIRL) approach is developed to address the realistic case in which the only available knowledge regarding the immediate cost function is provided by the sequence of measurements and interventions recorded in an experimental setting by an expert.
Journal ArticleDOI

Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors

TL;DR: The new proposed general prior construction framework extends the prior construction methodology to a more flexible framework that results in better inference when proper prior knowledge exists, and enables superior classifier design using small, unstructured data sets.
Journal ArticleDOI

Multivariate Calibration and Experimental Validation of a 3D Finite Element Thermal Model for Laser Powder Bed Fusion Metal Additive Manufacturing

TL;DR: In this article, the authors propose a surrogate modeling approach based on multivariate Gaussian processes (MVGPs) to calibrate the free parameters of the multi-physics models against experiments, sidestepping the use of prohibitively expensive Monte Carlo based calibration.
References
More filters
Journal ArticleDOI

Model Selection and Akaike's Information Criterion (AIC): The General Theory and Its Analytical Extensions.

TL;DR: In this article, the entropy-based information criterion (AIC) has been extended in two ways without violating Akaike's main principles: CAIC and CAICF, which make AIC asymptotically consistent and penalize overparameterization more stringently.
Journal ArticleDOI

Mixture densities, maximum likelihood, and the EM algorithm

Richard A. Redner, +1 more
- 01 Apr 1984 - 
TL;DR: This work discusses the formulation and theoretical and practical properties of the EM algorithm, a specialization to the mixture density context of a general algorithm used to approximate maximum-likelihood estimates for incomplete data problems.
Journal ArticleDOI

On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)

TL;DR: In this paper, a hierarchical prior model is proposed to deal with weak prior information while avoiding the mathematical pitfalls of using improper priors in the mixture context, which can be used as a basis for a thorough presentation of many aspects of the posterior distribution.
Journal ArticleDOI

Estimation of Finite Mixture Distributions Through Bayesian Sampling

TL;DR: In this paper, Gibbs sampling is used to evaluate the posterior distribution and Bayes estimators by Gibbs sampling, relying on the missing data structure of the mixture model. And the data augmentation method is shown to converge geometrically, since a duality principle transfers properties from the discrete missing data chain to the parameters.
Related Papers (5)