scispace - formally typeset
Search or ask a question
Book ChapterDOI

A re-definition of mixtures of polynomials for inference in hybrid Bayesian networks

29 Jun 2011-pp 98-109
TL;DR: This relaxation means that MOPs are closed under transformations required for multi-dimensional linear deterministic conditionals, such as Z = X + Y, which allows us to construct MOP approximations of the probability density functions (PDFs) of theMulti-dimensional conditional linear Gaussian distributions using a MOP approximation of the PDF of the univariate standard normal distribution.
Abstract: We discuss some issues in using mixtures of polynomials (MOPs) for inference in hybrid Bayesian networks. MOPs were proposed by Shenoy and West for mitigating the problem of integration in inference in hybrid Bayesian networks. In definingMOP for multi-dimensional functions, one requirement is that the pieces where the polynomials are defined are hypercubes. In this paper, we discuss relaxing this condition so that each piece is defined on regions called hyper-rhombuses. This relaxation means that MOPs are closed under transformations required for multi-dimensional linear deterministic conditionals, such as Z = X + Y. Also, this relaxation allows us to construct MOP approximations of the probability density functions (PDFs) of the multi-dimensional conditional linear Gaussian distributions using a MOP approximation of the PDF of the univariate standard normal distribution. We illustrate our method using conditional linear Gaussian PDFs in two and three dimensions.

Summary (2 min read)

1 Introduction

  • Each variable in a BN is associated with conditional distributions for the variable, one for each state of its parents.
  • MTE functions are piecewise functions that are defined on regions called hypercubes, and the functions themselves are exponential functions of a linear function of the variables.
  • An advantage of the MOP method is that one can easily find MOP approximations of differentiable PDFs using the Taylor series expansion of the PDF [8], or by using Lagrange interpolating polynomials [6].
  • For dimensions two or greater, the hyper-rhombus condition is a generalization of the hypercube condition.
  • Second, MOP functions defined on hyper-rhombuses are closed under operations required for multidimensional linear deterministic functions.

2.1 MOP Functions

  • The definition given in Equation (1) is exactly the same as in Shenoy and West [8].
  • The main motivation for defining MOP functions is that such functions are easy to integrate in closed form, and that they are closed under multiplication, integration, and addition, the main operations in making inferences in hybrid Bayesian networks.
  • The definition of a m-dimensional MOP function stated in Equation (3) is more general than the corresponding definition stated in Shenoy and West [8], which is as follows:.
  • It is easy to see that an m-dimensional function satisfying the condition in Equation (5) will also satisfy the condition in Equation (3), but the converse is not true.
  • An advantage is that the authors can more easily construct high dimensional conditional PDFs such as the conditional linear Gaussian distributions.

3 Fitting MOPs to Two- and Three-Dimensional CLG PDFs

  • The authors will find MOP approximations of the PDFs of 2- and 3- dimensional conditional linear Gaussian (CLG) distributions based on a MOP approximation of the 1-dimensional standard normal PDF.
  • The authors revised definition of multi-dimensional MOP functions in Equation (3) facilitates the task of finding MOP approximations of the PDFs of CLG conditional distributions.

3.4 Three-Dimensional CLG Distributions

  • As in the two-dimensional case, the authors will investigate how much of a time penalty one has to pay for using hyper-rhombus condition.
  • The inner integral (with respect to y) required approximately 93 seconds (≈ 1.6 minutes), and resulted in a 2-dimensional, 7-degree, MOP.
  • Thus, the two multiplications and the two integrations in Equation (17) require a total of approximately 269 seconds (or ≈ 4.5 minutes) using Mathematica c© on a laptop computer.
  • In summary, the hyper-rhombus condition enables us to easily represent CLG conditionals in high dimensions.

4 Summary and Discussion

  • A major contribution of this paper is a re-definition of multi-dimensional mixture of polynomials so that the regions where the polynomials are defined are hyperrhombuses instead of hypercubes.
  • This re-definition allows us to use the MOP approximation of a one-dimensional standard normal PDF to define MOP approximations of high-dimensional CLG PDFs.
  • Shenoy [6] compares the practical implications of the hyper-rhombus condition with the hypercube condition.
  • He compares the time required for computation of marginals for a couple of simple Bayesian networks, and also the accuracy of the computed marginals.
  • This is a topic that needs further investigation.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: A new method for finding MOP approximations based on Lagrange interpolating polynomials (LIP) with Chebyshev points is described, and how the LIP method can be used to find efficient MOP approximation of PDFs is described.

29 citations


Cites background from "A re-definition of mixtures of poly..."

  • ...A condensed version of one part of this manuscript has appeared as [24]....

    [...]

Book ChapterDOI
01 Jan 2014

17 citations

Proceedings Article
01 Jan 2012
TL;DR: A structure for handling probability potentials called Sum-Product factorized potentials, and it is shown how these potentials facilitate ecient inference based on properties of the MoTBFs and ideas similar to the ones underlying Lazy propagation (postponing operations and keeping factorized representations of the potentials).
Abstract: In this paper we study the problem of exact inference in hybrid Bayesian networks using mixtures of truncated basis functions (MoTBFs). We propose a structure for handling probability potentials called Sum-Product factorized potentials, and show how these potentials facilitate ecient inference based on i) properties of the MoTBFs and ii) ideas similar to the ones underlying Lazy propagation (postponing operations and keeping factorized representations of the potentials). We report on preliminary experiments demonstrating the eciency of the proposed method in comparison with existing algorithms.

17 citations


Cites methods from "A re-definition of mixtures of poly..."

  • ...Recently, the mixtures of polynomials (MOPs) model has been proposed as an alternative to the MTE model (Shenoy and West, 2011); the MOP model shares the advantages of MTEs, but it also provides a more flexible way of handling deterministic relationships among variables (Shenoy, 2011)....

    [...]

01 Jan 2012
TL;DR: XML can represent several kinds of models, such as Bayesian networks, Markov networks, influence diagrams, LIMIDs, decision analysis networks, as well as tempo- ral models, and the possibility of encoding new types of networks and user-specific properties without the need to modify the format definition.
Abstract: ProbModelXML is an XML format for encoding probabilistic graphical models. The main advan- tages of this format are that it can represent several kinds of models, such as Bayesian networks, Markov networks, influence diagrams, LIMIDs, decision analysis networks, as well as tempo- ral models: dynamic Bayesian networks, MDPs, POMDPs, Markov processes with atemporal decisions (MPADs), DLIMIDs, etc., and the possibility of encoding new types of networks and user-specific properties without the need to modify the format definition.

9 citations


Cites background from "A re-definition of mixtures of poly..."

  • ...…the light of the feedback we have received from some colleagues, and to extend it to cover new types of potentials (such as mixtures of polynomials (Shenoy, 2011; Shenoy and West, 2010)), submodels (as in GENIE), and new types of networks, such as object-oriented Bayesian networks (Koller and…...

    [...]

  • ...The tasks we have scheduled for the near future are to improve the syntax for certain properties in the light of the feedback we have received from some colleagues, and to extend it to cover new types of potentials (such as mixtures of polynomials (Shenoy, 2011; Shenoy and West, 2010)), submodels (as in GENIE), and new types of networks, such as object-oriented Bayesian networks (Koller and Pfeffer, 1997) and probabilistic relational models (Jaeger, 1997; Koller and Pfeffer, 1996)....

    [...]

Dissertation
19 Jul 2013
TL;DR: The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs.
Abstract: Probabilistic modeling is the de?ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di?erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi?cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random ?eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci?cally models inspired from multi-dimensional Bayesian network classi?ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e?ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely ?-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi?cation, where six di?erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di?erent Bayesian classi?ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.

8 citations


Cites background from "A re-definition of mixtures of poly..."

  • ...This piecewise function that is defined by partitioning the domain of continuous variables into disjoint hyper-rhombuses [Shenoy, 2011] is called mixture of polynomials....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: MTE potentials that approximate standard PDF’s and applications of these potentials will extend the types of inference problems that can be modelled with Bayesian networks, as demonstrated using three examples.
Abstract: Mixtures of truncated exponentials (MTE) potentials are an alternative to discretization and Monte Carlo methods for solving hybrid Bayesian networks. Any probability density function (PDF) can be approximated by an MTE potential, which can always be marginalized in closed form. This allows propagation to be done exactly using the Shenoy-Shafer architecture for computing marginals, with no restrictions on the construction of a join tree. This paper presents MTE potentials that approximate standard PDF's and applications of these potentials for solving inference problems in hybrid Bayesian networks. These approximations will extend the types of inference problems that can be modelled with Bayesian networks, as demonstrated using three examples.

78 citations


"A re-definition of mixtures of poly..." refers methods in this paper

  • ...[1] describe MTE approximations of several commonly used one-dimensional PDFs....

    [...]

Book ChapterDOI
02 Jul 2003
TL;DR: This paper proposes a method to estimate conditional MTE densities using mixed trees, which are graphical structures similar to classification trees, defined in terms of the mean square error and entropy-like measures.
Abstract: Mixtures of truncated exponential (MTE) distributions have been shown to be a powerful alternative to discretisation within the framework of Bayesian networks One of the features of the MTE model is that standard propagation algorithms as Shenoy-Shafer and Lazy propagation can be used Estimating conditional MTE densities from data is a rather difficult problem since, as far as we know, such densities cannot be expressed in parametric form in the general case In the univariate case, regression-based estimators have been successfully employed In this paper, we propose a method to estimate conditional MTE densities using mixed trees, which are graphical structures similar to classification trees Criteria for selecting the variables during the construction of the tree and for pruning the leaves are defined in terms of the mean square error and entropy-like measures

40 citations


"A re-definition of mixtures of poly..." refers methods in this paper

  • ...[5] describe a mixed-tree method for representing an MTE approximation of a 2-dimensional CLG distribution....

    [...]

  • ...[5] and the Taylor series method proposed by Shenoy and West [8] do not scale up to higher dimensions in practice, i....

    [...]

Journal ArticleDOI
TL;DR: A new method for finding MOP approximations based on Lagrange interpolating polynomials (LIP) with Chebyshev points is described, and how the LIP method can be used to find efficient MOP approximation of PDFs is described.

29 citations

Journal ArticleDOI
TL;DR: An architecture for solving large general hybrid Bayesian networks (BNs) with deterministic conditionals for continuous variables using local computation and an extended version of the crop problem that includes non-conditional linear Gaussian distributions and non-linear deterministic functions is solved.

18 citations


Additional excerpts

  • ...architecture [7]....

    [...]

Proceedings Article
08 Jul 2010
TL;DR: An extended Shenoy-Shafer architecture for propagation of discrete, continuous, and utility potentials in hybrid influence diagrams that include deterministic chance variables and discrete and continuous decision variables is described.
Abstract: We describe a framework and an algorithm for solving hybrid influence diagrams with discrete, continuous, and deterministic chance variables, and discrete and continuous decision variables. A continuous chance variable in an influence diagram is said to be deterministic if its conditional distributions have zero variances. The solution algorithm is an extension of Shenoy's fusion algorithm for discrete influence diagrams. We describe an extended Shenoy-Shafer architecture for propagation of discrete, continuous, and utility potentials in hybrid influence diagrams that include deterministic chance variables. The algorithm and framework are illustrated by solving two small examples.

8 citations


"A re-definition of mixtures of poly..." refers methods in this paper

  • ...Constructing MOP approximations of the multi-dimensional log-normal distributions is of great interest in the finance literature where log-normal distributions are used to model stock price behavior [3]....

    [...]

Frequently Asked Questions (1)
Q1. What have the authors contributed in "A re-definition of mixtures of polynomials for inference in hybrid bayesian networks" ?

The authors discuss some issues in using mixtures of polynomials ( MOPs ) for inference in hybrid Bayesian networks. In this paper, the authors discuss relaxing this condition so that each piece is defined on regions called hyper-rhombuses.