Book ChapterDOI

# A re-definition of mixtures of polynomials for inference in hybrid Bayesian networks

29 Jun 2011-pp 98-109

TL;DR: This relaxation means that MOPs are closed under transformations required for multi-dimensional linear deterministic conditionals, such as Z = X + Y, which allows us to construct MOP approximations of the probability density functions (PDFs) of theMulti-dimensional conditional linear Gaussian distributions using a MOP approximation of the PDF of the univariate standard normal distribution.

AbstractWe discuss some issues in using mixtures of polynomials (MOPs) for inference in hybrid Bayesian networks. MOPs were proposed by Shenoy and West for mitigating the problem of integration in inference in hybrid Bayesian networks. In definingMOP for multi-dimensional functions, one requirement is that the pieces where the polynomials are defined are hypercubes. In this paper, we discuss relaxing this condition so that each piece is defined on regions called hyper-rhombuses. This relaxation means that MOPs are closed under transformations required for multi-dimensional linear deterministic conditionals, such as Z = X + Y. Also, this relaxation allows us to construct MOP approximations of the probability density functions (PDFs) of the multi-dimensional conditional linear Gaussian distributions using a MOP approximation of the PDF of the univariate standard normal distribution. We illustrate our method using conditional linear Gaussian PDFs in two and three dimensions.

Topics: Bayesian network (54%), Normal distribution (53%), Inference (52%)

### 1 Introduction

• Each variable in a BN is associated with conditional distributions for the variable, one for each state of its parents.
• MTE functions are piecewise functions that are defined on regions called hypercubes, and the functions themselves are exponential functions of a linear function of the variables.
• An advantage of the MOP method is that one can easily find MOP approximations of differentiable PDFs using the Taylor series expansion of the PDF [8], or by using Lagrange interpolating polynomials [6].
• For dimensions two or greater, the hyper-rhombus condition is a generalization of the hypercube condition.
• Second, MOP functions defined on hyper-rhombuses are closed under operations required for multidimensional linear deterministic functions.

### 2.1 MOP Functions

• The definition given in Equation (1) is exactly the same as in Shenoy and West [8].
• The main motivation for defining MOP functions is that such functions are easy to integrate in closed form, and that they are closed under multiplication, integration, and addition, the main operations in making inferences in hybrid Bayesian networks.
• The definition of a m-dimensional MOP function stated in Equation (3) is more general than the corresponding definition stated in Shenoy and West [8], which is as follows:.
• It is easy to see that an m-dimensional function satisfying the condition in Equation (5) will also satisfy the condition in Equation (3), but the converse is not true.
• An advantage is that the authors can more easily construct high dimensional conditional PDFs such as the conditional linear Gaussian distributions.

### 3 Fitting MOPs to Two- and Three-Dimensional CLG PDFs

• The authors will find MOP approximations of the PDFs of 2- and 3- dimensional conditional linear Gaussian (CLG) distributions based on a MOP approximation of the 1-dimensional standard normal PDF.
• The authors revised definition of multi-dimensional MOP functions in Equation (3) facilitates the task of finding MOP approximations of the PDFs of CLG conditional distributions.

### 3.4 Three-Dimensional CLG Distributions

• As in the two-dimensional case, the authors will investigate how much of a time penalty one has to pay for using hyper-rhombus condition.
• The inner integral (with respect to y) required approximately 93 seconds (≈ 1.6 minutes), and resulted in a 2-dimensional, 7-degree, MOP.
• Thus, the two multiplications and the two integrations in Equation (17) require a total of approximately 269 seconds (or ≈ 4.5 minutes) using Mathematica c© on a laptop computer.
• In summary, the hyper-rhombus condition enables us to easily represent CLG conditionals in high dimensions.

### 4 Summary and Discussion

• A major contribution of this paper is a re-definition of multi-dimensional mixture of polynomials so that the regions where the polynomials are defined are hyperrhombuses instead of hypercubes.
• This re-definition allows us to use the MOP approximation of a one-dimensional standard normal PDF to define MOP approximations of high-dimensional CLG PDFs.
• Shenoy [6] compares the practical implications of the hyper-rhombus condition with the hypercube condition.
• He compares the time required for computation of marginals for a couple of simple Bayesian networks, and also the accuracy of the computed marginals.
• This is a topic that needs further investigation.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

##### Citations
More filters

Journal ArticleDOI
TL;DR: A new method for finding MOP approximations based on Lagrange interpolating polynomials (LIP) with Chebyshev points is described, and how the LIP method can be used to find efficient MOP approximation of PDFs is described.
Abstract: We discuss two issues in using mixtures of polynomials (MOPs) for inference in hybrid Bayesian networks. MOPs were proposed by Shenoy and West for mitigating the problem of integration in inference in hybrid Bayesian networks. First, in defining MOP for multi-dimensional functions, one requirement is that the pieces where the polynomials are defined are hypercubes. In this paper, we discuss relaxing this condition so that each piece is defined on regions called hyper-rhombuses. This relaxation means that MOPs are closed under transformations required for multi-dimensional linear deterministic conditionals, such as Z=X+Y, etc. Also, this relaxation allows us to construct MOP approximations of the probability density functions (PDFs) of the multi-dimensional conditional linear Gaussian distributions using a MOP approximation of the PDF of the univariate standard normal distribution. Second, Shenoy and West suggest using the Taylor series expansion of differentiable functions for finding MOP approximations of PDFs. In this paper, we describe a new method for finding MOP approximations based on Lagrange interpolating polynomials (LIP) with Chebyshev points. We describe how the LIP method can be used to find efficient MOP approximations of PDFs. We illustrate our methods using conditional linear Gaussian PDFs in one, two, and three dimensions, and conditional log-normal PDFs in one and two dimensions. We compare the efficiencies of the hyper-rhombus condition with the hypercube condition. Also, we compare the LIP method with the Taylor series method.

29 citations

### Cites background from "A re-definition of mixtures of poly..."

• ...A condensed version of one part of this manuscript has appeared as [24]....

[...]

Book ChapterDOI
01 Jan 2014

15 citations

Proceedings Article
01 Jan 2012
TL;DR: A structure for handling probability potentials called Sum-Product factorized potentials, and it is shown how these potentials facilitate ecient inference based on properties of the MoTBFs and ideas similar to the ones underlying Lazy propagation (postponing operations and keeping factorized representations of the potentials).
Abstract: In this paper we study the problem of exact inference in hybrid Bayesian networks using mixtures of truncated basis functions (MoTBFs). We propose a structure for handling probability potentials called Sum-Product factorized potentials, and show how these potentials facilitate ecient inference based on i) properties of the MoTBFs and ii) ideas similar to the ones underlying Lazy propagation (postponing operations and keeping factorized representations of the potentials). We report on preliminary experiments demonstrating the eciency of the proposed method in comparison with existing algorithms.

15 citations

### Cites methods from "A re-definition of mixtures of poly..."

• ...Recently, the mixtures of polynomials (MOPs) model has been proposed as an alternative to the MTE model (Shenoy and West, 2011); the MOP model shares the advantages of MTEs, but it also provides a more flexible way of handling deterministic relationships among variables (Shenoy, 2011)....

[...]

01 Jan 2012
TL;DR: XML can represent several kinds of models, such as Bayesian networks, Markov networks, influence diagrams, LIMIDs, decision analysis networks, as well as tempo- ral models, and the possibility of encoding new types of networks and user-specific properties without the need to modify the format definition.
Abstract: ProbModelXML is an XML format for encoding probabilistic graphical models. The main advan- tages of this format are that it can represent several kinds of models, such as Bayesian networks, Markov networks, influence diagrams, LIMIDs, decision analysis networks, as well as tempo- ral models: dynamic Bayesian networks, MDPs, POMDPs, Markov processes with atemporal decisions (MPADs), DLIMIDs, etc., and the possibility of encoding new types of networks and user-specific properties without the need to modify the format definition.

8 citations

### Cites background from "A re-definition of mixtures of poly..."

• ...…the light of the feedback we have received from some colleagues, and to extend it to cover new types of potentials (such as mixtures of polynomials (Shenoy, 2011; Shenoy and West, 2010)), submodels (as in GENIE), and new types of networks, such as object-oriented Bayesian networks (Koller and…...

[...]

• ...The tasks we have scheduled for the near future are to improve the syntax for certain properties in the light of the feedback we have received from some colleagues, and to extend it to cover new types of potentials (such as mixtures of polynomials (Shenoy, 2011; Shenoy and West, 2010)), submodels (as in GENIE), and new types of networks, such as object-oriented Bayesian networks (Koller and Pfeffer, 1997) and probabilistic relational models (Jaeger, 1997; Koller and Pfeffer, 1996)....

[...]

Dissertation
19 Jul 2013
TL;DR: The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs.
Abstract: Probabilistic modeling is the de?ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di?erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi?cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random ?eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci?cally models inspired from multi-dimensional Bayesian network classi?ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e?ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely ?-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi?cation, where six di?erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di?erent Bayesian classi?ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.

8 citations

### Cites background from "A re-definition of mixtures of poly..."

• ...This piecewise function that is defined by partitioning the domain of continuous variables into disjoint hyper-rhombuses [Shenoy, 2011] is called mixture of polynomials....

[...]

##### References
More filters

Journal ArticleDOI

14,407 citations

Posted Content
Abstract: The information deviation between any two finite measures cannot be increased by any statistical operations (Markov morphisms). It is invarient if and only if the morphism is sufficient for these two measures

4,282 citations

### "A re-definition of mixtures of poly..." refers methods in this paper

• ...First, we can use the Kullback-Liebler (KL) divergence [2] as a measure of the goodness of fit....

[...]

BookDOI
01 Jan 1993
Abstract: Symbolic and quantitative approaches to reasoning with uncertainty , Symbolic and quantitative approaches to reasoning with uncertainty , کتابخانه دیجیتال جندی شاپور اهواز

647 citations

Book ChapterDOI
19 Sep 2001
TL;DR: The properties of the MTE distribution are studied and it is shown how exact probability propagation can be carried out by means of a local computation algorithm.
Abstract: In this paper we propose the use of mixtures of truncated exponential (MTE) distributions in hybrid Bayesian networks. We study the properties of the MTE distribution and show how exact probability propagation can be carried out by means of a local computation algorithm. One feature of this model is that no restriction is made about the order among the variables either discrete or continuous. Computations are performed over a representation of probabilistic potentials based on probability trees, expanded to allow discrete and continuous variables simultaneously. Finally, a Markov chain Monte Carlo algorithm is described with the aim of dealing with complex networks.

228 citations

### "A re-definition of mixtures of poly..." refers background in this paper

• ...One solution to the integration problem is to approximate conditional PDFs by a family of functions called mixtures of truncated exponentials (MTEs) [4]....

[...]

Journal ArticleDOI
TL;DR: The main goal of this paper is to describe inference in hybrid Bayesian networks (BNs) using mixture of polynomials (MOP) approximations of probability density functions (PDFs), which are similar in spirit to using mixtures of truncated exponentials (MTEs) approxims.
Abstract: The main goal of this paper is to describe inference in hybrid Bayesian networks (BNs) using mixture of polynomials (MOP) approximations of probability density functions (PDFs). Hybrid BNs contain a mix of discrete, continuous, and conditionally deterministic random variables. The conditionals for continuous variables are typically described by conditional PDFs. A major hurdle in making inference in hybrid BNs is marginalization of continuous variables, which involves integrating combinations of conditional PDFs. In this paper, we suggest the use of MOP approximations of PDFs, which are similar in spirit to using mixtures of truncated exponentials (MTEs) approximations. MOP functions can be easily integrated, and are closed under combination and marginalization. This enables us to propagate MOP potentials in the extended Shenoy-Shafer architecture for inference in hybrid BNs that can include deterministic variables. MOP approximations have several advantages over MTE approximations of PDFs. They are easier to find, even for multi-dimensional conditional PDFs, and are applicable for a larger class of deterministic functions in hybrid BNs.

116 citations

### "A re-definition of mixtures of poly..." refers background or methods in this paper

• ...The definition of a m-dimensional MOP function stated in Equation (3) is more general than the corresponding definition stated in Shenoy and West [8], which is as follows: An m-dimensional function f : R → R is said to be a MOP function if:...

[...]

• ...The definition given in Equation (1) is exactly the same as in Shenoy and West [8]....

[...]

• ...In Shenoy and West [8], a 12-piece, 14-degree MOP approximation is found by covering the two-dimensional region −3 < z < 3, z − 3 < y < z + 3 by 12 squares (hypercubes in two dimensions), and then by using two-dimensional Taylor series approximation at the mid-point of each square....

[...]

• ...The definition we provide here is slightly more general than the definition provided in Shenoy and West [8] for the case of multi-dimensional functions....

[...]

• ...Although a detailed comparison of MTE and MOP methods has yet to be done, an advantage of the MOP method is that one can easily find MOP approximations of differentiable PDFs using the Taylor series expansion of the PDF [8], or by using Lagrange interpolating polynomials [6]....

[...]