Loopy belief propagation for approximate inference: an empirical study

Home
/
Papers
/
Loopy belief propagation for approximate inference: an empirical study

Proceedings Article•

Loopy belief propagation for approximate inference: an empirical study

Kevin Murphy¹, Yair Weiss¹, Michael I. Jordan¹•Institutions (1)

30 Jul 1999-pp 467-475

TL;DR: This paper compares the marginals computed using loopy propagation to the exact ones in four Bayesian network architectures, including two real-world networks: ALARM and QMR, and finds that the loopy beliefs often converge and when they do, they give a good approximation to the correct marginals.

read less

Abstract: Recently, researchers have demonstrated that "loopy belief propagation" -- the use of Pearl's polytree algorithm in a Bayesian network with loops -- can perform well in the context of error-correcting codes. The most dramatic instance of this is the near Shannon-limit performance of "Turbo Codes" -- codes whose decoding algorithm is equivalent to loopy belief propagation in a chain-structured Bayesian network. In this paper we ask: is there something special about the error-correcting code context, or does loopy propagation work as an approximate inference scheme in a more general setting? We compare the marginals computed using loopy propagation to the exact ones in four Bayesian network architectures, including two real-world networks: ALARM and QMR. We find that the loopy beliefs often converge and when they do, they give a good approximation to the correct marginals. However, on the QMR network, the loopy beliefs oscillated and had no obvious relationship to the correct posteriors. We present some initial investigations into the cause of these oscillations, and show that some simple methods of preventing them lead to the wrong results.

...read moreread less

Citations

PDF

Open Access

More filters

Dynamic bayesian networks: representation, inference and learning

[...]

Kevin Murphy, Stuart Russell

01 Jan 2002

TL;DR: This thesis will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in Dbns, and how to learn DBN models from sequential data.

...read moreread less

Abstract: Dynamic Bayesian Networks: Representation, Inference and Learning by Kevin Patrick Murphy Doctor of Philosophy in Computer Science University of California, Berkeley Professor Stuart Russell, Chair Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data. In particular, the main novel technical contributions of this thesis are as follows: a way of representing Hierarchical HMMs as DBNs, which enables inference to be done in O(T ) time instead of O(T ), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T ) space instead of O(T ); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.

...read moreread less

2,757 citations

Journal Article•DOI•

Constructing free-energy approximations and generalized belief propagation algorithms

[...]

Jonathan S. Yedidia¹, William T. Freeman², Yair Weiss³•Institutions (3)

Mitsubishi¹, Massachusetts Institute of Technology², Hebrew University of Jerusalem³

01 Jul 2005-IEEE Transactions on Information Theory

TL;DR: This work explains how to obtain region-based free energy approximations that improve the Bethe approximation, and corresponding generalized belief propagation (GBP) algorithms, and describes empirical results showing that GBP can significantly outperform BP.

...read moreread less

Abstract: Important inference problems in statistical physics, computer vision, error-correcting coding theory, and artificial intelligence can all be reformulated as the computation of marginal probabilities on factor graphs. The belief propagation (BP) algorithm is an efficient way to solve these problems that is exact when the factor graph is a tree, but only approximate when the factor graph has cycles. We show that BP fixed points correspond to the stationary points of the Bethe approximation of the free energy for a factor graph. We explain how to obtain region-based free energy approximations that improve the Bethe approximation, and corresponding generalized belief propagation (GBP) algorithms. We emphasize the conditions a free energy approximation must satisfy in order to be a "valid" or "maxent-normal" approximation. We describe the relationship between four different methods that can be used to generate valid approximations: the "Bethe method", the "junction graph method", the "cluster variation method", and the "region graph method". Finally, we explain how to tell whether a region-based approximation, and its corresponding GBP algorithm, is likely to be accurate, and describe empirical results showing that GBP can significantly outperform BP.

...read moreread less

1,827 citations

Cites background from "Loopy belief propagation for approx..."

...Nevertheless, in such cases there are no guarantees, and sometimes the results are quite poor, or the algorithm fails to give any result at all because it does not converge [14]....
[...]

Book•

Understanding belief propagation and its generalizations

[...]

Jonathan S. Yedidia, William T. Freeman¹, Yair Weiss²•Institutions (2)

Massachusetts Institute of Technology¹, Hebrew University of Jerusalem²

01 Jan 2003

TL;DR: It is shown that BP can only converge to a fixed point that is also a stationary point of the Bethe approximation to the free energy, which enables connections to be made with variational approaches to approximate inference.

...read moreread less

Abstract: "Inference" problems arise in statistical physics, computer vision, error-correcting coding theory, and AI. We explain the principles behind the belief propagation (BP) algorithm, which is an efficient way to solve inference problems based on passing local messages. We develop a unified approach, with examples, notation, and graphical models borrowed from the relevant disciplines.We explain the close connection between the BP algorithm and the Bethe approximation of statistical physics. In particular, we show that BP can only converge to a fixed point that is also a stationary point of the Bethe approximation to the free energy. This result helps explaining the successes of the BP algorithm and enables connections to be made with variational approaches to approximate inference.The connection of BP with the Bethe approximation also suggests a way to construct new message-passing algorithms based on improvements to Bethe's approximation introduced Kikuchi and others. The new generalized belief propagation (GBP) algorithms are significantly more accurate than ordinary BP for some problems. We illustrate how to construct GBP algorithms with a detailed example.

...read moreread less

1,627 citations

Proceedings Article•

Expectation propagation for approximate Bayesian inference

[...]

Tom Minka¹•Institutions (1)

Carnegie Mellon University¹

02 Aug 2001

TL;DR: Expectation Propagation approximates the belief states by only retaining expectations, such as mean and varitmce, and iterates until these expectations are consistent throughout the network, which makes it applicable to hybrid networks with discrete and continuous nodes.

...read moreread less

Abstract: This paper presents a new deterministic approximation technique in Bayesian networks. This method, "Expectation Propagation," unifies two previous techniques: assumed-density filtering, an extension of the Kalman filter, and loopy belief propagation, an extension of belief propagation in Bayesian networks. Loopy belief propagation, because it propagates exact belief states, is useful for a limited class of belief networks, such as those which are purely discrete. Expectation Propagation approximates the belief states by only retaining expectations, such as mean and varitmce, and iterates until these expectations are consistent throughout the network. This makes it applicable to hybrid networks with discrete and continuous nodes. Experiments with Gaussian mixture models show Expectation Propagation to be donvincingly better than methods with similar computational cost: Laplace's method, variational Bayes, and Monte Carlo. Expectation Propagation also provides an efficient algorithm for training Bayes point machine classifiers.

...read moreread less

1,514 citations

Cites background from "Loopy belief propagation for approx..."

...In belief networks with loops it is known that approximate marginal distributions can be obtained by iterating the be lief propagation recursions, a process known as loopy be lief propagation (Frey & MacKay, 1997; Murphy et al., 1999)....
[...]
...In belief networks with loops it is known that approximate marginal distributions can be obtained by iterating the belief propagation recursions, a process known as loopy belief propagation (Frey & MacKay, 1997; Murphy et al., 1999)....
[...]

Posted Content•

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

[...]

Jascha Sohl-Dickstein¹, Eric L. Weiss², Niru Maheswaranathan¹, Surya Ganguli¹•Institutions (2)

Stanford University¹, University of California, Berkeley²

12 Mar 2015-arXiv: Learning

TL;DR: This work develops an approach to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process, then learns a reverse diffusion process that restores structure in data, yielding a highly flexible and tractable generative model of the data.

...read moreread less

Abstract: A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable. Here, we develop an approach that simultaneously achieves both flexibility and tractability. The essential idea, inspired by non-equilibrium statistical physics, is to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process. We then learn a reverse diffusion process that restores structure in data, yielding a highly flexible and tractable generative model of the data. This approach allows us to rapidly learn, sample from, and evaluate probabilities in deep generative models with thousands of layers or time steps, as well as to compute conditional and posterior probabilities under the learned model. We additionally release an open source reference implementation of the algorithm.

...read moreread less

1,481 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Book•

Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference

[...]

Judea Pearl¹•Institutions (1)

University of California, Los Angeles¹

01 Jan 1988

TL;DR: Probabilistic Reasoning in Intelligent Systems as mentioned in this paper is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty, and provides a coherent explication of probability as a language for reasoning with partial belief.

...read moreread less

Abstract: From the Publisher: Probabilistic Reasoning in Intelligent Systems is a complete andaccessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other AI approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic. The author distinguishes syntactic and semantic approaches to uncertaintyand offers techniques, based on belief networks, that provide a mechanism for making semantics-based systems operational. Specifically, network-propagation techniques serve as a mechanism for combining the theoretical coherence of probability theory with modern demands of reasoning-systems technology: modular declarative inputs, conceptually meaningful inferences, and parallel distributed computation. Application areas include diagnosis, forecasting, image interpretation, multi-sensor fusion, decision support systems, plan recognition, planning, speech recognitionin short, almost every task requiring that conclusions be drawn from uncertain clues and incomplete information. Probabilistic Reasoning in Intelligent Systems will be of special interest to scholars and researchers in AI, decision theory, statistics, logic, philosophy, cognitive psychology, and the management sciences. Professionals in the areas of knowledge-based systems, operations research, engineering, and statistics will find theoretical and computational tools of immediate practical use. The book can also be used as an excellent text for graduate-level courses in AI, operations research, or applied probability.

...read moreread less

15,671 citations

Proceedings Article•

Near Shannon limit error-correcting coding and decoding : Turbo-codes

[...]

Claude Berrou

01 Jan 1993

7,742 citations

Journal Article•DOI•

The computational complexity of probabilistic inference using Bayesian belief networks (research note)

[...]

Gregory F. Cooper¹•Institutions (1)

Stanford University¹

03 Mar 1990-Artificial Intelligence

TL;DR: In this article, it was shown that probabilistic inference using belief networks is NP-hard and that it seems unlikely that an exact algorithm can be developed to perform inference efficiently over all classes of belief networks and that research should be directed toward the design of efficient special-case, average-case and approximation algorithms.

...read moreread less

1,877 citations

Journal Article•DOI•

Turbo decoding as an instance of Pearl's "belief propagation" algorithm

[...]

Robert J. McEliece¹, David J. C. MacKay², Jung-Fu Cheng•Institutions (2)

California Institute of Technology¹, University of Cambridge²

01 Feb 1998-IEEE Journal on Selected Areas in Communications

TL;DR: It is shown that Pearl's algorithm can be used to routinely derive previously known iterative, but suboptimal, decoding algorithms for a number of other error-control systems, including Gallager's low-density parity-check codes, serially concatenated codes, and product codes.

...read moreread less

Abstract: We describe the close connection between the now celebrated iterative turbo decoding algorithm of Berrou et al. (1993) and an algorithm that has been well known in the artificial intelligence community for a decade, but which is relatively unknown to information theorists: Pearl's (1982) belief propagation algorithm. We see that if Pearl's algorithm is applied to the "belief network" of a parallel concatenation of two or more codes, the turbo decoding algorithm immediately results. Unfortunately, however, this belief diagram has loops, and Pearl only proved that his algorithm works when there are no loops, so an explanation of the experimental performance of turbo decoding is still lacking. However, we also show that Pearl's algorithm can be used to routinely derive previously known iterative, but suboptimal, decoding algorithms for a number of other error-control systems, including Gallager's (1962) low-density parity-check codes, serially concatenated codes, and product codes. Thus, belief propagation provides a very attractive general methodology for devising low-complexity iterative decoding algorithms for hybrid coded systems.

...read moreread less

989 citations

"Loopy belief propagation for approx..." refers background in this paper

...Yet McEliece et. alconjectured that the performance of loopy belief prop-agation on the Turbo code structure was a special caseof a more general phenomenon:We believe there are general undiscoveredtheorems about the performance of beliefpropagation on loopy DAGs....
[...]
...These theo rems, which may have nothing directly to do with coding or decoding will show that in some sense belief propagation "converges with high probability to a near-optimum value" of the desired belief on a class of loopy DAGs [10]....
[...]
...years" [11] and have recently been shown [9, 10] to uti lize an algorithm equivalent to belief propagation in a network with loops....
[...]
...These theo-rems, which may have nothing directly todo with coding or decoding will show thatin some sense belief propagation \convergeswith high probability to a near-optimumvalue" of the desired belief on a class of loopyDAGs [10]....
[...]

Book Chapter•DOI•

The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks

[...]

Ingo A. Beinlich¹, Henri J. Suermondt¹, R. Martin Chavez¹, Gregory F. Cooper¹•Institutions (1)

Stanford University¹

01 Jan 1989

TL;DR: Two algorithms were applied to this belief network: a message-passing algorithm by Pearl for probability updating in multiply connected networks using the method of conditioning and the Lauritzen-Spiegelhalter algorithm for local probability computations on graphical structures.

...read moreread less

Abstract: ALARM (A Logical Alarm Reduction Mechanism) is a diagnostic application used to explore probabilistic reasoning techniques in belief networks. ALARM implements an alarm message system for patient monitoring; it calculates probabilities for a differential diagnosis based on available evidence. The medical knowledge is encoded in a graphical structure connecting 8 diagnoses, 16 findings and 13 intermediate variables. Two algorithms were applied to this belief network: (1) a message-passing algorithm by Pearl for probability updating in multiply connected networks using the method of conditioning; and (2) the Lauritzen-Spiegelhalter algorithm for local probability computations on graphical structures. The characteristics of both algorithms are analyzed and their specific applications and time complexities are shown.

...read moreread less

835 citations