Local computations with probabilities on graphical structures and their application to expert systems

doi:10.1111/J.2517-6161.1988.TB01721.X

Home
/
Papers
/
Local computations with probabilities on graphical structures and their application to expert systems

Journal Article•DOI•

Local computations with probabilities on graphical structures and their application to expert systems

Steffen L. Lauritzen¹, David Spiegelhalter•Institutions (1)

01 Jun 1990-Journal of the royal statistical society series b-methodological (Morgan Kaufmann Publishers Inc.)-Vol. 50, Iss: 2, pp 415-448

About: This article is published in Journal of the royal statistical society series b-methodological.The article was published on 1990-06-01. It has received 3582 citations till now. The article focuses on the topics: Expert system.

...read moreread less

Citations

PDF

Open Access

More filters

Book•

Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference

[...]

Judea Pearl¹•Institutions (1)

University of California, Los Angeles¹

01 Jan 1988

TL;DR: Probabilistic Reasoning in Intelligent Systems as mentioned in this paper is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty, and provides a coherent explication of probability as a language for reasoning with partial belief.

...read moreread less

Abstract: From the Publisher: Probabilistic Reasoning in Intelligent Systems is a complete andaccessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other AI approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic. The author distinguishes syntactic and semantic approaches to uncertaintyand offers techniques, based on belief networks, that provide a mechanism for making semantics-based systems operational. Specifically, network-propagation techniques serve as a mechanism for combining the theoretical coherence of probability theory with modern demands of reasoning-systems technology: modular declarative inputs, conceptually meaningful inferences, and parallel distributed computation. Application areas include diagnosis, forecasting, image interpretation, multi-sensor fusion, decision support systems, plan recognition, planning, speech recognitionin short, almost every task requiring that conclusions be drawn from uncertain clues and incomplete information. Probabilistic Reasoning in Intelligent Systems will be of special interest to scholars and researchers in AI, decision theory, statistics, logic, philosophy, cognitive psychology, and the management sciences. Professionals in the areas of knowledge-based systems, operations research, engineering, and statistics will find theoretical and computational tools of immediate practical use. The book can also be used as an excellent text for graduate-level courses in AI, operations research, or applied probability.

...read moreread less

15,671 citations

Journal Article•DOI•

Factor graphs and the sum-product algorithm

[...]

Frank R. Kschischang¹, Brendan J. Frey², Hans-Andrea Loeliger•Institutions (2)

University of Toronto¹, University of Illinois at Urbana–Champaign²

01 Feb 2001-IEEE Transactions on Information Theory

TL;DR: A generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph, that computes-either exactly or approximately-various marginal functions derived from the global function.

...read moreread less

Abstract: Algorithms that must deal with complicated global functions of many variables often exploit the manner in which the given functions factor as a product of "local" functions, each of which depends on a subset of the variables. Such a factorization can be visualized with a bipartite graph that we call a factor graph, In this tutorial paper, we present a generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph. Following a single, simple computational rule, the sum-product algorithm computes-either exactly or approximately-various marginal functions derived from the global function. A wide variety of algorithms developed in artificial intelligence, signal processing, and digital communications can be derived as specific instances of the sum-product algorithm, including the forward/backward algorithm, the Viterbi algorithm, the iterative "turbo" decoding algorithm, Pearl's (1988) belief propagation algorithm for Bayesian networks, the Kalman filter, and certain fast Fourier transform (FFT) algorithms.

...read moreread less

6,637 citations

Cites methods from "Local computations with probabiliti..."

...In [20, 24], similar general procedures are described for transforming a graphical probability model into cycle-free form....
[...]

Book•

Bayesian networks and decision graphs

[...]

Finn B. Jensen¹, Thomas Graven-Nielsen•Institutions (1)

Aalborg University¹

01 Jan 2001

TL;DR: The book introduces probabilistic graphical models and decision graphs, including Bayesian networks and influence diagrams, and presents a thorough introduction to state-of-the-art solution and analysis algorithms.

...read moreread less

Abstract: Probabilistic graphical models and decision graphs are powerful modeling tools for reasoning and decision making under uncertainty. As modeling languages they allow a natural specification of problem domains with inherent uncertainty, and from a computational perspective they support efficient algorithms for automatic construction and query answering. This includes belief updating, finding the most probable explanation for the observed evidence, detecting conflicts in the evidence entered into the network, determining optimal strategies, analyzing for relevance, and performing sensitivity analysis. The book introduces probabilistic graphical models and decision graphs, including Bayesian networks and influence diagrams. The reader is introduced to the two types of frameworks through examples and exercises, which also instruct the reader on how to build these models. The book is a new edition of Bayesian Networks and Decision Graphs by Finn V. Jensen. The new edition is structured into two parts. The first part focuses on probabilistic graphical models. Compared with the previous book, the new edition also includes a thorough description of recent extensions to the Bayesian network modeling language, advances in exact and approximate belief updating algorithms, and methods for learning both the structure and the parameters of a Bayesian network. The second part deals with decision graphs, and in addition to the frameworks described in the previous edition, it also introduces Markov decision processes and partially ordered decision problems. The authors also provide a well-founded practical introduction to Bayesian networks, object-oriented Bayesian networks, decision trees, influence diagrams (and variants hereof), and Markov decision processes. give practical advice on the construction of Bayesian networks, decision trees, and influence diagrams from domain knowledge. give several examples and exercises exploiting computer systems for dealing with Bayesian networks and decision graphs. present a thorough introduction to state-of-the-art solution and analysis algorithms. The book is intended as a textbook, but it can also be used for self-study and as a reference book.

...read moreread less

4,566 citations

Book•

Graphical Models, Exponential Families, and Variational Inference

[...]

Martin J. Wainwright¹, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

16 Dec 2008

TL;DR: The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.

...read moreread less

Abstract: The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide variety of algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.

...read moreread less

4,335 citations

Cites background from "Local computations with probabiliti..."

...) This result underlies the junction tree algorithm [69] for exact inference on arbitrary graphs:...
[...]
...Our treatment is brief; further details can be found in various sources [1, 29, 63, 57, 69]....
[...]

Journal Article•DOI•

A Bayesian Method for the Induction of Probabilistic Networks from Data

[...]

Gregory F. Cooper¹, Edward H. Herskovits²•Institutions (2)

University of Pittsburgh¹, Stanford University²

01 Oct 1992-Machine Learning

TL;DR: This paper presents a Bayesian method for constructing probabilistic networks from databases, focusing on constructing Bayesian belief networks, and extends the basic method to handle missing data and hidden variables.

...read moreread less

Abstract: This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computer-assisted hypothesis testing, automated scientific discovery, and automated construction of probabilistic expert systems. We extend the basic method to handle missing data and hidden (latent) variables. We show how to perform probabilistic inference by averaging over the inferences of multiple belief networks. Results are presented of a preliminary evaluation of an algorithm for constructing a belief network from a database of cases. Finally, we relate the methods in this paper to previous work, and we discuss open problems.

...read moreread less

3,971 citations

Cites background from "Local computations with probabiliti..."

...…structure Bs is a directed acyclic graph in which nodes represent domain variables and arcs between nodes represent probabilistic dependencies (Cooper, 1989; Horvitz, Breese, & Henrion, 1988; Lauritzen & Spiegelhalter, 1988; Neapolitan, 1990; Pearl, 1986; Pearl, 1988; Shachter, 1988)....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Learning representations by back-propagating errors

[...]

David E. Rumelhart¹, Geoffrey E. Hinton², Ronald J. Williams¹•Institutions (2)

University of California, San Diego¹, Carnegie Mellon University²

01 Jan 1988-Nature

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.

...read moreread less

Abstract: We describe a new learning procedure, back-propagation, for networks of neurone-like units. The procedure repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector. As a result of the weight adjustments, internal ‘hidden’ units which are not part of the input or output come to represent important features of the task domain, and the regularities in the task are captured by the interactions of these units. The ability to create useful new features distinguishes back-propagation from earlier, simpler methods such as the perceptron-convergence procedure1.

...read moreread less

23,814 citations

Journal Article•DOI•

Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

[...]

Stuart Geman¹, Donald Geman²•Institutions (2)

Brown University¹, University of Massachusetts Amherst²

01 Nov 1984-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.

...read moreread less

Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

...read moreread less

18,761 citations

Book•

A mathematical theory of evidence

[...]

Glenn Shafer

01 Jan 1976

TL;DR: This book develops an alternative to the additive set functions and the rule of conditioning of the Bayesian theory: set functions that need only be what Choquet called "monotone of order of infinity." and Dempster's rule for combining such set functions.

...read moreread less

Abstract: Both in science and in practical affairs we reason by combining facts only inconclusively supported by evidence. Building on an abstract understanding of this process of combination, this book constructs a new theory of epistemic probability. The theory draws on the work of A. P. Dempster but diverges from Depster's viewpoint by identifying his "lower probabilities" as epistemic probabilities and taking his rule for combining "upper and lower probabilities" as fundamental. The book opens with a critique of the well-known Bayesian theory of epistemic probability. It then proceeds to develop an alternative to the additive set functions and the rule of conditioning of the Bayesian theory: set functions that need only be what Choquet called "monotone of order of infinity." and Dempster's rule for combining such set functions. This rule, together with the idea of "weights of evidence," leads to both an extensive new theory and a better understanding of the Bayesian theory. The book concludes with a brief treatment of statistical inference and a discussion of the limitations of epistemic probability. Appendices contain mathematical proofs, which are relatively elementary and seldom depend on mathematics more advanced that the binomial theorem.

...read moreread less

14,565 citations

Book•

Theory of probability

[...]

Harold Jeffreys, R. Bruce Lindsay

01 Jan 1939

TL;DR: In this paper, the authors introduce the concept of direct probabilities, approximate methods and simplifications, and significant importance tests for various complications, including one new parameter, and various complications for frequency definitions and direct methods.

...read moreread less

Abstract: 1. Fundamental notions 2. Direct probabilities 3. Estimation problems 4. Approximate methods and simplifications 5. Significance tests: one new parameter 6. Significance tests: various complications 7. Frequency definitions and direct methods 8. General questions

...read moreread less

7,086 citations

Book•

Optimal Statistical Decisions

[...]

Morris H. DeGroot

01 Jun 1970

TL;DR: In this article, the authors present a survey of probability theory in the context of sample spaces and decision problems, including the following: 1.1 Experiments and Sample Spaces, and Probability 2.2.3 Random Variables, Random Vectors and Distributions Functions.

...read moreread less

Abstract: Foreword.Preface.PART ONE. SURVEY OF PROBABILITY THEORY.Chapter 1. Introduction.Chapter 2. Experiments, Sample Spaces, and Probability.2.1 Experiments and Sample Spaces.2.2 Set Theory.2.3 Events and Probability.2.4 Conditional Probability.2.5 Binomial Coefficients.Exercises.Chapter 3. Random Variables, Random Vectors, and Distributions Functions.3.1 Random Variables and Their Distributions.3.2 Multivariate Distributions.3.3 Sums and Integrals.3.4 Marginal Distributions and Independence.3.5 Vectors and Matrices.3.6 Expectations, Moments, and Characteristic Functions.3.7 Transformations of Random Variables.3.8 Conditional Distributions.Exercises.Chapter 4. Some Special Univariate Distributions.4.1 Introduction.4.2 The Bernoulli Distributions.4.3 The Binomial Distribution.4.4 The Poisson Distribution.4.5 The Negative Binomial Distribution.4.6 The Hypergeometric Distribution.4.7 The Normal Distribution.4.8 The Gamma Distribution.4.9 The Beta Distribution.4.10 The Uniform Distribution.4.11 The Pareto Distribution.4.12 The t Distribution.4.13 The F Distribution.Exercises.Chapter 5. Some Special Multivariate Distributions.5.1 Introduction.5.2 The Multinomial Distribution.5.3 The Dirichlet Distribution.5.4 The Multivariate Normal Distribution.5.5 The Wishart Distribution.5.6 The Multivariate t Distribution.5.7 The Bilateral Bivariate Pareto Distribution.Exercises.PART TWO. SUBJECTIVE PROBABILITY AND UTILITY.Chapter 6. Subjective Probability.6.1 Introduction.6.2 Relative Likelihood.6.3 The Auxiliary Experiment.6.4 Construction of the Probability Distribution.6.5 Verification of the Properties of a Probability Distribution.6.6 Conditional Likelihoods.Exercises.Chapter 7. Utility.7.1 Preferences Among Rewards.7.2 Preferences Among Probability Distributions.7.3 The Definitions of a Utility Function.7.4 Some Properties of Utility Functions.7.5 The Utility of Monetary Rewards.7.6 Convex and Concave Utility Functions.7.7 The Anxiomatic Development of Utility.7.8 Construction of the Utility Function.7.9 Verification of the Properties of a Utility Function.7.10 Extension of the Properties of a Utility Function to the Class ?E.Exercises.PART THREE. STATISTICAL DECISION PROBLEMS.Chapter 8. Decision Problems.8.1 Elements of a Decision Problem.8.2 Bayes Risk and Bayes Decisions.8.3 Nonnegative Loss Functions.8.4 Concavity of the Bayes Risk.8.5 Randomization and Mixed Decisions.8.6 Convex Sets.8.7 Decision Problems in Which ~2 and D Are Finite.8.8 Decision Problems with Observations.8.9 Construction of Bayes Decision Functions.8.10 The Cost of Observation.8.11 Statistical Decision Problems in Which Both ? and D contains Two Points.8.12 Computation of the Posterior Distribution When the Observations Are Made in More Than One Stage.Exercises.Chapter 9. Conjugate Prior Distributions.9.1 Sufficient Statistics.9.2 Conjugate Families of Distributions.9.3 Construction of the Conjugate Family.9.4 Conjugate Families for Samples from Various Standard Distributions.9.5 Conjugate Families for Samples from a Normal Distribution.9.6 Sampling from a Normal Distribution with Unknown Mean and Unknown Precision.9.7 Sampling from a Uniform Distribution.9.8 A Conjugate Family for Multinomial Observations.9.9 Conjugate Families for Samples from a Multivariate Normal Distribution.9.10 Multivariate Normal Distributions with Unknown Mean Vector and Unknown Precision matrix.9.11 The Marginal Distribution of the Mean Vector.9.12 The Distribution of a Correlation.9.13 Precision Matrices Having an Unknown Factor.Exercises.Chapter 10. Limiting Posterior Distributions.10.1 Improper Prior Distributions.10.2 Improper Prior Distributions for Samples from a Normal Distribution.10.3 Improper Prior Distributions for Samples from a Multivariate Normal Distribution.10.4 Precise Measurement.10.5 Convergence of Posterior Distributions.10.6 Supercontinuity.10.7 Solutions of the Likelihood Equation.10.8 Convergence of Supercontinuous Functions.10.9 Limiting Properties of the Likelihood Function.10.10 Normal Approximation to the Posterior Distribution.10.11 Approximation for Vector Parameters.10.12 Posterior Ratios.Exercises.Chapter 11. Estimation, Testing Hypotheses, and linear Statistical Models.11.1 Estimation.11.2 Quadratic Loss.11.3 Loss Proportional to the Absolute Value of the Error.11.4 Estimation of a Vector.11.5 Problems of Testing Hypotheses.11.6 Testing a Simple Hypothesis About the Mean of a Normal Distribution.11.7 Testing Hypotheses about the Mean of a Normal Distribution.11.8 Deciding Whether a Parameter Is Smaller or larger Than a Specific Value.11.9 Deciding Whether the Mean of a Normal Distribution Is Smaller or larger Than a Specific Value.11.10 Linear Models.11.11 Testing Hypotheses in Linear Models.11.12 Investigating the Hypothesis That Certain Regression Coefficients Vanish.11.13 One-Way Analysis of Variance.Exercises.PART FOUR. SEQUENTIAL DECISIONS.Chapter 12. Sequential Sampling.12.1 Gains from Sequential Sampling.12.2 Sequential Decision Procedures.12.3 The Risk of a Sequential Decision Procedure.12.4 Backward Induction.12.5 Optimal Bounded Sequential Decision procedures.12.6 Illustrative Examples.12.7 Unbounded Sequential Decision Procedures.12.8 Regular Sequential Decision Procedures.12.9 Existence of an Optimal Procedure.12.10 Approximating an Optimal Procedure by Bounded Procedures.12.11 Regions for Continuing or Terminating Sampling.12.12 The Functional Equation.12.13 Approximations and Bounds for the Bayes Risk.12.14 The Sequential Probability-ratio Test.12.15 Characteristics of Sequential Probability-ratio Tests.12.16 Approximating the Expected Number of Observations.Exercises.Chapter 13. Optimal Stopping.13.1 Introduction.13.2 The Statistician's Reward.13.3 Choice of the Utility Function.13.4 Sampling Without Recall.13.5 Further Problems of Sampling with Recall and Sampling without Recall.13.6 Sampling without Recall from a Normal Distribution with Unknown Mean.13.7 Sampling with Recall from a Normal Distribution with Unknown Mean.13.8 Existence of Optimal Stopping Rules.13.9 Existence of Optimal Stopping Rules for Problems of Sampling with Recall and Sampling without Recall.13.10 Martingales.13.11 Stopping Rules for Martingales.13.12 Uniformly Integrable Sequences of Random Variables.13.13 Martingales Formed from Sums and Products of Random Variables.13.14 Regular Supermartingales.13.15 Supermartingales and General Problems of Optimal Stopping.13.16 Markov Processes.13.17 Stationary Stopping Rules for Markov Processes.13.18 Entrance-fee Problems.13.19 The Functional Equation for a Markov Process.Exercises.Chapter 14. Sequential Choice of Experiments.14.1 Introduction.14.2 Markovian Decision Processes with a Finite Number of Stages.14.3 Markovian Decision Processes with an Infinite Number of Stages.14.4 Some Betting Problems.14.5 Two-armed-bandit Problems.14.6 Two-armed-bandit Problems When the Value of One Parameter Is Known.14.7 Two-armed-bandit Problems When the Parameters Are Dependent.14.8 Inventory Problems.14.9 Inventory Problems with an Infinite Number of Stages.14.10 Control Problems.14.11 Optimal Control When the Process Cannot Be Observed without Error.14.12 Multidimensional Control Problems.14.13 Control Problems with Actuation Errors.14.14 Search Problems.14.15 Search Problems with Equal Costs.14.16 Uncertainty Functions and Statistical Decision Problems.14.17 Sufficient Experiments.14.18 Examples of Sufficient Experiments.Exercises.References.Supplementary Bibliography.Name Index.Subject Index.

...read moreread less

4,287 citations