Home
/
Authors
/
Persi Diaconis

Author

Persi Diaconis

Other affiliations: University of Nice Sophia Antipolis, Cornell University, University of California, San Diego ...read more

Bio: Persi Diaconis is an academic researcher from Stanford University. The author has contributed to research in topics: Markov chain & Random walk. The author has an hindex of 80, co-authored 341 publications receiving 28431 citations. Previous affiliations of Persi Diaconis include University of Nice Sophia Antipolis & Cornell University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976

Papers

PDF

Open Access

More filters

Book•

Group representations in probability and statistics

[...]

Persi Diaconis

01 Jan 1988

1,522 citations

Journal Article•DOI•

On the histogram as a density estimator:L 2 theory

[...]

David A. Freedman¹, Persi Diaconis²•Institutions (2)

University of California, Berkeley¹, Stanford University²

01 Dec 1981-Probability Theory and Related Fields

TL;DR: In this article, a probability density on an interval I, finite or infinite, including its finite endpoints, if any; and f vanishes outside of I. To define this object, choose a reference point xosI and a cell width h.

...read moreread less

Abstract: Let f be a probability density on an interval I, finite or infinite: I includes its finite endpoints, if any; and f vanishes outside of I. Let X1, . . . ,X k be independent random variables, with common density f The empirical histogram for the X's is often used to estimate f To define this object, choose a reference point xosI and a cell width h. Let Nj be the number of X's falling in the j th class interval:

...read moreread less

1,336 citations

Journal Article•DOI•

Computer-Intensive Methods in Statistics

[...]

Persi Diaconis, Bradley Efron

01 Jun 1983-Scientific American

TL;DR: The bootstrap method is examined and evaluated as an example of this new generation of statistical tools that take advantage of the high speed digital computer and free the statistician to attack more complicated problems.

...read moreread less

Abstract: In the past few years there has been a surge in the development of new statistical theories and methods that take advantage of the high speed digital computer. The payoff for such intensive computation methods is freedom from two limiting factors that have dominated statistical theory since its beginning: the assumption that the data conform to a bell-shaped curve and the need to focus on statistical measures whose theoretical properties can be analyzed mathematically. The new methods free the statistician to attack more complicated problems, exploiting a wider array of statistical tools. The bootstrap method is examined and evaluated as an example of this new generation of statistical tools.

...read moreread less

1,040 citations

Journal Article•DOI•

Geometric Bounds for Eigenvalues of Markov Chains

[...]

Persi Diaconis, Daniel Stroock

01 Feb 1991-Annals of Applied Probability

TL;DR: In this article, the second largest eigenvalue and spectral gap of a reversible Markov chain were derived for the random walk associated to approximate computation of the permanent. But these bounds depend on geometric quantities such as the maximum degree, diameter and covering number of associated graphs.

...read moreread less

Abstract: We develop bounds for the second largest eigenvalue and spectral gap of a reversible Markov chain. The bounds depend on geometric quantities such as the maximum degree, diameter and covering number of associated graphs. The bounds compare well with exact answers for a variety of simple chains and seem better than bounds derived through Cheeger-like inequalities. They offer improved rates of convergence for the random walk associated to approximate computation of the permanent.

...read moreread less

947 citations

Book•

Fastest mixing Markov chain on a graph

[...]

Stephen Boyd¹, Persi Diaconis¹, Lin Xiao¹•Institutions (1)

Stanford University¹

01 Jan 2003

TL;DR: The Lagrange dual of the fastest mixing Markov chain problem is derived, which gives a sophisticated method for obtaining (arbitrarily good) bounds on the optimal mixing rate, as well as the optimality conditions.

...read moreread less

Abstract: We consider a symmetric random walk on a connected graph, where each edge is labeled with the probability of transition between the two adjacent vertices. The associated Markov chain has a uniform equilibrium distribution; the rate of convergence to this distribution, i.e., the mixing rate of the Markov chain, is determined by the second largest eigenvalue modulus (SLEM) of the transition probability matrix. In this paper we address the problem of assigning probabilities to the edges of the graph in such a way as to minimize the SLEM, i.e., the problem of finding the fastest mixing Markov chain on the graph. We show that this problem can be formulated as a convex optimization problem, which can in turn be expressed as a semidefinite program (SDP). This allows us to easily compute the (globally) fastest mixing Markov chain for any graph with a modest number of edges (say, $1000$) using standard numerical methods for SDPs. Larger problems can be solved by exploiting various types of symmetry and structure in the problem, and far larger problems (say, 100,000 edges) can be solved using a subgradient method we describe. We compare the fastest mixing Markov chain to those obtained using two commonly used heuristics: the maximum-degree method, and the Metropolis--Hastings algorithm. For many of the examples considered, the fastest mixing Markov chain is substantially faster than those obtained using these heuristic methods. We derive the Lagrange dual of the fastest mixing Markov chain problem, which gives a sophisticated method for obtaining (arbitrarily good) bounds on the optimal mixing rate, as well as the optimality conditions. Finally, we describe various extensions of the method, including a solution of the problem of finding the fastest mixing reversible Markov chain, on a fixed graph, with a given equilibrium distribution.

...read moreread less

750 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Confidence limits on phylogenies: an approach using the bootstrap.

[...]

Joseph Felsenstein¹•Institutions (1)

University of Washington¹

01 Jul 1985-Evolution

TL;DR: The recently‐developed statistical method known as the “bootstrap” can be used to place confidence intervals on phylogenies and shows significant evidence for a group if it is defined by three or more characters.

...read moreread less

Abstract: The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved in making estimates from the original data. In the case of phylogenies, it is argued that the proper method of resampling is to keep all of the original species while sampling characters with replacement, under the assumption that the characters have been independently drawn by the systematist and have evolved independently. Majority-rule consensus trees can be used to construct a phylogeny showing all of the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If a group shows up 95% of the time or more, the evidence for it is taken to be statistically significant. Existing computer programs can be used to analyze different bootstrap samples by using weights on the characters, the weight of a character being how many times it was drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show significant evidence for a group if it is defined by three or more characters.

...read moreread less

40,349 citations

Book•

Neural networks for pattern recognition

[...]

Christopher M. Bishop¹•Institutions (1)

Aston University¹

01 Jan 1995

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

19,056 citations

Journal Article•DOI•

Approximation by superpositions of a sigmoidal function

[...]

George Cybenko¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 1989-Mathematics of Control, Signals, and Systems

TL;DR: It is demonstrated that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube.

...read moreread less

Abstract: In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube; only mild conditions are imposed on the univariate function. Our results settle an open question about representability in the class of single hidden layer neural networks. In particular, we show that arbitrary decision regions can be arbitrarily well approximated by continuous feedforward neural networks with only a single internal, hidden layer and any continuous sigmoidal nonlinearity. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks.

...read moreread less

12,286 citations

Book•

Gaussian Processes for Machine Learning

[...]

Carl Edward Rasmussen¹, Christopher Williams•Institutions (1)

Max Planck Society¹

23 Nov 2005

TL;DR: The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.

...read moreread less

Abstract: A comprehensive and self-contained introduction to Gaussian processes, which provide a principled, practical, probabilistic approach to learning in kernel machines. Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

...read moreread less

11,357 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse