A Bernoulli Two-armed Bandit

doi:10.1214/AOMS/1177692553

Open AccessJournal ArticleDOI

A Bernoulli Two-armed Bandit

Donald A. Berry

- 01 Jun 1972 -

Annals of Mathematical Statistics

- Vol. 43, Iss: 3, pp 871-897

TLDR

In this article, a Bernoulli process with unknown expectations is selected and observed at each of n$ stages, and the objective is to maximize the expected number of successes from the n$ selections.

Abstract:

One of two independent Bernoulli processes (arms) with unknown expectations $\rho$ and $\lambda$ is selected and observed at each of $n$ stages. The selection problem is sequential in that the process which is selected at a particular stage is a function of the results of previous selections as well as of prior information about $\rho$ and $\lambda$. The variables $\rho$ and $\lambda$ are assumed to be independent under the (prior) probability distribution. The objective is to maximize the expected number of successes from the $n$ selections. Sufficient conditions for the optimality of selecting one or the other of the arms are given and illustrated for example distributions. The stay-on-a-winner rule is proved.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Celebrating 70: An Interview with Don Berry

Dalene Stangl, +2 more

- 26 Mar 2012 -

arXiv: Methodology

TL;DR: Donald (Don) Arthur Berry, born May 26, 1940 in Southbridge, Massachusetts, earned his A.B. and Ph.D. in statistics from Yale University and served first on the faculty at the University of Minnesota and subsequently held endowed chair positions at Duke University and The University of Texas M. Anderson Center.

...read moreread less

Book ChapterDOI

Bandit Problems with Random Discounting

Donald A. Berry

TL;DR: In this article, the decision problem is shown to be equivalent to one with nonrandom discounting in some versions, and the important case of geometric discounting arises in a natural way.

...read moreread less

Journal ArticleDOI

Celebrating 70: An Interview with Don Berry

Dalene Stangl, +2 more

- 01 Feb 2012 -

Statistical Science

TL;DR: Berry as discussed by the authors has published over 200 articles and 10 books and has mentored 24 Ph.D. and 16 M.S. students, and served as Head of the Division of Quantitative Sciences, and Chairman and Professor of the Department of Biostatistics at UT M. Anderson Center.

...read moreread less

Journal ArticleDOI

Optimal Choice of Design Parameter in an Adaptive Design

Uttam Bandyopadhyay, +1 more

- 01 Mar 2002 -

Calcutta Statistical Association Bulleti...

TL;DR: The present paper provides an optimal choice of design parameter for such a rule with reference to the Michigan ECMO trial, a real life application of the rule.

...read moreread less

Journal ArticleDOI

Adaptive Clinical Trial Designs with Surrogates: When Should We Bother?

- 01 Mar 2022 -

Management Science

TL;DR: Shanthikumar et al. as mentioned in this paper proposed a Bayesian adaptive clinical trial design that simultaneously leverages both observed outcomes to inform trial decisions, which can yield a 16% decrease in trial costs relative to existing clinical trial designs, while maintaining the same Type I/II error rates.

...read moreread less

Collapse

A Bernoulli Two-armed Bandit

Citations

Celebrating 70: An Interview with Don Berry

Bandit Problems with Random Discounting

Celebrating 70: An Interview with Don Berry

Optimal Choice of Design Parameter in an Adaptive Design

Adaptive Clinical Trial Designs with Surrogates: When Should We Bother?

Related Papers (5)

Some aspects of the sequential design of experiments

Bandit Processes and Dynamic Allocation Indices

Bandit problems: Sequential Allocation of Experiments

On the likelihood that one unknown probability exceeds another in view of the evidence of two samples

Multi-Armed Bandits and the Gittins Index