A Bernoulli Two-armed Bandit

doi:10.1214/AOMS/1177692553

Open AccessJournal ArticleDOI

A Bernoulli Two-armed Bandit

Donald A. Berry

- 01 Jun 1972 -

Annals of Mathematical Statistics

- Vol. 43, Iss: 3, pp 871-897

TLDR

In this article, a Bernoulli process with unknown expectations is selected and observed at each of n$ stages, and the objective is to maximize the expected number of successes from the n$ selections.

Abstract:

One of two independent Bernoulli processes (arms) with unknown expectations $\rho$ and $\lambda$ is selected and observed at each of $n$ stages. The selection problem is sequential in that the process which is selected at a particular stage is a function of the results of previous selections as well as of prior information about $\rho$ and $\lambda$. The variables $\rho$ and $\lambda$ are assumed to be independent under the (prior) probability distribution. The objective is to maximize the expected number of successes from the $n$ selections. Sufficient conditions for the optimality of selecting one or the other of the arms are given and illustrated for example distributions. The stay-on-a-winner rule is proved.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A note on structural properties of the Bernoulli two-armed bandit problem

Dieter Kalin, +1 more

- 01 Jan 1982 -

Optimization

TL;DR: In this article, a certain monotonicity property is proved for the optimal expected cumulative discounted reward associated with a dynamic programming model with finite horizon, describing the Bernoulli: two-armed bandit problem.

...read moreread less

Journal ArticleDOI

Employment Relationships with Joint Employer and Worker Experimentation

W. Kip Viscusi

- 01 Jun 1983 -

International Economic Review

Journal ArticleDOI

A Uniform Two-armed Bandit Problem--The Parameter of one Distribution is Known

Toshio Hamada

- 01 Jan 1978 -

Journal of the Japan Statistical Society...

TL;DR: The problem is asked to find the selection procedure which maximizes the expected value of the sum of the n observations, that is, which experiment should the authors perform sequentially at each stage !

...read moreread less

Journal ArticleDOI

A Note on Discounted Future Two-Armed Bandits

Richard Kakigi

- 01 Jun 1983 -

Annals of Statistics

TL;DR: In this article, the problem of finding Bayes sequential designs for successively choosing between two given Bernoulli variables so as to maximize the total discounted expected sum is addressed, assuming simple hypotheses concerning the success probabilities and dynamic programming methods are used to characterize optimal designs.

...read moreread less

Journal ArticleDOI

Bernoulli Two-Armed Bandits with Geometric Termination

Donald A. Berry, +1 more

- 01 Mar 1981 -

Stochastic Processes and their Applicati...

TL;DR: In this paper, the standard Bernoulli two-armed bandit model is modified by terminating the choice problem after the first unsuccessful trial, and both terminal reward situations and instances in which payoffs accrue with each success are considered.

...read moreread less

Collapse

A Bernoulli Two-armed Bandit

Citations

A note on structural properties of the Bernoulli two-armed bandit problem

Employment Relationships with Joint Employer and Worker Experimentation

A Uniform Two-armed Bandit Problem--The Parameter of one Distribution is Known

A Note on Discounted Future Two-Armed Bandits

Bernoulli Two-Armed Bandits with Geometric Termination

Related Papers (5)

Some aspects of the sequential design of experiments

Bandit Processes and Dynamic Allocation Indices

Bandit problems: Sequential Allocation of Experiments

On the likelihood that one unknown probability exceeds another in view of the evidence of two samples

Multi-Armed Bandits and the Gittins Index