scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Iterated logarithm inequalities.

TL;DR: A sequence of independent, identically distributed random variables with mean 0, variance 1, and moment generating function ϕ(t) = E(etz) finite in some neighborhood of t= 0 is introduced.
Abstract: 1. Introduction—Let x,x 1, x 2 … be a sequence of independent, identically distributed random variables with mean 0, variance 1, and moment generating function ϕ(t) = E(etz) finite in some neighborhood of t= 0, and put S n = x 1, + … + x n, \(\bar x\) n = S n/n. For any sequence of positive constants a n, n ≥ 1, let P m = P(|\(\bar x\) n| ≥ a n for some n ≥ m).
Citations
More filters
Proceedings Article
29 May 2014
TL;DR: It is proved that the UCB procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples is optimal up to constants and also shows through simulations that it provides superior performance with respect to the state-of-the-art.
Abstract: The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples. The procedure cannot be improved in the sense that the number of samples required to identify the best arm is within a constant factor of a lower bound based on the law of the iterated logarithm (LIL). Inspired by the LIL, we construct our confidence bounds to explicitly account for the infinite time horizon of the algorithm. In addition, by using a novel stopping time for the algorithm we avoid a union bound over the arms that has been observed in other UCBtype algorithms. We prove that the algorithm is optimal up to constants and also show through simulations that it provides superior performance with respect to the state-of-the-art.

368 citations


Cites background or methods from "Iterated logarithm inequalities."

  • ...In 2002, the successive elimination procedure of Even-Dar et al. (2002) was shown to find the best arm with order ∑ i 6=i∗ ∆ −2 i log(n∆ −2 i ) samples, where ∆i = μi∗ −μi, coming within a logarithmic factor of the lower bound of ∑ i 6=i∗ ∆ −2 i , shown in 2004 in Mannor and Tsitsiklis (2004). A similar bound was also obtained using a procedure known as LUCB1 that was originally designed for finding the m-best arms (Kalyanakrishnan et al., 2012). Recently, Jamieson et al. (2013) proposed a procedure called PRISM which succeeds with ∑ i ∆ −2 i log log (∑ j ∆ −2 j ) or ∑ i ∆ −2 i log ( ∆−2 i ) samples depending on the parameterization of the algorithm, improving the result of Even-Dar et al. (2002) by at least a factor of log(n). The best sample complexity result for the fixed confidence setting comes from a procedure similar to PRISM, called exponential-gap elimination (Karnin et al., 2013), which guarantees best arm identification with high probability using order ∑ i ∆ −2 i log log ∆ −2 i samples, coming within a doubly logarithmic factor of the lower bound of Mannor and Tsitsiklis (2004). While the authors of Karnin et al. (2013) conjecture that the log log term cannot be avoided, it remained unclear as to whether the upper bound of Karnin et al....

    [...]

  • ...In 2002, the successive elimination procedure of Even-Dar et al. (2002) was shown to find the best arm with order ∑ i 6=i∗ ∆ −2 i log(n∆ −2 i ) samples, where ∆i = μi∗ −μi, coming within a logarithmic factor of the lower bound of ∑ i 6=i∗ ∆ −2 i , shown in 2004 in Mannor and Tsitsiklis (2004). A similar bound was also obtained using a procedure known as LUCB1 that was originally designed for finding the m-best arms (Kalyanakrishnan et al....

    [...]

  • ...In 2002, the successive elimination procedure of Even-Dar et al. (2002) was shown to find the best arm with order ∑ i 6=i∗ ∆ −2 i log(n∆ −2 i ) samples, where ∆i = μi∗ −μi, coming within a logarithmic factor of the lower bound of ∑ i 6=i∗ ∆ −2 i , shown in 2004 in Mannor and Tsitsiklis (2004). A similar bound was also obtained using a procedure known as LUCB1 that was originally designed for finding the m-best arms (Kalyanakrishnan et al., 2012). Recently, Jamieson et al. (2013) proposed a procedure called PRISM which succeeds with ∑ i ∆ −2 i log log (∑ j ∆ −2 j ) or ∑ i ∆ −2 i log ( ∆−2 i ) samples depending on the parameterization of the algorithm, improving the result of Even-Dar et al. (2002) by at least a factor of log(n). The best sample complexity result for the fixed confidence setting comes from a procedure similar to PRISM, called exponential-gap elimination (Karnin et al., 2013), which guarantees best arm identification with high probability using order ∑ i ∆ −2 i log log ∆ −2 i samples, coming within a doubly logarithmic factor of the lower bound of Mannor and Tsitsiklis (2004). While the authors of Karnin et al. (2013) conjecture that the log log term cannot be avoided, it remained unclear as to whether the upper bound of Karnin et al. (2013) or the lower bound of Mannor and Tsitsiklis (2004) was loose. The classic work of Farrell (1964) answers this question....

    [...]

  • ...In 2002, the successive elimination procedure of Even-Dar et al. (2002) was shown to find the best arm with order ∑ i 6=i∗ ∆ −2 i log(n∆ −2 i ) samples, where ∆i = μi∗ −μi, coming within a logarithmic factor of the lower bound of ∑ i 6=i∗ ∆ −2 i , shown in 2004 in Mannor and Tsitsiklis (2004). A similar bound was also obtained using a procedure known as LUCB1 that was originally designed for finding the m-best arms (Kalyanakrishnan et al., 2012). Recently, Jamieson et al. (2013) proposed a procedure called PRISM which succeeds with ∑ i ∆ −2 i log log (∑ j ∆ −2 j ) or ∑ i ∆ −2 i log ( ∆−2 i ) samples depending on the parameterization of the algorithm, improving the result of Even-Dar et al....

    [...]

  • ...The doubly logarithmic factor is a consequence of the law of the iterated logarithm (LIL) (Darling and Robbins, 1985)....

    [...]

Journal ArticleDOI
TL;DR: In this article, a method for obtaining probability inequalities and related limit theorems concerning the behavior of the entire sequence of random variables with a specified joint probability distribution is given. But the method is not suitable for the case of the random variables in the case where the distribution of the variables is fixed.
Abstract: 1 Extension and applications of an inequality of Ville and Wald Let x 1… be a sequence of random variables with a specified joint probability distribution P We shall give a method for obtaining probability inequalities and related limit theorems concerning the behavior of the entire sequence of x’s

254 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that the probability that S n is the nth partial sum of any sequence x 1, x 2, x 3 of independent and identically distributed (i.i.d.) random variables with mean 0 and variance 1.
Abstract: 1. Introduction and summary. Let W(t) denote a standard Wiener process for 0 ≦ t 0 (or for some t > 0) for a certain class of functions g(t), including functions which are ~ (2t log log t)½ as y → ∞. We also prove an invariance theorem which states that this probability is the limit as m → ∞ of the probability that S n ≦m ½ g(n/m) for some n ≦ τm (or for some n ≦ 1), where S n is the nth partial sum of any sequence x 1, x 2, … of independent and identically distributed (i.i.d.) random variables with mean 0 and variance 1.

159 citations

Journal ArticleDOI
TL;DR: In this article, the authors develop confidence sequences whose widths go to zero, with nonasymptotic coverage guarantees under nonparametric conditions, including sub-Gaussian and Bernstein conditions, and matrix martingales.
Abstract: A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon. Our work develops confidence sequences whose widths go to zero, with nonasymptotic coverage guarantees under nonparametric conditions. We draw connections between the Cramer–Chernoff method for exponential concentration, the law of the iterated logarithm (LIL) and the sequential probability ratio test—our confidence sequences are time-uniform extensions of the first; provide tight, nonasymptotic characterizations of the second; and generalize the third to nonparametric settings, including sub-Gaussian and Bernstein conditions, self-normalized processes and matrix martingales. We illustrate the generality of our proof techniques by deriving an empirical-Bernstein bound growing at a LIL rate, as well as a novel upper LIL for the maximum eigenvalue of a sum of random matrices. Finally, we apply our methods to covariance matrix estimation and to estimation of sample average treatment effect under the Neyman–Rubin potential outcomes model.

129 citations


Cites background or methods from "Iterated logarithm inequalities."

  • ...Analogous, related results for the sub-Gaussian special case using ψ(λ) = λ2/2 can be found in Robbins and Siegmund [58], Section 4, and Lai [42], Theorem 2, in some cases under weaker assumptions on F ....

    [...]

  • ...This was observed by Darling and Robbins [13], and is used in the implementation of Johari et al. [34], for example....

    [...]

  • ...Their main example is the Beta mixture for i.i.d. Bernoulli observations, an example which originated with Ville [72] and discussed by Robbins [55] and Lai [41]....

    [...]

  • ...Our definition of confidence sequences (1.1), based on Darling and Robbins [12] and Lai [43], differs from that Johari, Pekelis and Walsh [35], who require that P(θτ ∈ CIτ ) ≥ 1 − α for all stopping times τ ....

    [...]

  • ...A similar idea was considered by Darling and Robbins [14], using a mixture integral approximation instead of an epoch-based construction to derive closed-form bounds....

    [...]

Posted Content
TL;DR: In this article, a class of exponential bounds for the probability that a martingale sequence crosses a time-dependent linear threshold is presented. But the authors focus on the distribution of the time-uniform concentration of scalar, matrix and Banach-space-valued martingales.
Abstract: We develop a class of exponential bounds for the probability that a martingale sequence crosses a time-dependent linear threshold. Our key insight is that it is both natural and fruitful to formulate exponential concentration inequalities in this way. We illustrate this point by presenting a single assumption and theorem that together unify and strengthen many tail bounds for martingales, including classical inequalities (1960-80) by Bernstein, Bennett, Hoeffding, and Freedman; contemporary inequalities (1980-2000) by Shorack and Wellner, Pinelis, Blackwell, van de Geer, and de la Pe\~na; and several modern inequalities (post-2000) by Khan, Tropp, Bercu and Touati, Delyon, and others. In each of these cases, we give the strongest and most general statements to date, quantifying the time-uniform concentration of scalar, matrix, and Banach-space-valued martingales, under a variety of nonparametric assumptions in discrete and continuous time. In doing so, we bridge the gap between existing line-crossing inequalities, the sequential probability ratio test, the Cram\'er-Chernoff method, self-normalized processes, and other parts of the literature.

67 citations


Cites background from "Iterated logarithm inequalities."

  • ...It has long been known that the normal SPRT bound can be applied to sequential problems involving any i.i.d. sequence of sub-Gaussian observations (Darling and Robbins, 1967; Robbins, 1970)....

    [...]

  • ...sequence of sub-Gaussian observations (Darling and Robbins, 1967; Robbins, 1970)....

    [...]

References
More filters
Journal ArticleDOI
01 Jan 1960
TL;DR: The original Kolmogorov's inequality has been extended to a martingale inequality by Levy [8] and Ville [12] and later to a semimartingale equality by Doob [3] as discussed by the authors.
Abstract: The original Kolmogorov's inequality [6] has been extended to a martingale inequality by Levy [8] and Ville [12] and later to a semimartingale inequality by Doob [3]. In this note we will extend (1) to a semi-martingale inequality which contains Doob's inequality as a special case. As Kolmogorov's inequality is the key to the proof of the law of large numbers for a sequence of independent random variables, we will use our inequality to prove a "law of large numbers" for a martingale, which will be shown to include the extensions of Kolmogorov's law of large numbers for independent random variables [7] made by Brunk [I], Chung [2], Kawata and Udagawa [5], and Prohorov [11], and for dependent random variables made by Levy [8] and Loeve [9]. In the following (W, F, P) will be a probability space, cl, c2, . . . a nonincreasing sequence of positive numbers, xl, x2, * * * a sequence of random variables, yk=XlX2+x2 ? * +Xk and Fk the Borel field generated by xi, x2, * * *, Xk for each k, and for a random variable z we put z+=max(z, 0).

91 citations