Iterated logarithm inequalities.
Citations
368 citations
Cites background or methods from "Iterated logarithm inequalities."
...In 2002, the successive elimination procedure of Even-Dar et al. (2002) was shown to find the best arm with order ∑ i 6=i∗ ∆ −2 i log(n∆ −2 i ) samples, where ∆i = μi∗ −μi, coming within a logarithmic factor of the lower bound of ∑ i 6=i∗ ∆ −2 i , shown in 2004 in Mannor and Tsitsiklis (2004). A similar bound was also obtained using a procedure known as LUCB1 that was originally designed for finding the m-best arms (Kalyanakrishnan et al., 2012). Recently, Jamieson et al. (2013) proposed a procedure called PRISM which succeeds with ∑ i ∆ −2 i log log (∑ j ∆ −2 j ) or ∑ i ∆ −2 i log ( ∆−2 i ) samples depending on the parameterization of the algorithm, improving the result of Even-Dar et al. (2002) by at least a factor of log(n). The best sample complexity result for the fixed confidence setting comes from a procedure similar to PRISM, called exponential-gap elimination (Karnin et al., 2013), which guarantees best arm identification with high probability using order ∑ i ∆ −2 i log log ∆ −2 i samples, coming within a doubly logarithmic factor of the lower bound of Mannor and Tsitsiklis (2004). While the authors of Karnin et al. (2013) conjecture that the log log term cannot be avoided, it remained unclear as to whether the upper bound of Karnin et al....
[...]
...In 2002, the successive elimination procedure of Even-Dar et al. (2002) was shown to find the best arm with order ∑ i 6=i∗ ∆ −2 i log(n∆ −2 i ) samples, where ∆i = μi∗ −μi, coming within a logarithmic factor of the lower bound of ∑ i 6=i∗ ∆ −2 i , shown in 2004 in Mannor and Tsitsiklis (2004). A similar bound was also obtained using a procedure known as LUCB1 that was originally designed for finding the m-best arms (Kalyanakrishnan et al....
[...]
...In 2002, the successive elimination procedure of Even-Dar et al. (2002) was shown to find the best arm with order ∑ i 6=i∗ ∆ −2 i log(n∆ −2 i ) samples, where ∆i = μi∗ −μi, coming within a logarithmic factor of the lower bound of ∑ i 6=i∗ ∆ −2 i , shown in 2004 in Mannor and Tsitsiklis (2004). A similar bound was also obtained using a procedure known as LUCB1 that was originally designed for finding the m-best arms (Kalyanakrishnan et al., 2012). Recently, Jamieson et al. (2013) proposed a procedure called PRISM which succeeds with ∑ i ∆ −2 i log log (∑ j ∆ −2 j ) or ∑ i ∆ −2 i log ( ∆−2 i ) samples depending on the parameterization of the algorithm, improving the result of Even-Dar et al. (2002) by at least a factor of log(n). The best sample complexity result for the fixed confidence setting comes from a procedure similar to PRISM, called exponential-gap elimination (Karnin et al., 2013), which guarantees best arm identification with high probability using order ∑ i ∆ −2 i log log ∆ −2 i samples, coming within a doubly logarithmic factor of the lower bound of Mannor and Tsitsiklis (2004). While the authors of Karnin et al. (2013) conjecture that the log log term cannot be avoided, it remained unclear as to whether the upper bound of Karnin et al. (2013) or the lower bound of Mannor and Tsitsiklis (2004) was loose. The classic work of Farrell (1964) answers this question....
[...]
...In 2002, the successive elimination procedure of Even-Dar et al. (2002) was shown to find the best arm with order ∑ i 6=i∗ ∆ −2 i log(n∆ −2 i ) samples, where ∆i = μi∗ −μi, coming within a logarithmic factor of the lower bound of ∑ i 6=i∗ ∆ −2 i , shown in 2004 in Mannor and Tsitsiklis (2004). A similar bound was also obtained using a procedure known as LUCB1 that was originally designed for finding the m-best arms (Kalyanakrishnan et al., 2012). Recently, Jamieson et al. (2013) proposed a procedure called PRISM which succeeds with ∑ i ∆ −2 i log log (∑ j ∆ −2 j ) or ∑ i ∆ −2 i log ( ∆−2 i ) samples depending on the parameterization of the algorithm, improving the result of Even-Dar et al....
[...]
...The doubly logarithmic factor is a consequence of the law of the iterated logarithm (LIL) (Darling and Robbins, 1985)....
[...]
254 citations
159 citations
129 citations
Cites background or methods from "Iterated logarithm inequalities."
...Analogous, related results for the sub-Gaussian special case using ψ(λ) = λ2/2 can be found in Robbins and Siegmund [58], Section 4, and Lai [42], Theorem 2, in some cases under weaker assumptions on F ....
[...]
...This was observed by Darling and Robbins [13], and is used in the implementation of Johari et al. [34], for example....
[...]
...Their main example is the Beta mixture for i.i.d. Bernoulli observations, an example which originated with Ville [72] and discussed by Robbins [55] and Lai [41]....
[...]
...Our definition of confidence sequences (1.1), based on Darling and Robbins [12] and Lai [43], differs from that Johari, Pekelis and Walsh [35], who require that P(θτ ∈ CIτ ) ≥ 1 − α for all stopping times τ ....
[...]
...A similar idea was considered by Darling and Robbins [14], using a mixture integral approximation instead of an epoch-based construction to derive closed-form bounds....
[...]
67 citations
Cites background from "Iterated logarithm inequalities."
...It has long been known that the normal SPRT bound can be applied to sequential problems involving any i.i.d. sequence of sub-Gaussian observations (Darling and Robbins, 1967; Robbins, 1970)....
[...]
...sequence of sub-Gaussian observations (Darling and Robbins, 1967; Robbins, 1970)....
[...]
References
987 citations
326 citations
140 citations
91 citations