scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations

01 Dec 1952-Annals of Mathematical Statistics (Institute of Mathematical Statistics)-Vol. 23, Iss: 4, pp 493-507
TL;DR: In this paper, it was shown that the likelihood ratio test for fixed sample size can be reduced to this form, and that for large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample with the second test.
Abstract: In many cases an optimum or computationally convenient test of a simple hypothesis $H_0$ against a simple alternative $H_1$ may be given in the following form. Reject $H_0$ if $S_n = \sum^n_{j=1} X_j \leqq k,$ where $X_1, X_2, \cdots, X_n$ are $n$ independent observations of a chance variable $X$ whose distribution depends on the true hypothesis and where $k$ is some appropriate number. In particular the likelihood ratio test for fixed sample size can be reduced to this form. It is shown that with each test of the above form there is associated an index $\rho$. If $\rho_1$ and $\rho_2$ are the indices corresponding to two alternative tests $e = \log \rho_1/\log \rho_2$ measures the relative efficiency of these tests in the following sense. For large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample of size $en$ with the second test. To obtain the above result, use is made of the fact that $P(S_n \leqq na)$ behaves roughly like $m^n$ where $m$ is the minimum value assumed by the moment generating function of $X - a$. It is shown that if $H_0$ and $H_1$ specify probability distributions of $X$ which are very close to each other, one may approximate $\rho$ by assuming that $X$ is normally distributed.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors develop confidence sequences whose widths go to zero, with nonasymptotic coverage guarantees under nonparametric conditions, including sub-Gaussian and Bernstein conditions, and matrix martingales.
Abstract: A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon. Our work develops confidence sequences whose widths go to zero, with nonasymptotic coverage guarantees under nonparametric conditions. We draw connections between the Cramer–Chernoff method for exponential concentration, the law of the iterated logarithm (LIL) and the sequential probability ratio test—our confidence sequences are time-uniform extensions of the first; provide tight, nonasymptotic characterizations of the second; and generalize the third to nonparametric settings, including sub-Gaussian and Bernstein conditions, self-normalized processes and matrix martingales. We illustrate the generality of our proof techniques by deriving an empirical-Bernstein bound growing at a LIL rate, as well as a novel upper LIL for the maximum eigenvalue of a sum of random matrices. Finally, we apply our methods to covariance matrix estimation and to estimation of sample average treatment effect under the Neyman–Rubin potential outcomes model.

129 citations


Cites methods from "A Measure of Asymptotic Efficiency ..."

  • ...2), itself based on the classical Cramér–Chernoff method ([10, 11]; [9], Section 2....

    [...]

Book
28 Feb 2018
TL;DR: This paper describes a completely general strategy that takes an arbitrary step of an ideal CRCW PRAM and automatically translates it to run efficiently and robustly on a PRAM in which processors are prone to failure.
Abstract: A parallel computing system becomes increasingly prone to failure as the number of processing elements in it increases. In this paper, we describe a completely general strategy that takes an arbitrary step of an ideal CRCW PRAM and automatically translates it to run efficiently and robustly on a PRAM in which processors are prone to failure. The strategy relies on efficient robust algorithms for solving a core problem, the Certified Write-All Problem. This problem characterizes the core of robustness, because , as we show, its complexity is equal to that of any general strategy for realizing robustness in the model. We analyze the expected parallel time and work of various algorithms for solving this problem. Our results are a non-trivial generalization of Brent's Permission to copy without fee all or part of this material is granted provided that the copies are not made or distn'buted for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. Lemma. We consider the case where the number of the available processors decreases dynamically over time, whereas Brent's Lemma is only applicable in the case where the processor availability pattern is static.

129 citations

Journal ArticleDOI
TL;DR: It is shown that, in the absence of eavesdropping, without using cryptography, for any ε > 0 and t = n, there is a randomized protocol with O(log) expected number of rounds, which is an improvement on the lower bound of t + 1 rounds required for deterministic protocols.
Abstract: Byzantine Generals protocols enable processes to broadcast messages reliably in the presence of faulty processes. These protocols are run in a system that consists of n processes, t of which are faulty. The protocols are conducted in synchronous rounds of message exchange. It is shown that, in the absence of eavesdropping, without using cryptography, for any e > 0 and t = n/(3 + e), there is a randomized protocol with O(log n) expected number of rounds. If cryptographic methods are allowed, then, for e > 0 and t = n/(2 + e), there is a randomized protocol with O(log n) expected number of rounds. This is an improvement on the lower bound of t + 1 rounds required for deterministic protocols, and on a previous result of t/log n expected number of rounds for randomized noncryptographic protocols.

128 citations

Posted Content
01 Jan 2010
TL;DR: In this paper, a sequential adaptive sampling-and-refinement procedure called distilled sensing (DS) is proposed and analyzed, which can detect and localize far weaker signals than possible from non-adaptive measurements.
Abstract: Adaptive sampling results in dramatic improvements in the recovery of sparse signals in white Gaussian noise. A sequential adaptive sampling-and-refinement procedure called Distilled Sensing (DS) is proposed and analyzed. DS is a form of multi-stage experimental design and testing. Because of the adaptive nature of the data collection, DS can detect and localize far weaker signals than possible from non-adaptive measurements. In particular, reliable detection and localization (support estimation) using non-adaptive samples is possible only if the signal amplitudes grow logarithmically with the problem dimension. Here it is shown that using adaptive sampling, reliable detection is possible provided the amplitude exceeds a constant, and localization is possible when the amplitude exceeds any arbitrarily slowly growing function of the dimension.

128 citations

Journal ArticleDOI
TL;DR: There exists a fixed positive rate C B below which it is possible (asymptotically in n) to obtain exponentially small error probability using bounded discrepancy decoding, and in many cases C B is shown to be strictly less than the channel capacity.
Abstract: The following four channels are considered: (A) a class of discrete memoryless channels with q inputs and outputs, (B) the time-discrete, amplitude-continuous memoryless channel with additive Gaussian noise and amplitude constraint, (C) the same as channel B but with energy instead of amplitude constraint, (D) a class of time-discrete, amplitude-continuous memoryless channels with amplitude constraint and non-Gaussian noise. For each channel the theoretical capabilities of “bounded discrepancy decoding” are studied. The “discrepancy” between two vectors is a distance or distance-like quantity defined such that the optimal decoder is a “minimum discrepancy decoder.” For example, for channel A the discrepancy is the Hamming distance, and for channel B the discrepancy is the Euclidean distance. Bounded discrepancy decoding is a nonoptimal decoding scheme in which disjoint regions in the space of possible received vectors are constructed about each code word, each region consisting of those vectors within a fixed discrepancy of that code word. For example, in channels A and B the regions are spheres with centers at the code words and radius d/2 where d is the minimum distance between code words. If the received vector is in the region about code word i, it is decoded as code word i; otherwise the decoder announces an error. For all four classes of channels the following is shown to hold: There exists a fixed positive rate C B below which it is possible (asymptotically in n) to obtain exponentially small error probability using bounded discrepancy decoding. In many cases C B is shown to be strictly less than the channel capacity.

127 citations

References
More filters