scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations

01 Dec 1952-Annals of Mathematical Statistics (Institute of Mathematical Statistics)-Vol. 23, Iss: 4, pp 493-507
TL;DR: In this paper, it was shown that the likelihood ratio test for fixed sample size can be reduced to this form, and that for large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample with the second test.
Abstract: In many cases an optimum or computationally convenient test of a simple hypothesis $H_0$ against a simple alternative $H_1$ may be given in the following form. Reject $H_0$ if $S_n = \sum^n_{j=1} X_j \leqq k,$ where $X_1, X_2, \cdots, X_n$ are $n$ independent observations of a chance variable $X$ whose distribution depends on the true hypothesis and where $k$ is some appropriate number. In particular the likelihood ratio test for fixed sample size can be reduced to this form. It is shown that with each test of the above form there is associated an index $\rho$. If $\rho_1$ and $\rho_2$ are the indices corresponding to two alternative tests $e = \log \rho_1/\log \rho_2$ measures the relative efficiency of these tests in the following sense. For large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample of size $en$ with the second test. To obtain the above result, use is made of the fact that $P(S_n \leqq na)$ behaves roughly like $m^n$ where $m$ is the minimum value assumed by the moment generating function of $X - a$. It is shown that if $H_0$ and $H_1$ specify probability distributions of $X$ which are very close to each other, one may approximate $\rho$ by assuming that $X$ is normally distributed.
Citations
More filters
Proceedings ArticleDOI
01 Dec 1988
TL;DR: This paper proves a lower bound on the number of random examples required for distribution-free learning of a concept class C and shows that for many interesting concept classes, including k CNF and k DNF, the bound is actually tight to within a constant factor.

432 citations

01 Jan 2001
TL;DR: A brief review of the developments in several classical problems of sequential analysis and their applications to biomedicine, economics and engi- neering can be found in this paper.
Abstract: We give a brief review of the developments in several classical problems of sequential analysis and their applications to biomedicine, economics and engi- neering. Even though it can only focus on a limited number of topics, the review shows that sequential analysis is still a vibrant subject after six decades of contin- ual development, with fresh ideas brought in from various fields of application and through interactions with other branches of statistics and probability. We conclude with some remarks on the opportunities and challenges ahead.

423 citations

Proceedings ArticleDOI
01 Apr 1999
TL;DR: This paper defines an appropriate stochastic error model on the input, and proves that under the conditions of the model, the algorithm recovers the cluster structure with high probability, and presents a practical heuristic based on the same algorithmic ideas.
Abstract: Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. The corresponding algorithmic problem is to cluster multicondition gene expression patterns. In this paper we describe a novel clustering algorithm that was developed for analysis of gene expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an n-gene dataset is O{n2[log(n)]c}. We also present a practical heuristic based on the same algorithmic ideas. The heuristic was implemented and its performance is demonstrated on simulate...

422 citations

BookDOI
01 Jan 1987
TL;DR: This paper examines some properties which the S-boxes satisfy and attempts to determine a reason for such structure to exist in the Data Encryption Standard.
Abstract: The S-boxes used in the DES are the major cryptographic component of the system. Any structure which they possess can have far reaching implications for the security of the algorithm. Structure m a y exist as a result of design principles intended to strengthen security. Structure could also exist as a "trapdoor" for breaking the system. This paper examines some properties which the S-boxes satisfy and attempts to determine a reason for such structure to exist. INTRODUCTION The DES (Data Encryption Standard) was certified by the NBS in 1975 [NBSI]. A complete description of the DES can also be found in either ID] or [K]. The major nonlinear component of the DES is a function f that involves the S-boxes. / is a function that takes as input 32 bits of partially enciphered message and 48 bits of key and produces 32 bits of partially enciphered message as output. f uses eight S-boxes. Each S-box is a function from 8 bits into 4 bits. To be more precise, let a = ( a l . . .a32) be 32 bits of partially enciphered message and let k = (k , . . .k ,8) be 48 bits of key. Then to form f(o,k), a is expanded to a 48 bit b by duplicating the bits that have an index that is 0 or 1 mod 4 in the following manner. Let Let c, = b,+k, for l < i < 4 8 . (+ will refer to addition mod 2 throughout this paper.) Let ~s(,-~),,...c~(i-,)+~ will be the 6 input bits into the i'th S-box. Let Then c = ( e l ... ~ 4 8 ) . d ~ ( i , ) + ~ . . . d 4 ( i ~ ) + 4 be the output bits from the i'th S-box. f ( a , k ) = d . Let d = (dl...d32). 'This work perform4 at Saudi* Nitronil Laboratories supported by the U S Dcputmeat of Energy under contract number DEACOC76DPIWlfBO "This work performed while the authar mas visiting Bell Communications Research A.M. Odlyzko (Ed.): Advances in Cryptology CRYPT0 '86, LNCS 263, pp. 3-8, 1987. 0 Springer-Verlag Berlin Heidelberg 1987

421 citations

Book ChapterDOI
TL;DR: In this paper, it was shown that chi-square tests of simple and composite hypotheses are inferior to the corresponding likelihood ratio tests, provided that α → 0 at a suitable rate.
Abstract: Tests of simple and composite hypotheses for multinomial distributions are considered. It is assumed that the size αN of a test tends to 0 as the sample size N increases. The main concern of this paper is to substantiate the following proposition: If a given test of size αN is “sufficiently different” from a likelihood ratio test then there is a likelihood ratio test of size ≦αN which is considerably more powerful than the given test at “most” points in the set of alternatives when N is large enough, provided that αN → 0 at a suitable rate. In particular, it is shown that chi-square tests of simple and of some composite hypotheses are inferior, in the sense described, to the corresponding likelihood ratio tests. Certain Bayes tests are shown to share the above-mentioned property of a likelihood ratio test.

419 citations

References
More filters