scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations

01 Dec 1952-Annals of Mathematical Statistics (Institute of Mathematical Statistics)-Vol. 23, Iss: 4, pp 493-507
TL;DR: In this paper, it was shown that the likelihood ratio test for fixed sample size can be reduced to this form, and that for large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample with the second test.
Abstract: In many cases an optimum or computationally convenient test of a simple hypothesis $H_0$ against a simple alternative $H_1$ may be given in the following form. Reject $H_0$ if $S_n = \sum^n_{j=1} X_j \leqq k,$ where $X_1, X_2, \cdots, X_n$ are $n$ independent observations of a chance variable $X$ whose distribution depends on the true hypothesis and where $k$ is some appropriate number. In particular the likelihood ratio test for fixed sample size can be reduced to this form. It is shown that with each test of the above form there is associated an index $\rho$. If $\rho_1$ and $\rho_2$ are the indices corresponding to two alternative tests $e = \log \rho_1/\log \rho_2$ measures the relative efficiency of these tests in the following sense. For large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample of size $en$ with the second test. To obtain the above result, use is made of the fact that $P(S_n \leqq na)$ behaves roughly like $m^n$ where $m$ is the minimum value assumed by the moment generating function of $X - a$. It is shown that if $H_0$ and $H_1$ specify probability distributions of $X$ which are very close to each other, one may approximate $\rho$ by assuming that $X$ is normally distributed.
Citations
More filters
Journal ArticleDOI
01 Jul 2012
TL;DR: This paper verifies that the two definitions of the frequent itemset have a tight connection and can be unified together when the size of data is large enough and provides baseline implementations of eight existing representative algorithms and test their performances with uniform measures fairly.
Abstract: In recent years, due to the wide applications of uncertain data, mining frequent itemsets over uncertain databases has attracted much attention. In uncertain databases, the support of an itemset is a random variable instead of a fixed occurrence counting of this itemset. Thus, unlike the corresponding problem in deterministic databases where the frequent itemset has a unique definition, the frequent itemset under uncertain environments has two different definitions so far. The first definition, referred as the expected support-based frequent itemset, employs the expectation of the support of an itemset to measure whether this itemset is frequent. The second definition, referred as the probabilistic frequent itemset, uses the probability of the support of an itemset to measure its frequency. Thus, existing work on mining frequent itemsets over uncertain databases is divided into two different groups and no study is conducted to comprehensively compare the two different definitions. In addition, since no uniform experimental platform exists, current solutions for the same definition even generate inconsistent results. In this paper, we firstly aim to clarify the relationship between the two different definitions. Through extensive experiments, we verify that the two definitions have a tight connection and can be unified together when the size of data is large enough. Secondly, we provide baseline implementations of eight existing representative algorithms and test their performances with uniform measures fairly. Finally, according to the fair tests over many different benchmark data sets, we clarify several existing inconsistent conclusions and discuss some new findings.

156 citations


Cites background from "A Measure of Asymptotic Efficiency ..."

  • ...Because the support of an itemset follows Poisson Binomial distribution, Chernoff bound [16] is a well-known tight upper bound of the frequent probability....

    [...]

Proceedings Article
01 Jan 1987
TL;DR: Three very different formal definitions of security for public-key cryptosystems have been proposed and it is proved all of them to be equivalent, providing evidence that the right formalization of the notion of security has been reached.
Abstract: Three very different formal definitions of security for public-key cryptosystems have been proposed--two by Goldwasser and Micali and one by Yao. We prove all of them to be equivalent. This equivalence provides evidence that the right formalization of the notion of security has been reached.

156 citations

Journal ArticleDOI
TL;DR: It is demonstrated that if one of the probability measure of the two classes is not known, it is still possible to define a universal discrimination function which performs as the optimal (likelihood ratio) discriminant function (which can be evaluated only if the probability measures of theTwo classes are available).
Abstract: Classification with empirically observed statistics is studied for finite alphabet sources. Efficient universal discriminant functions are described and shown to be related to universal data compression. It is demonstrated that if one of the probability measure of the two classes is not known, it is still possible to define a universal discrimination function which performs as the optimal (likelihood ratio) discriminant function (which can be evaluated only if the probability measures of the two classes are available). If both of the probability measures are not available but training vectors from at least one of the two classes are available, it is demonstrated that no discriminant function can perform efficiency of the length of the training vectors does not grow at least linearly with the length of the classified vector. A universal discriminant function is introduced and shown to perform efficiently when the length of the training vectors grows linearly with the length of the classified sequence, in the sense that it yields an error exponent that is arbitrarily close to that of the optimal discriminant function. >

155 citations

01 Jan 2011
TL;DR: The main objective of this paper is to provide a broad perspective on this area of research known as ''probabilistic robust control'', and to address in a systematic manner recent advances.
Abstract: A novel approach based on probability and randomization has emerged to synergize with the standard deterministic methods for control of systems with uncertainty. The main objective of this paper is to provide a broad perspective on this area of research known as ''probabilistic robust control'', and to address in a systematic manner recent advances. The focal point is on design methods, based on the interplay between uncertainty randomization and convex optimization, and on the illustration of specific control applications.

154 citations


Additional excerpts

  • ...Hence, if we set a priori the accuracy ï¿¿ ∈ (0, 1) and a confidence level δ ∈ (0, 1), that is we set 2e −2Nï¿¿ 2 ≤ δ, then we can ''invert'' this inequality for N, and obtain the socalled (additive) Chernoff bound (Chernoff, 1952) for the sample complexity N ≥ 1 2ï¿¿ 2 log 2 δ ....

    [...]

Journal ArticleDOI
TL;DR: The T-estimator as discussed by the authors is a family of estimators that are based on the T-test method, which was introduced by Le Cam and Birge in the early 1970s.
Abstract: This paper is devoted to the definition and study of a family of model selection oriented estimators that we shall call T-estimators (“T” for tests). Their construction is based on former ideas about deriving estimators from some families of tests due to Le Cam [L.M. Le Cam, Convergence of estimates under dimensionality restrictions, Ann. Statist. 1 (1973) 38–53 and L.M. Le Cam, On local and global properties in the theory of asymptotic normality of experiments, in: M. Puri (Ed.), Stochastic Processes and Related Topics, vol. 1, Academic Press, New York, 1975, pp. 13–54] and Birge [L. Birge, Approximation dans les espaces metriques et theorie de l'estimation, Z. Wahrscheinlichkeitstheorie Verw. Gebiete 65 (1983) 181–237, L. Birge, Sur un theoreme de minimax et son application aux tests, Probab. Math. Statist. 3 (1984) 259–282 and L. Birge, Stabilite et instabilite du risque minimax pour des variables independantes equidistribuees, Ann. Inst. H. Poincare Sect. B 20 (1984) 201–223] and about complexity based model selection from Barron and Cover [A.R. Barron, T.M. Cover, Minimum complexity density estimation, IEEE Trans. Inform. Theory 37 (1991) 1034–1054]. It is well-known that maximum likelihood estimators and, more generally, minimum contrast estimators do suffer from various weaknesses, and their penalized versions as well. In particular they are not robust and they require restrictive assumptions on both the models and the underlying parameter set to work correctly. We propose an alternative construction, which derives an estimator from many simultaneous tests between some probability balls in a suitable metric space. In many cases, although not in all, it results in a penalized M-estimator restricted to a suitable countable set of parameters. On the one hand, this construction should be considered as a theoretical rather than a practical tool because of its high computational complexity. On the other hand, it solves many of the previously mentioned difficulties provided that the tests involved in our construction exist, which is the case for various statistical frameworks including density estimation from i.i.d. variables or estimating the mean of a Gaussian sequence with a known variance. For all such frameworks, the robustness properties of our estimators allow to deal with minimax estimation and model selection in a unified way, since bounding the minimax risk amounts to performing our method with a single, well-chosen, model. This results, for those frameworks, in simple bounds for the minimax risk solely based on some metric properties of the parameter space. Moreover the method applies to various statistical frameworks and can handle essentially all types of models, linear or not, parametric and non-parametric, simultaneously. It also provides a simple way of aggregating preliminary estimators. From these viewpoints, it is much more flexible than traditional methods and allows to derive some results that do not presently seem to be accessible to them.

153 citations


Cites background from "A Measure of Asymptotic Efficiency ..."

  • ...More precise results in this direction can be found in Chernoff [ 25 ]....

    [...]

References
More filters