scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Convergence Rates for Empirical Bayes Two-Action Problems II. Continuous Case

01 Jun 1972-Annals of Mathematical Statistics (Institute of Mathematical Statistics)-Vol. 43, Iss: 3, pp 934-947
TL;DR: In this article, a sequence of decision problems is considered where for each problem the observation has a probability density function of exponential type with parameter lambda where lambda is selected independently for each problems according to an unknown prior distribution G(lambda).
Abstract: : A sequence of decision problems is considered where for each problem the observation has a probability density function of exponential type with parameter lambda where lambda is selected independently for each problem according to an unknown prior distribution G(lambda). It is supposed that in each of the problems, one of two possible actions (e.g., 'accept' or 'reject') must be taken. Under various assumptions, reasonably sharp upper bounds are found for the rate at which the risk of the nth problem approaches the smallest possible risk for certain refinements of the standard empirical Bayes procedures. For suitably chosen procedures, under situations likely to occur in practice, rates faster than n to the power (-1 + epsilon) may be obtained for arbitrarily small epsilon > 0. Arbitrarily slow rates can occur in pathological situations. (Author)

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors have reviewed and explored the non-parametric density estimation approach for analysing various econometric functionals, and some limitations of the nonparametric approach are also examined, and potential future areas of applied and theoretical research have been indicated.
Abstract: In this paper we have reviewed and explored the non-parametric density estimation approach for analysing various econometric functionals. The applications of density estimation have been emphasized in the specification, estimation, and testing problems arising in econometrics. Some limitations of the non-parametric approach are also examined, and potential future areas of applied and theoretical research have been indicated.

116 citations

Journal ArticleDOI
TL;DR: In this paper, the authors provide a brief description of the history of compound decision theory and empirical Bayes (EB) and its impact and a number of important related developments in statistical decision making.
Abstract: 1. Introduction. Compound decision theory and empirical Bayes methodology , acclaimed as " two breakthroughs " by Neyman (1962), are the most important contributions of Herbert Robbins to statistics. The purpose of this paper is to provide a brief description of his work in these two intimately connected fields, its impact and a number of important related developments. Robbins introduced compound decision theory in 1950 at the Second Berkeley Symposium on Mathematical Statistics and Probability. Compound decision theory concerns a sequence of independent statistical decision problems of the same form. Its basic thrust is the possibility of gaining substantial reduction of total risk by allowing statistical procedures for the individual component problems to depend on the observations in the entire sequence. It demonstrates, against naive intuition, that stochastically independent experiments are not necessarily " noninformative " to each other in statistical decision making. Five years later, at the Third Berkeley Symposium, Robbins developed empirical Bayes (EB) theory. EB concerns experiments in which the unknown parameters are i.i.d. random variables with an unknown common prior distribution. EB methodologies provide statistical procedures which approximate the ideal Bayes rule for the true model, so that the goal of the Bayesian inference is nearly achieved without specifying a prior. EB procedures usually perform well conditionally on the unknown parameters and thus provide solutions to compound decision problems. EB methods also find applications in problems with more complex structures and for inference about multivariate and infinite-dimensional parameters in a single experiment. Compound decision theory and EB have had great influence on modern statistical thinking and practice. Since Robbins' pioneering papers, EB methods have been applied in a wide range of paradigms and to numerous real-life problems; cf.

94 citations


Cites methods from "Convergence Rates for Empirical Bay..."

  • ...…(1967) on methods based on estimates of the prior, Meeden (1972) on admissibility, Martz and Krutchkoff (1969) and Wind (1973) on regression, Johns and Van Ryzin (1971, 1972), Singh (1979) and Zhang (1997) on rates of convergence and asymptotic minimaxity, O’Bryan (1976) on problems with…...

    [...]

Journal ArticleDOI
TL;DR: In this paper, an empirical Bayes (EB) estimator for the component problem is proposed and the best possible speed at which these estimators converge to the minimum EB risk is investigated.
Abstract: Asymptotically optimal (a.o.) empirical Bayes (EB) estimators are proposed. Speeds and the best possible speed at which these estimators are a.o. are investigated. The underlying component problem is the squared error loss estimation of $\theta$ based on an observation $X$ whose conditional (on $\theta$) $\operatorname{pdf}$ is of the form $u(x)C(\theta)\exp(\theta x)$. The function $u$ could have infinitely many discontinuities; $\theta$ is distributed according to an unknown and unspecified $G$ with support in $\Theta$, and $\Theta$ could be unbounded. Using $n$ independent past experiences of the component problem, EB estimators $\phi_n$ for the present problem are exhibited for each integer $r > 1$. The risks $R(\phi_n, G)$ due to $\phi_n$ are shown to converge to the minimum Bayes risk $R(G)$. In particular, for each $\delta$ in $\lbrack r^{-1}, 1\rbrack$, sufficient conditions are given under which $c_1n^{-2(r - 1)/(1 + 2r)} \leqslant R(\phi_n, G) - R(G) \leqslant c_2n^{-2(\delta r - 1)/(1 + 2r)}$, where $c_1$ and $c_2$ are positive constants. The right hand-side inequality holds uniformly in $G$ satisfying certain conditions, while the other holds at all degenerate $G$ and for all large $n$. (Thus with $\delta$ close to one, $\phi_n$ achieves almost the exact rate.) Examples of exponential families such as normal, gamma and one with $\operatorname{pdf's}$ having infinitely many discontinuities are given where the conditions for the above inequalities are satisfied uniformly in $G$ with $\int|\theta|^{2r\delta}dG(\theta) < \infty$.

79 citations

Posted ContentDOI
TL;DR: In this paper, a sequence of independent random vectors where Xi, conditional on % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9
Abstract: Let % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaai4EaiaacI% cacaWGybWaaSbaaSqaaiaadMgaaeqaaOGaaiilaiabeI7aXnaaBaaa% leaacaWGPbaabeaakiaacMcacaGG9baaaa!3ED1!\[\{ (X_i ,\theta _i )\} \] be a sequence of independent random vectors where Xi, conditional on % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiUde3aaS% baaSqaaiaadMgaaeqaaaaa!38BD!\[\theta _i \], has the probability density of the form % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzaiaacI% cacaWG4bGaaiiFaiabeI7aXnaaBaaaleaacaWGPbaabeaakiaacMca% cqGH9aqpcaWG1bGaaiikaiaadIhacaGGPaGaam4qaiaacIcacqaH4o% qCdaWgaaWcbaGaamyAaaqabaGccaGGPaGaaeyzaiaabIhacaqGWbGa% aiikaiabgkHiTiaadIhacaGGVaGaeqiUde3aaSbaaSqaaiaadMgaae% qaaOGaaiykaaaa!4FFF!\[f(x|\theta _i ) = u(x)C(\theta _i ){\text{exp}}( - x/\theta _i )\] and the unobservable % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiUde3aaS% baaSqaaiaadMgaaeqaaaaa!38BD!\[\theta _i \] are i.i.d. according to an unknown G in some class G of prior distributions on Θ, a subset of % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaai4EaiabeI% 7aXjabg6da+iaaicdacaGG8bGaam4qaiaacIcacqaH4oqCcaGGPaGa% eyypa0JaaiikaiaadAgacaWG1bGaaiikaiaadIhacaGGPaGaaeyzai% aabIhacaqGWbGaaeikaiabgkHiTiaadIhacaGGVaGaeqiUdeNaaiyk% aiaadsgacaWG4bGaaiykamaaCaaaleqabaGaeyOeI0IaaGymaaaaki% abg6da+iaaicdacaGG9baaaa!54DE!\[\{ \theta > 0|C(\theta ) = (fu(x){\text{exp(}} - x/\theta )dx)^{ - 1} > 0\} \]. For a S(X1, ..., Xn, Xn+1)-measurable function % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOXdy2aaS% baaSqaaiaad6gaaeqaaOGaaiilaaaa!397F!\[\phi _n ,\] let % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOuamaaBa% aaleaacaWGUbaabeaakiabg2da9iaadweacaGGOaGaeqOXdy2aaSba% aSqaaiaad6gaaeqaaOGaeyOeI0IaeqiUde3aaSbaaSqaaiaad6gacq% GHRaWkcaaIXaaabeaakiaacMcadaahaaWcbeqaaiaaikdaaaaaaa!444A!\[R_n = E(\phi _n - \theta _{n + 1} )^2 \] denote the Bayes risk of % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOXdy2aaS% baaSqaaiaad6gaaeqaaaaa!38C5!\[\phi _n \] and let R(G) denote the infimum Bayes risk with respect to G. For each integer s>1 we exhibit a class of S(X1, ..., Xn, Xn+1)-measurable functions % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOXdy2aaS% baaSqaaiaad6gaaeqaaaaa!38C5!\[\phi _n \] such that for δ in [s−1, 1], % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4yamaaBa% aaleaacaaIWaaabeaakiaad6gadaahaaWcbeqaaiabgkHiTiaaikda% caWGZbGaai4laiaacIcacaaIXaGaey4kaSIaaGOmaiaadohacaGGPa% aaaOGaeyizImQaamOuamaaBaaaleaacaWGUbaabeaakiaacIcacqaH% gpGzdaWgaaWcbaGaamOBaaqabaGccaGGSaGaam4raiaacMcacqGHsi% slcaWGsbGaaiikaiaadEeacaGGPaGaeyizImQaam4yamaaBaaaleaa% caaIXaaabeaakiaad6gadaahaaWcbeqaaiabgkHiTiaaikdacaGGOa% Gaam4Caiabes7aKjabgkHiTiaaigdacaGGPaGaai4laiaacIcacaaI% XaGaey4kaSIaaGOmaiaadohacaGGPaaaaaaa!5F94!\[c_0 n^{ - 2s/(1 + 2s)} \leqslant R_n (\phi _n ,G) - R(G) \leqslant c_1 n^{ - 2(s\delta - 1)/(1 + 2s)} \] under certain conditions on u and G. No assumptions on the form or smoothness of u is made, however. Examples of functions u, including one with infinitely many discontinuities, are given for which our conditions reduce to some moment conditions on G. When Θ is bounded, for each integer s>1 S(X1, ..., Xn, Xn+1)-measurable functions % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOXdy2aaS% baaSqaaiaad6gaaeqaaaaa!38C5!\[\phi _n \] are exhibited such that for δ in % MathType!MTEF!2!1!+-% feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaai4waiaaik% dacaGGVaGaam4CaiaacYcacaaIXaGaaiyxaiaadogadaqhaaWcbaGa% aGimaaqaaiaacEcaaaGccaWGUbWaaWbaaSqabeaacqGHsislcaaIYa% Gaam4Caiaac+cacaGGOaGaaGymaiabgUcaRiaaikdacaWGZbGaaiyk% aaaakiabgsMiJkaadkfadaWgaaWcbaGaamOBaaqabaGccaGGOaGaeq% OXdy2aaSbaaSqaaiaad6gaaeqaaOGaaiilaiaadEeacaGGPaGaeyOe% I0IaamOuaiaacIcacaWGhbGaaiykaiabgsMiJkaadogadaqhaaWcba% GaaGymaaqaaiaacEcaaaGccaWGUbWaaWbaaSqabeaacqGHsislcaaI% YaGaam4Caiabes7aKjaac+cacaGGOaGaaGymaiabgUcaRiaaikdaca% WGZbGaaiykaaaaaaa!637D!\[[2/s,1]c_0^' n^{ - 2s/(1 + 2s)} \leqslant R_n (\phi _n ,G) - R(G) \leqslant c_1^' n^{ - 2s\delta /(1 + 2s)} \]. Examples of functions u and class g are given where the above lower and upper bounds are achieved.

28 citations


Cites background from "Convergence Rates for Empirical Bay..."

  • ...In empirical Bayes (EB) context (as introduced by Robbins (1955) and later developed in great detail by Johns (1957), Robbins (1963, 1964), Samuel (1963), Johns and Van Ryzin (1971, 1972) , among others), one considers a sequence of statistical problems having the same generic structure being possessed by what is called the component problem....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the problem of the estimation of a probability density function and of determining the mode of the probability function is discussed. Only estimates which are consistent and asymptotically normal are constructed.
Abstract: : Given a sequence of independent identically distributed random variables with a common probability density function, the problem of the estimation of a probability density function and of determining the mode of a probability function are discussed. Only estimates which are consistent and asymptotically normal are constructed. (Author)

10,114 citations

Book
01 Jan 1959
TL;DR: The general decision problem, the Probability Background, Uniformly Most Powerful Tests, Unbiasedness, Theory and First Applications, and UNbiasedness: Applications to Normal Distributions, Invariance, Linear Hypotheses as discussed by the authors.
Abstract: The General Decision Problem.- The Probability Background.- Uniformly Most Powerful Tests.- Unbiasedness: Theory and First Applications.- Unbiasedness: Applications to Normal Distributions.- Invariance.- Linear Hypotheses.- The Minimax Principle.- Multiple Testing and Simultaneous Inference.- Conditional Inference.- Basic Large Sample Theory.- Quadratic Mean Differentiable Families.- Large Sample Optimality.- Testing Goodness of Fit.- General Large Sample Methods.

6,480 citations

Journal ArticleDOI
01 Jan 1963

46 citations

Journal ArticleDOI
TL;DR: In this article, a sequence of decision problems is considered where for each problem the observation has discrete probability function of the form p(x) = h(x), beta (lambda) lambda to the power x, x = 0,1,2,..., and where lambda is selected independently for each decision according to an unknown prior distribution G(lambda).
Abstract: : A sequence of decision problems is considered where for each problem the observation has discrete probability function of the form p(x) = h(x) beta (lambda) lambda to the power x, x = 0,1,2,..., and where lambda is selected independently for each problem according to an unknown prior distribution G(lambda). It is supposed that for each problem one of two possible actions (e.g., 'accept' or 'reject') must be selected. Under various assumptions about h(x) and G(lambda) the rate at which the risk of the nth problem approaches the smallest possible risk is determined for standard empirical Bayes procedures. It is shown that for most practical situations, the rate of convergence to 'optimality' will be at least as fast as L(n)/n where L(n) is a slowly varying function (e.g., log n). The rate cannot be faster than 1/n and this exact rate is achieved in some cases. Arbitrarily slow rates will occur in certain pathological situations. (Author)

40 citations