scispace - formally typeset
Search or ask a question

Showing papers in "Annals of Statistics in 1991"


Journal ArticleDOI
TL;DR: In this article, a new method is presented for flexible regression modeling of high dimensional data, which takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations) are automatically determined by the data.
Abstract: A new method is presented for flexible regression modeling of high dimensional data. The model takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations) are automatically determined by the data. This procedure is motivated by the recursive partitioning approach to regression and shares its attractive properties. Unlike recursive partitioning, however, this method produces continuous models with continuous derivatives. It has more power and flexibility to model relationships that are nearly additive or involve interactions in at most a few variables. In addition, the model can be represented in a form that separately identifies the additive contributions and those associated with the different multivariable interactions.

6,651 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that the difficulty of deconvolution depends on the smoothness of error distributions: the smoother, the harder it is to estimate the density of a random variable.
Abstract: Deconvolution problems arise in a variety of situations in statistics. An interesting problem is to estimate the density $f$ of a random variable $X$ based on $n$ i.i.d. observations from $Y = X + \varepsilon$, where $\varepsilon$ is a measurement error with a known distribution. In this paper, the effect of errors in variables of nonparametric deconvolution is examined. Insights are gained by showing that the difficulty of deconvolution depends on the smoothness of error distributions: the smoother, the harder. In fact, there are two types of optimal rates of convergence according to whether the error distribution is ordinary smooth or supersmooth. It is shown that optimal rates of convergence can be achieved by deconvolution kernel density estimators.

945 citations


Journal ArticleDOI
TL;DR: In this article, logically consistent rules for selecting a vector from any feasible set defined by linear constraints, when either all $n$-vectors or those with positive components or the probability vectors are permissible, are determined.
Abstract: An attempt is made to determine the logically consistent rules for selecting a vector from any feasible set defined by linear constraints, when either all $n$-vectors or those with positive components or the probability vectors are permissible. Some basic postulates are satisfied if and only if the selection rule is to minimize a certain function which, if a "prior guess" is available, is a measure of distance from the prior guess. Two further natural postulates restrict the permissible distances to the author's $f$-divergences and Bregman's divergences, respectively. As corollaries, axiomatic characterizations of the methods of least squares and minimum discrimination information are arrived at. Alternatively, the latter are also characterized by a postulate of composition consistency. As a special case, a derivation of the method of maximum entropy from a small set of natural axioms is obtained.

850 citations


Journal ArticleDOI
TL;DR: In this article, it is shown that when some functionals of the distribution of the data are known, one can get sharper inferences on other functionals by imposing the known values as constraints on the optimization.
Abstract: Empirical likelihood is a nonparametric method of inference. It has sampling properties similar to the bootstrap, but where the bootstrap uses resampling, it profiles a multinomial likelihood supported on the sample. Its properties in i.i.d. settings have been investigated in works by Owen, by Hall and by DiCiccio, Hall and Romano. This article extends the method to regression problems. Fixed and random regressors are considered, as are robust and heteroscedastic regressions. To make the extension, three variations on the original idea are considered. It is shown that when some functionals of the distribution of the data are known, one can get sharper inferences on other functionals by imposing the known values as constraints on the optimization. The result is first order equivalent to conditioning on a sample value of the known functional. The use of a Euclidean alternative to the likelihood function is investigated. A triangular array version of the empirical likelihood theorem is given. The one-way ANOVA and heteroscedastic regression models are considered in detail. An example is given in which inferences are drawn on the parameters of both the regression function and the conditional variance model.

704 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a general statistical model for data coarsening, which includes as special cases rounded, heaped, censored, partially categorized and missing data, and establish simple conditions under which the possible stochastic nature of the coarsing mechanism can be ignored when drawing Bayesian and likelihood inferences and thus the data can be validly treated as grouped data.
Abstract: We present a general statistical model for data coarsening, which includes as special cases rounded, heaped, censored, partially categorized and missing data. Formally, with coarse data, observations are made not in the sample space of the random variable of interest, but rather in its power set. Grouping is a special case in which the degree of coarsening is known and nonstochastic. We establish simple conditions under which the possible stochastic nature of the coarsening mechanism can be ignored when drawing Bayesian and likelihood inferences and thus the data can be validly treated as grouped data. The conditions are that the data be coarsened at random, a generalization of the condition missing at random, and that the parameters of the data and the coarsening process be distinct. Applications of the general model and the ignorability condition are illustrated in a numerical example and described briefly in a variety of special cases.

590 citations


Journal ArticleDOI
TL;DR: Finite-sample replacement breakdown points are derived for different types of estimators of multivariate location and covariance matrices in this paper, and the breakdown point is related to a measure of performance based on large deviations probabilities.
Abstract: Finite-sample replacement breakdown points are derived for different types of estimators of multivariate location and covariance matrices The role of various equivariance properties is illustrated The breakdown point is related to a measure of performance based on large deviations probabilities Finally, we show that one-step reweighting preserves the breakdown point

433 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that the empirical likelihood method for constructing confidence intervals is Bartlett-correctable, which means that a simple adjustment for the expected value of log-likelihood ratio reduces coverage error to an extremely low O(n −2 ) where n −2 denotes sample size.
Abstract: It is shown that, in a very general setting, the empirical likelihood method for constructing confidence intervals is Bartlett-correctable. This means that a simple adjustment for the expected value of log-likelihood ratio reduces coverage error to an extremely low $O(n^{-2})$, where $n$ denotes sample size. That fact makes empirical likelihood competitive with methods such as the bootstrap which are not Bartlett-correctable and which usually have coverage error of size $n^{-1}$. Most importantly, our work demonstrates a strong link between empirical likelihood and parametric likelihood, since the Bartlett correction had previously only been available for parametric likelihood. A general formula is given for the Bartlett correction, valid in a very wide range of problems, including estimation of mean, variance, covariance, correlation, skewness, kurtosis, mean ratio, mean difference, variance ratio, etc. The efficacy of the correction is demonstrated in a simulation study for the case of the mean.

410 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that there exists a pointwise estimate of the derivative of a random vector of order $m, where m is a nonnegative integer smaller than $p. The local Bahadur type is used to obtain some useful asymptotic results.
Abstract: Let $(X, Y)$ be a random vector such that $X$ is $d$-dimensional, $Y$ is real valued and $Y = \theta(X) + \varepsilon$, where $X$ and $\varepsilon$ are independent and the $\alpha$th quantile of $\varepsilon$ is $0$ ($\alpha$ is fixed such that $0 0$, and set $r = (p - m)/(2p + d)$, where $m$ is a nonnegative integer smaller than $p$. Let $T(\theta)$ denote a derivative of $\theta$ of order $m$. It is proved that there exists a pointwise estimate $\hat{T}_n$ of $T(\theta)$, based on a set of i.i.d. observations $(X_1, Y_1),\cdots,(S_n, Y_n)$, that achieves the optimal nonparametric rate of convergence $n^{-r}$ under appropriate regularity conditions. Further, a local Bahadur type representation is shown to hold for the estimate $\hat{T}_n$ and this is used to obtain some useful asymptotic results.

330 citations


Journal ArticleDOI
TL;DR: Simultaneous error bars are constructed for nonparametric kernel estimates of regression functions in this article, where resampling is done from a suitably estimated residual distribution, giving asymptotically correct coverage probabilities uniformly over any number of gridpoints.
Abstract: Simultaneous error bars are constructed for nonparametric kernel estimates of regression functions. The method is based on the bootstrap, where resampling is done from a suitably estimated residual distribution. The error bars are seen to give asymptotically correct coverage probabilities uniformly over any number of gridpoints. Applications to an economic problem are given and comparison to both pointwise and Bonferroni-type bars is presented through a simulation study.

320 citations


Journal ArticleDOI
TL;DR: In this article, the Kalman recursion for state space models is extended to allow for likelihood evaluation and minimum mean square estimation given states with an arbitrarily large covariance matrix, and application is made to likelihood evaluation, state estimation, prediction and smoothing.
Abstract: The Kalman recursion for state space models is extended to allow for likelihood evaluation and minimum mean square estimation given states with an arbitrarily large covariance matrix. The extension is computationally minor. Application is made to likelihood evaluation, state estimation, prediction and smoothing.

314 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that for well-behaved loss functions, the complexity of the full infinite-dimensional composite testing problem is comparable to the difficulty of the hardest simple two-point testing subproblem.
Abstract: Consider estimating a functional $T(F)$ of an unknown distribution $F \in \mathbf{F}$ from data $X_1, \cdots, X_n$ i.i.d. $F$. Let $\omega(\varepsilon)$ denote the modulus of continuity of the functional $T$ over $\mathbf{F}$, computed with respect to Hellinger distance. For well-behaved loss functions $l(t)$, we show that $\inf_{T_n \sup_\mathbf{F}} E_Fl(T_n - T(F))$ is equivalent to $l(\omega(n^{-1/2}))$ to within constants, whenever $T$ is linear and $\mathbf{F}$ is convex. The same conclusion holds in three nonlinear cases: estimating the rate of decay of a density, estimating the mode and robust nonparametric regression. We study the difficulty of testing between the composite, infinite dimensional hypotheses $H_0: T(F) \leq t$ and $H_1: T(F) \geq t + \Delta$. Our results hold, in the cases studied, because the difficulty of the full infinite-dimensional composite testing problem is comparable to the difficulty of the hardest simple two-point testing subproblem.

Journal ArticleDOI
TL;DR: In this paper, the problem of estimating a smooth monotone regression function $m$ is studied, where the estimator is composed of a smoothing step and an isotonisation step.
Abstract: The problem of estimating a smooth monotone regression function $m$ will be studied. We will consider the estimator $m_{SI}$ consisting of a smoothing step (application of a kernel estimator based on a kernel $K$) and of a isotonisation step (application of the pool adjacent violator algorithm). The estimator $m_{SI}$ will be compared with the estimator $m_{IS}$ where these two steps are interchanged. A higher order stochastic expansion of these estimators will be given which show that $m_{SI}$ and $m_{SI}$ are asymptotically first order equivalent and that $m_{IS}$ has a smaller mean squared error than $m_{SI}$ if and only if the kernel function of the kernel estimator is not too smooth.

Journal ArticleDOI
TL;DR: Slicing Regression: A Link-Free Regression M e t h o d Author(s): Naihua Duan and K e r - C h a u l i s o u r c e : The Annals of Statistics, V o l. 19, N o. 2 ( T u n., 1991), p p. 5 0 5 - 5 3 0 P u b l i m h e d b y : Institute of Mathematical Statistics S t a b l e u r l : http://www.jstor.org as discussed by the authors
Abstract: Slicing Regression: A Link-Free Regression M e t h o d Author(s): Naihua Duan and K e r - C h a u L i S o u r c e : The Annals of Statistics, V o l . 19, N o . 2 ( T u n . , 1991), p p . 5 0 5 - 5 3 0 P u b l i s h e d b y : Institute of Mathematical Statistics S t a b l e U R L : http://www.jstor.org/stable/2242072~ Accessed: 16/05/2011 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=ims. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. http://www.jstor.org

Journal ArticleDOI
TL;DR: Analogous convergence results for the relative entropy are shown to hold in general, for any class of log-density functions and sequence of finite-dimensional linear spaces having L2 and L.
Abstract: 1. Introduction. Consider the estimation of a probability density func- tion p(x) defined on a bounded interval. We approximate the logarithm of the density by a basis function expansion consisting of polynomials, splines or trigonometric series. The expansion yields a regular exponential family within which we estimate the density by the method of maximum likelihood. This method of density estimation arises by application of the principle of maxi- mum entropy or minimum relative entropy subject to empirical constraints. We show that if the logarithm of the density has r square-integrable deriva- tives, f IDr log p12 < ox, then the sequence of density estimators P converges to p in the sense of relative entropy (Kullback-Leibler fp log(p/ pn) at rate Opr(1/m2r + m/n) as m -X c and m2/n -* 0 in the spline and trigonometric cases and m3/n -O 0 in the polynomial case, where m is the dimension of the family and n is the sample size. Boundary conditions are assumed for the density in the trigonometric case. This convergence rate specializes to Op r(n-2r/(2r+ 1)) by setting m = nl/(2r+ 1) when the log-density is known to have degree of smoothness at least r. Analogous convergence results for the relative entropy are shown to hold in general, for any class of log-density functions and sequence of finite-dimensional linear spaces having L2 and L. approximation properties. The approximation of log-densities using polynomials has previously been considered by Neyman (1937) to define alternatives for goodness-of-fit tests, by Good (1963) as an application of the method of maximum entropy or minimum relative entropy, by Crain (1974, 1976a, b 1977) who demonstrates existence and consistency of the maximum likelihood estimator and by Mead and

Journal ArticleDOI
TL;DR: In this paper, the conditional limit distribution of the bootstrap estimate at β = 1 was shown to converge to a random distribution, even if the error distribution is assumed to be normal.
Abstract: Consider a first-order autoregressive process $X_t = \beta X_{t - 1} + \varepsilon_t$, where $\{\varepsilon_t\}$ are independent and identically distributed random errors with mean 0 and variance 1. It is shown that when $\beta = 1$ the standard bootstrap least squares estimate of $\beta$ is asymptotically invalid, even if the error distribution is assumed to be normal. The conditional limit distribution of the bootstrap estimate at $\beta = 1$ is shown to converge to a random distribution.

Journal ArticleDOI
TL;DR: In this article, it was shown that Hill's estimator is consistent in the i.i.d. setting and is asymptotically normally distributed in the independent setting.
Abstract: Let $X_1, X_2,\ldots$ be possibly dependent random variables having the same marginal distribution. Consider the situation where $\bar{F}(x) := P\lbrack X_1 > x\rbrack$ is regularly varying at $\infty$ with an unknown index $- \alpha < 0$ which is to be estimated. In the i.i.d. setting, it is well known that Hill's estimator is consistent for $\alpha^{-1}$, and is asymptotically normally distributed. It is the purpose of this paper to demonstrate that such properties of Hill's estimator extend considerably beyond the independent setting. In addition to some basic results derived under very general conditions, the case where the observations are strictly stationary and satisfy a certain mixing condition is considered in detail. Also a finite moving average sequence is studied to illustrate the results.

Journal ArticleDOI
TL;DR: A class of redescending $M$-estimates of multivariate location and scatter are investigated in this article, where sufficient conditions are given to ensure the existence and uniqueness of the estimates.
Abstract: A class of redescending $M$-estimates of multivariate location and scatter are investigated. Sufficient conditions are given to ensure the existence and uniqueness of the estimates. These results are applied to the multivariate $t$-distribution with degrees of freedom $ u \geq 1.$

Journal ArticleDOI
TL;DR: In contrast to the traditional approach of considering regression functions whose $m$th derivatives lie in a ball in the $L_\infty$ or $L 2$ norm, the authors consider the class of functions whose $(m - 1)$st derivative consists of at most $k$ monotone pieces.
Abstract: We propose a new nonparametric regression estimate. In contrast to the traditional approach of considering regression functions whose $m$th derivatives lie in a ball in the $L_\infty$ or $L_2$ norm, we consider the class of functions whose $(m - 1)$st derivative consists of at most $k$ monotone pieces. For many applications this class seems more natural than the classical ones. The least squares estimator of this class is studied. It is shown that the speed of convergence is as fast as in the classical case.

Journal ArticleDOI
TL;DR: In this article, a modified version of the Buckley-James estimator is proposed to get around the difficulties caused by the instability at the upper tail of the associated Kaplan-Meier estimate of the underlying error distribution.
Abstract: Buckley and James proposed an extension of the classical least squares estimator to the censored regression model. It has been found in some empirical and Monte Carlo studies that their approach provides satisfactory results and seems to be superior to other extensions of the least squares estimator in the literature. To develop a complete asymptotic theory for this approach, we introduce herein a slight modification of the Buckley-James estimator to get around the difficulties caused by the instability at the upper tail of the associated Kaplan-Meier estimate of the underlying error distribution and show that the modified Buckley-James estimator is consistent and asymptotically normal under certain regularity conditions. A simple formula for the asymptotic variance of the modified Buckley-James estimator is also derived and is used to study the asymptotic efficiency of the estimator. Extensions of these results to the multiple regression model are also given.

Journal ArticleDOI
TL;DR: In this paper, the nonparametric estimators of Darkhovskh and Carlstein are imbedded in a more general framework, where random seminorms are applied to empirical measures for making inference about change points.
Abstract: Consider a sequence $X_1, X_2,\ldots, X_n$ of independent random variables, where $X_1, X_2,\ldots, X_{n\theta}$ have distribution $P,$ and $X_{n\theta + 1}, X_{n\theta + 2},\ldots, X_n$ have distribution $Q$. The change-point $\theta \in (0,1)$ is an unknown parameter to be estimated, and $P$ and $Q$ are two unknown probability distributions. The nonparametric estimators of Darkhovskh and Carlstein are imbedded in a more general framework, where random seminorms are applied to empirical measures for making inference about $\theta$. Carlstein's and Darkhovskh's results about consistency are improved, and the limiting distributions of some particular estimators are derived in various models. Further we propose asymptotically valid confidence regions for the change point $\theta$ by inverting bootstrap tests. As an example this method is applied to the Nile data.

Journal ArticleDOI
TL;DR: In this paper, a suitable definition of differentiable functional to differentiability of a functional was proposed, and it was shown that regular estimability of a function implies its differentiability.
Abstract: Given a sample of size $n$ from a distribution $P_\lambda$, one wants to estimate a functional $\psi(\lambda)$ of the (typically infinite-dimensional) parameter $\lambda$. Lower bounds on the performance of estimators can be based on the concept of a differentiable functional $P_\lambda \rightarrow \psi(\lambda)$. In this paper we relate a suitable definition of differentiable functional to differentiability of $\alpha \rightarrow dP^{1/2}_\lambda$ and $\lambda \rightarrow \psi(\lambda)$. Moreover, we show that regular estimability of a functional implies its differentiability.

Journal ArticleDOI
TL;DR: In this article, a class of rank estimators is introduced for regression analysis in the presence of both left-truncation and right-censoring on the response variable by making use of martingale theory and a tightness lemma for stochastic integrals of multiparameter empirical processes.
Abstract: A class of rank estimators is introduced for regression analysis in the presence of both left-truncation and right-censoring on the response variable By making use of martingale theory and a tightness lemma for stochastic integrals of multiparameter empirical processes, the asymptotic normality of the estimators is established under certain assumptions Adaptive choice of the score functions to give asymptotically efficient rank estimators is also discussed

Journal ArticleDOI
TL;DR: In this article, a minor modification of the product-limit estimator is proposed for estimating a distribution function when the data are subject to either truncation or censoring, or to both, by independent but not necessarily identically distributed truncation-censoring variables.
Abstract: A minor modification of the product-limit estimator is proposed for estimating a distribution function (not necessarily continuous) when the data are subject to either truncation or censoring, or to both, by independent but not necessarily identically distributed truncation-censoring variables. Making use of martingale integral representations and empirical process theory, uniform strong consistency of the estimator is established and weak convergence results are proved for the entire observable range of the function. Numerical results are also given to illustrate the usefulness of the modification, particularly in the context of truncated data.

Journal ArticleDOI
TL;DR: In this paper, a simple bandwidth selection procedure is proposed to stabilize the variation of the cross-validation bandwidth estimate and a plug-in estimate and an adjusted plugin estimate are also proposed, and their asymptotic distributions are obtained.
Abstract: The problem of automatic bandwidth selection for a kernel density estimator is considered. It is well recognized that the bandwidth estimate selected by the least squares cross-validation is subject to large sample variation. This difficulty limits the application of the cross-validation estimate. Based on characteristic functions, an important expression for the cross-validation bandwidth estimate is obtained. The expression clearly points out the source of variation. To stabilize the variation, a simple bandwidth selection procedure is proposed. It is shown that the stabilized bandwidth selector gives a strongly consistent estimate of the optimal bandwidth. Under commonly used smoothness conditions, the stabilized bandwidth estimate has a faster convergence rate than the convergence rate of the cross-validation estimate. For sufficiently smooth density functions, it is shown that the stabilized bandwidth estimate is asymptotically normal with a relative convergence rate $n^{-1/2}$ instead of the rate $n^{-1/10}$ of the cross-validation estimate. A plug-in estimate and an adjusted plug-in estimate are also proposed, and their asymptotic distributions are obtained. It is noted that the plug-in estimate is asymptotically efficient. The adjusted plug-in bandwidth estimate and the stabilized bandwidth estimate are shown to be asymptotically equivalent. The simulation results verify that the proposed procedures perform much better than the cross-validation for finite samples.

Journal ArticleDOI
TL;DR: In this paper, a limiting distribution for the cross-validated bandwidth is proposed to adjust for the dependence effect on bandwidth selection for nonparametric regression in the case of dependent observations, and the bandwidths produced by these two methods are analyzed by further limiting distributions which reveal significantly different characteristics.
Abstract: For nonparametric regression. in the case of dependent observations. cross-validation is known to be severely affected by dependence. This effect is precisely quantified through a limiting distribution for the cross-validated bandwidth. The performance of two methods. the "leave-(2e+1)-out" version of cross-validation and partitioned cross-validation. which adjust for the dependence effect on bandwidth selection is investigated. The bandwidths produced by these two methods are analyzed by further limiting distributions which reveal significantly different characteristics. Simulations demonstrate that the asymptotic effects hold for reasonable sample sizes.

Journal ArticleDOI
TL;DR: In this article, asymptotic properties of the least squares estimator (LSE) in a regression model with long-memory stationary errors were investigated. But the LSE was not considered in this paper.
Abstract: We consider asymptotic properties of the least squares estimator (LSE) in a regression model with long-memory stationary errors. First we derive a necessary and sufficient condition that the LSE be asymptotically efficient relative to the best linear unbiased estimator (BLUE). Then we derive the asymptotic distribution of the LSE under a condition on the higher-order cumulants of the white-noise process of the errors.

Journal ArticleDOI
TL;DR: In this article, the authors considered large sample properties of estimators constructed from stratified cluster samples, and properties of large-sample confidence intervals, and established the results within the context of a sequence of finite populations generated from a superpopulation.
Abstract: Estimation of the finite population distribution function and related statistics, such as the median and interquartile range, is considered. Large-sample properties of estimators constructed from stratified cluster samples, and properties of large-sample confidence intervals, are established. The results are obtained within the context of a sequence of finite populations generated from a superpopulation.

Journal ArticleDOI
TL;DR: In this article, a relatively obscure eigenvalue inequality due to Wielandt is used to give a simple derivation of the asymptotic distribution of the eigenvalues of a random symmetric matrix.
Abstract: A relatively obscure eigenvalue inequality due to Wielandt is used to give a simple derivation of the asymptotic distribution of the eigenvalues of a random symmetric matrix. The asymptotic distributions are obtained under a fairly general setting. An application of the general theory to the bootstrap distribution of the eigenvalues of the sample covariance matrix is given. 1. Introduction and summary. The derivation of the asymptotic distribution of the eigenvalues of a random symmetric matrix arises in many papers in multivariate analysis. Although the main idea behind most of the derivations is quite basic, i.e., the expansion of the sample roots about the population roots, the derivations themselves are often quite involved. These complications are primarily due to the mathematical rather than statistical nature of the eigenvalue problem. One of the main objectives of this paper is to introduce a simple method for obtaining the asymptotic distribution of the eigenvalue of random symmetric matrices. The method is based upon a relatively obscure eigenvalue inequality

Journal ArticleDOI
TL;DR: In this paper, Stein, Haff, Dey and Srinivasan used the principle of invariance to narrow the class of estimators under consideration to the equivariant ones.
Abstract: Let $S_1$ and $S_2$ be two independent $p \times p$ Wishart matrices with $S_1 \sim W_p(\sum_1, n_1)$ and $S_2 \sim W_p(\sum_2, n_2)$. We wish to estimate $(\sum_1, \sum_2)$ under the loss function $L(\hat{\sum}_1, \hat{\sum}_2; \sum_1, \sum_2) = \sum_i\{\operatorname{tr}(\sum^{-1}_i \hat{\sum}_i) - \log|\sum^{-1}_i\hat{\sum}_i| - p\}$. Our approach is to first utilize the principle of invariance to narrow the class of estimators under consideration to the equivariant ones. The unbiased estimates of risk of these estimators are then computed and promising estimators are derived from them. A Monte Carlo study is also conducted to evaluate the risk performances of these estimators. The results of this paper extend those of Stein, Haff, Dey and Srinivasan from the one sample problem to the two sample one.

Journal ArticleDOI
TL;DR: This paper is to present a methodology which allows the fastest possible rate of convergence with the use of only nonnegative kernel estimators at all stages of the selection process.
Abstract: The asymptotically best bandwidth selectors for a kernel density estimator currently require the use of either unappealing higher order kernel pilot estimators or related Fourier transform methods. The point of this paper is to present a methodology which allows the fastest possible rate of convergence with the use of only nonnegative kernel estimators at all stages of the selection process. The essential idea is derived through careful study of factorizations of the pilot bandwidth in terms of the original bandwidth.