scispace - formally typeset
Search or ask a question

Showing papers on "Asymptotic distribution published in 2004"


Posted Content
TL;DR: In this article, the authors proposed a new approach to estimation and inference in panel data models with a multifactor error structure where the unobserved common factors are correlated with exogenously given individual-specific regressors, and the factor loadings differ over the cross-section units.
Abstract: This paper presents a new approach to estimation and inference in panel data models with a multifactor error structure where the unobserved common factors are (possibly) correlated with exogenously given individual-specific regressors, and the factor loadings differ over the cross section units. The basic idea behind the proposed estimation procedure is to filter the individual-specific regressors by means of (weighted) cross-section aggregates such that asymptotically as the cross-section dimension (N) tends to infinity the differential effects of unobserved common factors are eliminated. The estimation procedure has the advantage that it can be computed by OLS applied to an auxiliary regression where the observed regressors are augmented by (weighted) cross sectional averages of the dependent variable and the individual specific regressors. Two different but related problems are addressed: one that concerns the coefficients of the individual-specific regressors, and the other that focusses on the mean of the individual coefficients assumed random. In both cases appropriate estimators, referred to as common correlated effects (CCE) estimators, are proposed and their asymptotic distribution as N with T (the time-series dimension) fixed or as N and T (jointly) are derived under different regularity conditions. One important feature of the proposed CCE mean group (CCEMG) estimator is its invariance to the (unknown but fixed) number of unobserved common factors as N and T (jointly). The small sample properties of the various pooled estimators are investigated by Monte Carlo experiments that confirm the theoretical derivations and show that the pooled estimators have generally satisfactory small sample properties even for relatively small values of N and T.

3,170 citations


Journal ArticleDOI
TL;DR: In this article, Fan and Li showed that the nonconcave penalized likelihood has an oracle property when the number of parameters is finite, and the consistency of the sandwich formula of the covariance matrix is demonstrated.
Abstract: A class of variable selection procedures for parametric models via nonconcave penalized likelihood was proposed by Fan and Li to simultaneously estimate parameters and select important variables. They demonstrated that this class of procedures has an oracle property when the number of parameters is finite. However, in most model selection problems the number of parameters should be large and grow with the sample size. In this paper some asymptotic properties of the nonconcave penalized likelihood are established for situations in which the number of parameters tends to ∞ as the sample size increases. Under regularity conditions we have established an oracle property and the asymptotic normality of the penalized likelihood estimators. Furthermore, the consistency of the sandwich formula of the covariance matrix is demonstrated. Nonconcave penalized likelihood ratio statistics are discussed, and their asymptotic distributions under the null hypothesis are obtained by imposing some mild conditions on the penalty functions. The asymptotic results are augmented by a simulation study, and the newly developed methodology is illustrated by an analysis of a court case on the sexual discrimination of salary.

978 citations


Journal ArticleDOI
TL;DR: In this paper, asymptotic properties of the maximum likelihood estimators and the quasi-maximum likelihood estimator for the spatial autoregressive model were investigated. But the convergence rates of those estimators may depend on some general features of the spatial weights matrix of the model.
Abstract: This paper investigates asymptotic properties of the maximum likelihood estimator and the quasi-maximum likelihood estimator for the spatial autoregressive model. The rates of convergence of those estimators may depend on some general features of the spatial weights matrix of the model. It is important to make the distinction with dif- ferent spatial scenarios. Under the scenario that each unit will be influenced by only a few neighboring units, the estimators may have >/n-rate of convergence and be asymp- totically normal. When each unit can be influenced by many neighbors, irregularity of the information matrix may occur and various components of the estimators may have different rates of convergence.

905 citations


Journal ArticleDOI
TL;DR: In this paper, the authors propose unit root tests for large n and T panels in which the cross-sectional units are correlated and derive their asymptotic distribution under the null hypothesis of a unit root and local alternatives.

717 citations


Journal ArticleDOI
TL;DR: In this paper, a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance is proposed, which is based on a fixed interval of time (e.g., a day or week).
Abstract: This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities.

717 citations


Posted Content
TL;DR: In this paper, the limiting distribution of the largest eigenvalue of a complex Gaussian covariance matrix when both the number of samples and variables in each sample become large is studied.
Abstract: We compute the limiting distributions of the largest eigenvalue of a complex Gaussian sample covariance matrix when both the number of samples and the number of variables in each sample become large. When all but finitely many, say $r$, eigenvalues of the covariance matrix are the same, the dependence of the limiting distribution of the largest eigenvalue of the sample covariance matrix on those distinguished $r$ eigenvalues of the covariance matrix is completely characterized in terms of an infinite sequence of new distribution functions that generalize the Tracy-Widom distributions of the random matrix theory. Especially a phase transition phenomena is observed. Our results also apply to a last passage percolation model and a queuing model.

713 citations


Posted ContentDOI
TL;DR: In this article, the authors developed a new approach to the problem of testing the existence of a long-run level relationship between a dependent variable and a set of regressors, when it is not known with certainty whether the underlying regressors are trend- or first-difference stationary.
Abstract: This paper develops a new approach to the problem of testing the existence of a long-run level relationship between a dependent variable and a set of regressors, when it is not known with certainty whether the underlying regressors are trend- or first-difference stationary. The proposed tests are based on standard F- and t- statistics used to test the significance of the lagged levels of the variables on a first-difference regression. The asymptotic distributions of these statistics are non-standard under the null hypothesis that there exists no level relationship between the dependent variable and the included regressors, irrespective of whether the regressors are I(0) or I(1). Two sets of asymptotic critical values are provided: One set assuming that all the regressors are I(1), and another set assuming that they are all I(0). These two sets of critical values provide a band covering all possible classifications of the regressors into I(0), I(1) or mutually cointegrated. Accordingly, various bounds testing procedures are proposed. It is shown that the proposed tests are consistent, and their asymptotic distribution under the null and suitably defined local alternatives are derived. The empirical relevance of the bounds procedures are demonstrated by a re-examination of the earnings equation included in the UK Treasury macroeconometric model. This is a particularly relevant application as there is considerable doubt concerning the order of integration of the variables such as the unemployment rate, the union strength and the wedge between the "real produce wage" and the "real consumption wage" that enter the earnings equation.

596 citations


Journal ArticleDOI
TL;DR: In this article, a semiparametric approach is proposed to analyze air pollution data and reveal complex extremal dependence behavior that is consistent with scientific understanding of the process. But it is not suitable for applications where the extreme values of all the variables are unlikely to occur together or when interest is in regions of the support of the joint distribution where only a subset of components is extreme.
Abstract: Summary. Multivariate extreme value theory and methods concern the characterization, estimation and extrapolation of the joint tail of the distribution of a d-dimensional random variable. Existing approaches are based on limiting arguments in which all components of the variable become large at the same rate. This limit approach is inappropriate when the extreme values of all the variables are unlikely to occur together or when interest is in regions of the support of the joint distribution where only a subset of components is extreme. In practice this restricts existing methods to applications where d is typically 2 or 3. Under an assumption about the asymptotic form of the joint distribution of a d-dimensional random variable conditional on its having an extreme component, we develop an entirely new semiparametric approach which overcomes these existing restrictions and can be applied to problems of any dimension. We demonstrate the performance of our approach and its advantages over existing methods by using theoretical examples and simulation studies. The approach is used to analyse air pollution data and reveals complex extremal dependence behaviour that is consistent with scientific understanding of the process. We find that the dependence structure exhibits marked seasonality, with ex- tremal dependence between some pollutants being significantly greater than the dependence at non-extreme levels.

588 citations


Journal ArticleDOI
TL;DR: In this article, a two-stage least squares estimator of the threshold parameter and a generalized method of moments estimator for the slope parameters were proposed. But they do not consider a model with endogenous variables but an exogenous threshold variable.
Abstract: Threshold models (sample splitting models) have wide application in economics. Existing estimation methods are confined to regression models, which require that all right-hand-side variables are exogenous. This paper considers a model with endogenous variables but an exogenous threshold variable. We develop a two-stage least squares estimator of the threshold parameter and a generalized method of moments estimator of the slope parameters. We show that these estimators are consistent, and we derive the asymptotic distribution of the estimators. The threshold estimate has the same distribution as for the regression case (Hansen, 2000, Econometrica 68, 575–603), with a different scale. The slope parameter estimates are asymptotically normal with conventional covariance matrices. We investigate our distribution theory with a Monte Carlo simulation that indicates the applicability of the methods.We thank the two referees and co-editor for constructive comments. Hansen thanks the National Science Foundation for financial support. Caner thanks University of Pittsburgh Central Research Development Fund for financial support.

584 citations


Posted Content
01 Jan 2004
TL;DR: In this article, a robust version of the Dickey-Fuller t-statistic under contemporaneous correlated errors is suggested, which is based on the tstatistic of the transformed model, and the test procedure is further generalized to accommodate individual specific intercepts.
Abstract: In this paper alternative approaches for testing the unit root hypothesis in panel data are considered. First, a robust version of the Dickey-Fuller t-statistic under contemporaneous correlated errors is suggested. Second, the GLS t-statistic is considered, which is based on the t-statistic of the transformed model. The asymptotic power of both tests against a sequence of local alternatives is compared. To adjust for short-run serial correlation of the errors, a pre-whitening procedure is suggested that yields a test statistic with a standard normal limiting distribution as N and T tends to infinity. The test procedure is further generalized to accommodate individual specific intercepts. From our Monte Carlo simulations it turns out that the robust OLS t-statistic performs well with respect to size and power, whereas the the GLS t-statistic may suffer from severe size distortions in small and moderate sample sizes. To improve the small sample properties of the GLS test procedure, a bootstrap version of the test is available.

517 citations


01 Jan 2004
TL;DR: In this article, the theoretical properties of cross-validated smoothing parameter selection for local linear kernel estimators are studied. But the authors focus on the local linear estimator and do not consider the nonparametric estimator.
Abstract: Local linear kernel methods have been shown to dominate local constant methods for the nonparametric estimation of regression functions. In this paper we study the theoretical properties of cross-validated smoothing parameter selec- tion for the local linear kernel estimator. We derive the rate of convergence of the cross-validated smoothing parameters to their optimal benchmark values, and we establish the asymptotic normality of the resulting nonparametric estimator. We then generalize our result to the mixed categorical and continuous regressor case which is frequently encountered in applied settings. Monte Carlo simulation results are reported to examine the finite sample performance of the local-linear based cross-validation smoothing parameter selector. We relate the theoretical and simulation results to a corrected AIC method (termed AICc )p roposed by Hur- vich, Simonoff and Tsai (1998) and find that AICc has impressive finite-sample properties.

Journal ArticleDOI
TL;DR: In this paper, two new approaches are proposed for estimating the regression coefficients in a semiparametric model and the asymptotic normality of the resulting estimators is established.
Abstract: Semiparametric regression models are very useful for longitudinal data analysis. The complexity of semiparametric models and the structure of longitudinal data pose new challenges to parametric inferences and model selection that frequently arise from longitudinal data analysis. In this article, two new approaches are proposed for estimating the regression coefficients in a semiparametric model. The asymptotic normality of the resulting estimators is established. An innovative class of variable selection procedures is proposed to select significant variables in the semiparametric models. The proposed procedures are distinguished from others in that they simultaneously select significant variables and estimate unknown parameters. Rates of convergence of the resulting estimators are established. With a proper choice of regularization parameters and penalty functions, the proposed variable selection procedures are shown to perform as well as an oracle estimator. A robust standard error formula is derived usi...

Journal ArticleDOI
TL;DR: In this article, the limiting distribution of a quantile autoregression estimator and its t-statistic is derived, which is a linear combination of the Dickey-Fuller distribution and the standard normal, with the weight determined by the correlation coefficient of related time series.
Abstract: We study statistical inference in quantile autoregression models when the largest autoregressive coefficient may be unity. The limiting distribution of a quantile autoregression estimator and its t-statistic is derived. The asymptotic distribution is not the conventional Dickey–Fuller distribution, but rather a linear combination of the Dickey–Fuller distribution and the standard normal, with the weight determined by the correlation coefficient of related time series. Inference methods based on the estimator are investigated asymptotically. Monte Carlo results indicate that the new inference procedures have power gains over the conventional least squares-based unit root tests in the presence of non-Gaussian disturbances. An empirical application of the model to U. S. macroeconomic time series data further illustrates the potential of the new approach.

Journal ArticleDOI
TL;DR: A simple natural test, seen as an asymptotic version of the well-known anova F-test, is proposed for testing the null hypothesis of equality of their respective mean functions.

Journal ArticleDOI
01 Dec 2004-Test
TL;DR: In this article, it was shown that linear rank statistics have the same asymptotic distribution in both serial and non-serial copula processes, and that the limiting process has the same joint distribution in the serial case as in the non serial case.
Abstract: Deheuvels (1981a) described a decomposition of the empirical copula process into a finite number of asymptotically mutually independent sub-processes whose joint limiting distribution is tractable under the hypothesis that a multivariate distribution is equal to the product of its margins. It is proved here that this result can be extended to the serial case and that the limiting processes have the same joint distribution as in the non-serial setting. As a consequences, linear rank statistics have the same asymptotic distribution in both contexts. It is also shown how these facts can be exploited to construct simple statistics for detecting dependence graphically and testing it formally. Simulation are used to explore the finite-sample behavior of these statistics, which are found to be powerful against varions types of alternatives.

Journal ArticleDOI
TL;DR: In this article, the root n consistent estimator for nonlinear models with measurement errors in the explanatory variables, when one repeated observation of each mismeasured regressor is available, is presented.
Abstract: This paper presents a solution to an important econometric problem, namely the root n consistent estimation of nonlinear models with measurement errors in the explanatory variables, when one repeated observation of each mismeasured regressor is available. While a root n consistent estimator has been derived for polynomial specifications (see Hausman, Ichimura, Newey, and Powell (1991)), such an estimator for general nonlinear specifications has so far not been available. Using the additional information provided by the repeated observation, the suggested estimator separates the measurement error from the “true” value of the regressors thanks to a useful property of the Fourier transform: The Fourier transform converts the integral equations that relate the distribution of the unobserved “true” variables to the observed variables measured with error into algebraic equations. The solution to these equations yields enough information to identify arbitrary moments of the “true,” unobserved variables. The value of these moments can then be used to construct any estimator that can be written in terms of moments, including traditional linear and nonlinear least squares estimators, or general extremum estimators. The proposed estimator is shown to admit a representation in terms of an influence function, thus establishing its root n consistency and asymptotic normality. Monte Carlo evidence and an application to Engel curve estimation illustrate the usefulness of this new approach.

Journal ArticleDOI
TL;DR: If the distribution of the new elements matches that of the parent set exactly, the algorithms will converge to the global optimum under three widely used selection schemes and a factorized distribution algorithm converges globally under proportional selection.
Abstract: We investigate the global convergence of estimation of distribution algorithms (EDAs). In EDAs, the distribution is estimated from a set of selected elements, i.e., the parent set, and then the estimated distribution model is used to generate new elements. In this paper, we prove that: 1) if the distribution of the new elements matches that of the parent set exactly, the algorithms will converge to the global optimum under three widely used selection schemes and 2) a factorized distribution algorithm converges globally under proportional selection.

Journal ArticleDOI
TL;DR: Rahbek et al. as mentioned in this paper showed that the likelihood-based estimator for the GARCH parameters is consistent and asymptotically normal in the entire parameter region including both stationary and explosive behavior.
Abstract: Consistency and asymptotic normality are established for the highly applied quasi-maximum likelihood estimator in the GARCH(1,1) model. Contrary to existing literature we allow the parameters to be in the region where no stationary version of the process exists. This has the important implication that the likelihood-based estimator for the GARCH parameters is consistent and asymptotically normal in the entire parameter region including both stationary and explosive behavior. In particular, there is no “knife edge result like the unit root case” as hypothesized in Lumsdaine (1996, Econometrica 64, 575–596).Anders Rahbek is grateful for support from the Danish Social Sciences Research Council, the Centre for Analytical Finance (CAF), and the EU network DYNSTOCH. Both authors thank the two anonymous referees and the editor for highly valuable and detailed comments that have, we believe, led to a much improved version of the paper, both in terms of the econometric theory and of the presentation.

Journal ArticleDOI
TL;DR: The present article proposes general single-step multiple testing procedures for controlling Type I error rates defined as arbitrary parameters of the distribution of the number of Type I errors, such as the generalized family-wise error rate.
Abstract: The present article proposes general single-step multiple testing procedures for controlling Type I error rates defined as arbitrary parameters of the distribution of the number of Type I errors, such as the generalized family-wise error rate. A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which single-step common-quantile and common-cut-off procedures asymptotically control the Type I error rate, for arbitrary data generating distributions, without the need for conditions such as subset pivotality. Inspired by this general characterization of a null distribution, we then propose as an explicit null distribution the asymptotic distribution of the vector of null value shifted and scaled test statistics. In the special case of family-wise error rate (FWER) control, our method yields the single-step minP and maxT procedures, based on minima of unadjusted p-values and maxima of test statistics, respectively, with the important distinction in the choice of null distribution. Single-step procedures based on consistent estimators of the null distribution are shown to also provide asymptotic control of the Type I error rate. A general bootstrap algorithm is supplied to conveniently obtain consistent estimators of the null distribution. The special cases of t- and F-statistics are discussed in detail. The companion articles focus on step-down multiple testing procedures for control of the FWER (van der Laan et al., 2004b) and on augmentations of FWER-controlling methods to control error rates such as tail probabilities for the number of false positives and for the proportion of false positives among the rejected hypotheses (van der Laan et al., 2004a). The proposed bootstrap multiple testing procedures are evaluated by a simulation study and applied to genomic data in the fourth article of the series (Pollard et al., 2004).

Journal ArticleDOI
TL;DR: A general transfer theorem is derived which allows us to establish a limit law on the basis of the recursive structure and the asymptotics of the first and second moments of the sequence, where the Zolotarev metric is used.
Abstract: Limit laws are proven by the contraction method for random vectors of a recursive nature as they arise as parameters of combinatorial structures such as random trees or recursive algorithms, where we use the Zolotarev metric. In comparison to previous applications of this method, a general transfer theorem is derived which allows us to establish a limit law on the basis of the recursive structure and the asymptotics of the first and second moments of the sequence. In particular, a general asymptotic normality result is obtained by this theorem which typically cannot be handled by the more common $\ell_2$ metrics. As applications we derive quite automatically many asymptotic limit results ranging from the size of tries or $m$-ary search trees and path lengths in digital structures to mergesort and parameters of random recursive trees, which were previously shown by different methods one by one. We also obtain a related local density approximation result as well as a global approximation result. For the proofs of these results we establish that a smoothed density distance as well as a smoothed total variation distance can be estimated from above by the Zolotarev metric, which is the main tool in this article.

Journal ArticleDOI
TL;DR: In this article, a simple and consistent estimation procedure for conditional moment restrictions is proposed, which is directly based on the definition of the conditional moments and does not require the selection of any user-chosen number.
Abstract: In econometrics, models stated as conditional moment restrictions are typically estimated by means of the generalized method of moments (GMM). The GMM estimation procedure can render inconsistent estimates since the number of arbitrarily chosen instruments is finite. In fact, consistency of the GMM estimators relies on additional assumptions that imply unclear restrictions on the data generating process. This article introduces a new, simple and consistent estimation procedure for these models that is directly based on the definition of the conditional moments. The main feature of our procedure is its simplicity, since its implementation does not require the selection of any user-chosen number, and statistical inference is straightforward since the proposed estimator is asymptotically normal. In addition, we suggest an asymptotically efficient estimator constructed by carrying out one Newton–Raphson step in the direction of the efficient GMM estimator.

Journal ArticleDOI
TL;DR: In this article, the authors consider three models of Gaussian random analytic functions distinguished by invariance of their zeroes distribution and prove asymptotic normality for smooth functionals (linear statistics) of the set of zeros.
Abstract: We consider three models (elliptic, flat and hyperbolic) of Gaussian random analytic functions distinguished by invariance of their zeroes distribution. Asymptotic normality is proven for smooth functionals (linear statistics) of the set of zeroes.

Journal ArticleDOI
TL;DR: In this article, an estimator for the probability of an extreme event that works both in the case of asymptotic independence and dependence is presented, and its consistency is proved.
Abstract: In the classical setting of bivariate extreme value theory, the procedures for estimating the probability of an extreme event are not applicable if the componentwise maxima of the observations are asymptotically independent. To cope with this problem, Ledford and Tawn proposed a submodel in which the penultimate dependence is characterized by an additional parameter. We discuss the asymptotic properties of two estimators for this parameter in an extended model. Moreover, we develop an estimator for the probability of an extreme event that works in the case of asymptotic independence as well as in the case of asymptotic dependence, and prove its consistency.

Journal ArticleDOI
TL;DR: In this article, the authors generalize the local Whittle estimator to circumvent the problem of sample bias that can be large and approximate its logarithm by a polynomial.
Abstract: The local Whittle (or Gaussian semiparametric) estimator of long range dependence, proposed by Kunsch (1987) and analyzed by Robinson (1995a), has a relatively slow rate of convergence and a finite sample bias that can be large. In this paper, we generalize the local Whittle estimator to circumvent these problems. Instead of approximating the short-run component of the spectrum, ϕ(λ)� by a constant in a shrinking neighborhood of frequency zero, we approximate its logarithm by a polynomial. This leads to a “local polynomial Whittle” (LPW) estimator. We specify a data-dependent adaptive procedure that adjusts the degree of the polynomial to the smoothness of ϕ(λ) at zero and selects the bandwidth. The resulting “adaptive LPW” estimator is shown to achieve the optimal rate of convergence, which depends on the smoothness of ϕ(λ) at zero, up to a logarithmic factor.

Journal ArticleDOI
TL;DR: In this paper, the Anderson-Darling test for uniformity has been evaluated directly via series with two-term recursions, and for any particular n, a procedure for evaluating the distribution to the fourth digit is given.
Abstract: Except for n = 1, only the limit as n approaches infinity for the distribution of the Anderson-Darling test for uniformity has been found, and that in so complicated a form that published values for a few percentiles had to be determined by numerical integration, saddlepoint or other approximation methods. We give here our method for evaluating that asymptotic distribution to great accuracy--directly, via series with two-term recursions. We also give, for any particular n, a procedure for evaluating the distribution to the fourth digit, based on empirical CDF's from samples of size 10 10 .

Journal ArticleDOI
TL;DR: Results from simulations and channel measurements indicate that the derived expression provides an accurate approximation of the true channel outage capacity for a wide range of realistic system conditions.
Abstract: Closed form approximations of the outage capacity and the mutual information between the in- and outputs of a multi-input, multi-output (MIMO) narrowband system in the presence of correlated fading are considered. First, the limiting distribution of the squared singular values of the channel matrix is derived as the number of antennas at either the transmit or receive site increases. Spatial correlation is allowed according to a realistic stochastic channel model and correlation is allowed between the transmitted signals. The derived limiting distribution has the advantage of being closed form while simulations indicate that it provides a reasonable approximation of the true distribution also for realistic antenna array sizes. Second, the channel outage capacity is derived from the limiting distribution above for the case when the number of transmit antennas is large. The resulting outage capacity has a simple form and allows for spatially correlated channel elements as well as correlation among the transmitted signals. Results from simulations and channel measurements are presented that indicate that the derived expression provides an accurate approximation of the true channel outage capacity for a wide range of realistic system conditions.

Journal ArticleDOI
TL;DR: In this paper, the authors established consistency and asymptotic normality of the quasi-maximum likelihood estimator in the linear ARCH model and allowed the parameters to be in the region where no stationary version of the process exists.
Abstract: We establish consistency and asymptotic normality of the quasi-maximum likelihood estimator in the linear ARCH model. Contrary to the existing literature, we allow the parameters to be in the region where no stationary version of the process exists. This implies that the estimator is always asymptotically normal.

Journal ArticleDOI
TL;DR: A class of weighted estimators with general time-varying weights that are related to a class of estimators proposed by Robins, Rotnitzky, and Zhao are developed and shown to be consistent and asymptotically normal under appropriate conditions.
Abstract: The case-cohort design is a common means of reducing the cost of covariate measurements in large failure-time studies. Under this design, complete covariate data are collected only on the cases (i. e., the subjects whose failure times are uncensored) and on a subcohort randomly selected from the whole cohort. In many applications, certain covariates are readily measured on all cohort members, and surrogate measurements of the expensive covariates also may be available. The existing relative-risk estimators for the case-cohort design disregard the covariate data collected outside the case-cohort sample and thus incur loss of efficiency. To make better use of the available data, we develop a class of weighted estimators with general time-varying weights that are related to a class of estimators proposed by Robins, Rotnitzky, and Zhao. The estimators are shown to be consistent and asymptotically normal under appropriate conditions. We identify the estimator within this class that maximizes efficiency, numeri...

Journal ArticleDOI
TL;DR: In this article, the asymptotic distribution of Ln is derived by using the Chen-Stein Poisson approximation method for the non-Gaussian case, and the test statistic is chosen as Ln=max i≠j|ρij|.
Abstract: Let Xn=(xij) be an n by p data matrix, where the n rows form a random sample of size n from a certain p-dimensional population distribution. Let Rn=(ρij) be the p×p sample correlation matrix of Xn; that is, the entry ρij is the usual Pearson”s correlation coefficient between the ith column of Xn and jth column of Xn. For contemporary data both n and p are large. When the population is a multivariate normal we study the test that H0: the p variates of the population are uncorrelated. A test statistic is chosen as Ln=max i≠j|ρij|. The asymptotic distribution of Ln is derived by using the Chen–Stein Poisson approximation method. Similar results for the non-Gaussian case are also derived.

Journal ArticleDOI
TL;DR: In this article, a modified likelihood ratio (MLR) test is proposed for finite mixture models with normal, binomial and Poisson kernels, where the estimates of the parameters are obtained from a modified probability function.
Abstract: Summary. We consider a finite mixture model with k components and a kernel distribution from a general one-parameter family. The problem of testing the hypothesis k = 2 versus k > 3 is studied. There has been no general statistical testing procedure for this problem. We propose a modified likelihood ratio statistic where under the null and the alternative hypotheses the estimates of the parameters are obtained from a modified likelihood function. It is shown that estimators of the support points are consistent. The asymptotic null distribution of the modified likelihood ratio test proposed is derived and found to be relatively simple and easily applied. Simulation studies for the asymptotic modified likelihood ratio test based on finite mixture models with normal, binomial and Poisson kernels suggest that the test proposed performs well. Simulation studies are also conducted for a bootstrap method with normal kernels. An example involving foetal movement data from a medical study illustrates the testing procedure.