scispace - formally typeset
Search or ask a question

Showing papers on "Asymptotic distribution published in 2021"


Journal ArticleDOI
TL;DR: In this article, optimal subsampling for quantile regression is investigated and algorithms based on the optimal sampling probabilities are proposed to obtain asymptotic distributions and optimality of the resulting estimators.
Abstract: We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive two versions of optimal subsampling probabilities. One version minimizes the trace of the asymptotic variance-covariance matrix for a linearly transformed parameter estimator and the other minimizes that of the original parameter estimator. The former does not depend on the densities of the responses given covariates and is easy to implement. Algorithms based on optimal subsampling probabilities are proposed and asymptotic distributions and asymptotic optimality of the resulting estimators are established. Furthermore, we propose an iterative subsampling procedure based on the optimal subsampling probabilities in the linearly transformed parameter estimation which has great scalability to utilize available computational resources. In addition, this procedure yields standard errors for parameter estimators without estimating the densities of the responses given the covariates. We provide numerical examples based on both simulated and real data to illustrate the proposed method.

46 citations


Journal ArticleDOI
TL;DR: To fast approximate maximum likelihood estimators with massive data, this paper studies the Optimal Subsampling Method under the A-optimality Criterion (OSMAC) for generalized linear models, and the consistency and asymptotic normality of the estimator from a general subsampling algorithm are established.
Abstract: To fast approximate maximum likelihood estimators with massive data, this paper studies the Optimal Subsampling Method under the A-optimality Criterion (OSMAC) for generalized linear models The consistency and asymptotic normality of the estimator from a general subsampling algorithm are established, and optimal subsampling probabilities under the A- and L-optimality criteria are derived Furthermore, using Frobenius norm matrix concentration inequalities, finite sample properties of the subsample estimator based on optimal subsampling probabilities are also derived Since the optimal subsampling probabilities depend on the full data estimate, an adaptive two-step algorithm is developed Asymptotic normality and optimality of the estimator from this adaptive algorithm are established The proposed methods are illustrated and evaluated through numerical experiments on simulated and real datasets

45 citations


Journal ArticleDOI
TL;DR: In this paper, generalized least squares (GLS) estimation for linear panel data models is considered and the covariance matrix used for the feasible GLS is estimated via the banding and thresholding method.
Abstract: This paper considers generalized least squares (GLS) estimation for linear panel data models. By estimating the large error covariance matrix consistently, the proposed feasible GLS estimator is more efficient than the ordinary least squares in the presence of heteroskedasticity, serial and cross-sectional correlations. The covariance matrix used for the feasible GLS is estimated via the banding and thresholding method. We establish the limiting distribution of the proposed estimator. A Monte Carlo study is considered. The proposed method is applied to an empirical application.

45 citations


Journal ArticleDOI
TL;DR: In this article, a distributionally robust stochastic optimization (DRO) framework is proposed to learn a model providing good performance against perturbations to the data-generating distribution.
Abstract: A common goal in statistics and machine learning is to learn models that can perform well against distributional shifts, such as latent heterogeneous subpopulations, unknown covariate shifts or unmodeled temporal effects. We develop and analyze a distributionally robust stochastic optimization (DRO) framework that learns a model providing good performance against perturbations to the data-generating distribution. We give a convex formulation for the problem, providing several convergence guarantees. We prove finite-sample minimax upper and lower bounds, showing that distributional robustness sometimes comes at a cost in convergence rates. We give limit theorems for the learned parameters, where we fully specify the limiting distribution so that confidence intervals can be computed. On real tasks including generalizing to unknown subpopulations, fine-grained recognition and providing good tail performance, the distributionally robust approach often exhibits improved performance.

35 citations


Journal ArticleDOI
TL;DR: In this article, a family of U-statistics as unbiased estimators of the normalized features of high-dimensional joint distributions of the features of a highdimensional joint distribution is presented.
Abstract: Many high-dimensional hypothesis tests aim to globally examine marginal or low-dimensional features of a high-dimensional joint distribution, such as testing of mean vectors, covariance matrices and regression coefficients This paper constructs a family of U-statistics as unbiased estimators of the $\ell_{p}$-norms of those features We show that under the null hypothesis, the U-statistics of different finite orders are asymptotically independent and normally distributed Moreover, they are also asymptotically independent with the maximum-type test statistic, whose limiting distribution is an extreme value distribution Based on the asymptotic independence property, we propose an adaptive testing procedure which combines $p$-values computed from the U-statistics of different orders We further establish power analysis results and show that the proposed adaptive procedure maintains high power against various alternatives

33 citations


Journal ArticleDOI
TL;DR: In this paper, the authors studied the fluctuation of the outlier singular vectors of the matrix denoising model under fully general assumptions on the structure of the singular vectors and the distribution of the signal matrix.
Abstract: In this paper, we study the matrix denoising model $Y=S+X$, where $S$ is a low rank deterministic signal matrix and $X$ is a random noise matrix, and both are $M\times n$. In the scenario that $M$ and $n$ are comparably large and the signals are supercritical, we study the fluctuation of the outlier singular vectors of $Y$, under fully general assumptions on the structure of $S$ and the distribution of $X$. More specifically, we derive the limiting distribution of angles between the principal singular vectors of $Y$ and their deterministic counterparts, the singular vectors of $S$. Further, we also derive the distribution of the distance between the subspace spanned by the principal singular vectors of $Y$ and that spanned by the singular vectors of $S$. It turns out that the limiting distributions depend on the structure of the singular vectors of $S$ and the distribution of $X$, and thus they are nonuniversal. Statistical applications of our results to singular vector and singular subspace inferences are also discussed.

32 citations


Journal ArticleDOI
TL;DR: In this article, the authors studied the asymptotic properties of the pooled CCE estimator under more realistic conditions and showed that the true number of common factors, r, can be larger than the number of estimated factors, which in CCE is given by k + 1, where k is the number number of regressors.

28 citations


Journal ArticleDOI
TL;DR: This article established strong consistency and asymptotic normality of the maximum likelihood estimator for stochastic time-varying parameter models driven by the score of the predictive conditional likelihood function.

25 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider a positive-valued time series whose conditional distribution has a time-varying mean, which may depend on exogenous variables and provide conditions for the existence of marginal moments and for the geometric decay of the beta-mixing coefficients.
Abstract: We consider a positive-valued time series whose conditional distribution has a time-varying mean, which may depend on exogenous variables. The main applications concern count or duration data. Under a contraction condition on the mean function, it is shown that stationarity and ergodicity hold when the mean and stochastic orders of the conditional distribution are the same. The latter condition holds for the exponential family parametrized by the mean, but also for many other distributions. We also provide conditions for the existence of marginal moments and for the geometric decay of the beta-mixing coefficients. We give conditions for consistency and asymptotic normality of the Exponential Quasi-Maximum Likelihood Estimator (QMLE) of the conditional mean parameters. Simulation experiments and illustrations on series of stock market volumes and of greenhouse gas concentrations show that the multiplicative-error form of usual duration models deserves to be relaxed, as allowed in the present paper.

24 citations


Journal ArticleDOI
TL;DR: This paper proposes distributed algorithms which account for the heterogeneous distributions by allowing site-specific nuisance parameters and establishes the non-asymptotic risk bound of the proposed distributed estimator and its limiting distribution in the two-index asymptotic setting.
Abstract: In multicenter research, individual-level data are often protected against sharing across sites. To overcome the barrier of data sharing, many distributed algorithms, which only require sharing aggregated information, have been developed. The existing distributed algorithms usually assume the data are homogeneously distributed across sites. This assumption ignores the important fact that the data collected at different sites may come from various sub-populations and environments, which can lead to heterogeneity in the distribution of the data. Ignoring the heterogeneity may lead to erroneous statistical inference. In this paper, we propose distributed algorithms which account for the heterogeneous distributions by allowing site-specific nuisance parameters. The proposed methods extend the surrogate likelihood approach to the heterogeneous setting by applying a novel density ratio tilting method to the efficient score function. The proposed algorithms maintain the same communication cost as the existing communication-efficient algorithms. We establish a non-asymptotic risk bound for the proposed distributed estimator and its limiting distribution in the two-index asymptotic setting which allows both sample size per site and the number of sites to go to infinity. In addition, we show that the asymptotic variance of the estimator attains the Cramer-Rao lower bound when the number of sites is in rate smaller than the sample size at each site. Finally, we use simulation studies and a real data application to demonstrate the validity and feasibility of the proposed methods.

24 citations


Journal ArticleDOI
TL;DR: The proposed autoregressive model outperforms existing methods in two of the data sets, while the best empirical performance in the other two data sets is attained by existing methods based on functional transformations of the densities.
Abstract: Data consisting of time-indexed distributions of cross-sectional or intraday returns have been extensively studied in finance, and provide one example in which the data atoms consist of serially dependent probability distributions. Motivated by such data, we propose an autoregressive model for density time series by exploiting the tangent space structure on the space of distributions that is induced by the Wasserstein metric. The densities themselves are not assumed to have any specific parametric form, leading to flexible forecasting of future unobserved densities. The main estimation targets in the order-$p$ Wasserstein autoregressive model are Wasserstein autocorrelations and the vector-valued autoregressive parameter. We propose suitable estimators and establish their asymptotic normality, which is verified in a simulation study. The new order-$p$ Wasserstein autoregressive model leads to a prediction algorithm, which includes a data driven order selection procedure. Its performance is compared to existing prediction procedures via application to four financial return data sets, where a variety of metrics are used to quantify forecasting accuracy. For most metrics, the proposed model outperforms existing methods in two of the data sets, while the best empirical performance in the other two data sets is attained by existing methods based on functional transformations of the densities.

Journal ArticleDOI
TL;DR: The density estimator (or smoothed histogram) is closely related to the Dirichlet kernel estimator from Ouimet (2020), and can also be used to analyze compositional data.

Journal ArticleDOI
TL;DR: In this paper, a Laplace-type estimator is proposed for the construction of the estimate and confidence set for the date of a structural change based on the continuous record asymptotic framework.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the asymptotic behavior of the center of mass of the elephant random walk, which is a discrete-time random walk on integers with a complete memory of its whole history.

Journal ArticleDOI
Konrad Menzel1
TL;DR: In this paper, a bootstrap procedure for data that may exhibit cluster-dependence in two or more dimensions is proposed, where the asymptotic distribution of the sample mean or other statistics may be non-Gaussian if observations are dependent but uncorrelated within clusters.
Abstract: We propose a bootstrap procedure for data that may exhibit cluster‐dependence in two or more dimensions. The asymptotic distribution of the sample mean or other statistics may be non‐Gaussian if observations are dependent but uncorrelated within clusters. We show that there exists no procedure for estimating the limiting distribution of the sample mean under two‐way clustering that achieves uniform consistency. However, we propose bootstrap procedures that achieve adaptivity with respect to different uniformity criteria. Important cases and extensions discussed in the paper include regression inference, U‐ and V‐statistics, subgraph counts for network data, and non‐exhaustive samples of matched data.

Journal ArticleDOI
TL;DR: The asymptotic normality of underlying DRO estimators as well as the properties of an optimal (in a suitable sense) confidence region induced by the Wasserstein DRO formulation are studied.
Abstract: Wasserstein distributionally robust optimization estimators are obtained as solutions of min-max problems in which the statistician selects a parameter minimizing the worst-case loss among all probability models within a certain distance (in a Wasserstein sense) from the underlying empirical measure. While motivated by the need to identify optimal model parameters or decision choices that are robust to model misspecification, these distributionally robust estimators recover a wide range of regularized estimators, including square-root lasso and support vector machines, among others, as particular cases. This paper studies the asymptotic normality of these distributionally robust estimators as well as the properties of an optimal (in a suitable sense) confidence region induced by the Wasserstein distributionally robust optimization formulation. In addition, key properties of min-max distributionally robust optimization problems are also studied, for example, we show that distributionally robust estimators regularize the loss based on its derivative and we also derive general sufficient conditions which show the equivalence between the min-max distributionally robust optimization problem and the corresponding max-min formulation.

Journal ArticleDOI
TL;DR: In this paper, an estimator of factor strength and its consistency and asymptotic distribution is proposed, based on the number of statistically significant factor loadings, taking account of the multiple testing problem.
Abstract: This paper proposes an estimator of factor strength and establishes its consistency and asymptotic distribution. The proposed estimator is based on the number of statistically significant factor loadings, taking account of the multiple testing problem. We focus on the case where the factors are observed which is of primary interest in many applications in macroeconomics and finance. We also consider using cross section averages as a proxy in the case of unobserved common factors. We face a fundamental factor identification issue when there are more than one unobserved common factors. We investigate the small sample properties of the proposed estimator by means of Monte Carlo experiments under a variety of scenarios. In general, we find that the estimator, and the associated inference, perform well. The test is conservative under the null hypothesis, but, nevertheless, has excellent power properties, especially when the factor strength is sufficiently high. Application of the proposed estimation strategy to factor models of asset returns shows that out of 146 factors recently considered in the finance literature, only the market factor is truly strong, while all other factors are at best semi-strong, with their strength varying considerably over time. Similarly, we only find evidence of semi-strong factors in an updated version of the Stock and Watson (2012) macroeconomic dataset.

Journal ArticleDOI
TL;DR: This study establishes the almost complete convergence with rate of the expectile regression estimator, and obtains the asymptotic normality of the proposed estimator under some mild conditions.

Journal ArticleDOI
TL;DR: The results show that the bootstrap method is superior to previously used approaches relying on the asymptotic distribution of the tests that assumed the data come from a normal distribution.
Abstract: It is important to examine the symmetry of an underlying distribution before applying some statistical procedures to a data set. For example, in the Zuni School District case, a formula originally developed by the Department of Education trimmed 5% of the data symmetrically from each end. The validity of this procedure was questioned at the hearing by Chief Justice Roberts. Most tests of symmetry (even nonparametric ones) are not distribution free in finite sample sizes. Hence, using asymptotic distribution may not yield an accurate type I error rate or/and loss of power in small samples. Bootstrap resampling from a symmetric empirical distribution function fitted to the data is proposed to improve the accuracy of the calculated p-value of several tests of symmetry. The results show that the bootstrap method is superior to previously used approaches relying on the asymptotic distribution of the tests that assumed the data come from a normal distribution. Incorporating the bootstrap estimate in a recently proposed test due to Miao, Gel and Gastwirth (2006) preserved its level and shows it has reasonable power properties on the family of distribution evaluated.

Journal ArticleDOI
TL;DR: For massive survival data, a subsampling algorithm is proposed to efficiently approximate the estimates of regression parameters in the additive hazards model and establishes consistency and asymptotic normality of the subsample‐based estimator given the full data.
Abstract: For massive survival data, we propose a subsampling algorithm to efficiently approximate the estimates of regression parameters in the additive hazards model. We establish consistency and asymptotic normality of the subsample-based estimator given the full data. The optimal subsampling probabilities are obtained via minimizing asymptotic variance of the resulting estimator. The subsample-based procedure can largely reduce the computational cost compared with the full data method. In numerical simulations, our method has low bias and satisfactory coverage probabilities. We provide an illustrative example on the survival analysis of patients with lymphoma cancer from the Surveillance, Epidemiology, and End Results Program.

Posted Content
TL;DR: In this paper, the authors considered the problem of proving a central limit theorem for the empirical optimal transport cost in the semi-discrete case, i.e., when the distribution $P$ is finitely supported, and showed that the asymptotic distribution is the supremun of a centered Gaussian process.
Abstract: We address the problem of proving a Central Limit Theorem for the empirical optimal transport cost, $\sqrt{n}\{\mathcal{T}_c(P_n,Q)-\mathcal{T}_c(P,Q)\}$, in the semi discrete case, ie when the distribution $P$ is finitely supported We show that the asymptotic distribution is the supremun of a centered Gaussian process which is Gaussian under some additional conditions on the probability $Q$ and on the cost Such results imply the central limit theorem for the $p$-Wassertein distance, for $p\geq 1$ Finally, the semidiscrete framework provides a control on the second derivative of the dual formulation, which yields the first central limit theorem for the optimal transport potentials

Journal ArticleDOI
TL;DR: A scaling technique is proposed in order to determine a causal order of the node variables and all dependence parameters are estimated from the estimated scalings and dependence parameters based on asymptotic normality of the empirical spectral measure.

Journal ArticleDOI
TL;DR: In this article, a model-based bootstrap procedure is proposed for estimating the sparsified version of the underlying vector autoregressive model, and the asymptotic distribution of such estimators in the time series context is derived.
Abstract: Fitting sparse models to high-dimensional time series is an important area of statistical inference. In this paper, we consider sparse vector autoregressive models and develop appropriate bootstrap methods to infer properties of such processes. Our bootstrap methodology generates pseudo time series using a model-based bootstrap procedure which involves an estimated, sparsified version of the underlying vector autoregressive model. Inference is performed using so-called de-sparsified or de-biased estimators of the autoregressive model parameters. We derive the asymptotic distribution of such estimators in the time series context and establish asymptotic validity of the bootstrap procedure proposed for estimation and, appropriately modified, for testing purposes. In particular, we focus on testing that large groups of autoregressive coefficients equal zero. Our theoretical results are complemented by simulations which investigate the finite sample performance of the bootstrap methodology proposed. A real-life data application is also presented.

Journal ArticleDOI
TL;DR: An L 2 -norm-based test is proposed and studied and it is shown that under some regularity conditions and the null hypothesis, the test statistic and a chi-square-type mixture have the same normal or non-normal limiting distribution.

Journal ArticleDOI
TL;DR: For multiple change-points detection of high-dimensional time series, asymptotic theory concerning the consistency and the asymPTotic distribution of the breakpoint statistics and estimations is provided.
Abstract: For multiple change-points detection of high-dimensional time series, we provide asymptotic theory concerning the consistency and the asymptotic distribution of the breakpoint statistics and estima...

Journal ArticleDOI
07 Jun 2021
TL;DR: In this article, the exact sampling distribution of the estimated optimal portfolio weights and their characteristics is derived by deriving their sampling distribution by its stochastic representation. But the sampling distribution is not directly applied to the real world.
Abstract: Optimal portfolio selection problems are determined by the (unknown) parameters of the data generating process. If an investor wants to realize the position suggested by the optimal portfolios, he/she needs to estimate the unknown parameters and to account for the parameter uncertainty in the decision process. Most often, the parameters of interest are the population mean vector and the population covariance matrix of the asset return distribution. In this paper, we characterize the exact sampling distribution of the estimated optimal portfolio weights and their characteristics. This is done by deriving their sampling distribution by its stochastic representation. This approach possesses several advantages, e.g. (i) it determines the sampling distribution of the estimated optimal portfolio weights by expressions, which could be used to draw samples from this distribution efficiently; (ii) the application of the derived stochastic representation provides an easy way to obtain the asymptotic approximation of the sampling distribution. The later property is used to show that the high-dimensional asymptotic distribution of optimal portfolio weights is a multivariate normal and to determine its parameters. Moreover, a consistent estimator of optimal portfolio weights and their characteristics is derived under the high-dimensional settings. Via an extensive simulation study, we investigate the finite-sample performance of the derived asymptotic approximation and study its robustness to the violation of the model assumptions used in the derivation of the theoretical results.

Journal ArticleDOI
TL;DR: In this article, a Gibbs version of the ABC approach is explored, which runs component-wise approximate Bayesian computation steps aimed at the corresponding conditional posterior distributions, and based on summary statistics of reduced dimensions.
Abstract: Approximate Bayesian computation methods are useful for generative models with intractable likelihoods. These methods are however sensitive to the dimension of the parameter space, requiring exponentially increasing resources as this dimension grows. To tackle this difficulty, we explore a Gibbs version of the ABC approach that runs component-wise approximate Bayesian computation steps aimed at the corresponding conditional posterior distributions, and based on summary statistics of reduced dimensions. While lacking the standard justifications for the Gibbs sampler, the resulting Markov chain is shown to converge in distribution under some partial independence conditions. The associated stationary distribution can further be shown to be close to the true posterior distribution and some hierarchical versions of the proposed mechanism enjoy a closed form limiting distribution. Experiments also demonstrate the gain in efficiency brought by the Gibbs version over the standard solution.

Journal ArticleDOI
TL;DR: In this paper, a new multivariate volatility model that belongs to the family of conditional correlation GARCH models is introduced, where the GARCH equations of this model contain a multiplicative deterministic component to describe long run movements in volatility and, in addition, the correlations are deterministically time-varying.

Journal ArticleDOI
TL;DR: In this paper, the asymptotic distributions of likelihood ratio-based tests for detecting regime switching were examined and established in the context of nonlinear models, allowing multiple parameters to be affected by regime switching.
Abstract: Markov regime-switching models are very common in economics and finance. Despite persisting interest in them, the asymptotic distributions of likelihood ratio-based tests for detecting regime switching remain unknown. This study examines such tests and establishes their asymptotic distributions in the context of nonlinear models, allowing multiple parameters to be affected by regime switching. The analysis addresses three difficulties: (i) some nuisance parameters are unidentified under the null hypothesis, (ii) the null hypothesis yields a local optimum, and (iii) the conditional regime probabilities follow stochastic processes that can only be represented recursively. Addressing these issues permits substantial power gains in empirically relevant settings. This study also presents the following results: (1) a characterization of the conditional regime probabilities and their derivatives with respect to the model’s parameters, (2) a high-order approximation to the log-likelihood ratio, (3) a refinement of the asymptotic distribution, and (4) a unified algorithm to simulate the critical values. For models that are linear under the null hypothesis, the elements needed for the algorithm can all be computed analytically. Furthermore, the above results explain why some bootstrap procedures can be inconsistent, and why standard information criteria can be sensitive to the hypothesis and the model structure. When applied to US quarterly real gross domestic product (GDP) growth rate data, the methods detect relatively strong evidence favouring the regime-switching specification. Lastly, we apply the methods in the context of dynamic stochastic equilibrium models and obtain similar results as the GDP case.

Journal ArticleDOI
TL;DR: This article establishes asymptotic normality of estimating function estimators in a very general setting of nonstationary point processes and adapts this result to the case of non stationary determinantal point processes, which are an important class of models for repulsive point patterns.
Abstract: Estimating function inference is indispensable for many common point process models where the joint intensities are tractable while the likelihood function is not. In this paper we establish asymptotic normality of estimating function estimators in a very general setting of non‐stationary point processes. We then adapt this result to the case of non‐stationary determinantal point processes which are an important class of models for repulsive point patterns. In practice often first and second order estimating functions are used. For the latter it is common practice to omit contributions for pairs of points separated by a distance larger than some truncation distance which is usually specified in an ad hoc manner. We suggest instead a data‐driven approach where the truncation distance is adapted automatically to the point process being fitted and where the approach integrates seamlessly with our asymptotic framework. The good performance of the adaptive approach is illustrated via simulation studies for non‐stationary determinantal point processes and by an application to a real dataset.