scispace - formally typeset
Search or ask a question

Showing papers in "Bernoulli in 2018"


Journal ArticleDOI
TL;DR: In this paper, the authors investigate statistical properties of change point estimators based on moving sum statistics and derive rates of convergence for the estimation of the location of the change points and show that these rates are strict by deriving the limit distribution.
Abstract: In this work, we investigate statistical properties of change point estimators based on moving sum statistics. We extend results for testing in a classical situation with multiple deterministic change points by allowing for random exogenous change points that arise in Hidden Markov or regime switching models among others. To this end, we consider a multiple mean change model with possible time series errors and prove that the number and location of change points are estimated consistently by this procedure. Additionally, we derive rates of convergence for the estimation of the location of the change points and show that these rates are strict by deriving the limit distribution of properly scaled estimators. Because the small sample behavior depends crucially on how the asymptotic (long-run) variance of the error sequence is estimated, we propose to use moving sum type estimators for the (long-run) variance and derive their asymptotic properties. While they do not estimate the variance consistently at every point in time, they can still be used to consistently estimate the number and location of the changes. In fact, this inconsistency can even lead to more precise estimators for the change points. Finally, some simulations illustrate the behavior of the estimators in small samples showing that its performance is very good compared to existing methods.

126 citations


Journal ArticleDOI
TL;DR: In this article, the distance of the nth step distributions of two Markov chains when one of them satisfies a Wasserstein ergodicity condition is shown to be bounded.
Abstract: Perturbation theory for Markov chains addresses the question of how small differences in the transition probabilities of Markov chains are reflected in differences between their distributions. We prove powerful and flexible bounds on the distance of the nth step distributions of two Markov chains when one of them satisfies a Wasserstein ergodicity condition. Our work is motivated by the recent interest in approximate Markov chain Monte Carlo (MCMC) methods in the analysis of big data sets. By using an approach based on Lyapunov functions, we provide estimates for geometrically ergodic Markov chains under weak assumptions. In an autoregressive model, our bounds cannot be improved in general. We illustrate our theory by showing quantitative estimates for approximate versions of two prominent MCMC algorithms, the Metropolis-Hastings and stochastic Langevin algorithms.

98 citations


Journal ArticleDOI
TL;DR: In this article, the posterior probability distribution of the fixed parameters of a state-space dynamical system using a sequential Monte Carlo method is approximated in a purely recursive manner, in which the computational complexity of the recursive steps of the method introduced herein is constant over time.
Abstract: We address the problem of approximating the posterior probability distribution of the fixed parameters of a state-space dynamical system using a sequential Monte Carlo method. The proposed approach relies on a nested structure that employs two layers of particle filters to approximate the posterior probability measure of the static parameters and the dynamic state variables of the system of interest, in a vein similar to the recent “sequential Monte Carlo square” (SMC$^{2}$) algorithm. However, unlike the SMC$^{2}$ scheme, the proposed technique operates in a purely recursive manner. In particular, the computational complexity of the recursive steps of the method introduced herein is constant over time. We analyse the approximation of integrals of real bounded functions with respect to the posterior distribution of the system parameters computed via the proposed scheme. As a result, we prove, under regularity assumptions, that the approximation errors vanish asymptotically in $L_{p}$ ($p\ge1$) with convergence rate proportional to $\frac{1}{\sqrt{N}}+\frac{1}{\sqrt{M}}$, where $N$ is the number of Monte Carlo samples in the parameter space and $N\times M$ is the number of samples in the state space. This result also holds for the approximation of the joint posterior distribution of the parameters and the state variables. We discuss the relationship between the SMC$^{2}$ algorithm and the new recursive method and present a simple example in order to illustrate some of the theoretical findings with computer simulations.

84 citations


Journal ArticleDOI
TL;DR: In this article, the essential boundedness of potential functions associated with the i-cSMC algorithm was shown to provide necessary and sufficient conditions for the uniform ergodicity of the MC Markov chain and quantitative bounds on its geometric rate of convergence.
Abstract: We establish quantitative bounds for rates of convergence and asymptotic variances for iterated conditional sequential Monte Carlo (i-cSMC) Markov chains and associated particle Gibbs samplers [J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 (2010) 269–342]. Our main findings are that the essential boundedness of potential functions associated with the i-cSMC algorithm provide necessary and sufficient conditions for the uniform ergodicity of the i-cSMC Markov chain, as well as quantitative bounds on its (uniformly geometric) rate of convergence. Furthermore, we show that the i-cSMC Markov chain cannot even be geometrically ergodic if this essential boundedness does not hold in many applications of interest. Our sufficiency and quantitative bounds rely on a novel non-asymptotic analysis of the expectation of a standard normalizing constant estimate with respect to a “doubly conditional” SMC algorithm. In addition, our results for i-cSMC imply that the rate of convergence can be improved arbitrarily by increasing $N$, the number of particles in the algorithm, and that in the presence of mixing assumptions, the rate of convergence can be kept constant by increasing $N$ linearly with the time horizon. We translate the sufficiency of the boundedness condition for i-cSMC into sufficient conditions for the particle Gibbs Markov chain to be geometrically ergodic and quantitative bounds on its geometric rate of convergence, which imply convergence of properties of the particle Gibbs Markov chain to those of its corresponding Gibbs sampler. These results complement recently discovered, and related, conditions for the particle marginal Metropolis–Hastings (PMMH) Markov chain.

58 citations


Journal ArticleDOI
TL;DR: A Gaussian approximation result is developed for the maximum of a sum of weakly dependent vectors, where the data dimension is allowed to be exponentially larger than sample size.
Abstract: We develop a Gaussian approximation result for the maximum of a sum of weakly dependent vectors, where the data dimension is allowed to be exponentially larger than sample size. Our result is established under the physical/functional dependence framework. This work can be viewed as a substantive extension of Chernozhukov et al. (Ann. Statist. 41 (2013) 2786–2819) to time series based on a variant of Stein’s method developed therein.

53 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of estimating the residual variance, the proportion of explained variance, and the signal strength in a high-dimensional linear regression model with Gaussian random design.
Abstract: We consider the equivalent problems of estimating the residual variance, the proportion of explained variance $\eta$ and the signal strength in a high-dimensional linear regression model with Gaussian random design. Our aim is to understand the impact of not knowing the sparsity of the vector of regression coefficients and not knowing the distribution of the design on minimax estimation rates of $\eta$. Depending on the sparsity $k$ of the vector regression coefficients, optimal estimators of $\eta$ either rely on estimating the vector of regression coefficients or are based on $U$-type statistics. In the important situation where $k$ is unknown, we build an adaptive procedure whose convergence rate simultaneously achieves the minimax risk over all $k$ up to a logarithmic loss which we prove to be non avoidable. Finally, the knowledge of the design distribution is shown to play a critical role. When the distribution of the design is unknown, consistent estimation of explained variance is indeed possible in much narrower regimes than for known design distribution.

50 citations


Journal ArticleDOI
TL;DR: In this article, the authors revisited the methodology of Stein (1975, 1986) for estimating a covariance matrix in the setting where the number of variables can be of the same magnitude as the sample size, and they proposed an alternative solution by minimizing the limiting expression of the unbiased estimator of risk under large-dimensional asymptotics, rather than the finite-sample expression.
Abstract: This paper revisits the methodology of Stein (1975, 1986) for estimating a covariance matrix in the setting where the number of variables can be of the same magnitude as the sample size. Stein proposed to keep the eigenvectors of the sample covariance matrix but to shrink the eigenvalues. By minimizing an unbiased estimator of risk, Stein derived an ‘optimal’ shrinkage transformation. Unfortunately, the resulting estimator has two pitfalls: the shrinkage transformation can change the ordering of the eigenvalues and even make some of them negative. Stein suggested an ad hoc isotonizing algorithm that post-processes the transformed eigenvalues and thereby fixes these problems. We offer an alternative solution by minimizing the limiting expression of the unbiased estimator of risk under large-dimensional asymptotics, rather than the finite-sample expression. Compared to the isotonized version of Stein’s estimator, our solution is theoretically more elegant and also delivers improved performance, as evidenced by Monte Carlo simulations.

50 citations


Journal ArticleDOI
TL;DR: All max- linear models which are generated by a recursive structural equation model are characterized, and it is shown that its max-linear coefficient matrix is the solution of a fixed point equation.
Abstract: We consider a new structural equation model in which all random variables can be written as a max-linear function of their parents and independent noise variables. For the corresponding graph we assume that it is a directed acyclic graph. We show that the model is max-linear and detail the relation between the weights of the structural equation model and the max-linear coefficients. We characterize all max-linear models which are generated by this structural equation model. This leads to the presentation of a max-linear structural equation model as the solution of a fixed point equation and to a unique minimal DAG describing the relationships between the variables.The model structure introduces an order between the random variables, which yields certain model reductions, represented by subgraphs of the DAG which we call order DAGs. This results also in a reduced form for the regular conditional distributions compared to previous representations.

46 citations


Journal ArticleDOI
TL;DR: In this paper, the authors study various transport-information inequalities under three different notions of Ricci curvature in the discrete setting: the curvature-dimension condition of Bakry and Emery (In Seminaire de Probabilites, XIX, 1983/84 (1985) 177-206 Springer), the exponential curvature dimension condition of Bauer et al. (2013), and the coarse curvature of Ollivier (J. Funct. Anal. 256 (2009) 810-864).
Abstract: We study various transport-information inequalities under three different notions of Ricci curvature in the discrete setting: the curvature-dimension condition of Bakry and Emery (In Seminaire de Probabilites, XIX, 1983/84 (1985) 177–206 Springer), the exponential curvature-dimension condition of Bauer et al. (Li-Yau Inequality on Graphs (2013)) and the coarse Ricci curvature of Ollivier (J. Funct. Anal. 256 (2009) 810–864). We prove that under a curvature-dimension condition or coarse Ricci curvature condition, an $L_{1}$ transport-information inequality holds; while under an exponential curvature-dimension condition, some weak-transport information inequalities hold. As an application, we establish a Bonnet–Myers theorem under the curvature-dimension condition $\operatorname{CD}(\kappa,\infty)$ of Bakry and Emery (In Seminaire de Probabilites, XIX, 1983/84 (1985) 177–206 Springer).

45 citations


Journal ArticleDOI
TL;DR: In this paper, a wavelet analysis of operator fractional Brownian motion (OFBM) is presented, where the authors consider the evolution along scales of the eigenstructure of the wavelet spectrum.
Abstract: Operator fractional Brownian motion (OFBM) is the natural vector-valued extension of the univariate fractional Brownian motion. Instead of a scalar parameter, the law of an OFBM scales according to a Hurst matrix that affects every component of the process. In this paper, we develop the wavelet analysis of OFBM, as well as a new estimator for the Hurst matrix of bivariate OFBM. For OFBM, the univariate-inspired approach of analyzing the entry-wise behavior of the wavelet spectrum as a function of the (wavelet) scales is fraught with difficulties stemming from mixtures of power laws. Instead we consider the evolution along scales of the eigenstructure of the wavelet spectrum. This is shown to yield consistent and asymptotically normal estimators of the Hurst eigenvalues, and also of the eigenvectors under assumptions. A simulation study is included to demonstrate the good performance of the estimators under finite sample sizes.

45 citations


Journal ArticleDOI
TL;DR: In this paper, the least squares estimator (LSE) is considered for estimating an unknown matrix from noisy observations under the constraint that the matrix is nondecreasing in both rows and columns.
Abstract: We consider the problem of estimating an unknown $n_{1}\times n_{2}$ matrix $\mathbf{\theta}^{*}$ from noisy observations under the constraint that $\mathbf{\theta}^{*}$ is nondecreasing in both rows and columns. We consider the least squares estimator (LSE) in this setting and study its risk properties. We show that the worst case risk of the LSE is $n^{-1/2}$, up to multiplicative logarithmic factors, where $n=n_{1}n_{2}$ and that the LSE is minimax rate optimal (up to logarithmic factors). We further prove that for some special $\mathbf{\theta}^{*}$, the risk of the LSE could be much smaller than $n^{-1/2}$; in fact, it could even be parametric, that is, $n^{-1}$ up to logarithmic factors. Such parametric rates occur when the number of “rectangular” blocks of $\mathbf{\theta}^{*}$ is bounded from above by a constant. We also derive an interesting adaptation property of the LSE which we term variable adaptation – the LSE adapts to the “intrinsic dimension” of the problem and performs as well as the oracle estimator when estimating a matrix that is constant along each row/column. Our proofs, which borrow ideas from empirical process theory, approximation theory and convex geometry, are of independent interest.

Journal ArticleDOI
TL;DR: In this article, it was shown that the maximum pseudolikelihood estimate (MPLE) of the natural parameter is consistent at a point whenever the log-partition function has order $a(n) in a neighborhood of that point.
Abstract: The Ising spin glass is a one-parameter exponential family model for binary data with quadratic sufficient statistic. In this paper, we show that given a single realization from this model, the maximum pseudolikelihood estimate (MPLE) of the natural parameter is $\sqrt{a_{N}}$-consistent at a point whenever the log-partition function has order $a_{N}$ in a neighborhood of that point. This gives consistency rates of the MPLE for ferromagnetic Ising models on general weighted graphs in all regimes, extending the results of Chatterjee (Ann. Statist. 35 (2007) 1931–1946) where only $\sqrt{N}$-consistency of the MPLE was shown. It is also shown that consistent testing, and hence estimation, is impossible in the high temperature phase in ferromagnetic Ising models on a converging sequence of simple graphs, which include the Curie–Weiss model. In this regime, the sufficient statistic is distributed as a weighted sum of independent $\chi^{2}_{1}$ random variables, and the asymptotic power of the most powerful test is determined. We also illustrate applications of our results on synthetic and real-world network data.

Journal ArticleDOI
TL;DR: In this article, a notion of exponential families for CRMs, which are called exponential CRM likelihoods, was introduced, allowing automatic Bayesian nonparametric conjugate priors for exponential CRMs.
Abstract: We demonstrate how to calculate posteriors for general Bayesian nonparametric priors and likelihoods based on completely random measures (CRMs). We further show how to represent Bayesian nonparametric priors as a sequence of finite draws using a size-biasing approach – and how to represent full Bayesian nonparametric models via finite marginals. Motivated by conjugate priors based on exponential family representations of likelihoods, we introduce a notion of exponential families for CRMs, which we call exponential CRMs. This construction allows us to specify automatic Bayesian nonparametric conjugate priors for exponential CRM likelihoods. We demonstrate that our exponential CRMs allow particularly straightforward recipes for size-biased and marginal representations of Bayesian nonparametric models. Along the way, we prove that the gamma process is a conjugate prior for the Poisson likelihood process and the beta prime process is a conjugate prior for a process we call the odds Bernoulli process. We deliver a size-biased representation of the gamma process and a marginal representation of the gamma process coupled with a Poisson likelihood process.

Journal ArticleDOI
TL;DR: Formulae are obtained for the expected number and height distribution of critical points of smooth isotropic Gaussian random fields parameterized on Euclidean space or spheres of arbitrary dimension based on a characterization of the distribution of the Hessian of the Gaussian field by means of the family of Gaussian orthogonally invariant matrices.
Abstract: We obtain formulae for the expected number and height distribution of critical points of smooth isotropic Gaussian random fields parameterized on Euclidean space or spheres of arbitrary dimension. The results hold in general in the sense that there are no restrictions on the covariance function of the field except for smoothness and isotropy. The results are based on a characterization of the distribution of the Hessian of the Gaussian field by means of the family of Gaussian orthogonally invariant (GOI) matrices, of which the Gaussian orthogonal ensemble (GOE) is a special case. The obtained formulae depend on the covariance function only through a single parameter (Euclidean space) or two parameters (spheres), and include the special boundary case of random Laplacian eigenfunctions.

Journal ArticleDOI
TL;DR: In this article, the authors study linear Hawkes process with an exponential kernel in the asymptotic regime where the initial intensity of the Hawkes Process is large and establish large deviations for Hawkes processes in this regime as well as the regime when both initial intensity and the time are large.
Abstract: Hawkes process is a class of simple point processes that is self-exciting and has clustering effect. The intensity of this point process depends on its entire past history. It has wide applications in finance, insurance, neuroscience, social networks, criminology, seismology, and many other fields. In this paper, we study linear Hawkes process with an exponential kernel in the asymptotic regime where the initial intensity of the Hawkes process is large. We establish large deviations for Hawkes processes in this regime as well as the regime when both the initial intensity and the time are large. We illustrate the strength of our results by discussing the applications to insurance and queueing systems.

Journal ArticleDOI
TL;DR: In this paper, Szekely et al. apply the ADCF to the residuals of an autoregressive process as a test of goodness of fit, and establish the relevant asymptotic theory for the sample auto- and cross-distance correlation functions.
Abstract: The use of empirical characteristic functions for inference problems, including estimation in some special parametric settings and testing for goodness of fit, has a long history dating back to the 70s. More recently, there has been renewed interest in using empirical characteristic functions in other inference settings. The distance covariance and correlation, developed by Szekely et al. (Ann. Statist. 35 (2007) 2769–2794) and Szekely and Rizzo (Ann. Appl. Stat. 3 (2009) 1236–1265) for measuring dependence and testing independence between two random vectors, are perhaps the best known illustrations of this. We apply these ideas to stationary univariate and multivariate time series to measure lagged auto- and cross-dependence in a time series. Assuming strong mixing, we establish the relevant asymptotic theory for the sample auto- and cross-distance correlation functions. We also apply the auto-distance correlation function (ADCF) to the residuals of an autoregressive processes as a test of goodness of fit. Under the null that an autoregressive model is true, the limit distribution of the empirical ADCF can differ markedly from the corresponding one based on an i.i.d. sequence. We illustrate the use of the empirical auto- and cross-distance correlation functions for testing dependence and cross-dependence of time series in a variety of contexts.

Journal ArticleDOI
TL;DR: In this paper, a class of multivariate spectral variance estimators for the asymptotic covariance matrix in the Markov chain central limit theorem and conditions for strong consistency are provided.
Abstract: Markov chain Monte Carlo (MCMC) algorithms are used to estimate features of interest of a distribution. The Monte Carlo error in estimation has an asymptotic normal distribution whose multivariate nature has so far been ignored in the MCMC community. We present a class of multivariate spectral variance estimators for the asymptotic covariance matrix in the Markov chain central limit theorem and provide conditions for strong consistency. We examine the finite sample properties of the multivariate spectral variance estimators and its eigenvalues in the context of a vector autoregressive process of order 1.

Journal ArticleDOI
TL;DR: In this article, Ghosal et al. provide general conditions to check on the model and the prior to derive posterior concentration rates for data-dependent priors (or empirical Bayes approaches).
Abstract: In this paper we provide general conditions to check on the model and the prior to derive posterior concentration rates for data-dependent priors (or empirical Bayes approaches). We aim at providing conditions that are close to the conditions provided in the seminal paper by Ghosal & van der Vaart (2007). We then apply the general theorem to two different settings: the estimation of a density using Dirichlet process mixtures of Gaussian random variables with base measure depending on some empirical quantities and the estimation of the intensity of a counting process under the Aalen model. A simulation study for inhomogeneous Poisson processes also illustrates our results. In the former case we also derive some results on the estimation of the mixing density and on the deconvolution problem. In the latter, we provide a general theorem on posterior concentration rates for counting processes with Aalen multiplicative intensity with priors not depending on the data.

Journal ArticleDOI
TL;DR: This work proposes a constructive approach yielding a normalized spectral representation that solves an optimization problem related to the efficiency of simulating max-stable processes and has the potential of considerably reducing the simulation time of max- stable processes.
Abstract: The efficiency of simulation algorithms for max-stable processes relies on the choice of the spectral representation: different choices result in different sequences of finite approximations to the process. We propose a constructive approach yielding a normalized spectral representation that solves an optimization problem related to the efficiency of simulating max-stable processes. The simulation algorithm based on the normalized spectral representation can be regarded as max-importance sampling. Compared to other simulation algorithms hitherto, our approach has at least two advantages. First, it allows the exact simulation of a comprising class of max-stable processes. Second, the algorithm has a stopping time with finite expectation. In practice, our approach has the potential of considerably reducing the simulation time of max-stable processes.

Journal ArticleDOI
TL;DR: In this article, the authors give a complete characterization of the statistical equivalence classes of CEGs and staged trees and show that all graphical representations of the same model share a common polynomial description.
Abstract: In this paper we give a complete characterization of the statistical equivalence classes of CEGs and of staged trees. We are able to show that all graphical representations of the same model share a common polynomial description. Then, simple transformations on that polynomial enable us to traverse the corresponding class of graphs. We illustrate our results with a real analysis of the implicit dependence relationships within a previously studied dataset.

Journal ArticleDOI
TL;DR: In this paper, the authors consider determinantal point processes on the unit sphere and characterize and construct isotropic DPPs models, where it becomes essential to specify the eigenvalues and eigenfunctions in a spectral representation for the kernel.
Abstract: We consider determinantal point processes on the $d$-dimensional unit sphere $\mathbb{S}^{d}$. These are finite point processes exhibiting repulsiveness and with moment properties determined by a certain determinant whose entries are specified by a so-called kernel which we assume is a complex covariance function defined on $\mathbb{S}^{d}\times\mathbb{S}^{d}$. We review the appealing properties of such processes, including their specific moment properties, density expressions and simulation procedures. Particularly, we characterize and construct isotropic DPPs models on $\mathbb{S}^{d}$, where it becomes essential to specify the eigenvalues and eigenfunctions in a spectral representation for the kernel, and we figure out how repulsive isotropic DPPs can be. Moreover, we discuss the shortcomings of adapting existing models for isotropic covariance functions and consider strategies for developing new models, including a useful spectral approach.

Journal ArticleDOI
TL;DR: In this paper, a test for the stability over time of the covariance matrix of multivariate time series is proposed to ascertain changes due to instability in the eigenvalues and/or eigenvectors using strong invariance principles and Law of Large Numbers.
Abstract: We propose a test for the stability over time of the covariance matrix of multivariate time series The analysis is extended to the eigensystem to ascertain changes due to instability in the eigenvalues and/or eigenvectors Using strong Invariance Principles and Law of Large Numbers, we normalise the CUSUM-type statistics to calculate their supremum over the whole sample The power properties of the test versus alternative hypotheses, including also the case of breaks close to the beginning/end of sample are investigated theoretically and via simulation We extend our theory to test for the stability of the covariance matrix of a multivariate regression model The testing procedures are illustrated by studying the stability of the principal components of the term structure of 18 US interest rates

Journal ArticleDOI
TL;DR: In this article, the Stein equation associated with the one-dimensional Gamma distribution is studied, and bounds for test functions supported by the whole real line are derived, and a non-central quantitative de Jong theorem for sequences of degenerate $U$-statistics satisfying minimal uniform integrability conditions.
Abstract: We study the Stein equation associated with the one-dimensional Gamma distribution, and provide novel bounds, allowing one to effectively deal with test functions supported by the whole real line. We apply our estimates to derive new quantitative results involving random variables that are non-linear functionals of random fields, namely: (i) a non-central quantitative de Jong theorem for sequences of degenerate $U$-statistics satisfying minimal uniform integrability conditions, significantly extending previous findings by de Jong (J. Multivariate Anal. 34 (1990) 275–289), Nourdin, Peccati and Reinert (Ann. Probab. 38 (2010) 1947–1985) and Dobler and Peccati (Electron. J. Probab. 22 (2017) no. 2), (ii) a new Gamma approximation bound on the Poisson space, refining previous estimates by Peccati and Thale (ALEA Lat. Am. J. Probab. Math. Stat. 10 (2013) 525–560) and (iii) new Gamma bounds on a Gaussian space, strengthening estimates by Nourdin and Peccati (Probab. Theory Related Fields 145 (2009) 75–118). As a by-product of our analysis, we also deduce a new inequality for Gamma approximations via exchangeable pairs, that is of independent interest.

Journal ArticleDOI
TL;DR: In this article, the maximum likelihood threshold of a graph is defined as the smallest number of data points that guarantee that maximum likelihood estimates exist almost surely in the Gaussian graphical model associated to the graph.
Abstract: The maximum likelihood threshold of a graph is the smallest number of data points that guarantees that maximum likelihood estimates exist almost surely in the Gaussian graphical model associated to the graph. We show that this graph parameter is connected to the theory of combinatorial rigidity. In particular, if the edge set of a graph $G$ is an independent set in the $(n-1)$-dimensional generic rigidity matroid, then the maximum likelihood threshold of $G$ is less than or equal to $n$. This connection allows us to prove many results about the maximum likelihood threshold. We conclude by showing that these methods give exact bounds on the number of observations needed for the score matching estimator to exist with probability one.

Journal ArticleDOI
TL;DR: In this paper, the authors showed that with high probability the typical distance between the Stieltjes transform of the empirical spectral distribution (ESD) of the matrix $n^{-\frac{1}{2}}\mathbf{X}$ and Wigner's semicircle law is of order (nv)^{-1}log n, where v denotes the distance to the real line in the complex plane.
Abstract: We consider a random symmetric matrix $\mathbf{X}=[X_{jk}]_{j,k=1}^{n}$ with upper triangular entries being i.i.d. random variables with mean zero and unit variance. We additionally suppose that $\mathbb{E}|X_{11}|^{4+\delta}=:\mu_{4+\delta} 0$. The aim of this paper is to significantly extend a recent result of the authors Gotze, Naumov and Tikhomirov (2015) and show that with high probability the typical distance between the Stieltjes transform of the empirical spectral distribution (ESD) of the matrix $n^{-\frac{1}{2}}\mathbf{X}$ and Wigner’s semicircle law is of order $(nv)^{-1}\log n$, where $v$ denotes the distance to the real line in the complex plane. We apply this result to the rate of convergence of the ESD to the distribution function of the semicircle law as well as to rigidity of eigenvalues and eigenvector delocalization significantly extending a recent result by Gotze, Naumov and Tikhomirov (2015). The result on delocalization is optimal by comparison with GOE ensembles. Furthermore the techniques of this paper provide a new shorter proof for the optimal $O(n^{-1})$ rate of convergence of the expected ESD to the semicircle law.

Journal ArticleDOI
TL;DR: In this paper, the authors provide a general methodology for unbiased estimation for intractable stochastic models, where the target distribution can be written as an appropriate limit of distributions, and where conventional approaches require truncation of such a representation leading to a systematic bias.
Abstract: We provide a general methodology for unbiased estimation for intractable stochastic models. We consider situations where the target distribution can be written as an appropriate limit of distributions, and where conventional approaches require truncation of such a representation leading to a systematic bias. For example, the target distribution might be representable as the $L^{2}$-limit of a basis expansion in a suitable Hilbert space; or alternatively the distribution of interest might be representable as the weak limit of a sequence of random variables, as in MCMC. Our main motivation comes from infinite-dimensional models which can be parameterised in terms of a series expansion of basis functions (such as that given by a Karhunen–Loeve expansion). We introduce and analyse schemes for direct unbiased estimation along such an expansion. However, a substantial component of our paper is devoted to the study of MCMC schemes which, due to their infinite dimensionality, cannot be directly implemented, but which can be effectively estimated unbiasedly. For all our methods we give theory to justify the numerical stability for robust Monte Carlo implementation, and in some cases we illustrate using simulations. Interestingly the computational efficiency of our methods is usually comparable to simpler methods which are biased. Crucial to the effectiveness of our proposed methodology is the construction of appropriate couplings, many of which resonate strongly with the Monte Carlo constructions used in the coupling from the past algorithm.

Journal ArticleDOI
TL;DR: In this paper, an upper bound for the contraction rate of the posterior distribution for nonparametric inverse problems is derived from contraction rates of the related direct problem of estimating transformed parameter of interest.
Abstract: In this paper, we propose a general method to derive an upper bound for the contraction rate of the posterior distribution for nonparametric inverse problems. We present a general theorem that allows us to derive contraction rates for the parameter of interest from contraction rates of the related direct problem of estimating transformed parameter of interest. An interesting aspect of this approach is that it allows us to derive contraction rates for priors that are not related to the singular value decomposition of the operator. We apply our result to several examples of linear inverse problems, both in the white noise sequence model and the nonparametric regression model, using priors based on the singular value decomposition of the operator, location-mixture priors and splines prior, and recover minimax adaptive contraction rates.

Journal ArticleDOI
TL;DR: In this article, a general asymptotic analysis of the misspecified case, for independent observation points with uniform distribution, is provided. But the analysis is restricted to the case where the true covariance function does not belong to the parametric set used for estimation.
Abstract: In parametric estimation of covariance function of Gaussian processes, it is often the case that the true covariance function does not belong to the parametric set used for estimation. This situation is called the misspecified case. In this case, it has been observed that, for irregular spatial sampling of observation points, Cross Validation can yield smaller prediction errors than Maximum Likelihood. Motivated by this comparison, we provide a general asymptotic analysis of the misspecified case, for independent observation points with uniform distribution. We prove that the Maximum Likelihood estimator asymptotically minimizes a Kullback-Leibler divergence, within the misspecified parametric set, while Cross Validation asymptotically minimizes the integrated square prediction error. In a Monte Carlo simulation, we show that the covariance parameters estimated by Maximum Likelihood and Cross Validation, and the corresponding Kullback-Leibler divergences and integrated square prediction errors, can be strongly contrasting. On a more technical level, we provide new increasing-domain asymptotic results for the situation where the eigenvalues of the covariance matrices involved are not upper bounded.

Journal ArticleDOI
TL;DR: In this article, the authors established a large deviation principle for a type of stochastic partial differential equations with locally monotone coefficients driven by Levy noise and used weak convergence for weak convergence.
Abstract: We establish a large deviation principle for a type of stochastic partial differential equations (SPDEs) with locally monotone coefficients driven by Levy noise. The weak convergence method plays an important role.

Journal ArticleDOI
TL;DR: A general theory for a consensus-based combination of estimations of probability measures and characterizations of barycenters of probabilities that belong to (non necessarily elliptical) location and scatter families are introduced.
Abstract: We introduce a general theory for a consensus-based combination of estimations of probability measures. Potential applications include parallelized or distributed sampling schemes as well as variations on aggregation from resampling techniques like boosting or bagging. Taking into account the possibility of very discrepant estimations, instead of a full consensus we consider a “wide consensus” procedure. The approach is based on the consideration of trimmed barycenters in the Wasserstein space of probability measures. We provide general existence and consistency results as well as suitable properties of these robustified Frechet means. In order to get quick applicability, we also include characterizations of barycenters of probabilities that belong to (non necessarily elliptical) location and scatter families. For these families, we provide an iterative algorithm for the effective computation of trimmed barycenters, based on a consistent algorithm for computing barycenters, guarantying applicability in a wide setting of statistical problems.