scispace - formally typeset
Search or ask a question

Showing papers on "Central limit theorem published in 2016"


Journal ArticleDOI
TL;DR: In this article, the authors provide a rigorous mathematical framework for analysing the SGLD algorithm and show that the algorithm is consistent, satisfies a central limit theorem (CLT), and its asymptotic bias-variance decomposition can be characterized by an explicit functional of the step-sizes sequence (δm)m≥0.
Abstract: Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally expensive. Both the calculation of the acceptance probability and the creation of informed proposals usually require an iteration through the whole data set. The recently proposed stochastic gradient Langevin dynamics (SGLD) method circumvents this problem by generating proposals which are only based on a subset of the data, by skipping the accept-reject step and by using decreasing step-sizes sequence (δm)m≥0. We provide in this article a rigorous mathematical framework for analysing this algorithm. We prove that, under verifiable assumptions, the algorithm is consistent, satisfies a central limit theorem (CLT) and its asymptotic bias-variance decomposition can be characterized by an explicit functional of the step-sizes sequence (δm)m≥0. We leverage this analysis to give practical recommendations for the notoriously difficult tuning of this algorithm: it is asymptotically optimal to use a step-size sequence of the type δm = m-1/3, leading to an algorithm whose mean squared error (MSE) decreases at rate O(m-1/3).

196 citations


Journal ArticleDOI
TL;DR: In this paper, the authors generalize the construction of multivariate Hawkes processes to a possibly infinite network of counting processes on a directed graph G. The process is constructed as the solution to a system of Poisson driven stochastic differential equations, for which they prove pathwise existence and uniqueness under some reasonable conditions.
Abstract: We generalise the construction of multivariate Hawkes processes to a possibly infinite network of counting processes on a directed graph G. The process is constructed as the solution to a system of Poisson driven stochastic differential equations, for which we prove pathwise existence and uniqueness under some reasonable conditions. We next investigate how to approximate a standard N -dimensional Hawkes process by a simple inhomogeneous Poisson process in the mean-field framework where each pair of individuals interact in the same way, in the limit N → ∞. In the so-called linear case for the interaction, we further investigate the large time behaviour of the process. We study in particular the stability of the central limit theorem when exchanging the limits N, T → ∞ and exhibit different possible behaviours. We finally consider the case G = Z d with nearest neighbour interactions. In the linear case, we prove some (large time) laws of large numbers and exhibit different behaviours, reminiscent of the infinite setting. Finally we study the propagation of a single impulsion started at a given point of Z d at time 0. We compute the probability of extinction of such an impulsion and, in some particular cases, we can accurately describe how it propagates to the whole space. Mathematics Subject Classification (2010): 60F05, 60G55, 60G57.

108 citations


Journal ArticleDOI
TL;DR: In this paper, the central limit theorem for random walks with finite variance on linear groups was proved for the case of linear groups with constant variance, where the random walk is performed on a linear group.
Abstract: We prove a central limit theorem for random walks with finite variance on linear groups.

108 citations


Journal ArticleDOI
01 Feb 2016
TL;DR: In this article, a central limit theorem for the components of the largest eigenvectors of the adjacency matrix of a finite-dimensional random dot product graph whose true latent positions are unknown is proved.
Abstract: We prove a central limit theorem for the components of the largest eigenvectors of the adjacency matrix of a finite-dimensional random dot product graph whose true latent positions are unknown. We use the spectral embedding of the adjacency matrix to construct consistent estimates for the latent positions, and we show that the appropriately scaled differences between the estimated and true latent positions converge to a mixture of Gaussian random variables. We state several corollaries, including an alternate proof of a central limit theorem for the first eigenvector of the adjacency matrix of an Erdos-Renyi random graph.

99 citations


Journal ArticleDOI
TL;DR: A modified SGLD which removes the asymptotic bias due to the variance of the stochastic gradients up to first order in the step size is derived and bounds on the finite-time bias, variance and mean squared error are obtained.
Abstract: Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally infeasible. The recently proposed stochastic gradient Langevin dynamics (SGLD) method circumvents this problem in three ways: it generates proposed moves using only a subset of the data, it skips the Metropolis-Hastings accept-reject step, and it uses sequences of decreasing step sizes. In Teh et al. (2014), we provided the mathematical foundations for the decreasing step size SGLD, including consistency and a central limit theorem. However, in practice the SGLD is run for a relatively small number of iterations, and its step size is not decreased to zero. The present article investigates the behaviour of the SGLD with fixed step size. In particular we characterise the asymptotic bias explicitly, along with its dependence on the step size and the variance of the stochastic gradient. On that basis a modified SGLD which removes the asymptotic bias due to the variance of the stochastic gradients up to first order in the step size is derived. Moreover, we are able to obtain bounds on the finite-time bias, variance and mean squared error (MSE). The theory is illustrated with a Gaussian toy model for which the bias and the MSE for the estimation of moments can be obtained explicitly. For this toy model we study the gain of the SGLD over the standard Euler method in the limit of large data sets.

96 citations


Journal ArticleDOI
Li-Xin Zhang1
TL;DR: In this article, the Kolmogorov type exponential inequalities of the partial sums of independent random variables as well as negatively dependent random variables under the sublinear expectation were established.
Abstract: Kolmogorovs exponential inequalities are basic tools for studying the strong limit theorems such as the classical laws of the iterated logarithm for both independent and dependent random variables. This paper establishes the Kolmogorov type exponential inequalities of the partial sums of independent random variables as well as negatively dependent random variables under the sub-linear expectations. As applications of the exponential inequalities, the laws of the iterated logarithm in the sense of non-additive capacities are proved for independent or negatively dependent identically distributed random variables with finite second order moments. For deriving a lower bound of an exponential inequality, a central limit theorem is also proved under the sub-linear expectation for random variables with only finite variances.

81 citations


Journal ArticleDOI
TL;DR: It is proved that q quantile forests-close in spirit to Breiman's (2001) forests but easier to study-are able to combine inconsistent trees to obtain a final consistent prediction, thus highlighting the benefits of random forests compared to single trees.

79 citations


Journal ArticleDOI
TL;DR: In this paper, a new class of inequalities, based on an iteration of the classical Poincare inequality, as well as on the use of Malliavin operators, of Stein's method, and of an integrated Mehler's formula, providing a representation of the Ornstein-Uhlenbeck semigroup in terms of thinned Poisson processes, is presented.
Abstract: We prove a new class of inequalities, yielding bounds for the normal approximation in the Wasserstein and the Kolmogorov distance of functionals of a general Poisson process (Poisson random measure). Our approach is based on an iteration of the classical Poincare inequality, as well as on the use of Malliavin operators, of Stein’s method, and of an (integrated) Mehler’s formula, providing a representation of the Ornstein-Uhlenbeck semigroup in terms of thinned Poisson processes. Our estimates only involve first and second order difference operators, and have consequently a clear geometric interpretation. In particular we will show that our results are perfectly tailored to deal with the normal approximation of geometric functionals displaying a weak form of stabilization, and with non-linear functionals of Poisson shot-noise processes. We discuss two examples of stabilizing functionals in great detail: (i) the edge length of the k-nearest neighbour graph, (ii) intrinsic volumes of k-faces of Voronoi tessellations. In all these examples we obtain rates of convergence (in the Kolmogorov and the Wasserstein distance) that one can reasonably conjecture to be optimal, thus significantly improving previous findings in the literature. As a necessary step in our analysis, we also derive new lower bounds for variances of Poisson functionals.

76 citations


Journal ArticleDOI
TL;DR: In this article, fluctuations of linear statistics corresponding to smooth functions for certain biorthogonal ensembles are studied for which the underlying biorthyogonal family of families is known.
Abstract: We study fluctuations of linear statistics corresponding to smooth functions for certain biorthogonal ensembles We study those biorthogonal ensembles for which the underlying biorthogonal family s

75 citations


Journal ArticleDOI
TL;DR: In this article, the exact distribution of the product of two correlated normal random variables has been shown to be the same as that of the average of the mean of the two random variables.

74 citations


Journal ArticleDOI
TL;DR: A key element in this approach is to demonstrate that when the classical phase variation assumptions of Functional Data Analysis are applied to the point process case, they become equivalent to conditions interpretable through the prism of the theory of optimal transportation of measure.
Abstract: We develop a canonical framework for the study of the problem of registration of multiple point processes subjected to warping, known as the problem of separation of amplitude and phase variation. The amplitude variation of a real random function $\{Y(x):x\in[0,1]\}$ corresponds to its random oscillations in the $y$-axis, typically encapsulated by its (co)variation around a mean level. In contrast, its phase variation refers to fluctuations in the $x$-axis, often caused by random time changes. We formalise similar notions for a point process, and nonparametrically separate them based on realisations of i.i.d. copies $\{\Pi_i\}$ of the phase-varying point process. A key element in our approach is to demonstrate that when the classical phase variation assumptions of Functional Data Analysis (FDA) are applied to the point process case, they become equivalent to conditions interpretable through the prism of the theory of optimal transportation of measure. We demonstrate that these induce a natural Wasserstein geometry tailored to the warping problem, including a formal notion of bias expressing over-registration. Within this framework, we construct nonparametric estimators that tend to avoid over-registration in finite samples. We show that they consistently estimate the warp maps, consistently estimate the structural mean, and consistently register the warped point processes, even in a sparse sampling regime. We also establish convergence rates, and derive $\sqrt{n}$-consistency and a central limit theorem in the Cox process case under dense sampling, showing rate optimality of our structural mean estimator in that case.

Journal ArticleDOI
TL;DR: In this paper, the authors develop a canonical framework for the registration of multiple point processes subjected to warping, known as the problem of separation of amplitude and phase variation, and construct nonparametric estimators that tend to avoid over-registration in finite samples.
Abstract: We develop a canonical framework for the study of the problem of registration of multiple point processes subjected to warping, known as the problem of separation of amplitude and phase variation. The amplitude variation of a real random function {Y(x) : x is an element of [0, 1]} corresponds to its random oscillations in the y-axis, typically encapsulated by its (co) variation around a mean level. In contrast, its phase variation refers to fluctuations in the x-axis, often caused by random time changes. We formalise similar notions for a point process, and nonparametrically separate them based on realisations of i.i.d. copies {Pi(i)} of the phase-varying point process. A key element in our approach is to demonstrate that when the classical phase variation assumptions of Functional Data Analysis (FDA) are applied to the point process case, they become equivalent to conditions interpretable through the prism of the theory of optimal transportation of measure. We demonstrate that these induce a natural Wasserstein geometry tailored to the warping problem, including a formal notion of bias expressing over-registration. Within this framework, we construct nonparametric estimators that tend to avoid over-registration in finite samples. We show that they consistently estimate the warp maps, consistently estimate the structural mean, and consistently register the warped point processes, even in a sparse sampling regime. We also establish convergence rates, and derivev root n-consistency and a central limit theorem in the Cox process case under dense sampling, showing rate optimality of our structural mean estimator in that case.

Journal ArticleDOI
TL;DR: This paper aims to provide a law of large numbers for uncertain random variable, which states that the average of uncertain random variables converges in distribution to an uncertain variable.
Abstract: The law of large numbers in probability theory states that the average of random variables converges to its expected value in some sense under some conditions. Sometimes, random factors and human uncertainty exist simultaneously in complex systems, and a concept of uncertain random variable has been proposed to study this type of complex systems. This paper aims to provide a law of large numbers for uncertain random variables, which states that the average of uncertain random variables converges in distribution to an uncertain variable. As a byproduct, the convergence of a sequence of uncertain variables is also studied.

Journal ArticleDOI
TL;DR: In this paper, a rigorous analysis of directed exponential random graph models with binary and non-binary weighted edges is presented, and the uniform consistency and the asymptotic normality of the maximum likelihood estimator are established.
Abstract: Although asymptotic analyses of undirected network models based on degree sequences have started to appear in recent literature, it remains an open problem to study the statistical properties of directed network models. In this paper, we provide for the first time a rigorous analysis of directed exponential random graph models using the in-degrees and out-degrees as sufficient statistics with binary and non-binary weighted edges. We establish the uniform consistency and the asymptotic normality of the maximum likelihood estimator, when the number of parameters grows and only one realized observation of the graph is available. One key technique in the proofs is to approximate the inverse of the Fisher information matrix using a simple matrix with high accuracy. Along the way, we also establish a geometrically fast rate of convergence for the Newton iterative algorithm, which is used to obtain the maximum likelihood estimate. Numerical studies confirm our theoretical findings.

Journal ArticleDOI
TL;DR: For the two-dimensional one-component Coulomb plasma, this paper derived an asymptotic expansion of the free energy up to order of the number of particles of the Coulomb gas, with an effective error bound for some constant ε > 0, and proved that the fluctuations of the linear statistics are given by a Gaussian free field at any positive temperature.
Abstract: For the two-dimensional one-component Coulomb plasma, we derive an asymptotic expansion of the free energy up to order $N$, the number of particles of the gas, with an effective error bound $N^{1-\kappa}$ for some constant $\kappa > 0$. This expansion is based on approximating the Coulomb gas by a quasi-free Yukawa gas. Further, we prove that the fluctuations of the linear statistics are given by a Gaussian free field at any positive temperature. Our proof of this central limit theorem uses a loop equation for the Coulomb gas, the free energy asymptotics, and rigidity bounds on the local density fluctuations of the Coulomb gas, which we obtained in a previous paper.

Posted Content
TL;DR: In this article, the persistence homology of a stationary point process on N was studied in a multiscale way, and the strong law of large numbers for persistence diagrams was proved.
Abstract: The persistent homology of a stationary point process on ${\bf R}^N$ is studied in this paper. As a generalization of continuum percolation theory, we study higher dimensional topological features of the point process such as loops, cavities, etc. in a multiscale way. The key ingredient is the persistence diagram, which is an expression of the persistent homology. We prove the strong law of large numbers for persistence diagrams as the window size tends to infinity and give a sufficient condition for the limiting persistence diagram to have the full support. We also discuss a central limit theorem for persistent Betti numbers.

Journal ArticleDOI
TL;DR: The elephant random walk (ERW) as mentioned in this paper is a non-Markovian discrete-time random walk with unbounded memory which exhibits a phase transition from diffusive to superdiffusive behavior.
Abstract: We study the so-called elephant random walk (ERW) which is a non-Markovian discrete-time random walk on $\mathbb{Z}$ with unbounded memory which exhibits a phase transition from diffusive to superdiffusive behaviour. We prove a law of large numbers and a central limit theorem. Remarkably the central limit theorem applies not only to the diffusive regime but also to the phase transition point which is superdiffusive. Inside the superdiffusive regime the ERW converges to a non-degenerate random variable which is not normal. We also obtain explicit expressions for the correlations of increments of the ERW.

Journal ArticleDOI
TL;DR: In this article, a new functional central limit theorem for the block bootstrap in a Hilbert space is proposed, which is used to detect structural changes in functional data from hydrological data from Germany.
Abstract: A new test for structural changes in functional data is investigated. It is based on Hilbert space theory and critical values are deduced from bootstrap iterations. Thus a new functional central limit theorem for the block bootstrap in a Hilbert space is required. The test can also be used to detect changes in the marginal distribution of random vectors, which is supplemented by a simulation study. Our methods are applied to hydrological data from Germany. The Canadian Journal of Statistics 44: 300–322; 2016 © 2016 Statistical Society of Canada

Posted Content
TL;DR: In this paper, the root mean square error of a stochastic numerical quadratures involving determinantal point processes associated with multivariate orthogonal polynomials is shown to decrease with the dimension of the ambient space.
Abstract: We show that repulsive random variables can yield Monte Carlo methods with faster convergence rates than the typical $N^{-1/2}$, where $N$ is the number of integrand evaluations. More precisely, we propose stochastic numerical quadratures involving determinantal point processes associated with multivariate orthogonal polynomials, and we obtain root mean square errors that decrease as $N^{-(1+1/d)/2}$, where $d$ is the dimension of the ambient space. First, we prove a central limit theorem (CLT) for the linear statistics of a class of determinantal point processes, when the reference measure is a product measure supported on a hypercube, which satisfies the Nevai-class regularity condition, a result which may be of independent interest. Next, we introduce a Monte Carlo method based on these determinantal point processes, and prove a CLT with explicit limiting variance for the quadrature error, when the reference measure satisfies a stronger regularity condition. As a corollary, by taking a specific reference measure and using a construction similar to importance sampling, we obtain a general Monte Carlo method, which applies to any measure with continuously derivable density. Loosely speaking, our method can be interpreted as a stochastic counterpart to Gaussian quadrature, which, at the price of some convergence rate, is easily generalizable to any dimension and has a more explicit error term.

Journal ArticleDOI
TL;DR: In this paper, it was shown that convergence in Wasserstein distance always implies convergence in the Kolmogorov distance at a possibly weaker rate for Poisson point processes.
Abstract: Peccati, Sole, Taqqu, and Utzet recently combined Stein’s method and Malliavin calculus to obtain a bound for the Wasserstein distance of a Poisson functional and a Gaussian random variable. Convergence in the Wasserstein distance always implies convergence in the Kolmogorov distance at a possibly weaker rate. But there are many examples of central limit theorems having the same rate for both distances. The aim of this paper was to show this behavior for a large class of Poisson functionals, namely so-called U-statistics of Poisson point processes. The technique used by Peccati et al. is modified to establish a similar bound for the Kolmogorov distance of a Poisson functional and a Gaussian random variable. This bound is evaluated for a U-statistic, and it is shown that the resulting expression is up to a constant the same as it is for the Wasserstein distance.

Journal ArticleDOI
TL;DR: In this article, the authors introduce a class of two-parameter discrete dispersion models, obtained by combining convolution with a factorial tilting operation, similar to exponential dispersion model which combine convolution and exponential tilting.
Abstract: We introduce a class of two-parameter discrete dispersion models, obtained by combining convolution with a factorial tilting operation, similar to exponential dispersion models which combine convolution and exponential tilting. The equidispersed Poisson model has a special place in this approach, whereas several overdispersed discrete distributions, such as the Neyman Type A, Polya–Aeppli, negative binomial and Poisson-inverse Gaussian, turn out to be Poisson–Tweedie factorial dispersion models with power dispersion functions, analogous to ordinary Tweedie exponential dispersion models with power variance functions. Using the factorial cumulant generating function as tool, we introduce a dilation operation as a discrete analogue of scaling, generalizing binomial thinning. The Poisson–Tweedie factorial dispersion models are closed under dilation, which in turn leads to a Poisson–Tweedie asymptotic framework where Poisson–Tweedie models appear as dilation limits. This unifies many discrete convergence results and leads to Poisson and Hermite convergence results, similar to the law of large numbers and the central limit theorem, respectively. The dilation operator also leads to a duality transformation which in some cases transforms overdispersion into underdispersion and vice versa. Finally, we consider the multivariate factorial cumulant generating function, and introduce a multivariate notion of over- and underdispersion, and a multivariate zero inflation index.

Journal ArticleDOI
TL;DR: An exchangeable random matrix is a random matrix with distribution invariant under any permutation of the entries, and it is shown, as the dimension tends to infinity, that the empirical spectral distribution tends to the uniform law on the unit disc.
Abstract: An exchangeable random matrix is a random matrix with distribution invariant under any permutation of the entries. For such random matrices, we show, as the dimension tends to infinity, that the empirical spectral distribution tends to the uniform law on the unit disc. This is an instance of the universality phenomenon known as the circular law, for a model of random matrices with dependent entries, rows, and columns. It is also a non-Hermitian counterpart of a result of Chatterjee on the semi-circular law for random Hermitian matrices with exchangeable entries. The proof relies in particular on a reduction to a simpler model given by a random shuffle of a rigid deterministic matrix, on Hermitization, and also on combinatorial concentration of measure and combinatorial Central Limit Theorem. A crucial step is a polynomial bound on the smallest singular value of exchangeable random matrices, which may be of independent interest.

Journal ArticleDOI
TL;DR: The asymptotic normalcy of families of random variables $X$ which count the number of occupied sites in some large set is considered, and sufficient criteria is given, involving the location of the zeros of $P(z)$, for these families to satisfy a central limit theorem (CLT) and even a local CLT (LCLT).

Journal ArticleDOI
TL;DR: In this paper, the authors studied the asymptotic covariance matrix of general geometric functionals of Z ∩ W for increasing the observation window, including convergence rates, and proved multivariate central limit theorems including Berry-Esseen bounds.
Abstract: Let Z be a Boolean model based on a stationary Poisson process η of compact, convex particles in Euclidean space Rᵈ. Let W denote a compact, convex observation window. For a large class of function- als, formulas for mean values of ψ(Z ∩ W) are available in the literature. The first aim of the present work is to study the asymp- totic covariances of general geometric (additive, translation invariant and locally bounded) functionals of Z ∩ W for increasing observation window W, including convergence rates. Our approach is based on the Fock space representation associated with η. For the important special case of intrinsic volumes, the asymptotic covariance matrix is shown to be positive definite and can be explicitly expressed in terms of suitable moments of (local) curvature measures in the isotropic case. The second aim of the paper is to prove multivariate central limit theorems including Berry–Esseen bounds. These are based on a general normal approximation result obtained by the Malliavin–Stein method.

Proceedings ArticleDOI
19 Jun 2016
TL;DR: This work shows that any (n,k)-PMD is poly(k/σ)-close in total variation distance to the (appropriately discretized) multi-dimensional Gaussian with the same first two moments, removing the dependence on n from the Central Limit Theorem of Valiant and Valiant.
Abstract: An (n,k)-Poisson Multinomial Distribution (PMD) is the distribution of the sum of n independent random vectors supported on the set Bk={e1,…,ek} of standard basis vectors in ℝk. We show that any (n,k)-PMD is poly(k/σ)-close in total variation distance to the (appropriately discretized) multi-dimensional Gaussian with the same first two moments, removing the dependence on n from the Central Limit Theorem of Valiant and Valiant. Interestingly, our CLT is obtained by bootstrapping the Valiant-Valiant CLT itself through the structural characterization of PMDs shown in recent work by Daskalakis, Kamath and Tzamos. In turn, our stronger CLT can be leveraged to obtain an efficient PTAS for approximate Nash equilibria in anonymous games, significantly improving the state of the art, and matching qualitatively the running time dependence on n and 1/є of the best known algorithm for two-strategy anonymous games. Our new CLT also enables the construction of covers for the set of (n,k)-PMDs, which are proper and whose size is shown to be essentially optimal. Our cover construction combines our CLT with the Shapley-Folkman theorem and recent sparsification results for Laplacian matrices by Batson, Spielman, and Srivastava. Our cover size lower bound is based on an algebraic geometric construction. Finally, leveraging the structural properties of the Fourier spectrum of PMDs we show that these distributions can be learned from Ok(1/є2) samples in polyk(1/є)-time, removing the quasi-polynomial dependence of the running time on 1/є from prior work.

Journal ArticleDOI
TL;DR: In this paper, the central limit theorem for random walks with finite variance on Gromov hyperbolic groups was proved for the case of random walk with finite-variance.
Abstract: We prove a central limit theorem for random walks with finite variance on Gromov hyperbolic groups.

Book
16 Dec 2016
TL;DR: In this paper, the authors use the framework of mod-$\phi$ convergence to prove precise large or moderate deviations for quite general sequences of random variables, where the random variables considered can be lattice or non-lattice distributed, and single or multi-dimensional; and one obtains precise estimates of the fluctuations.
Abstract: In this paper, we use the framework of mod-$\phi$ convergence to prove precise large or moderate deviations for quite general sequences of random variables ($X_n$)$ _{n\in{\mathbb{N}}}$. The random variables considered can be lattice or non-lattice distributed, and single or multi-dimensional; and one obtains precise estimates of the fluctuations $\mathbb{P}[X_{n} \in t_{n}B]$, instead of the usual estimates for the rate of exponential decay log($\mathbb{P}[X_{n} \in t_{n}B]$In the special setting of mod-Gaussian convergence, we shall see that our approach allows us to identify the scale at which the central limit theorem ceases to hold and we are able to quantify the "breaking of symmetry" at this critical scale thanks to the residue or limiting function occurring in mod-f convergence. In particular this provides us with a systematic way to characterise the normality zone, that is the zone in which the Gaussian approximation for the tails is still valid. Besides, the residue function measures the extent to which this approximation fails to hold at the edge of the normality zone. The first sections of the article are devoted to a proof of these abstract results. We then propose new examples covered by this theory and coming from various areas of mathematics: classical probability theory (multi-dimensional random walks, random point processes), number theory (statistics of additive arithmetic functions), combinatorics (statistics of random permutations), random matrix theory (characteristic polynomials of random matrices in compact Lie groups), graph theory (number of subgraphs in a random Erd˝os-Renyi graph), and non-commutative probability theory (asymptotics of random character values of symmetric groups). In particular, we complete our theory of precise deviations by a concrete method of cumulants and dependency graphs, which applies to many examples of sums of “weakly dependent” random variables. Although the latter methods can only be applied in the more restrictive setting of mod-Gaussian convergence, the large number as well as the variety of examples which are covered there hint at a universality class for second order fluctuations.

Posted Content
TL;DR: If the system is sufficiently ergodic that this data satisfies a strong central limit theorem (as is known to hold for chaotic Lorenz systems), then the governing equations can be exactly recovered as the solution to an $\ell_1$ minimization problem -- even if a large percentage of the data is corrupted by outliers.
Abstract: Learning the governing equations in dynamical systems from time-varying measurements is of great interest across different scientific fields. This task becomes prohibitive when such data is moreover highly corrupted, for example, due to the recording mechanism failing over unknown intervals of time. When the underlying system exhibits chaotic behavior, such as sensitivity to initial conditions, it is crucial to recover the governing equations with high precision. In this work, we consider continuous time dynamical systems $\dot{x} = f(x)$ where each component of $f: \mathbb{R}^{d} \rightarrow \mathbb{R}^d$ is a multivariate polynomial of maximal degree $p$; we aim to identify $f$ exactly from possibly highly corrupted measurements $x(t_1), x(t_2), \dots, x(t_m)$. As our main theoretical result, we show that if the system is sufficiently ergodic that this data satisfies a strong central limit theorem (as is known to hold for chaotic Lorenz systems), then the governing equations $f$ can be exactly recovered as the solution to an $\ell_1$ minimization problem -- even if a large percentage of the data is corrupted by outliers. Numerically, we apply the alternating minimization method to solve the corresponding constrained optimization problem. Through several examples of 3D chaotic systems and higher dimensional hyperchaotic systems, we illustrate the power, generality, and efficiency of the algorithm for recovering governing equations from noisy and highly corrupted measurement data.

Journal ArticleDOI
TL;DR: In this paper, a central limit theorem for linear eigenvalue statistics of real elliptic random matrices under the assumption that the test functions are analytic is established for Wigner matrices.
Abstract: We consider a class of elliptic random matrices which generalize two classical ensembles from random matrix theory: Wigner matrices and random matrices with iid entries. In particular, we establish a central limit theorem for linear eigenvalue statistics of real elliptic random matrices under the assumption that the test functions are analytic. As a corollary, we extend the results of Rider and Silverstein (Ann Probab 34(6):2118–2143, 2006) to real iid random matrices.

Posted Content
TL;DR: In this article, it was shown that if the zeta function converges to a non-trivial random generalized function, which in turn is identified as a product of a very well behaved random smooth function and a complex Gaussian multiplicative chaos distribution, then the Zeta function has an identical distribution on the mesoscopic scale.
Abstract: We prove that if $\omega$ is uniformly distributed on $[0,1]$, then as $T\to\infty$, $t\mapsto \zeta(i\omega T+it+1/2)$ converges to a non-trivial random generalized function, which in turn is identified as a product of a very well behaved random smooth function and a random generalized function known as a complex Gaussian multiplicative chaos distribution. This demonstrates a novel rigorous connection between number theory and the theory of multiplicative chaos -- the latter is known to be connected to many other areas of mathematics. We also investigate the statistical behavior of the zeta function on the mesoscopic scale. We prove that if we let $\delta_T$ approach zero slowly enough as $T\to\infty$, then $t\mapsto \zeta(1/2+i\delta_T t+i\omega T)$ is asymptotically a product of a divergent scalar quantity suggested by Selberg's central limit theorem and a strictly Gaussian multiplicative chaos. We also prove a similar result for the characteristic polynomial of a Haar distributed random unitary matrix, where the scalar quantity is slightly different but the multiplicative chaos part is identical. This essentially says that up to scalar multiples, the zeta function and the characteristic polynomial of a Haar distributed random unitary matrix have an identical distribution on the mesoscopic scale.