scispace - formally typeset
Search or ask a question

Showing papers in "Bernoulli in 2017"


Journal ArticleDOI
TL;DR: The formal foundations of the algorithm are developed through the construction of measures on smooth manifolds, and how the theory naturally identifies efficient implementations and motivates promising generalizations are demonstrated.
Abstract: Although Hamiltonian Monte Carlo has proven an empirical success, the lack of a rigorous theoretical understanding of the algorithm has in many ways impeded both principled developments of the method and use of the algorithm in practice. In this paper, we develop the formal foundations of the algorithm through the construction of measures on smooth manifolds, and demonstrate how the theory naturally identifies efficient implementations and motivates promising generalizations.

192 citations


Journal ArticleDOI
TL;DR: In particular, in this article, the authors obtained concentration inequalities and expectation bounds for the deviation of the sample covariance operator from the true covariance in the presence of Gaussian random variables in a separable Banach space.
Abstract: Let $X,X_{1},\dots,X_{n},\dots$ be i.i.d. centered Gaussian random variables in a separable Banach space $E$ with covariance operator $\Sigma$: \[\Sigma:E^{\ast}\mapsto E,\qquad\Sigma u=\mathbb{E}\langle X,u\rangle X,\qquad u\in E^{\ast}.\] The sample covariance operator $\hat{\Sigma}:E^{\ast}\mapsto E$ is defined as \[\hat{\Sigma}u:=n^{-1}\sum_{j=1}^{n}\langle X_{j},u\rangle X_{j},\qquad u\in E^{\ast}.\] The goal of the paper is to obtain concentration inequalities and expectation bounds for the operator norm $\Vert \hat{\Sigma}-\Sigma\Vert $ of the deviation of the sample covariance operator from the true covariance operator. In particular, it is shown that \[\mathbb{E}\Vert \hat{\Sigma}-\Sigma\Vert \asymp\Vert \Sigma\Vert (\sqrt{\frac{{\mathbf{r}}(\Sigma)}{n}}\vee \frac{{\mathbf{r}}(\Sigma)}{n}),\] where \[{\mathbf{r}}(\Sigma):=\frac{(\mathbb{E}\Vert X\Vert )^{2}}{\Vert \Sigma\Vert }.\] Moreover, it is proved that, under the assumption that ${\mathbf{r}}(\Sigma)\leq n$, for all $t\geq1$, with probability at least $1-e^{-t}$ \[\vert \Vert \hat{\Sigma}-\Sigma\Vert -M\vert \lesssim\Vert \Sigma\Vert (\sqrt{\frac{t}{n}}\vee \frac{t}{n}),\] where $M$ is either the median, or the expectation of $\Vert \hat{\Sigma}-\Sigma\Vert $. On the other hand, under the assumption that ${\mathbf{r}}(\Sigma)\geq n$, for all $t\geq1$, with probability at least $1-e^{-t}$ \[\vert \Vert \hat{\Sigma}-\Sigma\Vert -M\vert \lesssim\Vert \Sigma\Vert (\sqrt{\frac{{\mathbf{r}}(\Sigma)}{n}}\sqrt{\frac{t}{n}}\vee \frac{t}{n}).\]

153 citations


Journal ArticleDOI
TL;DR: This paper showed that the incorporation of a simple correlation measure into the tuning parameter leads to a nearly optimal prediction performance of the Lasso even for highly correlated covariates, however, they also reveal that the prediction performance can be mediocre irrespective of the choice of tuning parameter.
Abstract: Although the Lasso has been extensively studied, the relationship between its prediction performance and the correlations of the covariates is not fully understood. In this paper, we give new insights into this relationship in the context of multiple linear regression. We show, in particular, that the incorporation of a simple correlation measure into the tuning parameter leads to a nearly optimal prediction performance of the Lasso even for highly correlated covariates. However, we also reveal that for moderately correlated covariates, the prediction performance of the Lasso can be mediocre irrespective of the choice of the tuning parameter. For the illustration of our approach with an important application, we deduce nearly optimal rates for the least-squares estimator with total variation penalty

96 citations


Journal ArticleDOI
TL;DR: This paper proposed a new empirical Bayes approach for inference in the $p\gg n$ normal linear model, which uses data in the prior in two ways, for centering and regularization.
Abstract: We propose a new empirical Bayes approach for inference in the $p\gg n$ normal linear model. The novelty is the use of data in the prior in two ways, for centering and regularization. Under suitable sparsity assumptions, we establish a variety of concentration rate results for the empirical Bayes posterior distribution, relevant for both estimation and model selection. Computation is straightforward and fast, and simulation results demonstrate the strong finite-sample performance of the empirical Bayes model selection procedure.

85 citations


Journal ArticleDOI
TL;DR: In this article, a semigroup approach is taken to obtain the averaging principles for stochastic partial differential equations (SPDEs) under two-time-scale formulation, where the SPDEs are either modulated by a continuous-time Markov chain with a finite state space or have an addition fast jump component.
Abstract: This paper focuses on stochastic partial differential equations (SPDEs) under two-time-scale formulation. Distinct from the work in the existing literature, the systems are driven by $\alpha $-stable processes with $\alpha \in(1,2)$. In addition, the SPDEs are either modulated by a continuous-time Markov chain with a finite state space or have an addition fast jump component. The inclusion of the Markov chain is for the needs of treating random environment, whereas the addition of the fast jump process enables the consideration of discontinuity in the sample paths of the fast processes. Assuming either a fast changing Markov switching or an additional fast-varying jump process, this work aims to obtain the averaging principles for such systems. There are several distinct difficulties. First, the noise is not square integrable. Second, in our setup, for the underlying SPDE, there is only a unique mild solution and as a result, there is only mild Ito’s formula that can be used. Moreover, another new aspect is the addition of the fast regime switching and the addition of the fast varying jump processes in the formulation, which enlarges the applicability of the underlying systems. To overcome these difficulties, a semigroup approach is taken. Under suitable conditions, it is proved that the $p$th moment convergence takes place with $p\in(1,\alpha )$, which is stronger than the usual weak convergence approaches.

81 citations


Journal ArticleDOI
TL;DR: In this article, a test statistic that is a kernel-based function of the estimated latent positions obtained from the adjacency spectral embedding for each graph is proposed, which converges to the test statistic obtained using the true but unknown latent positions and hence that the proposed test procedure is consistent across a broad range of alternatives.
Abstract: We consider the problem of testing whether two independent finite-dimensional random dot product graphs have generating latent positions that are drawn from the same distribution, or distributions that are related via scaling or projection. We propose a test statistic that is a kernel-based function of the estimated latent positions obtained from the adjacency spectral embedding for each graph. We show that our test statistic using the estimated latent positions converges to the test statistic obtained using the true but unknown latent positions and hence that our proposed test procedure is consistent across a broad range of alternatives. Our proof of consistency hinges upon a novel concentration inequality for the suprema of an empirical process in the estimated latent positions setting.

67 citations


Journal ArticleDOI
TL;DR: In this paper, a kernel-type copula density estimator is proposed, which is based on the idea of transforming the uniform marginals of the copula densities into normal distributions via the probit function, estimating the density in the transformed domain, which can be accomplished without boundary problems, and obtaining an estimate of the Copula density through back-transformation.
Abstract: Copula modelling has become ubiquitous in modern statistics. Here, the problem of nonparametrically estimating a copula density is addressed. Arguably the most popular nonparametric density estimator, the kernel estimator is not suitable for the unit-square-supported copula densities, mainly because it is heavily a↵ected by boundary bias issues. In addition, most common copulas admit unbounded densities, and kernel methods are not consistent in that case. In this paper, a kernel-type copula density estimator is proposed. It is based on the idea of transforming the uniform marginals of the copula density into normal distributions via the probit function, estimating the density in the transformed domain, which can be accomplished without boundary problems, and obtaining an estimate of the copula density through back-transformation. Although natural, a raw application of this procedure was, however, seen not to perform very well in the earlier literature. Here, it is shown that, if combined with local likelihood density estimation methods, the idea yields very good and easy to implement estimators, fixing boundary issues in a natural way and able to cope with unbounded copula densities. The asymptotic properties of the suggested estimators are derived, and a practical way of selecting the crucially important smoothing parameters is devised. Finally, extensive simulation studies and a real data analysis evidence their excellent performance compared to their main competitors.

63 citations


Journal ArticleDOI
TL;DR: In this paper, the authors define growth-fragmentation processes by considering the family of sizes of cells alive a some fixed time, and establish a simple criterion for excessiveness in terms of $X$.
Abstract: Consider a Markov process $X$ on $[0,\infty)$ which has only negative jumps and converges as time tends to infinity a.s. We interpret $\mathit{X(t)}$ as the size of a typical cell at time $t$, and each jump as a birth event. More precisely, if $\Delta\mathit{X(s)}=−\mathit{y}<0$, then $s$ is the birthtime of a daughter cell with size $y$ which then evolves independently and according to the same dynamics, that is, giving birth in turn to great-daughters, and so on. After having constructed rigorously such cell systems as a general branching process, we define growth-fragmentation processes by considering the family of sizes of cells alive a some fixed time. We introduce the notion of excessive functions for the latter, whose existence provides a natural sufficient condition for the non-explosion of the system. We establish a simple criterion for excessiveness in terms of $X$. The case when $X$ is self-similar is treated in details, and connexions with self-similar fragmentations and compensated fragmentations are emphasized.

56 citations


Journal ArticleDOI
TL;DR: In this article, the authors considered magnitude, asymptotics and duration of drawdown for some L´evyprocesses and derived the time to recover (TTR) the historical max-imum.
Abstract: June 30, 2015AbstractThis paper considers magnitude, asymptotics and duration of drawdowns for some L´evyprocesses. First, we revisit some existing results on the magnitude of drawdowns for spec-trally negative L´evy processes using an approximation approach. For any spectrally negativeL´evy process whose scale functions are well-behaved at 0+, we then study the asymptotics ofdrawdown quantities when the threshold of drawdown magnitude approaches zero. We alsoshow that such asymptotics is robust to perturbations of additional positive compound Poissonjumps. Finally, thanks to the asymptotic results and some recent works on the running max-imum of L´evy processes, we derive the law of duration of drawdowns for a large class of L´evyprocesses (with a general spectrally negative part plus a positive compound Poisson structure).The duration of drawdowns is also known as the “Time to Recover” (TTR) the historical max-imum, which is a widely used performance measure in the fund management industry. We findthat the law of duration of drawdowns qualitatively depends on the path type of the spectrallynegative component of the underlying L´evy process.Keywords: Asymptotics; Drawdown; Duration; L´evy process; Magnitude; Parisian stoppingtimeMSC(2000): Primary 60G40; Secondary 60G51

52 citations


Journal ArticleDOI
TL;DR: In this paper, the particle-based, rapid incremental smoother (PaRIS) algorithm is proposed for online estimation of hidden Markov models, which is based on a forward implementation of the classical expectation-maximization algorithm.
Abstract: This thesis consists of two papers studying online inference in general hidden Markov models using sequential Monte Carlo methods.The first paper present an novel algorithm, the particle-based, rapid incremental smoother (PaRIS), aimed at efficiently perform online approximation of smoothed expectations of additive state functionals in general hidden Markov models. The algorithm has, under weak assumptions, linear computational complexity and very limited memory requirements. The algorithm is also furnished with a number of convergence results, including a central limit theorem.The second paper focuses on the problem of online estimation of parameters in a general hidden Markov model. The algorithm is based on a forward implementation of the classical expectation-maximization algorithm. The algorithm uses the PaRIS algorithm to achieve an efficient algorithm.

52 citations


Journal ArticleDOI
TL;DR: In this article, the authors considered the random design regression model with square loss and established sharp oracle inequalities for its risk, showing that the excess risk rate matches the behavior of the minimax risk of function estimation in regression problems under the well-specified model.
Abstract: We consider the random design regression model with square loss. We propose a method that aggregates empirical minimizers (ERM) over appropriately chosen random subsets and reduces to ERM in the extreme case, and we establish sharp oracle inequalities for its risk. We show that, under the $\varepsilon^{-p}$ growth of the empirical $\varepsilon$-entropy, the excess risk of the proposed method attains the rate $n^{-2/(2+p)}$ for $p\in(0,2)$ and $n^{-1/p}$ for $p>2$ where $n$ is the sample size. Furthermore, for $p\in(0,2)$, the excess risk rate matches the behavior of the minimax risk of function estimation in regression problems under the well-specified model. This yields a conclusion that the rates of statistical estimation in well-specified models (minimax risk) and in misspecified models (minimax regret) are equivalent in the regime $p\in(0,2)$. In other words, for $p\in(0,2)$ the problem of statistical learning enjoys the same minimax rate as the problem of statistical estimation. On the contrary, for $p>2$ we show that the rates of the minimax regret are, in general, slower than for the minimax risk. Our oracle inequalities also imply the $v\log(n/v)/n$ rates for Vapnik–Chervonenkis type classes of dimension $v$ without the usual convexity assumption on the class; we show that these rates are optimal. Finally, for a slightly modified method, we derive a bound on the excess risk of $s$-sparse convex aggregation improving that of Lounici [Math. Methods Statist. 16 (2007) 246–259] and providing the optimal rate.

Journal ArticleDOI
TL;DR: In this paper, the authors extend Stein's method to the distribution of the product of independent mean zero normal random variables and obtain a Stein equation for this class of distributions, which reduces to the classical normal Stein equation in the case $n = 1.
Abstract: In this paper, we extend Stein’s method to the distribution of the product of $n$ independent mean zero normal random variables. A Stein equation is obtained for this class of distributions, which reduces to the classical normal Stein equation in the case $n=1$. This Stein equation motivates a generalisation of the zero bias transformation. We establish properties of this new transformation, and illustrate how they may be used together with the Stein equation to assess distributional distances for statistics that are asymptotically distributed as the product of independent central normal random variables. We end by proving some product normal approximation theorems.

Journal ArticleDOI
TL;DR: In this paper, a consistent estimator for the homology (an algebraic structure representing connected components and cycles) of level sets of both density and regression functions is introduced, based on kernel estimation.
Abstract: We introduce a consistent estimator for the homology (an algebraic structure representing connected components and cycles) of level sets of both density and regression functions. Our method is based on kernel estimation. We apply this procedure to two problems: (1) inferring the homology structure of manifolds from noisy observations, (2) inferring the persistent homology (a multi-scale extension of homology) of either density or regression functions. We prove consistency for both of these problems. In addition to the theoretical results, we demonstrate these methods on simulated data for binary regression and clustering applications.

Journal ArticleDOI
TL;DR: A superfamily of divergences which contains both the power divergence and the density power divergence families as special cases is considered, indicating that this superfamily has real utility, rather than just being a routine generalization.
Abstract: The power divergence (PD) and the density power divergence (DPD) families have proven to be useful tools in the area of robust inference. In this paper, we consider a superfamily of divergences which contains both of these families as special cases. The role of this superfamily is studied in several statistical applications, and desirable properties are identified and discussed. In many cases, it is observed that the most preferred minimum divergence estimator within the above collection lies outside the class of minimum PD or minimum DPD estimators, indicating that this superfamily has real utility, rather than just being a routine generalization. The limitation of the usual first order influence function as an effective descriptor of the robustness of the estimator is also demonstrated in this connection.

Journal ArticleDOI
TL;DR: In this paper, the authors derived tight confidence intervals for the Good-Turing estimator of the missing mass in an infinite urn scheme, and showed that the concentration inequalities in these concentration inequalities are tight if the sampling distribution satisfies a regular variation property.
Abstract: An infinite urn scheme is defined by a probability mass function $(p_j)_{j\geq 1}$ over positive integers. A random allocation consists of a sample of $N$ independent drawings according to this probability distribution where $N$ may be deterministic or Poisson-distributed. This paper is concerned with occupancy counts, that is with the number of symbols with $r$ or at least $r$ occurrences in the sample, and with the missing mass that is the total probability of all symbols that do not occur in the sample. Without any further assumption on the sampling distribution, these random quantities are shown to satisfy Bernstein-type concentration inequalities. The variance factors in these concentration inequalities are shown to be tight if the sampling distribution satisfies a regular variation property. This regular variation property reads as follows. Let the number of symbols with probability larger than $x$ be $\vec{ u}(x) = |\{ j \colon p_j \geq x\}|$. In a regularly varying urn scheme, $\vec{ u}$ satisfies $\lim_{\tau\rightarrow 0} \vec{ u}(\tau x)/\vec u(\tau) = x^{-\alpha}$ for $\alpha \in [0,1]$ and the variance of the number of distinct symbols in a sample tends to infinity as the sample size tends to infinity. Among other applications, these concentration inequalities allow us to derive tight confidence intervals for the Good-Turing estimator of the missing mass.

Journal ArticleDOI
TL;DR: In this paper, a robust estimator of the autocorrelation parameter, which is consistent and satisfies a central limit theorem in the Gaussian case, is proposed to follow the classical inference approach, by plugging this estimator in the criteria used for change-points estimation.
Abstract: We consider the problem of multiple change-point estimation in the mean of an $\operatorname{AR}(1)$ process. Taking into account the dependence structure does not allow us to use the dynamic programming algorithm, which is the only algorithm giving the optimal solution in the independent case. We propose a robust estimator of the autocorrelation parameter, which is consistent and satisfies a central limit theorem in the Gaussian case. Then, we propose to follow the classical inference approach, by plugging this estimator in the criteria used for change-points estimation. We show that the asymptotic properties of these estimators are the same as those of the classical estimators in the independent framework. The same plug-in approach is then used to approximate the modified BIC and choose the number of segments. This method is implemented in the R package AR1seg and is available from the Comprehensive R Archive Network (CRAN). This package is used in the simulation section in which we show that for finite sample sizes taking into account the dependence structure improves the statistical performance of the change-point estimators and of the selection criterion.

Journal ArticleDOI
TL;DR: In this paper, the authors derived large sample consistency results and rates of convergence for the problem of embedding points based on triple or quadruple distance comparisons, and bound the number of such comparisons needed to achieve consistency.
Abstract: Motivated by recent work on ordinal embedding (In Proceedings of the 27th Conference on Learning Theory (2014) 40–67), we derive large sample consistency results and rates of convergence for the problem of embedding points based on triple or quadruple distance comparisons. We also consider a variant of this problem where only local comparisons are provided. Finally, inspired by (In Communication, Control, and Computing (Allerton), 2011 49th Annual Allerton Conference on (2011) 1077–1084 IEEE), we bound the number of such comparisons needed to achieve consistency.

Journal ArticleDOI
TL;DR: In this article, an unbiased simulation method for multidimensional diffusions based on the parametrix method for solving partial differential equations with Holder continuous coefficients was proposed, which is based on an Euler scheme with random time steps.
Abstract: In this article, we consider an unbiased simulation method for multidimensional diffusions based on the parametrix method for solving partial differential equations with Holder continuous coefficients. This Monte Carlo method which is based on an Euler scheme with random time steps, can be considered as an infinite dimensional extension of the Multilevel Monte Carlo method for solutions of stochastic differential equations with Holder continuous coefficients. In particular, we study the properties of the variance of the proposed method. In most cases, the method has infinite variance and therefore we propose an importance sampling method to resolve this issue.

Journal ArticleDOI
TL;DR: In this article, a Monte Carlo method for simulating a multi-dimensional diffusion process conditioned on hitting a fixed point at a fixed future time is developed, where proposals for such diffusion bridges are obtained by superimposing an additional guiding term to the drift of the process under consideration.
Abstract: A Monte Carlo method for simulating a multi-dimensional diffusion process conditioned on hitting a fixed point at a fixed future time is developed. Proposals for such diffusion bridges are obtained by superimposing an additional guiding term to the drift of the process under consideration. The guiding term is derived via approximation of the target process by a simpler diffusion processes with known transition densities. Acceptance of a proposal can be determined by computing the likelihood ratio between the proposal and the target bridge, which is derived in closed form. We show under general conditions that the likelihood ratio is well defined and show that a class of proposals with guiding term obtained from linear approximations fall under these conditions.

Journal ArticleDOI
TL;DR: The empirical copula process plays a central role in the asymptotic analysis of many statistical procedures which are based on copulas or ranks as mentioned in this paper, and its weak convergence can be used to derive tests for stochastic independence or specific copula structures, or they may serve as a fundamental tool for the analysis of multivariate rank statistics.
Abstract: The empirical copula process plays a central role in the asymptotic analysis of many statistical procedures which are based on copulas or ranks. Among other applications, results regarding its weak convergence can be used to develop asymptotic theory for estimators of dependence measures or copula densities, they allow to derive tests for stochastic independence or specific copula structures, or they may serve as a fundamental tool for the analysis of multivariate rank statistics. In the present paper, we establish weak convergence of the empirical copula process (for observations that are allowed to be serially dependent) with respect to weighted supremum distances. The usefulness of our results is illustrated by applications to general bivariate rank statistics and to estimation procedures for the Pickands dependence function arising in multivariate extreme-value theory.

Journal ArticleDOI
TL;DR: In this paper, a multilevel Richardson-Romberg (MLRR) estimator was proposed, which combines the higher order bias cancellation of the Multistep RichardsonRomberg ($MSRR) method introduced in [Pages 07] and the variance control resulting from the stratification in the Multilevel Monte Carlo ($MLMC$) method.
Abstract: We propose and analyze a Multilevel Richardson-Romberg ($MLRR$) estimator which combines the higher order bias cancellation of the Multistep Richardson-Romberg ($MSRR$) method introduced in [Pages 07] and the variance control resulting from the stratification in the Multilevel Monte Carlo ($MLMC$) method (see [Heinrich, 01] and [Giles, 08]). Thus we show that in standard frameworks like discretization schemes of diffusion processes an assigned quadratic error $\varepsilon$ can be obtained with our ($MLRR$) estimator with a global complexity of $\log(1/\varepsilon)/\varepsilon^2$ instead of $(\log(1/\varepsilon))^2/\varepsilon^2$ with the standard ($MLMC$) method, at least when the weak error $E[Y_h]-E[Y_0]$ of the biased implemented estimator $Y_h$ can be expanded at any order in $h$. We analyze and compare these estimators on two numerical problems: the classical vanilla and exotic option pricing by Monte Carlo simulation and the less classical Nested Monte Carlo simulation.

Journal ArticleDOI
TL;DR: It is shown that different seeds lead to different distributions of limiting trees from a total variation point of view, and statistics are constructed that measure, in a certain well-defined sense, global "balancedness" properties of such trees.
Abstract: We study the influence of the seed in random trees grown according to the uniform attachment model, also known as uniform random recursive trees. We show that different seeds lead to different distributions of limiting trees from a total variation point of view. To do this, we construct statistics that measure, in a certain well-defined sense, global “balancedness” properties of such trees. Our paper follows recent results on the same question for the preferential attachment model.

Journal ArticleDOI
TL;DR: In this article, the authors considered Gaussian processes with explicitly given covariance functions, fractionally integrated stable Levy motions and their sums when the law of $\xi$ belongs to the domain of attraction of a stable law with finite mean.
Abstract: Let $(X_{1},\xi_{1}),(X_{2},\xi_{2}),\ldots$ be i.i.d. copies of a pair $(X,\xi)$ where $X$ is a random process with paths in the Skorokhod space $D[0,\infty)$ and $\xi$ is a positive random variable. Define $S_{k}:=\xi_{1}+\cdots+\xi_{k}$, $k\in\mathbb{N}_{0}$ and $Y(t):=\sum_{k\geq0}X_{k+1}(t-S_{k})\mathbf{1}_{\{S_{k}\leq t\}}$, $t\geq0$. We call the process $(Y(t))_{t\geq0}$ random process with immigration at the epochs of a renewal process. We investigate weak convergence of the finite-dimensional distributions of $(Y(ut))_{u>0}$ as $t\to\infty$. Under the assumptions that the covariance function of $X$ is regularly varying in $(0,\infty)\times(0,\infty)$ in a uniform way, the class of limiting processes is rather rich and includes Gaussian processes with explicitly given covariance functions, fractionally integrated stable Levy motions and their sums when the law of $\xi$ belongs to the domain of attraction of a stable law with finite mean, and conditionally Gaussian processes with explicitly given (conditional) covariance functions, fractionally integrated inverse stable subordinators and their sums when the law of $\xi$ belongs to the domain of attraction of a stable law with infinite mean.

Journal ArticleDOI
TL;DR: In this article, two variational formulas for the quenched free energy of a random walk in random potential (RWRP) were given for the directed i.i.d. case, where the underlying walk is directed or undirected and the environment is stationary and ergodic.
Abstract: We give two variational formulas (qVar1) and (qVar2) for the quenched free energy of a random walk in random potential (RWRP) when (i) the underlying walk is directed or undirected, (ii) the environment is stationary and ergodic, and (iii) the potential is allowed to depend on the next step of the walk which covers random walk in random environment (RWRE). In the directed i.i.d. case, we also give two variational formulas (aVar1) and (aVar2) for the annealed free energy of RWRP. These four formulas are the same except that they involve infima over different sets, and the first two are modified versions of a previously known variational formula (qVar0) for which we provide a short alternative proof. Then, we show that (qVar0) always has a minimizer, (aVar2) never has any minimizers unless the RWRP is an RWRE, and (aVar1) has a minimizer if and only if the RWRP is in the weak disorder regime. In the latter case, the minimizer of (aVar1) is unique and it is also the unique minimizer of (qVar1), but (qVar2) has no minimizers except for RWRE. In the case of strong disorder, we give a sufficient condition for the nonexistence of minimizers of (qVar1) and (qVar2) which is satisfied for the log-gamma directed polymer with a sufficiently small parameter. We end with a conjecture which implies that (qVar1) and (qVar2) have no minimizers under very strong disorder.

Journal ArticleDOI
TL;DR: In this article, the posterior consistency of Dirichlet process mixtures of Gaussian kernels with various prior specifications on the covariance matrix is established, and posterior convergence rates are also discussed.
Abstract: Density estimation represents one of the most successful applications of Bayesian nonparametrics. In particular, Dirichlet process mixtures of normals are the gold standard for density estimation and their asymptotic properties have been studied extensively, especially in the univariate case. However, a gap between practitioners and the current theoretical literature is present. So far, posterior asymptotic results in the multivariate case are available only for location mixtures of Gaussian kernels with independent prior on the common covariance matrix, while in practice as well as from a conceptual point of view a location-scale mixture is often preferable. In this paper, we address posterior consistency for such general mixture models by adapting a convergence rate result which combines the usual low-entropy, high-mass sieve approach with a suitable summability condition. Specifically, we establish consistency for Dirichlet process mixtures of Gaussian kernels with various prior specifications on the covariance matrix. Posterior convergence rates are also discussed.

Journal ArticleDOI
TL;DR: In this article, the laws of the iterated logarithm (LIL) for sample paths, local times and ranges are established for symmetric jump processes on metric measure spaces.
Abstract: Based on two-sided heat kernel estimates for a class of symmetric jump processes on metric measure spaces, the laws of the iterated logarithm (LILs) for sample paths, local times and ranges are established. In particular, the LILs are obtained for $\beta$-stable-like processes on $\alpha$-sets with $\beta>0$.

Journal ArticleDOI
TL;DR: In this paper, the authors considered a stochastic diffusion process with an unknown parameter in the diffusion coefficient, and found easily verified conditions on approximate martingale estimating functions under which estimators are consistent, rate optimal, and efficient under high frequency (in-fill) asymptotics.
Abstract: Parametric estimation for diffusion processes is considered for high frequency observations over a fixed time interval. The processes solve stochastic differential equations with an unknown parameter in the diffusion coefficient. We find easily verified conditions on approximate martingale estimating functions under which estimators are consistent, rate optimal, and efficient under high frequency (in-fill) asymptotics. The asymptotic distributions of the estimators are shown to be normal variance-mixtures, where the mixing distribution generally depends on the full sample path of the diffusion process over the observation time interval. Utilising the concept of stable convergence, we also obtain the more easily applicable result that for a suitable data dependent normalisation, the estimators converge in distribution to a standard normal distribution. The theory is illustrated by a simulation study comparing an efficient and a non-efficient estimating function for an ergodic and a non-ergodic model.

Journal ArticleDOI
TL;DR: In this article, it was shown that the conjecture that upper tail probabilities of Gaussian processes are 1.1527 −1/γ(1/δ)-1/α/γ 1/α for small δ is false for all δ > 0.
Abstract: Pickands’ constants $H_{\alpha}$ appear in various classical limit results about tail probabilities of suprema of Gaussian processes. It is an often quoted conjecture that perhaps $H_{\alpha}=1/\Gamma(1/\alpha)$ for all $0<\alpha \leq 2$, but it is also frequently observed that this does not seem compatible with evidence coming from simulations. We prove the conjecture is false for small $\alpha$, and in fact that $H_{\alpha}\geq (1.1527)^{1/\alpha}/\Gamma(1/\alpha)$ for all sufficiently small $\alpha$. The proof is a refinement of the “conditioning and comparison” approach to lower bounds for upper tail probabilities, developed in a previous paper of the author. Some calculations of hitting probabilities for Brownian motion are also involved.

Journal ArticleDOI
TL;DR: In this paper, the real log-canonical thresholds (also known as stochastic complexities or learning coefficients) that quantify the large-sample behavior of the marginal likelihood in Bayesian inference were defined.
Abstract: Gaussian latent tree models, or more generally, Gaussian latent forest models have Fisher-information matrices that become singular along interesting submodels, namely, models that correspond to subforests. For these singularities, we compute the real log-canonical thresholds (also known as stochastic complexities or learning coefficients) that quantify the large-sample behavior of the marginal likelihood in Bayesian inference. This provides the information needed for a recently introduced generalization of the Bayesian information criterion. Our mathematical developments treat the general setting of Laplace integrals whose phase functions are sums of squared differences between monomials and constants. We clarify how in this case real log-canonical thresholds can be computed using polyhedral geometry, and we show how to apply the general theory to the Laplace integrals associated with Gaussian latent tree and forest models. In simulations and a data example, we demonstrate how the mathematical knowledge can be applied in model selection.

Journal ArticleDOI
TL;DR: Bounds for the hypergeometric distribution are established which include a finite sampling correction factor, but are otherwise analogous to bounds for the binomial distribution due to León and Perron and Talagrand.
Abstract: We establish exponential bounds for the hypergeometric distribution which include a finite sampling correction factor, but are otherwise analogous to bounds for the binomial distribution due to Leon and Perron (Statist. Probab. Lett.62 (2003) 345-354) and Talagrand (Ann. Probab.22 (1994) 28-76). We also extend a convex ordering of Kemperman's (Nederl. Akad. Wetensch. Proc. Ser. A76 = Indag. Math.35 (1973) 149-164) for sampling without replacement from populations of real numbers between zero and one: a population of all zeros or ones (and hence yielding a hypergeometric distribution in the upper bound) gives the extreme case.