scispace - formally typeset
Search or ask a question

Showing papers in "Bernoulli in 2015"


Journal ArticleDOI
TL;DR: In this paper, the geometric median of a collection of independent "weakly concentrated" estimators satisfies a much stronger deviation bound than each individual element in the collection, which is illustrated through several examples, including sparse linear regression and low-rank matrix recovery problems.
Abstract: In many real-world applications, collected data are contaminated by noise with heavy-tailed distribution and might contain outliers of large magnitude. In this situation, it is necessary to apply methods which produce reliable outcomes even if the input contains corrupted measurements. We describe a general method which allows one to obtain estimators with tight concentration around the true parameter of interest taking values in a Banach space. Suggested construction relies on the fact that the geometric median of a collection of independent “weakly concentrated” estimators satisfies a much stronger deviation bound than each individual element in the collection. Our approach is illustrated through several examples, including sparse linear regression and low-rank matrix recovery problems.

154 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider multivariate counting processes depending on an unknown function to be estimated by linear combinations of a fixed dictionary, and propose an adaptive $\ell_1$-penalization methodology, where data-driven weights of the penalty are derived from new Bernstein type inequalities for martingales.
Abstract: Due to its low computational cost, Lasso is an attractive regularization method for high-dimensional statistical settings. In this paper, we consider multivariate counting processes depending on an unknown function to be estimated by linear combinations of a fixed dictionary. To select coefficients, we propose an adaptive $\ell_1$-penalization methodology, where data-driven weights of the penalty are derived from new Bernstein type inequalities for martingales. Oracle inequalities are established under assumptions on the Gram matrix of the dictionary. Non-asymptotic probabilistic results for multivariate Hawkes processes are proven, which allows us to check these assumptions by considering general dictionaries based on histograms, Fourier or wavelet bases. Motivated by problems of neuronal activities inference, we finally lead a simulation study for multivariate Hawkes processes and compare our methodology with the {\it adaptive Lasso procedure} proposed by Zou in \cite{Zou}. We observe an excellent behavior of our procedure with respect to the problem of supports recovery. We rely on theoretical aspects for the essential question of tuning our methodology. Unlike adaptive Lasso of \cite{Zou}, our tuning procedure is proven to be robust with respect to all the parameters of the problem, revealing its potential for concrete purposes, in particular in neuroscience.

154 citations


Journal ArticleDOI
TL;DR: In this article, the convergence to equilibrium in terms of Wasserstein distance has been studied for piecewise deterministic Markov processes with two components, where the first component evolves according to one of finitely many underlying Markovian dynamics, with a choice of dynamics that changes at the jump times of the second component.
Abstract: We study a Markov process with two components: the first component evolves according to one of finitely many underlying Markovian dynamics, with a choice of dynamics that changes at the jump times of the second component. The second component is discrete and its jump rates may depend on the position of the whole process. Under regularity assumptions on the jump rates and Wasserstein contraction conditions for the underlying dynamics, we provide a concrete criterion for the convergence to equilibrium in terms of Wasserstein distance. The proof is based on a coupling argument and a weak form of the Harris theorem. In particular, we obtain exponential ergodicity in situations which do not verify any hypoellipticity assumption, but are not uniformly contracting either. We also obtain a bound in total variation distance under a suitable regularising assumption. Some examples are given to illustrate our result, including a class of piecewise deterministic Markov processes.

140 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a nonparametric estimator of the division rate for a growing and dividing population modelled by a piecewise deterministic Markov branching tree, where the individuals split into two offsprings at a division rate B(x) that depends on their size x, whereas their size grows exponentially in time.
Abstract: We raise the issue of estimating the division rate for a growing and dividing population modelled by a piecewise deterministic Markov branching tree. Such models have broad applications, ranging from TCP/IP window size protocol to bacterial growth. Here, the individ-uals split into two offsprings at a division rate B(x) that depends on their size x, whereas their size grow exponentially in time, at a rate that exhibits variability. The mean empirical measure of the model satisfies a growth-fragmentation type equation, and we bridge the determinis-tic and probabilistic viewpoints. We then construct a nonparametric estimator of the division rate B(x) based on the observation of the pop-ulation over different sampling schemes of size n on the genealogical tree. Our estimator nearly achieves the rate n −s/(2s+1) in squared-loss error asymptotically, generalizing and improving on the rate n −s/(2s+3) obtained in [13, 15] through indirect observation schemes. Our method is consistently tested numerically and implemented on Escherichia coli data, which demonstrates its major interest for practical applications.

116 citations


Journal ArticleDOI
TL;DR: For sampling without replacement, the best general concentration inequality has been a Hoeffding inequality due to Serfling as discussed by the authors, which is known as the Bernstein concentration bound, and it has been shown that the concentration inequality does not require the variance to be known to the user.
Abstract: Concentration inequalities quantify the deviation of a random variable from a fixed value. In spite of numerous applications, such as opinion surveys or ecological counting procedures, few concentration results are known for the setting of sampling without replacement from a finite population. Until now, the best general concentration inequality has been a Hoeffding inequality due to Serfling [ Ann. Statist. 2 (1974) 39–48]. In this paper, we first improve on the fundamental result of Serfling [ Ann. Statist. 2 (1974) 39–48], and further extend it to obtain a Bernstein concentration bound for sampling without replacement. We then derive an empirical version of our bound that does not require the variance to be known to the user.

111 citations


Journal ArticleDOI
TL;DR: In this article, the authors tackle the problem of comparing distributions of random variables and defining a mean pattern between a sample of random events using barycenters of measures in the Wasserstein space, and propose an iterative version as an estimation of the mean distribution.
Abstract: In this paper we tackle the problem of comparing distributions of random variables and defining a mean pattern between a sample of random events. Using barycenters of measures in the Wasserstein space, we propose an iterative version as an estimation of the mean distribution. Moreover, when the distributions are a common measure warped by a centered random operator, then the barycenter enables to recover this distribution template.

88 citations


Journal ArticleDOI
TL;DR: The approach provides a complete description of the distributions of all pairs (Yt, Yt−k) and inherits the robustness properties of classical quantile regression, because it does not require any distributional assumptions such as the existence of finite moments.
Abstract: In this paper, we present an alternative method for the spectral analysis of a univariate, strictly stationary time series $\{Y_{t}\}_{t\in\mathbb{Z} }$ We define a “new” spectrum as the Fourier transform of the differences between copulas of the pairs $(Y_{t},Y_{t-k})$ and the independence copula This object is called a copula spectral density kernel and allows to separate the marginal and serial aspects of a time series We show that this spectrum is closely related to the concept of quantile regression Like quantile regression, which provides much more information about conditional distributions than classical location-scale regression models, copula spectral density kernels are more informative than traditional spectral densities obtained from classical autocovariances In particular, copula spectral density kernels, in their population versions, provide (asymptotically provide, in their sample versions) a complete description of the copulas of all pairs $(Y_{t},Y_{t-k})$ Moreover, they inherit the robustness properties of classical quantile regression, and do not require any distributional assumptions such as the existence of finite moments In order to estimate the copula spectral density kernel, we introduce rank-based Laplace periodograms which are calculated as bilinear forms of weighted $L_{1}$-projections of the ranks of the observed time series onto a harmonic regression model We establish the asymptotic distribution of those periodograms, and the consistency of adequately smoothed versions The finite-sample properties of the new methodology, and its potential for applications are briefly investigated by simulations and a short empirical example

77 citations


Journal ArticleDOI
TL;DR: The particle Gibbs sampler as discussed by the authors is a Markov chain Monte Carlo (MCMC) algorithm to sample from the full posterior distribution of a state-space model, which does so by executing Gibbs sampling steps on an extended target distribution defined on the space of the auxiliary variables generated by an interacting particle system.
Abstract: The particle Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm to sample from the full posterior distribution of a state-space model. It does so by executing Gibbs sampling steps on an extended target distribution defined on the space of the auxiliary variables generated by an interacting particle system. This paper makes the following contributions to the theoretical study of this algorithm. Firstly, we present a coupling construction between two particle Gibbs updates from different starting points and we show that the coupling probability may be made arbitrarily close to one by increasing the number of particles. We obtain as a direct corollary that the particle Gibbs kernel is uniformly ergodic. Secondly, we show how the inclusion of an additional Gibbs sampling step that reselects the ancestors of the particle Gibbs' extended target distribution, which is a popular approach in practice to improve mixing, does indeed yield a theoretically more efficient algorithm as measured by the asymptotic variance. Thirdly, we extend particle Gibbs to work with lower variance resampling schemes. A detailed numerical study is provided to demonstrate the efficiency of particle Gibbs and the proposed variants.

67 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provide a unified analysis of the properties of the sample covariance matrix $\Sigma n}$ over the class of population covariance matrices of reduced effective rank $r_{e}(\Sigma)$.
Abstract: This work provides a unified analysis of the properties of the sample covariance matrix $\Sigma_{n}$ over the class of $p\times p$ population covariance matrices $\Sigma$ of reduced effective rank $r_{e}(\Sigma)$. This class includes scaled factor models and covariance matrices with decaying spectrum. We consider $r_{e}(\Sigma)$ as a measure of matrix complexity, and obtain sharp minimax rates on the operator and Frobenius norm of $\Sigma_{n}-\Sigma$, as a function of $r_{e}(\Sigma)$ and $\|\Sigma\|_{2}$, the operator norm of $\Sigma$. With guidelines offered by the optimal rates, we define classes of matrices of reduced effective rank over which $\Sigma_{n}$ is an accurate estimator. Within the framework of these classes, we perform a detailed finite sample theoretical analysis of the merits and limitations of the empirical scree plot procedure routinely used in PCA. We show that identifying jumps in the empirical spectrum that consistently estimate jumps in the spectrum of $\Sigma$ is not necessarily informative for other goals, for instance for the selection of those sample eigenvalues and eigenvectors that are consistent estimates of their population counterparts. The scree plot method can still be used for selecting consistent eigenvalues, for appropriate threshold levels. We provide a threshold construction and also give a rule for checking the consistency of the corresponding sample eigenvectors. We specialize these results and analysis to population covariance matrices with polynomially decaying spectra, and extend it to covariance operators with polynomially decaying spectra. An application to fPCA illustrates how our results can be used in functional data analysis.

65 citations


Journal ArticleDOI
TL;DR: In this article, the authors established a large deviation principle for two-dimensional stochastic Navier-Stokes equations driven by multiplicative Levy noises and showed that the weak convergence method introduced by Budhiraja, Dupuis and Maroulas plays a key role.
Abstract: In this paper, we establish a large deviation principle for two-dimensional stochastic Navier–Stokes equations driven by multiplicative Levy noises. The weak convergence method introduced by Budhiraja, Dupuis and Maroulas [ Ann. Inst. Henri Poincare Probab. Stat. 47 (2011) 725–747] plays a key role.

62 citations


Journal ArticleDOI
TL;DR: In this article, several estimators for its integrated version in a high-frequency setting, all based on increments of spot volatility estimators, were constructed by construction, bias corrected in order to attain the optimal rate.
Abstract: In this paper, we are concerned with nonparametric inference on the volatility of volatility process in stochastic volatility models. We construct several estimators for its integrated version in a high-frequency setting, all based on increments of spot volatility estimators. Some of those are positive by construction, others are bias corrected in order to attain the optimal rate $n^{-1/4}$. Associated central limit theorems are proven which can be widely used in practice, as they are the key to essentially all tools in model validation for stochastic volatility models. As an illustration we give a brief idea on a goodness-of-fit test in order to check for a certain parametric form of volatility of volatility.

Journal ArticleDOI
TL;DR: In this paper, the existence and consistency of the maximum likelihood estimators for the extreme value index and normalization constants within the framework of the block maxima method was proved. But the authors did not consider the GEV distribution in this paper.
Abstract: The maximum likelihood method offers a standard way to estimate the three parameters of a generalized extreme value (GEV) distribution. Combined with the block maxima method, it is often used in practice to assess the extreme value index and normalization constants of a distribution satisfying a first order extreme value condition, assuming implicitly that the block maxima are exactly GEV distributed. This is unsatisfactory since the GEV distribution is a good approximation of the block maxima distribution only for blocks of large size. The purpose of this paper is to provide a theoretical basis for this methodology. Under a first order extreme value condition only, we prove the existence and consistency of the maximum likelihood estimators for the extreme value index and normalization constants within the framework of the block maxima method.

Journal ArticleDOI
TL;DR: A class of depth-based classification procedures that are of a nearest-neighbor nature, which achieve Bayes consistency under virtually any absolutely continuous distributions is introduced, to stress the difference with the stronger universal consistency of the standard kNN classifiers.
Abstract: We introduce a class of depth-based classification procedures that are of a nearest-neighbor nature. Depth, after symmetrization, indeed provides the center-outward ordering that is necessary and sufficient to define nearest neighbors. Like all their depth-based competitors, the resulting classifiers are affine-invariant, hence in particular are insensitive to unit changes. Unlike the former, however, the latter achieve Bayes consistency under virtually any absolutely continuous distributions – a concept we call nonparametric consistency, to stress the difference with the stronger universal consistency of the standard $k$NN classifiers. We investigate the finite-sample performances of the proposed classifiers through simulations and show that they outperform affine-invariant nearest-neighbor classifiers obtained through an obvious standardization construction. We illustrate the practical value of our classifiers on two real data examples. Finally, we shortly discuss the possible uses of our depth-based neighbors in other inference problems.

Journal ArticleDOI
TL;DR: Linearly interpolated density (LID) as discussed by the authors is a Bayesian quantile regression method, which uses a linear interpolation of the quantiles to approximate the likelihood, leading to higher global efficiency for all quantiles of interest.
Abstract: Quantile regression is often used when a comprehensive relationship between a response variable and one or more explanatory variables is desired. The traditional frequentists’ approach to quantile regression has been well developed around asymptotic theories and efficient algorithms. However, not much work has been published under the Bayesian framework. One challenging problem for Bayesian quantile regression is that the full likelihood has no parametric forms. In this paper, we propose a Bayesian quantile regression method, the linearly interpolated density (LID) method, which uses a linear interpolation of the quantiles to approximate the likelihood. Unlike most of the existing methods that aim at tackling one quantile at a time, our proposed method estimates the joint posterior distribution of multiple quantiles, leading to higher global efficiency for all quantiles of interest. Markov chain Monte Carlo algorithms are developed to carry out the proposed method. We provide convergence results that justify both the algorithmic convergence and statistical approximations to an integrated-likelihood-based posterior. From the simulation results, we verify that LID has a clear advantage over other existing methods in estimating quantities that relate to two or more quantiles.

Journal ArticleDOI
TL;DR: In this paper, two sample tests are constructed on the basis of functional principal component analysis and self-normalization, the latter of which is a new studentization technique recently developed for the inference of a univariate time series.
Abstract: Motivated by the need to statistically quantify the difference between two spatio-temporal datasets that arise in climate downscaling studies, we propose new tests to detect the differences of the covariance operators and their associated characteristics of two functional time series. Our two sample tests are constructed on the basis of functional principal component analysis and self-normalization, the latter of which is a new studentization technique recently developed for the inference of a univariate time series. Compared to the existing tests, our SN-based tests allow for weak dependence within each sample and it is robust to the dependence between the two samples in the case of equal sample sizes. Asymptotic properties of the SN-based test statistics are derived under both the null and local alternatives. Through extensive simulations, our SN-based tests are shown to outperform existing alternatives in size and their powers are found to be respectable. The tests are then applied to the gridded climate model outputs and interpolated observations to detect the difference in their spatial dynamics.

Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of testing whether a correlation matrix of a multivariate normal population is the identity matrix and derive a general lower bound applicable to various classes.
Abstract: We consider the problem of testing whether a correlation matrix of a multivariate normal population is the identity matrix. We focus on sparse classes of alternatives where only a few entries are nonzero and, in fact, positive. We derive a general lower bound applicable to various classes and study the performance of some near-optimal tests. We pay special attention to computational feasibility and construct near-optimal tests that can be computed efficiently. Finally, we apply our results to prove new lower bounds for the clique number of high-dimensional random geometric graphs.

Journal ArticleDOI
TL;DR: In this paper, the authors develop estimators for a general class of stationary GARCH models with possibly heavy tailed asymmetrically distributed errors, covering processes with symmetric and asymmetric feedback.
Abstract: We develop two new estimators for a general class of stationary GARCH models with possibly heavy tailed asymmetrically distributed errors, covering processes with symmetric and asymmetric feedback like GARCH, Asymmetric GARCH, VGARCH and Quadratic GARCH. The first estimator arises from negligibly trimming QML criterion equations according to error extremes. The second imbeds negligibly transformed errors into QML score equations for a Method of Moments estimator. In this case, we exploit a sub-class of redescending transforms that includes tail-trimming and functions popular in the robust estimation literature, and we re-center the transformed errors to minimize small sample bias. The negligible transforms allow both identification of the true parameter and asymptotic normality. We present a consistent estimator of the covariance matrix that permits classic inference without knowledge of the rate of convergence. A simulation study shows both of our estimators trump existing ones for sharpness and approximate normality including QML, Log-LAD, and two types of non-Gaussian QML (Laplace and Power-Law). Finally, we apply the tail-trimmed QML estimator to financial data.

Journal ArticleDOI
TL;DR: In this article, it was shown that the Kolmogorov distance between the distribution of $W$ and the standard normal distribution is bounded by a factor of logarithmic in the number of exchangeable pairs.
Abstract: Let $\mathbb{X}=\{X_{ij}\colon\ 1\le i,j\le n\}$ be an $n\times n$ array of independent random variables where $n\ge2$. Let $\pi$ be a uniform random permutation of $\{1,2,\dots,n\}$, independent of $\mathbb{X}$, and let $W=\sum_{i=1}^{n}X_{i\pi(i)}$. Suppose $\mathbb{X}$ is standardized so that $\mathbb{E}W=0$, $\operatorname{Var}(W)=1$. We prove that the Kolmogorov distance between the distribution of $W$ and the standard normal distribution is bounded by $451\sum_{i,j=1}^{n}\mathbb{E}|X_{ij}|^{3}/n$. Our approach is by Stein’s method of exchangeable pairs and the use of a concentration inequality.

Journal ArticleDOI
Abstract: We present a new proof of the Burkholder–Davis–Gundy inequalities for $1\leq p<\infty$. The novelty of our method is that these martingale inequalities are obtained as consequences of elementary deterministic counterparts. The latter have a natural interpretation in terms of robust hedging.

Journal ArticleDOI
TL;DR: In this paper, a data-driven selection rule from the family of kernel estimators is proposed and a pointwise oracle inequality is derived for it, and the proposed estimator is minimax and minimax adaptive over the scale of anisotropic Nikolskii classes.
Abstract: In this paper, we study the problem of pointwise estimation of a multivariate density. We provide a data-driven selection rule from the family of kernel estimators and derive for it a pointwise oracle inequality. Using the latter bound, we show that the proposed estimator is minimax and minimax adaptive over the scale of anisotropic Nikolskii classes. It is important to emphasize that our estimation method adjusts automatically to eventual independence structure of the underlying density. This, in its turn, allows to reduce significantly the influence of the dimension on the accuracy of estimation (curse of dimensionality). The main technical tools used in our considerations are pointwise uniform bounds of empirical processes developed recently in Lepski [ Math. Methods Statist. 22 (2013) 83–99].

Journal ArticleDOI
TL;DR: In this paper, the Riemann hypothesis is proven to be true if and only if each σ(t) is a pretended-infinitely divisible characteristic function, which is defined in this paper, for each $1/2 1.
Abstract: Let $\sigma,t\in\mathbb{R}$, $s=\sigma+\mathrm{{i}}t$, $\Gamma(s)$ be the Gamma function, $\zeta(s)$ be the Riemann zeta function and $\xi(s):=s(s-1)\pi^{-s/2}\Gamma(s/2)\zeta(s)$ be the complete Riemann zeta function. We show that $\Xi_{\sigma}(t):=\xi(\sigma-\mathrm{{i}}t)/\xi(\sigma)$ is a characteristic function for any $\sigma\in\mathbb{R}$ by giving the probability density function. Next we prove that the Riemann hypothesis is true if and only if each $\Xi_{\sigma}(t)$ is a pretended-infinitely divisible characteristic function, which is defined in this paper, for each $1/2 1$.

Journal ArticleDOI
TL;DR: In this paper, the authors prove local limit theorems for the magnetization in the Curie-Weiss model at high temperature, the number of triangles and isolated vertices in Erdős-Renyi random graphs, and the independence number in a geometric random graph.
Abstract: In this article, we prove new inequalities between some common probability metrics. Using these inequalities, we obtain novel local limit theorems for the magnetization in the Curie–Weiss model at high temperature, the number of triangles and isolated vertices in Erdős–Renyi random graphs, as well as the independence number in a geometric random graph. We also give upper bounds on the rates of convergence for these local limit theorems and also for some other probability metrics. Our proofs are based on the Landau–Kolmogorov inequalities and new smoothing techniques.

Journal ArticleDOI
TL;DR: In this article, a unified framework for studying both latent and stochastic block models, which are used to cluster simultaneously rows and columns of a data matrix, is proposed, and the authors characterize whether it is possible to asymptotically recover the actual groups on the rows or columns of the matrix, relying on a consistent estimate of the parameter.
Abstract: We propose a unified framework for studying both latent and stochastic block models, which are used to cluster simultaneously rows and columns of a data matrix. In this new framework, we study the behaviour of the groups posterior distribution, given the data. We characterize whether it is possible to asymptotically recover the actual groups on the rows and columns of the matrix, relying on a consistent estimate of the parameter. In other words, we establish sufficient conditions for the groups posterior distribution to converge (as the size of the data increases) to a Dirac mass located at the actual (random) groups configuration. In particular, we highlight some cases where the model assumes symmetries in the matrix of connection probabilities that prevents recovering the original groups. We also discuss the validity of these results when the proportion of non-null entries in the data matrix converges to zero.

Journal ArticleDOI
TL;DR: In this paper, a new class of stationary, max-stable processes are defined as max-mixtures of Brown-Resnick processes, which generalizes the class of Husler-Reiss distributions.
Abstract: Let $X_{i,n}$, $n\in\mathbb{N}$, $1\leq i\leq n$, be a triangular array of independent $\mathbb{R}^{d}$-valued Gaussian random vectors with correlation matrices $\Sigma_{i,n}$. We give necessary conditions under which the row-wise maxima converge to some max-stable distribution which generalizes the class of Husler–Reiss distributions. In the bivariate case, the conditions will also be sufficient. Using these results, new models for bivariate extremes are derived explicitly. Moreover, we define a new class of stationary, max-stable processes as max-mixtures of Brown–Resnick processes. As an application, we show that these processes realize a large set of extremal correlation functions, a natural dependence measure for max-stable processes. This set includes all functions $\psi(\sqrt{\gamma(h)})$, $h\in\mathbb{R}^{d}$, where $\psi$ is a completely monotone function and $\gamma$ is an arbitrary variogram.

Journal ArticleDOI
TL;DR: In this paper, the Rademacher random variables are independent identically distributed Rademachers, and the smallest possible constant (c$) in the inequality is given, where c = 3.178.
Abstract: Let $\varepsilon_{1},\ldots,\varepsilon_{n}$ be independent identically distributed Rademacher random variables, that is $\mathbb{P}\{\varepsilon_{i}=\pm1\}=1/2$. Let $S_{n}=a_{1}\varepsilon_{1}+\cdots+a_{n}\varepsilon_{n}$, where $\mathbf{a}=(a_{1},\ldots,a_{n})\in\mathbb{R}^{n}$ is a vector such that ${a_{1}^{2}+\cdots+a_{n}^{2}\leq1}$. We find the smallest possible constant $c$ in the inequality \[\mathbb{P}\{S_{n}\geq x\}\leq c\mathbb{P}\{\eta\geq x\}\qquad\mbox{for all }x\in \mathbb{R},\] where $\eta\sim N(0,1)$ is a standard normal random variable. This optimal value is equal to \[c_{\ast}=(4\mathbb{P}\{\eta\geq\sqrt{2}\})^{-1}\approx3.178.\]

Journal ArticleDOI
TL;DR: In this article, a tree-armed bandit algorithm for noisy global optimisation and continuous-armed bandits is proposed. But the regret of the algorithm is not known, and it is difficult to obtain a lower bound in terms of the zooming dimension.
Abstract: We describe a novel algorithm for noisy global optimisation and continuum-armed bandits, with good convergence properties over any continuous reward function having finitely many polynomial maxima. Over such functions, our algorithm achieves square-root regret in bandits, and inverse-square-root error in optimisation, without prior information. Our algorithm works by reducing these problems to tree-armed bandits, and we also provide new results in this setting. We show it is possible to adaptively combine multiple trees so as to minimise the regret, and also give near-matching lower bounds on the regret in terms of the zooming dimension.

Journal ArticleDOI
TL;DR: In this paper, the authors adopt the geometric view of topic models as a data generating mechanism for points randomly sampled from the interior of a (convex) population polytope, whose extreme points correspond to the population structure variables of interest.
Abstract: We study the posterior contraction behavior of the latent population structure that arises in admixture models as the amount of data increases. We adopt the geometric view of admixture models – alternatively known as topic models – as a data generating mechanism for points randomly sampled from the interior of a (convex) population polytope, whose extreme points correspond to the population structure variables of interest. Rates of posterior contraction are established with respect to Hausdorff metric and a minimum matching Euclidean metric defined on polytopes. Tools developed include posterior asymptotics of hierarchical models and arguments from convex geometry.

Journal ArticleDOI
TL;DR: In this paper, the authors solve the Skorokhod embedding problem for a general time-homogeneous diffusion, and derive necessary and sufficient conditions under which there exists a bounded embedding.
Abstract: We solve the Skorokhod embedding problem (SEP) for a general time-homogeneous diffusion X: given a distribution \rho, we construct a stopping time T such that the stopped process X_T has the distribution \rho? Our solution method makes use of martingale representations (in a similar way to Bass [3] who solves the SEP for Brownian motion) and draws on law uniqueness of weak solutions of SDEs. Then we ask if there exist solutions of the SEP which are respectively finite almost surely, integrable or bounded, and when does our proposed construction have these properties. We provide conditions that guarantee existence of finite time solutions. Then, we fully characterize the distributions that can be embedded with integrable stopping times. Finally, we derive necessary, respectively sufficient, conditions under which there exists a bounded embedding.

Journal ArticleDOI
TL;DR: In this paper, the authors present a comprehensive theory of generalized and weak generalized convolutions, illustrate it by a large number of examples, and discuss the related infinitely divisible distributions.
Abstract: In this paper, we present a comprehensive theory of generalized and weak generalized convolutions, illustrate it by a large number of examples, and discuss the related infinitely divisible distributions. We consider Levy and additive process with respect to generalized and weak generalized convolutions as certain Markov processes, and then study stochastic integrals with respect to such processes. We introduce the representability property of weak generalized convolutions. Under this property and the related weak summability, a stochastic integral with respect to random measures related to such convolutions is constructed.

Journal ArticleDOI
TL;DR: In this paper, the authors consider an insurance company exposed to a stochastic economic environment that contains two kinds of risk: the first kind is the insurance risk caused by traditional insurance claims, and the second kind is financial risk resulting from investments.
Abstract: Consider an insurance company exposed to a stochastic economic environment that contains two kinds of risk. The first kind is the insurance risk caused by traditional insurance claims, and the second kind is the financial risk resulting from investments. Its wealth process is described in a standard discrete-time model in which, during each period, the insurance risk is quantified as a real-valued random variable $X$ equal to the total amount of claims less premiums, and the financial risk as a positive random variable $Y$ equal to the reciprocal of the stochastic accumulation factor. This risk model builds an efficient platform for investigating the interplay of the two kinds of risk. We focus on the ruin probability and the tail probability of the aggregate risk amount. Assuming that every convex combination of the distributions of $X$ and $Y$ is of strongly regular variation, we derive some precise asymptotic formulas for these probabilities with both finite and infinite time horizons, all in the form of linear combinations of the tail probabilities of $X$ and $Y$. Our treatment is unified in the sense that no dominating relationship between $X$ and $Y$ is required.