scispace - formally typeset
Search or ask a question

Showing papers by "Richard Arratia published in 1995"


Journal ArticleDOI
TL;DR: In this paper, the decay rate of the Ewens sampling formula with parameter $\theta has been studied in the context of assembly decomposable combinatorial constructions characterized by the exponential formula for generating functions: the number of constructions that can be formed from a set of size n, where n is the total number of structures consisting of a single component.
Abstract: Assemblies are the decomposable combinatorial constructions characterized by the exponential formula for generating functions: $\Sigma p(n)s^n/n! = \exp(\Sigma m_is^i/i!)$. Here $p(n)$ is the total number of constructions that can be formed from a set of size $n$, and $m_n$ is the number of these structures consisting of a single component. Examples of assemblies include permutations, graphs, 2-regular graphs, forests of rooted or unrooted trees, set partitions and mappings of a set into itself. If an assembly is chosen uniformly from all possibilities on a set of size $n$, the counts $C_i(n)$ of components of size $i$ are jointly distributed like independent nonidentically distributed Poisson variables $Z_i$ conditioned on the event $Z_1 + 2Z_2 + \cdots + nZ_n = n$. We consider assemblies for which the process of component-size counts has a nontrivial limit distribution, without renormalizing. These include permutations, mappings, forests of labelled trees and 2-regular graphs, but not graphs and not set partitions. For some of these assemblies, the distribution of the component sizes may be viewed as a perturbation of the Ewens sampling formula with parameter $\theta$. We consider $d_b(n)$, the total variation distance between $(Z_1, \ldots, Z_b)$ and $(C_1(n),\ldots,C_b(n))$, counting components of size at most $b$. If the generating function of an assembly satisfies a mild analytic condition, we can determine the decay rate of $d_b(n)$. In particular, for $b = b(n) = o(n/\log n)$ and $n \rightarrow \infty, d_b(n) = o(b/n)$ if $\theta = 1$ and $d_b(n) \sim c(b)b/n$ if $\theta eq 1$. The constant $c(b)$ is given explicitly in terms of the $m_i: c(b) = |1 - \theta|\mathbb{E}|T_{0b} - \mathbb{E}T_{0b}|/(2b)$, where $T_{0b} = Z_1 + 2Z_2 + \cdots + bZ_b$. Finally, we show that for $\theta eq 1$ there is a constant $c_\theta$ such that $c(b) \sim c_\theta b$ as $b \rightarrow \infty$. Our results are proved using coupling, large deviation bounds and singularity analysis of generating functions.

30 citations