Convergence of Chao Unseen Species Estimator

doi:10.1109/ISIT.2019.8849732

Open AccessProceedings ArticleDOI

Convergence of Chao Unseen Species Estimator

Nived Rajaraman, +3 more

- pp 46-50

Chats0

TLDR

In this article, the authors analyze the Chao estimator and show that its worst case mean squared error (MSE) is smaller than the MSE of the plug-in estimator by a factor of

Abstract:

Support size estimation and the related problem of unseen species estimation have wide applications in ecology and database analysis. Perhaps the most used support size estimator is the Chao estimator. Despite its widespread use, little is known about its theoretical properties. We analyze the Chao estimator and show that its worst case mean squared error (MSE) is smaller than the MSE of the plug-in estimator by a factor of ${\mathcal{O}}\left( {{{\left( {k/n} \right)}^2}} \right)$. Our main technical contribution is a new method to analyze rational estimators for discrete distribution properties, which may be of independent interest.

Citations

PDF

Open Access

More filters

Book ChapterDOI

Estimating the Number of Unseen Species: How Many Words did Shakespeare Know?

Peter McCullagh

TL;DR: Efron and Thisted as discussed by the authors studied the frequency distribution of words in the Shakespearean canon and found that the expected number of words that occur x ≥ 1 times in a large sample of n words is

...read moreread less

References

PDF

Open Access

More filters

Journal Article

Nonparametric estimation of the number of classes in a population

Anne Chao

- 01 Jan 1984 -

Scandinavian Journal of Statistics

TL;DR: On applique la methode d'Efron (1981, 1982) a la construction d'intervalles de confiance bases sur des distributions du bootstrap as discussed by the authors.

...read moreread less

Journal ArticleDOI

Bacterial Diversity in Human Subgingival Plaque

Bruce J. Paster, +9 more

- 15 Jun 2001 -

Journal of Bacteriology

TL;DR: The purpose of this study was to determine the bacterial diversity in the human subgingival plaque by using culture-independent molecular methods as part of an ongoing effort to obtain full 16S rRNA sequences for all cultivable and not-yet-cultivated species of human oral bacteria.

...read moreread less

Journal ArticleDOI

Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages

Robert K. Colwell, +6 more

- 01 Mar 2012 -

Journal of Plant Ecology

TL;DR: In this paper, the authors provide new unconditional variance estimators for classical, individual-based rarefaction and for Coleman Rarefaction under two sampling models: sampling-theoretic predictors for the number of species in a larger sample (multinomial model), a larger area (Poisson model) or a larger number of sampling units (Bernoulli product model), based on an estimate of asymptotic species richness.

...read moreread less

Journal ArticleDOI

Counting the Uncountable: Statistical Approaches to Estimating Microbial Diversity.

Jennifer B. Hughes, +3 more

- 01 Oct 2001 -

Applied and Environmental Microbiology

TL;DR: New genetic techniques have revealed extensive microbial diversity that was previously undetected with culture-dependent methods and morphological methods, which have revealed how well a sample reflects a community's “true” diversity.

...read moreread less

Proceedings ArticleDOI

A large-scale study of web password habits

Dinei Florencio, +1 more

TL;DR: The study involved half a million users over athree month period and gets extremely detailed data on password strength, the types and lengths of passwords chosen, and how they vary by site.

...read moreread less

Convergence of Chao Unseen Species Estimator

Citations

Estimating the Number of Unseen Species: How Many Words did Shakespeare Know?

References

Nonparametric estimation of the number of classes in a population

Bacterial Diversity in Human Subgingival Plaque

Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages

Counting the Uncountable: Statistical Approaches to Estimating Microbial Diversity.

A large-scale study of web password habits

Related Papers (5)

On Some Bayesian Solutions of the Neyman-Scott Problem

Regression Estimators: A Comparative Study

An Example of Real–Life Data Where the Hill Estimator is Correct

A Ratio Estimator Under General Sampling Design

A note on covariance estimation in the unbiased estimator of risk framework