Mixing Strategies for Density Estimation

doi:10.1214/AOS/1016120365

Open AccessJournal ArticleDOI

Mixing Strategies for Density Estimation

Yuhong Yang

- 01 Feb 2000 -

Annals of Statistics

- Vol. 28, Iss: 1, pp 75-87

TLDR

In this article, it is shown that without knowing which strategy works best for the underlying density, a single strategy can be constructed by mixing the proposed ones to be adaptive in terms of statistical risks.

Abstract:

General results on adaptive density estimation are obtained with respect to any countable collection of estimation strategies under Kullback-Leibler and squared $L_2$ losses. It is shown that without knowing which strategy works best for the underlying density, a single strategy can be constructed by mixing the proposed ones to be adaptive in terms of statistical risks. A consequence is that under some mild conditions, an asymptotically minimax-rate adaptive estimator exists for a given countable collection of density classes; that is, a single estimator can be constructed to be simultaneously minimax-rate optimal for all the function classes being considered. A demonstration is given for high-dimensional density estimation on $[0,1]^d$ where the constructed estimator adapts to smoothness and interaction-order over some piecewise Besov classes and is consistent for all the densities with finite entropy.

Citations

PDF

Open Access

More filters

Book

Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems: École d'Été de Probabilités de Saint-Flour XXXVIII-2008

Vladimir Koltchinskii, +1 more

TL;DR: The purpose of these lecture notes is to provide an introduction to the general theory of empirical risk minimization with an emphasis on excess risk bounds and oracle inequalities in penalized problems.

...read moreread less

Journal ArticleDOI

Adaptive Regression by Mixing

Yuhong Yang

- 01 Jun 2001 -

Journal of the American Statistical Asso...

TL;DR: Under mild conditions, it is shown that the squared L2 risk of the estimator based on ARM is basically bounded above by the risk of each candidate procedure plus a small penalty term of order 1/n, giving the automatically optimal rate of convergence for ARM.

...read moreread less

BookDOI

Oracle inequalities in empirical risk minimization and sparse recovery problems

Vladimir Koltchinskii, +1 more

TL;DR: The main tools involved in the analysis of these problems are concentration and deviation inequalities by Talagrand along with other methods of empirical processes theory (symmetrization inequalities, contraction inequality for Rademacher sums, entropy and generic chaining bounds) as discussed by the authors.

...read moreread less

Journal ArticleDOI

PAC-Bayesian Stochastic Model Selection

David McAllester

- 01 Apr 2003 -

Machine Learning

TL;DR: A PAC-Bayesian performance guarantee for stochastic model selection that is superior to analogous guarantees for deterministic model selection and shown that the posterior optimizing the performance guarantee is a Gibbs distribution.

...read moreread less

Journal ArticleDOI

Combining forecasting procedures: Some theoretical results

Yuhong Yang

- 01 Feb 2004 -

Econometric Theory

TL;DR: In this paper, statistical risk bounds under the square error loss are obtained under distributional assumptions on the future given the current outside information and the past observations, and the combined forecast automatically achieves the best performance among the candidate procedures up to a constant factor and an additive penalty term.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Elements of information theory

Thomas M. Cover, +1 more

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.

...read moreread less

Book

A Probabilistic Theory of Pattern Recognition

Luc Devroye, +2 more

TL;DR: The Bayes Error and Vapnik-Chervonenkis theory are applied as guide for empirical classifier selection on the basis of explicit specification and explicit enforcement of the maximum likelihood principle.

...read moreread less

Journal Article

What is projection pursuit

M. C. Jones, +1 more

- 01 Jan 1987 -

Journal of the Royal Statistical Society

Journal ArticleDOI

The context-tree weighting method: basic properties

Frans M. J. Willems, +2 more

- 01 May 1995 -

IEEE Transactions on Information Theory

TL;DR: The authors derive a natural upper bound on the cumulative redundancy of the method for individual sequences that shows that the proposed context-tree weighting procedure is optimal in the sense that it achieves the Rissanen (1984) lower bound.

...read moreread less

Journal ArticleDOI

The Intrinsic Bayes Factor for Model Selection and Prediction

James O. Berger, +1 more

- 01 Mar 1996 -

Journal of the American Statistical Asso...

TL;DR: This article introduces a new criterion called the intrinsic Bayes factor, which is fully automatic in the sense of requiring only standard noninformative priors for its computation and yet seems to correspond to very reasonable actual Bayes factors.

...read moreread less

Mixing Strategies for Density Estimation

Citations

Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems: École d'Été de Probabilités de Saint-Flour XXXVIII-2008

Adaptive Regression by Mixing

Oracle inequalities in empirical risk minimization and sparse recovery problems

PAC-Bayesian Stochastic Model Selection

Combining forecasting procedures: Some theoretical results

References

Elements of information theory

A Probabilistic Theory of Pattern Recognition

What is projection pursuit

The context-tree weighting method: basic properties

The Intrinsic Bayes Factor for Model Selection and Prediction

Related Papers (5)

Statistical learning theory and stochastic optimization

Aggregation for Gaussian regression

Adaptive Regression by Mixing

Information-theoretic determination of minimax rates of convergence

Aggregating strategies