Open AccessPosted Content
Assessing Bayesian Nonparametric Log-Linear Models: an application to Disclosure Risk estimation
TLDR
A method is proposed for identifying models with good predictive performance in the family of Bayesian log‐linear mixed models with Dirichlet process random effects for count data, which is the focus of the present work.Abstract:
We present a method for identification of models with good predictive performances in the family of Bayesian log-linear mixed models with Dirichlet process random effects. Such a problem arises in many different applications; here we consider it in the context of disclosure risk estimation, an increasingly relevant issue raised by the increasing demand for data collected under a pledge of confidentiality. Two different criteria are proposed and jointly used via a two-stage selection procedure, in a M-open view. The first stage is devoted to identifying a path of search; then, at the second, a small number of nonparametric models is evaluated through an application-specific score based Bayesian information criterion. We test our method on a variety of contingency tables based on microdata samples from the US Census Bureau and the Italian National Security Administration, treated here as populations, and carefully discuss its features. This leads us to a journey around different forms and sources of bias along which we show that (i) while based on the so called "score+search" paradigm, our method is by construction well protected from the selection-induced bias, and (ii) models with good performances are invariably characterized by an extraordinarily simple structure of fixed effects. The complexity of model selection - a very challenging and difficult task in a strictly parametric context with large and sparse tables - is therefore significantly defused by our approach. An attractive collateral result of our analysis are fruitful new ideas about modeling in small area estimation problems, where interest is in total counts over cells with a small number of observations.read more
Citations
More filters
Journal ArticleDOI
Subset Selection in Regression
TL;DR: Chapman and Miller as mentioned in this paper, Subset Selection in Regression (Monographs on Statistics and Applied Probability, no. 40, 1990) and Section 5.8.
Posted Content
Discrete multivariate distributions
TL;DR: In this paper, the authors introduced two new discrete distributions: multivariate Binomial distribution and multivariate Poisson distribution, which were created in eventology as more correct generalizations of Binomial and Poisson distributions.
Posted Content
Assessing identification risk in survey microdata using log-linear models
Chris J. Skinner,Natalie Shlomo +1 more
TL;DR: This article developed new criteria for assessing the specification of a log-linear model in relation to the accuracy of risk estimates and found that within a class of "reasonable" models, risk estimates tend to decrease as the complexity of the model increases.
Posted Content
Optimal disclosure risk assessment
TL;DR: In this article, the authors study nonparametric estimation of the disclosure risk under the Poisson abundance model for sample records and establish a lower bound for the minimax NMSE for the estimation.
Journal ArticleDOI
Optimal disclosure risk assessment
TL;DR: A class of linear estimators of $\tau_{1}$ that are simple, computationally efficient and scalable to massive datasets, and they provably estimate all of the way up to the sampling fraction, with vanishing normalized mean-square error (NMSE) for large $n).
References
More filters
Journal ArticleDOI
Equation of state calculations by fast computing machines
TL;DR: In this article, a modified Monte Carlo integration over configuration space is used to investigate the properties of a two-dimensional rigid-sphere system with a set of interacting individual molecules, and the results are compared to free volume equations of state and a four-term virial coefficient expansion.
Journal ArticleDOI
A Bayesian Analysis of Some Nonparametric Problems
TL;DR: In this article, a class of prior distributions, called Dirichlet process priors, is proposed for nonparametric problems, for which treatment of many non-parametric statistical problems may be carried out, yielding results that are comparable to the classical theory.
Journal ArticleDOI
Hybrid Monte Carlo
TL;DR: In this article, a hybrid (molecular dynamics/Langevin) algorithm is used to guide a Monte Carlo simulation of lattice field theory, which is especially efficient for quantum chromodynamics which contain fermionic degrees of freedom.
Journal ArticleDOI
Bayesian Density Estimation and Inference Using Mixtures
Michael Escobar,Mike West +1 more
TL;DR: In this article, the authors describe and illustrate Bayesian inference in models for density estimation using mixtures of Dirichlet processes and show convergence results for a general class of normal mixture models.
Journal ArticleDOI
Markov Chain Sampling Methods for Dirichlet Process Mixture Models
TL;DR: In this article, Markov chain methods for sampling from the posterior distribution of a Dirichlet process mixture model are presented, and two new classes of methods are presented. But neither of these methods is suitable for handling general models with non-conjugate priors.
Related Papers (5)
Sparse Estimation and Uncertainty with Application to Subgroup Analysis
Marc Ratkovic,Dustin Tingley +1 more