Showing papers in "arXiv: Methodology in 2007"

PDF

Open Access

Posted Content•

[...]

Edoardo M. Airoldi¹, David M. Blei, Stephen E. Fienberg, Eric P. Xing•Institutions (1)

30 May 2007-arXiv: Methodology

TL;DR: The mixed membership stochastic block model as discussed by the authors extends block models for relational data to ones which capture mixed membership latent relational structure, thus providing an object-specific low-dimensional representation.

...read moreread less

Abstract: Observations consisting of measurements on relationships for pairs of objects arise in many settings, such as protein interaction and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with probabilisic models can be delicate because the simple exchangeability assumptions underlying many boilerplate models no longer hold. In this paper, we describe a latent variable model of such data called the mixed membership stochastic blockmodel. This model extends blockmodels for relational data to ones which capture mixed membership latent relational structure, thus providing an object-specific low-dimensional representation. We develop a general variational inference algorithm for fast approximate posterior inference. We explore applications to social and protein interaction networks.

...read moreread less

1,546 citations

Posted Content•

Bayesian treed Gaussian process models with an application to computer modeling

[...]

Robert B. Gramacy¹, Herbert K. H. Lee²•Institutions (2)

University of Cambridge¹, University of California, Santa Cruz²

24 Oct 2007-arXiv: Methodology

TL;DR: This article explores nonstationary modeling methodologies that couple stationary Gaussian processes with treed partitioning and shows that this approach is effective in other arenas as well.

...read moreread less

Abstract: Motivated by a computer experiment for the design of a rocket booster, this paper explores nonstationary modeling methodologies that couple stationary Gaussian processes with treed partitioning. Partitioning is a simple but effective method for dealing with nonstationarity. The methodological developments and statistical computing details which make this approach efficient are described in detail. In addition to providing an analysis of the rocket booster simulator, our approach is demonstrated to be effective in other arenas.

...read moreread less

518 citations

Posted Content•

Enhancing Sparsity by Reweighted L1 Minimization

[...]

Emmanuel J. Candès, Michael B. Wakin, Stephen Boyd

10 Nov 2007-arXiv: Methodology

TL;DR: In this article, a weighted L1-minimization problem is solved by solving a sequence of weighted L 1 minimization problems, where the weights used for the next iteration are computed from the value of the current solution, and a series of experiments demonstrate the remarkable performance and broad applicability of this algorithm in the areas of sparse signal recovery, statistical estimation, error correction and image processing.

...read moreread less

Abstract: It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained L1 minimization. In this paper, we study a novel method for sparse signal recovery that in many situations outperforms L1 minimization in the sense that substantially fewer measurements are needed for exact recovery. The algorithm consists of solving a sequence of weighted L1-minimization problems where the weights used for the next iteration are computed from the value of the current solution. We present a series of experiments demonstrating the remarkable performance and broad applicability of this algorithm in the areas of sparse signal recovery, statistical estimation, error correction and image processing. Interestingly, superior gains are also achieved when our method is applied to recover signals with assumed near-sparsity in overcomplete representations--not by reweighting the L1 norm of the coefficient sequence as is common, but by reweighting the L1 norm of the transformed object. An immediate consequence is the possibility of highly efficient data acquisition protocols by improving on a technique known as compressed sensing.

...read moreread less

384 citations

Journal Article•DOI•

Inflated Beta Distributions

[...]

Raydonal Ospina¹, Silvia Ferrari¹•Institutions (1)

University of São Paulo¹

04 May 2007-arXiv: Methodology

TL;DR: In this paper, the authors considered the issue of modeling fractional data observed in the interval [0, 1], (0,1] or [0.1] and proposed mixed continuous-discrete distributions.

...read moreread less

Abstract: This paper considers the issue of modeling fractional data observed in the interval [0,1), (0,1] or [0,1]. Mixed continuous-discrete distributions are proposed. The beta distribution is used to describe the continuous component of the model since its density can have quite diferent shapes depending on the values of the two parameters that index the distribution. Properties of the proposed distributions are examined. Also, maximum likelihood and method of moments estimation is discussed. Finally, practical applications that employ real data are presented.

...read moreread less

279 citations

Posted Content•

Instantaneous and lagged measurements of linear and nonlinear dependence between groups of multivariate time series: frequency decomposition

[...]

Roberto D. Pascual-Marqui

09 Nov 2007-arXiv: Methodology

TL;DR: Pascual-Marqui et al. as mentioned in this paper defined linear dependence (coherence) and nonlinear dependence (phase synchronization) between any number of multivariate time series, expressed as the sum of lagged dependence and instantaneous dependence.

...read moreread less

Abstract: Measures of linear dependence (coherence) and nonlinear dependence (phase synchronization) between any number of multivariate time series are defined The measures are expressed as the sum of lagged dependence and instantaneous dependence The measures are non-negative, and take the value zero only when there is independence of the pertinent type These measures are defined in the frequency domain and are applicable to stationary and non-stationary time series These new results extend and refine significantly those presented in a previous technical report (Pascual-Marqui 2007, arXiv:07061776 [statME], this http URL), and have been largely motivated by the seminal paper on linear feedback by Geweke (1982 JASA 77:304-313) One important field of application is neurophysiology, where the time series consist of electric neuronal activity at several brain locations Coherence and phase synchronization are interpreted as "connectivity" between locations However, any measure of dependence is highly contaminated with an instantaneous, non-physiological contribution due to volume conduction and low spatial resolution The new techniques remove this confounding factor considerably Moreover, the measures of dependence can be applied to any number of brain areas jointly, ie distributed cortical networks, whose activity can be estimated with eLORETA (Pascual-Marqui 2007, arXiv:07103341 [math-ph])

...read moreread less

227 citations

Posted Content•

Modeling homophily and stochastic equivalence in symmetric relational data

[...]

Peter D. Hoff¹•Institutions (1)

University of Washington¹

07 Nov 2007-arXiv: Methodology

TL;DR: A latent variable model for inference and prediction of symmetric relational data, based on the idea of the eigenvalue decomposition, that generalizes other popular latent variable models.

...read moreread less

Abstract: This article discusses a latent variable model for inference and prediction of symmetric relational data. The model, based on the idea of the eigenvalue decomposition, represents the relationship between two nodes as the weighted inner-product of node-specific vectors of latent characteristics. This ``eigenmodel'' generalizes other popular latent variable models, such as latent class and distance models: It is shown mathematically that any latent class or distance model has a representation as an eigenmodel, but not vice-versa. The practical implications of this are examined in the context of three real datasets, for which the eigenmodel has as good or better out-of-sample predictive performance than the other two models.

...read moreread less

194 citations

Posted Content•

On semiparametric regression with O'Sullivan penalised splines

[...]

Matt P. Wand, John T. Ormerod

02 Jul 2007-arXiv: Methodology

TL;DR: In this paper, the use of O'Sullivan penalized splines in contemporary semiparametric regression, including mixed model and Bayesian formulations, is discussed. And exact expressions for the OSullivan penalty matrix are obtained.

...read moreread less

Abstract: This is an expos\'e on the use of O'Sullivan penalised splines in contemporary semiparametric regression, including mixed model and Bayesian formulations. O'Sullivan penalised splines are similar to P-splines, but have an advantage of being a direct generalisation of smoothing splines. Exact expressions for the O'Sullivan penalty matrix are obtained. Comparisons between the two reveals that O'Sullivan penalised splines more closely mimic the natural boundary behaviour of smoothing splines. Implementation in modern computing environments such as Matlab, R and BUGS is discussed.

...read moreread less

164 citations

Journal Article•DOI•

Statistical and Clinical Aspects of Hospital Outcomes Profiling

[...]

Sharon-Lise T. Normand¹, David M. Shahian•Institutions (1)

Harvard University¹

25 Oct 2007-arXiv: Methodology

TL;DR: This work reviews the historical evolution of hospital profiling with special emphasis on outcomes; presents a detailed history of cardiac surgery report cards, the paradigm for modern provider profiling; discusses the potential unintended negative consequences of public report cards; and describes various statistical methodologies for quantifying the relative performance of cardiac Surgery programs.

...read moreread less

Abstract: Hospital profiling involves a comparison of a health care provider's structure, processes of care, or outcomes to a standard, often in the form of a report card. Given the ubiquity of report cards and similar consumer ratings in contemporary American culture, it is notable that these are a relatively recent phenomenon in health care. Prior to the 1986 release of Medicare hospital outcome data, little such information was publicly available. We review the historical evolution of hospital profiling with special emphasis on outcomes; present a detailed history of cardiac surgery report cards, the paradigm for modern provider profiling; discuss the potential unintended negative consequences of public report cards; and describe various statistical methodologies for quantifying the relative performance of cardiac surgery programs. Outstanding statistical issues are also described.

...read moreread less

157 citations

Posted Content•

Particle Filters for Partially Observed Diffusions

[...]

Paul Fearnhead¹, Omiros Papaspiliopoulos², Gareth O. Roberts²•Institutions (2)

Lancaster University¹, University of Warwick²

23 Oct 2007-arXiv: Methodology

TL;DR: In this paper, the authors proposed a particle filter scheme for a class of partially-observed multivariate diffusions, which does not require approximations of the transition and/or the observation density using timediscretisations.

...read moreread less

Abstract: In this paper we introduce a novel particle filter scheme for a class of partially-observed multivariate diffusions. %continuous-time dynamic models where the %signal is given by a multivariate diffusion process. We consider a variety of observation schemes, including diffusion observed with error, observation of a subset of the components of the multivariate diffusion and arrival times of a Poisson process whose intensity is a known function of the diffusion (Cox process). Unlike currently available methods, our particle filters do not require approximations of the transition and/or the observation density using time-discretisations. Instead, they build on recent methodology for the exact simulation of the diffusion process and the unbiased estimation of the transition density as described in \cite{besk:papa:robe:fear:2006}. %In particular, w We introduce the Generalised Poisson Estimator, which generalises the Poisson Estimator of \cite{besk:papa:robe:fear:2006}. %Thus, our filters avoid the systematic biases caused by %time-discretisations and they have significant computational %advantages over alternative continuous-time filters. These %advantages are supported theoretically by a A central limit theorem is given for our particle filter scheme.

...read moreread less

120 citations

Posted Content•

Sparse inverse covariance estimation with the lasso

[...]

Jerome H. Friedman¹, Trevor Hastie¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

27 Aug 2007-arXiv: Methodology

TL;DR: A simple algorithm, using a coordinate descent procedure for the lasso, is developed that solves a 1000 node problem in at most a minute, and is 30 to 4000 times faster than competing methods.

...read moreread less

Abstract: We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm| the Graphical Lasso| that is remarkably fast: it solves a 1000 node problem (» 500; 000 parameters) in at most a minute, and is 30 to 4000 times faster than competing methods. It also provides a conceptual link between the exact problem and the approximation suggested by Meinshausen & B˜ uhlmann (2006). We illustrate the method on some cell-signaling data from proteomics.

...read moreread less

101 citations

Journal Article•DOI•

A Review of Accelerated Test Models

[...]

Luis A. Escobar¹, William Q. Meeker²•Institutions (2)

Louisiana State University¹, Iowa State University²

02 Aug 2007-arXiv: Methodology

TL;DR: A review of many of the accelerated test (AT) models that have been used successfully in this area can be found in this article, where the authors also provide a review of statistical methods for AT planning and estimation of suitable reliability metrics.

...read moreread less

Abstract: Engineers in the manufacturing industries have used accelerated test (AT) experiments for many decades. The purpose of AT experiments is to acquire reliability information quickly. Test units of a material, component, subsystem or entire systems are subjected to higher-than-usual levels of one or more accelerating variables such as temperature or stress. Then the AT results are used to predict life of the units at use conditions. The extrapolation is typically justified (correctly or incorrectly) on the basis of physically motivated models or a combination of empirical model fitting with a sufficient amount of previous experience in testing similar units. The need to extrapolate in both time and the accelerating variables generally necessitates the use of fully parametric models. Statisticians have made important contributions in the development of appropriate stochastic models for AT data [typically a distribution for the response and regression relationships between the parameters of this distribution and the accelerating variable(s)], statistical methods for AT planning (choice of accelerating variable levels and allocation of available test units to those levels) and methods of estimation of suitable reliability metrics. This paper provides a review of many of the AT models that have been used successfully in this area.

...read moreread less

Posted Content•

Active Set and EM Algorithms for Log-Concave Densities Based on Complete and Censored Data

[...]

Lutz Duembgen, Andre Huesler, Kaspar Rufibach

31 Jul 2007-arXiv: Methodology

TL;DR: An active set algorithm for the maximum likelihood estimation of a log-concave density based on complete data and an EM algorithm to treat arbitrarily censored or binned data are developed.

...read moreread less

Abstract: We develop an active set algorithm for the maximum likelihood estimation of a log-concave density based on complete data. Building on this fast algorithm, we indidate an EM algorithm to treat arbitrarily censored or binned data.

...read moreread less

Journal Article•DOI•

Treelets--An adaptive multi-scale basis for sparse unordered data

[...]

Ann B. Lee, Boaz Nadler, Larry Wasserman

03 Jul 2007-arXiv: Methodology

TL;DR: Treelets as discussed by the authors extends wavelet wavelet to nonsmooth signals and returns a hierarchical tree and an orthonormal basis which both reflect the internal structure of the data, and are especially wellsuited as a dimensionality reduction and feature selection tool prior to regression and classification.

...read moreread less

Abstract: In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered--with no particular meaning to the given order of the variables. Yet, successful learning is often possible due to sparsity: the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this paper we present treelets--a novel construction of multi-scale bases that extends wavelets to nonsmooth signals. The method is fully adaptive, as it returns a hierarchical tree and an orthonormal basis which both reflect the internal structure of the data. Treelets are especially well-suited as a dimensionality reduction and feature selection tool prior to regression and classification, in situations where sample sizes are small and the data are sparse with unknown groupings of correlated or collinear variables. The method is also simple to implement and analyze theoretically. Here we describe a variety of situations where treelets perform better than principal component analysis, as well as some common variable selection and cluster averaging schemes. We illustrate treelets on a blocked covariance model and on several data sets (hyperspectral image data, DNA microarray data, and internet advertisements) with highly complex dependencies between variables.

...read moreread less

Posted Content•

Coherence and phase synchronization: generalization to pairs of multivariate time series, and removal of zero-lag contributions

[...]

Roberto D. Pascual-Marqui

12 Jun 2007-arXiv: Methodology

TL;DR: The new connectivity measures proposed here can be applied to pairs of univariate EEG/MEG signals, as is traditional in the published literature, but these calculations cannot be interpreted as connectivity, since it is in general incorrect to associate an extracranial electrode or sensor to the underlying cortex.

...read moreread less

Abstract: Coherence and phase synchronization between time series corresponding to different spatial locations are usually interpreted as indicators of the connectivity between locations. In neurophysiology, time series of electric neuronal activity are essential for studying brain interconnectivity. Such signals can either be invasively measured from depth electrodes, or computed from very high time resolution, non-invasive, extracranial recordings of scalp electric potential differences (EEG: electroencephalogram) and magnetic fields (MEG: magnetoencephalogram) by means of a tomography such as sLORETA (standardized low resolution brain electromagnetic tomography). There are two problems in this case. First, in the usual situation of unknown cortical geometry, the estimated signal at each brain location is a vector with three components (i.e. a current density vector), which means that coherence and phase synchronization must be generalized to pairs of multivariate time series. Second, the inherent low spatial resolution of the EEG/MEG tomography introduces artificially high zero-lag coherence and phase synchronization. In this report, solutions to both problems are presented. Two additional generalizations are briefly mentioned: (1) conditional coherence and phase synchronization; and (2) non-stationary time-frequency analysis. Finally, a non-parametric randomization method for connectivity significance testing is outlined. The new connectivity measures proposed here can be applied to pairs of univariate EEG/MEG signals, as is traditional in the published literature. However, these calculations cannot be interpreted as connectivity, since it is in general incorrect to associate an extracranial electrode or sensor to the underlying cortex.

...read moreread less

Posted Content•

Adaptive optimal allocation in stratified sampling methods

[...]

Pierre Etore, Benjamin Jourdain¹•Institutions (1)

University of Paris¹

28 Nov 2007-arXiv: Methodology

TL;DR: In this article, a stratified sampling algorithm is proposed in which the random drawings made in the strata to compute the expectation of interest are also used to adaptively modify the proportion of further drawings in each stratum.

...read moreread less

Abstract: In this paper, we propose a stratified sampling algorithm in which the random drawings made in the strata to compute the expectation of interest are also used to adaptively modify the proportion of further drawings in each stratum. These proportions converge to the optimal allocation in terms of variance reduction. And our stratified estimator is asymptotically normal with asymptotic variance equal to the minimal one. Numerical experiments confirm the efficiency of our algorithm.

...read moreread less

Journal Article•DOI•

Quantile and Probability Curves Without Crossing

[...]

Victor Chernozhukov¹, Ivan Fernandez-Val², Alfred Galichon³•Institutions (3)

Massachusetts Institute of Technology¹, Boston University², École Polytechnique³

27 Apr 2007-arXiv: Methodology

TL;DR: In this paper, the authors proposed a method to address the problem of lack of monotonicity in estimation of conditional and structural quantile functions, also known as the quantile crossing problem.

...read moreread less

Abstract: This paper proposes a method to address the longstanding problem of lack of monotonicity in estimation of conditional and structural quantile functions, also known as the quantile crossing problem. The method consists in sorting or monotone rearranging the original estimated non-monotone curve into a monotone rearranged curve. We show that the rearranged curve is closer to the true quantile curve in finite samples than the original curve, establish a functional delta method for rearrangement-related operators, and derive functional limit theory for the entire rearranged curve and its functionals. We also establish validity of the bootstrap for estimating the limit law of the the entire rearranged curve and its functionals. Our limit results are generic in that they apply to every estimator of a monotone econometric function, provided that the estimator satisfies a functional central limit theorem and the function satisfies some smoothness conditions. Consequently, our results apply to estimation of other econometric functions with monotonicity restrictions, such as demand, production, distribution, and structural distribution functions. We illustrate the results with an application to estimation of structural quantile functions using data on Vietnam veteran status and earnings.

...read moreread less

Journal Article•DOI•

SiZer for time series: A new approach to the analysis of trends

[...]

Vitaliana Rondonotti¹, James Stephen Marron, Cheolwoo Park²•Institutions (2)

European Central Bank¹, University of Georgia²

28 Jun 2007-arXiv: Methodology

TL;DR: A new visualization is proposed, which shows the statistician the range of trade-offs that are available in SiZer, and demonstrates the effectiveness of the method.

...read moreread less

Abstract: Smoothing methods and SiZer are a useful statistical tool for discovering statistically significant structure in data. Based on scale space ideas originally developed in the computer vision literature, SiZer (SIgnificant ZERo crossing of the derivatives) is a graphical device to assess which observed features are `really there' and which are just spurious sampling artifacts. In this paper, we develop SiZer like ideas in time series analysis to address the important issue of significance of trends. This is not a straightforward extension, since one data set does not contain the information needed to distinguish `trend' from `dependence'. A new visualization is proposed, which shows the statistician the range of trade-offs that are available. Simulation and real data results illustrate the effectiveness of the method.

...read moreread less

Posted Content•

Parameter Estimation for Partially Observed Hypoelliptic Diffusions

[...]

Yvo Pokern¹, Andrew M. Stuart¹, Petter Wiberg²•Institutions (2)

University of Warwick¹, Goldman Sachs²

29 Oct 2007-arXiv: Methodology

TL;DR: In this paper, a deterministic scan Gibbs sampler alternating between missing data in the unobserved solution components, and parameters is used to model a variety of phenomena in applications ranging from molecular dynamics to audio signal analysis.

...read moreread less

Abstract: Hypoelliptic diffusion processes can be used to model a variety of phenomena in applications ranging from molecular dynamics to audio signal analysis. We study parameter estimation for such processes in situations where we observe some components of the solution at discrete times. Since exact likelihoods for the transition densities are typically not known, approximations are used that are expected to work well in the limit of small inter-sample times $\Delta t$ and large total observation times $N\Delta t$. Hypoellipticity together with partial observation leads to ill-conditioning requiring a judicious combination of approximate likelihoods for the various parameters to be estimated. We combine these in a deterministic scan Gibbs sampler alternating between missing data in the unobserved solution components, and parameters. Numerical experiments illustrate asymptotic consistency of the method when applied to simulated data. The paper concludes with application of the Gibbs sampler to molecular dynamics data.

...read moreread less

Posted Content•

Stability of the Gibbs Sampler for Bayesian Hierarchical Models

[...]

Omiros Papaspiliopoulos, Gareth O. Roberts

23 Oct 2007-arXiv: Methodology

TL;DR: In this article, the convergence of the Gibbs sampler is studied in hierarchical linear models with arbitrary symmetric error distributions. But the convergence can be uniform, geometric or sub-geometric depending on the relative tail behaviour of the error distributions, and on the parametrisation chosen.

...read moreread less

Abstract: We characterise the convergence of the Gibbs sampler which samples from the joint posterior distribution of parameters and missing data in hierarchical linear models with arbitrary symmetric error distributions. We show that the convergence can be uniform, geometric or sub-geometric depending on the relative tail behaviour of the error distributions, and on the parametrisation chosen. Our theory is applied to characterise the convergence of the Gibbs sampler on latent Gaussian process models. We indicate how the theoretical framework we introduce will be useful in analyzing more complex models.

...read moreread less

Journal Article•DOI•

The Use of Unlabeled Data in Predictive Modeling

[...]

Feng Liang¹, Sayan Mukherjee², Mike West²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Duke University²

25 Oct 2007-arXiv: Methodology

TL;DR: The fundamental statistical foundations for predictive modeling and the general questions associated with unlabeled data are overviewed, highlighting the relevance of venerable concepts of sampling design and prior specification.

...read moreread less

Abstract: The incorporation of unlabeled data in regression and classification analysis is an increasing focus of the applied statistics and machine learning literatures, with a number of recent examples demonstrating the potential for unlabeled data to contribute to improved predictive accuracy. The statistical basis for this semisupervised analysis does not appear to have been well delineated; as a result, the underlying theory and rationale may be underappreciated, especially by nonstatisticians. There is also room for statisticians to become more fully engaged in the vigorous research in this important area of intersection of the statistical and computer sciences. Much of the theoretical work in the literature has focused, for example, on geometric and structural properties of the unlabeled data in the context of particular algorithms, rather than probabilistic and statistical questions. This paper overviews the fundamental statistical foundations for predictive modeling and the general questions associated with unlabeled data, highlighting the relevance of venerable concepts of sampling design and prior specification. This theory, illustrated with a series of central illustrative examples and two substantial real data analyses, shows precisely when, why and how unlabeled data matter.

...read moreread less

Posted Content•

Dated ancestral trees from binary trait data and its application to the diversification of languages

[...]

Geoff K. Nicholls, Russell D. Gray

12 Nov 2007-arXiv: Methodology

TL;DR: This work proposes a model‐based analysis of binary trait data and presents a Markov chain Monte Carlo algorithm that can sample from the resulting posterior distribution, based on using a birth–death process for the evolution of the elements of sets of traits.

...read moreread less

Abstract: Binary trait data record the presence or absence of distinguishing traits in individuals. We treat the problem of estimating ancestral trees with time depth from binary trait data. Simple analysis of such data is problematic. Each homology class of traits has a unique birth event on the tree, and the birth event of a trait visible at the leaves is biased towards the leaves. We propose a model-based analysis of such data, and present an MCMC algorithm that can sample from the resulting posterior distribution. Our model is based on using a birth-death process for the evolution of the elements of sets of traits. Our analysis correctly accounts for the removal of singleton traits, which are commonly discarded in real data sets. We illustrate Bayesian inference for two binary-trait data sets which arise in historical linguistics. The Bayesian approach allows for the incorporation of information from ancestral languages. The marginal prior distribution of the root time is uniform. We present a thorough analysis of the robustness of our results to model mispecification, through analysis of predictive distributions for external data, and fitting data simulated under alternative observation models. The reconstructed ages of tree nodes are relatively robust, whilst posterior probabilities for topology are not reliable.

...read moreread less

Journal Article•DOI•

Robust estimates in generalized partially linear models

[...]

Graciela Boente, Xuming He, Jianhui Zhou

01 Aug 2007-arXiv: Methodology

TL;DR: In this article, a family of robust estimates for the parametric and nonparametric components under a generalized partially linear model is introduced, where the data are modeled by $y_i|(\mathbf{x}_i,t_i)\sim F(cdot,\mu_i)$ with

...read moreread less

Abstract: In this paper, we introduce a family of robust estimates for the parametric and nonparametric components under a generalized partially linear model, where the data are modeled by $y_i|(\mathbf{x}_i,t_i)\sim F(\cdot,\mu_i)$ with $\mu_i=H(\eta(t_i)+\mathbf{x}_i^{$\mathrm{T}$}\beta)$, for some known distribution function F and link function H. It is shown that the estimates of $\beta$ are root-n consistent and asymptotically normal. Through a Monte Carlo study, the performance of these estimators is compared with that of the classical ones.

...read moreread less

Posted Content•

Bayesian Covariance Matrix Estimation using a Mixture of Decomposable Graphical Models

[...]

Helen Armstrong¹, Chris Carter¹, Kevin Wong², Robert Kohn¹•Institutions (2)

University of New South Wales¹, Harvard University²

09 Jun 2007-arXiv: Methodology

TL;DR: It is shown empirically that the prior that assigns equal probability over graph sizes outperforms the prior over all graphs in more efficiently estimating the covariance matrix.

...read moreread less

Abstract: A Bayesian approach is used to estimate the covariance matrix of Gaussian data. Ideas from Gaussian graphical models and model selection are used to construct a prior for the covariance matrix that is a mixture over all decomposable graphs. For this prior the probability of each graph size is specified by the user and graphs of equal size are assigned equal probability. Most previous approaches assume that all graphs are equally probable. We show empirically that the prior that assigns equal probability over graph sizes outperforms the prior that assigns equal probability over all graphs, both in identifying the correct decomposable graph and in more efficiently estimating the covariance matrix.

...read moreread less

Posted Content•

Maximum Likelihood Estimation in Latent Class Models For Contingency Table Data

[...]

Stephen E. Fienberg, Patricia Hersh, Alessandro Rinaldo, Yi Zhou

21 Sep 2007-arXiv: Methodology

TL;DR: In this article, the basic latent class model proposed originally by the sociologist Paul F. Lazarfeld for categorical variables is studied and its geometric structure is explained. And the authors draw parallels between the statistical and geometric properties of latent class models and illustrate geometrically the causes of many problems associated with maximum likelihood estimation and related statistical inference.

...read moreread less

Abstract: Statistical models with latent structure have a history going back to the 1950s and have seen widespread use in the social sciences and, more recently, in computational biology and in machine learning. Here we study the basic latent class model proposed originally by the sociologist Paul F. Lazarfeld for categorical variables, and we explain its geometric structure. We draw parallels between the statistical and geometric properties of latent class models and we illustrate geometrically the causes of many problems associated with maximum likelihood estimation and related statistical inference. In particular, we focus on issues of non-identifiability and determination of the model dimension, of maximization of the likelihood function and on the effect of symmetric data. We illustrate these phenomena with a variety of synthetic and real-life tables, of different dimension and complexity. Much of the motivation for this work stems from the “100 Swiss Francs” problem, which we introduce and describe in detail.

...read moreread less

Posted Content•

Application of Girsanov Theorem to Particle Filtering of Discretely Observed Continuous-Time Non-Linear Systems

[...]

Simo Särkkä¹, Tommi Sottinen²•Institutions (2)

Helsinki University of Technology¹, Reykjavík University²

11 May 2007-arXiv: Methodology

TL;DR: In this article, the Girsanov theorem is used for evaluating the likelihood ratios needed in importance sampling in a continuous-discrete optimal filtering problem, where the system model is a stochastic differential equation and noisy measurements are obtained at discrete instances of time.

...read moreread less

Abstract: This article considers the application of particle filtering to continuous-discrete optimal filtering problems, where the system model is a stochastic differential equation, and noisy measurements of the system are obtained at discrete instances of time. It is shown how the Girsanov theorem can be used for evaluating the likelihood ratios needed in importance sampling. It is also shown how the methodology can be applied to a class of models, where the driving noise process is lower in the dimensionality than the state and thus the laws of state and noise are not absolutely continuous. Rao-Blackwellization of conditionally Gaussian models and unknown static parameter models is also considered.

...read moreread less

Journal Article•DOI•

Wavelet methods in statistics: Some recent developments and their applications

[...]

Anestis Antoniadis

03 Dec 2007-arXiv: Methodology

TL;DR: In this paper, a selective review of wavelet shrinkage and thresholding estimators for nonparametric curve estimation is presented, with a short introduction to wavelet theory and a broad range of applications in major areas of statistics.

...read moreread less

Abstract: The development of wavelet theory has in recent years spawned applications in signal processing, in fast algorithms for integral transforms, and in image and function representation methods. This last application has stimulated interest in wavelet applications to statistics and to the analysis of experimental data, with many successes in the efficient analysis, processing, and compression of noisy signals and images. This is a selective review article that attempts to synthesize some recent work on ``nonlinear'' wavelet methods in nonparametric curve estimation and their role on a variety of applications. After a short introduction to wavelet theory, we discuss in detail several wavelet shrinkage and wavelet thresholding estimators, scattered in the literature and developed, under more or less standard settings, for density estimation from i.i.d. observations or to denoise data modeled as observations of a signal with additive noise. Most of these methods are fitted into the general concept of regularization with appropriately chosen penalty functions. A narrow range of applications in major areas of statistics is also discussed such as partial linear regression models and functional index models. The usefulness of all these methods are illustrated by means of simulations and practical examples.

...read moreread less

Journal Article•DOI•

Distributions associated with general runs and patterns in hidden Markov models

[...]

John A. D. Aston, Donald E. K. Martin

27 Jun 2007-arXiv: Methodology

TL;DR: In this article, a method for computing distributions associated with patterns in the state sequence of a hidden Markov model, conditional on observing all or part of the observation sequence, is presented.

...read moreread less

Abstract: This paper gives a method for computing distributions associated with patterns in the state sequence of a hidden Markov model, conditional on observing all or part of the observation sequence. Probabilities are computed for very general classes of patterns (competing patterns and generalized later patterns), and thus, the theory includes as special cases results for a large class of problems that have wide application. The unobserved state sequence is assumed to be Markovian with a general order of dependence. An auxiliary Markov chain is associated with the state sequence and is used to simplify the computations. Two examples are given to illustrate the use of the methodology. Whereas the first application is more to illustrate the basic steps in applying the theory, the second is a more detailed application to DNA sequences, and shows that the methods can be adapted to include restrictions related to biological knowledge.

...read moreread less

Journal Article•DOI•

A General Framework for the Parametrization of Hierarchical Models

[...]

Omiros Papaspiliopoulos¹, Gareth O. Roberts², Martin Sköld•Institutions (2)

University of Warwick¹, Lancaster University²

28 Aug 2007-arXiv: Methodology

TL;DR: In this paper, the authors describe centering and noncentering methodology as complementary techniques for use in parametrization of broad classes of hierarchical models, with a view to the construction of effective MCMC algorithms for exploring posterior distributions from these models.

...read moreread less

Abstract: In this paper, we describe centering and noncentering methodology as complementary techniques for use in parametrization of broad classes of hierarchical models, with a view to the construction of effective MCMC algorithms for exploring posterior distributions from these models. We give a clear qualitative understanding as to when centering and noncentering work well, and introduce theory concerning the convergence time complexity of Gibbs samplers using centered and noncentered parametrizations. We give general recipes for the construction of noncentered parametrizations, including an auxiliary variable technique called the state-space expansion technique. We also describe partially noncentered methods, and demonstrate their use in constructing robust Gibbs sampler algorithms whose convergence properties are not overly sensitive to the data.

...read moreread less

Book Chapter•DOI•

Statistical inverse problems in active network tomography

[...]

Earl Lawrence¹, George Michailidis, Vijayan N. Nair•Institutions (1)

Los Alamos National Laboratory¹

01 Jan 2007-arXiv: Methodology

TL;DR: This paper is concerned with active network tomography where the goal is to recover information about quality-of-service parameters at the link level from aggregate data measured on end-to- end network paths.

...read moreread less

Abstract: The analysis of computer and communication networks gives rise to some interesting inverse problems. This paper is concerned with active network tomography where the goal is to recover information about quality-of-service (QoS) parameters at the link level from aggregate data measured on end-to- end network paths. The estimation and monitoring of QoS parameters, such as loss rates and delays, are of considerable interest to network engineers and Internet service providers. The paper provides a review of the inverse problems and recent research on inference for loss rates and delay distributions. Some new results on parametric inference for delay distributions are also developed. In addition, a real application on Internet telephony is discussed.

...read moreread less

Posted Content•

Bandwidth Selection for Weighted Kernel Density Estimation

[...]

Bin Wang¹, Xiaofeng Wang²•Institutions (2)

University of South Alabama¹, Cleveland Clinic²

11 Sep 2007-arXiv: Methodology

TL;DR: Three mean integrated squared error based bandwidth selection methods are introduced, the least-squares cross-validation method, the adaptive weight kernel density estimator and boundary problems are studied.

...read moreread less

Abstract: Weighted kernel-density estimates (wKDE) are broadly used in many statistical areas, for instant, density estimation under right-censoring. However, bandwidth selection could be a problem by reweighting the ker- nels. In this paper, we investigate the methods of bandwidth selection for wKDE. Three mean integrated squared error based bandwidth selection methods are introduced. The least-squares cross-validation method, the adaptive weight kernel density estimator and boundary problems are also studied. Monte Carlo simulations were conducted to demonstrate the per- formance of the proposed bandwidth selection methods. Finally, the perfor- mance of wKDE is illustrated via an application to biased sampling problem and a real data application.

...read moreread less