scispace - formally typeset
Search or ask a question

Showing papers on "Nonparametric statistics published in 2018"


Journal ArticleDOI
TL;DR: In this article, a general class of weights, called balancing weights, is defined to balance the weighted distributions of the covariates between treatment groups, and a new weighting scheme, the overlap weights, are proposed to minimize the variance of the weighted average treatment effect among the class of balancing weights.
Abstract: Covariate balance is crucial for unconfounded descriptive or causal comparisons. However, lack of balance is common in observational studies. This article considers weighting strategies for balancing covariates. We define a general class of weights—the balancing weights—that balance the weighted distributions of the covariates between treatment groups. These weights incorporate the propensity score to weight each group to an analyst-selected target population. This class unifies existing weighting methods, including commonly used weights such as inverse-probability weights as special cases. General large-sample results on nonparametric estimation based on these weights are derived. We further propose a new weighting scheme, the overlap weights, in which each unit’s weight is proportional to the probability of that unit being assigned to the opposite group. The overlap weights are bounded, and minimize the asymptotic variance of the weighted average treatment effect among the class of balancing wei...

508 citations


Journal ArticleDOI
TL;DR: In this paper, the authors show that the standard central limit theorem (CLT) results do not hold for means of nonparametric, conditional efficiency estimators, and provide new CLTs that permit applied researchers to make valid inference about mean conditional efficiency or to compare mean efficiency across groups of producers.
Abstract: This paper demonstrates that standard central limit theorem (CLT) results do not hold for means of nonparametric, conditional efficiency estimators, and provides new CLTs that permit applied researchers to make valid inference about mean conditional efficiency or to compare mean efficiency across groups of producers. The new CLTs are used to develop a test of the restrictive “separability” condition that is necessary for second-stage regressions of efficiency estimates on environmental variables. We show that if this condition is violated, not only are second-stage regressions difficult to interpret and perhaps meaningless, but also first-stage, unconditional efficiency estimates are misleading. As such, the test developed here is of fundamental importance to applied researchers using nonparametric methods for efficiency estimation. The test is shown to be consistent and its local power is examined. Our simulation results indicate that our tests perform well both in terms of size and power. We provide a real-world empirical example by re-examining Aly et al. (R. E. Stat., 1990) and rejecting the separability assumption implicitly assumed by Aly et al., calling into question results that appear in hundreds of papers that have been published in recent years. This article is protected by copyright. All rights reserved

152 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider selective inference with a randomized response and prove a selective central limit theorem that transfers procedures valid under asymptotic normality without selection to their corresponding selective counterparts.
Abstract: Inspired by sample splitting and the reusable holdout introduced in the field of differential privacy, we consider selective inference with a randomized response. We discuss two major advantages of using a randomized response for model selection. First, the selectively valid tests are more powerful after randomized selection. Second, it allows consistent estimation and weak convergence of selective inference procedures. Under independent sampling, we prove a selective (or privatized) central limit theorem that transfers procedures valid under asymptotic normality without selection to their corresponding selective counterparts. This allows selective inference in nonparametric settings. Finally, we propose a framework of inference after combining multiple randomized selection procedures. We focus on the classical asymptotic setting, leaving the interesting high-dimensional asymptotic questions for future work.

124 citations


Journal ArticleDOI
TL;DR: In this article, the performance of least squares estimators over closed convex sets is studied in shape-constrained regression models under Gaussian and sub-Gaussian noise.
Abstract: The performance of Least Squares (LS) estimators is studied in shape-constrained regression models under Gaussian and sub-Gaussian noise. General bounds on the performance of LS estimators over closed convex sets are provided. These results have the form of sharp oracle inequalities that account for the model misspecification error. In the presence of misspecification, these bounds imply that the LS estimator estimates the projection of the true parameter at the same rate as in the well-specified case. In isotonic and unimodal regression, the LS estimator achieves the nonparametric rate $n^{-2/3}$ as well as a parametric rate of order $k/n$ up to logarithmic factors, where $k$ is the number of constant pieces of the true parameter. In univariate convex regression, the LS estimator satisfies an adaptive risk bound of order $q/n$ up to logarithmic factors, where $q$ is the number of affine pieces of the true regression function. This adaptive risk bound holds for any collection of design points. While Guntuboyina and Sen [Probab. Theory Related Fields 163 (2015) 379–411] established that the nonparametric rate of convex regression is of order $n^{-4/5}$ for equispaced design points, we show that the nonparametric rate of convex regression can be as slow as $n^{-2/3}$ for some worst-case design points. This phenomenon can be explained as follows: Although convexity brings more structure than unimodality, for some worst-case design points this extra structure is uninformative and the nonparametric rates of unimodal regression and convex regression are both $n^{-2/3}$. Higher order cones, such as the cone of $\beta $-monotone sequences, are also studied.

91 citations


Journal ArticleDOI
TL;DR: Tau (τ), a nonparametric rank order correlation statistic, has been applied to single-case experimental designs with promising results as mentioned in this paper, and a family of related coefficients, partitions variance.
Abstract: Tau (τ), a nonparametric rank order correlation statistic, has been applied to single-case experimental designs with promising results. Tau-U, a family of related coefficients, partitions variance ...

91 citations


Proceedings ArticleDOI
19 Jul 2018
TL;DR: This paper introduces generalized score functions for causal discovery based on the characterization of general (conditional) independence relationships between random variables, without assuming particular model classes.
Abstract: Discovery of causal relationships from observational data is a fundamental problem. Roughly speaking, there are two types of methods for causal discovery, constraint-based ones and score-based ones. Score-based methods avoid the multiple testing problem and enjoy certain advantages compared to constraint-based ones. However, most of them need strong assumptions on the functional forms of causal mechanisms, as well as on data distributions, which limit their applicability. In practice the precise information of the underlying model class is usually unknown. If the above assumptions are violated, both spurious and missing edges may result. In this paper, we introduce generalized score functions for causal discovery based on the characterization of general (conditional) independence relationships between random variables, without assuming particular model classes. In particular, we exploit regression in RKHS to capture the dependence in a nonparametric way. The resulting causal discovery approach produces asymptotically correct results in rather general cases, which may have nonlinear causal mechanisms, a wide class of data distributions, mixed continuous and discrete data, and multidimensional variables. Experimental results on both synthetic and real-world data demonstrate the efficacy of our proposed approach.

89 citations


Journal ArticleDOI
TL;DR: This work presents a Bayesian formulation of weighted stochastic block models that can be used to infer the large-scale modular structure of weighted networks, including their hierarchical organization, and gives a comprehensive treatment of different kinds of edge weights.
Abstract: We present a Bayesian formulation of weighted stochastic block models that can be used to infer the large-scale modular structure of weighted networks, including their hierarchical organization. Our method is nonparametric, and thus does not require the prior knowledge of the number of groups or other dimensions of the model, which are instead inferred from data. We give a comprehensive treatment of different kinds of edge weights (i.e., continuous or discrete, signed or unsigned, bounded or unbounded), as well as arbitrary weight transformations, and describe an unsupervised model selection approach to choose the best network description. We illustrate the application of our method to a variety of empirical weighted networks, such as global migrations, voting patterns in congress, and neural connections in the human brain.

84 citations


Journal ArticleDOI
TL;DR: In this paper, a method for using instrumental variables (IV) to draw inference about causal effects for individuals other than those affected by the instrument at hand has been proposed, where both the IV estimand and many treatment parameters can be expressed as weighted averages of the same underlying marginal treatment effects.
Abstract: We propose a method for using instrumental variables (IV) to draw inference about causal effects for individuals other than those affected by the instrument at hand. Policy relevance and external validity turn on the ability to do this reliably. Our method exploits the insight that both the IV estimand and many treatment parameters can be expressed as weighted averages of the same underlying marginal treatment effects. Since the weights are identified, knowledge of the IV estimand generally places some restrictions on the unknown marginal treatment effects, and hence on the values of the treatment parameters of interest. We show how to extract information about the treatment parameter of interest from the IV estimand and, more generally, from a class of IV‐like estimands that includes the two stage least squares and ordinary least squares estimands, among others. Our method has several applications. First, it can be used to construct nonparametric bounds on the average causal effect of a hypothetical policy change. Second, our method allows the researcher to flexibly incorporate shape restrictions and parametric assumptions, thereby enabling extrapolation of the average effects for compliers to the average effects for different or larger populations. Third, our method can be used to test model specification and hypotheses about behavior, such as no selection bias and/or no selection on gain.

84 citations


Journal ArticleDOI
TL;DR: In this paper, a Bayesian methodology is proposed to estimate and test the Kendall rank correlation coefficient τ using test statistics rather than data, and the combined result is an inferential methodology that yields a posterior distribution for Kendall's τ.
Abstract: This article outlines a Bayesian methodology to estimate and test the Kendall rank correlation coefficient τ. The nonparametric nature of rank data implies the absence of a generative model and the lack of an explicit likelihood function. These challenges can be overcome by modeling test statistics rather than data (Johnson, 2005). We also introduce a method for obtaining a default prior distribution. The combined result is an inferential methodology that yields a posterior distribution for Kendall's τ.

82 citations


Proceedings Article
29 Apr 2018
TL;DR: This work proposes the first trainable probabilistic deep architecture for hybrid domains that features tractable queries and relieves the user from deciding a-priori the parametric form of the random variables but is still expressive enough to effectively approximate any distribution and permits efficient learning and inference.
Abstract: While all kinds of mixed data---from personal data, over panel and scientific data, to public and commercial data---are collected and stored, building probabilistic graphical models for these hybrid domains becomes more difficult. Users spend significant amounts of time in identifying the parametric form of the random variables (Gaussian, Poisson, Logit, etc.) involved and learning the mixed models. To make this difficult task easier, we propose the first trainable probabilistic deep architecture for hybrid domains that features tractable queries. It is based on Sum-Product Networks (SPNs) with piecewise polynomial leaf distributions together with novel nonparametric decomposition and conditioning steps using the Hirschfeld-Gebelein-Renyi Maximum Correlation Coefficient. This relieves the user from deciding a-priori the parametric form of the random variables but is still expressive enough to effectively approximate any distribution and permits efficient learning and inference.Our experiments show that the architecture, called Mixed SPNs, can indeed capture complex distributions across a wide range of hybrid domains.

80 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of nonparametric regression under shape constraints, and study the behavior of the risk of the least squares estimator (LSE) and its pointwise limiting distribution.
Abstract: We consider the problem of nonparametric regression under shape constraints. The main examples include isotonic regression (with respect to any partial order), unimodal/convex regression, additive shaperestricted regression and constrained single index model. We review some of the theoretical properties of the least squares estimator (LSE) in these problems, emphasizing on the adaptive nature of the LSE. In particular, we study the behavior of the risk of the LSE, and its pointwise limiting distribution theory, with special emphasis to isotonic regression. We survey various methods for constructing pointwise confidence intervals around these shaperestricted functions. We also briefly discuss the computation of the LSE and indicate some open research problems and future directions.

Proceedings ArticleDOI
01 Jul 2018
TL;DR: A novel non-parametric generative model for location trajectories that tries to capture the statistical features of human mobility, in contrast with existing models that generate trajectories in a sequential manner is proposed and evaluated.
Abstract: Modeling human mobility and synthesizing realistic trajectories play a fundamental role in urban planning and privacy-preserving location data analysis. Due to its high dimensionality and also the diversity of its applications, existing trajectory generative models do not preserve the geometric (and more importantly) semantic features of human mobility, especially for longer trajectories. In this paper, we propose and evaluate a novel non-parametric generative model for location trajectories that tries to capture the statistical features of human mobility {\em as a whole}. This is in contrast with existing models that generate trajectories in a sequential manner. We design a new representation of locations, and use generative adversarial networks to produce data points in that representation space which will be then transformed to a time-series location trajectory form. We evaluate our method on realistic location trajectories and compare our synthetic traces with multiple existing methods on how they preserve geographic and semantic features of real traces at both aggregated and individual levels. The empirical results prove the capability of our model in preserving the utility of real data.

Journal ArticleDOI
TL;DR: A functional approach to estimate data ensemble properties, which is based entirely on the empirical observations of discrete data samples and the relative proximity of these points in the data space and hence named empirical data analysis (EDA), which is very suitable for the current move to a data-rich environment.
Abstract: Based on a critical analysis of data analytics and its foundations, we propose a functional approach to estimate data ensemble properties, which is based entirely on the empirical observations of discrete data samples and the relative proximity of these points in the data space and hence named empirical data analysis (EDA). The ensemble functions include the nonparametric square centrality (a measure of closeness used in graph theory) and typicality (an empirically derived quantity which resembles probability). A distinctive feature of the proposed new functional approach to data analysis is that it does not assume randomness or determinism of the empirically observed data, nor independence. The typicality is derived from the discrete data directly in contrast to the traditional approach, where a continuous probability density function is assumed a priori . The typicality is expressed in a closed analytical form that can be calculated recursively and, thus, is computationally very efficient. The proposed nonparametric estimators of the ensemble properties of the data can also be interpreted as a discrete form of the information potential (known from the information theoretic learning theory as well as the Parzen windows). Therefore, EDA is very suitable for the current move to a data-rich environment, where the understanding of the underlying phenomena behind the available vast amounts of data is often not clear. We also present an extension of EDA for inference. The areas of applications of the new methodology of the EDA are wide because it concerns the very foundation of data analysis. Preliminary tests show its good performance in comparison to traditional techniques.

Posted Content
TL;DR: A new notion of regularization is discovered, called the generator-discriminator-pair regularization, that sheds light on the advantage of GANs compared to classical parametric and nonparametric approaches for explicit distribution estimation.
Abstract: This paper studies the rates of convergence for learning distributions implicitly with the adversarial framework and Generative Adversarial Networks (GANs), which subsume Wasserstein, Sobolev, MMD GAN, and Generalized/Simulated Method of Moments (GMM/SMM) as special cases. We study a wide range of parametric and nonparametric target distributions under a host of objective evaluation metrics. We investigate how to obtain valid statistical guarantees for GANs through the lens of regularization. On the nonparametric end, we derive the optimal minimax rates for distribution estimation under the adversarial framework. On the parametric end, we establish a theory for general neural network classes (including deep leaky ReLU networks) that characterizes the interplay on the choice of generator and discriminator pair. We discover and isolate a new notion of regularization, called the generator-discriminator-pair regularization, that sheds light on the advantage of GANs compared to classical parametric and nonparametric approaches for explicit distribution estimation. We develop novel oracle inequalities as the main technical tools for analyzing GANs, which are of independent interest.


Journal ArticleDOI
TL;DR: In this paper, an extension of the stochastic block model for recurrent interaction events in continuous time is proposed, where every individual belongs to a latent group and conditional interactions between two individuals follow an inhomogeneous Poisson process with intensity driven by the individuals' latent groups.
Abstract: We propose an extension of the stochastic block model for recurrent interaction events in continuous time, where every individual belongs to a latent group and conditional interactions between two individuals follow an inhomogeneous Poisson process with intensity driven by the individuals’ latent groups. We show that the model is identifiable and estimate it with a semiparametric variational expectation-maximization algorithm. We develop two versions of the method, one using a nonparametric histogram approach with an adaptive choice of the partition size, and the other using kernel intensity estimators. We select the number of latent groups by an integrated classification likelihood criterion. We demonstrate the performance of our procedure on synthetic experiments, analyse two datasets to illustrate the utility of our approach, and comment on competing methods.


Journal ArticleDOI
TL;DR: This work proposes a non-parametric chance-constrained optimization approach to operate and plan energy storage units in power distribution girds and develops new closed-form stochastic models for the key operational parameters in the system.
Abstract: By considering the specific characteristics of random variables in active distribution grids, such as their statistical dependencies and often irregularly-shaped probability distributions, we propose a non-parametric chance-constrained optimization approach to operate and plan energy storage units in power distribution girds. In particular, we develop new closed-form stochastic models for the key operational parameters in the system. Our approach is analytical and allows formulating tractable optimization problems. Yet, it does not involve any restricting assumption on the distribution of random parameters, hence, it results in accurate modeling of uncertainties. Different case studies are presented to compare the proposed approach with the conventional deterministic and parametric stochastic approaches, where the latter is based on approximating random variables with Gaussian probability distributions.

BookDOI
03 Sep 2018
TL;DR: The author explains the development of Cox Regression and some of the principles behind its application to eha and survival.
Abstract: Preface Event History and Survival Data Introduction Survival Data Right Censoring Left Truncation Time Scales Event History Data More Data Sets Single Sample Data Introduction Continuous Time Model Descriptions Discrete Time Models Nonparametric Estimators Doing it in R Cox Regression Introduction Proportional Hazards The Log-Rank Test Proportional Hazards in Continuous Time Estimation of the Baseline Hazard Explanatory Variables Interactions Interpretation of Parameter Estimates Proportional Hazards in Discrete Time Model Selection Male Mortality Poisson Regression Introduction The Poisson Distribution The Connection to Cox Regression The Connection to the Piecewise Constant Hazards Model Tabular Lifetime Data More on Cox Regression Introduction Time-Varying Covariates Communal covariates Tied Event Times Stratification Sampling of Risk Sets Residuals Checking Model Assumptions Fixed Study Period Survival Left- or Right-Censored Data Parametric Models Introduction Proportional Hazards Models Accelerated Failure Time Models Proportional Hazards or AFT Model? Discrete Time Models Multivariate Survival Models Introduction Frailty Models Parametric Frailty Models Stratification Competing Risks Models Introduction Some Mathematics Estimation Meaningful Probabilities Regression R Code for Competing Risks Causality and Matching Introduction Philosophical Aspects of Causality Causal Inference Aalen's Additive Hazards Model Dynamic Path Analysis Matching Conclusion Basic Statistical Concepts Introduction Statistical Inference Asymptotic theory Model Selection Survival Distributions Introduction Relevant Distributions in R Parametric Proportional Hazards and Accelerated Failure Time Models A Brief Introduction to R R in General Some Standard R Functions Writing Functions Graphics Probability Functions Help in R Functions in eha and survival Reading Data into R Survival Packages in R Introduction eha survival Other Packages Bibliography Index

Journal ArticleDOI
TL;DR: This letter proposes a machine learning-based linear programming model that quickly establishes the nonparametric prediction intervals of wind power by integrating extreme learning machine and quantile regression.
Abstract: This letter proposes a machine learning-based linear programming model that quickly establishes the nonparametric prediction intervals of wind power by integrating extreme learning machine and quantile regression. The proportions of quantiles can be adaptively determined via sensitivity analysis. The proposed method has been proven to be significantly efficient and reliable, with a high application potential in power systems.

Journal ArticleDOI
TL;DR: Symmetric Rank Covariances is a new class of multivariate nonparametric measures of dependence that generalises all of the above measures and leads naturally to multivariate extensions of the Bergsma--Dassios sign covariance.
Abstract: SummaryThe need to test whether two random vectors are independent has spawned many competing measures of dependence. We focus on nonparametric measures that are invariant under strictly increasing transformations, such as Kendall’s tau, Hoeffding’s $D$, and the Bergsma–Dassios sign covariance. Each exhibits symmetries that are not readily apparent from their definitions. Making these symmetries explicit, we define a new class of multivariate nonparametric measures of dependence that we call symmetric rank covariances. This new class generalizes the above measures and leads naturally to multivariate extensions of the Bergsma–Dassios sign covariance. Symmetric rank covariances may be estimated unbiasedly using U-statistics, for which we prove results on computational efficiency and large-sample behaviour. The algorithms we develop for their computation include, to the best of our knowledge, the first efficient algorithms for Hoeffding’s $D$ statistic in the multivariate setting.

Posted Content
TL;DR: In this paper, a deep-network-based approach that leverages adversarial learning to address a key challenge in modern time-to-event modeling is presented. But unlike most time-event models, our approach focuses on the estimation of time-time distributions, rather than time ordering.
Abstract: Modern health data science applications leverage abundant molecular and electronic health data, providing opportunities for machine learning to build statistical models to support clinical practice. Time-to-event analysis, also called survival analysis, stands as one of the most representative examples of such statistical models. We present a deep-network-based approach that leverages adversarial learning to address a key challenge in modern time-to-event modeling: nonparametric estimation of event-time distributions. We also introduce a principled cost function to exploit information from censored events (events that occur subsequent to the observation window). Unlike most time-to-event models, we focus on the estimation of time-to-event distributions, rather than time ordering. We validate our model on both benchmark and real datasets, demonstrating that the proposed formulation yields significant performance gains relative to a parametric alternative, which we also propose.

Journal ArticleDOI
TL;DR: In this paper, a non-parametric panel data model with multidimensional, unobserved individual effects was studied and sufficient conditions for point identification of all parameters of the model were provided.
Abstract: This article studies non-parametric panel data models with multidimensional, unobserved individual effects when the number of time periods is fixed. I focus on models where the unobservables have a factor structure and enter an unknown structural function non-additively. The setup allows the individual effects to impact outcomes differently in different time periods and it allows for heterogeneous marginal effects. I provide sufficient conditions for point identification of all parameters of the model. Furthermore, I present a non-parametric sieve maximum likelihood estimator as well as flexible semiparametric and parametric estimators. Monte Carlo experiments demonstrate that the estimators perform well in finite samples. Finally, in an empirical application, I use these estimators to investigate the relationship between teaching practice and student achievement. The results differ considerably from those obtained with commonly used panel data methods.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed two semi-parametric model averaging schemes for nonlinear dynamic time series regression models with a very large number of covariates including exogenous regressors and auto-regressive lags.
Abstract: We propose two semiparametric model averaging schemes for nonlinear dynamic time series regression models with a very large number of covariates including exogenous regressors and auto-regressive lags. Our objective is to obtain more accurate estimates and forecasts of time series by using a large number of conditioning variables in a nonparametric way. In the first scheme, we introduce a Kernel Sure Independence Screening (KSIS) technique to screen out the regressors whose marginal regression (or auto-regression) functions do not make a significant contribution to estimating the joint multivariate regression function; we then propose a semiparametric penalized method of Model Averaging MArginal Regression (MAMAR) for the regressors and auto-regressors that survive the screening procedure, to further select the regressors that have significant effects on estimating the multivariate regression function and predicting the future values of the response variable. In the second scheme, we impose an app...

Posted Content
18 Oct 2018
TL;DR: This paper develops non-asymptotic confidence sequences that achieve arbitrary precision under nonparametric conditions and strengthens and generalizes existing constructions of finite-time iterated logarithm ("finite LIL") bounds.
Abstract: A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon. In this paper, we develop confidence sequences whose widths go to zero, with non-asymptotic coverage guarantees under nonparametric conditions. Our technique draws a connection between the classical Cram\'er-Chernoff method for exponential concentration bounds, the law of the iterated logarithm (LIL), and the sequential probability ratio test---our confidence sequences extend the first to time-uniform concentration bounds; provide tight, non-asymptotic characterizations of the second; and generalize the third to nonparametric settings, including sub-Gaussian and Bernstein conditions, self-normalized processes, and matrix martingales. We illustrate the generality of our proof techniques by deriving an empirical-Bernstein bound growing at a LIL rate, as well as a novel upper LIL for the maximum eigenvalue of a sum of random matrices. Finally, we apply our methods to covariance matrix estimation and to estimation of sample average treatment effect under the Neyman-Rubin potential outcomes model.

Journal ArticleDOI
TL;DR: In this paper, the authors develop asymptotic approximations for kernel-based semiparametric estimators under assumptions accommodating slower than usual rates of convergence of their nonparametric ingredients.
Abstract: This paper develops asymptotic approximations for kernel†based semiparametric estimators under assumptions accommodating slower†than†usual rates of convergence of their nonparametric ingredients. Our first main result is a distributional approximation for semiparametric estimators that differs from existing approximations by accounting for a bias. This bias is nonnegligible in general, and therefore poses a challenge for inference. Our second main result shows that some (but not all) nonparametric bootstrap distributional approximations provide an automatic method of correcting for the bias. Our general theory is illustrated by means of examples and its main finite sample implications are corroborated in a simulation study.

Journal ArticleDOI
TL;DR: In this paper, a locally adaptive nonparametric curve fitting method that operates within a fully Bayesian framework is presented. But the method uses shrinkage priors to induce sparsity in order-k differences in the latent trend function, providing a combination of local adaptation and global control.
Abstract: We present a locally adaptive nonparametric curve fitting method that operates within a fully Bayesian framework. This method uses shrinkage priors to induce sparsity in order-k differences in the latent trend function, providing a combination of local adaptation and global control. Using a scale mixture of normals representation of shrinkage priors, we make explicit connections between our method and kth order Gaussian Markov random field smoothing. We call the resulting processes shrinkage prior Markov random fields (SPMRFs). We use Hamiltonian Monte Carlo to approximate the posterior distribution of model parameters because this method provides superior performance in the presence of the high dimensionality and strong parameter correlations exhibited by our models. We compare the performance of three prior formulations using simulated data and find the horseshoe prior provides the best compromise between bias and precision. We apply SPMRF models to two benchmark data examples frequently used to test nonparametric methods. We find that this method is flexible enough to accommodate a variety of data generating models and offers the adaptive properties and computational tractability to make it a useful addition to the Bayesian nonparametric toolbox.

Proceedings ArticleDOI
17 Jun 2018
TL;DR: This work builds on a new representation of the communication constraint of the distribution constraint, which leads to a tight characterization of the problem of estimating high-dimensional and nonparametric distributions in distributed networks.
Abstract: We consider the problem of estimating high-dimensional and nonparametric distributions in distributed networks, where each sensor in the network observes an independent sample from the underlying distribution and can communicate it to a central processor by writing at most $k$ bits on a public blackboard. We obtain matching upper and lower bounds for the minimax risk of estimating the underlying distribution under $L$ 1 loss. Our results reveal that the minimax risk reduces exponentially in k. Instead of relying on strong data processing inequalities for the converse as commonly done in the literature, we build on a new representation of the communication constraint, which leads to a tight characterization of the problem.

Journal ArticleDOI
23 May 2018-PLOS ONE
TL;DR: A semi-nonparametric Poisson regression model is developed to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US to provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity.
Abstract: This paper develops a semi-nonparametric Poisson regression model to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US. Motor vehicle crash frequency on rural highway is a topic of interest in the area of transportation safety due to higher driving speeds and the resultant severity level. Unlike the traditional Negative Binomial (NB) model, the semi-nonparametric Poisson regression model can accommodate an unobserved heterogeneity following a highly flexible semi-nonparametric (SNP) distribution. Simulation experiments are conducted to demonstrate that the SNP distribution can well mimic a large family of distributions, including normal distributions, log-gamma distributions, bimodal and trimodal distributions. Empirical estimation results show that such flexibility offered by the SNP distribution can greatly improve model precision and the overall goodness-of-fit. The semi-nonparametric distribution can provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity. When estimated coefficients in empirical models are compared, SNP and NB models are found to have a substantially different coefficient for the dummy variable indicating the lane width. The SNP model with better statistical performance suggests that the NB model overestimates the effect of lane width on crash frequency reduction by 83.1%.

Posted Content
TL;DR: In this article, adaptive inference methods for regular (semiparametric) and non-regular (non-parametric) linear functionals of the conditional expectation function are provided.
Abstract: We provide adaptive inference methods, based on $\ell_1$ regularization, for regular (semiparametric) and non-regular (nonparametric) linear functionals of the conditional expectation function. Examples of regular functionals include average treatment effects, policy effects, and derivatives. Examples of non-regular functionals include average treatment effects, policy effects, and derivatives conditional on a covariate subvector fixed at a point. We construct a Neyman orthogonal equation for the target parameter that is approximately invariant to small perturbations of the nuisance parameters. To achieve this property, we include the Riesz representer for the functional as an additional nuisance parameter. Our analysis yields weak "double sparsity robustness": either the approximation to the regression or the approximation to the representer can be "completely dense" as long as the other is sufficiently "sparse". Our main results are non-asymptotic and imply asymptotic uniform validity over large classes of models, translating into honest confidence bands for both global and local parameters.