scispace - formally typeset
Search or ask a question

Showing papers on "Nonparametric statistics published in 2021"



Journal ArticleDOI
TL;DR: It is proved that the multi-period difference-in-differences estimator is equivalent to the weighted 2FE estimator with some observations having negative weights, implying that in contrast to the popular belief, the 2FE estimation does not represent a design-based, nonparametric estimation strategy for causal inference.
Abstract: The two-way linear fixed effects regression (2FE) has become a default method for estimating causal effects from panel data. Many applied researchers use the 2FE estimator to adjust for unobserved unit-specific and time-specific confounders at the same time. Unfortunately, we demonstrate that the ability of the 2FE model to simultaneously adjust for these two types of unobserved confounders critically relies upon the assumption of linear additive effects. Another common justification for the use of the 2FE estimator is based on its equivalence to the difference-in-differences estimator under the simplest setting with two groups and two time periods. We show that this equivalence does not hold under more general settings commonly encountered in applied research. Instead, we prove that the multi-period difference-in-differences estimator is equivalent to the weighted 2FE estimator with some observations having negative weights. These analytical results imply that in contrast to the popular belief, the 2FE estimator does not represent a design-based, nonparametric estimation strategy for causal inference. Instead, its validity fundamentally rests on the modeling assumptions.

192 citations


Proceedings ArticleDOI
10 Oct 2021
TL;DR: This paper proposed a new algorithm called ART-C for conducting contrast tests within the Aligned Rank Transform (ART) paradigm and validated it on 72,000 synthetic data sets and found that ART-c has more statistical power than a t-test, Mann-Whitney U test, Wilcoxon signed-rank test, and ART.
Abstract: Data from multifactor HCI experiments often violates the assumptions of parametric tests (i.e., nonconforming data). The Aligned Rank Transform (ART) has become a popular nonparametric analysis in HCI that can find main and interaction effects in nonconforming data, but leads to incorrect results when used to conduct post hoc contrast tests. We created a new algorithm called ART-C for conducting contrast tests within the ART paradigm and validated it on 72,000 synthetic data sets. Our results indicate that ART-C does not inflate Type I error rates, unlike contrasts based on ART, and that ART-C has more statistical power than a t-test, Mann-Whitney U test, Wilcoxon signed-rank test, and ART. We also extended an open-source tool called ARTool with our ART-C algorithm for both Windows and R. Our validation had some limitations (e.g., only six distribution types, no mixed factorial designs, no random slopes), and data drawn from Cauchy distributions should not be analyzed with ART-C.

155 citations


Journal ArticleDOI
TL;DR: In this article, the authors develop confidence sequences whose widths go to zero, with nonasymptotic coverage guarantees under nonparametric conditions, including sub-Gaussian and Bernstein conditions, and matrix martingales.
Abstract: A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon. Our work develops confidence sequences whose widths go to zero, with nonasymptotic coverage guarantees under nonparametric conditions. We draw connections between the Cramer–Chernoff method for exponential concentration, the law of the iterated logarithm (LIL) and the sequential probability ratio test—our confidence sequences are time-uniform extensions of the first; provide tight, nonasymptotic characterizations of the second; and generalize the third to nonparametric settings, including sub-Gaussian and Bernstein conditions, self-normalized processes and matrix martingales. We illustrate the generality of our proof techniques by deriving an empirical-Bernstein bound growing at a LIL rate, as well as a novel upper LIL for the maximum eigenvalue of a sum of random matrices. Finally, we apply our methods to covariance matrix estimation and to estimation of sample average treatment effect under the Neyman–Rubin potential outcomes model.

129 citations


Journal ArticleDOI
01 Mar 2021-Test
TL;DR: This paper provides a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics, and considers developments for the exploratory analysis of directional data.
Abstract: Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere, and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper, we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (Wiley 1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, space situational awareness, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments are discussed.

76 citations


Journal ArticleDOI
TL;DR: It is shown that ignoring the matching step results in asymptotically valid standard errors if matching is done without replacement and the regression model is correctly specified relative to the population regression function of the outcome variable on the treatment variable and all the covariates used for matching.
Abstract: Nearest-neighbor matching is a popular nonparametric tool to create balance between treatment and control groups in observational studies. As a preprocessing step before regression, matching reduce...

74 citations


Journal ArticleDOI
Qiang Chen1, Xinqi Yu1, Mingxuan Sun1, Chun Wu1, Zijun Fu1 
TL;DR: With the proposed ARLC scheme, a high steady-state tracking accuracy is guaranteed, and comparative experiments are provided to demonstrate the effectiveness and superiority of the proposed method.
Abstract: In this article, an adaptive repetitive learning control (ARLC) scheme is proposed for permanent magnet synchronous motor (PMSM) servo systems with bounded nonparametric uncertainties, which are divided into two separated parts. The periodically nonparametric part is involved in an unknown desired control input, and a fully saturated repetitive learning law with a continuous switching function is developed to ensure that the estimate of the unknown desired control input is continuous and confined with a prespecified region. The nonperiodically nonparametric part is transformed into the parametric form and compensated by designing the adaptive updating laws, such that a prior knowledge on the bounds of uncertainties is not required in the controller design. With the proposed ARLC scheme, a high steady-state tracking accuracy is guaranteed, and comparative experiments are provided to demonstrate the effectiveness and superiority of the proposed method.

52 citations


Journal ArticleDOI
TL;DR: In this paper, the authors extend the second step of UMAP to a parametric optimization over neural network weights, learning a relationship between data and embedding, and demonstrate that parametric UMAP performs comparably to its nonparametric counterpart while conferring the benefit of learned parametric mapping.
Abstract: UMAP is a nonparametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) computing a graphical representation of a data set (fuzzy simplicial complex) and (2) through stochastic gradient descent, optimizing a low-dimensional embedding of the graph. Here, we extend the second step of UMAP to a parametric optimization over neural network weights, learning a parametric relationship between data and embedding. We first demonstrate that parametric UMAP performs comparably to its nonparametric counterpart while conferring the benefit of a learned parametric mapping (e.g., fast online embeddings for new data). We then explore UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure preservation, and improving classifier accuracy for semisupervised learning by capturing structure in unlabeled data.1.

50 citations


Journal ArticleDOI
TL;DR: The Box-Cox power transformation family for non-negative responses in linear models has a long and interesting history in both statistical practice and theory, which is summarized in this article.
Abstract: The Box-Cox power transformation family for non-negative responses in linear models has a long and interesting history in both statistical practice and theory, which we summarize. The relationship between generalized linear models and log transformed data is illustrated. Extensions investigated include the transform both sides model and the Yeo-Johnson transformation for observations that can be positive or negative. The paper also describes an extended Yeo-Johnson transformation that allows positive and negative responses to have different power transformations. Analyses of data show this to be necessary. Robustness enters in the fan plot for which the forward search provides an ordering of the data. Plausible transformations are checked with an extended fan plot. These procedures are used to compare parametric power transformations with nonparametric transformations produced by smoothing.

49 citations


Journal ArticleDOI
TL;DR: It will be shown that the modeling choice of kernel density functions plays perhaps the most impactful roles in determining the posterior contraction rates in the misspecified situations.
Abstract: We study posterior contraction behaviors for parameters of interest in the context of Bayesian mixture modeling, where the number of mixing components is unknown while the model itself may or may not be correctly specified. Two representative types of prior specification will be considered: one requires explicitly a prior distribution on the number of mixture components, while the other places a nonparametric prior on the space of mixing distributions. The former is shown to yield an optimal rate of posterior contraction on the model parameters under minimal conditions, while the latter can be utilized to consistently recover the unknown number of mixture components, with the help of a fast probabilistic post-processing procedure. We then turn the study of these Bayesian procedures to the realistic settings of model misspecification. It will be shown that the modeling choice of kernel density functions plays perhaps the most impactful roles in determining the posterior contraction rates in the misspecified situations. Drawing on concrete posterior contraction rates established in this paper we wish to highlight some aspects about the interesting tradeoffs between model expressiveness and interpretability that a statistical modeler must negotiate in the rich world of mixture modeling.

36 citations


Journal ArticleDOI
TL;DR: In this article, the authors used satellite-recorded nighttime lights in a measurement error model framework to estimate the relationship between nighttime light growth and national accounts growth, as well as the nonparametric distribution of errors in both measures, and obtained three key results: (i) the elasticity of nighttime lights to GDP is about 1.3; (ii) national accounts GDP growth measures are less precise for low and middle income countries, and nighttime lights can play a big role in improving such measures; and (iii) their new measure of GDP growth, based on the optimal combination

Journal ArticleDOI
TL;DR: In this paper, the authors extended the recent work of Vannman and Albing (2007) regarding the new family of quantile based process capability in-dices (qPCI) CM A(τ, v).
Abstract: This article extends the recent work of Vannman and Albing (2007) regarding the new family of quantile based process capability in- dices (qPCI) CM A(τ, v). We develop both asymptotic parametric and non- parametric confidence limits and testing procedures of CM A(τ, v). The ker- nel density estimator of process was proposed to find the consistent estima- tor of the variance of the nonparametric consistent estimator of CM A(τ, v). Therefore, the proposed procedure is ready for practical implementation to any processes. Illustrative examples are also provided to show the steps of implementing the proposed methods directly on the real-life problems. We also present a simulation study on the sample size required for using asymptotic results.

Journal ArticleDOI
TL;DR: This work model anomalies as persistent outliers and propose to detect them via a cumulative sum-like algorithm via an asymptotic lower bound and an ascyptotic approximation for the average false alarm period of the proposed algorithm.
Abstract: Timely detection of abrupt anomalies is crucial for real-time monitoring and security of modern systems producing high-dimensional data. With this goal, we propose effective and scalable algorithms. Proposed algorithms are nonparametric as both the nominal and anomalous multivariate data distributions are assumed unknown. We extract useful univariate summary statistics and perform anomaly detection in a single-dimensional space. We model anomalies as persistent outliers and propose to detect them via a cumulative sum-like algorithm. In case the observed data have a low intrinsic dimensionality, we find a submanifold in which the nominal data are embedded and evaluate whether the sequentially acquired data persistently deviate from the nominal submanifold. Further, in the general case, we determine an acceptance region for nominal data via Geometric Entropy Minimization and evaluate whether the sequentially observed data persistently fall outside the acceptance region. We provide an asymptotic lower bound and an asymptotic approximation for the average false alarm period of the proposed algorithm. Moreover, we provide a sufficient condition to asymptotically guarantee that the decision statistic of the proposed algorithm does not diverge in the absence of anomalies. Experiments illustrate the effectiveness of the proposed schemes in quick and accurate anomaly detection in high-dimensional settings.

Journal ArticleDOI
TL;DR: In this paper, the authors present a review of the existing online methods for testing the two hypotheses of randomness and exchangeability, focusing on the online mode of testing, when the observations arrive sequentially.
Abstract: The hypothesis of randomness is fundamental in statistical machine learning and in many areas of nonparametric statistics; it says that the observations are assumed to be independent and coming from the same unknown probability distribution. This hypothesis is close, in certain respects, to the hypothesis of exchangeability, which postulates that the distribution of the observations is invariant with respect to their permutations. This paper reviews known methods of testing the two hypotheses concentrating on the online mode of testing, when the observations arrive sequentially. All known online methods for testing these hypotheses are based on conformal martingales, which are defined and studied in detail. An important variety of online testing is change detection, where the use of conformal martingales leads to conformal versions of the CUSUM and Shiryaev–Roberts procedures; these versions work in the nonparametric setting where the data is assumed IID according to a completely unknown distribution before the change. The paper emphasizes conceptual and practical aspects and states two kinds of results. Validity results limit the probability of a false alarm or, in the case of change detection, the frequency of false alarms for various procedures based on conformal martingales. Efficiency results establish connections between randomness, exchangeability, and conformal martingales.

Journal ArticleDOI
TL;DR: In this article, entropy balancing is used for estimating exposure effects in continuous exposure settings and the estimation of population dose-response curves using nonparametric estimation combined with entropy balancing weights, focusing on factors that would be important to applied researchers in medical or health services research.
Abstract: Weighted estimators are commonly used for estimating exposure effects in observational settings to establish causal relations. These estimators have a long history of development when the exposure of interest is binary and where the weights are typically functions of an estimated propensity score. Recent developments in optimization-based estimators for constructing weights in binary exposure settings, such as those based on entropy balancing, have shown more promise in estimating treatment effects than those methods that focus on the direct estimation of the propensity score using likelihood-based methods. This paper explores recent developments of entropy balancing methods to continuous exposure settings and the estimation of population dose-response curves using nonparametric estimation combined with entropy balancing weights, focusing on factors that would be important to applied researchers in medical or health services research. The methods developed here are applied to data from a study assessing the effect of non-randomized components of an evidence-based substance use treatment program on emotional and substance use clinical outcomes.

Journal ArticleDOI
23 Mar 2021
TL;DR: The percentile bootstrap is the Swiss Army knife of statistics as mentioned in this paper, which is a nonparametric method based on data-driven simulations and can be applied to many statistical problems, as a substitute to sta...
Abstract: The percentile bootstrap is the Swiss Army knife of statistics: It is a nonparametric method based on data-driven simulations. It can be applied to many statistical problems, as a substitute to sta...

Journal ArticleDOI
TL;DR: In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response, in other words, to gauge the variable imp... as mentioned in this paper.
Abstract: In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response — in other words, to gauge the variable imp...

Journal ArticleDOI
Geyu Zhou1, Hongyu Zhao1
TL;DR: This work develops a summary statistics-based nonparametric method that does not rely on validation datasets to tune parameters and is adaptive to different genetic architectures, statistically robust, and computationally efficient.
Abstract: Genetic prediction of complex traits has great promise for disease prevention, monitoring, and treatment. The development of accurate risk prediction models is hindered by the wide diversity of genetic architecture across different traits, limited access to individual level data for training and parameter tuning, and the demand for computational resources. To overcome the limitations of the most existing methods that make explicit assumptions on the underlying genetic architecture and need a separate validation data set for parameter tuning, we develop a summary statistics-based nonparametric method that does not rely on validation datasets to tune parameters. In our implementation, we refine the commonly used likelihood assumption to deal with the discrepancy between summary statistics and external reference panel. We also leverage the block structure of the reference linkage disequilibrium matrix for implementation of a parallel algorithm. Through simulations and applications to twelve traits, we show that our method is adaptive to different genetic architectures, statistically robust, and computationally efficient. Our method is available at https://github.com/eldronzhou/SDPR.

Journal ArticleDOI
TL;DR: In this article, a general framework for distribution-free nonparametric testing in multi-dimensions, based on a notion of multivariate ranks defined using the theory of measure transportation, is proposed.
Abstract: In this paper, we propose a general framework for distribution-free nonparametric testing in multi-dimensions, based on a notion of multivariate ranks defined using the theory of measure transportation. Unlike other existing proposals in the literature, these multivariate ranks share a number of useful properties with the usual one-dimensional ranks; most importantly, these ranks are distribution-free. This crucial observation allows us to design nonparametric tests that are exactly distribution-free under the null hypothesis. We demonstrate the applicability of this approach by constructing exact distribution-free tests for two classical nonparametric problems: (I) testing for mutual independence between random vectors, and (II) testing for the equality of multivariate distributions. In particular, we propose (multivariate) rank versions of distance covariance ((Szekely et al. [117]) and energy statistic (Szekely and Rizzo [116]) for testing scenarios (I) and (II) respectively. In both these problems we derive the asymptotic null distribution of the proposed test statistics. We further show that our tests are consistent against all fixed alternatives. Moreover, the proposed tests are computationally feasible and are well-defined under minimal assumptions on the underlying distributions (e.g., they do not need any moment assumptions). We also demonstrate the efficacy of these procedures via extensive simulations. In the process of analyzing the theoretical properties of our procedures, we end up proving some new results in the theory of measure transportation and in the limit theory of permutation statistics using Stein’s method for exchangeable pairs, which may be of independent interest.

Journal ArticleDOI
TL;DR: In this paper, a Hidden Semi-Markov Model with Hierarchical prior was proposed to detect brain activity under different flight tasks and a dynamic student mixture model was used to detect the outlier of emission probability of HSMM.
Abstract: The evaluation of pilot brain activity is very important for flight safety. This study proposes a Hidden semi-Markov Model with Hierarchical prior to detect brain activity under different flight tasks. A dynamic student mixture model is proposed to detect the outlier of emission probability of HSMM. Instantaneous spectrum features are also extracted from EEG signals. Compared with other latent variable models, the proposed model shows excellent performance for the automatic inference of brain cognitive activity of pilots. The results indicate that the consideration of hierarchical model and the emission probability with t mixture model improves the recognition performance for Pilots' fatigue cognitive level.

Journal ArticleDOI
TL;DR: In this paper, the coefficient function of the leading differential operator is estimated from observations of a linear stochastic partial differential equation (SPDE) based on continuous time observations which are localised in space.
Abstract: The coefficient function of the leading differential operator is estimated from observations of a linear stochastic partial differential equation (SPDE). The estimation is based on continuous time observations which are localised in space. For the asymptotic regime with fixed time horizon and with the spatial resolution of the observations tending to zero, we provide rate-optimal estimators and establish scaling limits of the deterministic PDE and of the SPDE on growing domains. The estimators are robust to lower order perturbations of the underlying differential operator and achieve the parametric rate even in the nonparametric setup with a spatially varying coefficient. A numerical example illustrates the main results.

Journal ArticleDOI
Riko Kelter1
TL;DR: The underlying assumptions, models and their implications for practical research of recently proposed Bayesian two-sample tests are explored and contrasted with the frequentist solutions, and an extensive simulation study demonstrates that the proposedBayesian tests achieve better type I error control at slightly increased type II error rates.
Abstract: Testing for differences between two groups is among the most frequently carried out statistical methods in empirical research. The traditional frequentist approach is to make use of null hypothesis significance tests which use p values to reject a null hypothesis. Recently, a lot of research has emerged which proposes Bayesian versions of the most common parametric and nonparametric frequentist two-sample tests. These proposals include Student’s two-sample t-test and its nonparametric counterpart, the Mann–Whitney U test. In this paper, the underlying assumptions, models and their implications for practical research of recently proposed Bayesian two-sample tests are explored and contrasted with the frequentist solutions. An extensive simulation study is provided, the results of which demonstrate that the proposed Bayesian tests achieve better type I error control at slightly increased type II error rates. These results are important, because balancing the type I and II errors is a crucial goal in a variety of research, and shifting towards the Bayesian two-sample tests while simultaneously increasing the sample size yields smaller type I error rates. What is more, the results highlight that the differences in type II error rates between frequentist and Bayesian two-sample tests depend on the magnitude of the underlying effect.

Journal ArticleDOI
TL;DR: Different semiparametric and nonparametric methods are proposed to estimate the first-passage time distribution of dependence bivariate degradation data and the saddlepoint approximation and bootstrap methods are used to estimates the marginal FPT distributions empirically and the empirical copula is used to estimated the joint distribution of two dependence degradation processes nonparametrically.

Report SeriesDOI
TL;DR: This paper developed multivariate time series models using Bayesian additive regression trees that posit nonlinear relationships among macroeconomic variables, their lags, and possibly the lags of the errors.
Abstract: We develop novel multivariate time series models using Bayesian additive regression trees that posit nonlinear relationships among macroeconomic variables, their lags, and possibly the lags of the errors. The variance of the errors can be stable, driven by stochastic volatility (SV), or follow a novel nonparametric specification. Estimation is carried out using scalable Markov chain Monte Carlo estimation algorithms for each specification. We evaluate the real-time density and tail forecasting performance of the various models for a set of US macroeconomic and financial indicators. Our results suggest that using nonparametric models generally leads to improved forecast accuracy. In particular, when interest centers on the tails of the posterior predictive, flexible models improve upon standard VAR models with SV. Another key finding is that if we allow for nonlinearities in the conditional mean, allowing for heteroskedasticity becomes less important. A scenario analysis reveals highly nonlinear relations between the predictive distribution and financial conditions.

Journal ArticleDOI
TL;DR: In this paper, the nonparametric estimation and specification testing for partially linear functional-coefficient dynamic panel data models is studied, where the effects of some covariates on the dependent variable are investigated.
Abstract: We study the nonparametric estimation and specification testing for partially linear functional-coefficient dynamic panel data models, where the effects of some covariates on the dependent variable...

Journal ArticleDOI
TL;DR: The novelty of the proposed technique lies in the fact that it indigenously helps in identifying the component(s) responsible for the signal, which is not straightforward with the traditional schemes for surveillance of a bivariate process.

Proceedings Article
18 Mar 2021
TL;DR: DoSE as discussed by the authors is a density of states estimator for out-of-distribution (OOD) detection, which is based on the statistical physics notion of ''density of states'' and utilizes the ''probability of the model probability'' or the frequency of any reasonable statistic.
Abstract: Perhaps surprisingly, recent studies have shown probabilistic model likelihoods have poor specificity for out-of-distribution (OOD) detection and often assign higher likelihoods to OOD data than in-distribution data. To ameliorate this issue we propose DoSE, the density of states estimator. Drawing on the statistical physics notion of ``density of states,'' the DoSE decision rule avoids direct comparison of model probabilities, and instead utilizes the ``probability of the model probability,'' or indeed the frequency of any reasonable statistic. The frequency is calculated using nonparametric density estimators (e.g., KDE and one-class SVM) which measure the typicality of various model statistics given the training data and from which we can flag test points with low typicality as anomalous. Unlike many other methods, DoSE requires neither labeled data nor OOD examples. DoSE is modular and can be trivially applied to any existing, trained model. We demonstrate DoSE's state-of-the-art performance against other unsupervised OOD detectors on previously established ``hard'' benchmarks.

Journal ArticleDOI
TL;DR: This work designs two types of data-driven pricing and ordering (DDPO) policies for the cases of nonparametric and parametric noise distributions, and demonstrates that their approach significantly outperforms the historical decisions made by the leading supermarket chain.
Abstract: We consider a retailer that sells a perishable product, making joint pricing and inventory ordering decisions over a finite time horizon of T periods with lost sales. Exploring a real-life data set from a leading supermarket chain, we identify several distinctive challenges faced by such a retailer that have not been jointly studied in the literature: the retailer does not have perfect information on (1) the demand-price relationship, (2) the demand noise distribution, (3) the inventory perishability rate, and (4) how the demand-price relationship changes over time. Furthermore, the demand noise distribution is nonparametric for some products but parametric for others. To tackle these challenges, we design two types of data-driven pricing and ordering (DDPO) policies for the cases of nonparametric and parametric noise distributions. Measuring performance by regret, i.e., the profit loss caused by not knowing (1)-(4), we prove that the T-period regret of our DDPO policies are in the order of T^{2/3}(logT)^{1/2} and T^{1/2}logT in the cases of nonparametric and parametric noise distributions, respectively. These are the best achievable growth rates of regret in these settings (up to logarithmic terms). Implementing our policies in the context of the aforementioned real-life data set, we show that our approach significantly outperforms the historical decisions made by the supermarket chain. Moreover, we characterize parameter regimes that quantify the relative significance of the changing environment and product perishability. Finally, we extend our model to allow for age-dependent perishability and demand censoring, and modify our policies to address these issues.