scispace - formally typeset
Search or ask a question

Showing papers on "Nonparametric statistics published in 2003"


Journal ArticleDOI
TL;DR: The problems of some applications of correlation and regression methods to these studies are described, using recent examples from this literature, and the 95% limits of agreement approach and a similar, appropriate, regression technique are described.
Abstract: The study of measurement error, observer variation and agreement between different methods of measurement are frequent topics in the imaging literature. We describe the problems of some applications of correlation and regression methods to these studies, using recent examples from this literature. We use a simulated example to show how these problems and misinterpretations arise. We describe the 95% limits of agreement approach and a similar, appropriate, regression technique. We discuss the difference vs. mean plot, and the pitfalls of plotting difference against one variable only. We stress that these are questions of estimation, not significance tests, and show how confidence intervals can be found for these estimates.

1,254 citations


Journal ArticleDOI
TL;DR: In this article, the identification and estimation of structural functions for nonparametric conditional moment restrictions are given. But they do not provide sufficient identification conditions for exponential families and discrete variables.
Abstract: In econometrics there are many occasions where knowledge of the structural relationship among dependent variables is required to answer questions of interest. This paper gives identification and estimation results for nonparametric conditional moment restrictions. We characterize identification of structural functions as completeness of certain conditional distributions, and give sufficient identification conditions for exponential families and discrete variables. We also give a consistent, nonparametric estimator of the structural function. The estimator is nonparametric two-stage least squares based on series approximation, which overcomes an ill-posed inverse problem by placing bounds on integrals of higher-order derivatives.

852 citations


Journal ArticleDOI
TL;DR: It is shown that Bayesian posterior probabilities are significantly higher than corresponding nonparametric bootstrap frequencies for true clades, but also that erroneous conclusions will be made more often.
Abstract: Many empirical studies have revealed considerable differences between nonparametric bootstrapping and Bayesian posterior probabilities in terms of the support values for branches, despite claimed predictions about their approximate equivalence. We investigated this problem by simulating data, which were then analyzed by maximum likelihood bootstrapping and Bayesian phylogenetic analysis using identical models and reoptimization of parameter values. We show that Bayesian posterior probabilities are significantly higher than corresponding nonparametric bootstrap frequencies for true clades, but also that erroneous conclusions will be made more often. These errors are strongly accentuated when the models used for analyses are underparameterized. When data are analyzed under the correct model, nonparametric bootstrapping is conservative. Bayesian posterior probabilities are also conservative in this respect, but less so.

620 citations


Proceedings ArticleDOI
18 Jun 2003
TL;DR: The NBP algorithm is applied to infer component interrelationships in a parts-based face model, allowing location and reconstruction of occluded features and extends particle filtering methods to the more general vision problems that graphical models can describe.
Abstract: In many applications of graphical models arising in computer vision, the hidden variables of interest are most naturally specified by continuous, non-Gaussian distributions. There exist inference algorithms for discrete approximations to these continuous distributions, but for the high-dimensional variables typically of interest, discrete inference becomes infeasible. Stochastic methods such as particle filters provide an appealing alternative. However, existing techniques fail to exploit the rich structure of the graphical models describing many vision problems. Drawing on ideas from regularized particle filters and belief propagation (BP), this paper develops a nonparametric belief propagation (NBP) algorithm applicable to general graphs. Each NBP iteration uses an efficient sampling procedure to update kernel-based approximations to the true, continuous likelihoods. The algorithm can accommodate an extremely broad class of potential functions, including nonparametric representations. Thus, NBP extends particle filtering methods to the more general vision problems that graphical models can describe. We apply the NBP algorithm to infer component interrelationships in a parts-based face model, allowing location and reconstruction of occluded features.

513 citations


Journal ArticleDOI
TL;DR: In this article, a shape-invariant Engel curve system with endogenous total expenditure was studied, in which the shape invariant specification involves a common shift parameter for each demographic group in a pooled system of nonparametric Engel curves.
Abstract: This paper studies a shape-invariant Engel curve system with endogenous total expenditure, in which the shape-invariant specification involves a common shift parameter for each demographic group in a pooled system of nonparametric Engel curves. We focus on the identification and estimation of both the nonparametric shapes of the Engel curves and the parametric specification of the demographic scaling parameters. The identification condition relates to the bounded completeness and the estimation procedure applies the sieve minimum distance estimation of conditional moment restrictions, allowing for endogeneity. We establish a new root mean squared convergence rate for the nonparametric instrumental variable regression when the endogenous regressor could have unbounded support. Root-n asymptotic normality and semiparametric efficiency of the parametric components are also given under a set of "low-level" sufficient conditions. Our empirical application using the U.K. Family Expenditure Survey shows the importance of adjusting for endogeneity in terms of both the nonparametric curvatures and the demographic parameters of systems of Engel curves.

462 citations


Journal ArticleDOI
TL;DR: In this paper, sample selection models provide an important way of accounting for economic decisions that combine discrete and continuous choices and of correcting for nonrandom sampling, and they can be used for estimating shapes and important economic quantities, as in standard nonparametnc regression.
Abstract: Sample selection models provide an important way of accounting for economic decisions that combine discrete and continuous choices and of correcting for nonrandom sampling. Nonparametric estimators for these models are developed in this paper. These can be used for estimating shapes and important economic quantities, as in standard nonparametnc regression. Endogeneity of regressors of interest is allowed for. Series estimators for these models are developed, which are useful for imposing additivity restnctions that arise from selection corrections. Convergence rates and asymptotic normality results are derived. An application to returns to schooling among Australian young females is given.

435 citations


Book
06 Jan 2003
TL;DR: In this article, Probability and Related Concepts Summarizing data sampling distributions and confidence intervals are discussed and a hypothesis testing least squares regression and Pearson's correlation is proposed to detect outliers in multivariate data.
Abstract: Introduction Probability and Related Concepts Summarizing Data Sampling Distributions and Confidence Intervals Hypothesis Testing Least Squares Regression and Pearson's Correlation Basic Bootstrap Methods Comparing Two Independent Groups One-Way Anova Two-Way Anova Comparing Dependent Groups Multiple Comparisons Detecting Outliers in Multivariate Data More Regression Methods Rank-Based and Nonparametric Methods

427 citations


Journal ArticleDOI
TL;DR: In this article, two nonparametric approaches, based on kernel methods and orthogonal series, are proposed to estimate regression functions in the presence of instrumental variables, and the authors derive optimal convergence rates, and show that they are attained by particular estimators.
Abstract: We suggest two nonparametric approaches, based on kernel methods and orthogonal series to estimating regression functions in the presence of instrumental variables. For the first time in this class of problems, we derive optimal convergence rates, and show that they are attained by particular estimators. In the presence of instrumental variables the relation that identifies the regression function also defines an ill-posed inverse problem, the "difficulty" of which depends on eigenvalues of a certain integral operator which is determined by the joint density of endogenous and instrumental variables. We delineate the role played by problem difficulty in determining both the optimal convergence rate and the appropriate choice of smoothing parameter.

423 citations


Journal ArticleDOI
TL;DR: Existing nonparametric imputation methods—both for the additive and the multiplicative approach—are revised and essential properties of the last method are given and for missing values a generalization of themultiplicative approach is proposed.
Abstract: The statistical analysis of compositional data based on logratios of parts is not suitable when zeros are present in a data set. Nevertheless, if there is interest in using this modeling approach, several strategies have been published in the specialized literature which can be used. In particular, substitution or imputation strategies are available for rounded zeros. In this paper, existing nonparametric imputation methods—both for the additive and the multiplicative approach—are revised and essential properties of the last method are given. For missing values a generalization of the multiplicative approach is proposed.

414 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present estimators for non-additive functions that are nonadditive in unobservable random terms, where the distributions of the unobservably random terms are assumed to be unknown.
Abstract: We present estimators for nonparametric functions that are nonadditive in unobservable random terms. The distributions of the unobservable random terms are assumed to be unknown. We show that when a nonadditive, nonparametric function is strictly monotone in an unobservable random term, and it satisfies some other properties that may be implied by economic theory, such as homogeneity of degree one or separability, the function and the distribution of the unobservable random term are identified. We also present convenient normalizations, to use when the properties of the function, other than strict monotonicity in the unobservable random term, are unknown. The estimators for the nonparametric function and for the distribution of the unobservable random term are shown to be consistent and asymptotically normal. We extend the results to functions that depend on a multivariate random term. The results of a limited simulation study are presented.

407 citations


Book
01 Mar 2003
TL;DR: A survey of applications theory and general estimation procedures for stress strength models can be found in this paper, along with examples and details on applications and their application in the context of point estimation and statistical inference.
Abstract: Stress-strength models - history, mathematical tools and survey of applications theory and general estimation procedures parametric point estimation parametric statistical inference nonparametric methods special cases and generalizations examples and details on applications

Journal ArticleDOI
TL;DR: This paper summarizes the main results of Cazals et al. (2002) on robust nonparametric frontier estimators and proposes a methodology implementing the tool and shows how this tool can be used for detecting outliers when using the classical DEA/FDH estimators or any parametric techniques.
Abstract: In frontier analysis, most of the nonparametric approaches (DEA, FDH) are based on envelopment ideas which suppose that with probability one, all the observed units belong to the attainable set. In these "deterministic'' frontier models, statistical theory is now mostly available (Simar and Wilson, 2000a). In the presence of superefficient outliers, envelopment estimators could behave dramatically since they are very sensitive to extreme observations. Some recent results from Cazals et al. (2002) on robust nonparametric frontier estimators may be used in order to detect outliers by de. ning a new DEA/FDH "deterministic'' type estimator which does not envelop all the data points and so is more robust to extreme data points. In this paper, we summarize the main results of Cazals et al. (2002) and we show how this tool can be used for detecting outliers when using the classical DEA/FDH estimators or any parametric techniques. We propose a methodology implementing the tool and we illustrate through some numerical examples with simulated and real data. The method should be used in a first step, as an exploratory data analysis, before using any frontier estimation.

Journal ArticleDOI
TL;DR: It is argued that methods for implementing the bootstrap with time‐series data are not as well understood as methods for data that are independent random samples, and there is a considerable need for further research.
Abstract: The chapter gives a review of the literature on bootstrap methods for time series data. It describes various possibilities on how the bootstrap method, initially introduced for independent random variables, can be extended to a wide range of dependent variables in discrete time, including parametric or nonparametric time series models, autoregressive and Markov processes, long range dependent time series and nonlinear time series, among others. Relevant bootstrap approaches, namely the intuitive residual bootstrap and Markovian bootstrap methods, the prominent block bootstrap methods as well as frequency domain resampling procedures, are described. Further, conditions for consistent approximations of distributions of parameters of interest by these methods are presented. The presentation is deliberately kept non-technical in order to allow for an easy understanding of the topic, indicating which bootstrap scheme is advantageous under a specific dependence situation and for a given class of parameters of interest. Moreover, the chapter contains an extensive list of relevant references for bootstrap methods for time series.

Journal ArticleDOI
TL;DR: The regression framework is described, which is used to compare two prostate‐specific antigen biomarkers and to evaluate the dependence of biomarker accuracy on the time prior to clinical diagnosis of prostate cancer.
Abstract: Accurate diagnosis of disease is a critical part of health care. New diagnostic and screening tests must be evaluated based on their abilities to discriminate diseased from nondiseased states. The partial area under the receiver operating characteristic (ROC) curve is a measure of diagnostic test accuracy. We present an interpretation of the partial area under the curve (AUC), which gives rise to a nonparametric estimator. This estimator is more robust than existing estimators, which make parametric assumptions. We show that the robustness is gained with only a moderate loss in efficiency. We describe a regression modeling framework for making inference about covariate effects on the partial AUC. Such models can refine knowledge about test accuracy. Model parameters can be estimated using binary regression methods. We use the regression framework to compare two prostate-specific antigen biomarkers and to evaluate the dependence of biomarker accuracy on the time prior to clinical diagnosis of prostate cancer.

Journal ArticleDOI
TL;DR: In this article, a method to constrain the values of the first and second derivatives of nonparametric locally polynomial estimators was developed to estimate the state price density, or risk-neutral density, implicit in the market prices of options.

Journal ArticleDOI
TL;DR: A new nonparametric tool for studying the relationship between a curve, considered as a functional predictor, and a categorical response is proposed and its practical performance is pointed out by means of a simulation study.

Journal ArticleDOI
TL;DR: In this article, a dependence measure that characterises dependence at the bivariate level, for all pairs and all higher orders up to and including the dimension of the variable, is presented and sufficient conditions for subsets of dependence measures to be self-consistent.
Abstract: We present properties of a dependence measure that arises in the study of extreme values in multivariate and spatial problems. For multivariate problems the dependence measure characterises dependence at the bivariate level, for all pairs and all higher orders up to and including the dimension of the variable. Necessary and sufficient conditions are given for subsets of dependence measures to be self‐consistent, that is to guarantee the existence of a distribution with such a subset of values for the dependence measure. For pairwise dependence, these conditions are given in terms of positive semidefinite matrices and non‐differentiable, positive definite functions. We construct new nonparametric estimators for the dependence measure which, unlike all naive nonparametric estimators, impose these self‐consistency properties. As the new estimators provide an improvement on the naive methods, both in terms of the inferential and interpretability properties, their use in exploratory extreme value analyses should aid the identification of appropriate dependence models. The methods are illustrated through an analysis of simulated multivariate data, which shows that a lack of self‐consistency is frequently a problem with the existing estimators, and by a spatial analysis of daily rainfall extremes in south‐west England, which finds a smooth decay in extremal dependence with distance.


Posted Content
01 Jan 2003
TL;DR: In this article, a collection of techniques for analyzing nonparametric and semiparametric regression models is provided, including simple goodness of fit tests and residual regression tests, which can be used to test hypotheses such as parametric and semi-parametric specifications, significance, monotonicity and additive separability.
Abstract: This book provides an accessible collection of techniques for analyzing nonparametric and semiparametric regression models. Worked examples include estimation of Engel curves and equivalence scales, scale economies, semiparametric Cobb-Douglas, translog and CES cost functions, household gasoline consumption, hedonic housing prices, option prices and state price density estimation. The book should be of interest to a broad range of economists including those working in industrial organization, labor, development, urban, energy and financial economics. A variety of testing procedures are covered including simple goodness of fit tests and residual regression tests. These procedures can be used to test hypotheses such as parametric and semiparametric specifications, significance, monotonicity and additive separability. Other topics include endogeneity of parametric and nonparametric effects, as well as heteroskedasticity and autocorrelation in the residuals. Bootstrap procedures are provided.

Journal ArticleDOI
TL;DR: In this article, the authors apply revealed preference theory to the nonparametric statistical analysis of consumer demand and derive tightest bounds on indifference surfaces and welfare measures using an algorithm for which revealed preference conditions are shown to guarantee convergence.
Abstract: This paper applies revealed preference theory to the nonparametric statistical analysis of consumer demand. Knowledge of expansion paths is shown to improve the power of nonparametric tests of revealed preference. The tightest bounds on indifference surfaces and welfare measures are derived using an algorithm for which revealed preference conditions are shown to guarantee convergence. Nonparametric Engel curves are used to estimate expansion paths and provide a stochastic structure within which to examine the consistency of household level data and revealed preference theory. An application is made to a long time series of repeated cross-sections from the Family Expenditure Survey for Britain. The consistency of these data with revealed preference theory is examined. For periods of consistency with revealed preference, tight bounds are placed on true cost of living indices.

Journal ArticleDOI
Jushan Bai1
TL;DR: In this article, a nonparametric test for parametric conditional distributions of dynamic models is proposed, coupled with Khmaladze's martingale transformation, which is asymptotically distribution-free and has nontrivial power against root-n local alternatives.
Abstract: This paper proposes a nonparametric test for parametric conditional distributions of dynamic models. The test is of the Kolmogorov type coupled with Khmaladze's martingale transformation. It is asymptotically distribution-free and has nontrivial power against root-n local alternatives. The method is applicable for various dynamic models, including autoregressive and moving average models, generalized autoregressive conditional heteroskedasticity (GARCH), integrated GARCH, and general nonlinear time series regressions. The method is also applicable for cross-sectional models. Finally, we apply the procedure to testing conditional normality and the conditional t-distribution in a GARCH model for the NYSE equal-weighted returns.

Book
01 Jan 2003
TL;DR: In this article, a collection of techniques for analyzing nonparametric and semiparametric regression models is provided, including simple goodness of fit tests and residual regression tests, which can be used to test hypotheses such as parametric and semi-parametric specifications, significance, monotonicity and additive separability.
Abstract: This book provides an accessible collection of techniques for analyzing nonparametric and semiparametric regression models. Worked examples include estimation of Engel curves and equivalence scales, scale economies, semiparametric Cobb-Douglas, translog and CES cost functions, household gasoline consumption, hedonic housing prices, option prices and state price density estimation. The book should be of interest to a broad range of economists including those working in industrial organization, labor, development, urban, energy and financial economics. A variety of testing procedures are covered including simple goodness of fit tests and residual regression tests. These procedures can be used to test hypotheses such as parametric and semiparametric specifications, significance, monotonicity and additive separability. Other topics include endogeneity of parametric and nonparametric effects, as well as heteroskedasticity and autocorrelation in the residuals. Bootstrap procedures are provided.

Journal ArticleDOI
TL;DR: A collection of 1,220,000 simulated benchmark data sets generated under 51 different cluster models and the null hypothesis are presented, to be used for power evaluations and to compare the power of the spatial scan statistic, the maximized excess events test and the nonparametric M statistic.

Journal ArticleDOI
TL;DR: Nonparametric permutation methods can be used to make robust statistical inference about group SAM data to show robust group activation at the P < 0.05 (corrected) level using the nonparametric methods, while no significant clusters were found using the conventional parametric approach.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the possibility of using nonparametric methods to estimate the univariate marginal distributions in each of the products, as well as the mixing proportion in a mixture of two distributions, each having independent components.
Abstract: Suppose k-variate data are drawn from a mixture of two distributions, each having independent components. It is desired to estimate the univariate marginal distributions in each of the products, as well as the mixing proportion. This is the setting of two-class, fully parametrized latent models that has been proposed for estimating the distributions of medical test results when disease status is unavailable. The problem is one of inference in a mixture of distributions without training data, and until now it has been tackled only in a fully parametric setting. We investigate the possibility of using nonparametric methods. Of course, when k=1 the problem is not identifiable from a nonparametric viewpoint. We show that the problem is "almost" identifiable when k=2; there, the set of all possible representations can be expressed, in terms of any one of those representations, as a two-parameter family. Furthermore, it is proved that when $k\geq3$ the problem is nonparametrically identifiable under particularly mild regularity conditions. In this case we introduce root-n consistent nonparametric estimators of the 2k univariate marginal distributions and the mixing proportion. Finite-sample and asymptotic properties of the estimators are described.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric kernel approach with smoothing parameters obtained from the cross-validated minimization of the estimator's integrated squared error is proposed, and the rate of convergence of the crossvalidated smoothing parameter to their "benchmark" optimal values is derived.

Journal ArticleDOI
TL;DR: In this paper, an alternative kernel smoothing method is proposed for longitudinal or clustered data with dependence within clusters, and the smallest variance of the new estimator is achieved when the true correlation is assumed.
Abstract: There has been substantial recent interest in non- and semiparametric methods for longitudinal or clustered data with dependence within clusters. It has been shown rather inexplicably that, when standard kernel smoothing methods are used in a natural way, higher efficiency is obtained by assuming independence than by using the true correlation structure. It is shown here that this result is a natural consequence of how standard kernel methods incorporate the within-subject correlation in the asymptotic setting considered, where the cluster sizes are fixed and the cluster number increases. In this paper, an alternative kernel smoothing method is proposed. Unlike the standard methods, the smallest variance of the new estimator is achieved when the true correlation is assumed. Asymptotically, the variance of the proposed method is uniformly smaller than that of the most efficient working independence approach. A small simulation study shows that significant improvement is obtained for finite samples.

Journal ArticleDOI
TL;DR: In this article, the asymptotic properties of kernel estimators of copulas and their derivatives in the context of a multivariate stationary process with strong mixing conditions are derived.
Abstract: We consider a nonparametric method to estimate copulas, i.e. functions linking joint distributions to their univariate margins. We derive the asymptotic properties of kernel estimators of copulas and their derivatives in the context of a multivariate stationary process satisfactory strong mixing conditions. Monte Carlo results are reported for a stationary vector autoregressive process of order one with Gaussian innovations. An empirical illustration containing a comparison with the independent, comotonic and Gaussian copulas is given for European and US stock index returns.

Journal ArticleDOI
TL;DR: A proposed local linear regression model was applied to short-term traffic prediction and consistently showed better performance than the k-nearest neighbor and kernel smoothing methods.
Abstract: The traffic-forecasting model, when considered as a system with inputs of historical and current data and outputs of future data, behaves in a nonlinear fashion and varies with time of day. Traffic data are found to change abruptly during the transition times of entering and leaving peak periods. Accurate and real-time models are needed to approximate the nonlinear time-variant functions between system inputs and outputs from a continuous stream of training data. A proposed local linear regression model was applied to short-term traffic prediction. The performance of the model was compared with previous results of nonparametric approaches that are based on local constant regression, such as the k-nearest neighbor and kernel methods, by using 32-day traffic-speed data collected on US-290, in Houston, Texas, at 5-min intervals. It was found that the local linear methods consistently showed better performance than the k-nearest neighbor and kernel smoothing methods.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric method and a Bayesian hierarchical modeling method are proposed for the detection of environmental thresholds, based on the reduction of deviance, while the Bayesian method is based on change in the response variable distribution parameters.