scispace - formally typeset
Search or ask a question

Showing papers on "Nonparametric statistics published in 1997"


Journal ArticleDOI
TL;DR: This paper decompose the conventional measure of evaluation bias into several components and find that bias due to selection on unobservables, commonly called selection bias in econometrics, is empirically less important than other components, although it is still a sizeable fraction of the estimated programme impact.
Abstract: This paper considers whether it is possible to devise a nonexperimental procedure for evaluating a prototypical job training programme. Using rich nonexperimental data, we examine the performance of a two-stage evaluation methodology that (a) estimates the probability that a person participates in a programme and (b) uses the estimated probability in extensions of the classical method of matching. We decompose the conventional measure of programme evaluation bias into several components and find that bias due to selection on unobservables, commonly called selection bias in econometrics, is empirically less important than other components, although it is still a sizeable fraction of the estimated programme impact. Matching methods applied to comparison groups located in the same labour markets as participants and administered the same questionnaire eliminate much of the bias as conventionally measured, but the remaining bias is a considerable fraction of experimentally-determined programme impact estimates. We test and reject the identifying assumptions that justify the classical method of matching. We present a nonparametric conditional difference-in-differences extension of the method of matching that is consistent with the classical index-sufficient sample selection model and is not rejected by our tests of identifying assumptions. This estimator is effective in eliminating bias, especially when it is due to temporally-invariant omitted variables.

5,069 citations


Book
01 Jan 1997
TL;DR: This chapter presents a meta-analyses of the nonparametric methods used in the construction of the Cramer-Rao Bound Tools, which were developed in the second half of the 1990s to address the problem of boundedness in the discrete-time model.
Abstract: 1. Basic Concepts. 2. Nonparametric Methods. 3. Parametric Methods for Rational Spectra. 4. Parametric Methods for Line Spectra. 5. Filter Bank Methods. 6. Spatial Methods. Appendix A: Linear Algebra and Matrix Analysis Tools. Appendix B: Cramer-Rao Bound Tools. Bibliography. References Grouped by Subject. Subject Index.

2,154 citations


Journal ArticleDOI
TL;DR: It is shown that a particular bootstrap method, the .632+ rule, substantially outperforms cross-validation in a catalog of 24 simulation experiments and also considers estimating the variability of an error rate estimate.
Abstract: A training set of data has been used to construct a rule for predicting future responses. What is the error rate of this rule? This is an important question both for comparing models and for assessing a final selected model. The traditional answer to this question is given by cross-validation. The cross-validation estimate of prediction error is nearly unbiased but can be highly variable. Here we discuss bootstrap estimates of prediction error, which can be thought of as smoothed versions of cross-validation. We show that a particular bootstrap method, the .632+ rule, substantially outperforms cross-validation in a catalog of 24 simulation experiments. Besides providing point estimates, we also consider estimating the variability of an error rate estimate. All of the results here are nonparametric and apply to any possible prediction rule; however, we study only classification problems with 0–1 loss in detail. Our simulations include “smooth” prediction rules like Fisher's linear discriminant fun...

1,602 citations


Journal ArticleDOI
TL;DR: In this article, simulation results show that tests for long-horizon (i.e., multi-year) abnormal security returns around firm-specific events are severely misspecified.

962 citations


Journal ArticleDOI
TL;DR: It is concluded that percentile bootstrap confidence interval methods provide a promising approach to estimating the uncertainty of ICER point estimates, however, successive bootstrap estimates of bias and standard error suggests that these may be unstable; accordingly, it is strongly recommend a cautious interpretation of such estimates.
Abstract: The statistic of interest in the economic evaluation of health care interventions is the incremental cost effectiveness ratio (ICER), which is defined as the difference in cost between two treatment interventions over the difference in their effect. Where patient-specific data on costs and health outcomes are available, it is natural to attempt to quantify uncertainty in the estimated ICER using confidence intervals. Recent articles have focused on parametric methods for constructing confidence intervals. In this paper, we describe the construction of non-parametric bootstrap confidence intervals. The advantage of such intervals is that they do not depend on parametric assumptions of the sampling distribution of the ICER. We present a detailed description of the non-parametric bootstrap applied to data from a clinical trial, in order to demonstrate the strengths and weaknesses of the approach. By examining the bootstrap confidence limits successively as the number of bootstrap replications increases, we conclude that percentile bootstrap confidence interval methods provide a promising approach to estimating the uncertainty of ICER point estimates. However, successive bootstrap estimates of bias and standard error suggests that these may be unstable; accordingly, we strongly recommend a cautious interpretation of such estimates.

849 citations


Journal ArticleDOI
TL;DR: Algorithms for wavelet network construction are proposed for the purpose of nonparametric regression estimation and particular attentions are paid to sparse training data so that problems of large dimension can be better handled.
Abstract: Wavelet networks are a class of neural networks consisting of wavelets. In this paper, algorithms for wavelet network construction are proposed for the purpose of nonparametric regression estimation. Particular attentions are paid to sparse training data so that problems of large dimension can be better handled. A numerical example on nonlinear system identification is presented for illustration.

760 citations


Journal ArticleDOI
TL;DR: In this paper, a nonparametric method is suggested to estimate the statistical significance of a computed correlation coefficient when serial correlation is a concern, and the method compares favorably with conventional methods.
Abstract: When analyzing pairs of time series, one often needs to know whether a correlation is statistically significant. If the data are Gaussian distributed and not serially correlated, one can use the results of classical statistics to estimate the significance. While some techniques can handle non-Gaussian distributions, few methods are available for data with nonzero autocorrelation (i.e., serially correlated). In this paper, a nonparametric method is suggested to estimate the statistical significance of a computed correlation coefficient when serial correlation is a concern. This method compares favorably with conventional methods.

721 citations


01 Jan 1997
TL;DR: This research assumes that H(f) is well-defined and is finite, and the concept of differential entropy was introduced in Shannon’s original paper ([55]).
Abstract: We assume that H(f) is well-defined and is finite. The concept of differential entropy was introduced in Shannon’s original paper ([55]). Since then, entropy has been of great theoretical and applied interest. The basic properties ∗This research was supported by the Scientific Exchange Program between the Belgian Academy of Sciences and the Hungarian Academy of Sciences in the field of Mathematical Information Theory, and NATO Research Grant No. CRG 931030.

695 citations


Journal ArticleDOI
TL;DR: In this article, a marked empirical process based on residuals is studied, and results on its large-sample behavior may be used to provide nonparametric full-model checks for regression, and their decomposition into principal components gives new insight into the question: which kind of departure from a hypothetical model may be well detected by residual-based goodness-offit methods?
Abstract: In this paper we study a marked empirical process based on residuals. Results on its large-sample behavior may be used to provide nonparametric full-model checks for regression. Their decomposition into principal components gives new insight into the question: which kind of departure from a hypothetical model may be well detected by residual-based goodness-of-fit methods? The work also contains a small simulation study on straight-line regression.

523 citations



Journal ArticleDOI
TL;DR: In this paper, a nonparametric estimator of Tawn's dependence measure is proposed, which is shown to be uniformly, strongly convergent, and asymptotically unbiased.
Abstract: SUMMARY A bivariate extreme value distribution with fixed marginals is generated by a onedimensional map called a dependence function. This paper proposes a new nonparametric estimator of this function. Its asymptotic properties are examined, and its small-sample behaviour is compared to that of other rank-based and likelihood-based procedures. The new estimator is shown to be uniformly, strongly convergent and asymptotically unbiased. Through simulations, it is also seen to perform reasonably well against the maximum likelihood estimator based on the correct model and to have smaller L1, L2 and L,, errors than any existing nonparametric alternative. The n' consistency of the proposed estimator leads to nonparametric estimation of Tawn's (1988) dependence measure that may be used to test independence in small samples.

Book
01 Jan 1997
TL;DR: The What and Why of Statistics Organization of Information Frequency Distributions Graphic Presentation Measures of Central Tendency Measures of Variability Relationships between Two Variables Cross-Tabulation Measures of Association for Nominal and Ordinal Variables Bivariate Regression and Correlation Organization of information and Measurement of Relationships A Review of Descriptive Data Analysis The Normal Distribution Sampling and Sampling Distributions Estimation Testing Hypotheses about Two Samples The Chi-square Test Reviewing Inferential Statistics
Abstract: The What and the Why of Statistics Organization of Information Frequency Distributions Graphic Presentation Measures of Central Tendency Measures of Variability Relationships between Two Variables Cross-Tabulation Measures of Association for Nominal and Ordinal Variables Bivariate Regression and Correlation Organization of Information and Measurement of Relationships A Review of Descriptive Data Analysis The Normal Distribution Sampling and Sampling Distributions Estimation Testing Hypotheses about Two Samples The Chi-Square Test Reviewing Inferential Statistics

Journal ArticleDOI
TL;DR: In this article, the Efficient method of moments (EMM) is used to fit the standard stochastic volatility model of various extensions to several daily financial time series, and the extensions required for an adequate fit are so elaborate that nonparametric specifications are probably more convenient.

Journal ArticleDOI
TL;DR: In this paper, the authors consider a class of dynamic models in which both the conditional mean and the conditional variance (volatility) are unknown functions of the past and derive probabilistic conditions under which nonparametric estimation of these functions is possible.

Journal ArticleDOI
TL;DR: In this article, a survey article that attempts to synthetize a broad variety of work on wavelets in statistics and includes some recent developments in nonparametric curve estimation that have been omitted from review articles and books on the subject is presented.
Abstract: The field of nonparametric function estimation has broadened its appeal in recent years with an array of new tools for statistical analysis. In particular, theoretical and applied research on the field of wavelets has had noticeable influence on statistical topics such as nonparametric regression, nonparametric density estimation, nonparametric discrimination and many other related topics. This is a survey article that attempts to synthetize a broad variety of work on wavelets in statistics and includes some recent developments in nonparametric curve estimation that have been omitted from review articles and books on the subject. After a short introduction to wavelet theory, wavelets are treated in the familiar context of estimation of «smooth» functions. Both «linear» and «nonlinear» wavelet estimation methods are discussed and cross-validation methods for choosing the smoothing parameters are addressed. Finally, some areas of related research are mentioned, such as hypothesis testing, model selection, hazard rate estimation for censored data, and nonparametric change-point problems. The closing section formulates some promising research directions relating to wavelets in statistics.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed simple finite-sample size approximations for the distribution of quadratic forms in factorial designs under a normal heteroscedastic error structure.
Abstract: Linear rank statistics in nonparametric factorial designs are asymptotically normal and, in general, heteroscedastic. In a comprehensive simulation study, the asymptotic chi-squared law of the corresponding quadratic forms is shown to be a rather poor approximation of the finite-sample distribution. Motivated by this problem, we propose simple finite-sample size approximations for the distribution of quadratic forms in factorial designs under a normal heteroscedastic error structure. These approximations are based on an F distribution with estimated degrees of freedom that generalizes ideas of Patnaik and Box. Simulation studies show that the nominal level is maintained with high accuracy and in most cases the power is comparable to the asymptotic maximin Wald test. Data-driven guidelines are given to select the most appropriate test procedure. These ideas are finally transferred to nonparametric factorial designs where the same quadratic forms as in the parametric case are applied to the vector ...

Journal ArticleDOI
TL;DR: In this paper, consistent cointegration tests, and estimators of a basis of the space of cointegrating vectors, that do not used specification of the data-generating process, apart from some mild regularity conditions, or estimation of structural and/or nuisance parameters, are proposed.

Journal ArticleDOI
TL;DR: In this paper, rank statistics are derived for testing the nonparametric hypotheses of no main effects, no interaction, and no factor effects in unbalanced crossed classifications, and a modification of the test statistics and approximations to their finite-sample distributions are also given.
Abstract: Factorial designs are studied with independent observations, fixed number of levels, and possibly unequal number of observations per factor level combination. In this context, the nonparametric null hypotheses introduced by Akritas and Arnold are considered. New rank statistics are derived for testing the nonparametric hypotheses of no main effects, no interaction, and no factor effects in unbalanced crossed classifications. The formulation of all results includes tied observations. Extensions of these procedures to higher-way layouts are given, and the efficacies of the test statistics against nonparametric alternatives are derived. A modification of the test statistics and approximations to their finite-sample distributions are also given. The small-sample performance of the procedures for two factors is examined in a simulation study. As an illustration, a real dataset with ordinal data is analyzed.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric identification and estimation procedure for an Ito diffusion process based on discrete sampling observations is proposed, which avoids any functional form specification for either the drift function or the diffusion function.
Abstract: In this paper, we propose a nonparametric identification and estimation procedure for an Ito diffusion process based on discrete sampling observations. The nonparametric kernel estimator for the diffusion function developed in this paper deals with general Ito diffusion processes and avoids any functional form specification for either the drift function or the diffusion function. It is shown that under certain regularity conditions the nonparametric diffusion function estimator is pointwise consistent and asymptotically follows a normal mixture distribution. Under stronger conditions, a consistent nonparametric estimator of the drift function is also derived based on the diffusion function estimator and the marginal density of the process, An application of the nonparametric technique to a short-term interest rate model involving Canadian daily 3-month treasury bill rates is also undertaken. The estimation results provide evidence for rejecting the common parametric or semiparametric specifications for both the drift and diffusion functions.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the potential of Bayes methods for the analysis of survival data using semiparametric models based on either the hazard or the intensity function, where the nonparametric part of every model is assumed to be a realization of a stochastic process.
Abstract: This review article investigates the potential of Bayes methods for the analysis of survival data using semiparametric models based on either the hazard or the intensity function. The nonparametric part of every model is assumed to be a realization of a stochastic process. The parametric part, which may include a regression parameter or a parameter quantifying the heterogeneity of a population, is assumed to have a prior distribution with possibly unknown hyperparameters. Careful applications of some recently popular computational tools, including sampling-based algorithms, are used to find posterior estimates of several quantities of interest even when dealing with complex models and unusual data structures. The methodologies developed herein are motivated and aimed at analyzing some common types of survival data from different medical studies; here we focus on univariate survival data in the presence of fixed and time-dependent covariates, multiple event-time data for repeated nonfatal events, ...

Book ChapterDOI
01 Jan 1997
TL;DR: The development of nonparametric approaches to psychometric and sociometric measurement dates back to the days before the establishment of regular item response theory (IRT) and has its roots in the early manifestations of scalogram analysis (Guttman, 1950), latent structure analysis (Lazarsfeld, 1950) and latent trait theory (Lord, 1953).
Abstract: The development of nonparametric approaches to psychometric and sociometric measurement dates back to the days before the establishment of regular item response theory (IRT). It has its roots in the early manifestations of scalogram analysis (Guttman, 1950), latent structure analysis (Lazarsfeld, 1950), and latent trait theory (Lord, 1953).

Journal ArticleDOI
TL;DR: In this article, the conditional variance function in a heteroscedastic, nonparametric regression model is estimated by linear smoothing of squared residuals, where the mean and variance functions are assumed to be smooth and neither is in a parametric family.
Abstract: The conditional variance function in a heteroscedastic, nonparametric regression model is estimated by linear smoothing of squared residuals. Attention is focused on local polynomial smoothers. Both the mean and variance functions are assumed to be smooth, but neither is assumed to be in a parametric family. The biasing effect of preliminary estimation of the mean is studied, and a degrees-of-freedom correction of bias is proposed. The corrected method is shown to be adaptive in the sense that the variance function can be estimated with the same asymptotic mean and variance as if the mean function were known. A proposal is made for using standard bandwidth selectors for estimating both the mean and variance functions. The proposal is illustrated with data from the LIDAR method of measuring atmospheric pollutants and from turbulence-model computations.

Journal ArticleDOI
TL;DR: In this paper, the authors present locally weighted regression estimates of employment density in suburban Chicago and demonstrate that Chicago is indeed a polycentric city: although the traditional city center continues to affect employment density patterns in the suburbs, local peaks have developed around secondary employment centers.
Abstract: Nonparametric estimation procedures offer distinct advantages in modeling polycentric cities because they are flexible enough to account for functional form misspecification and incorrect subcenter sites. This paper presents locally weighted (LW) regression estimates of employment density in suburban Chicago. LW regression estimates are more accurate than OLS regression and capture the effects of missing variables. The results demonstrate that Chicago is indeed a polycentric city: although the traditional city center continues to affect employment density patterns in the suburbs, local peaks have developed around secondary employment centers.

Journal ArticleDOI
TL;DR: In this article, the authors derive finite sample properties of kernel density estimates of the ergodic distribution of the short-rate when it follows a continuous time AR(1) as in Vasicek.
Abstract: Nonparametric kernel density estimation has recently been used to estimate and test short-term interest rate models, but inference has been based on asymptotics. We derive finite sample properties of kernel density estimates of the ergodic distribution of the short-rate when it follows a continuous time AR(1) as in Vasicek. We find that the asymptotic distribution substantially understates finite sample bias, variance, and correlation. Also, estimator quality and bandwidth choice depend strongly on the persistence of the interest rate process and on the span of the data, but not on sampling frequency. We also examine the size and power of one of Ait-Sahalia's nonparametric tests of continuous time interest rate models. The test rejects too often. This is probably because the quality of the nonparametric density estimate depends on persistence, but the asymptotic distribution of the test does not. After critical values are adjusted for size, the test has low power in distinguishing between the Vasicek and Cox-Ingersoll-Ross models relative to a conditional moment-based specification test.

Journal ArticleDOI
TL;DR: The common statistical techniques employed to analyze survival data in public health research, including the Kaplan-Meier method for estimating the survival function and the Cox proportional hazards model to identify risk factors and to obtain adjusted risk ratios are reviewed.
Abstract: This paper reviews the common statistical techniques employed to analyze survival data in public health research. Due to the presence of censoring, the data are not amenable to the usual method of analysis. The improvement in statistical computing and wide accessibility of personal computers led to the rapid development and popularity of nonparametric over parametric procedures. The former required less stringent conditions. But, if the assumptions for parametric methods hold, the resulting estimates have smaller standard errors and are easier to interpret. Nonparametric techniques include the Kaplan-Meier method for estimating the survival function and the Cox proportional hazards model to identify risk factors and to obtain adjusted risk ratios. In cases where the assumption of proportional hazards is not tenable, the data can be stratified and a model fitted with different baseline functions in each stratum. Parametric modeling such as the accelerated failure time model also may be used. Hazard functions for the exponential, Weibull, gamma, Gompertz, lognormal, and log-logistic distributions are described. Examples from published literature are given to illustrate the various methods. The paper is intended for public health professionals who are interested in survival data analysis.

Journal ArticleDOI
TL;DR: The authors show that the pseudovalues used in the jackknife method are directly linked to the placement values, and because of the close link, the choice between the two methods can be based on users' preferences.

Journal ArticleDOI
TL;DR: A family of random walk rules for the sequential allocation of dose levels to patients in a dose-response study, or phase I clinical trial, is described and the small sample properties of this rule compare favorably to those of the continual reassessment method, determined by simulation.
Abstract: We describe a family of random walk rules for the sequential allocation of dose levels to patients in a dose-response study, or phase I clinical trial. Patients are sequentially assigned the next higher, same, or next lower dose level according to some probability distribution, which may be determined by ethical considerations as well as the patient's response. It is shown that one can choose these probabilities in order to center dose level assignments unimodally around any target quantile of interest. Estimation of the quantile is discussed; the maximum likelihood estimator and its variance are derived under a two-parameter logistic distribution, and the maximum likelihood estimator is compared with other nonparametric estimators. Random walk rules have clear advantages: they are simple to implement, and finite and asymptotic distribution theory is completely worked out. For a specific random walk rule, we compute finite and asymptotic properties and give examples of its use in planning studies. Having the finite distribution theory available and tractable obviates the need for elaborate simulation studies to analyze the properties of the design. The small sample properties of our rule, as determined by exact theory, compare favorably to those of the continual reassessment method, determined by simulation.

Journal ArticleDOI
TL;DR: In this paper, a framework for individual and joint tests of significance employing nonparametric estimation procedures is presented, which is robust to functional misspecification for general classes of models, and employs nested pivotal bootstrapping procedures.
Abstract: This article presents a framework for individual and joint tests of significance employing nonparametric estimation procedures. The proposed test is based on nonparametric estimates of partial derivatives, is robust to functional misspecification for general classes of models, and employs nested pivotal bootstrapping procedures. Two simulations and one application are considered to examine size and power relative to misspecified parametric models, and to test for the linear unpredictability of exchange-rate movements for G7 currencies.

Journal ArticleDOI
TL;DR: In this paper, nonparametric methods of this type for estimating the spectral density, the conditional mean, higher order conditional moments or conditional densities have been reviewed and correlated data, bootstrap methods for time series and non-parametric trend analysis are described.
Abstract: Summary Various features of a given time series may be analyzed by nonparametric techniques. Generally the characteristic of interest is allowed to have a general form which is approximated increasingly precisely when the sample size goes to infinity. We review nonparametric methods of this type for estimating the spectral density, the conditional mean, higher order conditional moments or conditional densities. Moreover, density estimation with correlated data, bootstrap methods for time series and nonparametric trend analysis are described.

Journal ArticleDOI
TL;DR: In this article, the authors show that affine invariant and universally consistent tests have asymptotic power against sequences of contiguous alternatives converging to the raten?1/2, independent of i.i.d.