scispace - formally typeset
Search or ask a question

Showing papers on "Nonparametric statistics published in 1995"


Book
02 Aug 1995
TL;DR: In this paper, the authors present a review of computer applications of statistics, including one-sample t statistic, two-way analysis of variance, and repeated-measures analysis for variance nonparametric tests.
Abstract: Getting started: why study satistics? basic concepts and ideas. Descriptive statistics: frequency distributions and graphs summary measures relative measures and the normal curve linear correlation linear regression. Concepts of inferential statistics: sampling distributions logic of hypothesis testing. Methods of inferential statistics: one-sample t statistic - when a t ratio is not practical two-sample t tests analysis of variance two-way analysis of variance repeated-measures analysis of variance nonparametric tests bringing it all together. Appendices: statistical tables answers to selected review questions computer applications of statistics.

1,911 citations


BookDOI
TL;DR: Non-Bayesian predictive approaches for Bayesian prediction of process control and optimization and Multivariate normal prediction problems.
Abstract: The author's research has been directed towards inference involving observables rather than parameters. In this book, he brings together his views on predictive or observable inference and its advantages over parametric inference. While the book discusses a variety of approaches to prediction including those based on parametric, nonparametric, and nonstochastic statistical models, it is devoted mainly to predictive applications of the Bayesian approach. It not only substitutes predictive analyses for parametric analyses, but it also presents predictive analyses that have no real parametric analogues. It demonstrates that predictive inference can be a critical component of even strict parametric inference when dealing with interim analyses. This approach to predictive inference will be of interest to statisticians, psychologists, econometricians, and sociologists.

750 citations


Journal ArticleDOI
Udo Kamps1
TL;DR: In this article, a generalized order statistics (GOS) model is proposed to explain the similarities and analogies in the two models and to generalize related results, and sufficient conditions for the existence of moments are given for the moments of GOS.

670 citations


Posted Content
TL;DR: In this paper, a general time inhomogeneous multiple spell model is presented, which contains a variety of useful models as special cases, and conditions under which access to multiple spell data aids in solving the sensitivity problem.
Abstract: This paper considers the formulation and estimation of continuous time social science duration models. The focus is on new issues that arise in applying statistical models developed in biostatistics to analyze economic data and formulate economic models. Both single spell and multiple spell models are discussed. In addition, we present a general time inhomogeneous multiple spell model which contains a variety of useful models as special cases.Four distinctive features of social science duration analysis are emphasized:(1) Because of the limited size of samples available in economics and because of an abundance of candidate observed explanatory variables and plausible omitted explanatory variables, standard nonparametric procedures used in biostatistics are of limited value in econometric duration analysis. It is necessary to control for observed and unobserved explanatory variables to avoid biasing inference about underlying duration distributions. Controlling for such variables raises many new problems not discussed in the available literature.(2) The environments in which economic agents operate are not the time homogeneous laboratory environments assumed in biostatistics and reliability theory. Ad hoc methods for controlling for time inhomogeneity produce badly biased estimates.(3) Because the data available to economists are not obtained from the controlled experimental settings available to biologists, doing econometric duration analysis requires accounting for the effect of sampling plans on the distributions of sampled spells.(4) Econometric duration models that incorporate the restrictions produced by economic theory only rarely can be represented by the models used by biostatisticians. The estimation of structural econometric duration models raises new statistical and computational issues.Because of (1) it is necessary to parameterize econometric duration models to control for both observed and unobserved explanatory variables. Economic theory only provides qualitative guidance on the matter of selecting a functional form for a conditional hazard, and it offers no guidance at all on the matter of choosing a distribution of unobservables. This is unfortunate because empirical estimates obtained from econometric duration models are very sensitive to assumptions made about the functional forms of these model ingredients.In response to this sensitivity we present criteria for inferring qualitative properties of conditional hazards and distributions of unobservables from raw duration data sampled in time homogeneous environments; i.e. from unconditional duration distributions. No parametric structure need be assumed to implement these procedures.We also note that current econometric practice overparameterizes duration models. Given a functional form for a conditional hazard determined up to a finite number of parameters, it is possible to consistently estimate the distribution of unobservables nonparametrically. We report on the performance of such an estimator and show that it helps to solve the sensitivity problem.We demonstrate that in principle it is possible to identify both the conditional hazard and the distribution of unobservables without assuming parametric functional forms for either. Tradeoffs in assumptions required to secure such model identification are discussed. Although under certain conditions a fully nonparametric model can be identified, the development of a consistent fully nonparametric estimator remains to be done.We also discuss conditions under which access to multiple spell data aids in solving the sensitivity problem. A superficially attractive conditional likelihood approach produces inconsistent estimators, but the practical significance of this inconsistency is not yet known. Conditional inference schemes for eliminating unobservables from multiple spell duration models that are based on sufficient or ancillary statistics require unacceptably strong assumptions about the functional forms of conditional hazards and so are not robust. Contrary to recent claims, they offer no general solution to the model sensitivity problem.The problem of controlling for time inhomogeneous environments (Point (2)) remains to be solved. Failure to control for time inhomogeneity produces serious biases in estimated duration models. Controlling for time inhomogeneity creates a potential identification problem.For a single spell data it is impossible to separate the effect of duration dependence from the effect of time inhomogeneity by a fully nonparametric procedure. Although it is intuitively obvious that access to multiple spell data aids in the solution of this identification problem, the development of precise conditions under which this is possible is a topic left for future research.We demonstrate how sampling schemes distort the functional forms of sample duration distributions away from the population duration distributions that are the usual object of econometric interest (Point (3)). Inference based on misspecified duration distributions is in general biased. New formulae for the densities of commonly used duration measures are produced for duration models with unobservables in time inhomogeneous environments. We show how access to spells that begin after the origin date of a sample aids in solving econometric problems created by the sampling schemes that are used to generate economic duration data.We also discuss new issues that arise in estimating duration models explicitly derived from economic theory (Point (4)). For a prototypical search unemployment model we discuss and resolve new identification problems that arise in attempting to recover structural economic parameters. We also consider nonstandard statistical problems that arise in estimating structural models that are not treated in the literature. Imposing or testing the restrictions implied by economic theory requires duration models that do not appear in the received literature and often requires numerical solution of implicit equations derived from optimizing theory.

500 citations


Book
15 Dec 1995
TL;DR: In this paper, the authors present an approach for regression analysis based on univariate and univariate descriptive statistics and nonparametric statistics, as well as regression analysis for time series models.
Abstract: 1. Statistics and Geography I. Descriptive Statistics 2. Univariate Descriptive Statistics 3. Descriptive Statistics for Spatial Distributions II. Inferential Statistics 5. Elementary Probability Theory 6. Random Variables and Probability Distributions 7. Sampling 8. Parametric Statistical Inference: Estimation 9. Parametric Statistical Inference: Hypothesis Testing 10. Parametric Statistical Inference: Two Sample Tests 11. Nonparametric Statistics III. Statistical Relationships Between Two Variables 12. Correlation Analysis 13. Introduction to Regression Analysis 14. Inferential Aspects of Regression Analysis 15. Time Series Models IV. Modern Methods of Analysis 16. Exploratory Data Analysis 17. Bootstrapping and Related Computer Intensive Methods

493 citations


Journal ArticleDOI
TL;DR: This article examined the predictive performance of four structural exchange rate models using both parametric and nonparametric techniques and found that error correction terms can explain exchange rate movements significantly better than a no change forecast for a subset of the models and currencies.

410 citations


Book
01 Jan 1995
TL;DR: 1. Coherent Systems Analysis, Nonparametric Methods and Model Adequacy, and Parametric Lifetime Models.
Abstract: 1 Introduction 2 Coherent Systems Analysis 3 Lifetime Distributions 4 Parametric Lifetime Models 5 Specialized Models 6 Repairable Systems 7 Lifetime Data Analysis 8 Parametric Estimation for Models without Covariates 9 Parametric Estimation for Models with Covariates 10 Nonparametric Methods and Model Adequacy

360 citations


Journal ArticleDOI
01 Mar 1995-Genetics
TL;DR: This paper determines the appropriate significance level for the statistic ZW, by showing that its asymptotic null distribution follows an Ornstein-Uhlenbeck process, and provides a robust, distribution-free method for mapping QTLs.
Abstract: Genetic mapping of quantitative trait loci (QTLs) is performed typically by using a parametric approach, based on the assumption that the phenotype follows a normal distribution. Many traits of interest, however, are not normally distributed. In this paper, we present a nonparametric approach to QTL mapping applicable to any phenotypic distribution. The method is based on a statistic ZW, which generalizes the nonparametric Wilcoxon rank-sum test to the situation of whole-genome search by interval mapping. We determine the appropriate significance level for the statistic ZW, by showing that its asymptotic null distribution follows an Ornstein-Uhlenbeck process. These results provide a robust, distribution-free method for mapping QTLs.

346 citations


Book
16 Nov 1995
TL;DR: In this paper, the authors present and summarize the scientific method for displaying and summarizing data, and provide answers to Selected Exercises (see Section 5.1.1).
Abstract: 1. Statistics and the Scientific Method 2. Displaying and Summarizing Data 3. Designing Experiments 4. Probability and Uncertainty 5. Conditional Probability and Bayes' Rule 6. Models for Proportions 7. Densities for Proportions 8. Comparing Two Proportions 9. Densities for Two Proportions 10. General Samples and Population Means 11. Densities for Means 12. Comparing Two or More Means 13. Data Transformations and Nonparametric Methods 14. Regression Analysis Answers to Selected Exercises

341 citations


Journal ArticleDOI
TL;DR: The authors compare the performance of univariate homoskedastic, GARCH, autoregressive, and nonparametric models for conditional variances, using five bilateral weekly exchange rates for the dollar, 1973-1989.

325 citations



Journal ArticleDOI
TL;DR: In this paper, the Efficient Method of Moments (EMM) is used to fit the standard stochastic volatility model and various extensions to several daily financial time series, and the standard model is rejected, although some extensions are accepted.
Abstract: Efficient Method of Moments (EMM) is used to fit the standard stochastic volatility model and various extensions to several daily financial time series. EMM matches to the score of a model determined by data analysis called the score generator. Discrepancies reveal characteristics of data that stochastic volatility models cannot approximate. The two score generators employed here are "Semiparametric ARCH" and "Nonlinear Nonparametric". With the first, the standard model is rejected, although some extensions are accepted. With the second, all versions are rejected. The extensions required for an adequate fit are so elaborate that nonparametric specifications are probably more convenient.

Journal ArticleDOI
TL;DR: In this article, the authors apply nonparametric regression models to estimation of demand curves of the type most often used in applied research and derive estimates of exact consumers surplus and deadweight loss from the demand curve estimators.
Abstract: We apply nonparametric regression models to estimation of demand curves of the type most often used in applied research. From the demand curve estimators we derive estimates of exact consumers surplus and deadweight loss, which are the most widely used welfare and economic efficiency measures in areas of economics such as public finance. We also develop tests of the symmetry and downward sloping properties of compensated demand. We work out asymptotic normal sampling theory for kernel and series nonparametric estimators, as well as for the parametric case. The paper includes an application to gasoline demand. Empirical questions of interest here are the shape of the demand curve and the average magnitude of welfare loss from a tax on gasoline. In this application we compare parametric and nonparametric estimates of the demand curve, calculate exact and approximate measures of consumers surplus and deadweight loss, and give standard error estimates. We also analyze the sensitivity of the welfare measures to components of nonparametric regression estimators such as the number of terms in a series approximation.

Book
01 Jun 1995
TL;DR: In this paper, a unified treatment of the analysis and calculation of the asymptotic efficiencies of nonparametric tests is presented, where powerful new methods are developed to evaluate explicitly different kinds of efficiencies.
Abstract: Making a substantiated choice of the most efficient statistical test is one of the basic problems of statistics. Asymptotic efficiency is an indispensable technique for comparing and ordering statistical tests in large samples. It is especially useful in nonparametric statistics where it is usually necessary to rely on heuristic tests. This monograph presents a unified treatment of the analysis and calculation of the asymptotic efficiencies of nonparametric tests. Powerful new methods are developed to evaluate explicitly different kinds of efficiencies. Of particular interest is the description of domains of the Bahadur local optimality and related characterisation problems based on recent research by the author. Other Russian results are also published here for the first time in English. Researchers, professionals and students in statistics will find this book invaluable.

Journal ArticleDOI
TL;DR: In this article, the authors propose two consistent one-sided specification tests for parametric regression models, one based on the sample covariance between the residual from the parametric model and the discrepancy between parametric and nonparametric fitted values, and the other based on a difference in sums of squared residuals between the parameterized and non-parametric models, which can be viewed as a test of the joint hypothesis that the true parameters of a series regression model are zero.
Abstract: This paper proposes two consistent one-sided specification tests for parametric regression models, one based on the sample covariance between the residual from the parametric model and the discrepancy between the parametric and nonparametric fitted values ; the other based on the difference in sums of squared residuals between the parametric and nonparametric models. We estimate the nonparametric model by series regression. The new test statistics converge in distribution to a unit normal under correct specification and grow to infinity faster than the parametric rate (n -1/2 ) under misspecification, while avoiding weighting, sample splitting, and non-nested testing procedures used elsewhere in the literature. Asymptotically, our tests can be viewed as a test of the joint hypothesis that the true parameters of a series regression model are zero, where the dependent variable is the residual from the parametric model, and the series terms are functions of the explanatory variables, chosen so as to support nonparametric estimation of a conditional expectation. We specifically consider Fourier series and regression splines, and present a Monte Carlo study of the finite sample performance of the new tests in comparison to consistent tests of Bierens (1990), Eubank and Spiegelman (1990), Jayasuriya (1990), Wooldridge (1992), and Yatchew (1992) ; the results show the new tests have good power, performing quite well in some situations. We suggest a joint Bonferroni procedure that combines a new test with those of Bierens and Wooldridge to capture the best features of the three approaches.

Journal ArticleDOI
TL;DR: In this paper, a number of consistency results for nonparametric kernel estimators of density and regression functions and their derivatives are presented, which allow for near-epoch dependent, nonidentically distributed random variables, data-dependent bandwidth sequences, preliminary estimation of parameters, and non-parametric regression on index functions.
Abstract: This paper presents a number of consistency results for nonparametric kernel estimators of density and regression functions and their derivatives. These results are particularly useful in semiparametric estimation and testing problems that rely on preliminary nonparametric estimators, as in Andrews (1994, Econometrica 62, 43–72). The results allow for near-epoch dependent, nonidentically distributed random variables, data-dependent bandwidth sequences, preliminary estimation of parameters (e.g., nonparametric regression based on residuals), and nonparametric regression on index functions.

Journal ArticleDOI
TL;DR: In this paper, the authors developed a class of semiparametric methods that are designed to work better than the kernel estimator in a broad nonparametric neighbourhood of a given parametric class of densities, for example, the normal, while not losing much in precision when the true density is far from the parametric classes.
Abstract: The traditional kernel density estimator of an unknown density is by construction completely nonparametric in the sense that it has no preferences and will work reasonably well for all shapes. The present paper develops a class of semiparametric methods that are designed to work better than the kernel estimator in a broad nonparametric neighbourhood of a given parametric class of densities, for example, the normal, while not losing much in precision when the true density is far from the parametric class. The idea is to multiply an initial parametric density estimate with a kernel-type estimate of the necessary correction factor. This works well in cases where the correction factor function is less rough than the original density itself. Extensive comparisons with the kernel estimator are carried out, including exact analysis for the class of all normal mixtures. The new method, with a normal start, wins quite often, even in many cases where the true density is far from normal. Procedures for choosing the smoothing parameter of the estimator are also discussed. The new estimator should be particularly useful in higher dimensions, where the usual nonparametric methods have problems. The idea is also spelled out for nonparametric regression.

Journal ArticleDOI
TL;DR: The operation and performance of an algorithm for segmenting connected points into a combination of representations such as lines, circular, elliptical and superelliptical arcs, and polynomials is described and demonstrated.
Abstract: This paper describes and demonstrates the operation and performance of an algorithm for segmenting connected points into a combination of representations such as lines, circular, elliptical and superelliptical arcs, and polynomials. The algorithm has a number of interesting properties including being scale invariant, nonparametric, general purpose, and efficient.

Journal ArticleDOI
TL;DR: In this paper, the authors consider regression analysis when incomplete or auxiliary covariate data are available for all study subjects and, in addition, for a subset called the validation sample, true covariates of interest have been ascertained.
Abstract: SUMMARY We consider regression analysis when incomplete or auxiliary covariate data are available for all study subjects and, in addition, for a subset called the validation sample, true covariate data of interest have been ascertained. The term auxiliary data refers to data not in the regression model, but thought to be informative about the true missing covariate data of interest. We discuss a method which is nonparametric with respect to the association between available and missing data, allows missingness to depend on available response and covariate values, and is applicable to both cohort and case-control study designs. The method previously proposed by Flanders & Greenland (1991) and by Zhao & Lipsitz (1992) is generalised and asymptotic theory is derived. Our expression for the asymptotic variance of the estimator provides intuition regarding performance of the method. Optimal sampling strategies for the validation set are also suggested by the asymptotic results.

Posted Content
TL;DR: In this paper, a nonparametric estimation procedure for continuous-time stochastic models is proposed, where prices of derivative securities depend crucially on the form of the instantaneous volatility of the underlying process, leaving the volatility function unrestricted and estimate it nonparametrically.
Abstract: We propose a nonparametric estimation procedure for continuous- time stochastic models. Because prices of derivative securities depend crucially on the form of the instantaneous volatility of the underlying process, we leave the volatility function unrestricted and estimate it nonparametrically. Only discrete data are used but the estimation procedure still does not rely on replacing the continuous- time model by some discrete approximation. Instead the drift and volatility functions are forced to match the densities of the process. We estimate the stochastic differential equation followed by the short term interest rate and compute nonparametric prices for bonds and bond options.

Journal ArticleDOI
TL;DR: In this article, the relation between categorization and pdf estimation is examined from the perspective of the categorization process and it is shown that the prototype model and several decision-bound models of categorization are parametric, whereas most exemplar models are nonparametric.

Journal ArticleDOI
TL;DR: In this article, a nonparametric control chart is presented for detecting changes in the process median (or mean), or changes in process variability when samples are taken at regular time intervals.
Abstract: Nonparametric control chart are presented for the problem of detecting changes in the process median (or mean), or changes in the process variability when samples are taken at regular time intervals. The proposed procedures are based on sign-test statistics computed for each sample, and are used in Shewhart and cumulative sum control charts. When the process is in control the run length distributions for the proposed nonparametric control charts do not depend on the distribution of the observations. An additional advantage of the non-parametric control charts is that the variance of the process does not need to be established in order to set up a control chart for the mean. Comparisons with the corresponding parametric control charts are presented. It is also shown that curtailed sampling plans can considerably reduce the expected number of observations used in the Shewhart control schemes based on the sign statistic.

Journal ArticleDOI
TL;DR: In this article, an estimated partial likelihood method is proposed for estimating relative risk parameters, which is an extension of the estimated likelihood regression analysis method for uncensored data (Pepe, 1992; Pepe & Fleming, 1991).
Abstract: SUMMARY We consider the problem of missing covariate data in the context of censored failure time relative risk regression. Auxiliary covariate data, which are considered informative about the missing data but which are not explicitly part of the relative risk regression model, may be available. Full covariate information is available for a validation set. An estimated partial likelihood method is proposed for estimating relative risk parameters. This method is an extension of the estimated likelihood regression analysis method for uncensored data (Pepe, 1992; Pepe & Fleming, 1991). A key feature of the method is that it is nonparametric with respect to the association between the missing and observed, including auxiliary, covariate components. Asymptotic distribution theory is derived for the proposed estimated partial likelihood estimator in the case where the auxiliary or mismeasured covariates are categorical. Asymptotic efficiencies are calculated for exponential failure times using an exponential relative risk model. The estimated partial likelihood estimator compares favourably with a fully parametric maximum likelihood analysis. Comparisons are also made with a standard partial likelihood analysis which ignores the incomplete observations. Important efficiency gains can be made with the estimated partial likelihood method. Small sample properties are investigated through simulation studies.

Journal ArticleDOI
TL;DR: In this article, a general notion of universal consistency of nonparametric estimators is introduced that applies to regression estimation, conditional median estimation, curve fitting, pattern recognition, and learning concepts.
Abstract: A general notion of universal consistency of nonparametric estimators is introduced that applies to regression estimation, conditional median estimation, curve fitting, pattern recognition, and learning concepts. General methods for proving consistency of estimators based on minimizing the empirical error are shown. In particular, distribution-free almost sure consistency of neural network estimates and generalized linear estimators is established. >

Journal ArticleDOI
TL;DR: Empirical modeling of high-frequency currency market data reveals substantial evidence for nonnormality, stochastic volatility, and other nonlinearities and develops a new method for estimation of structural economic models.

Posted Content
TL;DR: In this paper, the authors proposed a general methodology of bootstrapping in nonparametric frontier models and applied it to analyze the sensitivity of efficiency scores relative to sampling variations of the estimated frontier.
Abstract: Efficiency scores of production units are generally measured relative to an estimated production frontier. Nonparametric estimators (DEA, FDH, ... ) are based on a finite sample of observed production units. The bootstrap is one easy way to analyze the sensitivity of efficiency scores relative to the sampling variations of the estimated frontier. The main point in order to validate the bootstrap is to define a reasonable data generating process in this complex framework and to propose a reasonable estimator of it. This provides a general methodology of bootstrapping in nonparametric frontier models. Some adapted methods are illustrated in analyzing the bootstrap sampling variations of input efficiency measures of electricity plants.

Journal ArticleDOI
TL;DR: In this paper, the asymptotic distributions of estimators of global integral functionals of the regression surface were derived for nonparametric regression with multiple random predictor variables, and the results were applied to the problem of obtaining reliable estimators for the non-parametric coefficient of determination, which is also called Pearson's correlation ratio.
Abstract: In a nonparametric regression setting with multiple random predictor variables, we give the asymptotic distributions of estimators of global integral functionals of the regression surface. We apply the results to the problem of obtaining reliable estimators for the nonparametric coefficient of determination. This coefficient, which is also called Pearson's correlation ratio, gives the fraction of the total variability of a response that can be explained by a given set of covariates. It can be used to construct measures of nonlinearity of regression and relative importance of subsets of regressors, and to assess the validity of other model restrictions. In addition to providing asymptotic results, we propose several data-based bandwidth selection rules and carry out a Monte Carlo simulation study of finite sample properties of these rules and associated estimators of explanatory power. We also provide two real data examples.

Journal ArticleDOI
TL;DR: Nonparametric function estimation refers to methods that strive to approximate a target function locally, i.e., using data from a "small" neighborhood of the point of estimate.
Abstract: Nonparametric function estimation refers to methods that strive to approximate a target function locally, i.e., using data from a “small” neighborhood of the point of estimate. “Weak” assumptions, such as continuity of the target function and its differentiability to some order in the neighborhood, rather than an a priori assumption of the global form (e.g., linear or quadratic) of the entire target function are used. Traditionally, parametric assumptions (e.g., hydraulic conductivity is log normally distributed, floods follow a log Pearson III (LP3) distribution, annual stream flow is either log normal or gamma distributed, daily rainfall amounts are exponentially distributed, and the variograms of spatial hydrologic data follow a power law) have dominated statistical hydrologic estimation. Applications of nonparametric methods to some classical problems (frequency analysis, classification, spatial surface fitting, trend analysis, time series forecasting and simulation) of stochastic hydrology are reviewed.

Journal ArticleDOI
TL;DR: In this paper, a bootstrap methodology for constructing confidence intervals for means of DEA and econometrically estimated efficiency scores, Malmquist productivity indices, and other similar measures in small samples is presented.
Abstract: This paper provides a bootstrap methodology for constructing confidence intervals for means of DEA and econometrically estimated efficiency scores, Malmquist productivity indices, and other similar measures in small samples. The procedure is nonparametric since no distributional assumptions are required. An empirical example is provided.

Journal ArticleDOI
TL;DR: In this paper, the authors introduce tests of linearity for time series based on nonparametric estimates of the conditional mean and the conditional variance, which are compared to a number of parametric tests and to non-parametric tests based on the bispectrum.
Abstract: SUMMARY We introduce tests of linearity for time series based on nonparametric estimates of the conditional mean and the conditional variance. The tests are compared to a number of parametric tests and to nonparametric tests based on the bispectrum. Asymptotic expressions give bad approximations, and the null distribution under linearity is constructed using resampling of the best linear approximation. The new tests perform well on the examples tested.