scispace - formally typeset
Search or ask a question

Showing papers on "Nonparametric statistics published in 1988"


Journal ArticleDOI
TL;DR: In this article, the authors proposed new tests for detecting the presence of a unit root in quite general time series models, which accommodate models with a fitted drift and a time trend so that they may be used to discriminate between unit root nonstationarity and stationarity about a deterministic trend.
Abstract: SUMMARY This paper proposes new tests for detecting the presence of a unit root in quite general time series models. Our approach is nonparametric with respect to nuisance parameters and thereby allows for a very wide class of weakly dependent and possibly heterogeneously distributed data. The tests accommodate models with a fitted drift and a time trend so that they may be used to discriminate between unit root nonstationarity and stationarity about a deterministic trend. The limiting distributions of the statistics are obtained under both the unit root null and a sequence of local alternatives. The latter noncentral distribution theory yields local asymptotic power functions for the tests and facilitates comparisons with alternative procedures due to Dickey & Fuller. Simulations are reported on the performance of the new tests in finite samples.

16,874 citations


Journal ArticleDOI
TL;DR: A nonparametric approach to the analysis of areas under correlated ROC curves is presented, by using the theory on generalized U-statistics to generate an estimated covariance matrix.
Abstract: Methods of evaluating and comparing the performance of diagnostic tests are of increasing importance as new tests are developed and marketed. When a test is based on an observed variable that lies on a continuous or graded scale, an assessment of the overall value of the test can be made through the use of a receiver operating characteristic (ROC) curve. The curve is constructed by varying the cutpoint used to determine which values of the observed variable will be considered abnormal and then plotting the resulting sensitivities against the corresponding false positive rates. When two or more empirical curves are constructed based on tests performed on the same individuals, statistical analysis on differences between curves must take into account the correlated nature of the data. This paper presents a nonparametric approach to the analysis of areas under correlated ROC curves, by using the theory on generalized U-statistics to generate an estimated covariance matrix.

16,496 citations


Journal ArticleDOI
TL;DR: In this paper, the authors argue that the quadratic assignment procedure (QAP) is superior to OLS for testing hypothesis in both simple and multiple regression models based on dyadic data, such as found in network analysis.

946 citations


Journal ArticleDOI
TL;DR: In this paper, the authors discuss the use of standard logistic regression techniques to estimate hazard rates and survival curves from censored data and demonstrate the increased structure that can be seen in a parametric analysis, as compared with the nonparametric Kaplan-Meier survival curves.
Abstract: We discuss the use of standard logistic regression techniques to estimate hazard rates and survival curves from censored data. These techniques allow the statistician to use parametric regression modeling on censored data in a flexible way that provides both estimates and standard errors. An example is given that demonstrates the increased structure that can be seen in a parametric analysis, as compared with the nonparametric Kaplan-Meier survival curves. In fact, the logistic regression estimates are closely related to Kaplan-Meier curves, and approach the Kaplan-Meier estimate as the number of parameters grows large.

886 citations



Journal ArticleDOI
TL;DR: In this article, a unified framework within which many commonly used bootstrap critical points and confidence intervals may be discussed and compared is developed, and seven different bootstrap methods are examined, each being usable in both parametric and nonparametric contexts.
Abstract: We develop a unified framework within which many commonly used bootstrap critical points and confidence intervals may be discussed and compared. In all, seven different bootstrap methods are examined, each being usable in both parametric and nonparametric contexts. Emphasis is on the way in which the methods cope with first- and second-order departures from normality. Percentile-$t$ and accelerated bias-correction emerge as the most promising of existing techniques. Certain other methods are shown to lead to serious errors in coverage and position of critical point. An alternative approach, based on "shortest" bootstrap confidence intervals, is developed. We also make several more technical contributions. In particular, we confirm Efron's conjecture that accelerated bias-correction is second-order correct in a variety of multivariate circumstances, and give a simple interpretation of the acceleration constant.

750 citations


BookDOI
TL;DR: In this article, a decision theory formulation for population selection followed by estimating the mean of the selected population is presented, and the problem of finding the largest normal mean under Heteroscedasticity is addressed.
Abstract: 1 - Selection, Ranking, and Multiple Comparisons.- Sequential Selection Procedures for Multi-Factor Experiments Involving Koopman-Darmois Populations with Additivity.- Selection Problem for a Modified Multinomial (Voting) Model.- A Decision Theory Formulation for Population Selection Followed by Estimating the Mean of the Selected Population.- On the Problem of Finding the Largest Normal Mean under Heteroscedasticity.- On Least Favorable Configurations for Some Poisson Selection Rules and Some Conditional Tests.- Selection of the Best Normal Populations better Than a Control: Dependence Case.- Inference about the Change-Point in a Sequence of Random Variables: A Selection Approach.- On Confidence Sets in Multiple Comparisons.- 2 - Asymptoticand Sequential Analysis.- The VPRT: Optimal Sequential and Nonsequential Testing.- An Edgeworth Expansion for the Distribution of the F-Ratio under a Randomization Model for the Randomized Block Design.- On Bayes Sequential Tests.- Stochastic Search in a Square and on a Torus.- Distinguished Statistics, Loss of Information and a Theorem of Robert B. Davies.- Prophet Inequalities for Threshold Rules for Independent Bounded Random Variables.- Weak Convergence of the Aalen Estimator for a Censored Renewal Process.- Sequential Stein-Rule Maximum Likelihood Estimation: General Asymptotics.- Fixed Proportional Accuracy in Three Stages.- 3 - Estimationand Testing.- Dominating Inadmissible Tests in Exponential Family Models.- On Estimating Change Point in a Failure Rate.- A Nonparametric, Intersection-Union Test for Stochastic Order.- On Estimating the Number of Unseen Species and System Reliability.- The Effects of Variance Function Estimation on Prediction and Calibration: An Example.- On Estimating a Parameter and Its Score Function, II.- A Simple Test for the Equality of Correlation Matrices.- Conditions of Rao's Covariance Method Type for Set-Valued Estimators.- Conservation of Properties of Optimality of Some Statistical Tests and Point Estimators under Extensions of Distributions.- Some Recent Results in Signal Detection.- 4 - Design, and Comparisonof Experimentsand Distributions.- Comparison of Experiments and Information in Censored Data.- A Note on Approximate D-Optimal Designs for G x 2m.- Some Statistical Design Aspects of Estimating Automotive Emission Deterioration Factors.- Peakedness in Multivariate Distributions.- Spatial Designs.

572 citations


Book
17 Nov 1988
TL;DR: In this paper, nonparametric methods for single and paired data were introduced for robustness, robustness and bootstrapping, counting, and counting for Bivariate and multivariate data.
Abstract: Introducing nonparametric methods. Location estimates for single samples. Distribution tests and rank transformations for single samples. Methods for paired samples. Tests and estimation for two independent samples. Three or more samples. Bivariate and multivariate data. Counts and categories. Robustness, jackknives and bootstraps. Looking ahead.

511 citations


Book
01 Jan 1988
TL;DR: For example, the authors proposed a hierarchy of smooth optimum kernels for estimating b? functions, and proposed a method for kernel smoothing and differentiation based on the FORTRAN framework.
Abstract: 1. Introduction.- 2. Longitudinal data and regression models.- 2.1 Longitudinal data.- 2.2 Regression models.- 2.3 Longitudinal growth curves.- 3. Nonparametric regression methods.- 3.1 Kernel estimates.- 3.2 Weighted local least squares estimates.- 3.3 Smoothing splines.- 3.4 Orthogonal series estimates.- 3.5 Discussion.- 3.6 Heart pacemaker study.- 4. Kernel and weighted local least squares methods.- 4.1 Mean Squared Error of kernel estimates for curves and derivatives.- 4.2 Asymptotic normality.- 4.3 Boundary effects and Integrated Mean Squared Error.- 4.4 Muscular activity as a function of force.- 4.5 Finite sample comparisons.- 4.6 Equivalence of weighted local regression and kernel estimators.- 5. Optimization of kernel and weighted local regression methods.- 5.1 Optimal designs.- 5.2 Choice of kernel functions.- 5.3 Minimum variance kernels.- 5.4 Optimal kernels.- 5.5 Finite evaluation of higher order kernels.- 5.6 Further criteria for kernels.- 5.7 A hierarchy of smooth optimum kernels.- 5.8 Smooth optimum boundary kernels.- 5.9 Choice of the order of kernels for estimating b? functions.- 6. Multivariate kernel estimators.- 6.1 Definiton and MSE/IMSE.- 6.2 Boundary effects and dimension problem.- 6.3 Rectangular designs and product kernels.- 7. Choice of global and local bandwidths.- 7.1 Overview.- 7.2 Pilot methods.- 7.3 Cross-validation and related methods.- 7.4 Bandwidth choice for derivatives.- 7.5 Confidence intervals for anthropokinetic data.- 7.6 Local versus global bandwidth choice.- 7.7 Weak convergence of a local bandwidth process.- 7.8 Practical local bandwidth choice.- 8. Longitudinal parameters.- 8.1 Comparison of samples of curves.- 8.2 Definition of longitudinal parameters and consistency.- 8.3 Limit distributions.- 9. Nonparametric estimation of the human height growth curve.- 9.1 Introduction.- 9.2 Choice of kernels and bandwidths.- 9.3 Comparison of parametric and nonparametric regression.- 9.4 Estimation of growth velocity and acceleration.- 9.5 Longitudinal parameters for growth curves.- 9.6 Growth spurts.- 10. Further applications.- 10.1 Monitoring and prognosis based on longitudinal medical data.- 10.2 Estimation of heteroscedasticity and prediction intervals.- 10.3 Further developments.- 11. Consistency properties of moving weighted averages.- 11.1 Local weak consistency.- 11.2 Uniform consistency.- 12. FORTRAN routines for kernel smoothing and differentiation.- 12.1 Structure of main routines KESMO and KERN.- 12.2 Listing of programs.- References.

353 citations


Journal ArticleDOI
21 May 1988-BMJ
TL;DR: Methods of calculating confidence intervals for a population median or for other population quantiles from a sample of observations and a non-parametric approach rather than the parametric approach for both paired and paired samples are described.
Abstract: Gardner and Altman1 described the rationale behind the use of confidence intervals and gave methods for their calculation for a population mean and for differences between two population means for paired and unpaired samples. These methods are based on sample means, standard errors, and the t distribution and should strictly be used only for continuous data from Normal distributions (although small deviations from Normality are not important2). For non-Normal continuous data the median of the population or the sample is preferable to the mean as a measure of location. Medians are also appropriate in other situations?for example, when measurements are on an ordinal scale. This paper describes methods of calculating confidence intervals for a population median or for other population quantiles from a sample of observations. Calculations of confidence intervals for the difference between two population medians or means (a non-parametric approach rather than the parametric approach mentioned above) for both unpaired and paired samples are described. Worked examples are given for each situation. Because of the discrete nature of some of the sampling distribu? tions involved in non-parametric analyses it is not usually possible to calculate confidence intervals with exactly the desired level of confidence. Hence, if a 95% confidence interval is wanted the choice is between the lowest possible level of confidence over 95% (a "conservative" interval) and the highest possible under 95%. There is no firm policy on which of these is preferred, but we will mainly describe conservative intervals in this paper. The exact level of confidence associated with any particular approximate level can be calculated from the distribution of the statistic being used. The methods outlined for obtaining confidence intervals are described in more detail in textbooks on non-parametric statistics.3 The calculations can be carried out using the statistical computer package MINITAB.4 A method for calculating confidence intervals for Spearman's rank correlation coefficient is given in an accom? panying paper.5 A confidence interval indicates the precision of the sample statistic as an estimate of the overall population value. Confidence intervals convey the effects of sampling variation but cannot control for non-sampling errors in study design or conduct. They should not be used for basic description of the sample data but only for indicating the uncertainty in sample estimates for population values of medians or other statistics.

341 citations


Book
01 Oct 1988
TL;DR: In this paper, the Pareto and F Distributions and their Parametric Extensions of the Exponential Distribution have been extended to include additional parametric families and the Inverse Gaussian Distribution with bounded support.
Abstract: Basics.- Preliminaries.- Ordering Distributions: Descriptive Statistics.- Mixtures.- Nonparametric Families.- Nonparametric Families: Densities and Hazard Rates.- Nonparametric Families: Origins in Reliability Theory.- Nonparametric Families: Inequalities for Moments and Survival Functions.- Semiparametric Families.- Semiparametric Families.- Parametric Families.- The Exponential Distribution.- Parametric Extensions of the Exponential Distribution.- Gompertz and Gompertz-Makeham Distributions.- The Pareto and F Distributions and Their Parametric Extensions.- Logarithmic Distributions.- The Inverse Gaussian Distribution.- Distributions with Bounded Support.- Additional Parametric Families.- Models Involving Several Variables.- Covariate Models.- Several Types of Failure: Competing Risks.- More About Semi-parametric Families.- Characterizations Through Coincidences of Semiparametric Families.- More About Semiparametric Families.- Complementary Topics.- Some Topics from Probability Theory.- Convexity and Total Positivity.- Some Functional Equations.- Gamma and Beta Functions.- Some Topics from Analysis.

Book
01 Jan 1988
TL;DR: In this article, Random Variables: Discrete Case, Discrete case, and Mixed Case are used to estimate the probability of an instance in a discrete case and in a continuous case.
Abstract: Naive Set Theory. Probability. Random Variables: Discrete Case. Random Variables: Continuous and Mixed Cases. Moments. Sums of Random Variables, Probability Inequalities, and Limit Laws. Point Estimation. Data Reduction and Best Estimation (Sufficiency, Completeness, and UMVUE's). Tests of Hypotheses. Interval Estimation. Ranking and Selection Procedures. Decision Theory. Nonparametric Statistical Inference. Regression and Linear Statistical Inference Analysis of Variance. Robust Statistical Procedures. Statistical Tables. Index

Journal ArticleDOI
TL;DR: The authors proposed a nonparametric estimation of transformations for regression, which is much more flexible than the familiar Box-Cox procedure, allowing general smooth transformations of the variables, and is similar to the ACE (alternating conditional expectation) algorithm of Breiman and Friedman (1985).
Abstract: I propose a method for the nonparametric estimation of transformations for regression. It is much more flexible than the familiar Box-Cox procedure, allowing general smooth transformations of the variables, and is similar to the ACE (alternating conditional expectation) algorithm of Breiman and Friedman (1985). The ACE procedure uses scatterplot smoothers in an iterative fashion to find the maximally correlated transformations of the variables. Like ACE, my proposal can incorporate continuous, categorical, or periodic variables, or any mixture of these types. The method differs from ACE in that it uses a (nonparametric) variance-stabilizing transformation for the response variable. The technique seems to alleviate many of the anomalies that ACE suffers with regression data, including the inability to reproduce model transformations and sensitivity to the marginal distribution of the predictors. I provide several examples, including an analysis of the “brain and body weight” data and some data on ...

Posted Content
TL;DR: In this paper, the use of spectral regression techniques in the context of cointegrated systems of multiple time series was studied and several alternatives were considered including efficient and band spectral methods as well as system and single equation techniques.
Abstract: This paper studies the use of spectral regression techniques in the context of cointegrated systems of multiple time series. Several alternatives are considered including efficient and band spectral methods as well as system and single equation techniques. It is shown that single equation spectral regressions suffer asymptotic bias and nuisance parameter problems that render these regressions impotent for inferential purposes. By contrast systems methods are shown to be covered by LAMN asymptotic theory, bringing the advantages of asymptotic media unbiasedness, scale nuisance parameters and the convenience of asymptotic chi-squared tests. System spectral methods also have advantages over full system direct maximum likelihood in that they do not require complete specification of the error processes. Instead they offer a nonparametric treatment of regression errors which avoids certain methodological problems of dynamic specification and permits additional generality in the class of error processes.

Journal ArticleDOI
TL;DR: In this article, the authors discuss the possibility of truly nonparametric inference about functionals of an unknown density, such as discrete functionals such as the number of modes of a density and the numberof terms in the true model; and continuous functionals, including the optimal bandwidth for kernel density estimates or the widths of confidence intervals for adaptive location estimators.
Abstract: This paper discusses the possibility of truly nonparametric inference about functionals of an unknown density. Examples considered include: discrete functionals, such as the number of modes of a density and the number of terms in the true model; and continuous functionals, such as the optimal bandwidth for kernel density estimates or the widths of confidence intervals for adaptive location estimators. For such functionals it is not generally possible to make two-sided nonparametric confidence statements. However, one-sided nonparametric confidence statements are possible: e.g., "I say with 95% confidence that the underlying distribution has at least three modes." Roughly, this is because the functionals of interest are semicontinuous with respect to the topology induced by a distribution-free metric. Then a neighborhood procedure can be used. The procedure is to find the minimum value of the functional over a neighborhood of the empirical distribution in function space. If this neighborhood is a nonparametric $1 - \alpha$ confidence region for the true distribution, the resulting minimum value lowerbounds the true value with a probability of at least $1 - \alpha$. This lower bound has good asymptotic properties in the high-confidence setting $\alpha$ close to 0.

Journal ArticleDOI
TL;DR: In this paper, a definition formelle de l'identification dans un contexte non parametrique sur la base des travaux de Koopmans et Reiersol (1950) is presented.
Abstract: On presente une definition formelle de l'identification dans un contexte non parametrique sur la base des travaux de Koopmans et Reiersol (1950)

Journal ArticleDOI
Luc Devroye1
TL;DR: The Vapnik-Chervonenkis method can be used to choose the smoothing parameter in kernel-based rules, to choose k in the k-nearest neighbor rule, and to choose between parametric and nonparametric rules.
Abstract: A test sequence is used to select the best rule from a class of discrimination rules defined in terms of the training sequence. The Vapnik-Chervonenkis and related inequalities are used to obtain distribution-free bounds on the difference between the probability of error of the selected rule and the probability of error of the best rule in the given class. The bounds are used to prove the consistency and asymptotic optimality for several popular classes, including linear discriminators, nearest-neighbor rules, kernel-based rules, histogram rules, binary tree classifiers, and Fourier series classifiers. In particular, the method can be used to choose the smoothing parameter in kernel-based rules, to choose k in the k-nearest neighbor rule, and to choose between parametric and nonparametric rules. >


Journal ArticleDOI
TL;DR: In this article, a nonparametric covariance sum method is proposed for statistical trend assessment in stream quality monitoring data, where the test statistic variance is computed as the sum of the covariances of the individual Mann-Kendall statistics.
Abstract: National and state fixed station stream quality monitoring networks have now been in existence for over ten years. The resulting data bases provide opportunities and challenges for statistical trend assessment. Although nonparametric tests have been developed that are well suited to such problems, the interpretation of variations in trend significance between seasons and variables remains a problem. One recently developed test is based on the sum of Mann-Kendall statistics over seasons or variables, with the test statistic variance computed as the sum of the covariances of the individual Mann-Kendall statistics. In this method, up- and downtrends can cancel, giving an overall indication of no trend. A related test which is sensitive to trend regardless of direction has been shown to behave poorly for typical stream quality record lengths. An alternative formulation which is sensitive to up- and downtrends and has power approaching that of the covariance sum method, is described. In addition, a variation of a contrast test for discriminating trend directions and magnitudes among variables or seasons where correlation between seasons or variables is present is described, and tests of its performance reported.

Journal ArticleDOI
TL;DR: In this article, the authors have reviewed and explored the non-parametric density estimation approach for analysing various econometric functionals, and some limitations of the nonparametric approach are also examined, and potential future areas of applied and theoretical research have been indicated.
Abstract: In this paper we have reviewed and explored the non-parametric density estimation approach for analysing various econometric functionals. The applications of density estimation have been emphasized in the specification, estimation, and testing problems arising in econometrics. Some limitations of the non-parametric approach are also examined, and potential future areas of applied and theoretical research have been indicated.

Journal ArticleDOI
TL;DR: In this article, the authors compare the empirical measure and the product of its marginals by taking a supremum over an appropriate Vapnik-Cervonenkis class of sets.
Abstract: Several tests based on the empirical measure have been proposed to test independence of variables, goodness of fit, equality of distributions, rotational invariance, and so forth. These tests have excellent power properties, but critical values are difficult, if not impossible, to obtain. Furthermore, these tests usually assume that the data are real-valued with continuous distributions. Here, critical values are determined by bootstrapping and the resulting tests are shown to have the correct asymptotic level under minimal assumptions. For example, given data Xi = (X i,1, …, Xi,d ), i = 1, …, n, it may be desired to test independence of the d components. The proposed test compares the empirical measure and the product of its marginals by taking a supremum over an appropriate Vapnik-Cervonenkis class of sets. No assumptions are made on the probability distribution of the data or on the space in which it lives; indeed, some components may be discrete, some continuous, and others categorical. Simil...

Journal ArticleDOI
TL;DR: In this article, an alternative to the local scoring method of Hastie and Tibshirani [J. Roy. Statist. Sci., 1 (1986), pp. 297-318] is provided for nonparametric estimation of the relative risk in the Cox model.
Abstract: An alternative to the local scoring method of Hastie and Tibshirani [J. Statist. Sci., 1 (1986), pp. 297–318] is provided for nonparametric estimation of the relative risk in the Cox model. The method involves penalized partial likelihood. Computations are carried out using a damped Newton–Raphson iteration. Each iterate is evaluated using an appropriately preconditioned conjugate gradient algorithm. The algorithm is globally convergent under mild conditions. One-step diagnostics are developed and cross-validation criteria are provided to guide the evaluation of the degree of smoothness of the estimator. These cross-validation scores have potential application to model selection in standard Cox regression contexts also. Bayesian confidence intervals akin to those of Wahba [J. Roy. Statist. Soc., 45 (1983), pp. 133–150] are defined. The performance of the methodology is illustrated on real and simulated data.

Book ChapterDOI
TL;DR: In this paper, the nonparametric AMOC problem is considered and non-sequential AMOC procedures are described in terms of asymptotic results, and can be called as nonsequential procedures.
Abstract: Publisher Summary Changepoint problems have originally arisen in the context of quality control, where one typically observes the output of a production line and would wish to signal deviation from an acceptable average output level while observing the data. When one observes a random process sequentially and stops observing at a random time of detecting change, then one speaks of a sequential procedure. Otherwise, it is observed that a large finite sequence for the sake of determining possible change during the data collection. Such procedures are described in terms of asymptotic results, and can be called as nonsequential procedures. Sequential and nonsequential procedures are usually based on parametric or nonparametric models for changepoint problems, allowing at most one change (AMOC) or, possibly, more than one change. This chapter focuses on the nonparametric AMOC setting and discusses non-sequential nonparametric AMOC procedures. A large number of nonparametric and parametric modelling of AMOC problems result in the same test statistic, general rank statistics with quantile and Wilcoxon type scores whose asymptotics are described in terms of a two-time parameter stochastic process, U-statistics type processes which are considered for the nonparametric AMOC problem, and detect change in the intensity parameter of a renewal process.

Journal ArticleDOI
TL;DR: A survey of the econometric and most relevant statistical literature on semiparametric inference can be found in this article, with a partial bibliography and a discussion of statistical properties.
Abstract: SUMMARY Semiparametric econometric models contain both parametric and nonparametric components, reflecting in some fashion what has been learned from economic theory and previous empirical experience, and what remains unknown. They raise such questions as how well the parametric component can be estimated, and how to construct rules of inference with good statistical properties. The paper attempts to survey the econometric and most relevant statistical literature on semiparametric inference, and includes a partial bibliography.


Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of estimating the mean of an underlying exponential distribution and consider both fixed sample size problems and inverse sampling schemes, and demonstrate certain global optimality properties of an estimator based on the "total time on test" statistic.
Abstract: Consider an experiment in which only record-breaking values (e.g., values smaller than all previous ones) are observed. The data available may be represented as X1,K1,X2,K2, …, where X1,X2, … are successive minima and K1,K2, … are the numbers of trials needed to obtain new records. We treat the problem of estimating the mean of an underlying exponential distribution, and we consider both fixed sample size problems and inverse sampling schemes. Under inverse sampling, we demonstrate certain global optimality properties of an estimator based on the “total time on test” statistic. Under random sampling, it is shown than an analogous estimator is consistent, but can be improved for any fixed sample size.


Journal ArticleDOI
TL;DR: In this article, the joint asymptotic distributions of the marginal quantiles and quantile functions in samples from a p-variate population are derived, on the basis of which tests of significance for population medians are developed.

Journal ArticleDOI
TL;DR: A simple algorithm, based on Newton's method, is constructed, which permits asymptotic minimization of L1 distance for nonparametric density estimators and is applicable to multivariate kernel estimator, multivariate histogram estimators, and smoothed histograms estimators such as frequency polygons.

Journal ArticleDOI
TL;DR: In this article, the authors give sufficient conditions for strong consistency of maximum likelihood estimators in certain nonparametric families of mixtures, with emphasis on mixtures over exponential families.