scispace - formally typeset
Search or ask a question

Showing papers on "Nonparametric statistics published in 1987"


Book
01 Jan 1987
TL;DR: The Delta Method and the Influence Function Cross-Validation, Jackknife and Bootstrap Balanced Repeated Replication (half-sampling) Random Subsampling Nonparametric Confidence Intervals as mentioned in this paper.
Abstract: The Jackknife Estimate of Bias The Jackknife Estimate of Variance Bias of the Jackknife Variance Estimate The Bootstrap The Infinitesimal Jackknife The Delta Method and the Influence Function Cross-Validation, Jackknife and Bootstrap Balanced Repeated Replications (Half-Sampling) Random Subsampling Nonparametric Confidence Intervals.

7,007 citations


01 Jan 1987
TL;DR: In this article, the authors consider the problem of setting approximate confidence intervals for a single parameter 0 in a multiparameter family, and propose the bootstrap confidence intervals that automatically incorporate transformations, bias corrections, and so forth.
Abstract: We consider the problem of setting approximate confidence intervals for a single parameter 0 in a multiparameter family. The standard approximate intervals based on maximum likelihood theory, 8 2 &z(=), can be quite misleading. In practice, tricks based on transformations, bias corrections, and so forth, are often used to improve their accuracy. The bootstrap confidence intervals discussed in this article automatically incorporate such tricks without requiring the statistician to think them through for each new application, at the price of a considerable increase in computational effort. The new intervals incorporate an improvement over previously suggested methods, which results in second-order correctness in a wide variety of problems. In addition to parametric families, bootstrap intervals are also developed for nonparametric situations.

2,828 citations


Book
30 Jun 1987
TL;DR: The preface to Statistics Descriptive Statistics explains the elements of Probability Random Variables and Expectation, and some examples of Sampling Statistics Parameter Estimation and Hypothesis Testing.
Abstract: Preface Introduction to Statistics Descriptive Statistics Elements of Probability Random Variables and Expectation Special Random Variables Distributions of Sampling Statistics Parameter Estimation Hypothesis Testing Regression Analysis of Variance Goodness of Fit Tests and Categorical Data Analysis Nonparametric Hypothesis Tests Quality Control LifeTesting Appendix of Tables

840 citations


Book
01 Jan 1987
TL;DR: Probability Random Variables and their Distributions Special Probability Distributions Joint Distributions Properties of random Variables Functions of random variables Limiting Distributions Statistics and Sampling Distributions Point Estimation Sufficiency and Completeness Interval Estimation Test of Hypotheses Contingency Tables and Goodness-of-Fit Nonparametric Methods Regression and Linear Models Reliability and Survival Distributions Answers to Selected Exercises as mentioned in this paper.
Abstract: Probability Random Variables and Their Distributions Special Probability Distributions Joint Distributions Properties of Random Variables Functions of Random Variables Limiting Distributions Statistics and Sampling Distributions Point Estimation Sufficiency and Completeness Interval Estimation Test of Hypotheses Contingency Tables and Goodness-of-Fit Nonparametric Methods Regression and Linear Models Reliability and Survival Distributions Answers to Selected Exercises.

640 citations


Journal ArticleDOI
TL;DR: Results of Monte Carlo simulations indicate that statistical bias and efficiency characteristics of the proposed test of spuriousness for structural data are very reasonable.

572 citations


Journal ArticleDOI
TL;DR: In this article, biased cross-validation criteria for selection of smoothing parameters for kernel and histogram density estimators, closely related to one investigated in Scott and Factor (1981), were introduced.
Abstract: Nonparametric density estimation requires the specification of smoothing parameters. The demands of statistical objectivity make it highly desirable to base the choice on properties of the data set. In this article we introduce some biased cross-validation criteria for selection of smoothing parameters for kernel and histogram density estimators, closely related to one investigated in Scott and Factor (1981). These criteria are obtained by estimating L 2 norms of derivatives of the unknown density and provide slightly biased estimates of the average squared L 2 error or mean integrated squared error. These criteria are roughly the analog of Wahba's (1981) generalized cross-validation procedure for orthogonal series density estimators. We present the relationship of the biased cross-validation procedure to the least squares cross-validation procedure, which provides unbiased estimates of the mean integrated squared error. Both methods are shown to be based on U statistics. We compare the two metho...

455 citations


Journal ArticleDOI
TL;DR: In this paper, the authors examined the properties of a new class of bivariate distributions whose members are stochastically ordered and likelihood ratio dependent, which can be used to construct families of distributions whose marginals are arbitrary and which include the Frechet bounds as well as the distribution corresponding to independent variables.
Abstract: SUMMARY This paper examines the properties of a new class of bivariate distributions whose members are stochastically ordered and likelihood ratio dependent The proposed class can be used to construct bivariate families of distributions whose marginals are arbitrary and which include the Frechet bounds as well as the distribution corresponding to independent variables Three nonparametric estimators of the association parameter are suggested and Monte Carlo experiments are used to compare their small-sample behaviour to that of the maximum likelihood estimate

416 citations


Book
01 Jan 1987
TL;DR: In this article, the authors present a survey of the literature on statistical methods for measuring the goodness of fit and independence of two populations with respect to a given set of properties.
Abstract: 1. Data and Statistics. 2. Descriptive Statistics: Tabular and Graphical Presentations. 3. Descriptive Statistics: Numerical Measures. 4. Introduction to Probability. 5. Discrete Probability Distributions. 6. Continuous Probability Distributions. 7. Sampling and Sampling Distributions. 8. Interval Estimation. 9. Hypothesis Tests. 10. Statistical Inference about Means and Proportions with Two Populations. 11. Inferences About Population Variances. 12. Test of Goodness of Fit and Independence. 13. Experimental Design and Analysis of Variance. 14. Simple Linear Regression. 15. Multiple Regression. 16. Regression Analysis: Model Building. 17. Index Numbers. 18. Forecasting. 19. Nonparametric Methods. 20. Statistical Methods for Quality Control. 21. Decision Analysis. 22. Sample Survey (on CD). Appendices. A: References and Bibliography. B: Tables. C: Summation Notation. D: Self-Test Solutions and Answers to Even-Numbered Exercises. E: Using Excel Functions. F: Computing p-values Using Minitab and Excel.

312 citations


Journal ArticleDOI
TL;DR: In this article, the problem of determining the number of observations required by some common nonparametric tests, so that the tests have power at least 1 − β against alternatives that differ sufficiently from the hypothesis being tested is discussed.
Abstract: The article discusses the problem of determining the number of observations required by some common nonparametric tests, so that the tests have power at least 1 – β against alternatives that differ sufficiently from the hypothesis being tested. It is shown that the number of observations depends on certain simple probabilities. A method is suggested for fixing the value of the appropriate probability when determining sample size.

285 citations



Journal ArticleDOI
TL;DR: The statistical properties and tests of significance for two nonparametric measures of phenotypic stability (mean of the absolute rank differences of a genotype over the environments and variance among the ranksover the environments) are investigated.
Abstract: Parametric methods for estimating genotype-environment interactions and phenotypic stability (stability of genotypes over environments) are widely used in plant and animal breeding and production. Several nonparametric methods proposed by Huhn (1979, EDP in Medicine and Biology 10, 112-117) are based on the ranks of genotypes in each environment and use the idea of homeostasis as a measure of stability. In this study the statistical properties and tests of significance for two of these nonparametric measures of phenotypic stability (mean of the absolute rank differences of a genotype over the environments and variance among the ranks over the environments) are investigated. The purpose of this study is to develop approximate but easily applied statistical tests based on the normal distribution. Finally, application of the theoretical results is demonstrated using data on 20 genotypes (varieties) of winter wheat in 10 environments (locations) from the official German registration trials.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the use of the smoothed bootstrap and the standard bootstrap for estimating properties of unknown distributions such as the sampling error of parameter estimates, and they develop criteria for determining whether it is advantageous to use the smoothed bootstrap rather than the traditional bootstrap.
Abstract: SUMMARY The bootstrap and smoothed bootstrap are considered as alternative methods of estimating properties of unknown distributions such as the sampling error of parameter estimates. Criteria are developed for determining whether it is advantageous to use the smoothed bootstrap rather than the standard bootstrap. Key steps in the argument leading to these criteria include the study of the estimation of linear functionals of distributions and the approximation of general functionals by linear functionals. Consideration of an example, the estimation of the standard error in the variance-stabilized sample correlation coefficient, elucidates previously-published simulation results and also illustrates the use of computer algebraic manipulation as a useful technique in asymptotic statistics. Finally, the various approximations used are vindicated by a simulation study.

Book
01 Jan 1987
TL;DR: This book discusses the role of Nonparametric Models in Continuous System Identification, and methods for Obtaining Transfer Functions from nonparametric models using the Frequency-Domain approach.
Abstract: Introduction. Continuous-Time Models of Dynamical Systems. Nonparametric Models. Parametric Models. Stochastic Models of Linear Time-Invariant Systems. Models of Distributed Parameter Systems (DPS). Signals and their Representations. Functions in the Ordinary Sense. Distribution or Generalized Functions. Identification of Linear Time-Invariant (LTIV) Systems via Nonparametric Models. The Role of Nonparametric Models in Continuous System Identification. Test Signals for System Identification. Identification of Linear Time-Invariant Systems - Time-Domain Approach. Frequency-Domain Approach. Methods for Obtaining Transfer Functions from Nonparametric Models. Numerical Transformations between Time- and Frequency-Domains. Parameter Estimation for Continuous-Time Models. The Primary Stage. The Secondary Stage: Parameter Estimation. Identification of Linear Systems Using Adaptive Models. Gradient Methods. Frequency-Domain. Stability Theory. Linear Filters. Identification of Multi-Input Multi-Output (MIMO) Systems, Distributed Parameter Systems (DPS) and Systems with Unknown Delays and Nonlinear Elements. MIMO Systems. Time-Varying Parameter Systems (TVPS). Lumped Systems with Unknown Time-Delays. Identification of Systems with Unknown Nonlinear Elements. Identification of Distributed Parameter Systems. Determination of System Structure. Index.



Journal ArticleDOI
TL;DR: A general finding is that much of the information for forecasting is in the immediate past few observations or a few summary statistics based on past data.
Abstract: The problem of predicting a future measurement on an individual given the past measurements is discussed under nonparametric and parametric growth models. The efficiencies of different methods of prediction are assessed by cross-validation or leave-one-out technique in each of three data sets and the results are compared. Under nonparametric models, direct and inverse regression methods of prediction are described and their relative advantages and disadvantages are discussed. Under parametric models polynomial and factor analytic type growth curves are considered. Bayesian and empirical Bayesian methods are used to deal with unknown parameters. A general finding is that much of the information for forecasting is contained in the immediate past few observations or a few summary statistics based on past data. A number of data reduction methods are suggested and analyses based on them are described. The usefulness of the leave-one-out technique in model selection is demonstrated. A new method of calibration is introduced to improve prediction.

Book
01 Jan 1987
TL;DR: In this article, the authors present some approaches to nonparametric density estimation in higher dimensions, such as in the case of model building and speculative data analysis, and numerical solution of constrained optimization problems.
Abstract: 1. Historical Background 2. Some Approaches to Nonparametric Density Estimation 3. Maximum Likelihood Density Estimation 4. Maximum Penalized Likelihood Density Estimation 5. Discrete Maximum Penalized Likelihood Estimation 6. Nonparametric Density of Estimation in Higher Dimensions 7. Nonparametric Regression and Intensity Function Estimation 8. Model Building and Speculative Data Analysis Appendix I. An Introduction to Mathematical Optimization Theory Appendix II. Numerical Solution of Constrained Optimization Problems Appendix III. Optimization Algorithms for Noisy Problems Appendix IV. A Brief Primer in Simulation Index.

Book
01 Jan 1987
TL;DR: In this paper, the authors discuss the importance of selecting the correct statistical test to evaluate the effectiveness of a program and the importance to consider when selecting a statistical test when evaluating individual practitioners' effectiveness.
Abstract: All chapters conclude with "Concluding Thoughts" and "Study Questions" Preface 1 Introduction to Statistical Analysis Uses of Statistical AnalysisGeneral Methodological TermsLevels of MeasurementLevels of Measurement and Analysis of DataOther Measurement ClassificationsCategories of Statistical Analyses 2 Frequency Distributions and Graphs Frequency DistributionsGrouped Frequency DistributionsUsing Frequency Distributions to Analyze DataMisrepresentation of DataGraphical Presentation of DataA Common Mistake in Displaying Data 3 Central Tendency and Variability Central TendencyVariability 4 Normal Distributions Skewed DistributionsNormal DistributionsConverting Raw Scores to Z Scores and PercentilesDeriving Raw Scores From Percentiles 5 Introduction to Hypothesis Testing Alternative ExplanationsProbabilityRefuting Sampling ErrorResearch HypothesesTesting the Null HypothesisStatistical SignificanceErrors in Drawing Conclusions About RelationshipsStatistically Significant Relationships and Meaningful Findings 6 Sampling Distributions and Hypothesis Testing Sample Size and Sampling ErrorSampling Distributions and InferenceSampling Distribution of MeansEstimating Parameters From StatisticsOther Distributions 7 Selecting a Statistical Test The Importance of Selecting the Correct Statistical TestFactors to Consider When Selecting a Statistical TestParametric and Nonparametric TestsMultivariate Statistical TestsGeneral Guidelines for Test SelectionGetting Help With Data Analyses 8 Correlation Uses of CorrelationPerfect CorrelationsNonperfect CorrelationsInterpreting Linear CorrelationsUsing Correlation For InferenceComputation and Presentation of Person's RNonparametric AlternativesUsing Correlation With Three or More VariablesOther Multivariate Tests that Use Correlation 9 Regression Analyses What is Prediction?What is Simple Linear Regression?Computation of the Regression EquationMore About the Regression LineInterpreting ResultsUsing Regression Analyses in Social Work PracticeRegression With Three or More VariablesOther Types of Regression Analyses 10 Cross-Tabulation The Chi-Square Test of AssociationUsing Chi-Square in Social Work PracticeChi-Square With Three or More VariablesSpecial Applications of the Chi-Square Formula 11 t Tests and Analysis of Variance The Use of t TestsThe One-Sample t TestThe Dependent t TestThe Independent t TestSimple Analysis of Variance (One-Way Anova) Appendix A Using Statistics to Evaluate Practice Effectiveness Evaluating ProgramsEvaluating Individual Practitioner Effectiveness Glossary Index

Book
01 Jul 1987
TL;DR: In this article, the authors introduce the concept of probability and statistics, and compare two-way ANOVA, repeated measures, and randomized blocks designs, and demonstrate correlation and regression with nonparametric procedures.
Abstract: Contents: Introduction to Statistics. Probability. Random Variables, Distributions, and Estimation. Binomial and Normal Distributions. Hypothesis Testing. Student's T, Chi-Square, and F Distribution. Comparing Two Means. One-Way ANOVA. Multiple Comparisons. Two-Way ANOVA. Repeated Measures and Randomized Blocks Designs. Selection Techniques. Correlation and Regression. Categorical Data. Nonparametric Procedures. Appendices: Tables. Elementary Matrix Algebra.

Journal ArticleDOI
TL;DR: In this article, it was shown that the asymptotic efficiency properties of the generalized Wilcoxon statistics are maintained in censored data by the log-rank and Peto-Peto WilCOxon statistics, respectively, when censoring is the same in both samples.
Abstract: Considerable progress has been made in generalizing to censored data hypothesis tests based on rank statistics. The most commonly used generalizations are the log-rank test statistic, which extends the Savage exponential scores statistic (Cox 1972; Mantel 1966) and the generalized Wilcoxon statistics (Gehan 1965; Peto and Peto 1972). Gill (1980) showed that in the two-sample problem the asymptotic efficiency properties of the Savage and Wilcoxon statistics are maintained in censored data by the log-rank and Peto and Peto Wilcoxon statistics, respectively, when censoring is the same in both samples. Heuristics and simulations have implied that some recently proposed supremum statistics, closely related to these two popular linear rank statistics, can be used to construct tests that provide good power against a much wider range of nonlocal alternatives (see, e.g., Fleming and Harrington 1981; Fleming, O'Brien, O'Fallon, and Harrington 1980; Gill 1980; Schumacher 1984). In this article we consider n...


Journal ArticleDOI
TL;DR: In this article, a combination of the parametric and nonparametric estimates is used to fit a density function, where π (0 ≤ π ≤ 1) is unknown and π is estimated from the data, and its estimate is then used in as the proposed density estimate.
Abstract: One method of fitting a parametric density function f(x, θ) is first to estimate θ by maximum likelihood, say, and then to estimate f(x, θ) by On the other hand, when the parametric model does not hold, the true density f(x) may be estimated nonparametrically, as in the case of a kernel estimate The key idea proposed is to fit a combination of the parametric and nonparametric estimates, namely where π (0 ≤ π ≤ 1) is unknown The parameter π is estimated from the data, and its estimate is then used in as the proposed density estimate The main point is that we expect to be close to unity when the parametric model prevails, and close to zero when it does not hold We show that, under certain conditions, converges to the parametric density when the parametric model holds and approaches the true f(x) when the parametric model does not hold The procedure was applied to a number of actual data sets In each case the maximum likelihood estimate was readily obtained and the semiparametric density es

Journal ArticleDOI
TL;DR: In this paper, the authors compare the performance of the linearization, nonlinear least squares, and a new nonparametric parameter space method as means of estimating parameters of the random predator equation.
Abstract: (1) Simulations of functional response experiments were used to compare Rogers's linearization, nonlinear least squares, and a new nonparametric parameter space method as means of estimating parameters of the random predator equation. (2) Rogers's linearization gave highly biased estimates of both parameters. These estimates were consistently too low. (3) Nonlinear least squares using the implicit form of the equation, and the nonparametric parameter space method provided equally good estimates for data sets that departed only moderately from the assumptions of homogeneous, normally distributed error. For data sets with greater heteroscedasticity and more extreme nonnormality, the nonparametric procedure performed slightly better than nonlinear least squares. This was especially true for smaller data sets. (4) For both methods, actual frequencies of erroneous rejection of true null hypotheses were usually significantly greater than the nominal 0-05 for the cases studied. Even a moderate amount of heteroscedasticity seems to effect the probability of type I error for both methods. (5) The practice of analysing average values of number of prey eaten at each value of number of prey offered was compared to using individual data points. Use of average numbers eaten did not alter point estimates of parameters, but produced severe underestimates of S.E.s of parameter estimates. Use of average numbers eaten increased probability of type I error by 16 to 42%. (6) Simulation studies such as this may be generally useful to biologists as tools of applied statistics.

Journal ArticleDOI
TL;DR: In this article, the power of parametric procedures is low, and their results may be in error when applied to non-normal data, and five important advantages of nonparametric methods over commonly used parametric procedure are illustrated.
Abstract: Water quality data are usually analysed with parametric statistical procedures requiring the normality assumption for accuracy of their attained significance levels. However, these data are typically non-normally distributed. When applied to non-normal data, the power of parametric procedures is low, and their results may be in error. Three typical case studies are discussed: differentiation of water quality in streams using analysis of variance; discernment of water quality types using discriminant analysis; and t-tests on differences between two groups which include data below the detection limit. Five important advantages of nonparametric methods over commonly used parametric procedures are illustrated.

Journal ArticleDOI
TL;DR: In this paper, the authors compared four methods for estimating the 7-day, 10-year and 7day, 20-year low flows for streams using the bootstrap method.
Abstract: Four methods for estimating the 7-day, 10-year and 7-day, 20-year low flows for streams are compared by the bootstrap method. The bootstrap method is a Monte Carlo technique in which random samples are drawn from an unspecified sampling distribution defined from observed data. The nonparametric nature of the bootstrap makes it suitable for comparing methods based on a flow series for which the true distribution is unknown. Results show that the two methods based on hypothetical distributions (Log-Pearson III and Weibull) had lower mean square errors than did the Box-Cox transformation method or the Log-Boughton method which is based on a fit of plotting positions.

Journal ArticleDOI
TL;DR: Calculation of confidence intervals demonstrated that 50 to 450 subjects are needed for a precise parametric estimation of the 95% reference interval, and a goodness-of-fit test to transformed values shows that one should accept gaussianity only for p-values greater than 0.05.
Abstract: In two-stage transformation systems for normalization of reference distributions, the asymmetry is first corrected, and any deviation of kurtosis is then adjusted. The simulation studies reported here show that these systems have previously been assessed too optimistically because the sample variation of the transformation parameters was neglected. Applying a goodness-of-fit test to transformed values shows that one should accept gaussianity only for p-values greater than 0.15 instead of those greater than 0.05. Further, the calculated 90% confidence intervals of reference limits should be expanded by 25%. When the correct level of significance is used, only real reference distributions that deviate moderately from the gaussian form are normalized. Calculation of confidence intervals demonstrated that 50 to 450 subjects are needed for a precise parametric estimation of the 95% reference interval. For the nonparametric approach, 125 to 700 reference subjects are necessary. The larger sample sizes are needed when distributions show pronounced skewness.

Journal ArticleDOI
TL;DR: In this article, non parametric tests based on Kruskal Wallis ranks are discussed and problems that arise in their application to multiple comparison procedures and multifactorial analyses when the null hypothesis is false are investigated.
Abstract: Non parametric tests based on Kruskal Wallis ranks are discussed. Problems that arise in their application to multiple comparison procedures and multifactorial analyses when the null hypothesis is false are investigated. Solutions of the former are proposed which involve re‐ranking the observations within subsets of the data. A solution to the latter is proposed based on combinations of Friedman rank sums.



Journal ArticleDOI
TL;DR: The effect of aligning the data, by using deviations from group means or group medians, is investigated for the BrownForsythe, O'Brien, Klotz, and Siegel-Tukey procedures as mentioned in this paper.
Abstract: Estimated Type I error rates and power are reported for the BrownForsythe, O'Brien, Klotz, and Siegel-Tukey procedures. The effect of aligning the data, by using deviations from group means or group medians, is investigated for the latter two tests. Normal and non-normal distributions, equal and unequal sample-size combinations, and equal and unequal means are investigated for a two-group design. No test is robust and most powerful for all distributions, however, using O'Brien's procedure will avoid the possibility of a liberal test and provide power almost as large as what would be provided by choosing the most powerful test for each distribution type. Using the Brown-Forsythe procedure with heavy-tailed distributions and O'Brien's procedure for other distributions will increase power modestly and maintain robustness. Using the mean-aligned Klotz test or the unaligned Klotz test with appropriate distributions can increase power, but only at the risk of increased Type I error rates if the tests are not accurately matched to the distribution type.