scispace - formally typeset
Search or ask a question

Showing papers on "Nonparametric statistics published in 1972"


Book
01 Jan 1972
TL;DR: In this paper, the authors present a survey of statistical and data analysis methods for probability distributions and their application to statistical quality control problems, including one and two Sided Tests of Hypotheses.
Abstract: 1. Introduction to Statistics and Data Analysis 2. Probability 3. Random Variables and Probability Distributions 4. Mathematical Expectations 5. Some Discrete Probability Distributions 6. Some Continuous Probability Distributions 7. Functions of Random Variables (optional) 8. Fundamental Distributions and Data Description 9. One and Two Sample Estimation Problems 10. One and Two Sided Tests of Hypotheses 11. Simple Linear Regression 12. Multiple Linear Regression 13. One Factor Experiments: General 14. Factorial Experiments (Two or More Factors) 15. 2k Factorial Experiments and Fractions 16. Nonparametric Statistics 17. Statistical Quality Control 18. Bayesian Statistics

2,945 citations


Book
01 Jan 1972
TL;DR: In this paper, the authors discuss the nature of statistics, including the Bayesian Inference, and their application to business and economic analysis, including Bayesian Decision Theory and Probability Distributions.
Abstract: BASIC PROBABILITY AND STATISTICS. The Nature of Statistics. Descriptive Statistics. Probability. Probability Distributions. Two Random Variables. INFERENCE FOR MEANS AND PROPORTIONS. Sampling. Point Estimation. Confidence Intervals. Hypothesis Testing. Analysis of Variance. REGRESSION: RELATING TWO OR MORE VARIABLES. Fitting a Line. Simple Regression. Multiple Regression. Regression Extensions. Correlation. TOPICS IN CLASSICAL AND BAYESIAN INFERENCE. Nonparametric and Robust Statistics. Chi Square Tests. Maximum Likelihood Estimation. Bayesian Inference. Bayesian Decision Theory. SPECIAL TOPICS FOR BUSINESS AND ECONOMICS. Decision Trees. Index Numbers. Sampling Designs. Time Series. Simultaneous Equations. Appendices. Tables. References. Answers to Odd--Numbered Problems. Glossary of Common Symbols. Index.

407 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of fitting a general functional relationship between two variables and require only that the function to be fitted is smooth, and do not assume that it has a known mathematical form involving only a finite number of unknown parameters.
Abstract: SUMMARY In this note we consider the problem of fitting a general functional relationship between two variables. We require only that the function to be fitted is, in some sense, "smooth", and do not assume that it has a known mathematical form involving only a finite number of unknown parameters.

386 citations


Journal ArticleDOI
TL;DR: In this paper, nonparametric probability density estimation is used to estimate the probability density of a given set of points in a set of 3D images. But this method is limited to 3D points.
Abstract: (1972) Nonparametric Probability Density Estimation: I A Summary of Available Methods Technometrics: Vol 14, No 3, pp 533-546

225 citations


Journal ArticleDOI

209 citations



Book
01 Jan 1972

102 citations


Journal ArticleDOI
TL;DR: A sequence of transforming functions is proposed to convert nongaussian distributions often seen in laboratory data to gaussian form, which enables smooth curves to be drawn through observed cumulative distributions plotted on arithmetic or gaussian probability scales.
Abstract: A sequence of transforming functions is proposed to convert nongaussian distributions often seen in laboratory data to gaussian form. These transforms are chosen to eliminate or substantially reduce nongaussian characteristics of positive skewness and peakedness that result from two factors: ( a ) increases in variance with increasing mean values, and ( b ) general heterogeneity among intrapersonal variances. Use of these transforms, demonstrated on many sets of clinical laboratory data, enables smooth curves to be drawn through observed cumulative distributions plotted on arithmetic or gaussian probability scales. From such curves, normal ranges or proportions below a specified measurement may be estimated easily and with greater precision than possible through nonparametric methods. Formulas are given for obtaining confidence limits corresponding to these estimates. The entire process of transforming the original variable to gaussian form and graphing the cumulative distribution curve has been computerized. Programs are available to others interested in applying these methods.

91 citations


Journal ArticleDOI
TL;DR: In this paper, a sample-based rule is asymptotically optimal under very general conditions, and the apparent non-error rate is even optimistically biased as an estimator of the (unknown) optimal rule's nonerror rate; but the apparent rate converges to this optimum.
Abstract: Parametric or nonparametric density estimators from a mixture of K distributions can be used to estimate the non-error rate of an arbitrary classification rule—and to construct a rule which maximizes estimated probability of correct classification. (For two multivariate normal distributions with common covariance matrix, this general criterion yields the usual linear discriminant.) Such a sample-based rule is asymptotically optimal under very general conditions. Often its “apparent” non-error rate exceeds its true rate and is even optimistically biased as an estimator of the (unknown) optimal rule's non-error rate; but the apparent rate converges to this optimum.

90 citations


Journal ArticleDOI
TL;DR: Asymptotic normality of linear rank statistics for testing the hypothesis of independence is established under fixed alternatives in this article, where a generalization of a result of Bhuchongkul [1] is obtained both with respect to the conditions concerning the orders of magnitude of the score functions and to the smoothness conditions on these functions.
Abstract: Asymptotic normality of linear rank statistics for testing the hypothesis of independence is established under fixed alternatives. A generalization of a result of Bhuchongkul [1] is obtained both with respect to the conditions concerning the orders of magnitude of the score functions and with respect to the smoothness conditions on these functions.

80 citations


Journal ArticleDOI
TL;DR: A new class of nonparametric detectors based on m - interval partitioning is introduced, easier to treat analytically and implement practically than the widely used class ofnonparametric detector based on rank partitioning of the received signal space.
Abstract: A new class of nonparametric detectors based on m - interval partitioning is introduced. It is easier to treat analytically and implement practically than the widely used class of nonparametric detectors based on rank partitioning of the received signal space. In addition, these new detectors are robust or relatively insensitive to the underlying noise distribution. If the quantiles of the noise distribution are known exactly, the structure of the detector is particularly simple. Some general results for an adaptive system, when the quantiles of the noise distribution must be estimated, were also obtained.

Journal ArticleDOI
TL;DR: In this article, Parametric and nonparametric procedures for the prediction of a time series are discussed and the increase in the mean squared error of prediction over its minimum level due to the use of estimated spectra is assessed.
Abstract: : Parametric and nonparametric procedures for the prediction of a time series are discussed. In each case the increase in the mean squared error of prediction over its minimum level due to the use of estimated spectra is assessed. The fitting of simple parametric models as approximations is also discussed. (Author)

Journal ArticleDOI
TL;DR: In this paper, the area under the receiver operating characteristic curve (AOC) was used as a measure of bias and the false-alarm rate was used when experiments failed to reveal a change in the latter.
Abstract: Notes that dissatisfaction with theoretically based indexes of sensitivity and bias in detection and recognition experiments has motivated research into nonparametric indexes. Various suggestions from past work are presented and discussed with regard to their independence of particular theories. A particularly satisfactory measure of sensitivity is considered the area under the receiver operating characteristic curve. While no measure of bias is independent of sensitivity, the false-alarm rate can itself be used when experiments fail to reveal a change in the latter.

Journal ArticleDOI
01 Sep 1972
TL;DR: This work derives a modification of the Robbins-Monro method of stochastic approximation, and shows how this modification leads to training procedures that minimize the probability of error of a one-dimensional two-category pattern classifier.
Abstract: Some of the results of a study of asymptotically optimum nonparametric training procedures for two-category pattern classifiers are reported. The decision surfaces yielded by earlier forms of nonparametric training procedures generally do not minimize the probability of error. We derive a modification of the Robbins-Monro method of stochastic approximation, and show how this modification leads to training procedures that minimize the probability of error of a one-dimensional two-category pattern classifier. The class of probability density functions admitted by these training procedures is quite broad. We show that the sequence of decision points generated by any of these training procedures converges with probability one to the minimum-probability-of-error decision point.

Journal ArticleDOI
TL;DR: In this article, different strengths and weaknesses of the notion of a stochastically larger component of a two-dimensional random vector are given. But none of them reduce to the known definitions of stochastic order relationship when the components are stochatically independent.
Abstract: Definitions of different strengths are given to the notion of ‘a stochastically larger component of a two-dimensional random vector.’ Some of them reduce to the known definitions of stochastic order relationship when the components are stochastically independent. The definitions and the approach are related to nonparametric problems.

Journal ArticleDOI
TL;DR: In this article, a nonparametric test for the null hypothesis H 0 : 81 = 82 = 0 = 0 was proposed, where the second population is generated by the administration of a treatment which, if effective, would produce a positive location shift in either variable.
Abstract: our object is to test the null hypothesis H 0 : 81 = 82 = 0. Non-parametric tests for H 0 were proposed in Chatterjee and Sen (1964) and later developed in Chatterjee and Sen (1966), Puri and Sen (1966) [see also Bickel (1965), Sugiura (1965)] to situations where 81, 82 can possibly assume any real values. In this J>aper we are specially concerned with the situation 81 > 0, 82 ;;;;, 0. In other words, we want to test H 0 against the restricted class of alternatives H 1 : 81 > 0, 82 > 0, (8~> 82) \"# (0, 0). In many practical situations, e.g., where the second population is generated by the administration of a treatment which, if effective, would produce a positive location shift in either variable, in testing H 0, we are interested only in the restricted class of alternatives H 1• Tests meant for unrestricted alternatives, such as those in Chatterjee and Sen (1964), are of course consistent against H 1 and may be used for the restricted problem. However, tests specifically designed for the detection of H 1 need not bother about the three quadrants other than the first (in the (81, 82)-plane) anq hence may be more sensitive against H1 . In this paper we are concerned precisely with the development of such tests.

Journal ArticleDOI
TL;DR: In this article, the authors describe how following good measurement construction procedures result in normal variables, for which the use of nonparametric statistics would simply be inappropriate, and describe how to construct normal variables.
Abstract: This article describes how following good measurement construction procedures result in normal variables, for which the use of nonparametric statistics would simply be inappropriate.


Journal ArticleDOI
TL;DR: In this article, the significance of observed differences in well yields with respect to variation in controlling hydrogeologic factors was analyzed using Krusk-Wallis One-Way Analysis of Variance and Mann-Whitney U Test.
Abstract: Appropriate nonparametric or distribution free statistical techniques are useful tools when data do not satisfy the conditions required by parametric statistical tests, and may be applied to a variety of hydrogeological problems. Two nonparametric tests, Krusk-Wallis One-Way Analysis of Variance and Mann-Whitney U Test, were used to test the significance of observed differences in well yields with respect to variation in controlling hydrogeologic factors. This paper presents the steps involved in performing these two tests with one example for each and suggests other applications to water-related problems. To avoid computational errors and save time, a computer program was written for calculating the statistics used in these tests.

Journal ArticleDOI
TL;DR: In this article, some properties are developed for nonparametric estimators of the age of a Galton-Watson branching process, and a Monte-Carlo simulation shows that these estimators compare favourably with a parametric estimator studied earlier by Stigler.
Abstract: SUMMARY Some properties are developed for some nonparametric estimators of the age of a GaltonWatson branching process. A Monte-Carlo simulation shows that these estimators compare favourably with a parametric estimator studied earlier by Stigler (1970).

Journal ArticleDOI
TL;DR: In this paper, a class of bivariate rank tests are developed for the two-sample problem of testing equality of distributions against certain one-sided alternatives, and the asymptotic distributions are obtained under the null distribution and local alternatives.

Journal ArticleDOI
TL;DR: The distribution function of the Student's-t distribution is often needed in applied statistics, e.g., in computing the significance probability (p-value) of the usual test when comparing two means, and in computing tables needed for statistical procedu..
Abstract: The distribution function of the Student's-t distribution is often needed in applied statistics, e.g., in computing the significance probability (p-value) of the usual test when comparing two means, and in computing tables needed for statistical procedu..

Journal ArticleDOI
TL;DR: Under the assumption of two equiprobable classes that are normally distributed with equal covariance matrices, it is shown that the LSC is equivalent to Wald's sequential probability ratio test.
Abstract: A nonparametric sequential pattern classifier called a linear sequential classifier (LSC) is presented. The pattern components are measured sequentially and the decisions either to measure the next component or to stop and classify the pattern are made using linear functions derived from sample patterns based on the least mean-square error criterion. The required linear functions are computed using an adaption of Greville's recursive algorithm for computing the generalized inverse of a matrix. A recursive algorithm for computing the least mean-square error is given and is used to determine the order in which the pattern components are measured. Under the assumption of two equiprobable classes that are normally distributed with equal covariance matrices, it is shown that the LSC is equivalent to Wald's sequential probability ratio test. Computer-simulated experiments indicate that the LSC is more effective than existing nonparametric sequential classifiers.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric null hypothesis is formulated in such a way that it involves the parameter for which we want to find a confidence interval, and the confidence interval then consists of the set of parameter values for which the null hypothesis was not rejected.
Abstract: In recent years, more and more statisticians have come to appreciate the advantages of nonparametric tests. Not only do nonparametric tests have often surprisingly high efficiency relative to their normaltheory equivalents even under assumption of normality, but they are also less sensitive to the influence of "wild" observations than are the normal-theory equivalents. Much less appreciated seems to be the fact that many nonparametric tests can also be converted into confidence intervals for suitably chosen parameters. The basic idea is very simple. A nonparametric null hypothesis is formulated in such a way that it involves the parameter for which we want to find a confidence interval. The confidence interval then consists of the set of parameter values for which the null hypothesis is not rejected. To the extent that the test procedure is distribution-free, the resulting confidence interval is also distribution-free. In general, trial and error methods are required to determine the boundaries between acceptable and nonacceptable parameter values. Often it is possible to systematize this trial and error approach. And in some cases we can actually specify the endpoints in terms of the available observations. In what follows we shall investigate a class of procedures of this type. In general, the results are not new, though no attempt will be made to provide original references. When using nonparametric procedures it is customary to assume that sampled populations are continuous in order to insure the distribution-free character of the procedure when the null hypothesis is true. For the present we shall make the same assumption, though we shall see later that omission of this assumption does not cause any undue difficulties as far as confidence intervals are concerned. We illustrate our approach with a very simple and well-known example. Let us find a confidence interval for the median -q of an arbitrary (continuous) population on the basis of a random sample z1, . . ., Zn. We start with the sign test of the hypothesis Ho: -q = mqo against a two-sided alternative. The test can be performed as follows. We define two statistics,




01 Jan 1972
TL;DR: In this article, a Monte Carlo simulation technique was used to investigate the power efficiency of three nonparametric two-sample tests, i.e., the sign test, the Kolmogorov-Smirnov test, and the Mann-Whitney U test, compared with the power of their t-test equivalent.
Abstract: A Monte Carlo simulation technique was used to investigate the power efficiency of three nonparametric two-sample tests. The power of the sign test, the Kolmogorov-Smirnov test, and the Mann-Whitney U test was compared with the power of their t-test equivalent— the paired t-test in the case of the sign test, and the t-test for independent samples for the Kolmogorov-Smirnov test and the Mann-Whitney test. The simulation process permitted the investigation of a wide range of parameters. Each test was investigated for one-tailed signifi­ cance levels of .05 and .01; equal samples of size m = n = 6(2)20, 30, AO, 50; and location-shift alternatives 0 = 0.0(0.2)1.0, 2.0, 3.0, where © m *̂2 V*i. Restrictions on computer time prevented the analysis from O encompassing a wider range of parameters. The analysis was performed on an IBM 360/65 computer with a simulation process based on a Monte Carlo procedure of generating random normal deviates. Random samples of equal size were generated from normal distributions with equal variances of one; the first sample being drawn from a distribution with y = 0 and the second sample from a distribution with y 0. Two thousand separate samples were tested for each set of parameters for samples 6 to 20 and 1,000 repetitions for samples 30 to 50. Power was obtained by establishing a decision rule and determining the number of rejections in the total number of test samples. The findings were divided into two categories— probability of a Type I error (0 ■ 0.0) and power efficiency. The results obtained from simulating the probability of a Type I error Indicate that, In general, each nonparametrlc and parametric test was operating under similar test conditions, and, therefore, valid find­ ings were produced in the study. However, for the Kolmogorov-Smirnov te s t, which is based upon the establishment of cumulative frequency dis­ tributions, it was necessary to increase the number of class intervals in the cumulative distributions to 2(n + m) before valid results were obtained. The power efficiency of the sign test decreased from approximately 80 percent for the smaller parameter values of n and G to approximately 60 percent as the parameters increased. Over the same range of parameter values, the relative efficiency of the K-S test increased from approxi­ mately 50 to 70-75 percent, and all of the power efficiency values for the U-test fluctuated, primarily, between 90 and 100 percent. A slight increase in power efficiency was noted for both the sign test and the Kolmogorov-Smirnov test as the significance level decreased. Sampling error prevented any patterns from emerging as parameters changed for the U-test. It was anticipated that the K-S test would outperform the sign test for all parameter values. This proved not to be true for the smaller parameters. The power of the K-S test relies upon the assumption of continuous distributions and if this assumption is violated by creating too few classes then performance suffers. Therefore, the researcher is advised to use at least 2(n + m) class intervals in the test procedure. The power of the U-test was found to be very close to that of the t-test. The U-test is recommended over the t-test in all cases for test­ ing the hypothesis of equal means, except those in which the underlying distributions can be safely assumed to be normal. The Kolmogorov-Smirnov test is preferred to the sign test when large samples or large locationshift alternatives are encountered. However, when small samples or alternatives are involved the evidence of this study favors the sign test, especially when the ease of computation is considered.


Journal ArticleDOI
TL;DR: If ties exist among the k sample values, and the researcher has access to a program like that described in this paper, then the tests presented by Ferguson (1965, 1971) are a viable alternative and offer many statistical advantages.
Abstract: an a priori ordering among the k treatment groups. For such situations, trend analysis is often an appropriate statistical technique. Comprehensive discussions of the use of trend analysis with parametric data are available in statistics textbooks commonly used by behavioral scientists (e.g., Ferguson, 1971; Hays, 1963; Kirk, 1968; Winer, 1962). Analogous procedures for ranked data have been presented by Ferguson (1965, 1971), Jonckheere (1954a, 1954b), Jonckheere and Bower (1967), May and Konkin (1970), and Page (1963). Those described by May and Konkin (1970), for testing ordered hypotheses for k independent samples, and Page (1963), for testing ordered hypotheses for k correlated samples, are accompanied by extensive tables and are computationally facile. However, if ties exist among the k sample values, and the researcher has access to a program like that described in this paper, then the tests presented by Ferguson (1965,1971) are a viable alternative. The tests described by Ferguson employ the statistic S which is related to Kendall’s r and offers many statistical advantages. For example, as n increases, the sampling distribution of S rapidly approaches the normal distribution. More precisely, for n > 10, the normal approximations and exact values based on the distribution of