scispace - formally typeset
Search or ask a question

Showing papers in "Technometrics in 1979"


Journal ArticleDOI
TL;DR: The generalized cross-validation (GCV) method as discussed by the authors is a generalized version of Allen's PRESS, which can be used in subset selection and singular value truncation, and even to choose from among mixtures of these methods.
Abstract: Consider the ridge estimate (λ) for β in the model unknown, (λ) = (X T X + nλI)−1 X T y. We study the method of generalized cross-validation (GCV) for choosing a good value for λ from the data. The estimate is the minimizer of V(λ) given by where A(λ) = X(X T X + nλI)−1 X T . This estimate is a rotation-invariant version of Allen's PRESS, or ordinary cross-validation. This estimate behaves like a risk improvement estimator, but does not require an estimate of σ2, so can be used when n − p is small, or even if p ≥ 2 n in certain cases. The GCV method can also be used in subset selection and singular value truncation methods for regression, and even to choose from among mixtures of these methods.

3,697 citations


Journal ArticleDOI
TL;DR: In this article, the treatment of residuals associated with principal component analysis (PCA) is discussed, i.e., the difference between the original observations and the predictions of them using less than a full set of principal components.
Abstract: This paper is concerned with the treatment of residuals associated with principal component analysis. These residuals are the difference between the original observations and the predictions of them using less than a full set of principal components. Specifically, procedures are proposed for testing the residuals associated with a single observation vector and for an overall test for a group of observations. In this development, it is assumed that the underlying covariance matrix is known; this is reasonable for many quality control applications where the proposed procedures may be quite useful in detecting outliers in the data. A numerical example is included.

1,111 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present nonparametrics: Statistical Methods Based on Ranks (SBMR), a statistical method for nonparameterization of nonparametric data.
Abstract: (1979). Nonparametrics: Statistical Methods Based on Ranks. Technometrics: Vol. 21, No. 2, pp. 272-273.

802 citations


Journal ArticleDOI
TL;DR: The rank transform is a simple procedure which involves replacing the data with their corresponding ranks as mentioned in this paper, which has been shown to be useful in hypothesis testing with respect to experimental designs and has been used in regression.
Abstract: The rank transform is a simple procedure which involves replacing the data with their corresponding ranks. The rank transform has previously been shown by the authors to be useful in hypothesis testing with respect to experimental designs. This study shows the results of using the rank transform in regression. Two sets of data given by Daniel and Wood [8] are considered for purposes of illustrating the rank transform in simple and multiple regression. Also given are the results of a Monte Carlo study which compares regression on ranks with some published Monte Carlo results on isotonic regression. This Monte Carlo study is also modified to compare regression on ranks with robust regression. Another illustration gives the results of analyses on large computer codes by regression on ranks. The rank transform is a simple, repeatable process that compares favorably with other methods such as given by Andrews [1]. Our studies indicate the method works quite well on monotonic data.

469 citations


Journal ArticleDOI
TL;DR: A four-parameter probability distribution, which includes a wide variety of curve shapes, is presented, because of the flexibility, generality, and simplicity of the distribution, it is useful in the representation of data when the underlying model is unknown.
Abstract: A four-parameter probability distribution, which includes a wide variety of curve shapes, is presented. Because of the flexibility, generality, and simplicity of the distribution, it is useful in the representation of data when the underlying model is unknown. A table based on the first four moments, which simplifies parameter estimation, is given. Further important applications of the distribution include the modeling and subsequent generation of random variates for simulation studies and Monte Carlo sampling studies of the robustness of statistical procedures.

418 citations


Journal ArticleDOI
TL;DR: In this article, the authors present iterative techniques for obtaining reduced rank approximation of matrices when weights are introduced, which involve criss-cross regressions with careful initialization, and possible applications of the approximation are in modelling, biplotting, contingency table analysis, fitting of missing values, checking outliers, etc.
Abstract: Reduced rank approximation of matrices has hitherto been possible only by unweighted least squares. This paper presents iterative techniques for obtaining such approximations when weights are introduced. The techniques involve criss-cross regressions with careful initialization. Possible applications of the approximation are in modelling, biplotting, contingency table analysis, fitting of missing values, checking outliers, etc.

311 citations


Journal ArticleDOI
TL;DR: In this paper, the results of the comparisons continue to support the fact that the T-Method is preferred if the unbalance of the sample sizes is slight and that T becomes a stronger competitor as α increases.
Abstract: Tables of the upper α-points m α; k*, v of the studentized maximum modulus distribution with parameter k* and v degrees of freedom are given for α = .01, .05, 10, and .20, twelve values 01 V, and k* = k(k − 1)/2 for k = 3(1)20. The tables are of use in applying Hochberg's GT2-Method [6] of multiple comparison and extend the use of this method to cases with k ≤ 20 means. As conjectured in [15]. the GT2-Method is in almost all cases the strongest of several competitors of Spodtvoll and Stoline's T′-Method [11] for carrying out the set of k* pairwise comparisons between the means of k groups with arbitrary sample sizes. The auxiliary tables of [15], in which the T′-Method and other procedures are compared, are extended here. These tables should aid the user in choosing between the T′ and GT2-Methods. The results of the comparisons continue to support the fact that the T-Method is preferred if the unbalance of the sample sizes is slight and that T becomes a stronger competitor as α increases.

264 citations


Journal ArticleDOI
TL;DR: The literature of ridge regression and James-Stein estimation is broadly reviewed in this paper, and critical comments are interpolated on a number of papers, expressing their viewpoints on ridge regression, and their antipathy to its mechanical use.
Abstract: The literature of ridge regression and James-Stein estimation is broadly reviewed, and critical comments are interpolated on a number of papers. The authors also express their viewpoints on ridge regression and their antipathy to its mechanical use.

216 citations


Journal ArticleDOI
TL;DR: In this paper, an initial least square fit is obtained treating the censored values as failures and the expected failure time for each censored observation is estimated, instead of the censoring times.
Abstract: Problems requiring regression analysis of censored data arise frequently in practice. For example, in accelerated testing one wishes to relate stress and average time to failure from data including unfailed units, i.e., censored observations. Maximum likelihood is one method for obtaining the desired estimates; in this paper, we propose an alternative approach. An initial least squares fit is obtained treating the censored values as failures. Then, based upon this initial fit, the expected failure time for each censored observation is estimated. These estimates are then used, instead of the censoring times, to obtain a revised least squares fit and new expected failure times are estimated for the censored values. These are then used in a further least squares fit. The procedure is iterated until convergence is achieved. This method is simpler to implement and explain to non-statisticians than maximum likelihood and appears to have good statistical and convergence properties. The method is illustrated by a...

182 citations


Journal ArticleDOI
TL;DR: In this paper, a nonparametric procedure is developed for the problem of quickly detecting any shift in the mean of a sequence of observations from a specified control value, based on Wilcoxon signed-rank statistics where ranking is within groups.
Abstract: A nonparametric procedure is developed for the problem of quickly detecting any shift in the mean of a sequence of observations from a specified control value. The proposed procedure is based on Wilcoxon signed-rank statistics where ranking is within groups. A cumulative sum control chart type stopping rule is used with the Wilcoxon statistics. Using a Markov chain approach, the average run length of the procedure can be computed exactly for any distribution for which the distribution of the Wilcoxon signed-rank statistic is known. The procedure has the same average run length for any continuous distribution which is symmetric about the control value.

140 citations


Journal ArticleDOI
TL;DR: In this paper, a bias correction using higher derivatives of the log likelihood is shown to be effective in a simulation study of a medical diagnosis problem, where the procedure is illustrated in a medical simulation problem.
Abstract: Maximum likelihood estimates of the parameters for logistic discrimination show considerable bias in small samples, Adjustment by a bias correction using higher derivatives of the log likelihood is shown to be effective in a simulation study. The procedure is illustrated in a medical diagnosis problem.

Journal ArticleDOI
TL;DR: In this paper, Bayesian results for the inverse Gaussian family of distributions with noninformative reference prior as well as the natural conjugate prior were derived for equipment failure data.
Abstract: Some Bayesian results are derived for the inverse Gaussian family of distributions with noninformative reference prior as well as the natural conjugate prior. With a particular parameterization, the posterior distributions are found to have remarkable similarities with the corresponding results for the normal model. Finally, an application of the Bayesian results is given toward analyzing some equipment failure data.

Journal ArticleDOI
TL;DR: For particular populations, the change in probability of correct classilication caused by adding dimensions is given here to give insight into how many variables one should use for fixed training data sizes, especially when dealing with the populations of these studies.
Abstract: This paper is a continuation of earlier work (Van Ness and Simpson [9]) studying the high dimensionality problem in discriminant analysis. Frequently one has potentially many possible variables (dimensions) to be measured on each object but is limited to a fixed training data size. For particular populations, we give here the change in probability of correct classilication caused by adding dimensions. This gives insight into how many variables one should use for fixed training data sizes, especially when dealing with the populations of these studies. We consider six basic discriminant analysis algorithms. Graphs are provided which compare the relative performance of the algorithms in high dimensions.

Journal ArticleDOI
TL;DR: In this article, the authors obtained orthogonal main-effect plans for experiments of the type 4 r 3 s 2 m by collapsing the levels of one or more four-level factors.
Abstract: Resolution III designs for 43·2 m experiments are constructed. These plans are available in 4 n runs, where n is a multiple of four, and permit estimation of all main-effects orthogonally. Once the plans for 43·2 m experiments are obtained, one can get orthogonal main-effect plans for experiments of the type 4 r 3 s 2 m by collapsing the levels of one or more four-level factors.

Journal ArticleDOI
TL;DR: In this article, an objective procedure for the detection of outliers is given by using Akaike's information criterion and numerical illustrations are given, using data from Grubbs [8] and Tietjen and Moore [7].
Abstract: An objective procedure for the detection of outliers is given by using Akaike's information criterion. Numerical illustrations are given, using data from Grubbs [8] and Tietjen and Moore [7].

Journal ArticleDOI
TL;DR: In this article, integrated mean squared error is used as a criterion to determine when the least squares, principal component, and ridge regression estimators of regression coefficients can produce satisfactory prediction equations in the presence of a multicollinear design matrix.
Abstract: Prediction equations constructed from multiple linear regression analyses are often intended for use in predicting response values throughout a region of the space of the predictor variables. Criteria for evaluating prediction equations, however, have generally concentrated attention on mean squared error properties of the estimated regression coefficients or on mean squared error properties of the predictor at the design points. If adequate prediction throughout a region of the space of predictor variables is the goal, neither of these criteria may be satisfactory in assessing the predictor. In this paper integrated mean squared error is used as a criterion to determine when the least squares, principal component, and ridge regression estimators of regression coefficients can produce satisfactory prediction equations in the presence of a multicollinear design matrix.

Journal ArticleDOI
TL;DR: In this paper, a recursive method is described to obtain certain distributional results relating to order statistics, which may be used to obtain the testing of single and multiple outliers, and some useful inequalities for tail probabilities.
Abstract: A recursive method is described, which may be used to obtain certain distributional results relating to order statistics. The procedure is applied to gamma samples to obtain null distributions of various statistics appropriate for the testing of single and multiple outliers, and some useful inequalities for tail probabilities.

Journal ArticleDOI
TL;DR: This paper serves as sequel to the earlier review paper on mixtures by discussing the literature on mixture experiments dating from 1973 to the present by discussing techniques for analyzing mixture data, additional design configurations for covering the whole simplex, and designs for exploring restricted regions of the simplex.
Abstract: This paper serves as sequel to the earlier review paper on mixtures [4] by discussing the literature on mixture experiments dating from 1973 to the present. The sections listed consist of techniques for analyzing mixture data, additional design configurations for covering the whole simplex, and designs for exploring restricted regions of the simplex. The material from each of twenty-two papers is given a brief exposure. and suggestions are made of topics for future research.

Journal ArticleDOI
TL;DR: In this paper, a multiple methods comparison technique for this case is proposed, and is illustrated by an example from the field of clinical chemistry, showing that none of the individual methods are known to measure truth.
Abstract: The basic sciences all require an ability to measure the amounts of substances under study. With new methods of measurement constantly being proposed there is a need for techniques for comparing these methods in terms of their precision and accuracy. Of particular interest is the case in which none of the individual methods are known to measure “truth”. A multiple methods comparison technique for this case is proposed in this paper, and is illustrated by an example from the field of clinical chemistry. Estimates of the components of variance for each method are developed, and some of their properties explored.

Journal ArticleDOI
TL;DR: In this article, a procedure for screening using a variable X which will guarantee that 100 δ% of Y values for screened items will be between L and U (two specification limits) is described.
Abstract: A procedure is described for screening using a variable X which will guarantee that 100δ% of Y values for screened items will be between L and U (two specification limits). The random variables (X, Y) are assumed to have a joint bivariate normal distribution. Tables for implementing the procedure are also given. The procedure is then extended to the case where all the parameters are unknown.

Journal ArticleDOI
TL;DR: In this paper, the authors derived methods for constructing prediction limits for the minimum or more generally, the jth smallest of some set of future observations from a Weibull or extreme value population.
Abstract: Methods are derived for constructing prediction limits for the minimum or. more generally, the jth smallest of some set of future observations from a Weibull or extreme-value population. Methods are also given for testing the difference of Weibull shape or scale parameters. All methods derived apply to both complete and censored samples.

Journal ArticleDOI
TL;DR: In this article, a new approach is proposed for testing goodness of fit for censored samples, where a transformation is applied so that the transformed censored sample behaves under the null hypothesis like a complete sample from the uniform (0, 1) distribution.
Abstract: A new approach is proposed for testing goodness of fit for censored samples. A transformation is applied so that the transformed censored sample behaves under the null hypothesis like a complete sample from the uniform (0,1) distribution. Any standard goodness-of-fit test can then be applied using existing tables. Empirical comparisons indicate that the proposed technique provides better overall power than other existing methods.

Journal ArticleDOI
TL;DR: In this paper, the critical values for testing for a two-phase regression are given by simulation methods and the dependence of the test on the values of the independent variable is investigated.
Abstract: The critical values for testing for a two-phase regression are given by simulation methods. The dependence of the test on the values of the independent variable is investigated. and a comparison is made between the critical values of the simulation and those given by using Bonferroni's inequality.

Journal ArticleDOI
TL;DR: The main purpose of this paper is to demonstrate the usefulness of a simplex search procedure for the problem of augmenting a given experimental design to “optimize” it with respect to some criterion.
Abstract: The main purpose of this paper is to demonstrate the usefulness of a simplex search procedure for the problem of augmenting a given experimental design to “optimize” it with respect to some criterion. In particular, the paper demonstrates the ability of the simplex search procedure to add m points simultaneously to an initial design in order to maximize |X′X| In the process, the paper presents a method for improving the Nelder-Mead simplex search procedure.

Journal ArticleDOI
TL;DR: In this article, integrated mean square error is proposed as an optimizing criterion for the linear inverse estimation problem and a generalization allowing for several predictor variables is considered, under certain conditions both the classical and the inverse methods may tend towards the optimal estimator.
Abstract: Integrated mean square error is proposed as an optimizing criterion for the linear inverse estimation problem. Results are compared with the classical and inverse procedures. A generalization allowing for several predictor variables is considered. Under certain conditions both the classical and the inverse methods may tend towards the optimal estimator. Also, the sample inverse estimator is shown to be optimal for a realistic situation

Journal ArticleDOI
TL;DR: In this article, the detection of singularities in the multiresponse dispersion matrix is discussed, and a procedure for this is presented, and the situation is also discussed in which linear dependencies in the data may be ignored to advantage.
Abstract: Care is required when analysing multiple response data if misleading results are to be avoided. Box et al. [4] have warned of errors in analysis resulting from linear relationships among the response data, and have provided a detection procedure. This is effective for most situations; however, we have encountered two cases that require additional consideration. This paper extends the work of Box et al. to include these cases. The key issue is the detection of singularities in the multiresponse dispersion matrix, and a procedure for this is presented. The situation is also discussed in which linear dependencies in the data may be ignored to advantage. Illustrations are from chemical kinetics investigations.

Journal ArticleDOI
TL;DR: In this paper, an efficient implicit enumeration algorithm is proposed for the problem of selecting subsets of predictor variables in a multiple linear regression model using the minimum sum of weighted absolute errors (MSWAE) criterion.
Abstract: An efficient implicit enumeration algorithm is proposed for the problem of selecting subsets of predictor variables in a multiple linear regression model using the minimum sum of weighted absolute errors (MSWAE) criterion. The proposed algorithm is illustrated with an example. Computational experience shows that the proposed algorithm is superior to the currently available algorithm in terms of computation time and the number of iterations required to solve a problem.

Journal ArticleDOI
TL;DR: In this paper, the Box-Andersen test and the jack knife were compared in independent and identically distributed (i.i.d.) case and the difference between the two robust procedures seems slight.
Abstract: Based on Pitman asymptotic relative efficiency and Monte Carlo studies of power functions, Shorack [29] concluded in the independent and identically distributed (i.i.d.) case that the most satisfactory tests for scale changes were the Box-Andersen test and the jack knife. Analogs of these two procedures are shown to have identical asymptotic properties when the observations follow a correctly identified autoregressive process. A Monte Carlo study indicates that the performance of these robust procedures is roughly the same as in the i.i.d. case. The difference between the two robust procedures seems slight. The Box-Andersen procedure does a better job of maintaining the stated significance level but it lags slightly behind the jack knife in power.

Journal ArticleDOI
TL;DR: In this article, exact percentage points are given for the ratio of the largest eigenvalue to the trace of Z′Z, where Z is a matrix having independent standard normal variates as entries.
Abstract: Some exact percentage points are given for the ratio of the largest eigenvalue to the trace of Z′Z, where Z is a matrix having independent standard normal variates as entries. These percentage points are useful in certain analysis of variance models for two-way cross-classilied data, for which only approximations have been given to date. They also have implications for principal component analysis.

Journal ArticleDOI
TL;DR: In this article, the precision of the predicted response is given a serious consideration in the search for optimum operating conditions, and the authors consider two cases, one in which the variance contours are ellipsoidal and the more general (and perh...
Abstract: The purpose of this paper is to introduce a certain modification to the technique of ridge analysis. The standard ridge analysis is employed in order to find conditions on a set of design variables that maximize (or minimize) an estimated second order response function on spheres of varying radii. See for example [3], [4], and [6]. The standard ridge analysis does not account for the fact that the experimental design (or lack of design) might result in large fluctuations in the variance of the predicted response on a sphere of a given radius. Thus the technique might well produce poor estimates of maximum responses and also conditions that give rise to it. Obviously one can expect the difficulty when non-rotatable designs are used to estimate the second order function. In this paper the precision of the predicted response is given a serious consideration in the search for optimum operating conditions. We consider two cases, one in which the variance contours are ellipsoidal, and the more general (and perh...