scispace - formally typeset
Search or ask a question

Showing papers on "Proper linear model published in 1976"



Book
01 Jan 1976
TL;DR: An introduction to linear regression and correlation, An introduction to Linear Regression and Correlation (LRCC) as discussed by the authors, is a recent work of the authors of this paper, which is also related to our work.
Abstract: An introduction to linear regression and correlation , An introduction to linear regression and correlation , مرکز فناوری اطلاعات و اطلاع رسانی کشاورزی

660 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that under very general circumstances coefficients in multiple regression models can be replaced with equal weights with almost no loss in accuracy on the original data sample, and that these equal weights will have greater robustness than least squares regression coefficients.
Abstract: It is proved that under very general circumstances coefficients in multiple regression models can be replaced with equal weights with almost no loss in accuracy on the original data sample. It is then shown that these equal weights will have greater robustness than least squares regression coefficients. The implications for problems of prediction are discussed. In the two decades since Meehl's (1954) book on the respective accuracy of clinical versus clerical prediction, little practical consequence has been observed. Diagnoses are still made by clinicians, not by clerks; college admissions are still done by committee, not by computer. This is true despite the considerable strength of Meehl's argument that humans are very poor at combining information optimally and that regression models evidently combine information rather well. These points were underlined in some recent work by Dawes and Corrigan (1974), in which they found again that human predictors do poorly when compared with regression models. Strikingly, they found that for some reason, linear models with random regression weights also do better than do humans. Even more striking, when all regression weights were set equal to one another they found still higher correlation with criterion on a validating sample. The obvious question here is Why? Is it because humans are so terrible at combining information that almost any rule works better, or is it some artifact of linear regression?

636 citations


Journal ArticleDOI
TL;DR: In this paper, the linear least squares prediction approach is applied to some problems in two-stage sampling from finite populations, and a theorem giving the optimal estimator and its error-variance under a general linear "superpopulation" model for a finite population is stated.
Abstract: The linear least-squares prediction approach is applied to some problems in two-stage sampling from finite populations A theorem giving the optimal (BLU) estimator and its error-variance under a general linear “superpopulation” model for a finite population is stated This theorem is then applied to a model describing many populations whose elements are grouped naturally in clusters Next, the probability model is used to analyze various conventional estimators and certain estimators suggested by the theory as alternatives to the conventional ones Problems of design are considered, as are some consequences of regression-model failure

183 citations


Journal ArticleDOI
TL;DR: In this paper, a unified approach to the study of biased estimators in an effort to determine their relative merits is provided, including simple and generalized ridge estimators, principal component estimators with extensions such as that, proposed by Marquardt [19] and the shrunken estimator proposed by Stein [23].
Abstract: Biased estimators of the coefficients in the linear regression model have been the subject of considerable discussion in the recent, literature. The purpose of this paper is to provide a unified approach to the study of biased estimators in an effort to determine their relative merits. The class of estimators includes the simple and the generalized ridge estimators proposed by Hoerl and Kennard [9], the principal component estimator with extensions such as that, proposed by Marquardt [19] and the shrunken estimator proposed by Stein [23]. The problem of estimating the biasing parameters is considered and illustrated with two examples.

182 citations


Journal ArticleDOI
TL;DR: A plotting procedure that shows the existence and location of changes in linear regression models is developed as an adjunct to one of the algorithms developing some efficient algorithms for linear spline and piecewise multiple linear regression.
Abstract: This paper develops some efficient algorithms for linear spline and piecewise multiple linear regression. A plotting procedure that shows the existence and location of changes in linear regression models is developed as an adjunct to one of the algorithms. The algorithms are compared with other presently available algorithms both in terms of efficiency and in terms of performance on sets of artificial data. An example shows how the algorithms, implemented in portable FORTRAN IV, can be used profitably in the analysis of data.

85 citations


Journal ArticleDOI
TL;DR: It is shown that the three approaches to detection of outliers from the general linear model Y = Xbeta + mu are exactly equivalent.
Abstract: Several authors have considered the problem of detection of outliers from the general linear model Y = Xbeta + mu. Ellenberg [1973] among others, has advocated use of a detection method which involves examination of the set of internally standardized least squares residuals. Mickey [1974] and Snedecor and Cochran [1968], apparently concerned about the usefulness of an outlier detection method which is based on residual estimates that themselves are biassed by the presence of the outlier, have proposed two other alternatives. It is shown that the three approaches are exactly equivalent. A detection procedure is described which uses as its test statistic the maximum of the internally standardized least squares residuals, and upper and lower bounds for the percentage points of the test statistic are given by Bonferroni inequalities. The computations required to obtain these approximate percentage points are illustrated in a numerical example. Finally, a brief simulation study of the performance of the procedure illustrates that the power of the test can be influenced by the position of the outlier vis-a-vis the structure of the design matrix X.

48 citations


Book ChapterDOI
01 Jan 1976

38 citations



Journal ArticleDOI
TL;DR: A procedure which solves the dual of the original linear programming formulation by the dual simplex method with upper bounded variables, in addition to utilizing the special structure of the constraint matrix from the point of view of storage and computation, performs the best in terms of both computational efficiency and storage requirements.
Abstract: The ordinal regression problem is an extension to the standard multiple regression problem in terms of assuming only ordinal properties for the dependent variable (rank order of preferred brands in a product class, academic ranks for students in a class, etc.) while retaining the interval scale assumption for independent (or predictor) variables. The linear programming formulation for obtaining the regression weights for ordinal regression, developed in an earlier paper, is outlined and computational improvements and alternatives which utilize the special structure of this linear program are developed and compared for their computational efficiency and storage requirements. A procedure which solves the dual of the original linear programming formulation by the dual simplex method with upper bounded variables, in addition to utilizing the special structure of the constraint matrix from the point of view of storage and computation, performs the best in terms of both computational efficiency and storage requirements. Using this special procedure, problems with 100 observations and 4 independent variables take less than 1/2 minute, on an average, on the IBM 360/67. Results also show that the linear programming solution procedure for ordinal regression is valid — the correlation coefficient between “true” and predicted values for the dependent variable was greater than .9 for most of the problems tested.

25 citations


Journal ArticleDOI
TL;DR: In this paper, the basic assumptions which must be met before linear regression techniques can be applied with maximum effectiveness are reviewed. But it is shown that linear regression models which predict events with only two out comes violate two basic assumptions.
Abstract: This article reviews the basic assumptions which must be met before linear regression techniques can be applied with maximum effectiveness. Starting with an intuitive explana tion of simple regression, the article proceeds to a discussion of more complex models which have applications in criminal justice research.Using this discussion as a background, it is shown that linear regression models which predict events with only two out comes violate two basic assumptions. Several alternative methods of dealing with this problem are examined, and one method, a multivariate logistic model, is used to estimate the probability of recidivism based on data provided by the Michigan Department o f Corrections.

Journal ArticleDOI
TL;DR: The General Linear Model (GLM) as discussed by the authors is a family of models possessing a common characteristic, namely, linearity in the parameters of the equation specifying the model, which has been used extensively in the analysis of nonlinear data.
Abstract: Recent works by Cohen (1968), Kelly, Beggs, McNeil, Eichelberger, and Lyon (1969), Kerlinger and Pedhazur (1973), McNeil (1970), Walberg (1971), and Bottenberg and Ward (Note 1) have attested to the flexibility of the General Linear Model. These publications have shown the capabilities of a single approach to the solution of correlation, regression, and Fisherian analysis of variance problems. It is noteworthy that all six of these publications claim, more or less, to be using the General Linear Model, but in no case has the particular linear model and its assumptions been clearly specified and consistently applied. The General Linear Model is a name given to the family of models possessing a common characteristic, namely, linearity in the parameters of the equation specifying the model. The members of this family are distinguishable in terms of their various assumptions, and it is the contention of this author that the distinctions among these different linear models are of more than just passing interest. The above publications, plus those of Digman (1966) and of McNeil and Spaner (1971), have shown the capabilities of the General Linear Model in handling the analysis of nonlinear data.1 This approach, with a history dating back to Court (1930),

Journal ArticleDOI
TL;DR: In this paper, the problem of estimating the coefficient vector β of a linear regression model with quadratic loss function was considered and some biased estimators which utilize the prior information about β were considered.
Abstract: We consider the problem of estimating the coefficient vector β of a linear regression model with quadratic loss function. Some biased estimators which utilize the prior information about β are considered. Also studied is the problem of estimating the parameters of an over-identified structural equation from undersized samples.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of estimating the estimators of the regression coefficients in non-stochastic regression models, which are analogous to those found in the models with stochastic predictor variables.
Abstract: Often in reporting the results of a regression analysis, researchers, particularly in the social sciences, choose to standardize the estimators of the regression coefficients into what are called “beta coefficients.” Most studies in which beta coefficients are reported involve linear models which contain stochastic predictor variables. However, in dealing with regression models in which thep redictor variables are nonstochastic, standardized regression coefficients can be defined which are analogous to those found in the models with stochastic predictor variables. We consider the problem of estimating these parameters. Several estimators are introduced and their properties are discussed. Two data examples are included to demonstrate the empirical behavior of the various estimators.

Journal ArticleDOI
TL;DR: In this paper, an application and specialization of the Bayesian linear model developed by Lindley and Smith (1972) is presented. The context is m-group regression and the application to the prediction of grade prediction.
Abstract: This study is an application and specialization of the Bayesian linear model developed by Lindley and Smith (1972). The context is m-group regression and the application to the prediction of grade ...

Journal ArticleDOI
TL;DR: In this paper, a two-stage general linear regression approach to analysis of variance is described for use in univariate or multivariate designs involving one repeated measurement factor and one or more independent classification factors.
Abstract: A convenient, two-stage general linear regression approach to analysis of variance is described for use in univariate or multivariate designs involving one repeated measurement factor and one or more independent classification factors. A brief illustrative example is provided.

Journal ArticleDOI
Milan Zeleny1
TL;DR: In this paper, a few examples are introduced to show that linear regression models represent only a quasi-rational paradigm at best, and simple graphical diagrams are used to clarify three main difficulties with the linear model.
Abstract: Although traditional instruments of research into human judgment - correlational statistics, the lens model, the ANOVA approach, etc. - are analytical, logical, and explicit tools of study, they might be inadequate, irrational and incorrect in their ultimate impact. In this short note a few examples are introduced to show that linear (in parameters) regression models could represent only a quasi-rational paradigm at best. Simple graphical diagrams are used to clarify three main difficulties with the linear model.

Journal ArticleDOI
TL;DR: The polynomial approximation of distributed lags is shown to involve linear restrictions on regression coefficients; two equivalent representations of these restrictions are used to clarify relationships between previous works byAlmon and byShiller.
Abstract: In this paper, the polynomial approximation of distributed lags is investigated within the framework of linear restrictions in linear regression models. In the first part, the polynomial approximation is analysed assuming well known the truncation point and the degree of the polynomial. The polynomial approximation is shown to involve linear restrictions on regression coefficients; two equivalent representations of these restrictions are used to clarify relationships between previous works byAlmon and byShiller. The difficulties related to the treatment of exact restrictions in a Bayesian framework are then tackled in the present context and alternative procedures are presented. In the second part, the analysis is extended to the case of unknown truncation point and/or unknown degree of the polynomial. This leads to consider mixed prior distributions as for the problem of choosing among different models. The paper ends by investigating the sensitivity of a particular set of data w.r.t. changes in the truncation point, in the degreee of the polynomial and in the prior tightness of the polynomial approximation.

Journal ArticleDOI
TL;DR: This paper discusses reasons (in addition to inconsistency) for the relative superiority of model over man, and it summarizes recent research in psychology concerning the robustness of linear regression models (and linear models in general).
Abstract: This paper elaborates on some issues discussed by Moskowitz, who presented evidence that linear multiple regression models, estimated from decisions made by individuals, often outperform the individuals themselves. In discussing his results, Moskowitz (1) suggested that inconsistency in information utilization by individuals may account for the relative superiority of regression models, and (2) expressed concern over the robustness of linear regression models to changes in (a) information environments, (b) weighting parameters, and (c) functional form of the model. This paper discusses reasons (in addition to inconsistency) for the relative superiority of model over man, and it summarizes recent research in psychology concerning the robustness of linear regression models (and linear models in general). This paper is supportive, rather than critical, of Moskowitz's research.

Journal ArticleDOI
TL;DR: In this paper, the problem of finding the most appropriate nonlinear relations among multiple specifications of a relationship has been studied in the context of research on state policy, and five recent contributions to the Journal of Politics face this problem by relying upon a criterion which is quite simply incorrect.
Abstract: R ELATIONS AMONG VARIABLES are not always linear. Fortunately, standard statistical procedures used in the analysis of linear relations can also be used to investigate nonlinear ones. Having decided to investigate more than one specification of a relationship, the analyst is eventually faced with the problem of choosing the one most appropriate to the data. Five recent contributions to the Journal of Politics face this problem in the context of research on state policy. All five "solve" the problem by relying upon a criterion which is, quite simply, incorrect.1

Journal ArticleDOI
TL;DR: In this paper, a nonlinear regression function can be transformed into a function which is after some reparametrization linear in the parameters, and this model transformation can be justified by an appropriate experimental design and a previous "model concentration".
Abstract: In many interesting cases a nonlinear regression function can be transformed into a function which is after some reparametrization linear in the parameters. To do such model transformations is a widespredad use without taking care of violations of the error assumptions. In this paper it is discussed, how this model transformation can be justified by an appropriate experimental design and a previous “model concentration”.

Journal ArticleDOI
TL;DR: In this paper, structural inference is applied to the linear regression model in which the errors follow an autoregressive process, and a marginal likelihood function is derived for the auto-gressive parameters while structural distributions are obtained for the regression parameters.
Abstract: Methods of structural inference are applied to the linear regression model in which the errors follow an autoregressive process. A marginal likelihood function is derived for the autoregressive parameters while structural distributions are obtained for the regression parameters. The marginal likelihood function, in the case of a Markov error process, is shown to be related under certain conditions to the Durbin-Watsonstatistic. This method of inference is illustrated by a simulated example.

Journal ArticleDOI
TL;DR: In this article, the authors compared different possible models in ternls of a risk function and used the properties of the risk function to study the effect of rejecting nonsignificant variables.
Abstract: The selection of the best subset of variables in a linear model is generally produccd by obtaining the regression equation that satisfies an optimality criterion. Unfortunately the statistical properties of the selected regression equations are not described by a general theory and in fact very, little is known of these properties. In this paper the different possible models are compared in ternls of a risk function. The properties of the risk function are used to study the effect of rejecting nonsignificant variables.

Journal ArticleDOI
TL;DR: It is demonstrated, using hypothetical data, that identical amounts of variance can be explained by ANCOVA relative to hierarchical ANOVA and multiple regression.
Abstract: Storandt and Hudson's treatment of the issue of which general linear model technique is preferable to use when age effects are confounded is misleading. Contrary to their position that hierarchical ANOVA or step-wise multiple regression is superior to ANCOVA, it is demonstrated, using hypothetical data, that identical amounts of variance can be explained by ANCOVA relative to hierarchical ANOVA and multiple regression. Multiple regression is recommended as the most appropriate technique for a variety of pragmatic reasons concerning calculation of significance tests, the distinction between gross and net effects, and the choice of the metric used in measurement.



Journal ArticleDOI
TL;DR: For a linear regression model for stochastic vector processes under certain conditions a sequence of experimental designs is derived, which are asymptotically optimal for estimating the regression coefficients as discussed by the authors.
Abstract: For a linear regression model for stochastic vector processes under certain conditions a sequence of experimental designs is derived, which are asymptotically optimal for estimating the regression coefficients.

Journal ArticleDOI
TL;DR: In this paper, structural distributions for the regression and covariance parameters in the multivariate linear model under various underlying functional models are derived, and the distribution of the covariance and regression parameters is analyzed.
Abstract: Structural distributions for the regression and covariance parameters in the multivariate linear model under various underlying functional models are derived.

Journal ArticleDOI
TL;DR: In this paper, three related methods for assigning interval scale values to a set of ordered categories are proposed, based on the assumption that certain patterns of linear regressions exist among the variables.
Abstract: Three related methods for assigning interval scale values to a set of ordered categories are proposed, based on the assumption that certain patterns of linear regressions exist among a set of variables. Technique I requires that the second administration of a variable be a linear function of the first. Technique II requires that two variables each be linear functions of one another (“symmetric linearity”). Technique III requires at least three linear regressions among three variables. The category values that are consistent with the assumed patterns of linear regressions are unique up to a linear transformation.

01 Jan 1976
TL;DR: In this article, the Bayesian ridge regression (BRI) estimator is used to estimate the parameters of a linear regression model, and two possible objective methods of choosing k are examined to determine if either one leads to a useful probability distribution.
Abstract: The estimation of the parameters of a linear statistical model is generally accomplished by the method of least squares. However, when the method of least squares is applied to nonorthogonal problems the resulting estimates may be significantly different from the true parameters. The method of ridge regression may provide better estimates in these cases; however, a probability distribution of the ridge estimator is presently not known. The form of such a distribution is dependent upon how the ridge parameter, k, is selected. Two possible objective methods of choosing k are examined to determine if either one leads to a useful probability distribution. TABLE OF CONTENTS I. BACKGROUND -----------------7 A. INTRODUCTION 7 B. ORDINARY LEAST SQUARES 8 C. RIDGE REGRESSION ------------10 1. Mean Squared Error ---------n 2. Alternative Methods of Choosing k13 II. PROPOSED OBJECTIVE RULES FOR CHOOSING k17 A. ABSOLUTE VALUE CRITERION --------19 B. DERIVATIVE CRITERION ----------20 III. PROBLEM------------------21 A. ABSOLUTE VALUE CRITERION --------21 B. DERIVATIVE CRITERION 22 IV. NOTES ON THE FULL BAYESIAN RIDGE ESTIMATOR 24 V. CONCLUSIONS AND RECOMMENDATIONS------26 APPENDIX A DERIVATION OF THE RIDGE REGRESSION ESTIMATOR ----27 APPENDIX B FULL BAYESIAN RIDGE ESTIMATION29 APPENDIX C MISCELLANEOUS MATRIX ALGEBRA AND CALCULUS34 LIST OF REFERENCES _______ 36 INITIAL DISTRIBUTION LIST------------37 LIST OF FIGURES Figure Page 1 Mean Squared Error Functions --------12 2 Typical Ridge Trace------14 3. Typical Plot of the Squared Length of the Ridge Estimator ------------15 I. BACKGROUND The following conventions will be used throughout. Unless otherwise noted, capital letters and Greek letters will refer to matrices and vectors while lower case letters will refer to scalars. A. INTRODUCTION The use of linear statistical models is widespread in scientific fields of all kinds. Generally, the linear statistical model is postulated as Y = X3 + e (1) where Y is an n x 1 vector of n observed values of a dependent variable, X is an n x p matrix containing n values for each of p predictor (independent) variables, 3 is a p x 1 vector of p unknown parameters (or coefficients) to be estimated from data, and e is an n x 1 vector representing experimental errors. Usually, the experimental error is assumed to have a multivariate normal distribution with mean equal to zero and variance covariance matrix 2 2 equal to a I where a is the scalar value of the common variance of the experimental errors. This assumption will be made throughout this paper. In practice, the modeling problem is to estimate the parameters 3 from data Y and X. The most common method of doing this is called least squares estimation or sometimes ordinary least squares (OLS) . The latter designation will be used in this paper. Under certain fairly general and common conditions OLS is an adequate method of estimating 3. However, when the data is "ill-conditioned" or nonorthogonal OLS may yield poor estimates of the true parameters. Ridge regression (RR) has been proposed [Ref. 1] as an alternative estimation method that might yield better estimates under conditions where OLS does poorly. B. ORDINARY LEAST SQUARES For convenience, it is assumed that the elements of X are scaled such that X'X has the form of a correlation matrix. This is done by forming from each element x. . a new element x'. such that x' . . = (x. . x.)/s (2) ij v 13 y' Xj < J where x. is the mean value of the elements of the j — 3 independent variable and s is its standard deviation x j times an appropriate constant such that the diagonal elements of X'X are equal to one. The OLS estimator of 3 is then 3 = (X'X)" 1 X , Y (3)