scispace - formally typeset
Search or ask a question

Showing papers on "Unit-weighted regression published in 2001"


Book
09 Oct 2001
TL;DR: The second edition of the Second Edition of the Logistic regression model as discussed by the authors is the most complete version of the first edition and includes a discussion of the relationship between linear regression and logistic regression.
Abstract: Series Editor's Introduction Author's Introduction to the Second Edition 1. Linear Regression and Logistic Regression Model 2. Summary Statistics for Evaluating the Logistic Regression Model 3. Interpreting the Logistic Regression Coefficients 4. An Introduction to Logistic Regression Diagnosis Ch 5. Polytomous Logistic Regression and Alternatives to Logistic Regression 6. Notes Appendix A References Tables Figures

4,046 citations


01 Jan 2001
TL;DR: In this article, the authors present a model with regression and correlation for nonlinear and logistic regression analysis, using multilevel modelling and structural equation modelling, respectively.
Abstract: PART ONE: I NEED TO DO REGRESSION ANALYSIS TOMORROW Building Models with Regression and Correlation More Than One Independent Variable Multiples Regression Categorical Independent Variables PART TWO: I NEED TO DO REGRESSION ANALYSIS NEXT WEEK Assumptions in Regression Analysis Issues in Regression Analysis PART THREE: I NEED TO KNOW MORE OF THE THINGS THAT REGRESSION CAN DO Nonlinear and Logistic Regression Moderator and Mediator Analysis Introducing Some Advanced Techniques Multilevel Modelling and Structural Equation Modelling

1,474 citations


Journal ArticleDOI
TL;DR: This book serves well as an introduction to the speciŽ c area of methods for detecting and correcting model violations in the standard linear regression model and provides a general overview of transformations of variables and focuses on three traditional situations where transformations can be applied.
Abstract: This book serves well as an introduction to the speciŽ c area of methods for detecting and correcting model violations in the standard linear regression model. In the preface of the book, it states that the authors view regression analysis as a set of data-analytic techniques that examine the interrelationship among a given set of variables. They approach the topic from an informal analysis point of view directed at uncovering patterns in the data rather than from the formal statistical-tests-and-probabilities point of view. The book relies heavily on graphical methods and intuitive explanations to achieve this. Several examples are introduced early in the book and are drawn on throughout the later chapters to demonstrate the different methods discussed. The examples are more from sociological and economic areas than from engineering Ž elds; however, they do demonstrate the given techniques well. There are no mathematical derivations for any of the results, although references are given throughout. The authors present the different subjects at a sufŽ cient level of detail so that most standard regression packages can be used for the methods discussed. The foundations of regression analysis are summarized without much detail, so the reader needs to be knowledgeable of and comfortable with multiple regression and model building to get the most beneŽ t from the material. The chapter layout is as follows: Chapter 1: Introduction. All of the datasets used in the examples throughout the book are available from the Web. The authors introduce a set of steps, cyclical in nature, that they use for a given regression analysis problem. They follow this process closely throughout the book. Chapter 2: Simple Linear Regression. This chapter outlines, in very general terms, the distribution theory, conŽ dence intervals, and hypothesis tests for the simple regression model. Equations/expressions are given with no derivations. There are typographical errors in the equations and table numbers throughout the chapter. Chapter 3: Multiple Linear Regression. This chapter introduces the multiple linear regression model, again in very general terms. It contains one of the better discussions/interpretations of partial regression coefŽ cients that I have read and includes a very effective example to emphasize the point. The chapter also has a good general introduction to the model comparison approach in regression analysis, discussing coefŽ cient testing based on full and reduced models. It also introduces the idea of model constraints and how to test these using a model comparison approach. The details of applying these methods are also given in later chapters with some speciŽ c examples. There is an appendix at the end of this chapter giving details of multiple regression in matrix notation in terms of the estimators and residuals and their properties. Chapter 4: Regression Diagnostics—Detection of Model Violations. This chapter addresses the issue of assumptions validation and the detection and correction of model violations. The authors discuss both standard techniques and more recently developed techniques to address nonnormality, outliers, high leverage points, and in uential observations. The applications of these techniques are demonstrated repeatedly in the examples discussed in later chapters. Each of the remaining chapters in the book deals with a speciŽ c type of regression problem or situation. Chapter 5: Qualitative Variables as Predictors. The authors do an excellent job of explaining how model parameterization works with qualitative variables. They also do an effective job of introducing the methods of analysis of covariance (ANCOVA) in terms of generating and comparing models with same and/or different slopes and/or intercepts. However, the material does not address the idea of degrees of freedom in these situations. This is one of the few drawbacks I see in the material. A good understanding of degrees of freedom in ANCOVA models is essential in specifying speciŽ c tests and evaluating model performance. Chapter 6: Transformation of Variables. This chapter provides a general overview of transformations of variables and focuses on three traditional situations where transformations can be applied—(1) to achieve linearity of the model, (2) to achieve normality of the errors, and (3) to stabilize the variance. Chapter 7: Weighted Least Squares. This chapter addresses the heterogeneity of variance assumption. The material has a good intuitive explanation of two speciŽ c situations where ordinary least squares are equivalent to weighted least squares—(1) when the variance of the residuals is a function of one of the predictor variables and (2) when the response variables are means with different sample sizes. Chapter 8: The Problem of Correlated Errors. This chapter addresses the issue of the independent-errors assumption and techniques used to identify and correct the problem. It gives a good general overview of the Durbin–Watson statistic and some of its limitations, transformations to remove autocorrelation, and iterative estimation with autocorrelated errors. Chapter 9: Analysis of Collinear Data. Chapter 10: Biased Estimation of Regression CoefŽ cients. Both Chapters 9 and 10 present methods for the detection and correction of the collinearity problem. This is one of the best discussions in the book. Different techniques are used to identify if collinearity exists and different methods are used to correct the situation. The authors address three speciŽ c questions: (1) How does multicollinearity affect inference and forecasting? (2) How can it be detected? (3) What can be done to resolve the difŽ culties associated with it? Chapter 9 contains a brief appendix on using principal components to detect multicollinearity in matrix notation. Chapter 10 contains a brief appendix on ridge regression in matrix notation. Chapter 11: Variable Selection Procedures. This chapter starts out by reviewing the standard methodology behind forward and backward selection and introduces different criteria useful for comparing results across different models. It contains two good examples using the methods introduced in Chapters 9 and 10 (principal components and ridge regression) as a means of evaluating situations containing a large number of predictor variables. Chapter 12: Logistic Regression. This chapter contains a good overview of the aspects of logistic regression. It is one of the more mathematical chapters in the book; however, the authors present the material in a very reader-friendly manner. There is one example given using Ž nancial data that goes into details of diagnostics measures of the model, judging the Ž t of the model, and the model comparison approach using the chi-squared statistic. The authors state that the primary focus of the book is on the detection and correction of violations of the basic linear model assumptions as a means of achieving a thorough and informative analysis of the data. The book covers only univariate regression, both simple and multiple, linear, and, to some extent, nonlinear (under linearizeable conditions). The authors deal mainly with the least squares method of estimation and, to some extent, weighted least squares. They touch on some of the aspects of other estimation methods, such as maximum likelihood. Much of the material and examples in the second half of the book uses the methods of ridge regression and principal components repeatedly. They do a very thorough and effective job of demonstrating the variety of methods that are available to help the analyst under different situations. This book is not a stand-alone regression text, and I do not believe it was intended to be. Overall, the material that is covered is an excellent introduction to a substantial collection of diagnostic tools that aid in uncovering hidden structures in one’s data. I would recommend the book as an addition to any applied statistician’s library.

1,126 citations


Journal ArticleDOI
TL;DR: This article reviewed articles published in the Journal of Applied Psychology to determine how interpretations might have differed if standardized regression coefficients and structure coefficients (or else bivariate rs of predictors with the criterion) had been interpreted.
Abstract: The importance of interpreting structure coefficients throughout the General Linear Model (GLM) is widely accepted. However, regression researchers too infrequently consult regression structure coefficients to augment their interpretations. The authors reviewed articles published in the Journal of Applied Psychology to determine how interpretations might have differed if standardized regression coefficients and structure coefficients (or else bivariate rs of predictors with the criterion) had been interpreted. Some dramatic misinterpretations or incomplete interpretations are summarized. It is suggested that beta weights and structure coefficients (or else bivariate rs of predictors with the criterion) ought to be interpreted when noteworthy regression results have been isolated.

448 citations


Book
10 Apr 2001
TL;DR: A Review of the Correlation Coefficient and its Properties Testing Correlations for Statistical Significance Applications of Pearson Correlation to Measurement Theory Range Restriction 'Simple' Two-Variable Regression Three Applications of Bivariate Regression Utility Analysis, Regression to the Mean, Partial Correlation Multiple (Mostly Trivariate) Regression Expanding the Regression Repertoire Polynomial and Interaction Terms More about Regression, and Beyond More about regression, and beyond as mentioned in this paper
Abstract: An Introduction, an Overview, and Some Reminders A Review of the Correlation Coefficient and Its Properties Testing Correlations for Statistical Significance Applications of Pearson Correlation to Measurement Theory Range Restriction 'Simple' Two-Variable Regression Three Applications of Bivariate Regression Utility Analysis, Regression to the Mean, Partial Correlation Multiple (Mostly Trivariate) Regression Expanding the Regression Repertoire Polynomial and Interaction Terms More about Regression, and Beyond

155 citations


Journal ArticleDOI
TL;DR: In this paper, the authors develop test statistics to test hypotheses in nonlinear weighted regression models with serial correlation or conditional heteroscedasticity of unknown form, and derive the limiting null distributions of these new tests in a general nonlinear setting, and show that the distributions depend only on the number of restrictions being tested.
Abstract: We develop test statistics to test hypotheses in nonlinear weighted regression models with serial correlation or conditional heteroscedasticity of unknown form. The novel aspect is that these tests are simple and do not require the use of heteroscedasticity autocorrelationconsistent (HAC) covariance matrix estimators. This new class of tests uses stochastic transformations to eliminate nuisance parameters as a substitute for consistently estimating the nuisance parameters. We derive the limiting null distributions of these new tests in a general nonlinear setting, and show that although the tests have nonstandard distributions, the distributions depend only on the number of restrictions being tested. We perform some simulations on a simple model and apply the new method of testing to an empirical example and illustrate that the size of the new test is less distorted than tests using HAC covariance matrix estimators.

59 citations


Journal ArticleDOI
TL;DR: In this paper, the quality of the results obtained by quantitative and semiquantitative methods in ICP-MS was critically investigated with a certified reference material (SLRS-3 Riverine Water), under multi-element routine analysis conditions.
Abstract: The quality of the results obtained by quantitative and semiquantitative methods in ICP-MS was critically investigated with a certified reference material (SLRS-3 Riverine Water), under multi-element routine analysis conditions. Quantitative methods were based on external calibration by simple and weighted linear regression and the semiquantitative one on the use of a commercial program (TotalQuant III) and internal calibration with Sc, In and Bi. Statistical weights used in weighted linear regressions were calculated from a noise model, which included the different ICP-MS noise sources in order to facilitate the use of this regression mode. Analytical results indicated that the concentrations of elements studied were in agreement with certified values by the three quantification methods. Relative uncertainties obtained by semiquantitative analysis ranged from 20%, for concentrations around the LOQ, to 6% for concentrations higher than 100 times the LOQ. They were similar to or better than those obtained by weighted regression calibration over a wide calibration range (0.01–100 ng ml−1). The uncertainty related to the linear regression calibrations could be reduced by one order of magnitude by adjusting the calibration range to the concentration of the analyte and increasing the analysis time.

40 citations


Journal ArticleDOI
TL;DR: A new method is proposed for comparing all predictors in a multiple regression model, which generates a measure of predictor criticality, which is distinct from and has several advantages over traditional indices of predictor importance.
Abstract: A new method is proposed for comparing all predictors in a multiple regression model. This method generates a measure of predictor criticality, which is distinct from and has several advantages over traditional indices of predictor importance. Using the bootstrapping (resampling with replacement) procedure, a large number of samples are obtained from a given data set which contains one response variable and p predictors. For each sample, all 2 p 2 1 subset regression models are ® tted and the best subset model is selected. Thus, the (multinomial) distribution ±

39 citations


Journal ArticleDOI
TL;DR: The statistical methods of logistic regression applied to predicting the probability of passing a course, based on the scores on the California Chemistry Diagnostic Test at two different institutions with two different instructors over multiple years are discussed.
Abstract: Several chemistry diagnostic and placement exams are used to help place chemistry students in an appropriate course or to determine strengths and weaknesses for specific topics in chemistry or math. The purpose of obtaining pre-course measurements is to increase students' academic success. Often these tests are used to predict the chance a student has in passing a course. This paper discusses the statistical methods of logistic regression applied to predicting the probability of passing a course, based on the scores on the California Chemistry Diagnostic Test at two different institutions with two different instructors over multiple years. This technique describes the relation of a test score (a continuous variable) to the probability of passing the class (a binary variable). Many papers in the Journal of Chemical Education have used a simple linear regression technique to correlate placement test scores with the proportion of students passing a course. The model assumptions are difficult to satisfy when using simple linear regression. Simple linear regression is useful when continuous predictor variables predict a continuous response, whereas logistic regression is useful when continuous predictor variables predict a binary response. Differences between simple linear regression and logistic regression and methods for evaluating linear regression model assumptions are discussed in detail. The fundamental concepts behind regression are described, with the caveats of using regression equations for predictions. By using logistic regression, instructors will be able to provide students with an estimate of their probability of passing the course.

30 citations


01 Jan 2001
TL;DR: In this article, the authors extend the biased reduced linearization (BRL) method to weighted regression analyses, which is a nonparametric method for estimating the standard errors of design-based statistics such as means and ratios as well as coefficients from linear and nonlinear regression models.
Abstract: Linearization (Skinner 1989) is a nonparametric method for estimating the standard errors of designbased statistics such as means and ratios as well as coefficients from linear and nonlinear regression models. Although the traditional linearization estimator for standard errors is consistent as the number of primary sampling units (PSUs) grows, the estimator can be biased, in particular biased low, when the number of PSUs is small or when the predictor variables are unbalanced across the PSUs (Bell and McCaffrey 2000; Kott 1994; Murray et al. 1998). Bell and McCaffrey (2000) developed biased reduced linearization (BRL) to eliminate or reduce this bias for linear regression models with unweighted data from nonstratified two-stage samples. Reduction in bias is achieved by replacing the ordinary residuals used in the standard linearization estimator by residuals adjusted to better approximate the joint distribution of the true errors. In this paper, we extend the BRL method to weighted regression analyses. The method handles a variety of different types of weights, including: • design weights equal to the inverse of the sample selection probability; • weights that account for post-stratification, nonresponse, and other weighting adjustments (e.g., for multiplicity) provided the weights can be treated as known; • diagonal or nondiagonal precisions weights to account for heteroskedastic or correlated errors; • logistic regression and other generalized linear models that can be fit by iteratively reweighted least squares; and • generalized estimating equations. We discuss four alternative BRL specifications and investigate the performance (bias and variance) of these estimators and commonly used alternative via simulation. We also present an application of logistic regression used to estimate the treatment effect in a clusterrandomized experiment. The application demonstrates a natural extension of BRL to models where parameters are estimated by iteratively reweighted least squares. 2. BIAS REDUCED LINEARIZATION FOR WEIGHTED LEAST SQUARES

25 citations