scispace - formally typeset
Search or ask a question
Journal ArticleDOI

An Analysis of Transformations

TL;DR: In this article, Lindley et al. make the less restrictive assumption that such a normal, homoscedastic, linear model is appropriate after some suitable transformation has been applied to the y's.
Abstract: [Read at a RESEARCH METHODS MEETING of the SOCIETY, April 8th, 1964, Professor D. V. LINDLEY in the Chair] SUMMARY In the analysis of data it is often assumed that observations Yl, Y2, *-, Yn are independently normally distributed with constant variance and with expectations specified by a model linear in a set of parameters 0. In this paper we make the less restrictive assumption that such a normal, homoscedastic, linear model is appropriate after some suitable transformation has been applied to the y's. Inferences about the transformation and about the parameters of the linear model are made by computing the likelihood function and the relevant posterior distribution. The contributions of normality, homoscedasticity and additivity to the transformation are separated. The relation of the present methods to earlier procedures for finding transformations is discussed. The methods are illustrated with examples.
Citations
More filters
Journal ArticleDOI
TL;DR: This article proposes methods for combining estimates of the cause-specific hazard functions under the proportional hazards formulation, but these methods do not allow the analyst to directly assess the effect of a covariate on the marginal probability function.
Abstract: With explanatory covariates, the standard analysis for competing risks data involves modeling the cause-specific hazard functions via a proportional hazards assumption Unfortunately, the cause-specific hazard function does not have a direct interpretation in terms of survival probabilities for the particular failure type In recent years many clinicians have begun using the cumulative incidence function, the marginal failure probabilities for a particular cause, which is intuitively appealing and more easily explained to the nonstatistician The cumulative incidence is especially relevant in cost-effectiveness analyses in which the survival probabilities are needed to determine treatment utility Previously, authors have considered methods for combining estimates of the cause-specific hazard functions under the proportional hazards formulation However, these methods do not allow the analyst to directly assess the effect of a covariate on the marginal probability function In this article we pro

11,109 citations

Journal ArticleDOI
TL;DR: The bootstrap is extended to other measures of statistical accuracy such as bias and prediction error, and to complicated data structures such as time series, censored data, and regression models.
Abstract: This is a review of bootstrap methods, concentrating on basic ideas and applications rather than theoretical considerations. It begins with an exposition of the bootstrap estimate of standard error for one-sample situations. Several examples, some involving quite complicated statistical procedures, are given. The bootstrap is then extended to other measures of statistical accuracy such as bias and prediction error, and to complicated data structures such as time series, censored data, and regression models. Several more examples are presented illustrating these ideas. The last third of the paper deals mainly with bootstrap confidence intervals.

5,894 citations

Journal Article
TL;DR: Created with improved data and statistical curve smoothing procedures, the United States growth charts represent an enhanced instrument to evaluate the size and growth of infants and children.
Abstract: Objectives—This report presents the revised growth charts for the United States. It summarizes the history of the 1977 National Center for Health Statistics (NCHS) growth charts, reasons for the revision, data sources and statistical procedures used, and major features of the revised charts. Methods—Data from five national health examination surveys collected from 1963 to 1994 and five supplementary data sources were combined to establish an analytic growth chart data set. A variety of statistical procedures were used to produce smoothed percentile curves for infants (from birth to 36 months) and older children (from 2 to 20 years), using a two-stage approach. Initial curve smoothing for selected major percentiles was accomplished with various parametric and nonparametric procedures. In the second stage, a normalization procedure was used to generate z-scores that closely match the smoothed percentile curves. Results—The 14 NCHS growth charts were revised and new body mass index-for-age (BMI-for-age) charts were created for boys and girls (http://www.cdc.gov/growthcharts). The growth percentile curves for infants and children are based primarily on national survey data. Use of national data ensures a smooth transition from the charts for infants to those for older children. These data better represent the racial/ethnic diversity and the size and growth patterns of combined breast- and formula-fed infants in the United States. New features include addition of the 3rd and 97th percentiles for all charts and extension of all charts for children and adolescents to age 20 years. Conclusion—Created with improved data and statistical curve smoothing procedures, the United States growth charts represent an enhanced instrument to evaluate the size and growth of infants and children.

5,160 citations


Cites methods from "An Analysis of Transformations"

  • ...Then, the Box-Cox power transformation (36) was used to specify an equation at each of the previously smoothed major percentiles....

    [...]

Book
17 May 2013
TL;DR: This research presents a novel and scalable approach called “Smartfitting” that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of designing and implementing statistical models for regression models.
Abstract: General Strategies.- Regression Models.- Classification Models.- Other Considerations.- Appendix.- References.- Indices.

3,672 citations

01 Jan 2002

2,894 citations


Cites methods from "An Analysis of Transformations"

  • ...An alternative one-par ameter family of transformations that could be considered in this case is t(y, α) = log(y + α) Using the same analysis as presented in Box and Cox (1964) the profile log likelihood for α is easily seen to be L̂(α) = const− n2 log RSS{log(y + α)} − ∑ log(y + α)...

    [...]

References
More filters
Book
01 Jan 1939
TL;DR: In this paper, the authors introduce the concept of direct probabilities, approximate methods and simplifications, and significant importance tests for various complications, including one new parameter, and various complications for frequency definitions and direct methods.
Abstract: 1. Fundamental notions 2. Direct probabilities 3. Estimation problems 4. Approximate methods and simplifications 5. Significance tests: one new parameter 6. Significance tests: various complications 7. Frequency definitions and direct methods 8. General questions

7,086 citations

Journal ArticleDOI
TL;DR: In this paper, the authors introduce the concept of direct probabilities, approximate methods and simplifications, and significant importance tests for various complications, including one new parameter, and various complications for frequency definitions and direct methods.
Abstract: 1. Fundamental notions 2. Direct probabilities 3. Estimation problems 4. Approximate methods and simplifications 5. Significance tests: one new parameter 6. Significance tests: various complications 7. Frequency definitions and direct methods 8. General questions

2,990 citations

Book ChapterDOI
TL;DR: In this article, the structure of small sample tests, whether these are related to problems of estimation and fiducial distributions, or are of the nature of tests of goodness of fit, is considered further.
Abstract: 1—In a previous paper*, dealing with the importance of properties of sufficiency in the statistical theory of small samples, attention was mainly confined to the theory of estimation. In the present paper the structure of small sample tests, whether these are related to problems of estimation and fiducial distributions, or are of the nature of tests of goodness of fit, is considered further.

2,432 citations

Journal ArticleDOI
TL;DR: In this paper, the authors summarize the transformations which have been used on raw statistical data, with particular reference to analysis of variance, and the usual purpose of the transformation is to change the scale of the measurements in order to make the analysis more valid.
Abstract: 1. Theoretical Discussion. The purpose of this note is to summarize the transformations which have been used on raw statistical data, with particular reference to analysis of variance. For any such analysis the usual purpose of the transformation is to change the scale of the measurements in order to make the analysis more valid. Thus the conditions required for assessing accuracy in the ordinary unweighted analysis of variance include the important one of a constant residual or error variance, and if the variance tends to change with the mean level of the measurements, the variance will only be stabilized by a suitable change of scale. If the form of the change of variance with mean level is known, this determines the type of transformation to use. Suppose we write

1,123 citations

Journal ArticleDOI
TL;DR: The present writer is usually much more concerned with and worried about non-additivity, and until recently has suffered from the lack of a systematic way to seek it out, and then to measure it.
Abstract: TN DISCUSSING the possible shortcomings of the analysis of variance, much attention has been paid to non-constancy and non-normality of the "error" contribution. (The recent papers in Biometrics by Eisenhart [4], Cochran [3] and Bartlett [1] discuss these matters and give references.) The present writer is usually much more concerned with and worried about non-additivity, and until recently has suffered from the lack of a systematic way to seek it out, and then to measure it. (Conversations with Frederick F. Stephan have contributed greatly to this development and presentation.) The purpose of the present paper is to indicate such a way, when the data is in the form of a row-by-column table. (The professional practitioner of the analysis of variance will have no difficulty in extending the process to more complex designs.) We shall show how to isolate one degree of freedom from the "residue", "error", 'interaction' or "'discrepance", call it what you will. There are two known situations to which this single degree of freedom is expected to react by swelling:

835 citations