scispace - formally typeset
Search or ask a question
Author

Norman E. Breslow

Bio: Norman E. Breslow is an academic researcher from University of Washington. The author has contributed to research in topics: Wilms' tumor & Population. The author has an hindex of 76, co-authored 208 publications receiving 25145 citations. Previous affiliations of Norman E. Breslow include Children's Hospital of Philadelphia & University of Geneva.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, generalized linear mixed models (GLMM) are used to estimate the marginal quasi-likelihood for the mean parameters and the conditional variance for the variances, and the dispersion matrix is specified in terms of a rank deficient inverse covariance matrix.
Abstract: Statistical approaches to overdispersion, correlated errors, shrinkage estimation, and smoothing of regression relationships may be encompassed within the framework of the generalized linear mixed model (GLMM). Given an unobserved vector of random effects, observations are assumed to be conditionally independent with means that depend on the linear predictor through a specified link function and conditional variances that are specified by a variance function, known prior weights and a scale factor. The random effects are assumed to be normally distributed with mean zero and dispersion matrix depending on unknown variance components. For problems involving time series, spatial aggregation and smoothing, the dispersion may be specified in terms of a rank deficient inverse covariance matrix. Approximation of the marginal quasi-likelihood using Laplace's method leads eventually to estimating equations based on penalized quasilikelihood or PQL for the mean parameters and pseudo-likelihood for the variances. Im...

4,317 citations

Journal ArticleDOI
TL;DR: The use of regression models for making covariance adjustments in the comparsion of survival curves is illustrated by application to a clinical trial of maintenance therapy for childhood leukemia.
Abstract: The use of regression models for making covariance adjustments in the comparsion of survival curves is illustrated by application to a clinical trial of maintenance therapy for childhood leukemia. Three models are considered: the log linear exponential (Glasser [1967]); Cox's [1972] nonparametric generalization of this; and the linear exponential (Feigl and Zelen [1965]). Age and white blood count at diagnosis are both shown to be important for prognosis; adjustment for the latter variable has marked effects on the treatment comparisons. Both advantages and disadvantages with the regression approach are noted.

1,728 citations

Journal ArticleDOI
TL;DR: It is argued that the problem of estimation of failure rates under the removal of certain causes is not well posed until a mechanism for cause removal is specified, and a method involving the estimation of parameters that relate time-dependent risk indicators for some causes to cause-specific hazard functions for other causes is proposed for the study of interrelations among failure types.
Abstract: Distinct problems in the analysis of failure times with competing causes of failure include the estimation of treatment or exposure effects on specific failure types, the study of interrelations among failure types, and the estimation of failure rates for some causes given the removal of certain other failure types. The usual formation of these problems is in terms of conceptual or latent failure times for each failure type. This approach is criticized on the basis of unwarranted assumptions, lack of physical interpretation and identifiability problems. An alternative approach utilizing cause-specific hazard functions for observable quantities, including time-dependent covariates, is proposed. Cause-specific hazard functions are shown to be the basic estimable quantities in the competing risks framework. A method, involving the estimation of parameters that relate time-dependent risk indicators for some causes to cause-specific hazard functions for other causes, is proposed for the study of interrelations among failure types. Further, it is argued that the problem of estimation of failure rates under the removal of certain causes is not well posed until a mechanism for cause removal is specified. Following such a specification, one will sometimes be in a position to make sensible extrapolations from available data to situations involving cause removal. A clinical program in bone marrow transplantation for leukemia provides a setting for discussion and illustration of each of these ideas. Failure due to censoring in a survivorship study leads to further discussion.

1,429 citations

Journal ArticleDOI
TL;DR: In this article, a generalization of the Kruskal-Wallis test for testing the equality of K continuous distribution functions when observations are subject to arbitrary right censorship is proposed, where the distribution of the censoring variables is allowed to differ for different populations.
Abstract: SUMMARY A generalization of the Kruskal-Wallis test, which extends Gehan's generalization of Wilcoxon's test, is proposed for testing the equality of K continuous distribution functions when observations are subject to arbitrary right censorship. The distribution of the censoring variables is allowed to differ for different populations. An alternative statistic is proposed for use when the censoring distributions may be assumed equal. These statistics have asymptotic chi-squared distributions under their respective null hypotheses, whether the censoring variables are regarded as random or as fixed numbers. Asymptotic power and efficiency calculations are made and numerical examples provided. A generalization of Wilcoxon's statistic for comparing two populations has been proposed by Gehan (1965a) for use when the observations are subject to arbitrary right censorship. Mantel (1967), as well as Gehan (1965b), has considered a further generalization to the case of arbitrarily restricted observation, or left and right censorship. Both of these authors base their calculations on the permutation distribution of the statistic, conditional on the observed censoring pattern for the combined sample. However, this model is inapplicable when there are differences in the distribution of the censoring variables for the two populations. For instance, in medical follow-up studies, where Gehan's procedure has so far found its widest application, this would happen if the two populations had been under study for different lengths of time. This paper extends Gehan's procedure for right censored observations to the comparison of K populations. The probability distributions of the relevant statistics are here considered in a large sample framework under two models: Model I, corresponding to random or unconditional censorship; and Model II, which considers the observed censoring times as fixed numbers. Since the distributions of the censoring variables are allowed to vary with the population, Gehan's procedure is also extended to the case of unequal censorship. For Model I these distributions are theoretical distributions; for Model II they are empirical. Besides providing chi-squared statistics for use in testing the hypothesis of equality of the K populations against general alternatives, the paper shows how single degrees of freedom may be partitioned for use in discriminating specific alternative hypotheses. Several investigators (Efron, 1967) have pointed out that Gehan's test is not the most efficient against certain parametric alternatives and have proposed modifications to increase its power. Asymptotic power and efficiency calculations made below demonstrate that their criticisms would apply equally well to the test proposed here. Hopefully some of the modifications they suggest can likewise eventually be generalized to the case of K

1,351 citations

Journal ArticleDOI
15 Jul 1989-Cancer
TL;DR: The Third National Wilms' Tumor Study as discussed by the authors sought to reduce treatment for low-risk patients and find better chemotherapy for those at high risk for relapse by randomized treatment regimens.
Abstract: The Third National Wilms' Tumor Study sought to reduce treatment for low-risk patients and find better chemotherapy for those at high risk for relapse. Eligible patients (1439) were randomized according to stage (I-IV) and histology (favorable [FH] or unfavorable [UH]), and contributed data to survival and relapse-free survival (RFS) analyses. Four-year (postnephrectomy) survival percentages and randomized treatment regimens for low-risk patients were 96.5% for 607 Stage I/FH patients who received dactinomycin (Actinomycin D [AMD], Merck Sharp & Dohme, West Point, PA) and vincristine (VCR) for 10 weeks versus 6 months; 92.2% for 278 Stage II/FH patients; and 86.9% for 275 Stage III/FH patients who received AMD + VCR +/- Adriamycin (ADR, Adria Laboratories, Columbus, OH) for 15 months. Stage II/FH patients also had either zero or 2000 cGy irradiation (RT) postoperatively and Stage III/FH patients either 1000 or 2000 cGy. Four-year survival was 73.0% for 279 high-risk patients (any Stage IV, all UH) who received postoperative radiation therapy (RT) and AMD + VCR + ADR +/- cyclophosphamide (CPM). Statistical analysis of survival and RFS experience shows that the less intensive therapy does not worsen results for low-risk patients and CPM does not benefit those at high risk.

629 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The method of classifying comorbidity provides a simple, readily applicable and valid method of estimating risk of death fromComorbid disease for use in longitudinal studies and further work in larger populations is still required to refine the approach.

39,961 citations

Book ChapterDOI
TL;DR: The analysis of censored failure times is considered in this paper, where the hazard function is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time.
Abstract: The analysis of censored failure times is considered. It is assumed that on each individual arc available values of one or more explanatory variables. The hazard function (age-specific failure rate) is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time. A conditional likelihood is obtained, leading to inferences about the unknown regression coefficients. Some generalizations are outlined.

28,264 citations

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined and derive a measure pD for the effective number in a model as the difference between the posterior mean of the deviances and the deviance at the posterior means of the parameters of interest, which is related to other information criteria and has an approximate decision theoretic justification.
Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

11,691 citations

Journal ArticleDOI
TL;DR: This article proposes methods for combining estimates of the cause-specific hazard functions under the proportional hazards formulation, but these methods do not allow the analyst to directly assess the effect of a covariate on the marginal probability function.
Abstract: With explanatory covariates, the standard analysis for competing risks data involves modeling the cause-specific hazard functions via a proportional hazards assumption Unfortunately, the cause-specific hazard function does not have a direct interpretation in terms of survival probabilities for the particular failure type In recent years many clinicians have begun using the cumulative incidence function, the marginal failure probabilities for a particular cause, which is intuitively appealing and more easily explained to the nonstatistician The cumulative incidence is especially relevant in cost-effectiveness analyses in which the survival probabilities are needed to determine treatment utility Previously, authors have considered methods for combining estimates of the cause-specific hazard functions under the proportional hazards formulation However, these methods do not allow the analyst to directly assess the effect of a covariate on the marginal probability function In this article we pro

11,109 citations

Book
24 Aug 2012
TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Abstract: Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package--PMTK (probabilistic modeling toolkit)--that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

8,059 citations