scispace - formally typeset
Search or ask a question

Showing papers by "Robert Tibshirani published in 1997"


Journal ArticleDOI
TL;DR: Simulations indicate that the lasso can be more accurate than stepwise selection in this setting and reduce the estimation variance while providing an interpretable final model in Cox's proportional hazards model.
Abstract: SUMMARY I propose a new method for variable selection and shrinkage in Cox’s proportional hazards model. My proposal minimizes the log partial likelihood subject to the sum of the absolute values of the parameters being bounded by a constant. Because of the nature of this constraint, it shrinks coeƒcients and produces some coeƒcients that are exactly zero. As a result it reduces the estimation variance while providing an interpretable final model. The method is a variation of the ‘lasso’ proposal of Tibshirani, designed for the linear regression context. Simulations indicate that the lasso can be more accurate than stepwise selection in this setting.

3,004 citations


Journal ArticleDOI
TL;DR: It is shown that a particular bootstrap method, the .632+ rule, substantially outperforms cross-validation in a catalog of 24 simulation experiments and also considers estimating the variability of an error rate estimate.
Abstract: A training set of data has been used to construct a rule for predicting future responses. What is the error rate of this rule? This is an important question both for comparing models and for assessing a final selected model. The traditional answer to this question is given by cross-validation. The cross-validation estimate of prediction error is nearly unbiased but can be highly variable. Here we discuss bootstrap estimates of prediction error, which can be thought of as smoothed versions of cross-validation. We show that a particular bootstrap method, the .632+ rule, substantially outperforms cross-validation in a catalog of 24 simulation experiments. Besides providing point estimates, we also consider estimating the variability of an error rate estimate. All of the results here are nonparametric and apply to any possible prediction rule; however, we study only classification problems with 0–1 loss in detail. Our simulations include “smooth” prediction rules like Fisher's linear discriminant fun...

1,602 citations


Journal ArticleDOI
TL;DR: The use of cellular telephones in motor vehicles is associated with a quadrupling of the risk of a collision during the brief time interval involving a call, suggesting that having a cellular telephone may have had advantages in the aftermath of an event.
Abstract: Background Because of a belief that the use of cellular telephones while driving may cause collisions, several countries have restricted their use in motor vehicles, and others are considering such regulations. We used an epidemiologic method, the case–crossover design, to study whether using a cellular telephone while driving increases the risk of a motor vehicle collision. Methods We studied 699 drivers who had cellular telephones and who were involved in motor vehicle collisions resulting in substantial property damage but no personal injury. Each person's cellular-telephone calls on the day of the collision and during the previous week were analyzed through the use of detailed billing records. Results A total of 26,798 cellular-telephone calls were made during the 14-month study period. The risk of a collision when using a cellular telephone was four times higher than the risk when a cellular telephone was not being used (relative risk, 4.3; 95 percent confidence interval, 3.0 to 6.5). The relative ri...

1,202 citations


Proceedings Article
01 Dec 1997
TL;DR: A strategy for polychotomous classification that involves estimating class probabilities for each pair of classes, and then coupling the estimates together is discussed, similar to the Bradley-Terry method for paired comparisons.
Abstract: We discuss a strategy for polychotomous classification that involves estimating class probabilities for each pair of classes, and then coupling the estimates together. The coupling model is similar to the Bradley-Terry method for paired comparisons. We study the nature of the class probability estimates that arise, and examine the performance of the procedure in simulated datasets. The classifiers used include linear discriminants and nearest neighbors: application to support vector machines is also briefly described.

232 citations


Journal ArticleDOI
TL;DR: It is suggested that an understanding of the case-crossover design may help investigators explore selected questions in behavioral medical research.

78 citations


Journal ArticleDOI
15 Aug 1997-Cancer
TL;DR: The efficacy of breast carcinoma screening should be enhanced if false‐negative mammography were reduced, and menstrual cycle phase was associated with false‐ negative outcomes for mammographic screening.
Abstract: BACKGROUND The efficacy of breast carcinoma screening should be enhanced if false-negative mammography were reduced. Prospectively collected data from the Canadian National Breast Screening Study were used to examine whether menstrual cycle phase was associated with false-negative outcomes for mammographic screening. METHODS Of 8887 women ages 40-44 years at the onset of screening, randomized to receive annual mammography and clinical breast examination, reporting menstruation no more than 28 days prior to their screening examination, and with a valid radiologic report, 1898 had never used oral contraceptives or replacement estrogen with or without progesterone. The remainder were past (6573) and current (416) estrogen users. Similar selection criteria were applied at subsequent screens. The distribution of false-negative and false-positive mammography in relation to true-negative and true-positive mammography was examined with respect to the follicular (Days 1 to 14) and luteal (Days 15-28) menstrual phases. RESULTS Comparing luteal with follicular mammograms in 6989 patients who ever used estrogen, the unadjusted odds ratio (2-sided P -values) for false-negatives versus true-negatives was 2.16 (0.05) and the adjusted odds ratio was 1.47 (0.05). In 1898 never-users, parallel odds ratios for luteal false-negatives were 0.55 (1.0) and 0.74 (1.0), respectively. CONCLUSIONS These results suggest that menstruating women who have used hormones may have an increased risk of false-negative results for screening mammograms performed in the luteal phase of the menstrual cycle. An increased risk of false-negative mammography might adversely affect screening efficacy. The impact of menstrual phase on mammographic interpretation, especially for women who ever used hormones, requires further investigation. Cancer 1997; 80:720-4. © 1997 American Cancer Society.

59 citations


Journal ArticleDOI
TL;DR: The authors compare the world record sprint races of Donovan Bailey and Michael Johnson in the 1996 Olympic Games, and try to answer the questions: 1. Who is faster?, and 2. Which performance was more remarkable?
Abstract: I compare the world record sprint races of Donovan Bailey and Michael Johnson in the 1996 Olympic Games, and try to answer the questions: 1. Who is faster?, and 2. Which performance was more remarkable? The statistical methods used include cubic spline curve fitting, the parametric bootstrap, and Keller's model of running.

23 citations



Journal ArticleDOI
TL;DR: In this paper, the authors describe the analysis of some matched-pair binary data arising from a study designed to investigate whether cellular-telephone use is associated with motor-vehicle collisions.
Abstract: We describe the analysis of some matched-pair binary data arising from a study designed to investigate whether cellular-telephone use is associated with motor-vehicle collisions. Conditional and random effects approaches to the problem are derived and compared. Driving intermittency is a potential confounder whose effect is assessed by strategic choices of the control period and by application of the bootstrap. The marked discrepancy between the conditional and random approaches merits further study. Cet article decrit l'analyse de quelques donnees binaires de paires assorties survenant lors d'une etude visant a determiner si l'usage du telephone cellulaire est associe aux collisions de vehicules motorises. Des approaches d'effects conditionnes et aletoires au probleme sont derivees et comparees. L'intermittence de la conduite est un element potentiellement confondant, dont l'effect est evalue par des choix strategiques de la periode de contrǒle et par l'application du bootstrap. La difference marquee entre l'approche conditionnelle et l'approche aleatoire merite d'ětre etudiee plus profondement.

20 citations


Journal ArticleDOI
01 Mar 1997-Chance
TL;DR: The case cross-over design as mentioned in this paper is a case control method where the controls are the same people as the cases and was used for the study of 699 drivers who had had an accident and found that the proportion of those who used their phones in the ten-minute period before their accidents was 24% compared to 5% when they used them during the same time period the day before the accident.
Abstract: The authors discuss a study they carried out which was reported in the New England Journal of Medicine (Feb 17, 1997) and discussed in Chance News 6.03. This article discusses interesting aspects of the study that would not appear in a technical article. For example, they were cautioned by friends about carrying out the study since it could effect large companies’ sales. They point out that the cellular phone companies in North America have significantly greater daily revenues than Microsoft. The design the authors used for their study, called the case cross-over design, is relatively new. It is a case control method where the controls are the same people as the cases. The authors considered 699 drivers who had had an accident. They compared the proportion of those who used their phones in the ten-minute period before their accidents (24%) with the proportion of those who used them while driving during the same time period the day before the accident (5%). Summary statistics led to a relative risk of 6.5 for using a phone while driving. The authors explain why they rejected the use of more standard methods that had been used in previous studies and which, they felt, led to biased results. They also discuss some issues involved in the media attention that the study received. They provide a cartoon from the Philadelphia Inquirer, suggesting that the danger of driving with a telephone should be compared to driving while drunk. Most writers included a statement similar to that of Gina Kolata in her article about the study in the Times. Referring to the risk of driving while talking on the telephone she writes:

12 citations