Showing papers by "Robert Tibshirani published in 1998"

PDF

Open Access

Journal Article•DOI•

[...]

01 Apr 1998-Annals of Statistics

TL;DR: In this article, the authors discuss a strategy for polychotomous classification that involves estimating class probabilities for each pair of classes, and then coupling the estimates together, similar to the Bradley-Terry method for paired comparisons.

...read moreread less

Abstract: We discuss a strategy for polychotomous classification that involves estimating class probabilities for each pair of classes, and then coupling the estimates together. The coupling model is similar to the Bradley-Terry method for paired comparisons. We study the nature of the class probability estimates that arise, and examine the performance of the procedure in real and simulated data sets. Classifiers used include linear discriminants, nearest neighbors, adaptive nonlinear methods and the support vector machine.

...read moreread less

1,569 citations

Journal Article•DOI•

The problem of regions

[...]

Bradley Efron, Robert Tibshirani

01 Oct 1998-Annals of Statistics

TL;DR: The paper studies the construction of confidence values and examines to what extent they approximate frequentist p-values and Bayesian a posteriori probabilities, and derives more accurate confidence levels using both frequentist and objective Bayesian approaches.

...read moreread less

Abstract: In the problem of regions, we wish to know which one of a discrete set of possibilities applies to a continuous parameter vector. This problem arises in the following way: we compute a descriptive statistic from a set of data, notice an interesting feature and wish to assign a confidence level to that feature. For example, we compute a density estimate and notice that the estimate is bimodal. What confidence can we assign to bimodality? A natural way to measure confidence is via the bootstrap: we compute our descriptive statistic on a large number of bootstrap data sets and record the proportion of times that the feature appears. This seems like a plausible measure of confidence for the feature. The paper studies the construction of such confidence values and examines to what extent they approximate frequentist $p$-values and Bayesian a posteriori probabilities. We derive more accurate confidence levels using both frequentist and objective Bayesian approaches. The methods are illustrated with a number of examples, including polynomial model selection and estimating the number of modes of a density.

...read moreread less

139 citations

Journal Article•DOI•

A comparison of statistical learning methods on the GUSTO database

[...]

Marguerite Ennis¹, Geoffrey E. Hinton¹, David Naylor, Mike Revow¹, Robert Tibshirani¹ - Show less +1 more•Institutions (1)

University of Toronto¹

15 Nov 1998-Statistics in Medicine

TL;DR: A battery of modern, adaptive non-linear learning methods are applied to a large real database of cardiac patient data and it is found that none of the methods could outperform a relatively simple logistic regression model previously developed for this problem.

...read moreread less

Abstract: We apply a battery of modern, adaptive non-linear learning methods to a large real database of cardiac patient data. We use each method to predict 30 day mortality from a large number of potential risk factors, and we compare their performances. We find that none of the methods could outperform a relatively simple logistic regression model previously developed for this problem.

...read moreread less

88 citations

Journal Article•DOI•

Generalized additive models for longitudinal data

[...]

Kiros Berhane¹, Robert Tibshirani²•Institutions (2)

University of Southern California¹, University of Toronto²

01 Dec 1998-Canadian Journal of Statistics-revue Canadienne De Statistique

TL;DR: In this article, a generalized estimating equations approach for longitudinal data is proposed to incorporate the flexibility of nonparametric smoothing, and the convergence of the estimating equations and consistency of the resulting solutions are discussed.

...read moreread less

Abstract: We introduce a class of models for longitudinal data by extending the generalized estimating equations approach of Liang and Zeger (1986) to incorporate the flexibility of nonparametric smoothing. The algorithm provides a unified estimation procedure for marginal distributions from the exponential family. We propose pointwise standard-error bands and approximate likelihood-ratio and score tests for inference. The algorithm is formally derived by using the penalized quasilikelihood framework. Convergence of the estimating equations and consistency of the resulting solutions are discussed. We illustrate the algorithm with data on the population dynamics of Colorado potato beetles on potato plants. Nous introduisons une classe de modeles pour les donnees longitudinales en etendant l'approche des equations d'estimation generalisees de Liang et Zeger (1986) afin d'incorporer la flexibility du lissage non-parametrique. L'algorithme fournit une procedure d'estimation unifiee pour les distributions marginales de la famille exponentielle. Nous proposons des bandes d'erreur standard par point et des tests de taux de vraisemblance approximative et de score pour l'inference. L'algorithme est formellement derive en utilisant le cadre de reference de la quasi-vraisemblance penalisee. La convergence des equations d'estimation et des solutions resultantes est discutee. Nous illustrons l'algorithme a l'aide de donnees concernant la dynamique des populations d'insectes des pommes de terre du Colorado sur les plants de pommes de terre.

...read moreread less

42 citations

Journal Article•DOI•

Monotone Shrinkage of Trees

[...]

Michael LeBlanc¹, Robert Tibshirani•Institutions (1)

Fred Hutchinson Cancer Research Center¹

01 Dec 1998-Journal of Computational and Graphical Statistics

TL;DR: A new method for regression trees which obtains estimates and predictions subject to constraints on the coefficients representing the effects of splits in the tree and for some problems gives better predictions than cost-complexity pruning used in the classification and regression tree (CART) algorithm.

...read moreread less

Abstract: We investigate a new method for regression trees which obtains estimates and predictions subject to constraints on the coefficients representing the effects of splits in the tree. The procedure leads to both shrinking of the node estimates and pruning of branches in the tree and for some problems gives better predictions than cost-complexity pruning used in the classification and regression tree (CART) algorithm. The new method is based on the least absolute shrinkage and selection operator (LASSO) method developed by Tibshirani.

...read moreread less

24 citations

Journal Article•DOI•

Coaching variables for regression and classification

[...]

Robert Tibshirani¹, Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

01 Jan 1998-Statistics and Computing

TL;DR: This work considers two methods of making use of the coaching variables in order to improve the prediction of Y from x1,x2,..., xp.

...read moreread less

Abstract: In a regression or classification setting where we wish to predict Y from x1,x2,…, xp, we suppose that an additional set of ’coaching‘ variables z1,z2,…, zm are available in our training sample. These might be variables that are difficult to measure, and they will not be available when we predict Y from x1,x2,…, xp in the future. We consider two methods of making use of the coaching variables in order to improve the prediction of Y from x1,x2,…, xp. The relative merits of these approaches are discussed and compared in a number of examples.

...read moreread less

20 citations

Journal Article•DOI•

Bayesian CART Model Search: Comment

[...]

Keith Knight, Rafal Kustra, Robert Tibshirani

01 Sep 1998-Journal of the American Statistical Association

2 citations