Showing papers by "Jerome H. Friedman published in 2001"

PDF

Open Access

Book•

The Elements of Statistical Learning

[...]

Trevor Hastie, Robert Tibshirani, Jerome H. Friedman

01 Jan 2001

19,211 citations

Journal Article•DOI•

Greedy function approximation: A gradient boosting machine.

[...]

Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Oct 2001-Annals of Statistics

TL;DR: A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.

...read moreread less

Abstract: Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest-descent minimization. A general gradient descent “boosting” paradigm is developed for additive expansions based on any fitting criterion.Specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are regression trees, and tools for interpreting such “TreeBoost” models are presented. Gradient boosting of regression trees produces competitive, highly robust, interpretable procedures for both regression and classification, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods of Freund and Shapire and Friedman, Hastie and Tibshirani are discussed.

...read moreread less

17,764 citations

The elements of statistical learning. 2001

[...]

Trevor Hastie, Robert Tibshirani, Jerome H. Friedman

01 Jan 2001

2,694 citations

Book Chapter•DOI•

Overview of Supervised Learning

[...]

Trevor Hastie¹, Jerome H. Friedman¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jan 2001

TL;DR: The first three examples described in Chapter 1 have several components in common, for each there is a set of variables that might be denoted as inputs, which are measured or preset.

...read moreread less

Abstract: The first three examples described in Chapter 1 have several components in common For each there is a set of variables that might be denoted as inputs, which are measured or preset These have some influence on one or more outputs For each example the goal is to use the inputs to predict the values of the outputs This exercise is called supervised learning

...read moreread less

181 citations

Book Chapter•DOI•

Linear Methods for Regression

[...]

Trevor Hastie¹, Jerome H. Friedman¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jan 2001

TL;DR: In this article, a linear regression model is used to model the transformation of the input to the output of a linear model, which is called basis function method (BFP) and can be applied to transformations of inputs.

...read moreread less

Abstract: A linear regression model assumes that the regression function E(Y|X) is linear in the inputs X 1,..., X p . Linear models were largely developed in the precomputer age of statistics, but even in today’s computer era there are still good reasons to study and use them. They are simple and often provide an adequate and interpretable description of how the inputs affect the output. For prediction purposes they can sometimes outperform fancier nonlinear models, especially in situations with small numbers of training cases, low signal-to-noise ratio or sparse data. Finally, linear methods can be applied to transformations of the inputs and this considerably expands their scope. These generalizations are sometimes called basis-function methods, and are discussed in Chapter 5.

...read moreread less

104 citations

Journal Article•DOI•

The Role of Statistics in the Data Revolution

[...]

Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Apr 2001-International Statistical Review

TL;DR: In this article, the authors explore certaines des causes of this situation, and s'interroge sur les raisons pour lesquelles les statisticiens devraient etre interesses a participer au developpement de nouvelles methodes pour traiter les fichiers de donnees volumineux and complexes.

...read moreread less

Abstract: La nature des donnees change rapidement. Les fichiers de donnees deviennent de plus en plus volumineux et complexes. La methodologie moderne pour analyser ces nouveaux types de donnees provient des domaines de la gestion des bases de donnees, de l'intelligence artificielle, de la reconnaisance de caracteres, et de la visualisation de donnees. Pour l'instant, la statistique, en tant que discipline, n'a joue qu'un role mineur. Cet article explore certaines des causes de cette situation, et s'interroge sur les raisons pour lesquelles les statisticiens devraient etre interesses a participer au developpement de nouvelles methodes pour traiter les fichiers de donnees volumineux et complexes.

...read moreread less

57 citations

Book Chapter•DOI•

Linear Methods for Classification

[...]

Trevor Hastie¹, Jerome H. Friedman¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jan 2001

TL;DR: This chapter revisits the classification problem and focuses on linear methods for classification, which means that the boundaries of these regions can be rough or smooth, depending on the prediction function.

...read moreread less

Abstract: In this chapter we revisit the classification problem and focus on linear methods for classification. Since our predictor G(x) takes values in a discrete set G, we can always divide the input space into a collection of regions labeled according to the classification. We saw in Chapter 2 that the boundaries of these regions can be rough or smooth, depending on the prediction function. For an important class of procedures, these decision boundaries are linear; this is what we will mean by linear methods for classification.

...read moreread less

36 citations

Book Chapter•DOI•

Basis Expansions and Regularization

[...]

Trevor Hastie¹, Jerome H. Friedman¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jan 2001

TL;DR: In this paper, it is shown that the true function f(X) = E(Y|X) will typically be nonlinear and nonadditive in X, and representation by a linear model is usually a convenient, and sometimes a necessary, approximation.

...read moreread less

Abstract: We have already made use of models linear in the input features, both for regression and classification. Linear regression, linear discriminant analysis, logistic regression and separating hyperplanes all rely on a linear model. It is extremely unlikely that the true function f(X) is actually linear in X. In regression problems, f(X) = E(Y|X) will typically be nonlinear and nonadditive in X, and representing f(X) by a linear model is usually a convenient, and sometimes a necessary, approximation. Convenient because a linear model is easy to interpret, and is the first-order Taylor approximation to f(X). Sometimes necessary, because with N small and/or p large, a linear model might be all we are able to fit to the data without overfitting. Likewise in classification, a linear, Bayes-optimal decision boundary implies that some monotone transformation of Pr(Y = 1|X) is linear in X. This is inevitably an approximation.

...read moreread less

19 citations

Review ID: 9116

[...]

D. Ruppert, Trevor Hastie, Robert Tibshirani, Jerome H. Friedman

01 Jan 2001