Showing papers by "Jerome H. Friedman published in 2009"

PDF

Open Access

Book Chapter•DOI•

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

01 Jan 2009

TL;DR: The generalization performance of a learning method relates to its prediction capability on independent test data, and gives a measure of the quality of the ultimately chosen model.

...read moreread less

Abstract: The generalization performance of a learning method relates to its prediction capability on independent test data. Assessment of this performance is extremely important in practice, since it guides the choice of learning method or model, and gives us a measure of the quality of the ultimately chosen model.

...read moreread less

220 citations

Book Chapter•DOI•

Boosting and Additive Trees

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Jan 2009

TL;DR: Boosting is one of the most powerful learning ideas introduced in the last ten years, but as will be seen in this chapter, it can profitably be extended to regression as well.

...read moreread less

Abstract: Boosting is one of the most powerful learning ideas introduced in the last ten years. It was originally designed for classification problems, but as will be seen in this chapter, it can profitably be extended to regression as well. The motivation for boosting was a procedure that combines the outputs of many “weak” classifiers to produce a powerful “committee.” From this perspective boosting bears a resemblance to bagging and other committee-based approaches (Section 8.8). However we shall see that the connection is at best superficial and that boosting is fundamentally different.

...read moreread less

192 citations

Book Chapter•DOI•

Additive Models, Trees, and Related Methods

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Jan 2009

TL;DR: This chapter begins the discussion of some specific methods for supervised learning by describing five related techniques: generalized additive models, trees, multivariate adaptive regression splines, the patient rule induction method, and hierarchical mixtures of experts.

...read moreread less

Abstract: In this chapter we begin our discussion of some specific methods for supervised learning. These techniques each assume a (different) structured form for the unknown regression function, and by doing so they finesse the curse of dimensionality. Of course, they pay the possible price of misspecifying the model, and so in each case there is a tradeoff that has to be made. They take off where Chapters 3–6 left off. We describe five related techniques: generalized additive models, trees, multivariate adaptive regression splines, the patient rule induction method, and hierarchical mixtures of experts.

...read moreread less

58 citations

Book Chapter•DOI•

Support Vector Machines and Flexible Discriminants

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Jan 2009

TL;DR: In this article, the authors describe generalizations of linear decision boundaries for classification, including flexible discriminant analysis which facilitates construction of nonlinear boundaries in a manner very similar to the support vector machines.

...read moreread less

Abstract: In this chapter we describe generalizations of linear decision boundaries for classification. Optimal separating hyperplanes are introduced in Chapter 4 for the case when two classes are linearly separable. Here we cover extensions to the nonseparable case, where the classes overlap. These techniques are then generalized to what is known as the support vector machine, which produces nonlinear boundaries by constructing a linear boundary in a large, transformed version of the feature space. The second set of methods generalize Fisher’s linear discriminant analysis (LDA). The generalizations include flexible discriminant analysis which facilitates construction of nonlinear boundaries in a manner very similar to the support vector machines, penalized discriminant analysis for problems such as signal and image classification where the large number of features are highly correlated, and mixture discriminant analysis for irregularly shaped classes.

...read moreread less

45 citations

Book Chapter•DOI•

Kernel Smoothing Methods

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Jan 2009

29 citations

Book Chapter•DOI•

High-Dimensional Problems: p N

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Jan 2009

28 citations

Book Chapter•DOI•

Model Inference and Averaging

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Jan 2009

TL;DR: For most of this book, the fitting (learning) of models has been achieved by minimizing a sum of squares for regression, or by minimizing cross-entropy for classification by maximizing the maximum likelihood approach to fitting.

...read moreread less

Abstract: For most of this book, the fitting (learning) of models has been achieved by minimizing a sum of squares for regression, or by minimizing cross-entropy for classification. In fact, both of these minimizations are instances of the maximum likelihood approach to fitting.

...read moreread less

22 citations

Book Chapter•DOI•

Prototype Methods and Nearest-Neighbors

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Jan 2009

TL;DR: Because they are highly unstructured, they typically aren’t useful for understanding the nature of the relationship between the features and class outcome, but as black box prediction engines, they can be very effective, and are often among the best performers in real data problems.

...read moreread less

Abstract: In this chapter we discuss some simple and essentially model-free methods for classification and pattern recognition. Because they are highly unstructured, they typically aren’t useful for understanding the nature of the relationship between the features and class outcome. However, as black box prediction engines, they can be very effective, and are often among the best performers in real data problems. The nearest-neighbor technique can also be used in regression; this was touched on in Chapter 2 and works reasonably well for low-dimensional problems. However, with high-dimensional features, the bias—variance tradeoff does not work as favorably for nearest-neighbor regression as it does for classification.

...read moreread less

19 citations

Book Chapter•DOI•

Undirected Graphical Models

[...]

Trevor Hastie¹, Robert Tibshirani¹, Jerome H. Friedman¹•Institutions (1)

Stanford University¹

01 Jan 2009

6 citations