scispace - formally typeset
Search or ask a question
Author

Jerome H. Friedman

Other affiliations: University of Washington
Bio: Jerome H. Friedman is an academic researcher from Stanford University. The author has contributed to research in topics: Lasso (statistics) & Multivariate statistics. The author has an hindex of 70, co-authored 155 publications receiving 138619 citations. Previous affiliations of Jerome H. Friedman include University of Washington.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the authors explore certaines des causes of this situation, and s'interroge sur les raisons pour lesquelles les statisticiens devraient etre interesses a participer au developpement de nouvelles methodes pour traiter les fichiers de donnees volumineux and complexes.
Abstract: La nature des donnees change rapidement. Les fichiers de donnees deviennent de plus en plus volumineux et complexes. La methodologie moderne pour analyser ces nouveaux types de donnees provient des domaines de la gestion des bases de donnees, de l'intelligence artificielle, de la reconnaisance de caracteres, et de la visualisation de donnees. Pour l'instant, la statistique, en tant que discipline, n'a joue qu'un role mineur. Cet article explore certaines des causes de cette situation, et s'interroge sur les raisons pour lesquelles les statisticiens devraient etre interesses a participer au developpement de nouvelles methodes pour traiter les fichiers de donnees volumineux et complexes.

57 citations

Book ChapterDOI
01 Jan 1979
TL;DR: In this article, a nonparametric regression problem is formulated where one is given a set of vector valued variables 3 (termed carriers) and with each associated scalar quantity Y (the response) and the problem is to estimate each corresponding response using the values of its carriers and the training sample.
Abstract: In the nonparametric regression problem, one is given a set of vector valued variables 3 (termed carriers) and with each an associated scalar quantity Y (termed the response). This set of carriers and associated responses CYi,Xil (l

55 citations

Posted Content
TL;DR: In this article, the authors review several variance estimators and perform a reasonably extensive simulation study in an attempt to compare their finite sample performance, and it would seem from the results that variance estimation with adaptively chosen regularisation parameters perform admirably over a broad range of sparsity and signal strength settings.
Abstract: Variance estimation in the linear model when $p > n$ is a difficult problem. Standard least squares estimation techniques do not apply. Several variance estimators have been proposed in the literature, all with accompanying asymptotic results proving consistency and asymptotic normality under a variety of assumptions. It is found, however, that most of these estimators suffer large biases in finite samples when true underlying signals become less sparse with larger per element signal strength. One estimator seems to be largely neglected in the literature: a residual sum of squares based estimator using Lasso coefficients with regularisation parameter selected adaptively (via cross-validation). In this paper, we review several variance estimators and perform a reasonably extensive simulation study in an attempt to compare their finite sample performance. It would seem from the results that variance estimators with adaptively chosen regularisation parameters perform admirably over a broad range of sparsity and signal strength settings. Finally, some intial theoretical analyses pertaining to these types of estimators are proposed and developed.

50 citations

Journal ArticleDOI
TL;DR: A rigorous formalization for the additive tree is introduced, an empirically validated learning technique for creating a single decision tree, and it is shown that this method can produce models equivalent to CART or gradient boosted stumps at the extremes by varying a single parameter.
Abstract: The expansion of machine learning to high-stakes application domains such as medicine, finance, and criminal justice, where making informed decisions requires clear understanding of the model, has increased the interest in interpretable machine learning. The widely used Classification and Regression Trees (CART) have played a major role in health sciences, due to their simple and intuitive explanation of predictions. Ensemble methods like gradient boosting can improve the accuracy of decision trees, but at the expense of the interpretability of the generated model. Additive models, such as those produced by gradient boosting, and full interaction models, such as CART, have been investigated largely in isolation. We show that these models exist along a spectrum, revealing previously unseen connections between these approaches. This paper introduces a rigorous formalization for the additive tree, an empirically validated learning technique for creating a single decision tree, and shows that this method can produce models equivalent to CART or gradient boosted stumps at the extremes by varying a single parameter. Although the additive tree is designed primarily to provide both the model interpretability and predictive performance needed for high-stakes applications like medicine, it also can produce decision trees represented by hybrid models between CART and boosted stumps that can outperform either of these approaches.

49 citations

Book ChapterDOI
01 Jan 2009
TL;DR: In this article, the authors describe generalizations of linear decision boundaries for classification, including flexible discriminant analysis which facilitates construction of nonlinear boundaries in a manner very similar to the support vector machines.
Abstract: In this chapter we describe generalizations of linear decision boundaries for classification. Optimal separating hyperplanes are introduced in Chapter 4 for the case when two classes are linearly separable. Here we cover extensions to the nonseparable case, where the classes overlap. These techniques are then generalized to what is known as the support vector machine, which produces nonlinear boundaries by constructing a linear boundary in a large, transformed version of the feature space. The second set of methods generalize Fisher’s linear discriminant analysis (LDA). The generalizations include flexible discriminant analysis which facilitates construction of nonlinear boundaries in a manner very similar to the support vector machines, penalized discriminant analysis for problems such as signal and image classification where the large number of features are highly correlated, and mixture discriminant analysis for irregularly shaped classes.

45 citations


Cited by
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Journal ArticleDOI
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

47,038 citations

Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations

Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations