Home
/
Authors
/
Jerome H. Friedman

Author

Jerome H. Friedman

Other affiliations: University of Washington

Bio: Jerome H. Friedman is an academic researcher from Stanford University. The author has contributed to research in topics: Lasso (statistics) & Multivariate statistics. The author has an hindex of 70, co-authored 155 publications receiving 138619 citations. Previous affiliations of Jerome H. Friedman include University of Washington.

Papers published on a yearly basis

2022
2021
2020
2019
2018
2017
2016
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1997
1996
1995
1994
1993
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1979
1978
1977
1975
1974
1973
1972

Papers

PDF

Open Access

More filters

Posted Content•

Lockout: Sparse Regularization of Neural Networks.

[...]

Gilmer Valdes¹, Wilmer Arbelo¹, Yannet Interian², Jerome H. Friedman³•Institutions (3)

University of California, San Francisco¹, University of San Francisco², Stanford University³

15 Jul 2021-arXiv: Learning

TL;DR: A fast algorithm is presented that provides all such solutions for any differentiable function f and loss L, and any constraint P that is an increasing monotone function of the absolute value of each parameter.

...read moreread less

Abstract: Many regression and classification procedures fit a parameterized function $f(x;w)$ of predictor variables $x$ to data $\{x_{i},y_{i}\}_1^N$ based on some loss criterion $L(y,f)$. Often, regularization is applied to improve accuracy by placing a constraint $P(w)\leq t$ on the values of the parameters $w$. Although efficient methods exist for finding solutions to these constrained optimization problems for all values of $t\geq0$ in the special case when $f$ is a linear function, none are available when $f$ is non-linear (e.g. Neural Networks). Here we present a fast algorithm that provides all such solutions for any differentiable function $f$ and loss $L$, and any constraint $P$ that is an increasing monotone function of the absolute value of each parameter. Applications involving sparsity inducing regularization of arbitrary Neural Networks are discussed. Empirical results indicate that these sparse solutions are usually superior to their dense counterparts in both accuracy and interpretability. This improvement in accuracy can often make Neural Networks competitive with, and sometimes superior to, state-of-the-art methods in the analysis of tabular data.

...read moreread less

3 citations

Journal Article•DOI•

Discussion of “Prediction, Estimation, and Attribution” by Bradley Efron

[...]

Jerome H. Friedman¹, Trevor Hastie¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

29 Dec 2020-International Statistical Review

TL;DR: There is more of a continuum between the old and new methodology, and the opportunity for both to improve through their synergy.

...read moreread less

Abstract: Professor Efron has presented us with a thought‐provoking paper on the relationship between prediction, estimation, and attribution in the modern era of data science. While we appreciate many of his arguments, we see more of a continuum between the old and new methodology, and the opportunity for both to improve through their synergy.

...read moreread less

2 citations

Book Chapter•DOI•

New similarity rules for mining data

[...]

Vito Di Gesù¹, Jerome H. Friedman²•Institutions (2)

University of Palermo¹, Stanford University²

08 Jun 2005

TL;DR: A new class of similarity functions, SF's, are introduced that can be used to discover properties in the feature space X and to perform their grouping with standard clustering techniques.

...read moreread less

Abstract: Variability and noise in data-sets entries make hard the discover of important regularities among association rules in mining problems. The need exists for defining flexible and robust similarity measures between association rules. This paper introduces a new class of similarity functions, SF's, that can be used to discover properties in the feature space X and to perform their grouping with standard clustering techniques. Properties of the proposed SF's are investigated and experiments on simulated data-sets are also shown to evaluate the grouping performance.

...read moreread less

2 citations

An Adaptive Importance Sampling Procedure.

[...]

Jerome H. Friedman, Margaret H. Wright

01 Nov 1981

TL;DR: In this paper, an importance sampling technique is proposed to improve the efficiency of an acceptance/rejection generating method by adaptively partitioning the sampling region so that the variation of density values within each subregion is relatively small.

...read moreread less

Abstract: : Monte Carlo calculations often require generation of a random sample of n-dimensional points drawn from a specified multivariate probability distribution. We present an importance sampling technique that can often greatly improve the efficiency of an acceptance/rejection generating method. The importance sampling function is defined as piecewise constant on a set of subregions, which are obtained by adaptively partitioning the sampling region so that the variation of density values within each subregion is relatively small. The partitioning strategy is based on multiparameter optimization to estimate the maximum and minimum of the original density function in each subregion. (Author)

...read moreread less

2 citations

Journal Article•DOI•

COMPARISON OF THE MULTIPERIPHERAL MODEL WITH INCLUSIVE DATA IN K$sup +$p AND $pi$$sup -$p REACTIONS.

[...]

Jerome H. Friedman, Clifford Risk

01 Jan 1972-Physical Review D

TL;DR: In this article, the authors compare the results obtained by other types of approaches to the inclusive analysis of high-energy collisions, including the multi-regge model, with the results of the single-energy analysis.

...read moreread less

Abstract: The pr~dictions of the inultiperipheral model are compared to inclusive data in K+ P and IT P re,!:ctions, We compare with topological \\ longitudinal momentum distributions, double, differential distributions, m~ltiplicity cross-sections, IT+ /ITratio, asymmetry characteristics, isotropy in the cm, and Regge behavior near the kinematical limit. The agreement is reasonably good. We discuss the relation of this work to earlier work on the multi-Regge model, to results of other models, and to the results obtained by other types of approaches to \\ the inclusive analysis. _2 1. INTRODUC TION h I 'h\" I . (1, 2) f t' During t east two years t e mc USlve type 0 reac lOn a + b -+ c .j. anything has become a popular means of studying high energy collisions. Two different approaches to this study can perhap~ be distinguished. On the one hand, detailed studies have been made of the momentum , distribution of particle\" c\" in the momentum regions.gear the kine, /\\ matical limit. For example, comparisons of a given' reaction (e. g. ' (3) , IT + p'\" IT + anything for slow IT in the lab. ) at several energIes have been made to test the Yang conjecture(2) of limiting distributions. Comparisons of the IT distribution of proton targets with different ' incident particles have been made(4) to test the factorization hypothesis(5). Finally, st.udies of a singte reaction at a,single energy have been ,made to test the quantitative predictions of the Regge limit near the kinematical boundary( 6). The advantage of this type of approach is that by examining this momentum range in such detail with these va.rious methods, one can perhaps obtain insight into the precise character, of the production process. H'Owever, the scope of the knowledge is limited for example, little is said about the distribution at PL 0, or about its dependence on prong number, or about correlations between the spectra of different types of secondaries (for example, in a p p reaction the relation between fast produced IT spectra and inelastic p spectra). On the other hand, various dynamical models have been proposed that describe the spectra over the entire momentum_range. For example, we list: (a) the multiperipheral model in the exclusive form of ABFST(7), Chew and Pignotti(8), and CLA(9); and in the inclusive form of Caneschi and Pignotti(10); (b) the thermodynamical model

...read moreread less

2 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
…
25
26
27
28
29
30
31
…
32

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

Scikit-learn: Machine Learning in Python

[...]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel¹, Peter Prettenhofer², Ron Weiss³, Vincent Dubourg, Jake Vanderplas⁴, Alexandre Passos⁵, David Cournapeau, Matthieu Brucher⁶, Matthieu Perrot, Edouard Duchesnay - Show less +12 more•Institutions (6)

Kobe University¹, Bauhaus University, Weimar², Google³, University of Washington⁴, University of Massachusetts Amherst⁵, Total S.A.⁶

01 Feb 2011-Journal of Machine Learning Research

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.

...read moreread less

Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

...read moreread less

47,974 citations

Journal Article•DOI•

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

[...]

Michael I. Love¹, Michael I. Love², Wolfgang Huber, Simon Anders•Institutions (2)

Max Planck Society¹, Harvard University²

05 Dec 2014-Genome Biology

TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.

...read moreread less

Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

...read moreread less

47,038 citations

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

46,906 citations

Journal Article•DOI•

LIBSVM: A library for support vector machines

[...]

Chih-Chung Chang¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

06 May 2011-ACM Transactions on Intelligent Systems and Technology

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

40,826 citations

Journal Article•DOI•

Regression Shrinkage and Selection via the Lasso

[...]

Robert Tibshirani

01 Jan 1996-Journal of the royal statistical society series b-methodological

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

...read moreread less

40,785 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse