Search or ask a question

Showing papers by "Jerome H. Friedman published in 2016"

PDF

Open Access

Journal Article•DOI•

A Study of Error Variance Estimation in Lasso Regression

[...]

Stephen Reid¹, Robert Tibshirani, Jerome H. Friedman•Institutions (1)

Stanford University¹

01 Jan 2016-Statistica Sinica

TL;DR: In this paper, the authors review several variance estimators and perform a reasonably extensive simulation study in an attempt to compare their finite sample performance, and it would seem from the results that variance estimation with adaptively chosen regularisation parameters perform admirably over a broad range of sparsity and signal strength settings.

...read moreread less

Abstract: Variance estimation in the linear model when p > n is a difficult problem. Standard least squares estimation techniques do not apply. Several variance estimators have been proposed in the literature, all with accompanying asymptotic results proving consistency and asymptotic normality under a variety of assumptions. It is found, however, that most of these estimators suffer large biases in finite samples when true underlying signals become less sparse with larger per element signal strength. One estimator seems to merit more attention than it has received in the literature: a residual sum of squares based estimator using Lasso coefficients with regularisation parameter selected adaptively (via cross-validation). In this paper, we review several variance estimators and perform a reasonably extensive simulation study in an attempt to compare their finite sample performance. It would seem from the results that variance estimators with adaptively chosen regularisation parameters perform admirably over a broad range of sparsity and signal strength settings. Finally, some intial theoretical analyses pertaining to these types of estimators are proposed and developed.

...read moreread less

127 citations

Journal Article•DOI•

Wavelet-based gradient boosting

[...]

E. Dubossarsky, Jerome H. Friedman¹, John T. Ormerod², Matt P. Wand³•Institutions (3)

Stanford University¹, University of Sydney², University of Technology, Sydney³

01 Jan 2016-Statistics and Computing

TL;DR: Wavelet-based gradient boosting takes advantages of the approximate approximate $$\ell _1$$ℓ1 penalization induced by gradient boosting to give appropriate penalized additive fits.

...read moreread less

Abstract: A new data science tool named wavelet-based gradient boosting is proposed and tested. The approach is special case of componentwise linear least squares gradient boosting, and involves wavelet functions of the original predictors. Wavelet-based gradient boosting takes advantages of the approximate $$\ell _1$$l1 penalization induced by gradient boosting to give appropriate penalized additive fits. The method is readily implemented in R and produces parsimonious and interpretable regression fits and classifiers.

...read moreread less

8 citations

Posted Content•

rCOSA: A Software Package for Clustering Objects on Subsets of Attributes

[...]

Maarten M. Kampert¹, Jacqueline J. Meulman¹, Jacqueline J. Meulman², Jerome H. Friedman²•Institutions (2)

Leiden University¹, Stanford University²

30 Nov 2016-arXiv: Computation

TL;DR: RCOSA as mentioned in this paper is a software package interfaced to the R language that implements statistical techniques for clustering objects on subsets of attributes in multivariate data, the main output of COSA is a dissimilarity matrix that one can subsequently analyze with a variety of proximity analysis methods.

...read moreread less

Abstract: \texttt{rCOSA} is a software package interfaced to the R language. It implements statistical techniques for clustering objects on subsets of attributes in multivariate data. The main output of COSA is a dissimilarity matrix that one can subsequently analyze with a variety of proximity analysis methods. Our package extends the original COSA software (Friedman and Meulman, 2004) by adding functions for hierarchical clustering methods, least squares multidimensional scaling, partitional clustering, and data visualization. In the many publications that cite the COSA paper by Friedman and Meulman (2004), the COSA program is actually used only a small number of times. This can be attributed to the fact that thse original implementation is not very easy to install and use. Moreover, the available software is out-of-date. Here, we introduce an up-to-date software package and a clear guidance for this advanced technique. The software package and related links are available for free at: \url{this https URL}

...read moreread less