scispace - formally typeset
Search or ask a question
Author

Jerome H. Friedman

Other affiliations: University of Washington
Bio: Jerome H. Friedman is an academic researcher from Stanford University. The author has contributed to research in topics: Lasso (statistics) & Multivariate statistics. The author has an hindex of 70, co-authored 155 publications receiving 138619 citations. Previous affiliations of Jerome H. Friedman include University of Washington.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the generalized Veneziano model is fitted to the K 1t mass spectrum at all energies, and it is shown that the model fits the mass spectra very well.
Abstract: Victor Waluch ii 12 GeV/c 4.6, 9, and 12 GeV/c + + o I We present data on the reaction K p~n pK at 12 GeV c,and we present a study of the generalized Veneziano model at 4.6, 9, . 0 + and 12 GeVjc. We have stud~ed the K 1t mass spectrum and found * * that in addition to the K (890) and K (1420) resonances, there is a hint of a resonance at 1.8 GeV, but we were unable to measure its parameters. We have measured the differential cross sec~ tions, the total cross sections, the masses and widths, and the * spin density matrices of the two K and the A (1236) resonances, and find them to be in agreement with previously published data. We have fitted the generalized Veneziano model to our reaction and to data at the other energies. 1-Te find that the five parameters used in the theory do not significantly change with energy. The fits at all energies are quite good. ·However, difficulties with the model at one energy persist at all energies. In particular, we find that the model fits the mass spectra very well at all energies, that the momentum transfer distributions to the single particles and to the resonances fit well, and that the ratio of the cross sections of resonances in a given channel are well predicted by the model. The model's inability to fit the p7t+ mass spectrum is evident at all energies. I. II.

1 citations

01 Jan 2004
TL;DR: This paper presents the motivation for clustering objects on subsets of attributes (COSA), and the weights that are crucial in the COSA procedure but that were rather underexposed as diagnostics in the original paper.
Abstract: The motivation for clustering objects on subsets of attributes (COSA) was given by consideration of data where the number of attributes is much larger than the number of objects. Obvious application is in systems biology (genomics, proteomics, and metabolomics). When we have a large numbers of attributes, ob- jects might cluster on some attributes, and be far apart on all others. Common data analysis approaches in systems biology are to cluster the attributes first, and only after having reduced the original many-attribute data set to a much smaller one, one tries to cluster the objects. The problem here, of course, is that we would like to select those attributes that discriminate most among the objects (so we have to do this while regarding all attributes multivariately), and it is usually not good enough to inspect each attribute univariately. Therefore, two tasks have to be carried out simultaneously: cluster the objects into homogeneous groups, while se- lecting dierent subsets of variables (one for each group of objects). The attribute subset for any discovered group may be completely, partially or nonoverlapping with those for other groups. The notorious local optima problem is dealt with by starting with the inverse exponential mean (rather than the arithmetic mean) of the separate attribute distances. By using a homotopy strategy, the algorithm creates a smooth transition of the inverse exponential distance to the mean of the ordinary Euclidean distances over attributes. New insight will be presented for the homotopy strategy, and the weights that are crucial in the COSA procedure but that were rather underexposed as diagnostics in the original paper.

1 citations

Book ChapterDOI
01 Jan 1986
TL;DR: The panel discussion on Data Analysis Trends in X-ray and Gamma-ray astronomy by first introducing the panel members and introducing the speakers is conducted.
Abstract: ZIMMERMANN: Let me begin the panel discussion on Data Analysis Trends in X-ray and Gamma-ray astronomy by first introducing the panel members These are from the left to the right: Roland Diehl, Ethan Schreier, Rosolino Buccheri, Livio Scarsi, Jean-Marc Chassery and Wolfgang Voges With the exception of Jean-Marc Chassery, who is an expert on image processing and statistical analysis, all the others have longer experience in the X-ray and Gamma-ray fields May I ask the speakers to keep their contributions to not more than five minutes, in order to allow ample room for discussion with the audience

1 citations

Book ChapterDOI
01 Jan 1982
TL;DR: The Orion I workstation is an experimental computer graphics system, built at the Stanford Linear Accelerator Center in 1980–81, used to study applications of recent developments in computer graphics technology to statistics.
Abstract: The Orion I workstation is an experimental computer graphics system, built at the Stanford Linear Accelerator Center in 1980–81. It is used to study applications of recent developments in computer graphics technology to statistics. Orion I is the newest of several descendants of an earlier system built at SLAC called Prim 9. The principal common feature of “Prim” systems is the use of real-time motion graphics to display three-dimensional scatter plots. We demonstrate some of the more basic methods for data analysis in our film “Exploring Data with Orion I”.

1 citations


Cited by
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Journal ArticleDOI
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

47,038 citations

Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations

Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations