scispace - formally typeset
Search or ask a question
Author

Robert Tibshirani

Bio: Robert Tibshirani is an academic researcher from Stanford University. The author has contributed to research in topics: Lasso (statistics) & Elastic net regularization. The author has an hindex of 147, co-authored 593 publications receiving 326580 citations. Previous affiliations of Robert Tibshirani include University of Toronto & University of California.


Papers
More filters
Journal ArticleDOI
TL;DR: The paper studies the construction of confidence values and examines to what extent they approximate frequentist p-values and Bayesian a posteriori probabilities, and derives more accurate confidence levels using both frequentist and objective Bayesian approaches.
Abstract: In the problem of regions, we wish to know which one of a discrete set of possibilities applies to a continuous parameter vector. This problem arises in the following way: we compute a descriptive statistic from a set of data, notice an interesting feature and wish to assign a confidence level to that feature. For example, we compute a density estimate and notice that the estimate is bimodal. What confidence can we assign to bimodality? A natural way to measure confidence is via the bootstrap: we compute our descriptive statistic on a large number of bootstrap data sets and record the proportion of times that the feature appears. This seems like a plausible measure of confidence for the feature. The paper studies the construction of such confidence values and examines to what extent they approximate frequentist $p$-values and Bayesian a posteriori probabilities. We derive more accurate confidence levels using both frequentist and objective Bayesian approaches. The methods are illustrated with a number of examples, including polynomial model selection and estimating the number of modes of a density.

139 citations

Journal ArticleDOI
TL;DR: In this paper, a general procedure for posterior sampling from additive and generalized additive models is proposed, which is a stochastic generalization of the well-known backfitting algorithm for fitting additive models.
Abstract: We propose general procedures for posterior sampling from additive and generalized additive models. The procedure is a stochastic generalization of the well-known backfitting algorithm for fitting additive models. One chooses a linear operator (“smoother”) for each predictor, and the algorithm requires only the application of the operator and its square root. The procedure is general and modular, and we describe its application to nonparametric, semiparametric and mixed models.

138 citations

Journal ArticleDOI
TL;DR: The expression of specific miRNAs may be useful for DLBCL survival prediction and their role in the pathogenesis of this disease should be examined further.
Abstract: Purpose: Diffuse large B-cell lymphoma (DLBCL) heterogeneity has prompted investigations for new biomarkers that can accurately predict survival. A previously reported 6-gene model combined with the International Prognostic Index (IPI) could predict patients9 outcome. However, even these predictors are not capable of unambiguously identifying outcome, suggesting that additional biomarkers might improve their predictive power. Experimental Design: We studied expression of 11 microRNAs (miRNA) that had previously been reported to have variable expression in DLBCL tumors. We measured the expression of each miRNA by quantitative real-time PCR analyses in 176 samples from uniformly treated DLBCL patients and correlated the results to survival. Results: In a univariate analysis, the expression of miR-18a correlated with overall survival (OS), whereas the expression of miR-181a and miR-222 correlated with progression-free survival (PFS). A multivariate Cox regression analysis including the IPI, the 6-gene model–derived mortality predictor score and expression of the miR-18a, miR-181a, and miR-222, revealed that all variables were independent predictors of survival except the expression of miR-222 for OS and the expression of miR-18a for PFS. Conclusion: The expression of specific miRNAs may be useful for DLBCL survival prediction and their role in the pathogenesis of this disease should be examined further. Clin Cancer Res; 17(12); 4125–35. ©2011 AACR .

138 citations

Journal ArticleDOI
01 Mar 2000-Blood
TL;DR: The findings indicate that the etiology and the driving forces for clonal expansion are heterogeneous, which may explain the well-known clinical and pathologic heterogeneity of DLBCL.

137 citations

01 Jan 2010
TL;DR: It is found that for edge selection, a simple method based on univariate screening of the elements of the empirical correlation matrix usually performs as well or better than all of the more complex methods proposed here and elsewhere.
Abstract: We propose several methods for estimating edge-sparse and nodesparse graphical models based on lasso and grouped lasso penalties. We develop ecien t algorithms for tting these models when the numbers of nodes and potential edges are large. We compare them to competing methods including the graphical lasso and SPACE (Peng, Wang, Zhou & Zhu 2008). Surprisingly, we nd that for edge selection, a simple method based on univariate screening of the elements of the empirical correlation matrix usually performs as well or better than all of the more complex methods proposed here and elsewhere. Running title: Applications of the lasso and grouped lasso

135 citations


Cited by
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Journal ArticleDOI
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

47,038 citations

Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations

Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations