scispace - formally typeset
Search or ask a question
Author

Robert Tibshirani

Bio: Robert Tibshirani is an academic researcher from Stanford University. The author has contributed to research in topics: Lasso (statistics) & Elastic net regularization. The author has an hindex of 147, co-authored 593 publications receiving 326580 citations. Previous affiliations of Robert Tibshirani include University of Toronto & University of California.


Papers
More filters
Journal ArticleDOI
TL;DR: Genistein produces diverse effects on gene expression that are dose-dependent and this has important implications in developing genistein as a putative prostate cancer preventive agent.
Abstract: Epidemiological evidence suggests that soy consumption is associated with a decreased risk of prostate cancer. The isoflavone genistein is found at high levels in soy and a large body of evidence suggests it is important in mediating the cancer preventive effects of soy. The mechanisms through which genistein acts in prostate cancer cells have not been fully defined. We used gene expression profiling to identify genes significantly modulated by low and high doses of ge- nistein in LNCaP cells. Significant genes were identified using StepMiner analysis and significantly altered pathways with Ingenuity Pathways analysis. Genistein significantly altered expression of transcripts involved in cell growth, carcinogen defenses and steroid signaling pathways. The effects of genistein on these pathways were confirmed by directly assessing dose-related effects on LNCaP cell growth, NQO-1 enzymatic activity and PSA protein expression. Genistein produces diverse effects on gene expression that are dose-dependent and this has important implications in developing genistein as a putative prostate cancer preventive agent.

11 citations

Posted Content
TL;DR: In this paper, a regularized model which adaptively pools elements of the precision matrices is proposed, which decreases the variance of our estimates without overly biasing them, and is shown to be effective on real and simulated datasets.
Abstract: Linear and Quadratic Discriminant analysis (LDA/QDA) are common tools for classification problems. For these methods we assume observations are normally distributed within group. We estimate a mean and covariance matrix for each group and classify using Bayes theorem. With LDA, we estimate a single, pooled covariance matrix, while for QDA we estimate a separate covariance matrix for each group. Rarely do we believe in a homogeneous covariance structure between groups, but often there is insufficient data to separately estimate covariance matrices. We propose L1- PDA, a regularized model which adaptively pools elements of the precision matrices. Adaptively pooling these matrices decreases the variance of our estimates (as in LDA), without overly biasing them. In this paper, we propose and discuss this method, give an efficient algorithm to fit it for moderate sized problems, and show its efficacy on real and simulated datasets.

11 citations

Journal ArticleDOI
TL;DR: It is proved that when log-transformed signal is used as the input for signal reconstruction, it will always yield an underestimation of the true signal.
Abstract: gene-expression values1. The authors mixed complementary RNA from the tissues and observed similar off-diagonal effects. They concluded that the off-diagonal effects are due to technical reasons, such as nonlinear sample amplification or probe crosshybridization, rather than statistical deconvolution. We found that this deviation of signal reconstruction was the result of data transformation. In microarray studies, expression data are logarithm-transformed for variance stabilization or for approximation of a normal distribution2. However, we argue that in the context of expression-profile deconvolution, the log transformation will produce biased estimation. Deconvolution is modeled by a linear equation O = S × W, where O is the expression data for mixed tissue samples, S is the tissue-specific expression profile, and W is the cell-type frequency matrix. If the signal is log-transformed, the linearity will no longer be preserved. The concavity feature of the log function will induce a downward bias to the reconstructed signal (Fig. 1a and Supplementary Fig. 1). Mathematically, it can be shown that the deconvolution model used on log-transformed signals is log(O ́) = log(S) × W, where O ́ is the csSAM estimate of gene-expression profiles. As W is a frequency matrix and its column values sum to 1, the following is true by the properties of concave functions3: log(S × W) > log(S) × W. Taking these two equations together, we can conclude that log(O ́) < log(S × W) = log(O). Thus, we proved that when log-transformed signal is used as the input for signal reconstruction, it will always yield an underestimation of the true signal. By taking an anti-log transformation, we obtained an unbiased reconstruction of the mixed tissue samples (Fig. 1b and Supplementary Fig. 2). The log transformation also introduced a large bias to the results of deconvolution (Fig. 1c and Supplementary Fig. 3). A substantial portion of the genes were off diagonal in the deconvolved cell type–specif ic gene-expression profiles. By performing the deconvolution in linear space, we achieved a considerably more accurate result (Fig. 1d and Supplementary Fig. 3). In summary, an incorrect transformation of data can greatly bias the final results of deconvolution. In the context of geneexpression deconvolution, a linear model achieves better accuracy. Accurate deconvolution of expression profiles is important for downstream analysis, such as gene expression analysis and pathway-enrichment analysis. We urge caution in selecting datatransformation functions and any preprocessing steps in modelbased statistical analysis.

11 citations

01 Jan 2013
TL;DR: The field of machine learning, discussed in this volume by my friend Larry Wasserman, has exploded and brought along with it the computational side of statistical research as mentioned in this paper, which is a thriving discipline, more and more an essential part of science, business and societal activities.
Abstract: When asked to reflect on an anniversary of their field, scientists in most fields would sing the praises of their subject. As a statistician, I will do the same. However, here the praise is justified! Statistics is a thriving discipline, more and more an essential part of science, business and societal activities. Class enrollments are up — it seems that everyone wants to be a statistician — and there are jobs everywhere. The field of machine learning, discussed in this volume by my friend Larry Wasserman, has exploded and brought along with it the computational side of statistical research. Hal Varian, Chief Economist at Google, said “I keep saying that the sexy job in the next 10 years will be statisticians. And I’m not kidding.” Nate Silver, creator of the New York Times political forecasting blog “538” was constantly in the news and on talk shows in the runup to the 2012 US election. Using careful statistical modelling, he forecasted the election with near 100% accuracy (in contrast to many others). Although his training is in economics, he (proudly?) calls himself a statistician. When meeting people at a party, the label “Statistician” used to kill one’s chances of making a new friend. But no longer! In the midst of all this excitement about the growing importance of statistics, there are fascinating developments within the field itself. Here I will discuss one that has been the focus my research and that of many other statisticians.

11 citations


Cited by
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Journal ArticleDOI
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

47,038 citations

Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations

Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations