Showing papers by "Robert Tibshirani published in 2010"

PDF

Open Access

Journal Article•DOI•

Regularization Paths for Generalized Linear Models via Coordinate Descent

[...]

Jerome H. Friedman¹, Trevor Hastie¹, Robert Tibshirani•Institutions (1)

02 Feb 2010-Journal of Statistical Software

TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.

...read moreread less

Abstract: We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multinomial regression problems while the penalties include l(1) (the lasso), l(2) (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.

...read moreread less

13,656 citations

Journal Article•

Spectral Regularization Algorithms for Learning Large Incomplete Matrices

[...]

Rahul Mazumder¹, Trevor Hastie¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Mar 2010-Journal of Machine Learning Research

TL;DR: Using the nuclear norm as a regularizer, the algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD in a sequence of regularized low-rank solutions for large-scale matrix completion problems.

...read moreread less

Abstract: We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm SOFT-IMPUTE iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity of order linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices; for example SOFT-IMPUTE takes a few hours to compute low-rank approximations of a 106 X 106 incomplete matrix with 107 observed entries, and fits a rank-95 approximation to the full Netflix training set in 3.3 hours. Our methods achieve good training and test errors and exhibit superior timings when compared to other competitive state-of-the-art techniques.

...read moreread less

1,195 citations

Posted Content•

A note on the group lasso and a sparse group lasso

[...]

Jerome H. Friedman, Trevor Hastie, Robert Tibshirani

05 Jan 2010-arXiv: Statistics Theory

TL;DR: An ecien t algorithm is derived for the resulting convex problem based on coordinate descent that can be used to solve the general form of the group lasso, with non-orthonormal model matrices.

...read moreread less

Abstract: We consider the group lasso penalty for the linear model. We note that the standard algorithm for solving the problem assumes that the model matrices in each group are orthonormal. Here we consider a more general penalty that blends the lasso (L1) with the group lasso (\two-norm"). This penalty yields solutions that are sparse at both the group and individual feature levels. We derive an ecien t algorithm for the resulting convex problem based on coordinate descent. This algorithm can also be used to solve the general form of the group lasso, with non-orthonormal model matrices.

...read moreread less

800 citations

Journal Article•DOI•

A framework for feature selection in clustering

[...]

Daniela Witten¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jun 2010-Journal of the American Statistical Association

TL;DR: A novel framework for sparse clustering is proposed, in which one clusters the observations using an adaptively chosen subset of the features, which uses a lasso-type penalty to select the features.

...read moreread less

Abstract: We consider the problem of clustering observations using a potentially large set of features. One might expect that the true underlying clusters present in the data differ only with respect to a small fraction of the features, and will be missed if one clusters the observations using the full set of features. We propose a novel framework for sparse clustering, in which one clusters the observations using an adaptively chosen subset of the features. The method uses a lasso-type penalty to select the features. We use this framework to develop simple methods for sparse K-means and sparse hierarchical clustering. A single criterion governs both the selection of the features and the resulting clusters. These approaches are demonstrated on simulated and genomic data.

...read moreread less

643 citations

Journal Article•DOI•

Cell type–specific gene expression differences in complex tissues

[...]

Shai S. Shen-Orr¹, Robert Tibshirani¹, Purvesh Khatri¹, Dale L. Bodian¹, Dale L. Bodian², Frank Staedtler², Nicholas Perry¹, Trevor Hastie¹, Minnie M. Sarwal¹, Mark M. Davis¹, Mark M. Davis³, Atul J. Butte¹ - Show less +8 more•Institutions (3)

Stanford University¹, Novartis², Howard Hughes Medical Institute³

01 Apr 2010-Nature Methods

TL;DR: This work validated csSAM with predesigned mixtures and applied it to whole-blood gene expression datasets from stable post-transplant kidney transplant recipients and those experiencing acute transplant rejection, which revealed hundreds of differentially expressed genes that were otherwise undetectable.

...read moreread less

Abstract: We describe cell type-specific significance analysis of microarrays (csSAM) for analyzing differential gene expression for each cell type in a biological sample from microarray data and relative cell-type frequencies. First, we validated csSAM with predesigned mixtures and then applied it to whole-blood gene expression datasets from stable post-transplant kidney transplant recipients and those experiencing acute transplant rejection, which revealed hundreds of differentially expressed genes that were otherwise undetectable.

...read moreread less

499 citations

Journal Article•DOI•

In Situ Vaccination With a TLR9 Agonist Induces Systemic Lymphoma Regression: A Phase I/II Study

[...]

Joshua Brody¹, Weiyun Z. Ai, Debra K. Czerwinski, James A. Torchia, Mia A. Levy, Ranjana H. Advani, Youn H. Kim, Richard T. Hoppe, Susan J. Knox, Lewis K. Shin, Irene Wapnir, Robert Tibshirani, Ronald Levy - Show less +9 more•Institutions (1)

Stanford University¹

01 Oct 2010-Journal of Clinical Oncology

TL;DR: In situ tumor vaccination with a TLR9 agonist induces systemic antilymphoma clinical responses and is clinically feasible and does not require the production of a customized vaccine product.

...read moreread less

Abstract: Purpose Combining tumor antigens with an immunostimulant can induce the immune system to specifically eliminate cancer cells. Generally, this combination is accomplished in an ex vivo, customized manner. In a preclinical lymphoma model, intratumoral injection of a Toll-like receptor 9 (TLR9) agonist induced systemic antitumor immunity and cured large, disseminated tumors. Patients and Methods We treated 15 patients with low-grade B-cell lymphoma using low-dose radiotherapy to a single tumor site and—at that same site—injected the C-G enriched, synthetic oligodeoxynucleotide (also referred to as CpG) TLR9 agonist PF-3512676. Clinical responses were assessed at distant, untreated tumor sites. Immune responses were evaluated by measuring T-cell activation after in vitro restimulation with autologous tumor cells. Results This in situ vaccination maneuver was well-tolerated with only grade 1 to 2 local or systemic reactions and no treatment-limiting adverse events. One patient had a complete clinical response,...

...read moreread less

443 citations

Journal Article•DOI•

Prediction of survival in diffuse large B-cell lymphoma based on the expression of 2 genes reflecting tumor and microenvironment.

[...]

Ash A. Alizadeh, Andrew J. Gentles¹, Alvaro J. Alencar², Chih Long Liu, Holbrook E Kohrt, Roch Houot³, Matthew J. Goldstein, Shuchun Zhao¹, Yasodha Natkunam¹, Ranjana H. Advani, Randy D. Gascoyne, Javier Briones⁴, Robert Tibshirani¹, June Helen Myklebust⁵, Sylvia K. Plevritis¹, Izidore S. Lossos², Ronald Levy - Show less +13 more•Institutions (5)

Stanford University¹, University of Miami², French Institute of Health and Medical Research³, Autonomous University of Barcelona⁴, Oslo University Hospital⁵

19 Nov 2010-Blood

TL;DR: It is concluded that the measurement of a single gene expressed by tumor cells (LMO2) and a single genes expressed by the immune microenvironment (TNFRSF9) powerfully predicts overall survival in patients with DLBCL.

...read moreread less

183 citations

Journal Article•DOI•

Ultra-high throughput sequencing-based small RNA discovery and discrete statistical biomarker analysis in a collection of cervical tumours and matched controls

[...]

Daniela Witten¹, Robert Tibshirani¹, Sam Guoping Gu¹, Andrew Fire¹, Weng-Onn Lui¹, Weng-Onn Lui² - Show less +2 more•Institutions (2)

Stanford University¹, Karolinska University Hospital²

11 May 2010-BMC Biology

TL;DR: A novel application of a log linear model has been described that resulted in the identification of 67 miRNAs that were differentially-expressed between the tumour and normal samples at a false discovery rate less than 0.001.

...read moreread less

Abstract: Ultra-high throughput sequencing technologies provide opportunities both for discovery of novel molecular species and for detailed comparisons of gene expression patterns. Small RNA populations are particularly well suited to this analysis, as many different small RNAs can be completely sequenced in a single instrument run. We prepared small RNA libraries from 29 tumour/normal pairs of human cervical tissue samples. Analysis of the resulting sequences (42 million in total) defined 64 new human microRNA (miRNA) genes. Both arms of the hairpin precursor were observed in twenty-three of the newly identified miRNA candidates. We tested several computational approaches for the analysis of class differences between high throughput sequencing datasets and describe a novel application of a log linear model that has provided the most effective analysis for this data. This method resulted in the identification of 67 miRNAs that were differentially-expressed between the tumour and normal samples at a false discovery rate less than 0.001. This approach can potentially be applied to any kind of RNA sequencing data for analysing differential sequence representation between biological sample sets.

...read moreread less

174 citations

Journal Article•DOI•

Survival analysis with high-dimensional covariates.

[...]

Daniela Witten¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Feb 2010-Statistical Methods in Medical Research

TL;DR: A number of methods from the literature are reviewed that address the problems of identifying features that are associated with survival and developing a multivariate model for the relationship between the features and survival that can be used to predict survival in a new observation.

...read moreread less

Abstract: In recent years, breakthroughs in biomedical technology have led to a wealth of data in which the number of features (for instance, genes on which expression measurements are available) exceeds the number of observations (eg patients) Sometimes survival outcomes are also available for those same observations In this case, one might be interested in (a) identifying features that are associated with survival (in a univariate sense), and (b) developing a multivariate model for the relationship between the features and survival that can be used to predict survival in a new observation Due to the high dimensionality of this data, most classical statistical methods for survival analysis cannot be applied directly Here, we review a number of methods from the literature that address these two problems

...read moreread less

158 citations

Applications of the lasso and grouped lasso to the estimation of sparse graphical models

[...]

Jerome H. Friedman, Trevor Hastie, Robert Tibshirani

01 Jan 2010

TL;DR: It is found that for edge selection, a simple method based on univariate screening of the elements of the empirical correlation matrix usually performs as well or better than all of the more complex methods proposed here and elsewhere.

...read moreread less

Abstract: We propose several methods for estimating edge-sparse and nodesparse graphical models based on lasso and grouped lasso penalties. We develop ecien t algorithms for tting these models when the numbers of nodes and potential edges are large. We compare them to competing methods including the graphical lasso and SPACE (Peng, Wang, Zhou & Zhu 2008). Surprisingly, we nd that for edge selection, a simple method based on univariate screening of the elements of the empirical correlation matrix usually performs as well or better than all of the more complex methods proposed here and elsewhere. Running title: Applications of the lasso and grouped lasso

...read moreread less

135 citations

Journal Article•DOI•

Discovery of molecular subtypes in leiomyosarcoma through integrative molecular profiling

[...]

Andrew H. Beck¹, Cheng-Han Lee², Daniela Witten¹, Briana C. Gleason³, Badreddin Edris¹, Inigo Espinosa¹, Shirley Zhu¹, Ruilin Li¹, Kelli Montgomery¹, Robert J. Marinelli¹, Robert Tibshirani¹, Trevor Hastie¹, David M. Jablons⁴, Brian P. Rubin⁵, Christopher D.M. Fletcher³, Robert B. West¹, Robert B. West⁶, M van de Rijn¹ - Show less +14 more•Institutions (6)

Stanford University¹, University of British Columbia², Harvard University³, University of California, San Francisco⁴, Cleveland Clinic Lerner College of Medicine⁵, Veterans Health Administration⁶

11 Feb 2010-Oncogene

TL;DR: In this analysis that combined gene expression profiling, aCGH and IHC, distinct molecular LMS subtypes are characterized, provided insight into their pathogenesis, and identified prognostic biomarkers.

...read moreread less

Abstract: Leiomyosarcoma (LMS) is a soft tissue tumor with a significant degree of morphologic and molecular heterogeneity. We used integrative molecular profiling to discover and characterize molecular subtypes of LMS. Gene expression profiling was performed on 51 LMS samples. Unsupervised clustering showed three reproducible LMS clusters. Array comparative genomic hybridization (aCGH) was performed on 20 LMS samples and showed that the molecular subtypes defined by gene expression showed distinct genomic changes. Tumors from the 'muscle-enriched' cluster showed significantly increased copy number changes (P=0.04). A majority of the muscle-enriched cases showed loss at 16q24, which contains Fanconi anemia, complementation group A, known to have an important role in DNA repair, and loss at 1p36, which contains PRDM16, of which loss promotes muscle differentiation. Immunohistochemistry (IHC) was performed on LMS tissue microarrays (n=377) for five markers with high levels of messenger RNA in the muscle-enriched cluster (ACTG2, CASQ2, SLMAP, CFL2 and MYLK) and showed significantly correlated expression of the five proteins (all pairwise P<0.005). Expression of the five markers was associated with improved disease-specific survival in a multivariate Cox regression analysis (P<0.04). In this analysis that combined gene expression profiling, aCGH and IHC, we characterized distinct molecular LMS subtypes, provided insight into their pathogenesis, and identified prognostic biomarkers.

...read moreread less

Journal Article•DOI•

Transposable regularized covariance models with an application to missing data imputation

[...]

Genevera I. Allen¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jun 2010-The Annals of Applied Statistics

TL;DR: In this paper, a transposable regularized covariance model is proposed to estimate the mean and non-singular covariance matrices of high-dimensional data in the form of a matrix, where rows and columns each have a separate mean vector and covariance matrix.

...read moreread less

Abstract: Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data matrix is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal, in which the rows and columns each have a separate mean vector and covariance matrix. By placing additive penalties on the inverse covariance matrices of the rows and columns, these so called transposable regularized covariance models allow for maximum likelihood estimation of the mean and non-singular covariance matrices. Using these models, we formulate EM-type algorithms for missing data imputation in both the multivariate and transposable frameworks. We present theoretical results exploiting the structure of our transposable models that allow these models and imputation methods to be applied to high-dimensional data. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility.

...read moreread less

Journal Article•DOI•

3'-end sequencing for expression quantification (3SEQ) from archival tumor samples.

[...]

Andrew H. Beck¹, Ziming Weng¹, Daniela Witten¹, Shirley Zhu¹, Joseph W. Foley¹, Phil Lacroute¹, Cheryl L. Smith¹, Robert Tibshirani¹, Matt van de Rijn¹, Arend Sidow¹, Robert B. West¹, Robert B. West² - Show less +8 more•Institutions (2)

Stanford University¹, Veterans Health Administration²

19 Jan 2010-PLOS ONE

TL;DR: It is demonstrated that 3SEQ is an effective technique for gene expression profiling from archival tumor samples and may facilitate significant advances in translational cancer research.

...read moreread less

Abstract: Gene expression microarrays are the most widely used technique for genome-wide expression profiling. However, microarrays do not perform well on formalin fixed paraffin embedded tissue (FFPET). Consequently, microarrays cannot be effectively utilized to perform gene expression profiling on the vast majority of archival tumor samples. To address this limitation of gene expression microarrays, we designed a novel procedure (3'-end sequencing for expression quantification (3SEQ)) for gene expression profiling from FFPET using next-generation sequencing. We performed gene expression profiling by 3SEQ and microarray on both frozen tissue and FFPET from two soft tissue tumors (desmoid type fibromatosis (DTF) and solitary fibrous tumor (SFT)) (total n = 23 samples, which were each profiled by at least one of the four platform-tissue preparation combinations). Analysis of 3SEQ data revealed many genes differentially expressed between the tumor types (FDR<0.01) on both the frozen tissue (approximately 9.6K genes) and FFPET (approximately 8.1K genes). Analysis of microarray data from frozen tissue revealed fewer differentially expressed genes (approximately 4.64K), and analysis of microarray data on FFPET revealed very few (69) differentially expressed genes. Functional gene set analysis of 3SEQ data from both frozen tissue and FFPET identified biological pathways known to be important in DTF and SFT pathogenesis and suggested several additional candidate oncogenic pathways in these tumors. These findings demonstrate that 3SEQ is an effective technique for gene expression profiling from archival tumor samples and may facilitate significant advances in translational cancer research.

...read moreread less

Journal Article•DOI•

Lymphoma cell VEGFR2 expression detected by immunohistochemistry predicts poor overall survival in diffuse large B cell lymphoma treated with immunochemotherapy (R‐CHOP)

[...]

Dita Gratzinger¹, Ranjana H. Advani, Shuchun Zhao, Neha Talreja, Robert Tibshirani¹, Ragini Shyam¹, Sandra J. Horning, Laurie H. Sehn², Pedro Farinha², Javier Briones³, Izidore S. Lossos⁴, Randy D. Gascoyne², Yasodha Natkunam - Show less +9 more•Institutions (4)

Stanford University¹, University of British Columbia², Autonomous University of Barcelona³, University of Miami⁴

01 Jan 2010-British Journal of Haematology

TL;DR: It is postulate that VEGFR1 may oppose autocrine V EGFR2 signalling in DLBCL by competing for VEGF binding, and this finding is concordant with the prior finding of an association of VEGfr1 with longer OS inDLBCL treated with chemotherapy alone.

...read moreread less

Abstract: Diffuse large B cell lymphoma (DLBCL) is clinically and biologically heterogeneous. In most cases of DLBCL, lymphoma cells co-express vascular endothelial growth factor (VEGF) and its receptors VEGFR1 and VEGFR2, suggesting autocrine in addition to angiogenic effects. We enumerated microvessel density and scored lymphoma cell expression of VEGF, VEGFR1, VEGFR2 and phosphorylated VEGFR2 in 162 de novo DLBCL patients treated with R-CHOP (rituximab, cyclophosphamide, vincristine, doxorubicin and prednisone)-like regimens. VEGFR2 expression correlated with shorter overall survival (OS) independent of International Prognostic Index (IPI) (P = 0.0028). Phosphorylated VEGFR2 (detected in 13% of cases) correlated with shorter progression-free survival (PFS, P = 0.044) and trended toward shorter OS on univariate analysis. VEGFR1 was not predictive of survival on univariate analysis, but it did correlate with better OS on multivariate analysis with VEGF, VEGFR2 and IPI (P = 0.036); in patients with weak VEGFR2, lack of VEGFR1 coexpression was significantly correlated with poor OS independent of IPI (P = 0.01). These results are concordant with our prior finding of an association of VEGFR1 with longer OS in DLBCL treated with chemotherapy alone. We postulate that VEGFR1 may oppose autocrine VEGFR2 signalling in DLBCL by competing for VEGF binding. In contrast to our prior results with chemotherapy alone, microvessel density was not prognostic of PFS or OS with R-CHOP-like therapy.

...read moreread less

Journal Article•DOI•

CD81 Protein is Expressed at High Levels in Normal Germinal Center B cells and in Subtypes of Human Lymphomas

[...]

Robert F. Luo¹, Shuchun Zhao¹, Robert Tibshirani¹, June Helen Myklebust¹, Mrinmoy Sanyal¹, Rosemary Fernandez¹, Dita Gratzinger¹, Robert J. Marinelli¹, Zhi Shun Lu¹, Anna K. Wong¹, Ronald Levy¹, Shoshana Levy¹, Yasodha Natkunam¹ - Show less +9 more•Institutions (1)

Stanford University¹

01 Feb 2010-Human Pathology

TL;DR: High-dimensional flow cytometry analysis of normal hematopoietic tissue confirmed that among B- and T-cell subsets, germinal center B cells showed the highest level of CD81 expression and its role in the risk stratification of patients with diffuse large B-cell lymphoma.

...read moreread less

Journal Article•DOI•

DR-Integrator

[...]

Keyan Salari¹, Robert Tibshirani¹, Jonathan R. Pollack¹•Institutions (1)

Stanford University¹

01 Feb 2010-Bioinformatics

TL;DR: DNA/RNA-Integrator is introduced, a statistical software tool to perform integrative analyses on paired DNA copy number and gene expression data and implements a supervised analysis that captures genes with significant alterations in both DNAcopy number and Gene expression between two sample classes.

...read moreread less

Abstract: Summary: DNA copy number alterations (CNA) frequently underlie gene expression changes by increasing or decreasing gene dosage. However, only a subset of genes with altered dosage exhibit concordant changes in gene expression. This subset is likely to be enriched for oncogenes and tumor suppressor genes, and can be identified by integrating these two layers of genome-scale data. We introduce DNA/RNA-Integrator (DR-Integrator), a statistical software tool to perform integrative analyses on paired DNA copy number and gene expression data. DR-Integrator identifies genes with significant correlations between DNA copy number and gene expression, and implements a supervised analysis that captures genes with significant alterations in both DNA copy number and gene expression between two sample classes. Availability: DR-Integrator is freely available for non-commercial use from the Pollack Lab at http://pollacklab.stanford.edu/ and can be downloaded as a plug-in application to Microsoft Excel and as a package for the R statistical computing environment. The R package is available under the name ‘DRI’ at http://cran.r-project.org/. An example analysis using DR-Integrator is included as supplemental material. Contact:ksalari@stanford.edu; pollack1@stanford.edu Supplementary information:Supplementary data are available at Bioinformatics online.

...read moreread less

Posted Content•

Strong rules for discarding predictors in lasso-type problems

[...]

Robert Tibshirani¹, Jacob Bien¹, Jerome H. Friedman¹, Trevor Hastie¹, Noah Simon¹, Jonathan Taylor¹, Ryan J. Tibshirani¹ - Show less +3 more•Institutions (1)

Stanford University¹

09 Nov 2010-arXiv: Statistics Theory

TL;DR: In this paper, the authors propose strong rules for discarding predictors in lasso regression and related problems, for computational efficiency, complemented with simple checks of the Karush- Kuhn-Tucker (KKT) conditions.

...read moreread less

Abstract: We consider rules for discarding predictors in lasso regression and related problems, for computational efficiency. El Ghaoui et al (2010) propose "SAFE" rules that guarantee that a coefficient will be zero in the solution, based on the inner products of each predictor with the outcome. In this paper we propose strong rules that are not foolproof but rarely fail in practice. These can be complemented with simple checks of the Karush- Kuhn-Tucker (KKT) conditions to provide safe rules that offer substantial speed and space savings in a variety of statistical convex optimization problems.

...read moreread less

Journal Article•DOI•

C-C chemokine receptor 1 expression in human hematolymphoid neoplasia.

[...]

Matthew W. Anderson¹, Shuchun Zhao¹, Weiyun Z. Ai², Robert Tibshirani¹, Ronald Levy¹, Izidore S. Lossos³, Yasodha Natkunam¹ - Show less +3 more•Institutions (3)

Stanford University¹, University of California, San Francisco², University of Miami³

01 Mar 2010-American Journal of Clinical Pathology

TL;DR: Immunohistochemical analysis of 944 cases of hematolymphoid neoplasia identified CCR1 expression in a subset of B- and T-cell lymphomas, plasma cell myeloma, acute myeloid leukemia, and classical Hodgkin lymphoma, and suggested that C CR1 may be useful for lymphoma classification and support a role for chemokine signaling in the pathogenesis of heMatolymphoids neoplastic disease.

...read moreread less

Abstract: Chemokine receptor 1 (CCR1) is a G protein–coupled receptor that binds to members of the C-C chemokine family Recently, CCL3 (MIP-1α), a high-affinity CCR1 ligand, was identified as part of a model that independently predicts survival in patients with diffuse large B-cell lymphoma (DLBCL) However, the role of chemokine signaling in the pathogenesis of human lymphomas is unclear In normal human hematopoietic tissues, we found CCR1 expression in intraepithelial B cells of human tonsil and granulocytic/monocytic cells in the bone marrow Immunohistochemical analysis of 944 cases of hematolymphoid neoplasia identified CCR1 expression in a subset of B- and T-cell lymphomas, plasma cell myeloma, acute myeloid leukemia, and classical Hodgkin lymphoma CCR1 expression correlated with the non–germinal center subtype of DLBCL but did not predict overall survival in follicular lymphoma These data suggest that CCR1 may be useful for lymphoma classification and support a role for chemokine signaling in the pathogenesis of hematolymphoid neoplasia

...read moreread less

Journal Article•DOI•

Predicting Patient Survival from Longitudinal Gene Expression

[...]

Yuping Zhang¹, Robert Tibshirani, Ronald W. Davis•Institutions (1)

Stanford University¹

01 Jan 2010-Statistical Applications in Genetics and Molecular Biology

TL;DR: A novel prediction approach for patient survival time that makes use of time course structure of gene expression and is consistently better than prediction methods using individual time point gene expression or simply pooling gene expression from each time point.

...read moreread less

Abstract: Characterizing dynamic gene expression pattern and predicting patient outcome is now significant and will be of more interest in the future with large scale clinical investigation of microarrays. However, there is currently no method that has been developed for prediction of patient outcome using longitudinal gene expression, where gene expression of patients is being monitored across time. Here, we propose a novel prediction approach for patient survival time that makes use of time course structure of gene expression. This method is applied to a burn study. The genes involved in the final predictors are enriched in the inflammatory response and immune system related pathways. Moreover, our method is consistently better than prediction methods using individual time point gene expression or simply pooling gene expression from each time point.

...read moreread less

Journal Article•DOI•

Gene Expression Changes Induced by Genistein in the Prostate CancerCell Line LNCaP

[...]

Suvarna Bhamre, Debashis Sahoo, Robert Tibshirani, David L. Dill, James D. Brooks¹ - Show less +1 more•Institutions (1)

Stanford University¹

23 Oct 2010-The Open Prostate Cancer Journal

TL;DR: Genistein produces diverse effects on gene expression that are dose-dependent and this has important implications in developing genistein as a putative prostate cancer preventive agent.

...read moreread less

Abstract: Epidemiological evidence suggests that soy consumption is associated with a decreased risk of prostate cancer. The isoflavone genistein is found at high levels in soy and a large body of evidence suggests it is important in mediating the cancer preventive effects of soy. The mechanisms through which genistein acts in prostate cancer cells have not been fully defined. We used gene expression profiling to identify genes significantly modulated by low and high doses of ge- nistein in LNCaP cells. Significant genes were identified using StepMiner analysis and significantly altered pathways with Ingenuity Pathways analysis. Genistein significantly altered expression of transcripts involved in cell growth, carcinogen defenses and steroid signaling pathways. The effects of genistein on these pathways were confirmed by directly assessing dose-related effects on LNCaP cell growth, NQO-1 enzymatic activity and PSA protein expression. Genistein produces diverse effects on gene expression that are dose-dependent and this has important implications in developing genistein as a putative prostate cancer preventive agent.

...read moreread less

Posted Content•

Inference with Transposable Data: Modeling the Effects of Row and Column Correlations

[...]

Genevera I. Allen, Robert Tibshirani

01 Apr 2010-arXiv: Methodology

TL;DR: In this article, the effect of both row and column correlations on commonly used test-statistics, null distributions, and multiple testing procedures, by explicitly modeling the covariances with the matrix-variate normal distribution, is investigated.

...read moreread less

Abstract: We consider the problem of large-scale inference on the row or column variables of data in the form of a matrix. Often this data is transposable, meaning that both the row variables and column variables are of potential interest. An example of this scenario is detecting significant genes in microarrays when the samples or arrays may be dependent due to underlying relationships. We study the effect of both row and column correlations on commonly used test-statistics, null distributions, and multiple testing procedures, by explicitly modeling the covariances with the matrix-variate normal distribution. Using this model, we give both theoretical and simulation results revealing the problems associated with using standard statistical methodology on transposable data. We solve these problems by estimating the row and column covariances simultaneously, with transposable regularized covariance models, and de-correlating or sphering the data as a pre-processing step. Under reasonable assumptions, our method gives test statistics that follow the scaled theoretical null distribution and are approximately independent. Simulations based on various models with structured and observed covariances from real microarray data reveal that our method offers substantial improvements in two areas: 1) increased statistical power and 2) correct estimation of false discovery rates.

...read moreread less

Journal Article•DOI•

Extracting Cell-type-specific Gene Expression Differences from Complex Tissues

[...]

Shai S. Shen-Orr¹, Alexander Gaidarski¹, Robert Tibshirani¹, Purvesh Khatri¹, Mark M. Davis¹, Atul J. Butte¹ - Show less +2 more•Institutions (1)

Stanford University¹

01 Jan 2010-Clinical Immunology

Journal Article•DOI•

Road crashes and the next U.S. presidential election

[...]

Donald A. Redelmeier, Robert Tibshirani

01 Jun 2010-Chance

TL;DR: The United States contains some of the world’s most dangerous roads, and the shortfall in U.S. road safety is a new issue, since American roads were considered the safest in the world 50 years ago.

...read moreread less

Abstract: (2010). Road Crashes and the Next U.S. Presidential Election. CHANCE: Vol. 23, Election Issues, pp. 20-24.

...read moreread less

Journal Article•

Extracting cell-type-specific gene expression differences from complex tissues

[...]

Shai S. Shen-Orr, Robert Tibshirani, Purvesh Khatri, Alexander Gaidarski, Dale L. Bodian, Frank Staedtler, Nicholas Perry, Trevor Hastie, Minnie M. Sarwal, Mark M. Davis, Atul J. Butte - Show less +7 more

01 Apr 2010-Journal of Immunology

Journal Article•DOI•

Reply to D.R. Catchpoole et al

[...]

Branimir I. Sikic¹, Robert Tibshirani¹, Norman J. Lacayo¹•Institutions (1)

Stanford University¹

01 Nov 2010-Journal of Clinical Oncology

TL;DR: The development of truly personalized therapies will require ascertaining the key differences among individuals as well as similarities between cohorts within a disease type, which is a major challenge in clinical trials designs.

...read moreread less

Abstract: We substantially agree with the comments of Catchpoole et al in response to our editorial. The idea that analyses of multidimensional, highly complex datasets of cancer and host factors will lead to more precise and successful individualized therapies is immensely appealing. We agree that building the algorithms for truly personalized medicine indeed will require paradigm shifts in clinical and translational research. The path to tailored therapies in individuals will be paved in large part by a deeper understanding of the complexity and heterogeneity of cancers. Genomics has contributed much to this understanding, as exemplified by the reclassification of breast cancers into many distinct subtypes based on gene expression profiles. Our endorsement of the so-called virtue of complexity is an appeal to strengthen the scientific rigor of genomic studies in cancer. Increasing the number of patients and samples is necessary but far from sufficient. Other important factors in conducting and reporting such studies include patient stratification, integration of various highthroughput technologies, novel trial designs and bioinformatics approaches, integration of emerging concepts in cancer biology, and transparent and complete presentation of statistical analyses. Despite the limitations of reductionism, genomic signatures derived from such approaches have proved useful in defining risks for relapse and appropriate patients for adjuvant therapies in early-stage breast cancers. Moreover, the development of predictive therapeutic biomarkers based on the understanding of molecular pathways, networks, and drug mechanisms has much to contribute to the personalization of cancer therapies, as illustrated by the importance of RAS mutation status in colorectal cancers treated with epidermal growth factor receptor–targeted antibodies. Ultimately, however, we agree that the development of truly personalized therapies will require ascertaining the key differences among individuals as well as similarities between cohorts within a disease type. Proof that tailoring makes a difference, particularly in a highly curable disease like pediatric acute lymphoblastic leukemia, is itself a major challenge in clinical trials designs.

...read moreread less

Journal Article•DOI•

In Situ Vaccination with TLR9 Agonist Combined with Local Radiation In Mycosis Fungoides: Analysis of Phase I/II Study

[...]

Youn H. Kim¹, Dita Gratzinger¹, Cameron Harrison¹, Joshua Brody¹, Debra K. Czerwinski¹, Leon Xing¹, Anjali V. Morales¹, Weiyun Z. Ai², Farah Abdulla¹, Daniel Navi¹, Robert Tibshirani¹, Ranjana H. Advani¹, Yasodha Natkunam¹, Richard T. Hoppe¹, Ronald Levy¹ - Show less +11 more•Institutions (2)

Stanford University¹, University of California, San Francisco²

19 Nov 2010-Blood

TL;DR: A novel in situ vaccination strategy using a combination of intratumoral CpG ODN and low-dose radiation is feasible in CTCL/MF with acceptable toxicities and Reduction of skin DCs may suggest cross-priming and migration of DCs to regional lymph nodes.

...read moreread less

Posted Content•

Bayesian Gene Set Analysis

[...]

Babak Shahbaba, Robert Tibshirani, Catherine M. Shachaf, Sylvia K. Plevritis

26 Jun 2010-arXiv: Applications

TL;DR: This work introduces a new methodology for identifying gene sets that are differentially expressed under varying experimental conditions that uses a hierarchical Bayesian framework where a hyperparameter measures the significance of each gene set.

...read moreread less

Abstract: Author(s): Shahbaba, Babak; Tibshirani, Robert; Shachaf, Catherine M; Plevritis, Sylvia K | Abstract: Gene expression microarray technologies provide the simultaneous measurements of a large number of genes. Typical analyses of such data focus on the individual genes, but recent work has demonstrated that evaluating changes in expression across predefined sets of genes often increases statistical power and produces more robust results. We introduce a new methodology for identifying gene sets that are differentially expressed under varying experimental conditions. Our approach uses a hierarchical Bayesian framework where a hyperparameter measures the significance of each gene set. Using simulated data, we compare our proposed method to alternative approaches, such as Gene Set Enrichment Analysis (GSEA) and Gene Set Analysis (GSA). Our approach provides the best overall performance. We also discuss the application of our method to experimental data based on p53 mutation status.

...read moreread less