Showing papers by "Robert Tibshirani published in 2004"

PDF

Open Access

Journal Article•DOI•

[...]

Bradley Efron¹, Trevor Hastie¹, Iain M. Johnstone¹, Robert Tibshirani¹, Hemant Ishwaran², Keith Knight³, Jean-Michel Loubes⁴, Jean-Michel Loubes⁵, Pascal Massart⁶, Pascal Massart⁵, David Madigan⁷, David Madigan⁸, Greg Ridgeway⁹, Greg Ridgeway⁷, Saharon Rosset¹⁰, Saharon Rosset¹, Ji Zhu, Robert A. Stine¹¹, Berwin A. Turlach¹², Sanford Weisberg¹³ - Show less +16 more•Institutions (13)

Stanford University¹, Cleveland Clinic², University of Toronto³, Centre national de la recherche scientifique⁴, Université Paris-Saclay⁵, University of Paris-Sud⁶, Rutgers University⁷, Avaya⁸, RAND Corporation⁹, IBM¹⁰, University of Pennsylvania¹¹, University of Western Australia¹², University of Minnesota¹³

01 Apr 2004-Annals of Statistics

TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.

...read moreread less

Abstract: The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to select a parsimonious set for the efficient prediction of a response variable. Least Angle Regression (LARS), a new model selection algorithm, is a useful and less greedy version of traditional forward selection methods. Three main properties are derived: (1) A simple modification of the LARS algorithm implements the Lasso, an attractive version of ordinary least squares that constrains the sum of the absolute regression coefficients; the LARS modification calculates all possible Lasso estimates for a given problem, using an order of magnitude less computer time than previous methods. (2) A different LARS modification efficiently implements Forward Stagewise linear regression, another promising new model selection method; this connection explains the similar numerical results previously observed for the Lasso and Stagewise, and helps us understand the properties of both methods, which are seen as constrained versions of the simpler LARS algorithm. (3) A simple approximation for the degrees of freedom of a LARS estimate is available, from which we derive a Cp estimate of prediction error; this allows a principled choice among the range of possible LARS estimates. LARS and its variants are computationally efficient: the paper describes a publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates.

...read moreread less

7,828 citations

Journal Article•DOI•

Gene expression profiling identifies clinically relevant subtypes of prostate cancer

[...]

Jacques Lapointe¹, Chunde Li², John P. Higgins¹, Matt van de Rijn¹, Eric Bair¹, Kelli Montgomery¹, Michelle Ferrari¹, Lars Egevad², Walter Rayford³, Ulf S.R. Bergerheim⁴, Peter Ekman², Angelo M. DeMarzo⁵, Robert Tibshirani¹, David Botstein¹, Patrick O. Brown¹, James D. Brooks¹, Jonathan R. Pollack¹ - Show less +13 more•Institutions (5)

Stanford University¹, Karolinska Institutet², Louisiana State University³, Linköping University⁴, Johns Hopkins University⁵

20 Jan 2004-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: It is suggested that prostate tumors can be usefully classified according to their gene expression patterns, and these tumor subtypes may provide a basis for improved prognostication and treatment stratification.

...read moreread less

Abstract: Prostate cancer, a leading cause of cancer death, displays a broad range of clinical behavior from relatively indolent to aggressive metastatic disease. To explore potential molecular variation underlying this clinical heterogeneity, we profiled gene expression in 62 primary prostate tumors, as well as 41 normal prostate specimens and nine lymph node metastases, using cDNA microarrays containing ≈26,000 genes. Unsupervised hierarchical clustering readily distinguished tumors from normal samples, and further identified three subclasses of prostate tumors based on distinct patterns of gene expression. High-grade and advanced stage tumors, as well as tumors associated with recurrence, were disproportionately represented among two of the three subtypes, one of which also included most lymph node metastases. To further characterize the clinical relevance of tumor subtypes, we evaluated as surrogate markers two genes differentially expressed among tumor subgroups by using immunohistochemistry on tissue microarrays representing an independent set of 225 prostate tumors. Positive staining for MUC1, a gene highly expressed in the subgroups with “aggressive” clinicopathological features, was associated with an elevated risk of recurrence (P = 0.003), whereas strong staining for AZGP1, a gene highly expressed in the other subgroup, was associated with a decreased risk of recurrence (P = 0.0008). In multivariate analysis, MUC1 and AZGP1 staining were strong predictors of tumor recurrence independent of tumor grade, stage, and preoperative prostate-specific antigen levels. Our results suggest that prostate tumors can be usefully classified according to their gene expression patterns, and these tumor subtypes may provide a basis for improved prognostication and treatment stratification.

...read moreread less

1,315 citations

Journal Article•DOI•

Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia.

[...]

Lars Bullinger¹, Konstanze Döhner², Eric Bair¹, Stefan Fröhling², Richard F. Schlenk², Robert Tibshirani¹, Hartmut Döhner², Jonathan R. Pollack¹ - Show less +4 more•Institutions (2)

Stanford University¹, University of Ulm²

15 Apr 2004-The New England Journal of Medicine

TL;DR: The use of gene-expression profiling improves the molecular classification of adult AML and identifies new molecular subtypes of AML, including two prognostically relevant subgroups in AML with a normal karyotype.

...read moreread less

Abstract: Background In patients with acute myeloid leukemia (AML), the presence or absence of recurrent cytogenetic aberrations is used to identify the appropriate therapy However, the current classification system does not fully reflect the molecular heterogeneity of the disease, and treatment stratification is difficult, especially for patients with intermediate-risk AML with a normal karyotype Methods We used complementary-DNA microarrays to determine the levels of gene expression in peripheral-blood samples or bone marrow samples from 116 adults with AML (including 45 with a normal karyotype) We used unsupervised hierarchical clustering analysis to identify molecular subgroups with distinct gene-expression signatures Using a training set of samples from 59 patients, we applied a novel supervised learning algorithm to devise a gene-expression–based clinical-outcome predictor, which we then tested using an independent validation group comprising the 57 remaining patients Results Unsupervised analysis identi

...read moreread less

992 citations

Journal Article•DOI•

Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes.

[...]

Izidore S. Lossos¹, Debra K. Czerwinski¹, Ash A. Alizadeh¹, Mark A. Wechser², Robert Tibshirani¹, David Botstein¹, Ronald Levy¹ - Show less +3 more•Institutions (2)

Stanford University¹, Applied Biosystems²

29 Apr 2004-The New England Journal of Medicine

TL;DR: Measurement of the expression of six genes is sufficient to predict overall survival in diffuse large-B-cell lymphoma.

...read moreread less

Abstract: background Several gene-expression signatures can be used to predict the prognosis in diffuse large-B-cell lymphoma, but the lack of practical tests for a genome-scale analysis has restricted the use of this method. methods We studied 36 genes whose expression had been reported to predict survival in diffuse large-B-cell lymphoma. We measured the expression of each of these genes in independent samples of lymphoma from 66 patients by quantitative real-time polymerasechain-reaction analyses and related the results to overall survival. results In a univariate analysis, genes were ranked on the basis of their ability to predict survival. The genes that were the strongest predictors were LMO2, BCL6, FN1, CCND2, SCYA3, and BCL2. We developed a multivariate model that was based on the expression of these six genes, and we validated the model in two independent microarray data sets. The model was independent of the International Prognostic Index and added to its predictive power. conclusions Measurement of the expression of six genes is sufficient to predict overall survival in diffuse large-B-cell lymphoma.

...read moreread less

891 citations

Journal Article•

The Entire Regularization Path for the Support Vector Machine

[...]

Trevor Hastie¹, Saharon Rosset², Robert Tibshirani¹, Ji Zhu•Institutions (2)

Stanford University¹, IBM²

01 Dec 2004-Journal of Machine Learning Research

TL;DR: An algorithm is derived that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model.

...read moreread less

Abstract: The support vector machine (SVM) is a widely used tool for classification. Many efficient implementations exist for fitting a two-class SVM model. The user has to supply values for the tuning parameters: the regularization cost parameter, and the kernel parameters. It seems a common practice is to use a default value for the cost parameter, often leading to the least restrictive model. In this paper we argue that the choice of the cost parameter can be critical. We then derive an algorithm that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model. We illustrate our algorithm on some examples, and use our representation to give further insight into the range of SVM solutions.

...read moreread less

699 citations

Journal Article•DOI•

Semi-supervised methods to predict patient survival from gene expression data.

[...]

Eric Bair¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

13 Apr 2004-PLOS Biology

TL;DR: Diagnostic procedures are presented that accurately predict the survival of future patients based on the gene expression profile and survival times of previous patients that have been successfully applied to several publicly available datasets.

...read moreread less

Abstract: An important goal of DNA microarray research is to develop tools to diagnose cancer more accurately based on the genetic profile of a tumor There are several existing techniques in the literature for performing this type of diagnosis Unfortunately, most of these techniques assume that different subtypes of cancer are already known to exist Their utility is limited when such subtypes have not been previously identified Although methods for identifying such subtypes exist, these methods do not work well for all datasets It would be desirable to develop a procedure to find such subtypes that is applicable in a wide variety of circumstances Even if no information is known about possible subtypes of a certain form of cancer, clinical information about the patients, such as their survival time, is often available In this study, we develop some procedures that utilize both the gene expression data and the clinical data to identify subtypes of cancer and use this knowledge to diagnose future patients These procedures were successfully applied to several publicly available datasets We present diagnostic procedures that accurately predict the survival of future patients based on the gene expression profile and survival times of previous patients This has the potential to be a powerful tool for diagnosing and treating cancer

...read moreread less

678 citations

Journal Article•DOI•

Least Angle Regression

[...]

Bradley Efron, Trevor Hastie, Iain M. Johnstone, Robert Tibshirani

23 Jun 2004-arXiv: Statistics Theory

TL;DR: Least Angle Regression (LARS) as discussed by the authors is a new model selection algorithm, which is a useful and less greedy version of traditional forward selection methods such as All Subsets, Forward Selection and Backward Elimination.

...read moreread less

547 citations

Journal Article•DOI•

Different Gene Expression Patterns in Invasive Lobular and Ductal Carcinomas of the Breast

[...]

Hongjuan Zhao¹, Anita Langerød², Youngran Ji¹, Kent W. Nowels¹, Jahn M. Nesland², Robert Tibshirani¹, Ida R. K. Bukholm², Rolf Kåresen², David Botstein³, David Botstein¹, Anne Lise Børresen-Dale², Stefanie S. Jeffrey¹ - Show less +8 more•Institutions (3)

Stanford University¹, University of Oslo², Princeton University³

01 Jun 2004-Molecular Biology of the Cell

TL;DR: Over half of the ILCs differ from IDCs not only in histological and clinical features but also in global transcription programs, and the remaining I LCs closely resemble IDCs in their transcription patterns.

...read moreread less

Abstract: Invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC) are the two major histological types of breast cancer worldwide Whereas IDC incidence has remained stable, ILC is the most rapidly increasing breast cancer phenotype in the United States and Western Europe It is not clear whether IDC and ILC represent molecularly distinct entities and what genes might be involved in the development of these two phenotypes We conducted comprehensive gene expression profiling studies to address these questions Total RNA from 21 ILCs, 38 IDCs, two lymph node metastases, and three normal tissues were amplified and hybridized to approximately 42,000 clone cDNA microarrays Data were analyzed using hierarchical clustering algorithms and statistical analyses that identify differentially expressed genes (significance analysis of microarrays) and minimal subsets of genes (prediction analysis for microarrays) that succinctly distinguish ILCs and IDCs Eleven of 21 (52%) of the ILCs ("typical" ILCs) clustered together and displayed different gene expression profiles from IDCs, whereas the other ILCs ("ductal-like" ILCs) were distributed between different IDC subtypes Many of the differentially expressed genes between ILCs and IDCs code for proteins involved in cell adhesion/motility, lipid/fatty acid transport and metabolism, immune/defense response, and electron transport Many genes that distinguish typical and ductal-like ILCs are involved in regulation of cell growth and immune response Our data strongly suggest that over half the ILCs differ from IDCs not only in histological and clinical features but also in global transcription programs The remaining ILCs closely resemble IDCs in their transcription patterns Further studies are needed to explore the differences between ILC molecular subtypes and to determine whether they require different therapeutic strategies

...read moreread less

418 citations

Journal Article•DOI•

Sample classification from protein mass spectrometry, by 'peak probability contrasts'

[...]

Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, Scott G. Soltys¹, Gongyi Shi¹, Albert C. Koong¹, Quynh-Thu Le¹ - Show less +3 more•Institutions (1)

Stanford University¹

22 Nov 2004-Bioinformatics

TL;DR: The peak probability contrast method is a potentially useful tool for sample classification from protein mass spectrometry data and performs as well or better than several methods that require the full spectra, rather than just labelled peaks.

...read moreread less

Abstract: Motivation: Early cancer detection has always been a major research focus in solid tumor oncology. Early tumor detection can theoretically result in lower stage tumors, more treatable diseases and ultimately higher cure rates with less treatment-related morbidities. Protein mass spectrometry is a potentially powerful tool for early cancer detection. We propose a novel method for sample classification from protein mass spectrometry data. When applied to spectra from both diseased and healthy patients, the 'peak probability contrast' technique provides a list of all common peaks among the spectra, their statistical significance and their relative importance in discriminating between the two groups. We illustrate the method on matrix-assisted laser desorption and ionization mass spectrometry data from a study of ovarian cancers. Results: Compared to other statistical approaches for class prediction, the peak probability contrast method performs as well or better than several methods that require the full spectra, rather than just labelled peaks. It is also much more interpretable biologically. The peak probability contrast method is a potentially useful tool for sample classification from protein mass spectrometry data. Supplementary Information: http://www.stat.stanford.edu/~tibs/ppc

...read moreread less

218 citations

Journal Article•DOI•

Toxicity from radiation therapy associated with abnormal transcriptional responses to DNA damage

[...]

Kerri E. Rieger¹, Wan-Jen Hong, Virginia Goss Tusher, Jean Y. Tang, Robert Tibshirani, Gilbert Chu - Show less +2 more•Institutions (1)

Stanford University¹

27 Apr 2004-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: Transcriptional responses in 24 genes predicted radiation toxicity in 9 of 14 patients with no false positives among 43 controls with significant heterogeneity, and may enable physicians to predict toxicity and tailor treatment for individual patients.

...read moreread less

Abstract: Toxicity from radiation therapy is a grave problem for cancer patients. We hypothesized that some cases of toxicity are associated with abnormal transcriptional responses to radiation. We used microarrays to measure responses to ionizing and UV radiation in lymphoblastoid cells derived from 14 patients with acute radiation toxicity. The analysis used heterogeneity-associated transformation of the data to account for a clinical outcome arising from more than one underlying cause. To compute the risk of toxicity for each patient, we applied nearest shrunken centroids, a method that identifies and cross-validates predictive genes. Transcriptional responses in 24 genes predicted radiation toxicity in 9 of 14 patients with no false positives among 43 controls (P = 2.2 × 10-7). The responses of these nine patients displayed significant heterogeneity. Of the five patients with toxicity and normal responses, two were treated with protocols that proved to be highly toxic. These results may enable physicians to predict toxicity and tailor treatment for individual patients.

...read moreread less

132 citations

Regularized Discriminant Analysis and Its Application in Microarrays

[...]

Yaqian Guo, Trevor Hastie, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jan 2004

TL;DR: These SCRDA methods generalize the idea of the nearest shrunken centroids of Prediction Analysis of Microarray into the classical discriminant analysis and perform uniformly well in the multivariate classification problems, especially outperform the currently popular PAM.

...read moreread less

Abstract: In this paper, we introduce a family of some modified versions of linear discriminant analysis, called “shrunken centroids regularized discriminant analysis” (SCRDA). These methods generalize the idea of the nearest shrunken centroids of Prediction Analysis of Microarray (PAM) into the classical discriminant analysis. These SCRDA methods are specially designed for classification problems in high dimension low sample size situations, for example microarray data. Through both simulation study and real life data, it is shown that these SCRDA methods perform uniformly well in the multivariate classification problems, especially outperform the currently popular PAM. Some of them are also suitable for feature elimination purpose and can be used as gene selection methods. The open source R codes for these methods are also available and will be added to the R libraries in the near future.

...read moreread less

Journal Article•DOI•

Efficient quadratic regularization for expression arrays.

[...]

Trevor Hastie¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

01 Jul 2004-Biostatistics

TL;DR: This article exposes a class of techniques based on quadratic regularization of linear models, including regularized (ridge) regression, logistic and multinomial regression, linear and mixture discriminant analysis, the Cox model and neural networks, and shows that dramatic computational savings are possible over naive implementations.

...read moreread less

Abstract: SUMMARY Gene expression arrays typically have 50 to 100 samples and 1000 to 20 000 variables (genes). There have been many attempts to adapt statistical models for regression and classification to these data, and in many cases these attempts have challenged the computational resources. In this article we expose a class of techniques based on quadratic regularization of linear models, including regularized (ridge) regression, logistic and multinomial regression, linear and mixture discriminant analysis, the Cox model and neural networks. For all of these models, we show that dramatic computational savings are possible over naive implementations, using standard transformations in numerical linear algebra.

...read moreread less

Journal Article•DOI•

Gene expression profiles at diagnosis in de novo childhood AML patients identify FLT3 mutations with good clinical outcomes

[...]

Norman J. Lacayo¹, Soheil Meshinchi, Paivi Kinnunen, Ron Yu, Yan Wang, Christianna M. Stuber, Lorrie Douglas, Romina Wahab, David L. Becton, Howard J. Weinstein, Myron Chang, Cheryl L. Willman, Jerald P. Radich, Robert Tibshirani, Yaddanapudi Ravindranath, Branimir I. Sikic, Gary V. Dahl - Show less +13 more•Institutions (1)

Stanford University¹

01 Nov 2004-Blood

TL;DR: Gene expression profiling identified AML patients with divergent prognoses within the FLT3-MU group, and the RUNX3 to ATRX expression ratio should be a useful prognostic indicator in these patients.

...read moreread less

Journal Article•DOI•

Developmental response to hypoxia

[...]

S.-T. Joseph Huang, Kim Chi Vo, Deirdre J. Lyell, Gerarda H. Faessen, Suzana Tulac, Robert Tibshirani, Amato J. Giaccia, Linda C. Giudice - Show less +4 more

01 Sep 2004-The FASEB Journal

TL;DR: Time‐dependent changes in fetal tissue gene expression in a rat model of in utero hypoxia compared with normoxic controls were investigated as an initial approach to understand molecular events underlying fetal development in response to Hypoxia.

...read moreread less

Abstract: Molecular mechanisms underlying fetal growth restriction due to placental insufficiency and in utero hypoxia are not well understood In the current study, time-dependent (3 h-11 days) changes in fetal tissue gene expression in a rat model of in utero hypoxia compared with normoxic controls were investigated as an initial approach to understand molecular events underlying fetal development in response to hypoxia Under hypoxic conditions, litter size was reduced and IGFBP-1 was up-regulated in maternal serum and in fetal liver and heart Tissue-specific, distinct regulatory patterns of gene expression were observed under acute vs chronic hypoxic conditions Induction of glycolytic enzymes was an early event in response to hypoxia during organ development; consistently, tissue-specific induction of calcium homeostasis-related genes and suppression of growth-related genes were observed, suggesting mechanisms underlying hypoxia-related fetal growth restriction Furthermore, induction of inflammation-related genes in placentas exposed to long-term hypoxia (11 days) suggests a mechanism for placental dysfunction and impaired pregnancy outcome accompanying in utero hypoxia

...read moreread less

Journal Article•DOI•

The use of plasma surface-enhanced laser desorption/ionization time-of-flight mass spectrometry proteomic patterns for detection of head and neck squamous cell cancers.

[...]

Scott G. Soltys¹, Quynh-Thu Le, Gongyi Shi, Robert Tibshirani, Amato J. Giaccia, Albert C. Koong - Show less +2 more•Institutions (1)

Stanford University¹

15 Jul 2004-Clinical Cancer Research

TL;DR: Plasma proteomic profiling with SELDI-TOF mass spectrometry provides moderate sensitivity and specificity in discriminating HNSCC, and is likely to overpredict cancer in control smokers.

...read moreread less

Abstract: Purpose: Our study was undertaken to determine the utility of plasma proteomic profiling using surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) mass spectrometry for the detection of head and neck squamous cell carcinomas (HNSCCs). Experimental Design: Pretreatment plasma samples from HNSCC patients or controls without known neoplastic disease were analyzed on the Protein Biology System IIc SELDI-TOF mass spectrometer (Ciphergen Biosystems, Fremont, CA). Proteomic spectra of mass:charge ratio ( m / z ) were generated by the application of plasma to immobilized metal-affinity-capture (IMAC) ProteinChip arrays activated with copper. A total of 37,356 data points were generated for each sample. A training set of spectra from 56 cancer patients and 52 controls were applied to the “Lasso” technique to identify protein profiles that can distinguish cancer from noncancer, and cross-validation was used to determine test errors in this training set. The discovery pattern was then used to classify a separate masked test set of 57 cancer and 52 controls. In total, we analyzed the proteomic spectra of 113 cancer patients and 104 controls. Results: The Lasso approach identified 65 significant data points for the discrimination of normal from cancer profiles. The discriminatory pattern correctly identified 39 of 57 HNSCC patients and 40 of 52 noncancer controls in the masked test set. These results yielded a sensitivity of 68% and specificity of 73%. Subgroup analyses in the test set of four different demographic factors (age, gender, and cigarette and alcohol use) that can potentially confound the interpretation of the results suggest that this model tended to overpredict cancer in control smokers. Conclusions: Plasma proteomic profiling with SELDI-TOF mass spectrometry provides moderate sensitivity and specificity in discriminating HNSCC. Further improvement and validation of this approach is needed to determine its usefulness in screening for this disease.

...read moreread less

Journal Article•DOI•

Mouse Strain–Specific Differences in Vascular Wall Gene Expression and Their Relationship to Vascular Disease

[...]

Raymond Tabibiazar¹, Roger A. Wagner, Joshua M. Spin, Euan A. Ashley, Balasubramanian Narasimhan, Edward M. Rubin, Bradley Efron, P.S. Tsao, Robert Tibshirani, Thomas Quertermous - Show less +6 more•Institutions (1)

Stanford University¹

18 Nov 2004-Arteriosclerosis, Thrombosis, and Vascular Biology

TL;DR: Gene expression differences between the 2 strains suggest that aortas of C57Bl/6 mice have a higher genetic propensity to develop inflammation in response to appropriate atherogenic stimuli.

...read moreread less

Abstract: Objective— Different strains of inbred mice exhibit different susceptibility to the development of atherosclerosis. The C3H/HeJ and C57Bl/6 mice have been used in several studies aimed at understanding the genetic basis of atherosclerosis. Under controlled environmental conditions, variations in susceptibility to atherosclerosis reflect differences in genetic makeup, and these differences must be reflected in gene expression patterns that are temporally related to the development of disease. In this study, we sought to identify the genetic pathways that are differentially activated in the aortas of these mice. Methods and Results— We performed genome-wide transcriptional profiling of aortas from C3H/HeJ and C57Bl/6 mice. Differences in gene expression were identified at baseline as well as during normal aging and longitudinal exposure to high-fat diet. The significance of these genes to the development of atherosclerosis was evaluated by observing their temporal pattern of expression in the well-studied apolipoprotein E model of atherosclerosis. Conclusion— Gene expression differences between the 2 strains suggest that aortas of C57Bl/6 mice have a higher genetic propensity to develop inflammation in response to appropriate atherogenic stimuli. This study expands the repertoire of factors in known disease-related signaling pathways and identifies novel candidate genes for future study.

...read moreread less

Proceedings Article•

The Entire Regularization Path for the Support Vector Machine

[...]

Saharon Rosset¹, Robert Tibshirani², Ji Zhu³, Trevor Hastie²•Institutions (3)

IBM¹, Stanford University², University of Michigan³

01 Dec 2004

TL;DR: In this article, the authors argue that the choice of the SVM cost parameter can be critical and derive an algorithm that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model.

...read moreread less

Abstract: In this paper we argue that the choice of the SVM cost parameter can be critical. We then derive an algorithm that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model.

...read moreread less

Journal Article•DOI•

Cancer characterization and feature set extraction by discriminative margin clustering.

[...]

Kamesh Munagala¹, Robert Tibshirani¹, Patrick O. Brown¹•Institutions (1)

Stanford University¹

03 Mar 2004-BMC Bioinformatics

TL;DR: Discriminative margin clustering is a new technique for analyzing high dimensional quantitative datasets, specially applicable to gene expression data from microarray experiments related to cancer, which yields highly specialized tumor subtypes which are similar in terms of potential diagnostic markers.

...read moreread less

Abstract: A central challenge in the molecular diagnosis and treatment of cancer is to define a set of molecular features that, taken together, distinguish a given cancer, or type of cancer, from all normal cells and tissues. Discriminative margin clustering is a new technique for analyzing high dimensional quantitative datasets, specially applicable to gene expression data from microarray experiments related to cancer. The goal of the analysis is find highly specialized sub-types of a tumor type which are similar in having a small combination of genes which together provide a unique molecular portrait for distinguishing the sub-type from any normal cell or tissue. Detection of the products of these genes can then, in principle, provide a basis for detection and diagnosis of a cancer, and a therapy directed specifically at the distinguishing constellation of molecular features can, in principle, provide a way to eliminate the cancer cells, while minimizing toxicity to any normal cell. The new methodology yields highly specialized tumor subtypes which are similar in terms of potential diagnostic markers.

...read moreread less

Proceedings Article•DOI•

Boosted PRIM with application to searching for oncogenic pathway of lung cancer

[...]

Pei Wang¹, Young Ho Kim¹, Jonathan R. Pollack¹, Robert Tibshirani¹•Institutions (1)

Stanford University¹

16 Aug 2004

TL;DR: The motivation for boosted PRIM is to solve the problem of "searching for oncogenic pathways" based on array-CGH data, though the algorithm itself is suitable for general classification problems.

...read moreread less

Abstract: Boosted PRIM (patient rule induction method) is a new algorithm developed for two-class classification problems. PRIM is a variation of those tree-based methods, seeking box-shaped regions in the feature space to separate different classes. Boosted PRIM is to implement PRIM-styled weak learners in Adaboost, one of the most popular boosting algorithms. In addition, we improve the performance of the algorithm by introducing a regularization to the boosting process, which supports the perspective of viewing boosting as a steepest-descent numerical optimization by Jerry Friedman. The motivation for boosted PRIM is to solve the problem of "searching for oncogenic pathways" based on array-CGH (comparative genomic hybridization) data, though the algorithm itself is suitable for general classification problems. We illustrate the performance of the method through some simulation studies as well as an application on a lung cancer array-CGH data set.

...read moreread less

Journal Article•DOI•

Discussions of boosting papers, and rejoinders

[...]

Peter L. Bartlett, Peter J. Bickel, Peter Bühlmann, Yoav Freund, Jerome H. Friedman, Trevor Hastie, Wenxin Jiang, Michael J. Jordan, Vladimir Koltchinskii, Gábor Lugosi, Jon McAuliffe, Ya'acov Ritov, Saharan Rosset, Robert E. Schapire, Robert Tibshirani, Nicolas Vayatis, Bin Yu, Tong Zhang, Ji Zhu - Show less +15 more

01 Feb 2004-Annals of Statistics

TL;DR: In this article, Jiang et al. discuss process consistency for AdaBoost and the Bayes-risk consistency of regularized boosting methods, including convex risk minimization, and statistical behavior and consistency of classification methods.

...read moreread less

Abstract: Discussions of: "Process consistency for AdaBoost" [Ann. Statist. 32 (2004), no. 1, 13-29] by W. Jiang; "On the Bayes-risk consistency of regularized boosting methods" [ibid., 30-55] by G. Lugosi and N. Vayatis; and "Statistical behavior and consistency of classification methods based on convex risk minimization" [ibid., 56-85] by T. Zhang. Includes rejoinders by the authors.

...read moreread less

Journal Article•

Flawed analysis, implausible results — move on

[...]

C. David Naylor¹, Marius Sinclair, Robert Tibshirani•Institutions (1)

University of Toronto¹

03 Feb 2004-Canadian Medical Association Journal

TL;DR: A short paper on the organization of queues for coronary surgery brings to mind H.L. Mencken's tag that every complex problem has a neat, simple solution — and it is wrong.

...read moreread less

Abstract: Gerry B. Hill's short paper on the organization of queues for coronary surgery (page 354)[1][1] brings to mind H.L. Mencken's tag that every complex problem has a neat, simple solution — and it is wrong. For busy readers, in the recent tradition of 4-word movie reviews,[2][2] we offer a 6-word

...read moreread less

Journal Article•DOI•

FLT3 Mutations Determine the Clinical Outcome in Children with De Novo Acute Myelogenous Leukemia (AML) and Normal Karyotype: Pediatric Oncology Group (POG) Study # 9421.

[...]

Norman J. Lacayo¹, Norman J. Lacayo², Soheil Meshinchi¹, Susana C. Raimondi¹, Dennis J. Kuo², Ron Yu², Myron Chang¹, C. L. Willman¹, Robert Tibshirani², Yaddanapudi Ravindranath¹, Branimir I. Sikic², Howard Weinstein¹, Gary V. Dahl², Gary V. Dahl¹ - Show less +10 more•Institutions (2)

Children's Oncology Group¹, Stanford University²

16 Nov 2004-Blood

TL;DR: It is hypothesized that gene expression profiles would identify genes that cooperate with FLT3 mutations in conferring poor clinical outcome, and it is observed that patients with normal karyotypes who were enrolled in the Pediatric Oncology Group (POG) study #9421 had two significantly different clinical outcomes that were associated with the expression of FLT 3 mutations.

...read moreread less

Journal Article•DOI•

A heightened global response of interferon-alfa treatment on host gene transcription is associated with clearance of HCV after interferon therapy

[...]

Xiao-Song He¹, Xuhuai Ji¹, Ramsey Cheung¹, Lawrence M. Pfeffer², Robert Tibshirani¹, Harry B. Greenberg¹ - Show less +2 more•Institutions (2)

Stanford University¹, University of Tennessee Health Science Center²

01 Jul 2004-Gastroenterology