scispace - formally typeset
Search or ask a question
Author

Pablo Tamayo

Bio: Pablo Tamayo is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Gene expression profiling & Cancer. The author has an hindex of 72, co-authored 177 publications receiving 97318 citations. Previous affiliations of Pablo Tamayo include University of California, Berkeley & Harvard University.


Papers
More filters
Journal ArticleDOI
TL;DR: Pooled in vivo screen and gene expression profiling identified functional variants and demonstrated that expression of rare variants induced tumorigenesis, underscoring the value of integrating genomic information with functional studies.
Abstract: Cancer genome characterization efforts now provide an initial view of the somatic alterations in primary tumors. However, most point mutations occur at low frequency, and the function of these alleles remains undefined. We have developed a scalable systematic approach to interrogate the function of cancer-associated gene variants. We subjected 474 mutant alleles curated from 5,338 tumors to pooled in vivo tumor formation assays and gene expression profiling. We identified 12 transforming alleles, including two in genes ( PIK3CB, POT1 ) that have not been shown to be tumorigenic. One rare KRAS allele, D33E, displayed tumorigenicity and constitutive activation of known RAS effector pathways. By comparing gene expression changes induced upon expression of wild-type and mutant alleles, we inferred the activity of specific alleles. Because alleles found to be mutated only once in 5,338 tumors rendered cells tumorigenic, these observations underscore the value of integrating genomic information with functional studies. Significance: Experimentally inferring the functional status of cancer-associated mutations facilitates the interpretation of genomic information in cancer. Pooled in vivo screen and gene expression profiling identified functional variants and demonstrated that expression of rare variants induced tumorigenesis. Variant phenotyping through functional studies will facilitate defining key somatic events in cancer. Cancer Discov; 6(7); 714–26. ©2016 AACR. See related commentary by Cho and Collisson, [p. 694][1] . This article is highlighted in the In This Issue feature, [p. 681][2] [1]: /lookup/volpage/6/694?iss=7 [2]: /lookup/volpage/6/681?iss=7

133 citations

Patent
07 Apr 2000
TL;DR: In this paper, methods and apparatus for classifying or predicting the classes for samples based on gene expression are described, as well as methods and methods for ascertaining or discovering new, previously unknown classes.
Abstract: Methods and apparatus for classifying or predicting the classes for samples based on gene expression are described. Also described are methods and apparatus for ascertaining or discovering new, previously unknown classes based on gene expression.

132 citations

Journal ArticleDOI
TL;DR: This work uses a computational method, Analytic Technique for Assessment of RNAi by Similarity (ATARiS), that takes advantage of patterns in RNAi data across multiple samples in order to enrich for RNAi reagents whose phenotypic effects relate to suppression of their intended targets and produces quantitative, gene-level phenotype values, which provide an intuitive measure of the effect of gene suppression in each sample.
Abstract: Genome-scale RNAi libraries enable the systematic interrogation of gene function. However, the interpretation of RNAi screens is complicated by the observation that RNAi reagents designed to suppress the mRNA transcripts of the same gene often produce a spectrum of phenotypic outcomes due to differential on-target gene suppression or perturbation of off-target transcripts. Here we present a computational method, Analytic Technique for Assessment of RNAi by Similarity (ATARiS), that takes advantage of patterns in RNAi data across multiple samples in order to enrich for RNAi reagents whose phenotypic effects relate to suppression of their intended targets. By summarizing only such reagent effects for each gene, ATARiS produces quantitative, gene-level phenotype values, which provide an intuitive measure of the effect of gene suppression in each sample. This method is robust for data sets that contain as few as 10 samples and can be used to analyze screens of any number of targeted genes. We used this analytic approach to interrogate RNAi data derived from screening more than 100 human cancer cell lines and identified HNF1B as a transforming oncogene required for the survival of cancer cells that harbor HNF1B amplifications. ATARiS is publicly available at http://broadinstitute.org/ataris.

132 citations

Journal ArticleDOI
TL;DR: This study quantitatively profiled global proteomes and phospho-proteomes of 45 medulloblastoma samples and found kinases associated with subtypes and showed that inhibiting PRKDC sensitizes MYC-driven cells to radiation.

124 citations

Book ChapterDOI
TL;DR: This work focuses on the recently developed procedure for automated high- dimensional flow cytometry analysis called FLAME (FLow analysis with Automated Multivariate Estimation), which introduced novel finite mixture models of heavy-tailed and asymmetric distributions to identify and model cell populations in a flow cytometric sample.
Abstract: Flow cytometry is widely used for single cell interrogation of surface and intracellular protein expression by measuring fluorescence intensity of fluorophore-conjugated reagents. We focus on the recently developed procedure of Pyne et al. (2009, Proceedings of the National Academy of Sciences USA 106, 8519-8524) for automated high- dimensional flow cytometric analysis called FLAME (FLow analysis with Automated Multivariate Estimation). It introduced novel finite mixture models of heavy-tailed and asymmetric distributions to identify and model cell populations in a flow cytometric sample. This approach robustly addresses the complexities of flow data without the need for transformation or projection to lower dimensions. It also addresses the critical task of matching cell populations across samples that enables downstream analysis. It thus facilitates application of flow cytometry to new biological and clinical problems. To facilitate pipelining with standard bioinformatic applications such as high-dimensional visualization, subject classification or outcome prediction, FLAME has been incorporated with the GenePattern package of the Broad Institute. Thereby analysis of flow data can be approached similarly as other genomic platforms. We also consider some new work that proposes a rigorous and robust solution to the registration problem by a multi-level approach that allows us to model and register cell populations simultaneously across a cohort of high-dimensional flow samples. This new approach is called JCM (Joint Clustering and Matching). It enables direct and rigorous comparisons across different time points or phenotypes in a complex biological study as well as for classification of new patient samples in a more clinical setting.

122 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

34,830 citations

Journal ArticleDOI
TL;DR: Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Abstract: Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

32,980 citations

Journal ArticleDOI
TL;DR: By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.
Abstract: DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

31,015 citations

Journal ArticleDOI
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

22,147 citations

Journal ArticleDOI
TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Abstract: Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together.The elastic net is particularly useful when the number of predictors (p) is much bigger than the number of observations (n). By contrast, the lasso is not a very satisfactory variable selection method in the

16,538 citations