scispace - formally typeset
Search or ask a question
Author

Pablo Tamayo

Bio: Pablo Tamayo is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Gene expression profiling & Cancer. The author has an hindex of 72, co-authored 177 publications receiving 97318 citations. Previous affiliations of Pablo Tamayo include University of California, Berkeley & Harvard University.


Papers
More filters
Journal ArticleDOI
17 Jan 2008-Nature
TL;DR: It is found that partial loss of function of the ribosomal subunit protein RPS14 phenocopies the disease in normal haematopoietic progenitor cells, and also that forced expression of RPS 14 rescues the disease phenotype in patient-derived bone marrow cells.
Abstract: Somatic chromosomal deletions in cancer are thought to indicate the location of tumour suppressor genes, by which a complete loss of gene function occurs through biallelic deletion, point mutation or epigenetic silencing, thus fulfilling Knudson's two-hit hypothesis. In many recurrent deletions, however, such biallelic inactivation has not been found. One prominent example is the 5q- syndrome, a subtype of myelodysplastic syndrome characterized by a defect in erythroid differentiation. Here we describe an RNA-mediated interference (RNAi)-based approach to discovery of the 5q- disease gene. We found that partial loss of function of the ribosomal subunit protein RPS14 phenocopies the disease in normal haematopoietic progenitor cells, and also that forced expression of RPS14 rescues the disease phenotype in patient-derived bone marrow cells. In addition, we identified a block in the processing of pre-ribosomal RNA in RPS14-deficient cells that is functionally equivalent to the defect in Diamond-Blackfan anaemia, linking the molecular pathophysiology of the 5q- syndrome to a congenital syndrome causing bone marrow failure. These results indicate that the 5q- syndrome is caused by a defect in ribosomal protein function and suggest that RNAi screening is an effective strategy for identifying causal haploinsufficiency disease genes.

865 citations

Journal ArticleDOI
TL;DR: The genes that displayed an expression profile most similar to endogenous Myc in microarray-based expression profiling of myeloid differentiation models were highly enriched for MYC target genes.
Abstract: MYC affects normal and neoplastic cell proliferation by altering gene expression, but the precise pathways remain unclear. We used oligonucleotide microarray analysis of 6,416 genes and expressed sequence tags to determine changes in gene expression caused by activation of c-MYC in primary human fibroblasts. In these experiments, 27 genes were consistently induced, and 9 genes were repressed. The identity of the genes revealed that MYC may affect many aspects of cell physiology altered in transformed cells: cell growth, cell cycle, adhesion, and cytoskeletal organization. Identified targets possibly linked to MYC's effects on cell growth include the nucleolar proteins nucleolin and fibrillarin, as well as the eukaryotic initiation factor 5A. Among the cell cycle genes identified as targets, the G1 cyclin D2 and the cyclin-dependent kinase binding protein CksHs2 were induced whereas the cyclin-dependent kinase inhibitor p21(Cip1) was repressed. A role for MYC in regulating cell adhesion and structure is suggested by repression of genes encoding the extracellular matrix proteins fibronectin and collagen, and the cytoskeletal protein tropomyosin. A possible mechanism for MYC-mediated apoptosis was revealed by identification of the tumor necrosis factor receptor associated protein TRAP1 as a MYC target. Finally, two immunophilins, peptidyl-prolyl cis-trans isomerase F and FKBP52, the latter of which plays a role in cell division in Arabidopsis, were up-regulated by MYC. We also explored pattern-matching methods as an alternative approach for identifying MYC target genes. The genes that displayed an expression profile most similar to endogenous Myc in microarray-based expression profiling of myeloid differentiation models were highly enriched for MYC target genes.

849 citations

Journal ArticleDOI
02 Aug 2012-Nature
TL;DR: Together, this study reveals the alteration of WNT, hedgehog, histone methyltransferase and now N-CoR pathways across medulloblastomas and within specific subtypes of this disease, and nominates the RNA helicase DDX3X as a component of pathogenic β-catenin signalling in medullOBlastoma.
Abstract: Medulloblastomas are the most common malignant brain tumours in children. Identifying and understanding the genetic events that drive these tumours is critical for the development of more effective diagnostic, prognostic and therapeutic strategies. Recently, our group and others described distinct molecular subtypes of medulloblastoma on the basis of transcriptional and copy number profiles. Here we use whole-exome hybrid capture and deep sequencing to identify somatic mutations across the coding regions of 92 primary medulloblastoma/normal pairs. Overall, medulloblastomas have low mutation rates consistent with other paediatric tumours, with a median of 0.35 non-silent mutations per megabase. We identified twelve genes mutated at statistically significant frequencies, including previously known mutated genes in medulloblastoma such as CTNNB1, PTCH1, MLL2, SMARCA4 and TP53. Recurrent somatic mutations were newly identified in an RNA helicase gene, DDX3X, often concurrent with CTNNB1 mutations, and in the nuclear co-repressor (N-CoR) complex genes GPS2, BCOR and LDB1. We show that mutant DDX3X potentiates transactivation of a TCF promoter and enhances cell viability in combination with mutant, but not wild-type, β-catenin. Together, our study reveals the alteration of WNT, hedgehog, histone methyltransferase and now N-CoR pathways across medulloblastomas and within specific subtypes of this disease, and nominates the RNA helicase DDX3X as a component of pathogenic β-catenin signalling in medulloblastoma.

692 citations

Journal ArticleDOI
TL;DR: An algorithm for classification of cell line chemosensitivity based on gene expression profiles alone is developed and suggests that at least for a subset of compounds genomic approaches to chemos sensitivity prediction are feasible.
Abstract: In an effort to develop a genomics-based approach to the prediction of drug response, we have developed an algorithm for classification of cell line chemosensitivity based on gene expression profiles alone. Using oligonucleotide microarrays, the expression levels of 6,817 genes were measured in a panel of 60 human cancer cell lines (the NCI-60) for which the chemosensitivity profiles of thousands of chemical compounds have been determined. We sought to determine whether the gene expression signatures of untreated cells were sufficient for the prediction of chemosensitivity. Gene expression-based classifiers of sensitivity or resistance for 232 compounds were generated and then evaluated on independent sets of data. The classifiers were designed to be independent of the cells’ tissue of origin. The accuracy of chemosensitivity prediction was considerably better than would be expected by chance. Eighty-eight of 232 expression-based classifiers performed accurately (with P < 0.05) on an independent test set, whereas only 12 of the 232 would be expected to do so by chance. These results suggest that at least for a subset of compounds genomic approaches to chemosensitivity prediction are feasible.

668 citations

Journal ArticleDOI
TL;DR: In an effort to find gene regulatory networks and clusters of genes that affect cancer susceptibility to anticancer agents, a database with baseline expression levels of 7,245 genes measured by using microarrays in 60 cancer cell lines was joined and Hypotheses for potential single-gene determinants of anticancer agent susceptibility were constructed.
Abstract: In an effort to find gene regulatory networks and clusters of genes that affect cancer susceptibility to anticancer agents, we joined a database with baseline expression levels of 7,245 genes measured by using microarrays in 60 cancer cell lines, to a database with the amounts of 5,084 anticancer agents needed to inhibit growth of those same cell lines. Comprehensive pair-wise correlations were calculated between gene expression and measures of agent susceptibility. Associations weaker than a threshold strength were removed, leaving networks of highly correlated genes and agents called relevance networks. Hypotheses for potential single-gene determinants of anticancer agent susceptibility were constructed. The effect of random chance in the large number of calculations performed was empirically determined by repeated random permutation testing; only associations stronger than those seen in multiply permuted data were used in clustering. We discuss the advantages of this methodology over alternative approaches, such as phylogenetic-type tree clustering and self-organizing maps.

655 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

34,830 citations

Journal ArticleDOI
TL;DR: Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Abstract: Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

32,980 citations

Journal ArticleDOI
TL;DR: By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.
Abstract: DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

31,015 citations

Journal ArticleDOI
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

22,147 citations

Journal ArticleDOI
TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Abstract: Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together.The elastic net is particularly useful when the number of predictors (p) is much bigger than the number of observations (n). By contrast, the lasso is not a very satisfactory variable selection method in the

16,538 citations