scispace - formally typeset
Search or ask a question
Author

Pablo Tamayo

Bio: Pablo Tamayo is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Gene expression profiling & Cancer. The author has an hindex of 72, co-authored 177 publications receiving 97318 citations. Previous affiliations of Pablo Tamayo include University of California, Berkeley & Harvard University.


Papers
More filters
Posted ContentDOI
11 Jul 2022-bioRxiv
TL;DR: A suite of computational tools that implement non-negative matrix factorization (NMF) and provide methods for accurate, clear biological interpretation and analysis are introduced.
Abstract: Non-negative matrix factorization (NMF) is an unsupervised learning method well suited to high-throughput biology. Still, inferring biological processes requires additional post hoc statistics and annotation for interpretation of features learned from software packages developed for NMF implementation. Here, we aim to introduce a suite of computational tools that implement NMF and provide methods for accurate, clear biological interpretation and analysis. A generalized discussion of NMF covering its benefits, limitations, and open questions in the field is followed by three vignettes for the Bayesian NMF algorithm CoGAPS (Coordinated Gene Activity across Pattern Subsets). Each vignette will demonstrate NMF analysis to quantify cell state transitions in public domain single-cell RNA-sequencing (scRNA-seq) data of malignant epithelial cells in 25 pancreatic ductal adenocarcinoma (PDAC) tumors and 11 control samples. The first uses PyCoGAPS, our new Python interface for CoGAPS that we developed to enhance runtime of Bayesian NMF for large datasets. The second vignette steps through the same analysis using our R CoGAPS interface, and the third introduces two new cloud-based, plug-and-play options for running CoGAPS using GenePattern Notebook and Docker. By providing Python support, cloud-based computing options, and relevant example workflows, we facilitate user-friendly interpretation and implementation of NMF for single-cell analyses.

2 citations

Journal Article
TL;DR: Biochemical analyses support the hypothesis that STK33 promotes cell growth and survival in a kinase activity-dependent manner by regulating the activity of S6K1 selectively in mutant KRAS-dependent cells, and illustrate the potential of RNAi for discovering critical functional dependencies created by oncogenic mutations that cannot be identified using other genomic technologies.
Abstract: AACR Annual Meeting-- Apr 18-22, 2009; Denver, CO Activating KRAS mutations are among the most common pathogenetic events in a broad spectrum of hematologic malignancies and epithelial tumors. However, oncogenic KRAS has thus far not proven to be a tractable target for therapeutic intervention. An alternative to direct targeting of known oncogenes is to perform \#8220;synthetic lethality\#8221; screens to identify genes that are selectively required for cell viability in the context of specific cancer-causing mutations. Using this approach, we have discovered a synthetic lethal interaction between mutant KRAS, the most frequently mutated oncogene in human cancer, and inactivation of the gene encoding the STK33 serine/threonine protein kinase. To identify genes that are essential for cell viability in the context of mutant KRAS, we performed high-throughput loss-of-function RNA interference (RNAi) screens in eight human cancer cell lines (mutant KRAS, n=4; wildtype KRAS, n=4), representing five different tumor types (acute myeloid leukemia, colon cancer, breast cancer, prostate cancer, glioblastoma), as well as normal human fibroblasts and mammary epithelial cells. We screened each cell line with a subset of the short hairpin RNA (shRNA) library developed by the RNAi Consortium that consists of 5,024 individual shRNA constructs targeting 1,011 human genes, including the majority of known and putative protein kinase genes and a selection of protein phosphatase genes and known cancer-related genes. In these cell lines, suppression of STK33 preferentially inhibited the viability and proliferation of cells that were dependent on mutant KRAS. The differential requirement for STK33 based on oncogenic KRAS dependency was confirmed in 17 additional cell lines using in vitro transformation assays and human tumor xenograft models. Biochemical analyses support the hypothesis that STK33 promotes cell growth and survival in a kinase activity-dependent manner by regulating the activity of S6K1 selectively in mutant KRAS-dependent cells. Notably, molecular genetic characterization of cancer cell lines and analysis of patient-derived genomic data sets indicated that STK33 is not frequently mutated or overexpressed in human tumors. These observations identify STK33 as a potential target for the treatment of mutant KRAS-driven cancers that may have a broad therapeutic index in normal versus malignant cells, and illustrate the potential of RNAi for discovering critical functional dependencies created by oncogenic mutations that cannot be identified using other genomic technologies. Citation Information: In: Proc Am Assoc Cancer Res; 2009 Apr 18-22; Denver, CO. Philadelphia (PA): AACR; 2009. Abstract nr 5608.

1 citations

Journal ArticleDOI
TL;DR: It is demonstrated that the human neurosphere model of medulloblastoma can facilitate the identification of agents with therapeutic efficacy against aggressive medullopiridol, a cyclin dependent kinase inhibitor currently in clinical trials for a variety of tumor types.
Abstract: Medulloblastoma, the most common malignant brain tumor in children, is divided into multiple subgroups with different associated mutations and clinical prognoses. To facilitate studying the biology and developing new therapeutic options, we created a model of aggressive medulloblastoma using human neural stem cells. Neural stem cells were obtained from the cerebellar anlage of first trimester fetal autopsy specimens and transfected using lentiviral vectors with different combinations of oncogenes associated with poor prognosis medulloblastoma: dominant-negative p53, hTERT, c-MYC, and constitutively active AKT. Cerebellar derived stem cells transformed with all four oncogenes formed fast growing, aggressive tumors that were histologically similar to primary anaplastic medulloblastoma tumors. We compared the expression profile of primary medulloblastoma tumors to our model, and confirmed that our model most closely matches the most aggressive and clinically devastating subgroup of medulloblastoma. The expression profile of our model was compared with a drug sensitivity database to identify therapeutic agents that could target our model. One class of drugs identified was cyclin-dependent kinase inhibitors. Because MYC also regulates the expression of CDKs 4/6, we hypothesized that flavopiridol (alvocidib), a cyclin dependent kinase inhibitor currently in clinical trials for a variety of tumor types would be effective against MYC-driven medulloblastoma. Our model and the medulloblastoma cell lines D425Med and D283Med were sensitive to treatment with flavopiridol. The IC50 of flavopiridol was less than 50nM for all cell lines tested. Flavopiridol decreased the growth and proliferation of both cell lines and our neurosphere model as measured by MTS assay and BrdU incorporation (p < 0.001)). Flavopiridol also included apoptosis as determined by staining for cleaved caspase-3 and flow cytometry (p < 0.01). The data shown here demonstrate that our human neurosphere model of medulloblastoma can facilitate the identification of agents with therapeutic efficacy against aggressive medulloblastoma.

1 citations

01 May 2012
TL;DR: In this paper, the authors used gene expression profiling and grouping by response to construct a predictive score that indicates the likelihood that patients without deletion 5q will respond to lenalidomide.
Abstract: Approximately 70% of all patients with myelodysplastic syndrome (MDS) present with lower-risk disease. Some of these patients will initially respond to treatment with growth factors to improve anemia but will eventually cease to respond, while others will be resistant to growth factor therapy. Eventually, all lower-risk MDS patients require multiple transfusions and long-term therapy. While some patients may respond briefly to hypomethylating agents or lenalidomide, the majority will not, and new therapeutic options are needed for these lower-risk patients. Our previous clinical trials with ezatiostat (ezatiostat hydrochloride, Telentra®, TLK199), a glutathione S-transferase P1-1 inhibitor in clinical development for the treatment of low- to intermediate-risk MDS, have shown significant clinical activity, including multilineage responses as well as durable red-blood-cell transfusion independence. It would be of significant clinical benefit to be able to identify patients most likely to respond to ezatiostat before therapy is initiated. We have previously shown that by using gene expression profiling and grouping by response, it is possible to construct a predictive score that indicates the likelihood that patients without deletion 5q will respond to lenalidomide. The success of that study was based in part on the fact that the profile for response was linked to the biology of the disease. RNA was available on 30 patients enrolled in the trial and analyzed for gene expression on the Illumina HT12v4 whole genome array according to the manufacturer’s protocol. Gene marker analysis was performed. The selection of genes associated with the responders (R) vs. non-responders (NR) phenotype was obtained using a normalized and rescaled mutual information score (NMI). We have shown that an ezatiostat response profile contains two miRNAs that regulate expression of genes known to be implicated in MDS disease pathology. Remarkably, pathway analysis of the response profile revealed that the genes comprising the jun-N-terminal kinase/c-Jun molecular pathway, which is known to be activated by ezatiostat, are under-expressed in patients who respond and over-expressed in patients who were non-responders to the drug, suggesting that both the biology of the disease and the molecular mechanism of action of the drug are positively correlated.

1 citations

Journal ArticleDOI
16 Nov 2006-Blood
TL;DR: It is suggested that Lenalidomide-responsive patients without 5q deletions have a defect in erythroid differentiation analogous to the ineffective erythropoiesis in patients with 5Q deletions, and that an erythyroid gene expression signature predicts Lenalidumide activity in MDS.

1 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

34,830 citations

Journal ArticleDOI
TL;DR: Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Abstract: Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

32,980 citations

Journal ArticleDOI
TL;DR: By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.
Abstract: DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

31,015 citations

Journal ArticleDOI
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

22,147 citations

Journal ArticleDOI
TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Abstract: Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together.The elastic net is particularly useful when the number of predictors (p) is much bigger than the number of observations (n). By contrast, the lasso is not a very satisfactory variable selection method in the

16,538 citations