scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles.

TL;DR: The expanded CMap is reported, made possible by a new, low-cost, high-throughput reduced representation expression profiling method that is shown to be highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts.
About: This article is published in Cell.The article was published on 2017-11-30 and is currently open access. It has received 1943 citations till now.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, a principal component analysis (PCA) algorithm was used to evaluate the ferroptosis regulation patterns of individual tumor and three distinct regulation subtypes, which linked to outcomes and clinical relevance of each patient, were established.
Abstract: Ferroptosis is a newly recognized mechanism of regulated cell death. It was reported to be highly associated with immune therapy and chemotherapy. However, its mechanism of regulation in the tumor microenvironment (TME) and influence on oral squamous cell carcinoma (OSCC) therapy are unknown. We identified a ferroptosis-specific gene-expression signature, an FPscore, developed by a principal component analysis (PCA) algorithm to evaluate the ferroptosis regulation patterns of individual tumor. Multi-omics analysis of ferroptosis regulation patterns was conducted. Three distinct ferroptosis regulation subtypes, which linked to outcomes and the clinical relevance of each patient, were established. A high FPscore of patients with OSCC was associated with a favorable prognosis, a ferroptosis-related immune-activation phenotype, potential sensitivities to the chemotherapy and immunotherapy. Importantly, a high FPscore correlated with a low gene copy number burden and high immune checkpoint expressions. We validated the prognostic value of the FPscore using independent immunotherapy and pan-cancer cohorts. Comprehensive evaluation of individual tumors with distinct ferroptosis regulation patterns provides new mechanistic insights, which may be clinically relevant for the application of combination therapies in OSCC.

12 citations

Journal ArticleDOI
30 Jul 2021
TL;DR: The EC organoids and O-PDX models mimic the tissue architecture, protein biomarker expression and genetic profile of the original tissue and show heterogenous sensitivity to conventional chemotherapy, and drug response is reproduced in vivo.
Abstract: A major hurdle in translational endometrial cancer (EC) research is the lack of robust preclinical models that capture both inter- and intra-tumor heterogeneity. This has hampered the development of new treatment strategies for people with EC. EC organoids were derived from resected patient tumor tissue and expanded in a chemically defined medium. Established EC organoids were orthotopically implanted into female NSG mice. Patient tissue and corresponding models were characterized by morphological evaluation, biomarker and gene expression and by whole exome sequencing. A gene signature was defined and its prognostic value was assessed in multiple EC cohorts using Mantel-Cox (log-rank) test. Response to carboplatin and/or paclitaxel was measured in vitro and evaluated in vivo. Statistical difference between groups was calculated using paired t-test. We report EC organoids established from EC patient tissue, and orthotopic organoid-based patient-derived xenograft models (O-PDXs). The EC organoids and O-PDX models mimic the tissue architecture, protein biomarker expression and genetic profile of the original tissue. Organoids show heterogenous sensitivity to conventional chemotherapy, and drug response is reproduced in vivo. The relevance of these models is further supported by the identification of an organoid-derived prognostic gene signature. This signature is validated as prognostic both in our local patient cohorts and in the TCGA endometrial cancer cohort. We establish robust model systems that capture both the diversity of endometrial tumors and intra-tumor heterogeneity. These models are highly relevant preclinical tools for the elucidation of the molecular pathogenesis of EC and identification of potential treatment strategies. To study the biology of cancer and test new potential treatments, it is important to use models that mimic patients’ tumors. Such models have largely been lacking in endometrial cancer. We therefore aimed to developing miniature tumors, called “organoids”, directly from patient tumor tissue. Our organoids maintained the characteristics and genetic features of the tumors from which they were derived, would grow into endometrial tumors in mice, and exhibited patient-specific responses to chemotherapy drugs. In summary, we have developed models that will help us better understand the biology of endometrial tumors and can be used to potentially identify new effective drugs for endometrial cancer patients. Berg et al. establish a panel of patient-derived endometrial cancer organoids and xenograft models. They show that their models recapitulate the genetic profile of the donor tumor and can be used for drug testing and development of a prognostic gene signature.

12 citations

Journal ArticleDOI
TL;DR: The 10 most vital small molecule drugs of GBM, which potentially imitate or reverse GBM carcinogenic status, were predicted by using Connectivity Map (CMAP) database and validated in silico.
Abstract: Glioblastoma (GBM) is a common and aggressive primary brain tumor, and the prognosis for GBM patients remains poor. This study aimed to identify the key genes associated with the development of GBM and provide new diagnostic and therapies for GBM. Three microarray datasets (GSE111260, GSE103227, and GSE104267) were selected from Gene Expression Omnibus (GEO) database for integrated analysis. The differential expressed genes (DEGs) between GBM and normal tissues were identified. Then, prognosis-related DEGs were screened by survival analysis, followed by functional enrichment analysis. The protein–protein interaction (PPI) network was constructed to explore the hub genes associated with GBM. The mRNA and protein expression levels of hub genes were respectively validated in silico using The Cancer Genome Atlas (TCGA) and Human Protein Atlas (HPA) databases. Subsequently, the small molecule drugs of GBM were predicted by using Connectivity Map (CMAP) database. A total of 78 prognosis-related DEGs were identified, of which10 hub genes with higher degree were obtained by PPI analysis. The mRNA expression and protein expression levels of CETN2, MKI67, ARL13B, and SETDB1 were overexpressed in GBM tissues, while the expression levels of CALN1, ELAVL3, ADCY3, SYN2, SLC12A5, and SOD1 were down-regulated in GBM tissues. Additionally, these genes were significantly associated with the prognosis of GBM. We eventually predicted the 10 most vital small molecule drugs, which potentially imitate or reverse GBM carcinogenic status. Cycloserine and 11-deoxy-16,16-dimethylprostaglandin E2 might be considered as potential therapeutic drugs of GBM. Our study provided 10 key genes for diagnosis, prognosis, and therapy for GBM. These findings might contribute to a better comprehension of molecular mechanisms of GBM development, and provide new perspective for further GBM research. However, specific regulatory mechanism of these genes needed further elaboration.

12 citations

Journal ArticleDOI
TL;DR: The results provide systems biology support for using BCG and small-molecule BCG mimics as putative vaccine and drug candidates against emergent viruses including SARS-CoV-2.
Abstract: Coronavirus disease 2019 (COVID-19) is expected to continue to cause worldwide fatalities until the World population develops ‘herd immunity’, or until a vaccine is developed and used as a prevention. Meanwhile, there is an urgent need to identify alternative means of antiviral defense. Bacillus Calmette–Guerin (BCG) vaccine that has been recognized for its off-target beneficial effects on the immune system can be exploited to boast immunity and protect from emerging novel viruses. We developed and employed a systems biology workflow capable of identifying small-molecule antiviral drugs and vaccines that can boast immunity and affect a wide variety of viral disease pathways to protect from the fatal consequences of emerging viruses. Our analysis demonstrates that BCG vaccine affects the production and maturation of naive T cells resulting in enhanced, long-lasting trained innate immune responses that can provide protection against novel viruses. We have identified small-molecule BCG mimics, including antiviral drugs such as raltegravir and lopinavir as high confidence hits. Strikingly, our top hits emetine and lopinavir were independently validated by recent experimental findings that these compounds inhibit the growth of SARS-CoV-2 in vitro. Our results provide systems biology support for using BCG and small-molecule BCG mimics as putative vaccine and drug candidates against emergent viruses including SARS-CoV-2.

12 citations


Cites background or methods from "A Next Generation Connectivity Map:..."

  • ...This database of cellular signatures has been produced using the L1000 platform (27); a...

    [...]

  • ...In order to identify experimentally validated upstream regulators that cause transcriptional changes similar to those induced by BCG, we queried the Connectivity Map (CMap) (27) database of the Broad Institute with BCGCGS and identified proteins and small-molecule drugs that have strong connectivity scores with BCG (Fig....

    [...]

  • ...The CMap (27,89) is a chemogenomics database that catalogs 1....

    [...]

Journal ArticleDOI
TL;DR: This work demonstrates prediction and preservation of cardiotoxic relationships for six drug-induced cardiotoxicity types using a machine learning approach on a large collected and curated dataset of transcriptional and molecular profiles.
Abstract: Computational methods can increase productivity of drug discovery pipelines, through overcoming challenges such as cardiotoxicity identification. We demonstrate prediction and preservation of cardiotoxic relationships for six drug-induced cardiotoxicity types using a machine learning approach on a large collected and curated dataset of transcriptional and molecular profiles (1,131 drugs, 35% with known cardiotoxicities, and 9,933 samples). The algorithm generality is demonstrated through validation in an independent drug dataset, in addition to cross-validation. The best prediction attains an average accuracy of 79% in area under the curve (AUC) for safe versus risky drugs, across all six cardiotoxicity types on validation and 66% on the unseen set of drugs. Individual cardiotoxicities for specific drug types are also predicted with high accuracy, including cardiac disorder signs and symptoms for a previously unseen set of anti-inflammatory agents (AUC = 80%) and heart failures for an unseen set of anti-neoplastic agents (AUC = 76%). Besides, independent testing on transcriptional data from the Drug Toxicity Signature Generation Center (DToxS) produces similar results in terms of accuracy and shows an average AUC of 72% for previously seen drugs and 60% for unseen respectively. Given the ubiquitous manifestation of multiple drug adverse effects in every human organ, the methodology is expected to be applicable to additional tissue-specific side effects beyond cardiotoxicity.

12 citations


Cites background from "A Next Generation Connectivity Map:..."

  • ...…knowledge and data repositories, including DrugBank (Wishart et al., 2008) (www.drugbank.com), Connectivity map Project (https://clue.io/cmap) (Subramanian et al., 2017), SIDER(Kuhn et al., 2016) (sideeffects.embl.de), MedDRA (https://bioportal.bioontology.org/ontologies/ MEDDRA) and MESH…...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

34,830 citations

Journal Article
TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Abstract: We present a new technique called “t-SNE” that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. t-SNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for high-dimensional data that lie on several different, but related, low-dimensional manifolds, such as images of objects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large datasets, we show how t-SNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of t-SNE on a wide variety of datasets and compare it with many other non-parametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by t-SNE are significantly better than those produced by the other techniques on almost all of the datasets.

30,124 citations

Journal ArticleDOI
TL;DR: The Gene Expression Omnibus (GEO) project was initiated in response to the growing demand for a public repository for high-throughput gene expression data and provides a flexible and open design that facilitates submission, storage and retrieval of heterogeneous data sets from high-power gene expression and genomic hybridization experiments.
Abstract: The Gene Expression Omnibus (GEO) project was initiated in response to the growing demand for a public repository for high-throughput gene expression data. GEO provides a flexible and open design that facilitates submission, storage and retrieval of heterogeneous data sets from high-throughput gene expression and genomic hybridization experiments. GEO is not intended to replace in house gene expression databases that benefit from coherent data sets, and which are constructed to facilitate a particular analytic method, but rather complement these by acting as a tertiary, central data distribution hub. The three central data entities of GEO are platforms, samples and series, and were designed with gene expression and genomic hybridization experiments in mind. A platform is, essentially, a list of probes that define what set of molecules may be detected. A sample describes the set of molecules that are being probed and references a single platform used to generate its molecular abundance data. A series organizes samples into the meaningful data sets which make up an experiment. The GEO repository is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.

10,968 citations

Journal ArticleDOI
TL;DR: How BLAT was optimized is described, which is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences.
Abstract: Analyzing vertebrate genomes requires rapid mRNA/DNA and cross-species protein alignments A new tool, BLAT, is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences BLAT's speed stems from an index of all nonoverlapping K-mers in the genome This index fits inside the RAM of inexpensive computers, and need only be computed once for each genome assembly BLAT has several major stages It uses the index to find regions in the genome likely to be homologous to the query sequence It performs an alignment between homologous regions It stitches together these aligned regions (often exons) into larger alignments (typically genes) Finally, BLAT revisits small internal exons possibly missed at the first stage and adjusts large gap boundaries that have canonical splice sites where feasible This paper describes how BLAT was optimized Effects on speed and sensitivity are explored for various K-mer sizes, mismatch schemes, and number of required index matches BLAT is compared with other alignment programs on various test sets and then used in several genome-wide applications http://genomeucscedu hosts a web-based BLAT server for the human genome

8,326 citations

Journal ArticleDOI
TL;DR: This paper proposed parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples.
Abstract: SUMMARY Non-biological experimental variation or “batch effects” are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes (>25) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.

6,319 citations

Related Papers (5)