scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles.

TL;DR: The expanded CMap is reported, made possible by a new, low-cost, high-throughput reduced representation expression profiling method that is shown to be highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts.
About: This article is published in Cell.The article was published on 2017-11-30 and is currently open access. It has received 1943 citations till now.
Citations
More filters
Journal ArticleDOI
TL;DR: DeSIDE-DDI as discussed by the authors proposes a deep learning-based framework for DDI prediction with drug-induced gene expression signatures so that the model can provide the expression level of interpretability for DDIs.
Abstract: Adverse drug-drug interaction (DDI) is a major concern to polypharmacy due to its unexpected adverse side effects and must be identified at an early stage of drug discovery and development. Many computational methods have been proposed for this purpose, but most require specific types of information, or they have less concern in interpretation on underlying genes. We propose a deep learning-based framework for DDI prediction with drug-induced gene expression signatures so that the model can provide the expression level of interpretability for DDIs. The model engineers dynamic drug features using a gating mechanism that mimics the co-administration effects by imposing attention to genes. Also, each side-effect is projected into a latent space through translating embedding. As a result, the model achieved an AUC of 0.889 and an AUPR of 0.915 in unseen interaction prediction, which is competitively very accurate and outperforms other state-of-the-art methods. Furthermore, it can predict potential DDIs with new compounds not used in training. In conclusion, using drug-induced gene expression signatures followed by gating and translating embedding can increase DDI prediction accuracy while providing model interpretability. The source code is available on GitHub ( https://github.com/GIST-CSBL/DeSIDE-DDI ).

6 citations

Journal ArticleDOI
TL;DR: In this article, the authors performed targeted sequencing of 40 medication-relevant genes on plasma samples from 98 non-small cell lung cancer patients and analyzed impact of genetic alterations on clinical presentation as well as response to systemic treatments.
Abstract: Non-small cell lung cancer (NSCLC) is characterized by relatively rapid response to systemic treatments yet inevitable resistance and predisposed to distant metastasis. We thus aimed at performing sequencing analysis to determine genomic events and underlying mechanisms concerning drug resistance in NSCLC. We performed targeted sequencing of 40 medication-relevant genes on plasma samples from 98 NSCLC patients and analyzed impact of genetic alterations on clinical presentation as well as response to systemic treatments. Profiling of multi-omics data from 1024 NSCLC tissues in public datasets was carried out for comparison and validation of identified molecular events implicated in resistance. A genetic association of CYP2D6 deletion with drug resistance was identified through circulating tumor DNA (ctDNA) profiling and response assessment. FCGR3A amplification was potentially involved in resistance to EGFR inhibitors. We further verified our findings in tissue samples and focused on potential resistance mechanisms, which uncovered that depleted CYP2D6 affected a set of genes involved in EMT, oncogenic signaling as well as inflammatory pathways. Tumor microenvironment analysis revealed that NSCLC with CYP2D6 loss manifested increased levels of immunomodulatory gene expressions, PD-L1 expression, relatively high mutational burden and lymphocyte infiltration. DNA methylation alterations were also found to be correlated with mRNA expressions and copy numbers of CYP2D6. Finally, MEK inhibitors were identified by CMap as the prospective therapeutic drugs for CYP2D6 deletion. These analyses identified novel resistance mechanisms to systemic NSCLC treatments and had significant implications for the development of new treatment strategies.

6 citations

Journal ArticleDOI
TL;DR: In this paper , the prognostic values of CC and CXC families' expression in various types of cancers are becoming increasingly evident, and the authors aimed to conduct a comprehensive bioinformatics analysis elucidating the prognosis values of the CC and CRF families in BC.
Abstract: Abstract Breast cancer (BC) is a major human health problem due to its increasing incidence and mortality rate. CC and CXC chemokines are associated with tumorigenesis and the progression of many cancers. Since the prognostic values of CC and CXC families' expression in various types of cancers are becoming increasingly evident, we aimed to conduct a comprehensive bioinformatics analysis elucidating the prognostic values of the CC and CXC families in BC. Therefore, TCGA, UALCAN, Kaplan–Meier plotter, bc-GenExMiner, cBioPortal, STRING, Enrichr, and TIMER were utilized for analysis. We found that high levels of CCL4/5/14/19/21/22 were associated with better OS and RFS, while elevated expression of CCL24 was correlated with shorter OS in BC patients. Also, high levels of CXCL9/13 indicated longer OS, and enhanced expression of CXCL12/14 was linked with better OS and RFS in BC patients. Meanwhile, increased transcription levels of CXCL8 were associated with worse OS and RFS in BC patients. In addition, our results showed that CCL5, CCL8, CCL14, CCL20, CCL27, CXCL4, and CXCL14 were notably correlated with the clinical outcomes of BC patients. Our findings provide a new point of view that may help the clinical application of CC and CXC chemokines as prognostic biomarkers in BC.

6 citations

Journal ArticleDOI
TL;DR: In this paper , the authors utilize NIS in high-throughput drug screening and undertake rigorous evaluation of lead compounds to identify and target key processes underpinning NIS function, finding that multiple proteostasis pathways, including proteasomal degradation and autophagy are central to the cellular processing of NIS.
Abstract: The sodium iodide symporter (NIS) functions to transport iodide and is critical for successful radioiodide ablation of cancer cells. Approaches to bolster NIS function and diminish recurrence post-radioiodide therapy are impeded by oncogenic pathways that suppress NIS, as well as the inherent complexity of NIS regulation. Here, we utilize NIS in high-throughput drug screening and undertake rigorous evaluation of lead compounds to identify and target key processes underpinning NIS function. We find that multiple proteostasis pathways, including proteasomal degradation and autophagy, are central to the cellular processing of NIS. Utilizing inhibitors targeting distinct molecular processes, we pinpoint combinatorial drug strategies giving robust >5-fold increases in radioiodide uptake. We also reveal significant dysregulation of core proteostasis genes in human tumors, identifying a 13-gene risk score classifier as an independent predictor of recurrence in radioiodide-treated patients. We thus propose and discuss a model for targetable steps of intracellular processing of NIS function.

6 citations

Journal ArticleDOI
01 Feb 2019-Alcohol
TL;DR: This review highlights several of the newly developed sequencing methodologies and resultant discoveries in neuroscience, as well as the importance of a multi-faceted and integrative approach for determining causal factors in AUD.

6 citations

References
More filters
Journal ArticleDOI
TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

34,830 citations

Journal Article
TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Abstract: We present a new technique called “t-SNE” that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. t-SNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for high-dimensional data that lie on several different, but related, low-dimensional manifolds, such as images of objects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large datasets, we show how t-SNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of t-SNE on a wide variety of datasets and compare it with many other non-parametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by t-SNE are significantly better than those produced by the other techniques on almost all of the datasets.

30,124 citations

Journal ArticleDOI
TL;DR: The Gene Expression Omnibus (GEO) project was initiated in response to the growing demand for a public repository for high-throughput gene expression data and provides a flexible and open design that facilitates submission, storage and retrieval of heterogeneous data sets from high-power gene expression and genomic hybridization experiments.
Abstract: The Gene Expression Omnibus (GEO) project was initiated in response to the growing demand for a public repository for high-throughput gene expression data. GEO provides a flexible and open design that facilitates submission, storage and retrieval of heterogeneous data sets from high-throughput gene expression and genomic hybridization experiments. GEO is not intended to replace in house gene expression databases that benefit from coherent data sets, and which are constructed to facilitate a particular analytic method, but rather complement these by acting as a tertiary, central data distribution hub. The three central data entities of GEO are platforms, samples and series, and were designed with gene expression and genomic hybridization experiments in mind. A platform is, essentially, a list of probes that define what set of molecules may be detected. A sample describes the set of molecules that are being probed and references a single platform used to generate its molecular abundance data. A series organizes samples into the meaningful data sets which make up an experiment. The GEO repository is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.

10,968 citations

Journal ArticleDOI
TL;DR: How BLAT was optimized is described, which is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences.
Abstract: Analyzing vertebrate genomes requires rapid mRNA/DNA and cross-species protein alignments A new tool, BLAT, is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences BLAT's speed stems from an index of all nonoverlapping K-mers in the genome This index fits inside the RAM of inexpensive computers, and need only be computed once for each genome assembly BLAT has several major stages It uses the index to find regions in the genome likely to be homologous to the query sequence It performs an alignment between homologous regions It stitches together these aligned regions (often exons) into larger alignments (typically genes) Finally, BLAT revisits small internal exons possibly missed at the first stage and adjusts large gap boundaries that have canonical splice sites where feasible This paper describes how BLAT was optimized Effects on speed and sensitivity are explored for various K-mer sizes, mismatch schemes, and number of required index matches BLAT is compared with other alignment programs on various test sets and then used in several genome-wide applications http://genomeucscedu hosts a web-based BLAT server for the human genome

8,326 citations

Journal ArticleDOI
TL;DR: This paper proposed parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples.
Abstract: SUMMARY Non-biological experimental variation or “batch effects” are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes (>25) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.

6,319 citations

Related Papers (5)