scispace - formally typeset
Search or ask a question
Author

Bang Wong

Bio: Bang Wong is an academic researcher from Broad Institute. The author has contributed to research in topics: Drug repositioning & Biological data. The author has an hindex of 23, co-authored 59 publications receiving 5046 citations. Previous affiliations of Bang Wong include Johns Hopkins University School of Medicine & Harvard University.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
30 Nov 2017-Cell
TL;DR: The expanded CMap is reported, made possible by a new, low-cost, high-throughput reduced representation expression profiling method that is shown to be highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts.

1,943 citations

Posted ContentDOI
10 May 2017-bioRxiv
TL;DR: A new, low-cost, high throughput reduced representation expression profiling method, L1000, is shown to be highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts.
Abstract: We previously piloted the concept of a Connectivity Map (CMap), whereby genes, drugs and disease states are connected by virtue of common gene-expression signatures. Here, we report more than a 1,000-fold scale-up of the CMap as part of the NIH LINCS Consortium, made possible by a new, low-cost, high throughput reduced representation expression profiling method that we term L1000. We show that L1000 is highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts. We further show that the expanded CMap can be used to discover mechanism of action of small molecules, functionally annotate genetic variants of disease genes, and inform clinical trials. The 1.3 million L1000 profiles described here, as well as tools for their analysis, are available at https://clue.io.

636 citations

Journal ArticleDOI
TL;DR: This work hand-curated a collection of 4,707 compounds, experimentally confirmed their identities, and annotated them with literature-reported targets, to assemble a comprehensive library of drugs that have reached the clinic and established a blueprint for others to easily assemble such a repurposing library.
Abstract: To the Editor: Drug repurposing, the application of an existing therapeutic to a new disease indication, holds promise of rapid clinical impact at a lower cost than de novo drug development. So far, there has not been a systematic effort to identify such opportunities, limited in part by the lack of a comprehensive library of clinical compounds suitable for testing. To address this challenge, we hand-curated a collection of 4,707 compounds, experimentally confirmed their identities, and annotated them with literature-reported targets. The collection includes 3,422 drugs that are marketed around the world or that have been tested in human clinical trials. Compounds were obtained from more than 50 chemical vendors, and the purity of each sample was established. We have thus established a blueprint for others to easily assemble such a repurposing library, and we have created an online Drug Repurposing Hub (http:// www.broadinstitute.org/repurposing) that contains detailed annotation for each of the compounds. Repurposing is attractive and pragmatic, given the substantial cost and time requirements—on average, a decade or more—for drug development1. In addition, a large number of potential drugs never reach clinical testing. Moreover, fewer than 15% of compounds that enter clinical development ultimately receive approval, despite the majority of them being deemed safe2. For either approved or failed drugs for which safety has already been established, finding new indications can rapidly bring benefits to patients. Prior drug-repurposing successes span disease areas; examples include the cyclooxygenase inhibitor aspirin to treat coronary-artery disease, the phosphodiesterase inhibitor sildenafil to treat erectile dysfunction, and the antibiotic erythromycin for impaired gastric motility (Supplementary Table 1)3. Even drugs associated with troubling side effects merit reconsideration, as evidenced by the successful repurposing of the antiemetic thalidomide to treat multiple myeloma4. Risk-mediating measures for avoiding the potential teratogenicity of thalidomide and its derivatives are reasonable in patients with life-threatening cancer, whereas the use of these drugs to treat nausea remains unacceptable. Although the benefits of repurposing are clear, successes thus far have been mostly serendipitous. Systematic, large-scale repurposing efforts have not been possible owing to the lack of a definitive physical drug collection, the low quality of drug annotations, and insufficient readouts of drug activity from which new indications can be predicted. Recent technological advances have enabled a step change in our ability to assess drug activities comprehensively. For example, perturbational gene expression profiles can now be obtained at high throughput across multiple cell types5. Gene expression profiling has enabled recent repurposing discoveries, including sirolimus for glucocorticoid-resistant acute lymphocytic leukemia, topiramate for inflammatory-bowel disease, and imipramine for small-cell lung cancer. For cancer therapeutics, a recently developed assay known as PRISM, which uses barcoded cell lines, enables rapid testing of many drugs against a large number of cancer cell lines in pools6. Molecular features of the cell lines (for example, gene expression, mutation, or copy-number variation) can then be used to identify predictive biomarkers of drug sensitivity (Supplementary Table 2). Finally, morphologic changes in cells can be assessed using high-throughput microscopy and machine-learning approaches. Such imaging-based screening unexpectedly identified the cholesterol drug lovastatin as a potent inhibitor of leukemia stem cells. To take advantage of these advances in experimental methods, we sought to assemble a comprehensive library of drugs that have reached the clinic. Surprisingly, we found that no such chemical library of approved and clinical trial drugs is available for purchase. In particular, drugs that have been tested in clinical trials but did not reach approval are not readily accessible. Even obtaining a complete list of such drugs and their annotations is challenging. A prior effort led by the US National Institutes of Health (NIH) focused on drugs approved by the US Food and Drug Administration (FDA), but the library has few compounds that have yet to achieve FDA approval7. Some chemical vendors offer a subset of approved drugs, but most of these commercial libraries overlap in their content and include only a small fraction of the approximately 10,000 drugs that have reached the clinic in the United States and Europe. Given that no complete collection exists, we launched a three-step effort to create the Repurposing Library by (i) identifying and purchasing compounds; (ii) comprehensively annotating their known activities and clinical indications; and (iii) experimentally confirming drug identity and purity. We employed two approaches to identify clinical-drug structures for the Repurposing Library. First, we searched existing databases, both publicly accessible and proprietary, for clinically tested drugs and then manually integrated them to ensure sufficient drug coverage and chemical-structure reliability (Supplementary Table 3). Sources included DrugBank, the NCATS NCGC Pharmaceutical Collection (NPC), Thomson Reuters Integrity, Thomson Reuters Cortellis, and Citeline Pharmaprojects7–9. Second, we located marketed or approved ingredient lists from regulatory agencies worldwide, including the FDA. After structure standardization and the removal of duplicates, approximately 10,000 small-molecule drugs with disclosed structures were found to have reached clinical development. Most of these drugs are not widely available in commercial screening libraries. Through structure-matching (as opposed to relying on compound names), chemical suppliers were identified for 5,691 compounds (Fig. 1). Controlled substances, nonpharmaceutical substances, and redundant elemental formulations were not pursued further. To assemble the collection, we ultimately purchased 8,584 samples (representing 5,691 unique compounds) from 75 chemical vendors, at an average cost of $29 per sample. We performed chemical-structure analysis on all clinical-drug structures (whether commercially available or not) to assess the extent of The Drug Repurposing Hub: a next-generation drug library and information resource

619 citations

Journal ArticleDOI
08 Aug 2018-Nature
TL;DR: The extent, origins and consequences of genetic variation within human cell lines are studied, providing a framework for researchers to measure such variation in efforts to support maximally reproducible cancer research.
Abstract: Human cancer cell lines are the workhorse of cancer research. Although cell lines are known to evolve in culture, the extent of the resultant genetic and transcriptional heterogeneity and its functional consequences remain understudied. Here we use genomic analyses of 106 human cell lines grown in two laboratories to show extensive clonal diversity. Further comprehensive genomic characterization of 27 strains of the common breast cancer cell line MCF7 uncovered rapid genetic diversification. Similar results were obtained with multiple strains of 13 additional cell lines. Notably, genetic changes were associated with differential activation of gene expression programs and marked differences in cell morphology and proliferation. Barcoding experiments showed that cell line evolution occurs as a result of positive clonal selection that is highly sensitive to culture conditions. Analyses of single-cell-derived clones demonstrated that continuous instability quickly translates into heterogeneity of the cell line. When the 27 MCF7 strains were tested against 321 anti-cancer compounds, we uncovered considerably different drug responses: at least 75% of compounds that strongly inhibited some strains were completely inactive in others. This study documents the extent, origins and consequences of genetic variation within cell lines, and provides a framework for researchers to measure such variation in efforts to support maximally reproducible cancer research.

601 citations

Journal ArticleDOI
TL;DR: Several CNAs recurrently observed in primary tumors gradually disappeared in PDXs, indicating that events undergoing positive selection in humans can become dispensable during propagation in mice.
Abstract: Patient-derived xenografts (PDXs) have become a prominent cancer model system, as they are presumed to faithfully represent the genomic features of primary tumors. Here we monitored the dynamics of copy number alterations (CNAs) in 1,110 PDX samples across 24 cancer types. We observed rapid accumulation of CNAs during PDX passaging, often due to selection of preexisting minor clones. CNA acquisition in PDXs was correlated with the tissue-specific levels of aneuploidy and genetic heterogeneity observed in primary tumors. However, the particular CNAs acquired during PDX passaging differed from those acquired during tumor evolution in patients. Several CNAs recurrently observed in primary tumors gradually disappeared in PDXs, indicating that events undergoing positive selection in humans can become dispensable during propagation in mice. Notably, the genomic stability of PDXs was associated with their response to chemotherapy and targeted drugs. These findings have major implications for PDX-based modeling of human cancer.

494 citations


Cited by
More filters
01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations

Journal ArticleDOI
TL;DR: All of the major steps in RNA-seq data analysis are reviewed, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping.
Abstract: RNA-sequencing (RNA-seq) has a wide variety of applications, but no single analysis pipeline can be used in all cases. We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping. We highlight the challenges associated with each step. We discuss the analysis of small RNAs and the integration of RNA-seq with other functional genomics techniques. Finally, we discuss the outlook for novel technologies that are changing the state of the art in transcriptomics.

1,963 citations

Journal ArticleDOI
30 Nov 2017-Cell
TL;DR: The expanded CMap is reported, made possible by a new, low-cost, high-throughput reduced representation expression profiling method that is shown to be highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts.

1,943 citations

Journal ArticleDOI
08 May 2019-Nature
TL;DR: The original Cancer Cell Line Encyclopedia is expanded with deeper characterization of over 1,000 cell lines, including genomic, transcriptomic, and proteomic data, and integration with drug-sensitivity and gene-dependency data, which reveals potential targets for cancer drugs and associated biomarkers.
Abstract: Large panels of comprehensively characterized human cancer models, including the Cancer Cell Line Encyclopedia (CCLE), have provided a rigorous framework with which to study genetic variants, candidate targets, and small-molecule and biological therapeutics and to identify new marker-driven cancer dependencies. To improve our understanding of the molecular features that contribute to cancer phenotypes, including drug responses, here we have expanded the characterizations of cancer cell lines to include genetic, RNA splicing, DNA methylation, histone H3 modification, microRNA expression and reverse-phase protein array data for 1,072 cell lines from individuals of various lineages and ethnicities. Integration of these data with functional characterizations such as drug-sensitivity, short hairpin RNA knockdown and CRISPR-Cas9 knockout data reveals potential targets for cancer drugs and associated biomarkers. Together, this dataset and an accompanying public data portal provide a resource for the acceleration of cancer research using model cancer cell lines.

1,801 citations

Journal ArticleDOI
04 Dec 2014-Cell
TL;DR: A combination of tissue- and lineage-specific transcription factors form the regulatory networks controlling chromatin specification in tissue-resident macrophages, and the environment is capable of shaping the chromatin landscape of transplanted bone marrow precursors, and even differentiated macrophage can be reprogrammed when transferred into a new microenvironment.

1,628 citations