The GCTx format and cmap{Py, R, M, J} packages: resources for optimized storage and integrated traversal of annotated dense matrices.
Oana M. Enache,David L. Lahr,Ted Natoli,Lev Litichevskiy,David Wadden,Corey Flynn,Joshua Gould,Jacob K. Asiedu,Rajiv Narayan,Aravind Subramanian +9 more
Reads0
Chats0
TLDR
The GCTx file format and a suite of open‐source packages for the efficient storage, serialization and analysis of dense two‐dimensional matrices are presented and it is anticipated that the format's generalizability will lower barriers for integrated cross‐assay analysis and algorithm development.Abstract:
Motivation Facilitated by technological improvements, pharmacologic and genetic perturbational datasets have grown in recent years to include millions of experiments Sharing and publicly distributing these diverse data creates many opportunities for discovery, but in recent years the unprecedented size of data generated and its complex associated metadata have also created data storage and integration challenges Results We present the GCTx file format and a suite of open-source packages for the efficient storage, serialization and analysis of dense two-dimensional matrices We have extensively used the format in the Connectivity Map to assemble and share massive datasets currently comprising 13 million experiments, and we anticipate that the format's generalizability, paired with code libraries that we provide, will lower barriers for integrated cross-assay analysis and algorithm development Availability and implementation Software packages (available in Python, R, Matlab and Java) are freely available at https://githubcom/cmap Additional instructions, tutorials and datasets are available at clueio/code Supplementary information Supplementary data are available at Bioinformatics onlineread more
Citations
More filters
Posted Content
Network Medicine Framework for Identifying Drug Repurposing Opportunities for COVID-19
Deisy Morselli Gysi,Italo Faria do Valle,Marinka Zitnik,Asher Ameli,Xiao Gan,Onur Varol,Susan Dina Ghiassian,J. J. Patten,Robert A. Davey,Joseph Loscalzo,Albert-László Barabási +10 more
TL;DR: Three network-based drug repurposing strategies are deployed, relying on network proximity, diffusion, and AI-based metrics, allowing to rank all approved drugs based on their likely efficacy for COVID-19 patients, and aggregate all predictions, to arrive at 81 promising repurpose candidates.
Journal ArticleDOI
Network medicine framework for identifying drug-repurposing opportunities for COVID-19.
Deisy Morselli Gysi,Deisy Morselli Gysi,Italo Faria do Valle,Marinka Zitnik,Asher Ameli,Xiao Gan,Xiao Gan,Onur Varol,Helia N. Sanchez,Rebecca M. Baron,Dina Ghiassian,Joseph Loscalzo,Albert-László Barabási,Albert-László Barabási,Albert-László Barabási +14 more
TL;DR: In this article, the authors deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2.
Journal ArticleDOI
Cas9 activates the p53 pathway and selects for p53-inactivating mutations.
Oana M. Enache,Veronica Rendo,Mai Abdusamad,Daniel D. Lam,Desiree Davison,Sangita Pal,Naomi Currimjee,Julian M. Hess,Sasha Pantel,Anwesha Nag,Aaron R. Thorner,John G. Doench,Francisca Vazquez,Rameen Beroukhim,Rameen Beroukhim,Todd R. Golub,Uri Ben-David,Uri Ben-David +17 more
TL;DR: Genetic and transcriptional consequences of Cas9 expression induces DNA damage and activates the p53 pathway, and it can lead to the selection of cells with p53-inactivating mutations, and Cas9 is less active in wild-type TP53 cell lines than in TP53- mutant cell lines.
Journal ArticleDOI
Noncanonical open reading frames encode functional proteins essential for cancer cell survival
John R. Prensner,John R. Prensner,John R. Prensner,Oana M. Enache,Victor Luria,Karsten Krug,Karl R. Clauser,Joshua M. Dempster,Amir Karger,Li Wang,Karolina Stumbraite,Vickie M. Wang,Ginevra Botta,Nicholas J. Lyons,Amy Goodale,Zohra Kalani,Briana Fritchman,Adam Brown,Douglas Alan,Thomas M Green,Xiaoping Yang,Jacob D. Jaffe,Jennifer Roth,Federica Piccioni,Federica Piccioni,Marc W. Kirschner,Zhe Ji,David E. Root,Todd R. Golub,Todd R. Golub,Todd R. Golub +30 more
TL;DR: This article showed that non-canonical open reading frames (ORFs) can express biologically active proteins that are potential therapeutic targets, such as glycine-rich extracellular protein-1 (GREP1), which is highly expressed in breast cancer.
Journal ArticleDOI
Deep learning of pharmacogenomics resources: moving towards precision oncology.
Yu-Chiao Chiu,Hung I Harry Chen,Hung I Harry Chen,Aparna Gorthi,Milad Mostavi,Milad Mostavi,Siyuan Zheng,Siyuan Zheng,Yufei Huang,Yi Chen,Yi Chen +10 more
TL;DR: This review provides an in-depth summary of state-of-the-art DL methods and up-to-date pharmacogenomics resources and future opportunities and challenges to realize the goal of precision oncology.
References
More filters
Journal ArticleDOI
Cluster analysis and display of genome-wide expression patterns
TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Journal ArticleDOI
The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease
Justin Lamb,Emily D. Crawford,David Peck,Joshua W. Modell,Irene C. Blat,Matthew J. Wrobel,Jim Lerner,Jean Philippe Brunet,Aravind Subramanian,Kenneth N. Ross,Michael Reich,Haley Hieronymus,Haley Hieronymus,Guo Wei,Guo Wei,Scott A. Armstrong,Scott A. Armstrong,Stephen J. Haggarty,Stephen J. Haggarty,Paul A. Clemons,Ru Wei,Steven A. Carr,Eric S. Lander,Eric S. Lander,Todd R. Golub +24 more
TL;DR: The first installment of a reference collection of gene-expression profiles from cultured human cells treated with bioactive small molecules is created, and it is demonstrated that this “Connectivity Map” resource can be used to find connections among small molecules sharing a mechanism of action, chemicals and physiological processes, and diseases and drugs.
Journal ArticleDOI
Molecular signatures database (MSigDB) 3.0
Arthur Liberzon,Aravind Subramanian,Reid M. Pinchback,Helga Thorvaldsdottir,Pablo Tamayo,Jill P. Mesirov +5 more
TL;DR: A new version of the database, MSigDB 3.0, is reported, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site.
Journal ArticleDOI
Functional Discovery via a Compendium of Expression Profiles
Timothy P. Hughes,Matthew J. Marton,Allan R. Jones,Christopher J. Roberts,Roland Stoughton,Christopher D. Armour,Holly A. Bennett,Ernest M. Coffey,Hongyue Dai,Yudong D. He,Matthew J. Kidd,Amy M King,Michael R. Meyer,David J. Slade,Pek Yee Lum,Sergey B. Stepaniants,Daniel D. Shoemaker,Daniel J Gachotte,Kalpana Chakraburtty,Julian A. Simon,Martin Bard,Stephen H. Friend +21 more
TL;DR: A reference database or "compendium" of expression profiles corresponding to 300 diverse mutations and chemical treatments in S. cerevisiae is constructed, and it is shown that the cellular pathways affected can be determined by pattern matching, even among very subtle profiles.
Journal ArticleDOI
A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles.
Aravind Subramanian,Rajiv Narayan,Steven M. Corsello,Steven M. Corsello,David Peck,Ted Natoli,Xiaodong Lu,Joshua Gould,John F. Davis,Andrew A. Tubelli,Jacob K. Asiedu,David L. Lahr,Jodi E. Hirschman,Zihan Liu,Melanie Donahue,Bina Julian,Mariya Khan,David Wadden,Ian Smith,Daniel D. Lam,Arthur Liberzon,Courtney Toder,Mukta Bagul,Marek Orzechowski,Oana M. Enache,Federica Piccioni,Sarah A. Johnson,Nicholas J. Lyons,Alice H. Berger,Alice H. Berger,Alykhan F. Shamji,Angela N. Brooks,Angela N. Brooks,Anita Vrcic,Corey Flynn,Jacqueline Rosains,David Y. Takeda,David Y. Takeda,Roger Hu,Desiree Davison,Justin Lamb,Kristin Ardlie,Larson Hogstrom,Peyton Greenside,Nathanael S. Gray,Nathanael S. Gray,Paul A. Clemons,Serena J. Silver,Xiaoyun Wu,Wen-Ning Zhao,Wen-Ning Zhao,Willis Read-Button,Xiaohua Wu,Stephen J. Haggarty,Stephen J. Haggarty,Lucienne Ronco,Jesse S. Boehm,Stuart L. Schreiber,Stuart L. Schreiber,Stuart L. Schreiber,John G. Doench,Joshua A. Bittker,David E. Root,Bang Wong,Todd R. Golub +64 more
TL;DR: The expanded CMap is reported, made possible by a new, low-cost, high-throughput reduced representation expression profiling method that is shown to be highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts.
Related Papers (5)
A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles.
Aravind Subramanian,Rajiv Narayan,Steven M. Corsello,Steven M. Corsello,David Peck,Ted Natoli,Xiaodong Lu,Joshua Gould,John F. Davis,Andrew A. Tubelli,Jacob K. Asiedu,David L. Lahr,Jodi E. Hirschman,Zihan Liu,Melanie Donahue,Bina Julian,Mariya Khan,David Wadden,Ian Smith,Daniel D. Lam,Arthur Liberzon,Courtney Toder,Mukta Bagul,Marek Orzechowski,Oana M. Enache,Federica Piccioni,Sarah A. Johnson,Nicholas J. Lyons,Alice H. Berger,Alice H. Berger,Alykhan F. Shamji,Angela N. Brooks,Angela N. Brooks,Anita Vrcic,Corey Flynn,Jacqueline Rosains,David Y. Takeda,David Y. Takeda,Roger Hu,Desiree Davison,Justin Lamb,Kristin Ardlie,Larson Hogstrom,Peyton Greenside,Nathanael S. Gray,Nathanael S. Gray,Paul A. Clemons,Serena J. Silver,Xiaoyun Wu,Wen-Ning Zhao,Wen-Ning Zhao,Willis Read-Button,Xiaohua Wu,Stephen J. Haggarty,Stephen J. Haggarty,Lucienne Ronco,Jesse S. Boehm,Stuart L. Schreiber,Stuart L. Schreiber,Stuart L. Schreiber,John G. Doench,Joshua A. Bittker,David E. Root,Bang Wong,Todd R. Golub +64 more
The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease
Justin Lamb,Emily D. Crawford,David Peck,Joshua W. Modell,Irene C. Blat,Matthew J. Wrobel,Jim Lerner,Jean Philippe Brunet,Aravind Subramanian,Kenneth N. Ross,Michael Reich,Haley Hieronymus,Haley Hieronymus,Guo Wei,Guo Wei,Scott A. Armstrong,Scott A. Armstrong,Stephen J. Haggarty,Stephen J. Haggarty,Paul A. Clemons,Ru Wei,Steven A. Carr,Eric S. Lander,Eric S. Lander,Todd R. Golub +24 more
Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
DrugBank 5.0: a major update to the DrugBank database for 2018
David S. Wishart,Yannick Djoumbou Feunang,An Chi Guo,Elvis J. Lo,Ana Marcu,Jason R. Grant,Tanvir Sajed,Daniel Johnson,Carin Li,Zinat Sayeeda,Nazanin Assempour,Ithayavani Iynkkaran,Yifeng Liu,Adam Maciejewski,Nicola Gale,Alex Wilson,Lucy Chin,Ryan Cummings,Diana Le,Allison Pon,Craig Knox,Michael Wilson +21 more