scispace - formally typeset
Open AccessJournal ArticleDOI

BIGCHEM: Challenges and Opportunities for Big Data Analysis in Chemistry

TLDR
It is shown that the efficient exploration of billions of molecules requires the development of smart strategies, and the importance of education in “Big Data” for further progress of this area is highlighted.
Abstract
The increasing volume of biomedical data in chemistry and life sciences requires the development of new methods and approaches for their handling. Here, we briefly discuss some challenges and opportunities of this fast growing area of research with a focus on those to be addressed within the BIGCHEM project. The article starts with a brief description of some available resources for "Big Data" in chemistry and a discussion of the importance of data quality. We then discuss challenges with visualization of millions of compounds by combining chemical and biological data, the expectations from mining the "Big Data" using advanced machine-learning methods, and their applications in polypharmacology prediction and target de-convolution in phenotypic screening. We show that the efficient exploration of billions of molecules requires the development of smart strategies. We also address the issue of secure information sharing without disclosing chemical structures, which is critical to enable bi-party or multi-party data sharing. Data sharing is important in the context of the recent trend of "open innovation" in pharmaceutical industry, which has led to not only more information sharing among academics and pharma industries but also the so-called "precompetitive" collaboration between pharma companies. At the end we highlight the importance of education in "Big Data" for further progress of this area.

read more

Citations
More filters

“Bioinformatics” 특집을 내면서

TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Journal ArticleDOI

Automating drug discovery

TL;DR: This article aims to identify the approaches and technologies that could be implemented robustly by medicinal chemists in the near future and to critically analyse the opportunities and challenges for their more widespread application.
Journal ArticleDOI

A renaissance of neural networks in drug discovery

TL;DR: This review discusses traditional and newly emerging neural network approaches to drug discovery, focusing on backpropagation neural networks and their variants, self-organizing maps and associated methods, and a relatively new technique, deep learning.
Journal ArticleDOI

Synthetic organic chemistry driven by artificial intelligence

TL;DR: The underlying concepts of artificial intelligence are examined to demystify AI for bench chemists in order that they may embrace it as a tool rather than fear it as an competitor, spur future research by pinpointing the gaps in knowledge and delineate how chemical AI will run in the era of digital chemistry.
Journal ArticleDOI

Autonomous Discovery in the Chemical Sciences Part II: Outlook.

TL;DR: The majority of this article defines a large set of open research directions, including improving the ability to work with complex data, build empirical models, automate both physical and computational experiments for validation, select experiments, and evaluate whether to make progress toward the ultimate goal of autonomous discovery.
References
More filters

“Bioinformatics” 특집을 내면서

TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Journal ArticleDOI

PubChem Substance and Compound databases

TL;DR: An overview of the PubChem Substance and Compound databases is provided, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access.
Journal ArticleDOI

New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays

TL;DR: A number of substructural features which can help to identify compounds that appear as frequent hitters (promiscuous compounds) in many biochemical high throughput screens are described.
Journal ArticleDOI

Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17

TL;DR: To better define the unknown chemical space, 166.4 billion molecules of up to 17 atoms of C, N, O, S, and halogens forming the chemical universe database GDB-17 are enumerated, covering a size range containing many drugs and typical for lead compounds.
Journal ArticleDOI

BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology.

TL;DR: The first update of BindingDB since 2007 is provided, focusing on new and unique features and highlighting directions of importance to the field as a whole.
Related Papers (5)