BIGCHEM: Challenges and Opportunities for Big Data Analysis in Chemistry
TLDR
It is shown that the efficient exploration of billions of molecules requires the development of smart strategies, and the importance of education in “Big Data” for further progress of this area is highlighted.Abstract:
The increasing volume of biomedical data in chemistry and life sciences requires the development of new methods and approaches for their handling. Here, we briefly discuss some challenges and opportunities of this fast growing area of research with a focus on those to be addressed within the BIGCHEM project. The article starts with a brief description of some available resources for "Big Data" in chemistry and a discussion of the importance of data quality. We then discuss challenges with visualization of millions of compounds by combining chemical and biological data, the expectations from mining the "Big Data" using advanced machine-learning methods, and their applications in polypharmacology prediction and target de-convolution in phenotypic screening. We show that the efficient exploration of billions of molecules requires the development of smart strategies. We also address the issue of secure information sharing without disclosing chemical structures, which is critical to enable bi-party or multi-party data sharing. Data sharing is important in the context of the recent trend of "open innovation" in pharmaceutical industry, which has led to not only more information sharing among academics and pharma industries but also the so-called "precompetitive" collaboration between pharma companies. At the end we highlight the importance of education in "Big Data" for further progress of this area.read more
Citations
More filters
“Bioinformatics” 특집을 내면서
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Journal ArticleDOI
Automating drug discovery
TL;DR: This article aims to identify the approaches and technologies that could be implemented robustly by medicinal chemists in the near future and to critically analyse the opportunities and challenges for their more widespread application.
Journal ArticleDOI
A renaissance of neural networks in drug discovery
TL;DR: This review discusses traditional and newly emerging neural network approaches to drug discovery, focusing on backpropagation neural networks and their variants, self-organizing maps and associated methods, and a relatively new technique, deep learning.
Journal ArticleDOI
Synthetic organic chemistry driven by artificial intelligence
TL;DR: The underlying concepts of artificial intelligence are examined to demystify AI for bench chemists in order that they may embrace it as a tool rather than fear it as an competitor, spur future research by pinpointing the gaps in knowledge and delineate how chemical AI will run in the era of digital chemistry.
Journal ArticleDOI
Autonomous Discovery in the Chemical Sciences Part II: Outlook.
TL;DR: The majority of this article defines a large set of open research directions, including improving the ability to work with complex data, build empirical models, automate both physical and computational experiments for validation, select experiments, and evaluate whether to make progress toward the ultimate goal of autonomous discovery.
References
More filters
“Bioinformatics” 특집을 내면서
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Journal ArticleDOI
PubChem Substance and Compound databases
Sunghwan Kim,Paul A. Thiessen,Evan E Bolton,Jie Chen,Gang Fu,Asta Gindulyte,Lianyi Han,Jane He,Siqian He,Benjamin A. Shoemaker,Jiyao Wang,Bo Yu,Jian-Jian Zhang,Stephen H. Bryant +13 more
TL;DR: An overview of the PubChem Substance and Compound databases is provided, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access.
Journal ArticleDOI
New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays
TL;DR: A number of substructural features which can help to identify compounds that appear as frequent hitters (promiscuous compounds) in many biochemical high throughput screens are described.
Journal ArticleDOI
Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17
TL;DR: To better define the unknown chemical space, 166.4 billion molecules of up to 17 atoms of C, N, O, S, and halogens forming the chemical universe database GDB-17 are enumerated, covering a size range containing many drugs and typical for lead compounds.
Journal ArticleDOI
BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology.
TL;DR: The first update of BindingDB since 2007 is provided, focusing on new and unique features and highlighting directions of importance to the field as a whole.