scispace - formally typeset
Search or ask a question

Showing papers by "Nina Jeliazkova published in 2017"


Journal ArticleDOI
TL;DR: This paper highlights the continued efforts to provide a community driven, open source cheminformatics library, and shows that such collaborative projects can thrive over extended periods of time, resulting in a high-quality and performant library.
Abstract: Background: The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such ...

347 citations


Journal ArticleDOI
TL;DR: In this article, the authors compile a comprehensive chemogenomics dataset with over 70 million SAR data points from publicly available databases (PubChem and ChEMBL) including structure, target information and activity annotations.
Abstract: Chemogenomics data generally refers to the activity data of chemical compounds on an array of protein targets and represents an important source of information for building in silico target prediction models. The increasing volume of chemogenomics data offers exciting opportunities to build models based on Big Data. Preparing a high quality data set is a vital step in realizing this goal and this work aims to compile such a comprehensive chemogenomics dataset. This dataset comprises over 70 million SAR data points from publicly available databases (PubChem and ChEMBL) including structure, target information and activity annotations. Our aspiration is to create a useful chemogenomics resource reflecting industry-scale data not only for building predictive models of in silico polypharmacology and off-target effects but also for the validation of cheminformatics approaches in general.

136 citations


Journal ArticleDOI
TL;DR: The aim was to interpret and expand the guidance for the well-known "OECD Principles for the Validation, for Regulatory Purposes, of (Q)SAR Models", with reference to nano-SAR, and present opinions on the criteria to be fulfilled for models developed for nanoparticles.

29 citations


Journal ArticleDOI
TL;DR: The aspiration is to create a useful chemogenomics resource reflecting industry‐scale data not only for building predictive models of in silico polypharmacology and off‐ target effects but also for the validation of cheminformatics approaches in general.
Abstract: Chemogenomics data generally refers to the activity data of chemical compounds on an array of protein targets and represents an important source of information for building in silico target prediction models. The increasing volume of chemogenomics data offers exciting opportunities to build models based on Big Data. Preparing a high quality data set is a vital step in realizing this goal and this work aims to compile such a comprehensive chemogenomics dataset. This dataset comprises over 70 million SAR data points from publicly available databases (PubChem and ChEMBL) including structure, target information and activity annotations. Our aspiration is to create a useful chemogenomics resource reflecting industry‐scale data not only for building predictive models of in silico polypharmacology and off‐ target effects but also for the validation of cheminformatics approaches in general.

11 citations