scispace - formally typeset
Search or ask a question

Showing papers by "Nina Jeliazkova published in 2014"


Journal ArticleDOI
TL;DR: An R‐code is provided that evaluates performances of the suggested algorithms to assess predictive error based on log likelihood scores and empirical coverage graphs, and which applies these to derive confidence intervals or samples from the predictive distributions of query compounds.
Abstract: Predictive models used in decision making, such as QSARs in chemical regulation or drug discovery, call for evaluated approaches to quantitatively assess associated uncertainty in predictions. Uncertainty in less reliable predictions may be captured by locally varying predictive errors. In the current study, model-based bootstrapping was combined with analogy reasoning to generate predictive distributions varying in magnitude over a model's domain of applicability. A resampling experiment based on PLS regressions on four QSAR data sets demonstrated that predictive errors assessed by k nearest neighbour or weighted PRedicted Error Sum of Squares (PRESS) on samples of external test data or by internal cross-validation improved the performance of the uncertainty assessment. Analogy using similarity defined by Euclidean distances, or differences in standard deviation in perturbed predictions, resulted in better performances than similarity defined by distance to, or density of, the training data. Locally assessed predictive distributions had on average at least as good coverage as Gaussian distribution with variance assessed from the PRESS. An R-code is provided that evaluates performances of the suggested algorithms to assess predictive error based on log likelihood scores and empirical coverage graphs, and which applies these to derive confidence intervals or samples from the predictive distributions of query compounds.

24 citations


Journal ArticleDOI
TL;DR: The main features of the QSPR-THESAURUS website are overviewed, such as model upload, experimental design and hazard assessment to support risk assessment, and integration with other web tools, all of which are essential parts of the project.
Abstract: The aim of the CADASTER project (CAse Studies on the Development and Application of in Silico Techniques for Environmental Hazard and Risk Assessment) was to exemplify REACH-related hazard assessments for four classes of chemical compound, namely, polybrominated diphenylethers, per and polyfluorinated compounds, (benzo)triazoles, and musks and fragrances. The QSPR-THESAURUS website (http: / /qspr-thesaurus.eu) was established as the project's online platform to upload, store, apply, and also create, models within the project. We overview the main features of the website, such as model upload, experimental design and hazard assessment to support risk assessment, and integration with other web tools, all of which are essential parts of the QSPR-THESAURUS.

12 citations


Proceedings ArticleDOI
01 Jan 2014
TL;DR: The eNanoMapper database solution builds on previous experience of the consortium partners in supporting diverse data through flexible data storage, semantic web technologies, open source components and web services, and adopts an ontology-supported data model, describing the materials and measurements.
Abstract: The EU-funded eNanoMapper project proposes a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs) based on open standards, ontologies and an interoperable design to enable a more effective, integrated approach to European research in nanotechnology. eNanoMapper's goal is to support the collaborative safety assessment for ENMs by creating a modular, extensible infrastructure for transparent data sharing, data analysis, and the creation of computational toxicology models for ENMs. The eNanoMapper database solution builds on previous experience of the consortium partners in supporting diverse data through flexible data storage, semantic web technologies, open source components and web services. A number of opportunities and challenges exist in nanomaterials representation and integration of ENM information, originating from diverse systems. A short summary, highlighting the pros and cons of the existing integration approaches and data models is presented. We demonstrate the approach of adopting an ontology-supported data model, describing the materials and measurements. The data sources supported include diverse formats (ISA-Tab, OECD Harmonized Templates, custom spreadsheet templates, various databases provided by consortia members). Besides retaining the data provenance, the focus on measurements provides insights into how to reuse the chemical structure database tools for nanomaterials characterization and safety.

10 citations



01 Jan 2014
TL;DR: The strategy to adopt and extend ontologies in support of data integration for eNanoMapper, a pan-European computational infrastructure for toxicological data management for ENMs, based on semantic web standards and ontologies is described.
Abstract: Engineered nanomaterials (ENMs) are being developed to meet specific application needs in diverse domains across the engineering and biomedical sciences (e.g. drug delivery). However, accompanying the exciting proliferation of novel nanomaterials is a challenging race to understand and predict their possibly detrimental effects on human health and the environment. The eNanoMapper project (www.enanomapper.net) is creating a pan-European computational infrastructure for toxicological data management for ENMs, based on semantic web standards and ontologies. Here, we describe our strategy to adopt and extend ontologies in support of data integration for eNanoMapper. ENM safety is at the boundary between engineering and the life sciences, and at the boundary between molecular granularity and bulk granularity. This creates challenges for the definition of key entities in the domain, which we will also discuss.

1 citations