scispace - formally typeset
Journal ArticleDOI

Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects.

Reads0
Chats0
TLDR
DEOGEN is presented, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs and provides a 10% improvement of MCC with respect to current state-of-the-art tools.
Abstract
Motivation: There are now many predictors capable of identifying the likely phenotypic effects of single nucleotide variants (SNVs) or short in-frame Insertions or Deletions (INDELs) on the increasing amount of genome sequence data. Most of these predictors focus on SNVs and use a combination of features related to sequence conservation, biophysical, and/or structural properties to link the observed variant to either neutral or disease phenotype. Despite notable successes, the mapping between genetic variants and their phenotypic effects is riddled with levels of complexity that are not yet fully understood and that are often not taken into account in the predictions, despite their promise of significantly improving the prediction of deleterious mutants. Results: We present DEOGEN, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs. By integrating information from different biological scales and mimicking the complex mixture of effects that lead from the variant to the phenotype, we obtain significant improvements in the variant-effect prediction results. Next to the typical variant-oriented features based on the evolutionary conservation of the mutated positions, we added a collection of protein-oriented features that are based on functional aspects of the gene affected. We cross-validated DEOGEN on 36 825 polymorphisms, 20 821 deleterious SNVs, and 1038 INDELs from SwissProt. The multilevel contextualization of each (variant, protein) pair in DEOGEN provides a 10% improvement of MCC with respect to current state-of-the-art tools. Availability and implementation: The software and the data presented here is publicly available at http://ibsquare.be/deogen. Contact: wvranken@vub.ac.be Supplementary information: Supplementary data are available at Bioinformatics online.

read more

Citations
More filters
Journal ArticleDOI

DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins.

TL;DR: In this paper, the authors present a method, DEOGEN2, which incorporates heterogeneous information about the molecular effects of the variants, the domains involved, the relevance of the gene and the interactions in which it participates.
Journal ArticleDOI

Prediction and interpretation of deleterious coding variants in terms of protein structural stability.

TL;DR: A stability-driven knowledge-based classifier that uses protein structure, artificial neural networks and solvent accessibility-dependent combinations of statistical potentials to predict whether destabilizing or stabilizing mutations are disease-causing.
Journal ArticleDOI

Predicting disease-causing variant combinations

TL;DR: The VarCoPP has been designed to act as an interpretable method that can provide explanations on why a bilocus combination is predicted as pathogenic and which biological information is important for that prediction, paving the way to clinical knowledge and improved patient care.
Journal ArticleDOI

Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning.

TL;DR: Two widely used structure‐based approaches, originally developed in the Blundell lab, are focused on: site‐directed mutator (SDM), a statistical approach to analyze amino acid substitutions, and mutation cutoff scanning matrix (mCSM), which uses graph‐based signatures to represent the wild‐type structural environment and machine learning to predict the effect of mutations on protein stability.
Journal ArticleDOI

Understanding mutational effects in digenic diseases

TL;DR: In this paper, a combination of variant, gene and higher-level features was used to differentiate between true and composite digenic effects with high accuracy, and they showed that a digenic effect decision profile, extracted from the predictive model, motivated why an instance was assigned to either of the two classes.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Journal Article

Scikit-learn: Machine Learning in Python

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Journal ArticleDOI

A method and server for predicting damaging missense mutations.

TL;DR: A new method and the corresponding software tool, PolyPhen-2, which is different from the early tool polyPhen1 in the set of predictive features, alignment pipeline, and the method of classification is presented and performance, as presented by its receiver operating characteristic curves, was consistently superior.
Book

The Metabolic and Molecular Bases of Inherited Disease

TL;DR: In this paper, the authors present a list of disorders of MITOCHONDRIAL FUNCTION, including the following: DISORDERS OF MIOCHONDRIC FERTILITY XIX, XVI, XIX.
Related Papers (5)