Journal ArticleDOI
Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects.
Daniele Raimondi,Andrea Gazzo,Marianne Rooman,Tom Lenaerts,Tom Lenaerts,Wim F. Vranken +5 more
- Vol. 32, Iss: 12, pp 1797-1804
Reads0
Chats0
TLDR
DEOGEN is presented, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs and provides a 10% improvement of MCC with respect to current state-of-the-art tools.Abstract:
Motivation: There are now many predictors capable of identifying the likely phenotypic effects of single nucleotide variants (SNVs) or short in-frame Insertions or Deletions (INDELs) on the increasing amount of genome sequence data. Most of these predictors focus on SNVs and use a combination of features related to sequence conservation, biophysical, and/or structural properties to link the observed variant to either neutral or disease phenotype. Despite notable successes, the mapping between genetic variants and their phenotypic effects is riddled with levels of complexity that are not yet fully understood and that are often not taken into account in the predictions, despite their promise of significantly improving the prediction of deleterious mutants. Results: We present DEOGEN, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs. By integrating information from different biological scales and mimicking the complex mixture of effects that lead from the variant to the phenotype, we obtain significant improvements in the variant-effect prediction results. Next to the typical variant-oriented features based on the evolutionary conservation of the mutated positions, we added a collection of protein-oriented features that are based on functional aspects of the gene affected. We cross-validated DEOGEN on 36 825 polymorphisms, 20 821 deleterious SNVs, and 1038 INDELs from SwissProt. The multilevel contextualization of each (variant, protein) pair in DEOGEN provides a 10% improvement of MCC with respect to current state-of-the-art tools. Availability and implementation: The software and the data presented here is publicly available at http://ibsquare.be/deogen. Contact: wvranken@vub.ac.be Supplementary information: Supplementary data are available at Bioinformatics online.read more
Citations
More filters
Journal ArticleDOI
DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins.
Daniele Raimondi,Daniele Raimondi,Ibrahim Tanyalcin,Julien Ferte,Andrea Gazzo,Gabriele Orlando,Gabriele Orlando,Tom Lenaerts,Tom Lenaerts,Marianne Rooman,Wim F. Vranken +10 more
TL;DR: In this paper, the authors present a method, DEOGEN2, which incorporates heterogeneous information about the molecular effects of the variants, the domains involved, the relevance of the gene and the interactions in which it participates.
Journal ArticleDOI
Prediction and interpretation of deleterious coding variants in terms of protein structural stability.
TL;DR: A stability-driven knowledge-based classifier that uses protein structure, artificial neural networks and solvent accessibility-dependent combinations of statistical potentials to predict whether destabilizing or stabilizing mutations are disease-causing.
Journal ArticleDOI
Predicting disease-causing variant combinations
Sofia Papadimitriou,Sofia Papadimitriou,Andrea Gazzo,Nassim Versbraegen,Charlotte Nachtegael,Jan Aerts,Jan Aerts,Yves Moreau,Yves Moreau,Sonia Van Dooren,Ann Nowé,Ann Nowé,Guillaume Smits,Tom Lenaerts,Tom Lenaerts +14 more
TL;DR: The VarCoPP has been designed to act as an interpretable method that can provide explanations on why a bilocus combination is predicted as pathogenic and which biological information is important for that prediction, paving the way to clinical knowledge and improved patient care.
Journal ArticleDOI
Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning.
TL;DR: Two widely used structure‐based approaches, originally developed in the Blundell lab, are focused on: site‐directed mutator (SDM), a statistical approach to analyze amino acid substitutions, and mutation cutoff scanning matrix (mCSM), which uses graph‐based signatures to represent the wild‐type structural environment and machine learning to predict the effect of mutations on protein stability.
Journal ArticleDOI
Understanding mutational effects in digenic diseases
Andrea Gazzo,Andrea Gazzo,Daniele Raimondi,Daniele Raimondi,Dorien Daneels,Yves Moreau,Guillaume Smits,Sonia Van Dooren,Tom Lenaerts,Tom Lenaerts +9 more
TL;DR: In this paper, a combination of variant, gene and higher-level features was used to differentiate between true and composite digenic effects with high accuracy, and they showed that a digenic effect decision profile, extracted from the predictive model, motivated why an instance was assigned to either of the two classes.
References
More filters
Journal ArticleDOI
Random Forests
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Journal Article
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +15 more
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Posted Content
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Andreas Müller,Joel Nothman,Gilles Louppe,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +18 more
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Journal ArticleDOI
A method and server for predicting damaging missense mutations.
Ivan Adzhubei,Steffen Schmidt,Leonid Peshkin,Vasily Ramensky,Anna Gerasimova,Peer Bork,Alexey S. Kondrashov,Shamil R. Sunyaev +7 more
TL;DR: A new method and the corresponding software tool, PolyPhen-2, which is different from the early tool polyPhen1 in the set of predictive features, alignment pipeline, and the method of classification is presented and performance, as presented by its receiver operating characteristic curves, was consistently superior.
Book
The Metabolic and Molecular Bases of Inherited Disease
TL;DR: In this paper, the authors present a list of disorders of MITOCHONDRIAL FUNCTION, including the following: DISORDERS OF MIOCHONDRIC FERTILITY XIX, XVI, XIX.
Related Papers (5)
PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels
Yongwook Choi,Agnes P. Chan +1 more