Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.
Hashem A. Shihab,Julian Gough,David Neil Cooper,Peter D. Stenson,Gary L A Barker,Keith J. Edwards,Ian N. M. Day,Tom R. Gaunt +7 more
Reads0
Chats0
TLDR
The Functional Analysis Through Hidden Markov Models (FATHMM) software and server is described: a species‐independent method with optional species‐specific weightings for the prediction of the functional effects of protein missense variants, demonstrating that FATHMM can be efficiently applied to high‐throughput/large‐scale human and nonhuman genome sequencing projects with the added benefit of phenotypic outcome associations.Abstract:
The rate at which nonsynonymous single nucleotide polymorphisms (nsSNPs) are being identified in the human genome is increasing dramatically owing to advances in whole-genome/whole-exome sequencing technologies. Automated methods capable of accurately and reliably distinguishing between pathogenic and functionally neutral nsSNPs are therefore assuming ever-increasing importance. Here, we describe the Functional Analysis Through Hidden Markov Models (FATHMM) software and server: a species-independent method with optional species-specific weightings for the prediction of the functional effects of protein missense variants. Using a model weighted for human mutations, we obtained performance accuracies that outperformed traditional prediction methods (i.e., SIFT, PolyPhen, and PANTHER) on two separate benchmarks. Furthermore, in one benchmark, we achieve performance accuracies that outperform current state-of-the-art prediction methods (i.e., SNPs&GO and MutPred). We demonstrate that FATHMM can be efficiently applied to high-throughput/large-scale human and nonhuman genome sequencing projects with the added benefit of phenotypic outcome associations. To illustrate this, we evaluated nsSNPs in wheat (Triticum spp.) to identify some of the important genetic variants responsible for the phenotypic differences introduced by intense selection during domestication. A Web-based implementation of FATHMM, including a high-throughput batch facility and a downloadable standalone package, is available at http://fathmm.biocompute.org.uk.read more
Citations
More filters
Journal ArticleDOI
Exhaustive non-synonymous variants functionality prediction enables high resolution characterization of the neurofibromin architecture.
TL;DR: A novel NF1-specific functional prediction model that focuses on nonsynonymous single nucleotide variants (SNVs) that enables annotating all possible NF1 nonsynonym variants, thus mapping the range of pathogenic non-truncating variants at the codon level across the NF1 gene.
Journal ArticleDOI
Genome-wide association study of chronic sputum production implicates loci involved in mucus production and infection
TL;DR: In this paper , the authors conducted a genome-wide association study (GWAS) of chronic sputum production in UK Biobank and identified six novel genomewide significant signals, including signals in the human leukocyte antigen (HLA) locus, chromosome 11 mucin locus (containing MUC2, MUC5AC and MUC 5B), and FUT2 locus.
Journal ArticleDOI
Reporting Two Novel Mutations in Two Iranian Families with Cystic Fibrosis, Molecular and Bioinformatic Analysis
TL;DR: The outcome of the investigation of two unrelated Iranian families with cystic fibrosis patients found two novel mutations in the CFTR gene that expand the spectrum of CFTR pathogenic variations and can improve prenatal diagnosis and genetic counseling for cystic Fibrosis.
Journal ArticleDOI
Germline sequence variants contributing to cancer susceptibility in South African breast cancer patients of African ancestry
TL;DR: In this article , a cohort of 165 South African women of self-identified African ancestry diagnosed with breast cancer, who were unselected for family history of cancer, were analyzed using the Illumina TruSight cancer panel for targeted sequencing of 94 cancer susceptibility genes.
Journal ArticleDOI
Multiple primary malignances managed with surgical excision: a case report with next generation sequencing analysis
Chiara Romano,S. Di Gregorio,Maria Stella Pennisi,Elena Tirrò,Giuseppe Broggi,Rosario Caltabiano,Livia Manzella,Martino Ruggieri,Paolo Vigneri,Andrea Di Cataldo +9 more
TL;DR: NGS analysis contributed to defined different molecular profiles for two tumors developed in the span of two years, thus allowing diagnosing the case as MPN, however, NGS was unable to establish a direct correlation between the identified alterations and cancer development.
References
More filters
Journal ArticleDOI
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Stephen F. Altschul,Thomas L. Madden,Alejandro A. Schäffer,Jinghui Zhang,Zheng Zhang,Webb Miller,David J. Lipman +6 more
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI
The Pfam protein families database
Marco Punta,Penny Coggill,Ruth Y. Eberhardt,Jaina Mistry,John Tate,Chris Boursnell,Ningze Pang,Kristoffer Forslund,Goran Ceric,Jody Clements,Andreas Heger,Liisa Holm,Erik L. L. Sonnhammer,Sean R. Eddy,Alex Bateman,Robert D. Finn +15 more
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.