A new disease-specific machine learning approach for the prediction of cancer-causing missense variants
Emidio Capriotti,Russ B. Altman +1 more
Reads0
Chats0
TLDR
A Support Vector Machine (SVM) classifier trained on a set of 3163 cancer-causing variants and an equal number of neutral polymorphisms is presented, which results in higher prediction accuracy and correlation coefficient in identifying cancer- Causing variants.About:
This article is published in Genomics.The article was published on 2011-10-01 and is currently open access. It has received 75 citations till now. The article focuses on the topics: Single-nucleotide polymorphism.read more
Citations
More filters
Journal ArticleDOI
UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches
TL;DR: The results support the use of UniRef clusters as a comprehensive and scalable alternative to native sequence databases for similarity searches and reinforces its reliability for use in functional annotation.
Journal ArticleDOI
Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.
Hashem A. Shihab,Julian Gough,David Neil Cooper,Peter D. Stenson,Gary L A Barker,Keith J. Edwards,Ian N. M. Day,Tom R. Gaunt +7 more
TL;DR: The Functional Analysis Through Hidden Markov Models (FATHMM) software and server is described: a species‐independent method with optional species‐specific weightings for the prediction of the functional effects of protein missense variants, demonstrating that FATHMM can be efficiently applied to high‐throughput/large‐scale human and nonhuman genome sequencing projects with the added benefit of phenotypic outcome associations.
Journal ArticleDOI
Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.
TL;DR: The recent progress of SVMs in cancer genomic studies is reviewed and the strength of the SVM learning and its future perspective incancer genomic applications is comprehended.
Journal ArticleDOI
Identifying Mendelian disease genes with the Variant Effect Scoring Tool
TL;DR: The ability of an aggregate VEST gene score to identify candidate Mendelian disease genes, based on whole-exome sequencing of a small number of disease cases, demonstrates the potential power gain of aggregating bioinformatics variant scores into gene-level scores and the general utility of bio informatics in assisting the search for disease genes in large-scale exome sequencing studies.
Journal ArticleDOI
WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation
Emidio Capriotti,Remo Calabrese,Piero Fariselli,Pier Luigi Martelli,Russ B. Altman,Rita Casadio +5 more
TL;DR: This work presents the web server implementation of SNPs&GO, a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function.
References
More filters
Journal ArticleDOI
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Stephen F. Altschul,Thomas L. Madden,Alejandro A. Schäffer,Jinghui Zhang,Zheng Zhang,Webb Miller,David J. Lipman +6 more
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI
The Pfam protein families database
Marco Punta,Penny Coggill,Ruth Y. Eberhardt,Jaina Mistry,John Tate,Chris Boursnell,Ningze Pang,Kristoffer Forslund,Goran Ceric,Jody Clements,Andreas Heger,Liisa Holm,Erik L. L. Sonnhammer,Sean R. Eddy,Alex Bateman,Robert D. Finn +15 more
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Journal ArticleDOI
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
Paul Burton,David Clayton,Lon R. Cardon,Nicholas John Craddock,Panos Deloukas,Audrey Duncanson,Dominic P. Kwiatkowski,Mark I. McCarthy,Willem H. Ouwehand,Nilesh J. Samani,John A. Todd,Peter Donnelly,Jeffrey C. Barrett,Dan Davison,Doug Easton,David M. Evans,H. T. Leung,Jonathan Marchini,Andrew P. Morris,Chris C. A. Spencer,Martin D. Tobin,Antony P. Attwood,James P. Boorman,Barbara Cant,Ursula Everson,Judith M. Hussey,Jennifer Jolley,Alexandra S. Knight,Kerstin Koch,Elizabeth Meech,Sarah Nutland,Christopher Prowse,Helen Stevens,Niall C. Taylor,Graham R. Walters,Neil Walker,Nicholas A. Watkins,Thilo Winzer,Richard Jones,Wendy L. McArdle,Susan M. Ring,David P. Strachan,Marcus Pembrey,Gerome Breen,David St Clair,Sian Caesar,Katherine Gordon-Smith,Lisa Jones,Christine Fraser,Elaine K. Green,Detelina Grozeva,Marian L. Hamshere,Peter Holmans,Ian Jones,George Kirov,Valentina Moskvina,Ivan Nikolov,Michael Conlon O'Donovan,Michael John Owen,David A. Collier,Amanda Elkin,Anne Farmer,Richard Williamson,Peter McGuffin,Allan H. Young,I. Nicol Ferrier,Stephen G. Ball,Anthony J. Balmforth,Jennifer H. Barrett,D. Timothy Bishop,Mark M. Iles,Azhar Maqbool,Nadira Yuldasheva,Alistair S. Hall,Peter S. Braund,Richard J. Dixon,Massimo Mangino,Suzanne Stevens,John R. Thompson,Francesca Bredin,Mark Tremelling,Miles Parkes,Hazel E. Drummond,Charlie W. Lees,Elaine R. Nimmo,Jack Satsangi,Sheila A. Fisher,Alastair Forbes,Cathryn M. Lewis,Clive M. Onnie,Natalie J. Prescott,Jeremy D. Sanderson,Christopher G. Mathew,Jamie Barbour,M. Khalid Mohiuddin,Catherine E. Todhunter,John C. Mansfield,Tariq Ahmad,Fraser Cummings,Derek P. Jewell,John Webster,Morris J. Brown,G. Mark Lathrop,John M. C. Connell,Anna F. Dominiczak,Carolina A. Braga Marcano,Beverley Burke,Richard Dobson,Johannie Gungadoo,Kate L. Lee,Patricia B. Munroe,Stephen Newhouse,Abiodun Onipinla,Chris Wallace,Mingzhan Xue,Mark J. Caulfield,Martin Farrall,Anne Barton,Ian N. Bruce,Hannah Donovan,Steve Eyre,Paul D. Gilbert,Samantha L. Hider,Anne Hinks,Sally John,Catherine Potter,Alan J. Silman,Deborah P M Symmons,Wendy Thomson,Jane Worthington,David B. Dunger,Barry Widmer,Timothy M. Frayling,Rachel M. Freathy,Hana Lango,John R. B. Perry,Beverley M. Shields,Michael N. Weedon,Andrew T. Hattersley,Graham A. Hitman,Mark Walker,Kate S. Elliott,Christopher J. Groves,Cecilia M. Lindgren,Nigel W. Rayner,Nicholas J. Timpson,Eleftheria Zeggini,Melanie J. Newport,Giorgio Sirugo,Emily J. Lyons,Fredrik O. Vannberg,Adrian V. S. Hill,Linda A. Bradbury,C Farrar,J J Pointon,Paul Wordsworth,Matthew A. Brown,Jayne A. Franklyn,Joanne M. Heward,Matthew J. Simmonds,Stephen C. L. Gough,Sheila Seal,Michael R. Stratton,Nazneen Rahman,Maria Ban,An Goris,Stephen Sawcer,Alastair Compston,David J. Conway,Muminatou Jallow,Kirk A. Rockett,Suzannah Bumpstead,Amy Chaney,Kate Downes,Mohammed J. R. Ghori,Rhian Gwilliam,Sarah E. Hunt,Michael Inouye,Andrew Keniry,Emma King,Ralph McGinnis,Simon C. Potter,Rathi Ravindrarajah,Pamela Whittaker,Claire Widden,David Withers,Niall Cardin,Teresa Ferreira,Joanne Pereira-Gale,Ingileif B. Hallgrímsdóttir,Bryan Howie,Zhan Su,Yik Ying Teo,Damjan Vukcevic,David Bentley,A Compston +195 more
TL;DR: This study has demonstrated that careful use of a shared control group represents a safe and effective approach to GWA analyses of multiple disease phenotypes; generated a genome-wide genotype database for future studies of common diseases in the British population; and shown that, provided individuals with non-European ancestry are excluded, the extent of population stratification in theBritish population is generally modest.
Journal ArticleDOI
A Map of Human Genome Variation From Population-Scale Sequencing
Gonçalo R. Abecasis,David Altshuler,David Altshuler,Adam Auton,Lisa D Brooks,Richard Durbin,Richard A. Gibbs,Matthew E. Hurles,Gil McVean +8 more
TL;DR: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype as mentioned in this paper, and the results of the pilot phase of the project, designed to develop and compare different strategies for genomewide sequencing with high-throughput platforms.