Showing papers on "Variant Call Format published in 2018"

PDF

Open Access

Journal Article•DOI•

ClinVar: improving access to variant interpretations and supporting evidence.

[...]

Melissa J. Landrum¹, Jennifer M. Lee¹, Mark L. Benson¹, Garth Brown¹, Chen Chao¹, Shanmuga Chitipiralla¹, Baoshan Gu¹, Jennifer Hart¹, Douglas W. Hoffman¹, Wonhee Jang¹, Karen Karapetyan¹, Kenneth S. Katz¹, Chunlei Liu¹, Zenith Maddipatla¹, Malheiro Aj¹, Kurt McDaniel¹, Michael Ovetsky¹, George R. Riley¹, George Zhou¹, J. Bradley Holmes¹, Brandi L. Kattman¹, Donna Maglott¹ - Show less +18 more•Institutions (1)

National Institutes of Health¹

04 Jan 2018-Nucleic Acids Research

TL;DR: ClinVar continues to make improvements to its search and retrieval functions.

...read moreread less

Abstract: ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) is a freely available, public archive of human genetic variants and interpretations of their significance to disease, maintained at the National Institutes of Health. Interpretations of the clinical significance of variants are submitted by clinical testing laboratories, research laboratories, expert panels and other groups. ClinVar aggregates data by variant-disease pairs, and by variant (or set of variants). Data aggregated by variant are accessible on the website, in an improved set of variant call format files and as a new comprehensive XML report. ClinVar recently started accepting submissions that are focused primarily on providing phenotypic information for individuals who have had genetic testing. Submissions may come from clinical providers providing their own interpretation of the variant ('provider interpretation') or from groups such as patient registries that primarily provide phenotypic information from patients ('phenotyping only'). ClinVar continues to make improvements to its search and retrieval functions. Several new fields are now indexed for more precise searching, and filters allow the user to narrow down a large set of search results.

...read moreread less

2,345 citations

Journal Article•DOI•

A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids.

[...]

Guilherme da Silva Pereira¹, Guilherme da Silva Pereira², Antonio Augusto Franco Garcia², Gabriel Rodrigues Alves Margarido²•Institutions (2)

North Carolina State University¹, University of São Paulo²

01 Nov 2018-BMC Bioinformatics

TL;DR: VCF2SM is a Python script that integrates sequencing depth information of polymorphisms in variant call format (VCF) files and SuperMASSA software for quantitative genotype calling and was successfully applied in analyzing GBS data from diverse panels and full-sib mapping populations of polyploid species.

...read moreread less

Abstract: Genotyping-by-sequencing (GBS) has been used broadly in genetic studies for several species, especially those with agricultural importance. However, its use is still limited in autopolyploid species because genotype calling software generally fails to properly distinguish heterozygous classes based on allele dosage. VCF2SM is a Python script that integrates sequencing depth information of polymorphisms in variant call format (VCF) files and SuperMASSA software for quantitative genotype calling. VCFs can be obtained from any variant discovery software that outputs exact allele sequencing depth, such as a modified version of the Tassel-GBS pipeline provided here. VCF2SM was successfully applied in analyzing GBS data from diverse panels (alfalfa and potato) and full-sib mapping populations (alfalfa and switchgrass) of polyploid species. We demonstrate that our approach can help plant geneticists working with autopolyploid species to advance their studies by distinguishing allele dosage from GBS data.

...read moreread less

42 citations

Journal Article•DOI•

SV-plaudit: A cloud-based framework for manually curating thousands of structural variants.

[...]

Jonathan R Belyeu¹, Thomas J. Nicholas¹, Brent S. Pedersen¹, Thomas A Sasani¹, James M Havrilla¹, Stephanie N Kravitz¹, Megan E Conway¹, Brian K. Lohman¹, Aaron R. Quinlan¹, Ryan M. Layer¹ - Show less +6 more•Institutions (1)

University of Utah¹

01 Jul 2018-GigaScience

TL;DR: SV-plaudit is a framework for rapidly curating structural variant (SV) predictions that will become a standard step in variant calling pipelines and the crowd-sourced curation of other biological results.

...read moreread less

Abstract: SV-plaudit is a framework for rapidly curating structural variant (SV) predictions. For each SV, we generate an image that visualizes the coverage and alignment signals from a set of samples. Images are uploaded to our cloud framework where users assess the quality of each image using a client-side web application. Reports can then be generated as a tab-delimited file or annotated Variant Call Format (VCF) file. As a proof of principle, nine researchers collaborated for 1 hour to evaluate 1,350 SVs each. We anticipate that SV-plaudit will become a standard step in variant calling pipelines and the crowd-sourced curation of other biological results. Code available at https://github.com/jbelyeu/SV-plaudit Demonstration video available at https://www.youtube.com/watch?v=ono8kHMKxDs

...read moreread less

32 citations

Journal Article•DOI•

m6ASNP: a tool for annotating genetic variants by m6A function.

[...]

Shuai Jiang¹, Yubin Xie¹, Zhihao He¹, Ya Zhang¹, Yuli Zhao¹, Li Chen¹, Yueyuan Zheng¹, Yanyan Miao¹, Zhixiang Zuo¹, Jian Ren², Jian Ren¹ - Show less +7 more•Institutions (2)

Sun Yat-sen University¹, National University of Defense Technology²

01 May 2018-GigaScience

TL;DR: A user-friendly web server called “m6ASNP” is presented that is dedicated to the identification of genetic variants that target m6A modification sites and is believed to be a very convenient tool that can be used to boost further functional studies investigating genetic variants.

...read moreread less

Abstract: Background Large-scale genome sequencing projects have identified many genetic variants for diverse diseases. A major goal of these projects is to characterize these genetic variants to provide insight into their function and roles in diseases. N6-methyladenosine (m6A) is one of the most abundant RNA modifications in eukaryotes. Recent studies have revealed that aberrant m6A modifications are involved in many diseases. Findings In this study, we present a user-friendly web server called "m6ASNP" that is dedicated to the identification of genetic variants that target m6A modification sites. A random forest model was implemented in m6ASNP to predict whether the methylation status of an m6A site is altered by the variants that surround the site. In m6ASNP, genetic variants in a standard variant call format (VCF) are accepted as the input data, and the output includes an interactive table that contains the genetic variants annotated by m6A function. In addition, statistical diagrams and a genome browser are provided to visualize the characteristics and to annotate the genetic variants. Conclusions We believe that m6ASNP is a very convenient tool that can be used to boost further functional studies investigating genetic variants. The web server "m6ASNP" is implemented in JAVA and PHP and is freely available at [60].

...read moreread less

32 citations

Journal Article•DOI•

Catching hidden variation: systematic correction of reference minor allele annotation in clinical variant calling.

[...]

Yury A. Barbitoff¹, Igor V Bezdvornykh, Dmitrii E. Polev¹, Elena A. Serebryakova¹, Andrey S. Glotov¹, Oleg S. Glotov¹, Alexander V. Predeus - Show less +3 more•Institutions (1)

Saint Petersburg State University¹

01 Mar 2018-Genetics in Medicine

TL;DR: A simple bioinformatic tool is developed that identifies variation at RMA sites and provides correct annotations for all such substitutions, which enhances the accuracy of next-generation sequencing–based methods in clinical practice.

...read moreread less

24 citations

Journal Article•DOI•

Seshat: A Web service for accurate annotation, validation, and analysis of TP53 variants generated by conventional and next-generation sequencing.

[...]

Tuomas Tikkanen, B. Leroy¹, Jean Louis Fournier¹, Rosa Ana Risques², Jitka Malčíková³, Jitka Malčíková⁴, Thierry Soussi⁵, Thierry Soussi⁶, Thierry Soussi¹ - Show less +5 more•Institutions (6)

University of Paris¹, University of Washington², Masaryk University³, Central European Institute of Technology⁴, Karolinska Institutet⁵, French Institute of Health and Medical Research⁶

01 Jul 2018-Human Mutation

TL;DR: Seshat, a Web service for annotating TP53 information derived from sequencing data, provides multiple statistical information for each TP53 variant including database frequency, functional activity, or pathogenicity.

...read moreread less

Abstract: Accurate annotation of genomic variants in human diseases is essential to allow personalized medicine. Assessment of somatic and germline TP53 alterations has now reached the clinic and is required in several circumstances such as the identification of the most effective cancer therapy for patients with chronic lymphocytic leukemia (CLL). Here, we present Seshat, a Web service for annotating TP53 information derived from sequencing data. A flexible framework allows the use of standard file formats such as Mutation Annotation Format (MAF) or Variant Call Format (VCF), as well as common TXT files. Seshat performs accurate variant annotations using the Human Genome Variation Society (HGVS) nomenclature and the stable TP53 genomic reference provided by the Locus Reference Genomic (LRG). In addition, using the 2017 release of the UMD_TP53 database, Seshat provides multiple statistical information for each TP53 variant including database frequency, functional activity, or pathogenicity. The information is delivered in standardized output tables that minimize errors and facilitate comparison of mutational data across studies. Seshat is a beneficial tool to interpret the ever-growing TP53 sequencing data generated by multiple sequencing platforms and it is freely available via the TP53 Website, http://p53.fr or directly at http://vps338341.ovh.net/.

...read moreread less

20 citations

Journal Article•DOI•

OpenEHR modeling for genomics in clinical practice.

[...]

Cecilia Mascia¹, Paolo Uva¹, Simone Leo¹, Gianluigi Zanetti¹•Institutions (1)

Center for Advanced Studies Research and Development in Sardinia¹

17 Oct 2018-International Journal of Medical Informatics

TL;DR: The proposed model allows to represent genetic test results in health records in a structured format, allowing both automated processing and clinical decision support and is extensible via external references, allowing to keep track of data provenance and adapt to future domain changes.

...read moreread less

16 citations

Journal Article•DOI•

Inferring Variation in Copy Number Using High Throughput Sequencing Data in R.

[...]

Brian J. Knaus¹, Niklaus J. Grünwald¹•Institutions (1)

United States Department of Agriculture¹

13 Apr 2018-Frontiers in Genetics

TL;DR: A method to infer copy number that uses variant call format (VCF) data as input and is implemented in the R package vcfR and validated with the model system of Saccharomyces cerevisiae and applied to the oomycete Phytophthora infestans.

...read moreread less

Abstract: Inference of copy number variation presents a technical challenge because variant callers typically require the copy number of a genome or genomic region to be known a priori. Here we present a method to infer copy number that uses variant call format (VCF) data as input and is implemented in the R package vcfR. This method is based on the relative frequency of each allele (in both genic and non-genic regions) sequenced at heterozygous positions throughout a genome. These heterozygous positions are summarized by using arbitrarily sized windows of heterozygous positions, binning the allele frequencies, and selecting the bin with the greatest abundance of positions. This provides a non-parametric summary of the frequency that alleles were sequenced at. The method is applicable to organisms that have reference genomes that consist of full chromosomes or sub-chromosomal contigs. In contrast to other software designed to detect copy number variation, our method does not rely on an assumption of base ploidy, but instead infers it. We validated these approaches with the model system of Saccharomyces cerevisiae and applied it to the oomycete Phytophthora infestans, both known to vary in copy number. This functionality has been incorporated into the current release of the R package vcfR to provide modular and flexible methods to investigate copy number variation in genomic projects.

...read moreread less

15 citations

Journal Article•DOI•

SNPitty: An Intuitive Web Application for Interactive B-Allele Frequency and Copy Number Visualization of Next-Generation Sequencing Data.

[...]

Job van Riet¹, Niels M.G. Krol¹, Peggy N. Atmodimedjo¹, Erwin Brosens¹, Wilfred F. J. van IJcken¹, Maurice P.H.M. Jansen¹, John W.M. Martens¹, Leendert H. J. Looijenga¹, Guido Jenster¹, Hendrikus J. Dubbink¹, Winand N.M. Dinjens¹, Harmen J.G. van de Werken¹ - Show less +8 more•Institutions (1)

Erasmus University Rotterdam¹

01 Mar 2018-The Journal of Molecular Diagnostics

TL;DR: SNitty as discussed by the authors is a web application that allows interactive visualization and interrogation of variant call format files by using B-allele frequencies of single nucleotide polymorphisms and single-nucleotide variants, coverage metrics, and copy numbers analysis results.

...read moreread less

13 citations

Posted Content•DOI•

Pediatric Cancer Variant Pathogenicity Information Exchange (PeCanPIE): A Cloud-based Platform for Curating and Classifying Germline Variants

[...]

Michael N. Edmonson¹, Aman Patel¹, Dale Hedges¹, Zhaoming Wang¹, Evadnie Rampersaud¹, Chimene Kesserwan¹, Xin Zhou¹, Yanling Liu¹, Scott Newman¹, Michael Rusch¹, Clay McLeod¹, Mark R. Wilkinson¹, Stephen V. Rice¹, Jared Becksfort¹, Kim E. Nichols¹, Leslie L. Robison¹, James R. Downing¹, Jinghui Zhang¹ - Show less +14 more•Institutions (1)

St. Jude Children's Research Hospital¹

06 Jun 2018-bioRxiv

TL;DR: PeCanPIE is a web- and cloud-based platform for annotation, identification, and classification of variations in known or putative disease genes, applied to classify variant pathogenicity in cancer predisposition genes in two large-scale investigations involving >4,000 pediatric cancer patients.

...read moreread less

Abstract: Variant interpretation in the era of next-generation sequencing (NGS) is challenging. While many resources and guidelines are available to assist with this task, few integrated end-to-end tools exist. Here we present "PeCanPIE" — the Pediatric Cancer Variant Pathogenicity Information Exchange, a web- and cloud-based platform for annotation, identification, and classification of variations in known or putative disease genes. Starting from a set of variants in Variant Call Format (VCF), variants are annotated, ranked by putative pathogenicity, and presented for formal classification using a decision-support interface based on published guidelines from the American College of Medical Genetics and Genomics (ACMG). The system can accept files containing millions of variants and handle single-nucleotide variants (SNVs), simple insertions/deletions (indels), multiple-nucleotide variants (MNVs), and complex substitutions. PeCanPIE has been applied to classify variant pathogenicity in cancer predisposition genes in two large-scale investigations involving >4,000 pediatric cancer patients, and serves as a repository for the expert-reviewed results. While PeCanPIE9s web-based interface was designed to be accessible to non-bioinformaticians, its back end pipelines may also be run independently on the cloud, facilitating direct integration and broader adoption. PeCanPIE is publicly available and free for research use.

...read moreread less

11 citations

Journal Article•DOI•

A Pipeline for Markers Selection Using Restriction Site Associated DNA Sequencing (Radseq)

[...]

Hanan Begali

15 Nov 2018

TL;DR: It is shown that the pipeline is efficient in RADSeq-based marker selection for Arabidopsis thaliana, and the visualization of SNPs and Indels has been very helpful and has provided valuable insights on marker selection.

...read moreread less

Abstract: The discovery and assessment of genetic variants for Next Generation Sequencing (NGS), including Restriction site Associated DNA sequencing (RADSeq), is an important task in bioinformatics and comparative genetics. The genetic variants can be single-nucleotide polymorphisms (SNPs), insertions and deletions (Indels) when compared to a reference genome. Usually, the short reads are aligned to a reference genome at first using NGS alignment software, such as the Burrows- Wheeler Aligner (BWA). The alignment is usually stored into a BAM file, a binary format of standard SAM (Sequence Alignment/Map) protocol. Then analysis software, such as Genome analysis Toolkit (GATK) or SAMTools, together with scripts written in R programming language, could provide an efficient solution for calling variants. In this project, we focus on RADSeq-based marker selection for Arabidopsis thaliana. RADSeq consists of short reads which do not cover the whole reference genome. In order to obtain four call-sets of SNPs as output in Variant Call Format (VCF), SNPs have been called by GATK or SAMTools. Then VCF files have been visualized by Integrative Genomics Viewer (IGV) software. We found that the visualization of SNPs and Indels has been very helpful and has provided us with valuable insights on marker selection. We found that applying Chi-Square test for all target genotypes, which are homozygous reference 0/0, heterozygous variants 0/1 and homozygous variants 1/1, to test Hardy-Weinberg Equilibrium (HWE) in order to reduce false positive rate significantly. We show that our pipeline is efficient in RADSeq-based marker selection.

...read moreread less

Journal Article•

Applying filtration steps to interpret the results of whole-exome sequencing in a consanguineous population to achieve a high detection rate.

[...]

Ahmed Alfares¹•Institutions (1)

Qassim University¹

14 Aug 2018-International journal of health sciences

TL;DR: A custom filtration process and strategy targeting a specific population provide excellent detection rates in less time and should be considered as a first-tier laboratory workflow for analysis.

...read moreread less

Abstract: Objective Interpreting whole-exome sequencing (WES) data are challenging, requiring extensive time, and effort to review all the variants in the variant call format Here, we examined the application of custom filters to narrow the number of candidate variants in a consanguineous population that requires further analysis Methods In 100 cases undergoing WES, we applied a custom filtration process to look primarily for homozygous variants in autosomal recessive (AR) disorders, and second for variants in either autosomal dominant or x-linked disorders Results Most identified disease-causing variants were homozygous in AR disorders By applying our custom filtration process, we narrowed the number of candidate variants requiring further analysis to 5-15 per case while maintaining a high detection rate and completing analysis in around 45 min Conclusion A custom filtration process and strategy targeting a specific population provide excellent detection rates in less time and should be considered as a first-tier laboratory workflow for analysis

...read moreread less

Journal Article•DOI•

OVAS: an open-source variant analysis suite with inheritance modelling.

[...]

Monika Mozere¹, Mehmet Tekman¹, Jameela A. Kari², Detlef Bockenhauer¹, Robert Kleta¹, Horia Stanescu¹ - Show less +2 more•Institutions (2)

University College London¹, King Abdulaziz University²

08 Feb 2018-BMC Bioinformatics

TL;DR: OVAS is an offline open-source modular-driven analysis environment designed to annotate and extract useful variants from Variant Call Format files, and process them under an inheritance context through a top-down filtering schema of swappable modules, run entirely off a live bootable medium and accessed locally through a web-browser.

...read moreread less

Abstract: The advent of modern high-throughput genetics continually broadens the gap between the rising volume of sequencing data, and the tools required to process them. The need to pinpoint a small subset of functionally important variants has now shifted towards identifying the critical differences between normal variants and disease-causing ones. The ever-increasing reliance on cloud-based services for sequence analysis and the non-transparent methods they utilize has prompted the need for more in-situ services that can provide a safer and more accessible environment to process patient data, especially in circumstances where continuous internet usage is limited. To address these issues, we herein propose our standalone Open-source Variant Analysis Sequencing (OVAS) pipeline; consisting of three key stages of processing that pertain to the separate modes of annotation, filtering, and interpretation. Core annotation performs variant-mapping to gene-isoforms at the exon/intron level, append functional data pertaining the type of variant mutation, and determine hetero/homozygosity. An extensive inheritance-modelling module in conjunction with 11 other filtering components can be used in sequence ranging from single quality control to multi-file penetrance model specifics such as X-linked recessive or mosaicism. Depending on the type of interpretation required, additional annotation is performed to identify organ specificity through gene expression and protein domains. In the course of this paper we analysed an autosomal recessive case study. OVAS made effective use of the filtering modules to recapitulate the results of the study by identifying the prescribed compound-heterozygous disease pattern from exome-capture sequence input samples. OVAS is an offline open-source modular-driven analysis environment designed to annotate and extract useful variants from Variant Call Format (VCF) files, and process them under an inheritance context through a top-down filtering schema of swappable modules, run entirely off a live bootable medium and accessed locally through a web-browser.

...read moreread less

Splice Site Variant Analyzer: Determining the Pathogenicity of Splice Site Variants

[...]

Corinne E. Sexton, Mark E. Wadsworth, Justin B. Miller, Michael J. Cormier, Perry G. Ridge - Show less +1 more

25 Jul 2018

TL;DR: Splice Site Variant Analyzer (SSVA) fills a void in splice site variant analysis by merging the output from several databases to provide researchers with a free and comprehensive analysis of the pathogenicity ofsplice site variants in a single step at runtime.

...read moreread less

Abstract: We present Splice Site Variant Analyzer (SSVA) to simplify the characterization of deleterious and benign variants in or around splice sites. SSVA uses a Variant Call Format (VCF) file to query variants in humans against the Annovar database, MaxEntScan software, and the Conserved Domain Database. From Annovar, SSVA calculates the GERP score, the Exac score for each population, the allele frequency from the 1000 Genomes Project, and the likelihood score that the variant affects splicing. From MaxEntScan, SSVA calculates a splice site efficiency score based on the sequence. Finally, SSVA uses the Conserved Domain Database through rpsblast to determine if conserved domains are affected by the variant. SSVA presents each of these scores in a single output file that allows researchers to easily classify each splice site variant as pathogenic or benign. SSVA fills a void in splice site variant analysis by merging the output from several databases to provide researchers with a free and comprehensive analysis of the pathogenicity of splice site variants in a single step at runtime.

...read moreread less

Journal Article•DOI•

A Pipeline for Markers Selection Using Restriction Site Associated DNA Sequencing (RADSeq)

[...]

Hanan Begali

20 Jan 2018

TL;DR: SNPs as output in Variant Call Format (VCF) have been visualized by Integrative Genomics Viewer (IGV) software and it is found that the visualization of SNPs and Indels is helpful and provides valuable insights on marker selection.

...read moreread less

Abstract: Motivation: The discovery and assessment genetic variants for Next Generation Sequencing (NGS), including Restriction site Associated DNA sequencing (RADSeq), is an important task in bioinformatics and comparative genetics. The genetic variants can be single-nucleotide polymorphisms (SNPs), insertions and deletions (Indels) when compared to a reference genome. Usually, the short reads are aligned to a reference genome at first using NGS alignment software, such as the Burrows- Wheeler Aligner (BWA). The alignment is usually stored into a BAM file, a binary format of standard SAM (Sequence Alignment/Map) protocol. Then analysis software, such as Genome analysis Toolkit (GATK) or SAMTools [30] [31], together with scripts written in R programming language, could provide an efficient solution for calling variants. We focused on RADSeq-based marker selection for Arabidopsis thaliana . RADSeq consists short reads that do not cover the whole reference genome. Finally, SNPs as output in Variant Call Format (VCF) have been visualized by Integrative Genomics Viewer (IGV) software. We found that the visualization of SNPs and Indels is helpful and provides us with valuable insights on marker selection. We found that applying Chi-Square test for all target genotypes, which are homozygous reference 0/0, heterozygous variants 0/1 and homozygous variants 1/1, to test Hardy-Weinberg Equilibrium (HWE) in order to reduce false positive rate significantly and we showed that our pipeline is efficient in RADSeq-based marker selection.

...read moreread less

Posted Content•DOI•

Population-wide copy number variation calling using variant call format files from 6,898 individuals

[...]

Grace Png¹, Daniel Suveges¹, Young-Chan Park¹, Klaudia Walter¹, Kousik Kundu¹, Ioanna Ntalla², Emmanouil Tsafantakis, Maria Karaleftheri, George Dedoussis³, Eleftheria Zeggini¹, Arthur Gilly¹ - Show less +7 more•Institutions (3)

Wellcome Trust Sanger Institute¹, Queen Mary University of London², National and Kapodistrian University of Athens³

21 Dec 2018-bioRxiv

TL;DR: This work demonstrates that existing population-wide WGS call-sets can be mined for CNVs with minimal computational overhead, delivering insight into a less well-studied, yet potentially impactful class of genetic variant.

...read moreread less

Abstract: Copy number variants (CNVs) are large deletions or duplications at least 50 to 200 base pairs long. They play an important role in multiple disorders, but accurate calling of CNVs remains challenging. Most current approaches to CNV detection use raw read alignments, which are computationally intensive to process. We use a regression tree-based approach to call CNVs from whole-genome sequencing (WGS, >18x) variant call-sets in 6,898 samples across four European cohorts, and describe a rich large variation landscape comprising 1,320 CNVs. 61.8% of detected events have been previously reported in the Database of Genomic Variants. 23% of high-quality deletions affect entire genes, and we recapitulate known events such as the GSTM1 and RHD gene deletions. We test for association between the detected deletions and 275 protein levels in 1,457 individuals to assess the potential clinical impact of the detected CNVs. We describe the LD structure and copy number variation underlying the association between levels of the CCL3 protein and a complex structural variant (MAF=0.15, p=3.6x10-12) affecting CCL3L3, a paralog of the CCL3 gene. We also identify a cis-association between a low-frequency NOMO1 deletion and the protein product of this gene (MAF=0.02, p=2.2x10-7), for which no cis- or trans- single nucleotide variant-driven protein quantitative trait locus (pQTL) has been documented to date. This work demonstrates that existing population-wide WGS call-sets can be mined for CNVs with minimal computational overhead, delivering insight into a less well-studied, yet potentially impactful class of genetic variant. The regression tree based approach, UN-CNVc, is available as an R and bash executable on GitHub at https://github.com/agilly/un-cnvc. Supplementary information is appended.

...read moreread less

Posted Content•DOI•

VCF/Plotein: A web application to facilitate the clinical interpretation of genetic and genomic variants from exome sequencing projects

[...]

Raul Ossio¹, Diego Said Anaya-Mancilla¹, O. Isaac Garcia-Salinas¹, Jair S. García-Sotelo¹, Luis A. Aguilar¹, David J. Adams², Carla Daniela Robles-Espinoza¹, Carla Daniela Robles-Espinoza² - Show less +4 more•Institutions (2)

National Autonomous University of Mexico¹, Wellcome Trust Sanger Institute²

14 Nov 2018-bioRxiv

TL;DR: A number of features make VCF/Plotein especially suited for the medical community, such as its speed, security, the ability to filter by disease or gene function, and the ease with which information may be shared with collaborators/co-workers.

...read moreread less

Abstract: Purpose To create a user-friendly web application that allows researchers, medical professionals and patients to easily and securely view, filter and interact with human exome sequencing data in the Variant Call Format (VCF). Methods We have created VCF/Plotein, a web application written entirely in JavaScript using the Vue.js framework, available at http://vcfplotein.liigh.unam.mx. After a VCF is loaded, gene and variant information is extracted from Ensembl, and cross-referencing with external databases is performed via the Elasticsearch search engine. Support for application-based gene and variant filtering has also been implemented. Interactive graphs are created using the D3.js library. All data processing is done locally in the user’s CPU to ensure the security of patient data. Results VCF/Plotein allows users to interactively view and filter VCF files without needing any bioinformatics knowledge. A number of features make it especially suited for the medical community, such as its speed, security, the ability to filter by disease or gene function, and the ease with which information may be shared with collaborators/co-workers. Conclusion VCF/Plotein is a novel web application that allows users to easily and interactively filter and display exome sequencing information, and that is especially suited for bench researchers, medical professionals and patients.

...read moreread less