scispace - formally typeset
Open AccessJournal ArticleDOI

Discretized Gaussian mixture for genotyping of microsatellite loci containing homopolymer runs

Reads0
Chats0
TLDR
GenoTan, a program using a discretized Gaussian mixture model combined with a rules-based approach to identify inherited variation of microsatellite loci from short sequence reads without paired-end information, effectively distinguishes length variants from noise including insertion/deletion errors in homopolymer runs by addressing the bidirectional aspect of insertion and deletion errors in sequence reads.
Abstract
Motivation: Inferring lengths of inherited microsatellite alleles with single base pair resolution from short sequence reads is challenging due to several sources of noise caused by the repetitive nature of microsatellites and the technologies used to generate raw sequence data. Results: We have developed a program, GenoTan, using a discretized Gaussian mixture model combined with a rules-based approach to identify inherited variation of microsatellite loci from short sequence reads without paired-end information. It effectively distinguishes length variants from noise including insertion/deletion errors in homopolymer runs by addressing the bidirectional aspect of insertion and deletion errors in sequence reads. Here we first introduce a homopolymer decomposition method which estimates error bias toward insertion or deletion in homopolymer sequence runs. Combining these approaches, GenoTan was able to genotype 94.9% of microsatellite loci accurately from simulated data with 40x sequence coverage quickly while the other programs showed590% correct calls for the same data and required 5� 30� more computational time than GenoTan. It also showed the highest true-positive rate for real data using mixed sequence data of two Drosophila inbred lines, which was a novel validation approach for genotyping. Availability: GenoTan is open-source software available at http://gen otan.sourceforge.net.

read more

Content maybe subject to copyright    Report

Citations
More filters
Dissertation

Population Based Microsatellite Genotyping

TL;DR: A practical implementation of microsatellite genotyping which is both much faster and more accurate than previously presented solutions is given.
Journal ArticleDOI

Whole-exome sequencing reveals microsatellite DNA markers for response to dofetilide initiation in patients with persistent atrial fibrillation: A pilot study.

TL;DR: Dofetilide is a class III antiarrhythmic drug effective for the treatment of atrial fibrillation and microsatellite DNA are novel genetic markers associated with congenital and acquired health conditions.
Journal ArticleDOI

Molecular Surveillance of Malaria Using the PF AmpliSeq Custom Assay for Plasmodium falciparum Parasites from Dried Blood Spot DNA Isolates from Peru

TL;DR: In this paper , a three-day workflow for targeted resequencing of markers in 13 resistance-associated genes, histidine rich protein 2 and 3 (hrp2&3), a country (Peru)-specific 28 SNP-barcode for population genetic analysis, and apical membrane antigen 1 (ama1), using Illumina short-read sequencing technology.
Journal ArticleDOI

Molecular Surveillance of Malaria Using the PF AmpliSeq Custom Assay for Plasmodium falciparum Parasites from Dried Blood Spot DNA Isolates from Peru

TL;DR: In this article , a three-day workflow for targeted resequencing of markers in 13 resistance-associated genes, histidine rich protein 2 and 3 (hrp2&3), a country (Peru)-specific 28 SNP-barcode for population genetic analysis, and apical membrane antigen 1 (ama1), using Illumina short-read sequencing technology.
Dissertation

Bioinformatics methods and approaches to discover disease variants from DNA sequencing data

TL;DR: STRetch, a new bioinformatics method to detect STR expansions using STR decoy chromosomes, is developed and validated and it is shown that STRetch can be used to detect both known pathogenic STR expansions, and novel expansions at other annotated STR loci across the genome.
References
More filters
Journal ArticleDOI

The Sequence Alignment/Map format and SAMtools

TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI

Fast and accurate short read alignment with Burrows–Wheeler transform

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Journal ArticleDOI

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Journal ArticleDOI

Base-calling of automated sequencer traces using Phred. I. accuracy assessment

TL;DR: In this article, a base-calling program for automated sequencer traces, phred, with improved accuracy was proposed. But it was not shown to achieve a lower error rate than the ABI software, averaging 40%-50% fewer errors in the data sets examined independent of position in read, machine running conditions, or sequencing chemistry.
Journal ArticleDOI

Tandem repeats finder: a program to analyze DNA sequences

TL;DR: A new algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size is presented and its ability to detect tandem repeats that have undergone extensive mutational change is demonstrated.
Related Papers (5)