scispace - formally typeset
Open AccessJournal ArticleDOI

A fast and accurate algorithm for single individual haplotyping.

TLDR
A new optimization model, called Balanced Optimal Partition (BOP), for single individual haplotyping, which generalizes two existing models, Minimum Error Correction (MEC) and Maximum Fragments Cut (MFC), and could be made either model by using some extreme parameter values.
Abstract
Due to the difficulty in separating two (paternal and maternal) copies of a chromosome, most published human genome sequences only provide genotype information, i.e., the mixed information of the underlying two haplotypes. However, phased haplotype information is needed to completely understand complex genetic polymorphisms and to increase the power of genome-wide association studies for complex diseases. With the rapid development of DNA sequencing technologies, reconstructing a pair of haplotypes from an individual's aligned DNA fragments by computer algorithms (i.e., Single Individual Haplotyping) has become a practical haplotyping approach. In the paper, we combine two measures "errors corrected" and "fragments cut" and propose a new optimization model, called Balanced Optimal Partition (BOP), for single individual haplotyping. The model generalizes two existing models, Minimum Error Correction (MEC) and Maximum Fragments Cut (MFC), and could be made either model by using some extreme parameter values. To solve the model, we design a heuristic dynamic programming algorithm H-BOP. By limiting the number of intermediate solutions at each iteration to an appropriately chosen small integer k, H-BOP is able to solve the model efficiently. Extensive experimental results on simulated and real data show that when k = 8, H-BOP is generally faster and more accurate than a recent state-of-art algorithm ReFHap in haplotype reconstruction. The running time of H-BOP is linearly dependent on some of the key parameters controlling the input size and H-BOP scales well to large input data. The code of H-BOP is available to the public for free upon request to the corresponding author.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Whole-genome haplotyping approaches and genomic medicine.

TL;DR: Advances in whole-genome haplotyping approaches are reviewed and the importance of haplotypes for genomic medicine is discussed, which is more specific than less complex variants such as single nucleotide variants.
Journal ArticleDOI

Exact algorithms for haplotype assembly from whole-genome sequence data.

TL;DR: An approach to finding optimal solutions for the haplotype assembly problem under the minimum-error-correction (MEC) model with or without the all-heterozygous assumption is developed.
Journal ArticleDOI

Unzipping haplotypes in diploid and polyploid genomes

TL;DR: In this paper, the authors review existing methods for alignment-based and assembly-based haplotype phasing for heterozygous diploid and polyploid genomes, as well as recent advances of experimental approaches for improved genome phasing.
Patent

Multiple tagging of individual long DNA fragments

TL;DR: In this article, methods and compositions for tagging long fragments of a target nucleic acid for sequencing and analyzing the resulting sequence information in order to reduce errors and perform haplotype phasing, for example.
Journal ArticleDOI

Haplotype phasing of whole human genomes using bead-based barcode partitioning in a single tube

TL;DR: Barcode partitioning of long DNA molecules in a single compartment using “on-bead” barcoded tagmentation is demonstrated, providing a barcode-linked read structure that reveals long-range molecular contiguity.
References
More filters
Journal ArticleDOI

A Map of Human Genome Variation From Population-Scale Sequencing

TL;DR: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype as mentioned in this paper, and the results of the pilot phase of the project, designed to develop and compare different strategies for genomewide sequencing with high-throughput platforms.
Journal ArticleDOI

A haplotype map of the human genome

John W. Belmont, +232 more
TL;DR: A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
Journal ArticleDOI

Haplotype phasing: existing methods and new developments.

TL;DR: The haplotype phasing methods that are available are assessed, focusing in particular on statistical methods, and the practical aspects of their application are discussed, and recent developments that may transform this field are described.
Related Papers (5)