scispace - formally typeset
Proceedings ArticleDOI

ReFHap: a reliable and fast algorithm for single individual haplotyping

Reads0
Chats0
TLDR
A novel problem formulation for single individual haplotyping that initially finds the best cut based on a heuristic algorithm for max-cut and then builds haplotypes consistent with that cut and is found that ReFHap performs significantly faster than previous methods without loss of accuracy.
Abstract
Full human genomic sequences have been published in the latest two years for a growing number of individuals. Most of them are a mixed consensus of the two real haplotypes because it is still very expensive to separate information coming from the two copies of a chromosome. However, latest improvements and new experimental approaches promise to solve these issues and provide enough information to reconstruct the sequences for the two copies of each chromosome through bioinformatics methods such as single individual haplotyping. Full haploid sequences provide a complete understanding of the structure of the human genome, allowing accurate predictions of translation in protein coding regions and increasing power of association studies.In this paper we present a novel problem formulation for single individual haplotyping. We start by assigning a score to each pair of fragments based on their common allele calls and then we use these score to formulate the problem as the cut of fragments that maximize an objective function, similar to the well known max-cut problem. Our algorithm initially finds the best cut based on a heuristic algorithm for max-cut and then builds haplotypes consistent with that cut. We have compared both accuracy and running time of ReFHap with other heuristic methods on both simulated and real data and found that ReFHap performs significantly faster than previous methods without loss of accuracy.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line

TL;DR: Haplotype resolution facilitated reconstruction of an amplified, highly rearranged region of chromosome 8q24 at which integration of the human papilloma virus type 18 (HPV-18) genome occurred and that is likely to be the event that initiated tumorigenesis.
Journal ArticleDOI

HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies.

TL;DR: It is shown that HapCUT2 rapidly assembles haplotypes with best-in-class accuracy for all data types and scales well for high sequencing coverage and rapidly assembled haplotypes for two long-read WGS data sets on which other methods struggled.
Journal ArticleDOI

In vitro, long-range sequence information for de novo genome assembly via transposase contiguity

TL;DR: It is demonstrated that fragScaff is complementary to Hi-C-based contact probability maps, providing midrange contiguity to support robust, accurate chromosome-scale de novo genome assemblies without the need for laborious in vivo cloning steps.
Patent

Linking sequence reads using paired code tags

TL;DR: Artificial transposon sequences having code tags and target nucleic acids containing such sequences were used for making artificial transposons and for using their properties to analyze targets.
Journal ArticleDOI

Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques.

TL;DR: Comparisons indicate that fosmid-based haplotyping can deliver highly accurate results even at low coverage and that the proposed SIH algorithm, ReFHap, is able to efficiently produce high-quality haplotypes.
References
More filters
Journal ArticleDOI

Initial sequencing and analysis of the human genome.

Eric S. Lander, +248 more
- 15 Feb 2001 - 
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Journal ArticleDOI

A second generation human haplotype map of over 3.1 million SNPs

Kelly A. Frazer, +237 more
- 18 Oct 2007 - 
TL;DR: The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.
Journal ArticleDOI

Accurate whole human genome sequencing using reversible terminator chemistry

David R. Bentley, +201 more
- 06 Nov 2008 - 
TL;DR: An approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost is reported, effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.
Journal ArticleDOI

A Comparison of Bayesian Methods for Haplotype Reconstruction from Population Genotype Data

TL;DR: A new algorithm is introduced that combines the modeling strategy of one method with the computational strategies of another and outperforms all three existing methods for inferring haplotypes from genotype data in a population sample.
Journal ArticleDOI

A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.

TL;DR: A statistical model based on the idea that, over short regions, haplotypes in a population tend to cluster into groups of similar haplotypes that allows cluster memberships to change continuously along the chromosome according to a hidden Markov model to capture the fact that recombination tends to be local in nature.
Related Papers (5)