A fast and accurate algorithm for single individual haplotyping.
TLDR
A new optimization model, called Balanced Optimal Partition (BOP), for single individual haplotyping, which generalizes two existing models, Minimum Error Correction (MEC) and Maximum Fragments Cut (MFC), and could be made either model by using some extreme parameter values.Abstract:
Due to the difficulty in separating two (paternal and maternal) copies of a chromosome, most published human genome sequences only provide genotype information, i.e., the mixed information of the underlying two haplotypes. However, phased haplotype information is needed to completely understand complex genetic polymorphisms and to increase the power of genome-wide association studies for complex diseases. With the rapid development of DNA sequencing technologies, reconstructing a pair of haplotypes from an individual's aligned DNA fragments by computer algorithms (i.e., Single Individual Haplotyping) has become a practical haplotyping approach. In the paper, we combine two measures "errors corrected" and "fragments cut" and propose a new optimization model, called Balanced Optimal Partition (BOP), for single individual haplotyping. The model generalizes two existing models, Minimum Error Correction (MEC) and Maximum Fragments Cut (MFC), and could be made either model by using some extreme parameter values. To solve the model, we design a heuristic dynamic programming algorithm H-BOP. By limiting the number of intermediate solutions at each iteration to an appropriately chosen small integer k, H-BOP is able to solve the model efficiently. Extensive experimental results on simulated and real data show that when k = 8, H-BOP is generally faster and more accurate than a recent state-of-art algorithm ReFHap in haplotype reconstruction. The running time of H-BOP is linearly dependent on some of the key parameters controlling the input size and H-BOP scales well to large input data. The code of H-BOP is available to the public for free upon request to the corresponding author.read more
Citations
More filters
Journal ArticleDOI
Whole-genome haplotyping approaches and genomic medicine.
TL;DR: Advances in whole-genome haplotyping approaches are reviewed and the importance of haplotypes for genomic medicine is discussed, which is more specific than less complex variants such as single nucleotide variants.
Journal ArticleDOI
Exact algorithms for haplotype assembly from whole-genome sequence data.
TL;DR: An approach to finding optimal solutions for the haplotype assembly problem under the minimum-error-correction (MEC) model with or without the all-heterozygous assumption is developed.
Journal ArticleDOI
Unzipping haplotypes in diploid and polyploid genomes
TL;DR: In this paper, the authors review existing methods for alignment-based and assembly-based haplotype phasing for heterozygous diploid and polyploid genomes, as well as recent advances of experimental approaches for improved genome phasing.
Patent
Multiple tagging of individual long DNA fragments
TL;DR: In this article, methods and compositions for tagging long fragments of a target nucleic acid for sequencing and analyzing the resulting sequence information in order to reduce errors and perform haplotype phasing, for example.
Journal ArticleDOI
Haplotype phasing of whole human genomes using bead-based barcode partitioning in a single tube
Fan Zhang,Lena Christiansen,Jerushah Thomas,Dmitry K. Pokholok,Ros Jackson,Natalie Morrell,Yannan Zhao,Melissa M. Wiley,Emily Welch,Erich Jaeger,Ana Granat,Steven Norberg,Aaron L. Halpern,Maria C Rogert,Mostafa Ronaghi,Jay Shendure,Niall Anthony Gormley,Kevin L. Gunderson,Frank J. Steemers +18 more
TL;DR: Barcode partitioning of long DNA molecules in a single compartment using “on-bead” barcoded tagmentation is demonstrated, providing a barcode-linked read structure that reveals long-range molecular contiguity.
References
More filters
Journal ArticleDOI
A Map of Human Genome Variation From Population-Scale Sequencing
Gonçalo R. Abecasis,David Altshuler,David Altshuler,Adam Auton,Lisa D Brooks,Richard Durbin,Richard A. Gibbs,Matthew E. Hurles,Gil McVean +8 more
TL;DR: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype as mentioned in this paper, and the results of the pilot phase of the project, designed to develop and compare different strategies for genomewide sequencing with high-throughput platforms.
Journal ArticleDOI
Table S2: Trans-factors and trinucleotide repeat instability Trans-factor
Journal ArticleDOI
A haplotype map of the human genome
John W. Belmont,Andrew Boudreau,Suzanne M. Leal,Paul Hardenbol,Shiran Pasternak,David A. Wheeler,Thomas D. Willis,Fuli Yu,Huanming Yang,Gao Yang,H. B. Hu,Weitao Hu,Chaohua Li,Wei Lin,Siqi Liu,Hao Pan,Xiaoli Tang,Jian Wang,Wei Wang,Jun Yu,Bo Zhang,Qingrun Zhang,Hongbin Zhao,Jun Zhou,Rachel Barry,Brendan Blumenstiel,Amy L. Camargo,Matthew Defelice,Maura Faggart,Mary Goyette,Supriya Gupta,Jamie Moore,Huy Nguyen,Melissa Parkin,Jessica Roy,Erich Stahl,Ellen Winchester,David Altshuler,Yan Shen,Zhijian Yao,Wei Huang,Xun Chu,Yungang He,Li Jin,Yangfan Liu,Yayun Shen,Weiwei Sun,Haifeng Wang,Yi Wang,Ying Wang,Xiaoyan Xiong,Liang Xu,Mary M.Y. Waye,Stephen Kwok-Wing Tsui,Hong Xue,J. Tze Fei Wong,Launa M. Galver,Jian-Bing Fan,Sarah S. Murray,Arnold Oliphant,Mark S. Chee,Alexandre Montpetit,Fanny Chagnon,Vincent Ferretti,Martin Leboeuf,Jean François Olivier,Michael S. Phillips,Stéphanie Roumy,Clémentine Sallée,Andrei Verner,Thomas J. Hudson,Kelly A. Frazer,Dennis G. Ballinger,David R. Cox,David A. Hinds,Laura L. Stuve,Pui-Yan Kwok,Dongmei Cai,Daniel C. Koboldt,Raymond D. Miller,Ludmila Pawlikowska,Patricia Taillon-Miller,Ming Xiao,Lap-Chee Tsui,William Mak,Pak C. Sham,You-Qiang Song,Paul K.H. Tam,Yusuke Nakamura,Takahisa Kawaguchi,Takuya Kitamoto,Takashi Morizono,Atsushi Nagashima,Yozo Ohnishi,Akihiro Sekine,Toshihiro Tanaka,Panos Deloukas,Christine P. Bird,Marcos Delgado,Emmanouil T. Dermitzakis,Rhian Gwilliam,Sarah E. Hunt,Jonathan Morrison,Don Powell,Barbara E. Stranger,Pamela Whittaker,David R. Bentley,Paul I.W. de Bakker,Jeffrey C. Barrett,Ben Fry,Julian Maller,Steve McCarroll,Nick Patterson,Itsik Pe'er,Shaun Purcell,Daniel J. Richter,Pardis C. Sabeti,Richa Saxena,Stephen F. Schaffner,Patrick Varilly,Lincoln Stein,Lalitha Krishnan,Albert V. Smith,Gudmundur A. Thorisson,Aravinda Chakravarti,Peter E. Chen,David J. Cutler,Carl S. Kashuk,Shin Lin,Gonçalo R. Abecasis,Weihua Guan,Heather M. Munro,Zhaohui S. Qin,Daryl J. Thomas,Gilean McVean,Leonardo Bottolo,Susana Eyheramendy,Colin Freeman,Jonathan Marchini,Simon Myers,Chris C. A. Spencer,Matthew Stephens,Peter Donnelly,Lon R. Cardon,Geraldine M. Clarke,David M. Evans,Andrew P. Morris,Bruce S. Weir,Tatsuhiko Tsunoda,James C. Mullikin,Stephen T. Sherry,Michael Feolo,Houcan Zhang,Changqing Zeng,Hui Zhao,Ichiro Matsuda,Yoshimitsu Fukushima,Darryl Macer,Eiko Suda,Charles N. Rotimi,Clement Adebamowo,Ike Ajayi,Toyin Aniagwu,Patricia A. Marshall,Chibuzor Nkwodimmah,Charmaine D.M. Royal,Mark Leppert,Missy Dixon,Andy Peiffer,Renzong Qiu,Alastair Kent,Kazuto Kato,Norio Niikawa,Isaac F. Adewole,Bartha Maria Knoppers,Morris W. Foster,Ellen Wright Clayton,Jessica Watkin,Richard A. Gibbs,Donna M. Muzny,Lynne V. Nazareth,Erica Sodergren,George M. Weinstock,Imtiaz Yakub,Stacey Gabriel,Robert C. Onofrio,Liuda Ziaugra,Bruce W. Birren,Mark J. Daly,Richard K. Wilson,Lucinda Fulton,Jane Rogers,John Burton,Nigel P. Carter,C M Clee,Mark Griffiths,Matthew C. Jones,Kirsten McLay,Robert W. Plumb,Mark T. Ross,Sarah Sims,David Willey,Zhu Chen,Hua Han,L. Kang,Martin Godbout,John C. Wallenburg,Paul L'Archevêque,Guy Bellemare,Koji Saeki,Hongguang Wang,Daochang An,Hongbo Fu,Qing Li,Zhen Wang,Renwu Wang,Arthur L. Holden,Lisa D. Brooks,Jean E. McEwen,Christianne R. Bird,Mark S. Guyer,Patrick J. Nailer,Vivian Ota Wang,Jane Peterson,Michael Shi,Jack Spiegel,Lawrence M. Sung,Jonathan Witonsky,Lynn F. Zacharia,Francis S. Collins,Karen Kennedy,Ruth Jamieson,John Stewart +232 more
TL;DR: A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
Journal ArticleDOI
The Diploid Genome Sequence of an Individual Human
Samuel Levy,Granger G. Sutton,Pauline C. Ng,Lars Feuk,Aaron L. Halpern,Brian P. Walenz,Nelson Axelrod,Jiaqi Huang,Ewen F. Kirkness,Gennady Denisov,Yuan Lin,Jeffrey R. MacDonald,Andy Wing Chun Pang,Mary Shago,Timothy B. Stockwell,Alexia Tsiamouri,Vineet Bafna,Vikas Bansal,Saul A. Kravitz,Dana A. Busam,Karen Beeson,Tina C McIntosh,Karin A. Remington,Josep F. Abril,John Gill,Jon Borman,Yu-Hui Rogers,Marvin Frazier,Stephen W. Scherer,Robert L. Strausberg,J. Craig Venter +30 more
TL;DR: A modified version of the Celera assembler is developed to facilitate the identification and comparison of alternate alleles within this individual diploid genome, and a novel haplotype assembly strategy is used, able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploids nature of the genome.
Journal ArticleDOI
Haplotype phasing: existing methods and new developments.
TL;DR: The haplotype phasing methods that are available are assessed, focusing in particular on statistical methods, and the practical aspects of their application are discussed, and recent developments that may transform this field are described.
Related Papers (5)
HapCUT: an efficient and accurate algorithm for the haplotype assembly problem
Vikas Bansal,Vineet Bafna +1 more
The Diploid Genome Sequence of an Individual Human
Samuel Levy,Granger G. Sutton,Pauline C. Ng,Lars Feuk,Aaron L. Halpern,Brian P. Walenz,Nelson Axelrod,Jiaqi Huang,Ewen F. Kirkness,Gennady Denisov,Yuan Lin,Jeffrey R. MacDonald,Andy Wing Chun Pang,Mary Shago,Timothy B. Stockwell,Alexia Tsiamouri,Vineet Bafna,Vikas Bansal,Saul A. Kravitz,Dana A. Busam,Karen Beeson,Tina C McIntosh,Karin A. Remington,Josep F. Abril,John Gill,Jon Borman,Yu-Hui Rogers,Marvin Frazier,Stephen W. Scherer,Robert L. Strausberg,J. Craig Venter +30 more