Proceedings ArticleDOI
ReFHap: a reliable and fast algorithm for single individual haplotyping
Jorge Duitama,Thomas Huebsch,Gayle K. McEwen,Eun-Kyung Suk,Margret R. Hoehe +4 more
- pp 160-169
Reads0
Chats0
TLDR
A novel problem formulation for single individual haplotyping that initially finds the best cut based on a heuristic algorithm for max-cut and then builds haplotypes consistent with that cut and is found that ReFHap performs significantly faster than previous methods without loss of accuracy.Abstract:
Full human genomic sequences have been published in the latest two years for a growing number of individuals. Most of them are a mixed consensus of the two real haplotypes because it is still very expensive to separate information coming from the two copies of a chromosome. However, latest improvements and new experimental approaches promise to solve these issues and provide enough information to reconstruct the sequences for the two copies of each chromosome through bioinformatics methods such as single individual haplotyping. Full haploid sequences provide a complete understanding of the structure of the human genome, allowing accurate predictions of translation in protein coding regions and increasing power of association studies.In this paper we present a novel problem formulation for single individual haplotyping. We start by assigning a score to each pair of fragments based on their common allele calls and then we use these score to formulate the problem as the cut of fragments that maximize an objective function, similar to the well known max-cut problem. Our algorithm initially finds the best cut based on a heuristic algorithm for max-cut and then builds haplotypes consistent with that cut. We have compared both accuracy and running time of ReFHap with other heuristic methods on both simulated and real data and found that ReFHap performs significantly faster than previous methods without loss of accuracy.read more
Citations
More filters
Journal ArticleDOI
The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line
Andrew Adey,Joshua N. Burton,Jacob O. Kitzman,Joseph B. Hiatt,Alexandra P. Lewis,Beth Martin,Ruolan Qiu,Choli Lee,Jay Shendure +8 more
TL;DR: Haplotype resolution facilitated reconstruction of an amplified, highly rearranged region of chromosome 8q24 at which integration of the human papilloma virus type 18 (HPV-18) genome occurred and that is likely to be the event that initiated tumorigenesis.
Journal ArticleDOI
HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies.
TL;DR: It is shown that HapCUT2 rapidly assembles haplotypes with best-in-class accuracy for all data types and scales well for high sequencing coverage and rapidly assembled haplotypes for two long-read WGS data sets on which other methods struggled.
Journal ArticleDOI
In vitro, long-range sequence information for de novo genome assembly via transposase contiguity
Andrew Adey,Jacob O. Kitzman,Joshua N. Burton,Riza M. Daza,Akash Kumar,Lena Christiansen,Mostafa Ronaghi,Sasan Amini,Kevin L. Gunderson,Frank J. Steemers,Jay Shendure +10 more
TL;DR: It is demonstrated that fragScaff is complementary to Hi-C-based contact probability maps, providing midrange contiguity to support robust, accurate chromosome-scale de novo genome assemblies without the need for laborious in vivo cloning steps.
Patent
Linking sequence reads using paired code tags
Frank J. Steemers,Kevin L. Gunderson,Thomas Royce,Natasha Pignatelli,Igor Goryshin,Nicholas Caruccio +5 more
TL;DR: Artificial transposon sequences having code tags and target nucleic acids containing such sequences were used for making artificial transposons and for using their properties to analyze targets.
Journal ArticleDOI
Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques.
Jorge Duitama,Gayle K. McEwen,Thomas Huebsch,Stefanie Palczewski,Sabrina Schulz,Kevin J. Verstrepen,Eun-Kyung Suk,Margret R. Hoehe +7 more
TL;DR: Comparisons indicate that fosmid-based haplotyping can deliver highly accurate results even at low coverage and that the proposed SIH algorithm, ReFHap, is able to efficiently produce high-quality haplotypes.
References
More filters
Journal ArticleDOI
Initial sequencing and analysis of the human genome.
Eric S. Lander,Lauren Linton,Bruce W. Birren,Chad Nusbaum,Michael C. Zody,Jennifer Baldwin,Keri Devon,Ken Dewar,Michael Doyle,William Fitzhugh,Roel Funke,Diane Gage,Katrina Harris,Andrew Heaford,John Howland,Lisa Kann,Jessica A. Lehoczky,Rosie Levine,Paul A. McEwan,Kevin McKernan,James Meldrim,Jill P. Mesirov,Cher Miranda,William Morris,Jerome Naylor,Christina Raymond,Mark Rosetti,Ralph Santos,Andrew Sheridan,Carrie Sougnez,Nicole Stange-Thomann,Nikola Stojanovic,Aravind Subramanian,Dudley Wyman,Jane Rogers,John Sulston,R Ainscough,Stephan Beck,David Bentley,John Burton,C M Clee,Nigel P. Carter,Alan Coulson,Rebecca Deadman,Panos Deloukas,Andrew Dunham,Ian Dunham,Richard Durbin,Lisa French,Darren Grafham,Simon G. Gregory,Tim Hubbard,Sean Humphray,Adrienne Hunt,Matthew Jones,Christine Lloyd,Amanda McMurray,Lucy Matthews,Simon Mercer,Sarah Milne,James C. Mullikin,Andrew J. Mungall,Robert W. Plumb,Mark T. Ross,Ratna Shownkeen,Sarah Sims,Robert H. Waterston,Richard K. Wilson,LaDeana W. Hillier,John Douglas Mcpherson,Marco A. Marra,Elaine R. Mardis,Lucinda Fulton,Asif T. Chinwalla,Kymberlie H. Pepin,Warren Gish,Stephanie L. Chissoe,Michael C. Wendl,Kim D. Delehaunty,Tracie L. Miner,Andrew Delehaunty,Jason B. Kramer,Lisa Cook,Robert S. Fulton,Douglas L. Johnson,Patrick Minx,Sandra W. Clifton,Trevor Hawkins,Elbert Branscomb,Paul Predki,Paul G. Richardson,Sarah Wenning,Tom Slezak,Norman A. Doggett,Jan Fang Cheng,Anne S. Olsen,Susan Lucas,Christopher J. Elkin,Edward Uberbacher,Marvin Frazier,Richard A. Gibbs,Donna M. Muzny,Steven E. Scherer,John Bouck,Erica Sodergren,Kim C. Worley,Catherine M. Rives,James H. Gorrell,Michael L. Metzker,Susan L. Naylor,Raju Kucherlapati,David L. Nelson,George M. Weinstock,Yoshiyuki Sakaki,Asao Fujiyama,Masahira Hattori,Tetsushi Yada,Atsushi Toyoda,Takehiko Itoh,Chiharu Kawagoe,Hidemi Watanabe,Yasushi Totoki,Todd D. Taylor,Jean Weissenbach,Roland Heilig,William Saurin,François Artiguenave,Philippe Brottier,Thomas Brüls,Eric Pelletier,Catherine Robert,Patrick Wincker,André Rosenthal,Matthias Platzer,Gerald Nyakatura,Stefan Taudien,Andreas Rump,Douglas R. Smith,Lynn Doucette-Stamm,Marc Rubenfield,Keith Weinstock,Mei Lee Hong,Joann Dubois,Huanming Yang,Jun Yu,Jian Wang,Guyang Huang,Jun Gu,Leroy Hood,Lee Rowen,Anup Madan,Shizen Qin,Ronald W. Davis,Nancy A. Federspiel,A. Pia Abola,Michael Proctor,Bruce A. Roe,Feng Chen,Huaqin Pan,Juliane Ramser,Hans Lehrach,Richard Reinhardt,W. Richard McCombie,Melissa De La Bastide,Neilay Dedhia,H. Blöcker,K. Hornischer,Gabriele Nordsiek,Richa Agarwala,L. Aravind,Jeffrey A. Bailey,Alex Bateman,Serafim Batzoglou,Ewan Birney,Peer Bork,Daniel G. Brown,Christopher B. Burge,Lorenzo Cerutti,Hsiu Chuan Chen,Deanna M. Church,Michele Clamp,Richard R. Copley,Tobias Doerks,Sean R. Eddy,Evan E. Eichler,Terrence S. Furey,James E. Galagan,James G. R. Gilbert,Cyrus L. Harmon,Yoshihide Hayashizaki,David Haussler,Henning Hermjakob,Karsten Hokamp,Wonhee Jang,L. Steven Johnson,Thomas A. Jones,Simon Kasif,Arek Kaspryzk,Scot Kennedy,W. James Kent,Paul Kitts,Eugene V. Koonin,Ian F Korf,David Kulp,Doron Lancet,Todd M. Lowe,Aoife McLysaght,Tarjei S. Mikkelsen,John V. Moran,Nicola Mulder,Victor J. Pollara,Chris P. Ponting,Greg Schuler,Jörg Schultz,Guy Slater,Arian F.A. Smit,Elia Stupka,Joseph Szustakowki,Danielle Thierry-Mieg,Jean Thierry-Mieg,Lukas Wagner,John W. Wallis,Raymond Wheeler,Alan Williams,Yuri I. Wolf,Kenneth H. Wolfe,Shiaw Pyng Yang,Ru Fang Yeh,Francis S. Collins,Mark S. Guyer,Jane Peterson,Adam Felsenfeld,Kris A. Wetterstrand,Richard M. Myers,Jeremy Schmutz,Mark Dickson,Jane Grimwood,David R. Cox,Maynard V. Olson,Rajinder Kaul,Christopher K. Raymond,Nobuyoshi Shimizu,Kazuhiko Kawasaki,Shinsei Minoshima,Glen A. Evans,Maria Athanasiou,Roger A. Schultz,Aristides Patrinos,Michael J. Morgan +248 more
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Journal ArticleDOI
A second generation human haplotype map of over 3.1 million SNPs
Kelly A. Frazer,Dennis G. Ballinger,David R. Cox,David A. Hinds,Laura L. Stuve,Richard A. Gibbs,John W. Belmont,Andrew Boudreau,Paul Hardenbol,Suzanne M. Leal,Shiran Pasternak,David A. Wheeler,Thomas D. Willis,Fuli Yu,Huanming Yang,Changqing Zeng,Gao Yang,H. B. Hu,Weitao Hu,Chaohua Li,Wei Lin,Siqi Liu,Hao Pan,Xiaoli Tang,Jian Wang,Wei Wang,Jun Yu,Bo Zhang,Qingrun Zhang,Hongbin Zhao,Hui Zhao,Jun Zhou,Stacey Gabriel,Rachel Barry,Brendan Blumenstiel,Amy L. Camargo,Matthew Defelice,Maura Faggart,Mary Goyette,Supriya Gupta,Jamie Moore,Huy Nguyen,Robert C. Onofrio,Melissa Parkin,Jessica Roy,Erich Stahl,Ellen Winchester,Liuda Ziaugra,David Altshuler,Yan Shen,Zhijian Yao,Wei Huang,Xun Chu,Yungang He,Li Jin,Yangfan Liu,Yayun Shen,Weiwei Sun,Haifeng Wang,Yi Wang,Ying Wang,Xiaoyan Xiong,Liang Xu,Mary M.Y. Waye,Stephen Kwok-Wing Tsui,Hong Xue,J. Tze Fei Wong,Luana Galver,Jian-Bing Fan,Kevin L. Gunderson,Sarah S. Murray,Arnold Oliphant,Mark S. Chee,Alexandre Montpetit,Fanny Chagnon,Vincent Ferretti,Martin Leboeuf,Jean François Olivier,Michael S. Phillips,Stéphanie Roumy,Clémentine Sallée,Andrei Verner,Thomas J. Hudson,Pui-Yan Kwok,Dongmei Cai,Daniel C. Koboldt,Raymond D. Miller,Ludmila Pawlikowska,Patricia Taillon-Miller,Ming Xiao,Lap-Chee Tsui,William Mak,Qiang Song You,Paul K.H. Tam,Yusuke Nakamura,Takahisa Kawaguchi,Takuya Kitamoto,Takashi Morizono,Atsushi Nagashima,Yozo Ohnishi,Akihiro Sekine,Toshihiro Tanaka,Tatsuhiko Tsunoda,Panos Deloukas,Christine P. Bird,Marcos Delgado,Emmanouil T. Dermitzakis,Rhian Gwilliam,Sarah E. Hunt,Jonathan J. Morrison,Don Powell,Barbara E. Stranger,Pamela Whittaker,David R. Bentley,Mark J. Daly,Paul I.W. de Bakker,Jeffrey C. Barrett,Yves Chretien,Julian Maller,Steve McCarroll,Nick Patterson,Itsik Pe'er,Alkes L. Price,Shaun Purcell,Daniel J. Richter,Pardis C. Sabeti,Richa Saxena,Stephen F. Schaffner,Pak C. Sham,Patrick Varilly,Lincoln Stein,Lalitha Krishnan,Albert V. Smith,Marcela K. Tello-Ruiz,Gudmundur A. Thorisson,Aravinda Chakravarti,Peter E. Chen,David J. Cutler,Carl S. Kashuk,Shin Lin,Gonçalo R. Abecasis,Weihua Guan,Yun Li,Heather M. Munro,Zhaohui S. Qin,Daryl J. Thomas,Gilean McVean,Adam Auton,Leonardo Bottolo,Niall Cardin,Susana Eyheramendy,Colin Freeman,Jonathan Marchini,Simon Myers,Chris C. A. Spencer,Matthew Stephens,Peter Donnelly,Lon R. Cardon,Geraldine M. Clarke,David M. Evans,Andrew P. Morris,Bruce S. Weir,Todd A. Johnson,James C. Mullikin,Stephen T. Sherry,Michael Feolo,Andrew D. Skol,Houcan Zhang,Ichiro Matsuda,Yoshimitsu Fukushima,Darryl Macer,Eiko Suda,Charles N. Rotimi,Clement Adebamowo,Ike Ajayi,Toyin Aniagwu,Patricia A. Marshall,Chibuzor Nkwodimmah,Charmaine D.M. Royal,Mark Leppert,Missy Dixon,Andy Peiffer,Renzong Qiu,Alastair Kent,Kazuto Kato,Norio Niikawa,Isaac F. Adewole,Bartha Maria Knoppers,Morris W. Foster,Ellen Wright Clayton,Jessica Watkin,Donna M. Muzny,Lynne V. Nazareth,Erica Sodergren,George M. Weinstock,Imtaz Yakub,Bruce W. Birren,Richard K. Wilson,Lucinda Fulton,Jane Rogers,John Burton,Nigel P. Carter,C M Clee,Mark Griffiths,Matthew C. Jones,Kirsten McLay,Robert W. Plumb,Mark T. Ross,Sarah Sims,David Willey,Zhu Chen,Hua Han,Le Kang,Martin Godbout,John C. Wallenburg,Paul L'Archevêque,Guy Bellemare,Koji Saeki,Hongguang Wang,Daochang An,Hongbo Fu,Qing Li,Zhen Wang,Renwu Wang,Arthur L. Holden,Lisa D. Brooks,Jean E. McEwen,Mark S. Guyer,Vivian Ota Wang,Jane Peterson,Michael Shi,Jack Spiegel,Lawrence M. Sung,Lynn F. Zacharia,Francis S. Collins,Karen Kennedy,Ruth Jamieson,John Stewart +237 more
TL;DR: The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.
Journal ArticleDOI
Accurate whole human genome sequencing using reversible terminator chemistry
David R. Bentley,Shankar Balasubramanian,Harold Swerdlow,Harold Swerdlow,Geoffrey Paul Smith,John Milton,John Milton,Clive Gavin Brown,Clive Gavin Brown,Kevin Hall,Dirk J. Evers,Colin Barnes,Colin Barnes,Helen Bignell,Jonathan Mark Boutell,Jason Bryant,Richard J. Carter,R. Keira Cheetham,Anthony J. Cox,Darren James Ellis,Michael R. Flatbush,Niall Anthony Gormley,Sean Humphray,Leslie J. Irving,Mirian S. Karbelashvili,Scott M. Kirk,Heng Li,Xiaohai Liu,Xiaohai Liu,Klaus Maisinger,Lisa Murray,Bojan Obradovic,Tobias William Barr Ost,Michael Lawrence Parkinson,M. R. Pratt,Isabelle Rasolonjatovo,Mark T. Reed,Roberto Rigatti,Chiara Rodighiero,Mark T. Ross,Andrea Sabot,Subramanian V. Sankar,Aylwyn Scally,Gary P. Schroth,Mark Smith,Vincent Peter Smith,Anastassia Spiridou,Peta E. Torrance,Svilen S. Tzonev,Eric Vermaas,Klaudia Walter,Wu Xiaolin,Lu Zhang,Mohammed D. Alam,Carole Anastasi,Ify C. Aniebo,David Mark Dunstan Bailey,Iain R. Bancarz,Saibal Banerjee,Selena G. Barbour,Primo Baybayan,Vincent A. Benoit,Kevin Benson,Claire Bevis,Phillip J. Black,Asha Boodhun,Joe S. Brennan,John Bridgham,Rob C. Brown,Andrew A. Brown,Dale Buermann,Abass A. Bundu,James C. Burrows,Nigel P. Carter,Nestor Castillo,Maria Chiara E. Catenazzi,Simon Chang,R. Neil Cooley,Natasha R. Crake,Olubunmi O. Dada,Konstantinos D. Diakoumakos,Belen Dominguez-Fernandez,David James Earnshaw,David James Earnshaw,Ugonna C. Egbujor,David W. Elmore,Sergey Etchin,Mark R. Ewan,Milan Fedurco,Louise Fraser,Karin Fuentes Fajardo,W. Scott Furey,David George,Kimberley J. Gietzen,Colin P. Goddard,George Stefan Golda,Philip A. Granieri,David E. Green,David L. Gustafson,Nancy F. Hansen,Kevin Harnish,Christian D. Haudenschild,Narinder I. Heyer,Matthew M. Hims,Johnny T. Ho,Adrian Horgan,Katya Hoschler,Steve Hurwitz,Denis V. Ivanov,Maria Q. Johnson,Terena James,T. A. Huw Jones,Gyoung-Dong Kang,Tzvetana H. Kerelska,Alan D. Kersey,Irina Khrebtukova,Alex P. Kindwall,Zoya Kingsbury,Paula Kokko-Gonzales,Anil Kumar,Marc Laurent,Cindy Lawley,Sarah E. Lee,Xavier Lee,Arnold Liao,Jennifer A. Loch,Mitch Lok,Shujun Luo,Radhika M. Mammen,John W. Martin,Patrick Mccauley,Paul McNitt,Parul Mehta,Keith W. Moon,Joe W. Mullens,Taksina Newington,Zemin Ning,Bee Ling Ng,Sonia M. Novo,Michael J. O'Neill,Mark A. Osborne,Mark A. Osborne,Andrew Osnowski,Omead Ostadan,Lambros L. Paraschos,Lea Pickering,Andrew C. Pike,Alger C. Pike,D. Chris Pinkard,Daniel P. Pliskin,Joe Podhasky,Victor J. Quijano,Come Raczy,Vicki H. Rae,Stephen Rawlings,Ana Chiva Rodriguez,Phyllida M. Roe,John Rogers,Maria Candelaria Rogert Bacigalupo,Nikolai Romanov,Anthony Romieu,Rithy K. Roth,Natalie J. Rourke,Silke Ruediger,Eli Rusman,Raquel Maria Sanches-Kuiper,Martin R. Schenker,Josefina M. Seoane,Richard Shaw,Mitch K. Shiver,Steven W. Short,Ning Sizto,Johannes P. Sluis,Melanie Anne Smith,Jean Ernest Sohna Sohna,Eric J. Spence,Kim B. Stevens,Neil Sutton,Lukasz Szajkowski,Carolyn Tregidgo,Gerardo Turcatti,Stephanie Vandevondele,Yuli Verhovsky,Selene M. Virk,Suzanne Wakelin,Gregory C. Walcott,Jingwen Wang,Graham John Worsley,Juying Yan,Ling Yau,Mike Zuerlein,Jane Rogers,James C. Mullikin,Matthew E. Hurles,Nick J. McCooke,Nick J. McCooke,John Stephen West,Frank L. Oaks,Peter Lundberg,David Klenerman,Richard Durbin,Anthony J. Smith +201 more
TL;DR: An approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost is reported, effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.
Journal ArticleDOI
A Comparison of Bayesian Methods for Haplotype Reconstruction from Population Genotype Data
Matthew Stephens,Peter Donnelly +1 more
TL;DR: A new algorithm is introduced that combines the modeling strategy of one method with the computational strategies of another and outperforms all three existing methods for inferring haplotypes from genotype data in a population sample.
Journal ArticleDOI
A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.
Paul Scheet,Matthew Stephens +1 more
TL;DR: A statistical model based on the idea that, over short regions, haplotypes in a population tend to cluster into groups of similar haplotypes that allows cluster memberships to change continuously along the chromosome according to a hidden Markov model to capture the fact that recombination tends to be local in nature.
Related Papers (5)
HapCUT: an efficient and accurate algorithm for the haplotype assembly problem
Vikas Bansal,Vineet Bafna +1 more
The Diploid Genome Sequence of an Individual Human
Samuel Levy,Granger G. Sutton,Pauline C. Ng,Lars Feuk,Aaron L. Halpern,Brian P. Walenz,Nelson Axelrod,Jiaqi Huang,Ewen F. Kirkness,Gennady Denisov,Yuan Lin,Jeffrey R. MacDonald,Andy Wing Chun Pang,Mary Shago,Timothy B. Stockwell,Alexia Tsiamouri,Vineet Bafna,Vikas Bansal,Saul A. Kravitz,Dana A. Busam,Karen Beeson,Tina C McIntosh,Karin A. Remington,Josep F. Abril,John Gill,Jon Borman,Yu-Hui Rogers,Marvin Frazier,Stephen W. Scherer,Robert L. Strausberg,J. Craig Venter +30 more