Home
/
Authors
/
Alexander Wait Zaranek

Author

Alexander Wait Zaranek

Other affiliations: Walter and Eliza Hall Institute of Medical Research

Bio: Alexander Wait Zaranek is an academic researcher from Harvard University. The author has contributed to research in topics: Genome & Genomics. The author has an hindex of 14, co-authored 22 publications receiving 3849 citations. Previous affiliations of Alexander Wait Zaranek include Walter and Eliza Hall Institute of Medical Research.

Topics: Genome, Genomics, Human genome, Personal genomics, DNA sequencing ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays

[...]

Radoje Drmanac, Andrew B. Sparks, Matthew J. Callow, Aaron L. Halpern, Norman L. Burns, Bahram G. Kermani, Paolo Carnevali, Igor Nazarenko, Geoffrey B. Nilsen, George Yeung, Fredrik A. Dahl, Andres Fernandez, Bryan Staker, Krishna Pant, Jonathan Baccash, Adam P. Borcherding, Anushka Brownley, Ryan J. Cedeno, Linsu Chen, Daniel F. Chernikoff, Alex Cheung, Razvan Chirita, Benjamin Curson, Jessica Ebert, Coleen R. Hacker, Robert Hartlage, Brian Hauser, Steve Huang, Yuan Jiang, Vitali Karpinchyk, Mark Koenig, Calvin Kong, Tom Landers, Catherine Le, Jia Liu, Celeste E. McBride, Matt Morenzoni, Robert E. Morey, Karl Mutch, Helena Perazich, Kimberly Perry, Brock A. Peters, Joe Peterson, Charit L. Pethiyagoda, Kaliprasad Pothuraju, Claudia Richter, Abraham M. Rosenbaum¹, Shaunak Roy, Jay Shafto, Uladzislau Sharanhovich, Karen W. Shannon, Conrad G. Sheppy, Michel Sun, Joseph V. Thakuria¹, Anne Tran, Dylan Vu, Alexander Wait Zaranek¹, Xiaodi Wu², Snezana Drmanac, Arnold R. Oliphant, William C. Banyai, Bruce L. Martin, Dennis G. Ballinger, George M. Church¹, Clifford Reid - Show less +61 more•Institutions (2)

Harvard University¹, Washington University in St. Louis²

01 Jan 2010-Science

TL;DR: A genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs is described.

...read moreread less

Abstract: Genome sequencing of large numbers of individuals promises to advance the understanding, treatment, and prevention of human diseases, among other applications. We describe a genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs. We sequenced three human genomes with this platform, generating an average of 45- to 87-fold coverage per genome and identifying 3.2 to 4.5 million sequence variants per genome. Validation of one genome data set demonstrates a sequence accuracy of about 1 false variant per 100 kilobases. The high accuracy, affordable cost of $4400 for sequencing consumables, and scalability of this platform enable complete human genome sequencing for the detection of rare variants in large-scale genetic studies.

...read moreread less

1,343 citations

Journal Article•DOI•

Clinical assessment incorporating a personal genome

[...]

Euan A. Ashley¹, Atul J. Butte¹, Matthew T. Wheeler¹, Rong Chen¹, Teri E. Klein¹, Frederick E. Dewey¹, Joel T. Dudley¹, Kelly E. Ormond¹, Aleksandra Pavlovic¹, Alexander A. Morgan¹, Dmitry Pushkarev¹, Norma F. Neff¹, Louanne Hudgins¹, Li Gong¹, Laura M. Hodges¹, Dorit S. Berlin¹, Caroline F. Thorn¹, Katrin Sangkuhl¹, Joan M. Hebert¹, Mark Woon¹, Hersh Sagreiya¹, Ryan Whaley¹, Joshua W. Knowles¹, Michael F. Chou², Joseph V. Thakuria², Abraham M. Rosenbaum², Alexander Wait Zaranek², George M. Church², Henry T. Greely¹, Stephen R. Quake¹, Russ B. Altman¹ - Show less +27 more•Institutions (2)

Stanford University¹, Harvard University²

01 May 2010-The Lancet

TL;DR: Although challenges remain, the results suggest that whole-genome sequencing can yield useful and clinically relevant information for individual patients.

...read moreread less

686 citations

Journal Article•DOI•

Extensive sequencing of seven human genomes to characterize benchmark reference materials

[...]

Justin M. Zook¹, David Catoe¹, Jennifer McDaniel¹, Lindsay K. Vang¹, Noah Spies¹, Noah Spies², Arend Sidow², Ziming Weng², Yuling Liu², Christopher E. Mason³, Noah Alexander³, Elizabeth Henaff³, Alexa B. R. McIntyre³, Dhruva Chandramohan³, Feng Chen⁴, Erich Jaeger⁴, Ali Moshrefi⁴, Khoa Pham, William Stedman, Tiffany Y. Liang, Michael Saghbini, Zeljko Dzakula, Alex Hastie, Han Cao, Gintaras Deikus⁵, Eric E. Schadt⁵, Robert Sebra⁵, Ali Bashir⁵, R Truty, Christopher C. Chang, Natali Gulbahce, Keyan Zhao⁶, Srinka Ghosh⁶, Fiona Hyland⁶, Yutao Fu⁶, Mark Chaisson⁷, Chunlin Xiao⁸, Jonathan Trow⁸, Stephen T. Sherry⁸, Alexander Wait Zaranek, Madeleine Ball, Jason Bobe⁵, Preston W. Estep⁹, George M. Church⁹, Patrick Marks, Sofia Kyriazopoulou-Panagiotopoulou, Grace X.Y. Zheng, Michael Schnall-Levin, Heather Ordonez, Patrice A Mudivarti, Kristina Giorda, Ying Sheng¹⁰, Karoline Bjarnesdatter Rypdal¹⁰, Marc L. Salit¹, Marc L. Salit² - Show less +51 more•Institutions (10)

National Institute of Standards and Technology¹, Stanford University², Cornell University³, Illumina⁴, Icahn School of Medicine at Mount Sinai⁵, Thermo Fisher Scientific⁶, University of Washington⁷, National Institutes of Health⁸, Harvard University⁹, Oslo University Hospital¹⁰

07 Jun 2016-Scientific Data

TL;DR: A large, diverse set of sequencing data for seven human genomes is described; five are current or candidate NIST Reference Materials and two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry are described.

...read moreread less

Abstract: The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.

...read moreread less

581 citations

Journal Article•DOI•

A highly annotated whole-genome sequence of a Korean individual

[...]

Jong Il Kim, Young Seok Ju¹, Young Seok Ju², Hansoo Park², Sheehyun Kim, Seonwook Lee, Jae-Hyuk Yi², Joann Mudge³, Neil A. Miller³, Dongwan Hong², Callum J. Bell³, Hye Sun Kim, In Soon Chung, Woo Chung Lee, Ji Sun Lee, Seung Hyun Seo, Ji Young Yun, Hyun Nyun Woo, Heewook Lee, Dongwhan Suh¹, Dongwhan Suh², Seung-Bok Lee¹, Seung-Bok Lee², Hyunjin Kim², Maryam Yavartanoo¹, Maryam Yavartanoo², Minhye Kwak¹, Minhye Kwak², Ying Zheng², Ying Zheng¹, Mi Kyeong Lee, Hyun Jun Park², Jeongyeon Kim², Omer Gokcumen⁴, Ryan E. Mills⁴, Alexander Wait Zaranek⁵, Joseph V. Thakuria⁵, Xiaodi Wu⁵, Ryan W. Kim³, Jim J. Huntley⁶, Shujun Luo⁶, Gary P. Schroth⁶, Thomas D. Wu⁷, Hye-Ran Kim, Kap-Seok Yang, Woong-Yang Park¹, Woong-Yang Park², Hyungtae Kim, George M. Church⁵, Charles Lee⁴, Stephen F. Kingsmore³, Jeong-Sun Seo - Show less +48 more•Institutions (7)

New Generation University College¹, Seoul National University², National Center for Genome Resources³, Brigham and Women's Hospital⁴, Harvard University⁵, Illumina⁶, Genentech⁷

20 Aug 2009-Nature

TL;DR: Lee et al. as discussed by the authors provided a highly annotated, whole-genome sequence for a Korean individual, known as AK1, using a combination of methods, including shotgun sequencing (27.8× coverage), targeted bacterial artificial chromosome sequencing, and high-resolution comparative genomic hybridization using custom microarrays featuring more than 24 million probes.

...read moreread less

Abstract: The genome of an anonymous Korean male has been sequenced using a broad spread of genomic techniques. This combinatorial approach allows for detailed characterization of sequence and structural variation. The first four individual genomes to have been determined spanned three distinct ethnic groups: a Yoruba African, northwest European (Craig Venter and James Watson) and Han Chinese. This new work, together with another Korean sequence reported elsewhere, adds the Altaic ethnic grouping to the list. Human genome sequences have so far been reported for individuals with ancestry in three distinct geographical regions: a Yoruba African, two individuals of northwest European origin, and a person from China. Here, using a combination of methods, a highly annotated, whole-genome sequence is provided for a Korean male. Recent advances in sequencing technologies have initiated an era of personal genome sequences. To date, human genome sequences have been reported for individuals with ancestry in three distinct geographical regions: a Yoruba African, two individuals of northwest European origin, and a person from China1,2,3,4. Here we provide a highly annotated, whole-genome sequence for a Korean individual, known as AK1. The genome of AK1 was determined by an exacting, combined approach that included whole-genome shotgun sequencing (27.8× coverage), targeted bacterial artificial chromosome sequencing, and high-resolution comparative genomic hybridization using custom microarrays featuring more than 24 million probes. Alignment to the NCBI reference, a composite of several ethnic clades5,6, disclosed nearly 3.45 million single nucleotide polymorphisms (SNPs), including 10,162 non-synonymous SNPs, and 170,202 deletion or insertion polymorphisms (indels). SNP and indel densities were strongly correlated genome-wide. Applying very conservative criteria yielded highly reliable copy number variants for clinical considerations. Potential medical phenotypes were annotated for non-synonymous SNPs, coding domain indels, and structural variants. The integration of several human whole-genome sequences derived from several ethnic groups will assist in understanding genetic ancestry, migration patterns and population bottlenecks.

...read moreread less

324 citations

Journal Article•DOI•

Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells.

[...]

Brock A. Peters, Bahram Ghaffarzadeh Kermani, Andrew B. Sparks, Oleg Alferov, Peter Hong, Andrei Alexeev, Yuan Jiang, Fredrik A. Dahl, Y. Tom Tang, Juergen Haas, Kimberly Robasky¹, Kimberly Robasky², Alexander Wait Zaranek², Je-Hyuk Lee², Je-Hyuk Lee³, Madeleine Ball², Joseph E. Peterson, Helena Perazich, George Yeung, Jia Liu, Linsu Chen, Michael I. Kennemer, Kaliprasad Pothuraju, Karel Konvicka, Mike Tsoupko-Sitnikov, Krishna Pant, Jessica Ebert, Geoffrey B. Nilsen, Jonathan Baccash, Aaron L. Halpern, George M. Church², Radoje Drmanac - Show less +28 more•Institutions (3)

Boston University¹, Harvard University², Wyss Institute for Biologically Inspired Engineering³

12 Jul 2012-Nature

TL;DR: A low-cost DNA sequencing and haplotyping process, long fragment read (LFR) technology, which is similar to sequencing long single DNA molecules without cloning or separation of metaphase chromosomes is described.

...read moreread less

Abstract: Recent advances in whole-genome sequencing have brought the vision of personal genomics and genomic medicine closer to reality. However, current methods lack clinical accuracy and the ability to describe the context (haplotypes) in which genome variants co-occur in a cost-effective manner. Here we describe a low-cost DNA sequencing and haplotyping process, long fragment read (LFR) technology, which is similar to sequencing long single DNA molecules without cloning or separation of metaphase chromosomes. In this study, ten LFR libraries were made using only ∼100 picograms of human DNA per sample. Up to 97% of the heterozygous single nucleotide variants were assembled into long haplotype contigs. Removal of false positive single nucleotide variants not phased by multiple LFR haplotypes resulted in a final genome error rate of 1 in 10 megabases. Cost-effective and accurate genome sequencing and haplotyping from 10–20 human cells, as demonstrated here, will enable comprehensive genetic studies and diverse clinical applications. A new DNA analysis method termed long fragment read technology is described, and the approach is used to determine parental haplotypes and to sequence human genomes cost-effectively and accurately from only 10 to 20 cells. Many of the hoped-for advances in the field of personalized medicine are dependent on the development of low-cost genome-sequencing technology that combines clinical accuracy with the ability to describe the context (the genetic haplotype) in which variants occur on an individual chromosome. The technique described here, termed long-fragment read technology, is similar to that used to sequence long single DNA molecules, but without DNA cloning or chromosome separation. The authors demonstrate the potential of this approach by generating seven accurate human genome sequences, as well as haplotype data, from samples containing just 10–20 cells. This advance shows that it should be possible to achieve clinical quality and scale in personal genome sequencing of microbiopsies and circulating cancer cells.

...read moreread less

320 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

[...]

Aaron McKenna¹, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kristian Cibulskis, Andrew Kernytsky, Kiran V. Garimella, David Altshuler, Stacey Gabriel, Mark J. Daly, Mark A. DePristo - Show less +7 more•Institutions (1)

Broad Institute¹

01 Sep 2010-Genome Research

TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

20,557 citations

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Journal Article•DOI•

A framework for variation discovery and genotyping using next-generation DNA sequencing data

[...]

Mark A. DePristo¹, Eric Banks¹, Ryan Poplin¹, Kiran V. Garimella¹, Jared Maguire¹, Christopher Hartl¹, Anthony A. Philippakis², Anthony A. Philippakis¹, Anthony A. Philippakis³, Guillermo del Angel¹, Manuel A. Rivas¹, Manuel A. Rivas², Matt Hanna¹, Aaron McKenna¹, Timothy Fennell¹, Andrew Kernytsky¹, Andrey Sivachenko¹, Kristian Cibulskis¹, Stacey Gabriel¹, David Altshuler¹, David Altshuler², Mark J. Daly², Mark J. Daly¹ - Show less +19 more•Institutions (3)

Broad Institute¹, Harvard University², Brigham and Women's Hospital³

01 May 2011-Nature Genetics

TL;DR: A unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs is presented.

...read moreread less

Abstract: Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.

...read moreread less

10,056 citations

Journal Article•DOI•

An integrated map of genetic variation from 1,092 human genomes

[...]

Gonçalo R. Abecasis¹, Adam Auton², Lisa D. Brooks³, Mark A. DePristo⁴, Richard Durbin⁵, Robert E. Handsaker⁴, Robert E. Handsaker⁶, Hyun Min Kang¹, Gabor T. Marth⁷, Gil McVean⁸ - Show less +6 more•Institutions (8)

University of Michigan¹, Yeshiva University², National Institutes of Health³, Broad Institute⁴, Wellcome Trust Sanger Institute⁵, Harvard University⁶, Boston College⁷, University of Oxford⁸

01 Nov 2012-Nature

TL;DR: It is shown that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites.

...read moreread less

Abstract: By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations.

...read moreread less

7,710 citations

Journal Article•DOI•

Sequencing technologies-the next generation

[...]

Michael L. Metzker¹•Institutions (1)

Baylor College of Medicine¹

01 Jan 2010-Nature Reviews Genetics

TL;DR: A technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments is presented.

...read moreread less

Abstract: Demand has never been greater for revolutionary technologies that deliver fast, inexpensive and accurate genome information. This challenge has catalysed the development of next-generation sequencing (NGS) technologies. The inexpensive production of large volumes of sequence data is the primary advantage over conventional methods. Here, I present a technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments. I also outline the broad range of applications for NGS technologies, in addition to providing guidelines for platform selection to address biological questions of interest.

...read moreread less

7,023 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse