Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity

Home
/
Papers
/
Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity

Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity

Austin M. Dulak¹, Petar Stojanov², Petar Stojanov¹, Petar Stojanov³, Shouyong Peng³, Shouyong Peng¹, Michael S. Lawrence³, Cameron Fox¹, Chip Stewart³, Santhoshi Bandla⁴, Yu Imamura¹, Steven E. Schumacher¹, Steven E. Schumacher³, Erica Shefler³, Aaron McKenna³, Scott L. Carter³, Kristian Cibulskis³, Andrey Sivachenko³, Gordon Saksena³, Douglas Voet³, Alex H. Ramos³, Daniel Auclair³, Kristin Thompson³, Carrie Sougnez³, Robert C. Onofrio³, Candace Guiducci³, Rameen Beroukhim, Zhongren Zhou⁴, Lin Lin⁵, Jules Lin⁵, Rishindra M. Reddy⁵, Andrew C. Chang⁵, Rodney Landrenau⁶, Arjun Pennathur⁶, Shuji Ogino, James D. Luketich⁶, Todd R. Golub, Stacey Gabriel³, Eric S. Lander³, Eric S. Lander², Eric S. Lander¹, David G. Beer⁵, Tony E. Godfrey⁴, Gad Getz¹, Gad Getz³, Adam J. Bass - Show less +42 more•Institutions (6)

Harvard University¹, Massachusetts Institute of Technology², Broad Institute³, University of Rochester⁴, University of Michigan⁵, University of Pittsburgh⁶

01 Mar 2013-

TL;DR: A mutational signature defined by a high prevalence of A>C transversions at AA dinucleotides is identified and the potential activation of the RAC1 pathway is suggested as a contributor to EAC tumorigenesis.

read less

Abstract: National Human Genome Research Institute (U.S.) (Large Scale Sequencing Program Grant U54 HG003067)

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Signatures of mutational processes in human cancer

[...]

Ludmil B. Alexandrov¹, Serena Nik-Zainal², Serena Nik-Zainal³, David C. Wedge¹, Samuel Aparicio⁴, Sam Behjati⁵, Sam Behjati¹, Andrew V. Biankin, Graham R. Bignell¹, Niccolo Bolli⁵, Niccolo Bolli¹, Åke Borg³, Anne Lise Børresen-Dale⁶, Anne Lise Børresen-Dale⁷, Sandrine Boyault⁸, Birgit Burkhardt⁸, Adam Butler¹, Carlos Caldas⁹, Helen Davies¹, Christine Desmedt, Roland Eils⁵, Jorunn E. Eyfjord¹⁰, John A. Foekens¹¹, Mel Greaves¹², Fumie Hosoda¹³, Barbara Hutter⁵, Tomislav Ilicic¹, Sandrine Imbeaud¹⁴, Sandrine Imbeaud¹⁵, Marcin Imielinsk¹⁵, Natalie Jäger⁵, David T. W. Jones¹⁶, David T. Jones¹, Stian Knappskog¹¹, Stian Knappskog¹⁷, Marcel Kool¹¹, Sunil R. Lakhani¹⁸, Carlos López-Otín¹⁸, Sancha Martin¹, Nikhil C. Munshi¹⁹, Nikhil C. Munshi²⁰, Hiromi Nakamura¹³, Paul A. Northcott¹⁶, Marina Pajic²¹, Elli Papaemmanuil¹, Angelo Paradiso²², John V. Pearson²³, Xose S. Puente¹⁸, Keiran Raine¹, Manasa Ramakrishna¹, Andrea L. Richardson²², Andrea L. Richardson¹⁹, Julia Richter²², Philip Rosenstiel²², Matthias Schlesner⁵, Ton N. Schumacher²⁴, Paul N. Span²⁵, Jon W. Teague¹, Yasushi Totoki¹³, Andrew Tutt²⁴, Rafael Valdés-Mas¹⁸, Marit M. van Buuren²⁵, Laura van ’t Veer²⁶, Anne Vincent-Salomon²⁷, Nicola Waddell²³, Lucy R. Yates¹, Icgc PedBrain²⁴, Jessica Zucman-Rossi¹⁴, Jessica Zucman-Rossi¹⁵, P. Andrew Futreal¹, Ultan McDermott¹, Peter Lichter²⁴, Matthew Meyerson¹⁵, Matthew Meyerson¹⁹, Sean M. Grimmond²³, Reiner Siebert²², Elias Campo²⁸, Tatsuhiro Shibata¹³, Stefan M. Pfister¹⁶, Stefan M. Pfister¹¹, Peter J. Campbell²⁹, Peter J. Campbell³⁰, Peter J. Campbell², Michael R. Stratton², Michael R. Stratton³¹ - Show less +81 more•Institutions (31)

Wellcome Trust Sanger Institute¹, Wellcome Trust², Cambridge University Hospitals NHS Foundation Trust³, University of British Columbia⁴, University of Cambridge⁵, The Breast Cancer Research Foundation⁶, Oslo University Hospital⁷, University of Oslo⁸, University of Münster⁹, Université libre de Bruxelles¹⁰, German Cancer Research Center¹¹, University of Iceland¹², Erasmus University Rotterdam¹³, Paris Descartes University¹⁴, French Institute of Health and Medical Research¹⁵, University of Paris¹⁶, Broad Institute¹⁷, University of Bergen¹⁸, University of Queensland¹⁹, University of Oviedo²⁰, University of Glasgow²¹, Harvard University²², United States Department of Veterans Affairs²³, Netherlands Cancer Institute²⁴, University of Kiel²⁵, Radboud University Nijmegen²⁶, King's College London²⁷, Curie Institute²⁸, University of New South Wales²⁹, Bankstown Lidcombe Hospital³⁰, University of Barcelona³¹

22 Aug 2013-Nature

TL;DR: It is shown that hypermutation localized to small genomic regions, ‘kataegis’, is found in many cancer types, and this results reveal the diversity of mutational processes underlying the development of cancer.

...read moreread less

Abstract: All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.

...read moreread less

7,904 citations

Journal Article•DOI•

Comprehensive molecular characterization of gastric adenocarcinoma

[...]

Adam J. Bass¹, Vesteinn Thorsson², Ilya Shmulevich², Sheila Reynolds² +254 more•Institutions (32)

11 Sep 2014-Nature

TL;DR: A comprehensive molecular evaluation of 295 primary gastric adenocarcinomas as part of The Cancer Genome Atlas (TCGA) project is described and a molecular classification dividing gastric cancer into four subtypes is proposed.

...read moreread less

Abstract: Gastric cancer was the world’s third leading cause of cancer mortality in 2012, responsible for 723,000 deaths1. The vast majority of gastric cancers are adenocarcinomas, which can be further subdivided into intestinal and diffuse types according to the Lauren classification2. An alternative system, proposed by the World Health Organization, divides gastric cancer into papillary, tubular, mucinous (colloid) and poorly cohesive carcinomas3. These classification systems have little clinical utility, making the development of robust classifiers that can guide patient therapy an urgent priority. The majority of gastric cancers are associated with infectious agents, including the bacterium Helicobacter pylori4 and Epstein–Barr virus (EBV). The distribution of histological subtypes of gastric cancer and the frequencies of H. pylori and EBV associated gastric cancer vary across the globe5. A small minority of gastric cancer cases are associated with germline mutation in E-cadherin (CDH1)6 or mismatch repair genes7 (Lynch syndrome), whereas sporadic mismatch repair-deficient gastric cancers have epigenetic silencing of MLH1 in the context of a CpG island methylator phenotype (CIMP)8. Molecular profiling of gastric cancer has been performed using gene expression or DNA sequencing9–12, but has not led to a clear biologic classification scheme. The goals of this study by The Cancer Genome Atlas (TCGA) were to develop a robust molecular classification of gastric cancer and to identify dysregulated pathways and candidate drivers of distinct classes of gastric cancer.

...read moreread less

4,583 citations

Mutational heterogeneity in cancer and the search for new cancer genes

[...]

Elena Helman, Eric S. Lander

01 Jun 2013

TL;DR: The MutSigCV method as mentioned in this paper applies mutational heterogeneity to exome sequences from 3,083 tumour-normal pairs and discovers extraordinary variation in mutation frequency and spectrum within cancer types, which sheds light on mutational processes and disease aetiology.

...read moreread less

Abstract: Major international projects are underway that are aimed at creating a comprehensive catalogue of all the genes responsible for the initiation and progression of cancer. These studies involve the sequencing of matched tumour–normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false-positive findings that overshadow true driver events. We show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumour–normal pairs and discover extraordinary variation in mutation frequency and spectrum within cancer types, which sheds light on mutational processes and disease aetiology, and in mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and enable the identification of genes truly associated with cancer.

...read moreread less

2,145 citations

Journal Article•DOI•

Integrated Genomic Characterization of Papillary Thyroid Carcinoma

[...]

Nishant Agrawal¹, Rehan Akbani¹, B. Arman Aksoy¹, Adrian Ally¹ +239 more•Institutions (1)

23 Oct 2014-Cell

TL;DR: The genomic landscape of 496 PTCs is described and a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties is proposed, which has the potential to improve their pathological classification and better inform the management of the disease.

...read moreread less

2,096 citations

Journal Article•DOI•

Maftools: efficient and comprehensive analysis of somatic variants in cancer.

[...]

Anand Mayakonda¹, Anand Mayakonda², De-Chen Lin³, Yassen Assenov⁴, Yassen Assenov¹, Christoph Plass⁴, Christoph Plass¹, H. Phillip Koeffler³, H. Phillip Koeffler² - Show less +5 more•Institutions (4)

German Cancer Research Center¹, National University of Singapore², Cedars-Sinai Medical Center³, Heidelberg University⁴

19 Oct 2018-Genome Research

TL;DR: An R Bioconductor package, Maftools, is described, which offers a multitude of analysis and visualization modules that are commonly used in cancer genomic studies, including driver gene identification, pathway, signature, enrichment, and association analyses, and is independent of larger alignment files.

...read moreread less

Abstract: Numerous large-scale genomic studies of matched tumor-normal samples have established the somatic landscapes of most cancer types. However, the downstream analysis of data from somatic mutations entails a number of computational and statistical approaches, requiring usage of independent software and numerous tools. Here, we describe an R Bioconductor package, Maftools, which offers a multitude of analysis and visualization modules that are commonly used in cancer genomic studies, including driver gene identification, pathway, signature, enrichment, and association analyses. Maftools only requires somatic variants in Mutation Annotation Format (MAF) and is independent of larger alignment files. With the implementation of well-established statistical and computational methods, Maftools facilitates data-driven research and comparative analysis to discover novel results from publicly available data sets. In the present study, using three of the well-annotated cohorts from The Cancer Genome Atlas (TCGA), we describe the application of Maftools to reproduce known results. More importantly, we show that Maftools can also be used to uncover novel findings through integrative analysis.

...read moreread less

1,990 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A method and server for predicting damaging missense mutations.

[...]

Ivan Adzhubei¹, Steffen Schmidt², Leonid Peshkin³, Vasily Ramensky⁴, Anna Gerasimova⁵, Peer Bork, Alexey S. Kondrashov⁵, Shamil R. Sunyaev¹ - Show less +4 more•Institutions (5)

Brigham and Women's Hospital¹, Max Planck Society², Harvard University³, Engelhardt Institute of Molecular Biology⁴, University of Michigan⁵

01 Apr 2010-Nature Methods

TL;DR: A new method and the corresponding software tool, PolyPhen-2, which is different from the early tool polyPhen1 in the set of predictive features, alignment pipeline, and the method of classification is presented and performance, as presented by its receiver operating characteristic curves, was consistently superior.

...read moreread less

Abstract: To the Editor: Applications of rapidly advancing sequencing technologies exacerbate the need to interpret individual sequence variants. Sequencing of phenotyped clinical subjects will soon become a method of choice in studies of the genetic causes of Mendelian and complex diseases. New exon capture techniques will direct sequencing efforts towards the most informative and easily interpretable protein-coding fraction of the genome. Thus, the demand for computational predictions of the impact of protein sequence variants will continue to grow. Here we present a new method and the corresponding software tool, PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), which is different from the early tool PolyPhen1 in the set of predictive features, alignment pipeline, and the method of classification (Fig. 1a). PolyPhen-2 uses eight sequence-based and three structure-based predictive features (Supplementary Table 1) which were selected automatically by an iterative greedy algorithm (Supplementary Methods). Majority of these features involve comparison of a property of the wild-type (ancestral, normal) allele and the corresponding property of the mutant (derived, disease-causing) allele, which together define an amino acid replacement. Most informative features characterize how well the two human alleles fit into the pattern of amino acid replacements within the multiple sequence alignment of homologous proteins, how distant the protein harboring the first deviation from the human wild-type allele is from the human protein, and whether the mutant allele originated at a hypermutable site2. The alignment pipeline selects the set of homologous sequences for the analysis using a clustering algorithm and then constructs and refines their multiple alignment (Supplementary Fig. 1). The functional significance of an allele replacement is predicted from its individual features (Supplementary Figs. 2–4) by Naive Bayes classifier (Supplementary Methods). Figure 1 PolyPhen-2 pipeline and prediction accuracy. (a) Overview of the algorithm. (b) Receiver operating characteristic (ROC) curves for predictions made by PolyPhen-2 using five-fold cross-validation on HumDiv (red) and HumVar3 (light green). UniRef100 (solid ... We used two pairs of datasets to train and test PolyPhen-2. We compiled the first pair, HumDiv, from all 3,155 damaging alleles with known effects on the molecular function causing human Mendelian diseases, present in the UniProt database, together with 6,321 differences between human proteins and their closely related mammalian homologs, assumed to be non-damaging (Supplementary Methods). The second pair, HumVar3, consists of all the 13,032 human disease-causing mutations from UniProt, together with 8,946 human nsSNPs without annotated involvement in disease, which were treated as non-damaging. We found that PolyPhen-2 performance, as presented by its receiver operating characteristic curves, was consistently superior compared to PolyPhen (Fig. 1b) and it also compared favorably with the three other popular prediction tools4–6 (Fig. 1c). For a false positive rate of 20%, PolyPhen-2 achieves the rate of true positive predictions of 92% and 73% on HumDiv and HumVar, respectively (Supplementary Table 2). One reason for a lower accuracy of predictions on HumVar is that nsSNPs assumed to be non-damaging in HumVar contain a sizable fraction of mildly deleterious alleles. In contrast, most of amino acid replacements assumed non-damaging in HumDiv must be close to selective neutrality. Because alleles that are even mildly but unconditionally deleterious cannot be fixed in the evolving lineage, no method based on comparative sequence analysis is ideal for discriminating between drastically and mildly deleterious mutations, which are assigned to the opposite categories in HumVar. Another reason is that HumDiv uses an extra criterion to avoid possible erroneous annotations of damaging mutations. For a mutation, PolyPhen-2 calculates Naive Bayes posterior probability that this mutation is damaging and reports estimates of false positive (the chance that the mutation is classified as damaging when it is in fact non-damaging) and true positive (the chance that the mutation is classified as damaging when it is indeed damaging) rates. A mutation is also appraised qualitatively, as benign, possibly damaging, or probably damaging (Supplementary Methods). The user can choose between HumDiv- and HumVar-trained PolyPhen-2. Diagnostics of Mendelian diseases requires distinguishing mutations with drastic effects from all the remaining human variation, including abundant mildly deleterious alleles. Thus, HumVar-trained PolyPhen-2 should be used for this task. In contrast, HumDiv-trained PolyPhen-2 should be used for evaluating rare alleles at loci potentially involved in complex phenotypes, dense mapping of regions identified by genome-wide association studies, and analysis of natural selection from sequence data, where even mildly deleterious alleles must be treated as damaging.

...read moreread less

11,571 citations

Journal Article•DOI•

A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

[...]

Benjamin M. Bolstad¹, Rafael A. Irizarry², Magnus Åstrand³, Terence P. Speed⁴, Terence P. Speed¹ - Show less +1 more•Institutions (4)

University of California, Berkeley¹, Johns Hopkins University², AstraZeneca³, Walter and Eliza Hall Institute of Medical Research⁴

22 Jan 2003-Bioinformatics

TL;DR: Three methods of performing normalization at the probe intensity level are presented: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure and the simplest and quickest complete data method is found to perform favorably.

...read moreread less

Abstract: Motivation: When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations. Results: We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably. Availabilty: Software implementing all three of the complete data normalization methods is available as part of the R package Affy, which is a part of the Bioconductor project http://www.bioconductor.org. Contact: bolstad@stat.berkeley.edu Supplementary information: Additional figures may be found at http://www.stat.berkeley.edu/∼bolstad/normalize/ index.html

...read moreread less

8,324 citations

Journal Article•DOI•

Comprehensive mapping of long-range interactions reveals folding principles of the human genome.

[...]

Erez Lieberman Aiden¹, Nynke L. van Berkum², Louise Williams¹, Maxim Imakaev¹, Tobias Ragoczy³, Tobias Ragoczy⁴, Agnes Telling³, Agnes Telling⁴, Ido Amit¹, Bryan R. Lajoie², Peter J. Sabo³, Michael O. Dorschner³, Richard Sandstrom³, Bradley E. Bernstein¹, Bradley E. Bernstein⁵, Michaël Bender³, Mark Groudine⁴, Mark Groudine³, Andreas Gnirke¹, John A. Stamatoyannopoulos³, Leonid A. Mirny¹, Eric S. Lander⁵, Eric S. Lander¹, Job Dekker² - Show less +20 more•Institutions (5)

Massachusetts Institute of Technology¹, University of Massachusetts Medical School², University of Washington³, Fred Hutchinson Cancer Research Center⁴, Harvard University⁵

09 Oct 2009-Science

TL;DR: Hi-C is described, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing and demonstrates the power of Hi-C to map the dynamic conformations of entire genomes.

...read moreread less

Abstract: We describe Hi-C, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. We constructed spatial proximity maps of the human genome with Hi-C at a resolution of 1 megabase. These maps confirm the presence of chromosome territories and the spatial proximity of small, gene-rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. At the megabase scale, the chromatin conformation is consistent with a fractal globule, a knot-free, polymer conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus. The fractal globule is distinct from the more commonly used globular equilibrium model. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.

...read moreread less

7,180 citations

Journal Article•DOI•

Comprehensive molecular characterization of human colon and rectal cancer

[...]

Donna M. Muzny¹, Matthew N. Bainbridge¹, Kyle Chang¹, Huyen Dinh¹ +317 more•Institutions (24)

19 Jul 2012-Nature

TL;DR: Integrative analyses suggest new markers for aggressive colorectal carcinoma and an important role for MYC-directed transcriptional activation and repression.

...read moreread less

Abstract: To characterize somatic alterations in colorectal carcinoma, we conducted a genome-scale analysis of 276 samples, analysing exome sequence, DNA copy number, promoter methylation and messenger RNA and microRNA expression. A subset of these samples (97) underwent low-depth-of-coverage whole-genome sequencing. In total, 16% of colorectal carcinomas were found to be hypermutated: three-quarters of these had the expected high microsatellite instability, usually with hypermethylation and MLH1 silencing, and one-quarter had somatic mismatch-repair gene and polymerase e (POLE) mutations. Excluding the hypermutated cancers, colon and rectum cancers were found to have considerably similar patterns of genomic alteration. Twenty-four genes were significantly mutated, and in addition to the expected APC, TP53, SMAD4, PIK3CA and KRAS mutations, we found frequent mutations in ARID1A, SOX9 and FAM123B. Recurrent copy-number alterations include potentially drug-targetable amplifications of ERBB2 and newly discovered amplification of IGF2. Recurrent chromosomal translocations include the fusion of NAV2 and WNT pathway member TCF7L1. Integrative analyses suggest new markers for aggressive colorectal carcinoma and an important role for MYC-directed transcriptional activation and repression.

...read moreread less

6,883 citations

Journal Article•DOI•

dbSNP: the NCBI database of genetic variation

[...]

Stephen T. Sherry¹, Minghong Ward, Michael Kholodov, Jonathan Baker, Lon Phan, Elizabeth M. Smigielski, Karl Sirotkin - Show less +3 more•Institutions (1)

National Institutes of Health¹

01 Jan 2001-Nucleic Acids Research

TL;DR: The dbSNP database is a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, and is integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data.

...read moreread less

Abstract: In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Center for Biotechnology Information (NCBI) has established the dbSNP database [S.T.Sherry, M.Ward and K.Sirotkin (1999) Genome Res., 9, 677–679]. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. The complete contents of dbSNP can also be downloaded in multiple formats via anonymous FTP at ftp:// ncbi.nlm.nih.gov/snp/.

...read moreread less

6,449 citations