scispace - formally typeset
Search or ask a question
Author

Elizabeth Nickerson

Bio: Elizabeth Nickerson is an academic researcher from Broad Institute. The author has contributed to research in topics: Exome & Mutation. The author has an hindex of 24, co-authored 33 publications receiving 20987 citations. Previous affiliations of Elizabeth Nickerson include University of California, Irvine & Baylor College of Medicine.

Papers
More filters
Journal ArticleDOI
15 Sep 2005-Nature
TL;DR: A scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments with 96% coverage at 99.96% accuracy in one run of the machine is described.
Abstract: The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.

8,434 citations

Journal ArticleDOI
Michael S. Lawrence1, Petar Stojanov2, Petar Stojanov1, Paz Polak1, Paz Polak3, Paz Polak2, Gregory V. Kryukov2, Gregory V. Kryukov3, Gregory V. Kryukov1, Kristian Cibulskis1, Andrey Sivachenko1, Scott L. Carter1, Chip Stewart1, Craig H. Mermel1, Craig H. Mermel2, Steven A. Roberts4, Adam Kiezun1, Peter S. Hammerman2, Peter S. Hammerman1, Aaron McKenna1, Aaron McKenna5, Yotam Drier, Lihua Zou1, Alex H. Ramos1, Trevor J. Pugh2, Trevor J. Pugh1, Nicolas Stransky1, Elena Helman6, Elena Helman1, Jaegil Kim1, Carrie Sougnez1, Lauren Ambrogio1, Elizabeth Nickerson1, Erica Shefler1, Maria L. Cortes1, Daniel Auclair1, Gordon Saksena1, Douglas Voet1, Michael S. Noble1, Daniel DiCara1, Pei Lin1, Lee Lichtenstein1, David I. Heiman1, Timothy Fennell1, Marcin Imielinski1, Marcin Imielinski2, Bryan Hernandez1, Eran Hodis1, Eran Hodis2, Sylvan C. Baca2, Sylvan C. Baca1, Austin M. Dulak1, Austin M. Dulak2, Jens G. Lohr2, Jens G. Lohr1, Dan A. Landau7, Dan A. Landau2, Dan A. Landau1, Catherine J. Wu2, Jorge Melendez-Zajgla, Alfredo Hidalgo-Miranda, Amnon Koren2, Amnon Koren1, Steven A. McCarroll2, Steven A. McCarroll1, Jaume Mora8, Ryan S. Lee9, Ryan S. Lee2, Brian D. Crompton2, Brian D. Crompton9, Robert C. Onofrio1, Melissa Parkin1, Wendy Winckler1, Kristin G. Ardlie1, Stacey Gabriel1, Charles W. M. Roberts2, Charles W. M. Roberts9, Jaclyn A. Biegel10, Kimberly Stegmaier1, Kimberly Stegmaier9, Kimberly Stegmaier2, Adam J. Bass2, Adam J. Bass1, Levi A. Garraway1, Levi A. Garraway2, Matthew Meyerson2, Matthew Meyerson1, Todd R. Golub, Dmitry A. Gordenin4, Shamil R. Sunyaev2, Shamil R. Sunyaev1, Shamil R. Sunyaev3, Eric S. Lander1, Eric S. Lander6, Eric S. Lander2, Gad Getz2, Gad Getz1 
11 Jul 2013-Nature
TL;DR: A fundamental problem with cancer genome studies is described: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds and the list includes many implausible genes, suggesting extensive false-positive findings that overshadow true driver events.
Abstract: Major international projects are underway that are aimed at creating a comprehensive catalogue of all the genes responsible for the initiation and progression of cancer. These studies involve the sequencing of matched tumour-normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false-positive findings that overshadow true driver events. We show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumour-normal pairs and discover extraordinary variation in mutation frequency and spectrum within cancer types, which sheds light on mutational processes and disease aetiology, and in mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and enable the identification of genes truly associated with cancer.

4,411 citations

Journal ArticleDOI
20 Jul 2012-Cell
TL;DR: The spectrum of driver mutations provided unequivocal genomic evidence for a direct mutagenic role of UV light in melanoma pathogenesis, providing oncogenic insights in BRAF- and NRAS-driven melanoma as well as those without known NRAS/BRAF mutations.

2,251 citations

Journal ArticleDOI
TL;DR: SPOP mutations may define a new molecular subtype of prostate cancer, with mutations involving the SPOP substrate-binding cleft in 6–15% of tumors across multiple independent cohorts.
Abstract: Prostate cancer is the second most common cancer in men worldwide and causes over 250,000 deaths each year. Overtreatment of indolent disease also results in significant morbidity. Common genetic alterations in prostate cancer include losses of NKX3.1 (8p21) and PTEN (10q23), gains of AR (the androgen receptor gene) and fusion of ETS family transcription factor genes with androgen-responsive promoters. Recurrent somatic base-pair substitutions are believed to be less contributory in prostate tumorigenesis but have not been systematically analyzed in large cohorts. Here, we sequenced the exomes of 112 prostate tumor and normal tissue pairs. New recurrent mutations were identified in multiple genes, including MED12 and FOXA1. SPOP was the most frequently mutated gene, with mutations involving the SPOP substrate-binding cleft in 6-15% of tumors across multiple independent cohorts. Prostate cancers with mutant SPOP lacked ETS family gene rearrangements and showed a distinct pattern of genomic alterations. Thus, SPOP mutations may define a new molecular subtype of prostate cancer.

1,370 citations

Journal ArticleDOI
TL;DR: The Generic Genome Browser (GBrowse) is described, a Web-based application for displaying genomic annotations and other features and easy integration with other components of a model organism system Web site.
Abstract: The Generic Model Organism System Database Project (GMOD) seeks to develop reusable software components for model organism system databases. In this paper we describe the Generic Genome Browser (GBrowse), a Web-based application for displaying genomic annotations and other features. For the end user, features of the browser include the ability to scroll and zoom through arbitrary regions of a genome, to enter a region of the genome by searching for a landmark or performing a full text search of all features, and the ability to enable and disable tracks and change their relative order and appearance. The user can upload private annotations to view them in the context of the public ones, and publish those annotations to the community. For the data provider, features of the browser software include reliance on readily available open source components, simple installation, flexible configuration, and easy integration with other components of a model organism system Web site. GBrowse is freely available under an open source license. The software, its documentation, and support are available at http://www.gmod.org.

1,177 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations

Journal ArticleDOI
21 Dec 2006-Nature
TL;DR: It is demonstrated through metagenomic and biochemical analyses that changes in the relative abundance of the Bacteroidetes and Firmicutes affect the metabolic potential of the mouse gut microbiota and indicates that the obese microbiome has an increased capacity to harvest energy from the diet.
Abstract: The worldwide obesity epidemic is stimulating efforts to identify host and environmental factors that affect energy balance. Comparisons of the distal gut microbiota of genetically obese mice and their lean littermates, as well as those of obese and lean human volunteers have revealed that obesity is associated with changes in the relative abundance of the two dominant bacterial divisions, the Bacteroidetes and the Firmicutes. Here we demonstrate through metagenomic and biochemical analyses that these changes affect the metabolic potential of the mouse gut microbiota. Our results indicate that the obese microbiome has an increased capacity to harvest energy from the diet. Furthermore, this trait is transmissible: colonization of germ-free mice with an 'obese microbiota' results in a significantly greater increase in total body fat than colonization with a 'lean microbiota'. These results identify the gut microbiota as an additional contributing factor to the pathophysiology of obesity.

10,126 citations

Journal ArticleDOI
TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.
Abstract: We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.

9,389 citations

Journal ArticleDOI
Ludmil B. Alexandrov1, Serena Nik-Zainal2, Serena Nik-Zainal3, David C. Wedge1, Samuel Aparicio4, Sam Behjati1, Sam Behjati5, Andrew V. Biankin, Graham R. Bignell1, Niccolo Bolli1, Niccolo Bolli5, Åke Borg3, Anne Lise Børresen-Dale6, Anne Lise Børresen-Dale7, Sandrine Boyault8, Birgit Burkhardt8, Adam Butler1, Carlos Caldas9, Helen Davies1, Christine Desmedt, Roland Eils5, Jorunn E. Eyfjord10, John A. Foekens11, Mel Greaves12, Fumie Hosoda13, Barbara Hutter5, Tomislav Ilicic1, Sandrine Imbeaud14, Sandrine Imbeaud15, Marcin Imielinsk14, Natalie Jäger5, David T. W. Jones16, David T. Jones1, Stian Knappskog11, Stian Knappskog17, Marcel Kool11, Sunil R. Lakhani18, Carlos López-Otín18, Sancha Martin1, Nikhil C. Munshi19, Nikhil C. Munshi20, Hiromi Nakamura13, Paul A. Northcott16, Marina Pajic21, Elli Papaemmanuil1, Angelo Paradiso22, John V. Pearson23, Xose S. Puente18, Keiran Raine1, Manasa Ramakrishna1, Andrea L. Richardson22, Andrea L. Richardson20, Julia Richter22, Philip Rosenstiel22, Matthias Schlesner5, Ton N. Schumacher24, Paul N. Span25, Jon W. Teague1, Yasushi Totoki13, Andrew Tutt24, Rafael Valdés-Mas18, Marit M. van Buuren25, Laura van ’t Veer26, Anne Vincent-Salomon27, Nicola Waddell23, Lucy R. Yates1, Icgc PedBrain24, Jessica Zucman-Rossi15, Jessica Zucman-Rossi14, P. Andrew Futreal1, Ultan McDermott1, Peter Lichter24, Matthew Meyerson14, Matthew Meyerson20, Sean M. Grimmond23, Reiner Siebert22, Elias Campo28, Tatsuhiro Shibata13, Stefan M. Pfister11, Stefan M. Pfister16, Peter J. Campbell2, Peter J. Campbell29, Peter J. Campbell30, Michael R. Stratton31, Michael R. Stratton2 
22 Aug 2013-Nature
TL;DR: It is shown that hypermutation localized to small genomic regions, ‘kataegis’, is found in many cancer types, and this results reveal the diversity of mutational processes underlying the development of cancer.
Abstract: All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.

7,904 citations

Journal ArticleDOI
TL;DR: A technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments is presented.
Abstract: Demand has never been greater for revolutionary technologies that deliver fast, inexpensive and accurate genome information. This challenge has catalysed the development of next-generation sequencing (NGS) technologies. The inexpensive production of large volumes of sequence data is the primary advantage over conventional methods. Here, I present a technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments. I also outline the broad range of applications for NGS technologies, in addition to providing guidelines for platform selection to address biological questions of interest.

7,023 citations