scispace - formally typeset
Search or ask a question
Author

Rachel Maupin

Bio: Rachel Maupin is an academic researcher from Washington University in St. Louis. The author has contributed to research in topics: Chromosome 20 & Sequence analysis. The author has an hindex of 5, co-authored 6 publications receiving 2562 citations.

Papers
More filters
Journal ArticleDOI
19 Jun 2003-Nature
TL;DR: The male-specific region of the Y chromosome, the MSY, differentiates the sexes and comprises 95% of the chromosome's length, and is a mosaic of heterochromatic sequences and three classes of euchromatics sequences: X-transposed, X-degenerate and ampliconic.
Abstract: The male-specific region of the Y chromosome, the MSY, differentiates the sexes and comprises 95% of the chromosome's length. Here, we report that the MSY is a mosaic of heterochromatic sequences and three classes of euchromatic sequences: X-transposed, X-degenerate and ampliconic. These classes contain all 156 known transcription units, which include 78 protein-coding genes that collectively encode 27 distinct proteins. The X-transposed sequences exhibit 99% identity to the X chromosome. The X-degenerate sequences are remnants of ancient autosomes from which the modern X and Y chromosomes evolved. The ampliconic class includes large regions (about 30% of the MSY euchromatin) where sequence pairs show greater than 99.9% identity, which is maintained by frequent gene conversion (non-reciprocal transfer). The most prominent features here are eight massive palindromes, at least six of which contain testis genes.

2,022 citations

Journal ArticleDOI
LaDeana W. Hillier1, Robert S. Fulton1, Lucinda Fulton1, Tina Graves1, Kymberlie H. Pepin1, Caryn Wagner-McPherson1, Dan Layman1, Jason Maas1, Sara Jaeger1, Rebecca S. Walker1, Kristine M. Wylie1, Mandeep Sekhon1, Michael C. Becker1, Michelle O'Laughlin1, Mark E. Schaller1, Ginger A. Fewell1, Kimberly D. Delehaunty1, Tracie L. Miner1, William E. Nash1, Matt Cordes1, Hui Du1, Hui Sun1, Jennifer Edwards1, Holland Bradshaw-Cordum1, Johar Ali1, Stephanie Andrews1, Amber Isak1, Andrew Vanbrunt1, Christine Nguyen1, Feiyu Du1, Betty Lamar1, Laura Courtney1, Joelle Kalicki1, Philip Ozersky1, Lauren Bielicki1, Kelsi Scott1, Andrea Holmes1, Richard Harkins1, Anthony R. Harris1, Cindy Strong1, Shunfang Hou1, Chad Tomlinson1, Sara Dauphin-Kohlberg1, Amy Kozlowicz-Reilly1, Shawn Leonard1, Theresa Rohlfing1, Susan M. Rock1, Aye-Mon Tin-Wollam1, Amanda Abbott1, Patrick Minx1, Rachel Maupin1, Catrina Strowmatt1, Phil Latreille1, Nancy Miller1, Doug Johnson1, Jennifer Murray1, Jeffrey Woessner1, Michael C. Wendl1, Shiaw-Pyng Yang1, Brian Schultz1, John W. Wallis1, John Spieth1, Tamberlyn Bieri1, Joanne O. Nelson1, Nicolas Berkowicz1, Patricia Wohldmann1, Lisa Cook1, Matthew T. Hickenbotham1, James M. Eldred1, Donald Williams1, Joseph A. Bedell1, Elaine R. Mardis1, Sandra W. Clifton1, Stephanie L. Chissoe1, Marco A. Marra1, Marco A. Marra2, Christopher K. Raymond3, Eric Haugen3, Will Gillett3, Yang Zhou3, R. James3, Karen A. Phelps3, Shawn Iadanoto3, Kerry L. Bubb3, Elizabeth Simms3, Ruth Levy3, James B. Clendenning3, Rajinder Kaul3, W. James Kent4, Terrence S. Furey4, Robert Baertsch4, Michael R. Brent1, Evan Keibler1, Paul Flicek1, Peer Bork5, Mikita Suyama5, Jeffrey A. Bailey6, Matthew E. Portnoy7, David Torrents5, Asif T. Chinwalla1, Warren Gish1, Sean R. Eddy1, John Douglas Mcpherson1, John Douglas Mcpherson8, Maynard V. Olson3, Evan E. Eichler6, Eric D. Green7, Robert H. Waterston3, Robert H. Waterston1, Richard K. Wilson1 
10 Jul 2003-Nature
TL;DR: The euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far, has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence.
Abstract: Human chromosome 7 has historically received prominent attention in the human genetics community, primarily related to the search for the cystic fibrosis gene and the frequent cytogenetic changes associated with various forms of cancer. Here we present more than 153 million base pairs representing 99.4% of the euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far. The sequence has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence (8.2%), with marked differences between the two arms. Our initial analyses have identified 1,150 protein-coding genes, 605 of which have been confirmed by complementary DNA sequences, and an additional 941 pseudogenes. Of genes confirmed by transcript sequences, some are polymorphic for mutations that disrupt the reading frame.

244 citations

Journal ArticleDOI
TL;DR: The use of an unbiased high-resolution genomic screen identified many genes not previously implicated in AML that may be relevant for pathogenesis, along with many known oncogenes and tumor suppressor genes.
Abstract: Cytogenetic analysis of acute myeloid leukemia (AML) cells has accelerated the identification of genes important for AML pathogenesis. To complement cytogenetic studies and to identify genes altered in AML genomes, we performed genome-wide copy number analysis with paired normal and tumor DNA obtained from 86 adult patients with de novo AML using 1.85 million feature SNP arrays. Acquired copy number alterations (CNAs) were confirmed using an ultra-dense array comparative genomic hybridization platform. A total of 201 somatic CNAs were found in the 86 AML genomes (mean, 2.34 CNAs per genome), with French-American-British system M6 and M7 genomes containing the most changes (10–29 CNAs per genome). Twenty-four percent of AML patients with normal cytogenetics had CNA, whereas 40% of patients with an abnormal karyotype had additional CNA detected by SNP array, and several CNA regions were recurrent. The mRNA expression levels of 57 genes were significantly altered in 27 of 50 recurrent CNA regions <5 megabases in size. A total of 8 uniparental disomy (UPD) segments were identified in the 86 genomes; 6 of 8 UPD calls occurred in samples with a normal karyotype. Collectively, 34 of 86 AML genomes (40%) contained alterations not found with cytogenetics, and 98% of these regions contained genes. Of 86 genomes, 43 (50%) had no CNA or UPD at this level of resolution. In this study of 86 adult AML genomes, the use of an unbiased high-resolution genomic screen identified many genes not previously implicated in AML that may be relevant for pathogenesis, along with many known oncogenes and tumor suppressor genes.

241 citations

Journal ArticleDOI
LaDeana W. Hillier1, Tina Graves1, Robert S. Fulton1, Lucinda Fulton1, Kymberlie H. Pepin1, Patrick Minx1, Caryn Wagner-McPherson1, Dan Layman1, Kristine M. Wylie1, Mandeep Sekhon1, Michael C. Becker1, Ginger A. Fewell1, Kimberly D. Delehaunty1, Tracie L. Miner1, William E. Nash1, Colin Kremitzki1, Lachlan G. Oddy1, Hui Du1, Hui Sun1, Holland Bradshaw-Cordum1, Johar Ali1, Jason Carter1, Matt Cordes1, Anthony R. Harris1, Amber Isak1, Andrew Van Brunt1, Christine Nguyen1, Feiyu Du1, Laura Courtney1, Joelle Kalicki1, Philip Ozersky1, Scott Abbott1, Jon R. Armstrong1, Edward A. Belter1, Lauren Caruso1, Maria Cedroni1, Marc Cotton1, Teresa Davidson1, Anu Desai1, Glendoria Elliott1, Thomas Erb1, Catrina Fronick1, Tony Gaige1, William Haakenson1, Krista Haglund1, Andrea Holmes1, Richard Harkins1, Kyung Kim1, Scott Kruchowski1, Cindy Strong1, Neenu Grewal1, Ernest Goyea1, Shunfang Hou1, Andrew Levy1, Scott Martinka1, Kelly Mead1, Michael D. McLellan1, Rick Meyer1, Jennifer Randall-Maher1, Chad Tomlinson1, Sara Dauphin-Kohlberg1, Amy Kozlowicz-Reilly1, Neha Shah1, Sharhonda Swearengen-Shahid1, Jacqueline E. Snider1, Joseph T. Strong1, Johanna Thompson1, Martin Yoakum1, Shawn Leonard1, Charlene Pearman1, Lee Trani1, Maxim Radionenko1, Jason Waligorski1, Chunyan Wang1, Susan M. Rock1, Aye Mon Tin-Wollam1, Rachel Maupin1, Phil Latreille1, Michael C. Wendl1, Shiaw Pyng Yang1, Craig Pohl1, John W. Wallis1, John Spieth1, Tamberlyn Bieri1, Nicolas Berkowicz1, Joanne O. Nelson1, John R. Osborne1, Li Ding1, Rekha Meyer1, Aniko Sabo1, Yoram Shotland1, Prashant R. Sinha1, Patricia Wohldmann1, Lisa Cook1, Matthew T. Hickenbotham1, James M. Eldred1, Donald Williams1, Thomas A. Jones1, Xinwei She2, Francesca D. Ciccarelli, Elisa Izaurralde, James Taylor3, Jeremy Schmutz4, Richard M. Myers4, David R. Cox4, Xiaoqiu Huang5, John Douglas Mcpherson6, John Douglas Mcpherson1, Elaine R. Mardis1, Sandra W. Clifton1, Wesley C. Warren1, Asif T. Chinwalla1, Sean R. Eddy1, Marco A. Marra7, Marco A. Marra1, Ivan Ovcharenko8, Terrence S. Furey9, Webb Miller3, Evan E. Eichler2, Peer Bork, Mikita Suyama, David Torrents, Robert H. Waterston1, Robert H. Waterston2, Richard K. Wilson1 
07 Apr 2005-Nature
TL;DR: Extensive analyses confirm the underlying construction of the sequence, and expand the understanding of the structure and evolution of mammalian chromosomes, including gene deserts, segmental duplications and highly variant regions.
Abstract: Human chromosome 2 is unique to the human lineage in being the product of a head-to-head fusion of two intermediate-sized ancestral chromosomes. Chromosome 4 has received attention primarily related to the search for the Huntington's disease gene, but also for genes associated with Wolf-Hirschhorn syndrome, polycystic kidney disease and a form of muscular dystrophy. Here we present approximately 237 million base pairs of sequence for chromosome 2, and 186 million base pairs for chromosome 4, representing more than 99.6% of their euchromatic sequences. Our initial analyses have identified 1,346 protein-coding genes and 1,239 pseudogenes on chromosome 2, and 796 protein-coding genes and 778 pseudogenes on chromosome 4. Extensive analyses confirm the underlying construction of the sequence, and expand our understanding of the structure and evolution of mammalian chromosomes, including gene deserts, segmental duplications and highly variant regions.

107 citations

Journal ArticleDOI
TL;DR: The generated sequence reveals the precise architecture of genes residing near CFTR/Cftr, including one known gene (WNT2/Wnt2) and two previously unknown genes that immediately flank CFTR or Cftr.
Abstract: The identification of the cystic fibrosis transmembrane conductance regulator gene (CFTR) in 1989 represents a landmark accomplishment in human genetics. Since that time, there have been numerous advances in elucidating the function of the encoded protein and the physiological basis of cystic fibrosis. However, numerous areas of cystic fibrosis biology require additional investigation, some of which would be facilitated by information about the long-range sequence context of the CFTR gene. For example, the latter might provide clues about the sequence elements responsible for the temporal and spatial regulation of CFTR expression. We thus sought to establish the sequence of the chromosomal segments encompassing the human CFTR and mouse Cftr genes, with the hope of identifying conserved regions of biologic interest by sequence comparison. Bacterial clone-based physical maps of the relevant human and mouse genomic regions were constructed, and minimally overlapping sets of clones were selected and sequenced, eventually yielding ≈1.6 Mb and ≈358 kb of contiguous human and mouse sequence, respectively. These efforts have produced the complete sequence of the ≈189-kb and ≈152-kb segments containing the human CFTR and mouse Cftr genes, respectively, as well as significant amounts of flanking DNA. Analyses of the resulting data provide insights about the organization of the CFTR/Cftr genes and potential sequence elements regulating their expression. Furthermore, the generated sequence reveals the precise architecture of genes residing near CFTR/Cftr, including one known gene (WNT2/Wnt2) and two previously unknown genes that immediately flank CFTR/Cftr.

81 citations


Cited by
More filters
Journal ArticleDOI
Robert H. Waterston1, Kerstin Lindblad-Toh2, Ewan Birney, Jane Rogers3  +219 moreInstitutions (26)
05 Dec 2002-Nature
TL;DR: The results of an international collaboration to produce a high-quality draft sequence of the mouse genome are reported and an initial comparative analysis of the Mouse and human genomes is presented, describing some of the insights that can be gleaned from the two sequences.
Abstract: The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

6,643 citations

Journal ArticleDOI
TL;DR: This work introduces Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner and constitutes a starting point to build pathway-centric models of biology.
Abstract: Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets. To address this challenge, we introduce Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. We demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise enrichment methods. Further, we provide examples of its utility in differential pathway activity and survival analysis. Lastly, we show how GSVA works analogously with data from both microarray and RNA-seq experiments. GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org .

6,125 citations

Journal ArticleDOI
TL;DR: New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments, and the voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline.
Abstract: New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or better than count-based RNA-seq methods even when the data are generated according to the assumptions of the earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods.

4,475 citations

Journal ArticleDOI
21 Oct 2004-Nature
TL;DR: The current human genome sequence (Build 35) as discussed by the authors contains 2.85 billion nucleotides interrupted by only 341 gaps and is accurate to an error rate of approximately 1 event per 100,000 bases.
Abstract: The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers approximately 99% of the euchromatic genome and is accurate to an error rate of approximately 1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human genome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

3,989 citations

Journal ArticleDOI
Timothy J. Ley1, Christopher A. Miller1, Li Ding1, Benjamin J. Raphael2, Andrew J. Mungall3, Gordon Robertson3, Katherine A. Hoadley4, Timothy J. Triche5, Peter W. Laird5, Jack Baty1, Lucinda Fulton1, Robert S. Fulton1, Sharon Heath1, Joelle Kalicki-Veizer1, Cyriac Kandoth1, Jeffery M. Klco1, Daniel C. Koboldt1, Krishna L. Kanchi1, Shashikant Kulkarni1, Tamara Lamprecht1, David E. Larson1, G. Lin1, Charles Lu1, Michael D. McLellan1, Joshua F. McMichael1, Jacqueline E. Payton1, Heather Schmidt1, David H. Spencer1, Michael H. Tomasson1, John W. Wallis1, Lukas D. Wartman1, Mark A. Watson1, John S. Welch1, Michael C. Wendl1, Adrian Ally3, Miruna Balasundaram3, Inanc Birol3, Yaron S.N. Butterfield3, Readman Chiu3, Andy Chu3, Eric Chuah3, Hye Jung E. Chun3, Richard Corbett3, Noreen Dhalla3, Ranabir Guin3, An He3, Carrie Hirst3, Martin Hirst3, Robert A. Holt3, Steven J.M. Jones3, Aly Karsan3, Darlene Lee3, Haiyan I. Li3, Marco A. Marra3, Michael Mayo3, Richard A. Moore3, Karen Mungall3, Jeremy Parker3, Erin Pleasance3, Patrick Plettner3, Jacquie Schein3, Dominik Stoll3, Lucas Swanson3, Angela Tam3, Nina Thiessen3, Richard Varhol3, Natasja Wye3, Yongjun Zhao3, Stacey Gabriel6, Gad Getz6, Carrie Sougnez6, Lihua Zou6, Mark D.M. Leiserson2, Fabio Vandin2, Hsin-Ta Wu2, Frederick Applebaum7, Stephen B. Baylin8, Rehan Akbani9, Bradley M. Broom9, Ken Chen9, Thomas C. Motter9, Khanh Thi-Thuy Nguyen9, John N. Weinstein9, Nianziang Zhang9, Martin L. Ferguson, Christopher Adams10, Aaron D. Black10, Jay Bowen10, Julie M. Gastier-Foster10, Thomas Grossman10, Tara M. Lichtenberg10, Lisa Wise10, Tanja Davidsen11, John A. Demchok11, Kenna R. Mills Shaw11, Margi Sheth11, Heidi J. Sofia, Liming Yang11, James R. Downing, Greg Eley, Shelley Alonso12, Brenda Ayala12, Julien Baboud12, Mark Backus12, Sean P. Barletta12, Dominique L. Berton12, Anna L. Chu12, Stanley Girshik12, Mark A. Jensen12, Ari B. Kahn12, Prachi Kothiyal12, Matthew C. Nicholls12, Todd Pihl12, David Pot12, Rohini Raman12, Rashmi N. Sanbhadti12, Eric E. Snyder12, Deepak Srinivasan12, Jessica Walton12, Yunhu Wan12, Zhining Wang12, Jean Pierre J. Issa13, Michelle M. Le Beau14, Martin Carroll15, Hagop M. Kantarjian, Steven M. Kornblau, Moiz S. Bootwalla5, Phillip H. Lai5, Hui Shen5, David Van Den Berg5, Daniel J. Weisenberger5, Daniel C. Link1, Matthew J. Walter1, Bradley A. Ozenberger11, Elaine R. Mardis1, Peter Westervelt1, Timothy A. Graubert1, John F. DiPersio1, Richard K. Wilson1 
TL;DR: It is found that a complex interplay of genetic events contributes to AML pathogenesis in individual patients and the databases from this study are widely available to serve as a foundation for further investigations of AMl pathogenesis, classification, and risk stratification.
Abstract: BACKGROUND—Many mutations that contribute to the pathogenesis of acute myeloid leukemia (AML) are undefined The relationships between patterns of mutations and epigenetic phenotypes are not yet clear METHODS—We analyzed the genomes of 200 clinically annotated adult cases of de novo AML, using either whole-genome sequencing (50 cases) or whole-exome sequencing (150 cases), along with RNA and microRNA sequencing and DNA-methylation analysis RESULTS—AML genomes have fewer mutations than most other adult cancers, with an average of only 13 mutations found in genes Of these, an average of 5 are in genes that are recurrently mutated in AML A total of 23 genes were significantly mutated, and another 237 were mutated in two or more samples Nearly all samples had at least 1 nonsynonymous mutation in one of nine categories of genes that are almost certainly relevant for pathogenesis, including transcriptionfactor fusions (18% of cases), the gene encoding nucleophosmin (NPM1) (27%), tumorsuppressor genes (16%), DNA-methylation–related genes (44%), signaling genes (59%), chromatin-modifying genes (30%), myeloid transcription-factor genes (22%), cohesin-complex genes (13%), and spliceosome-complex genes (14%) Patterns of cooperation and mutual exclusivity suggested strong biologic relationships among several of the genes and categories CONCLUSIONS—We identified at least one potential driver mutation in nearly all AML samples and found that a complex interplay of genetic events contributes to AML pathogenesis in individual patients The databases from this study are widely available to serve as a foundation for further investigations of AML pathogenesis, classification, and risk stratification (Funded by the National Institutes of Health) The molecular pathogenesis of acute myeloid leukemia (AML) has been studied with the use of cytogenetic analysis for more than three decades Recurrent chromosomal structural variations are well established as diagnostic and prognostic markers, suggesting that acquired genetic abnormalities (ie, somatic mutations) have an essential role in pathogenesis 1,2 However, nearly 50% of AML samples have a normal karyotype, and many of these genomes lack structural abnormalities, even when assessed with high-density comparative genomic hybridization or single-nucleotide polymorphism (SNP) arrays 3-5 (see Glossary) Targeted sequencing has identified recurrent mutations in FLT3, NPM1, KIT, CEBPA, and TET2 6-8 Massively parallel sequencing enabled the discovery of recurrent mutations in DNMT3A 9,10 and IDH1 11 Recent studies have shown that many patients with

3,980 citations