Showing papers by "Richard K. Wilson published in 2009"

PDF

Open Access

Journal Article•DOI•

The B73 Maize Genome: Complexity, Diversity, and Dynamics

[...]

Patrick S. Schnable¹, Doreen Ware², Robert S. Fulton³, Joshua C. Stein² +156 more•Institutions (18)

20 Nov 2009-Science

TL;DR: The sequence of the maize genome reveals it to be the most complex genome known to date and the correlation of methylation-poor regions with Mu transposon insertions and recombination and how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state is reported.

...read moreread less

Abstract: We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.

...read moreread less

3,761 citations

Journal Article•DOI•

Recurring mutations found by sequencing an acute myeloid leukemia genome.

[...]

Elaine R. Mardis¹, Li Ding¹, David J. Dooling¹, David E. Larson¹, Michael D. McLellan¹, Ken Chen, Daniel C. Koboldt¹, Robert S. Fulton¹, Kim D. Delehaunty¹, Sean McGrath¹, Lucinda Fulton¹, Devin P. Locke¹, Vincent Magrini¹, Rachel Abbott¹, Tammi L. Vickery¹, Jerry S. Reed¹, Jody S. Robinson¹, Todd Wylie¹, Scott M. Smith¹, Lynn K. Carmichael¹, James M. Eldred¹, Chris Harris¹, Jason Walker¹, Joshua Peck¹, Feiyu Du¹, Adam F. Dukes¹, Gabriel E. Sanderson¹, Anthony M. Brummett¹, Eric M. Clark¹, Joshua F. McMichael¹, Rick Meyer¹, Jonathan K. Schindler¹, Craig Pohl¹, John W. Wallis¹, Xiaoqi Shi¹, Ling Lin¹, Heather Schmidt¹, Yuzhu Tang¹, Carrie A. Haipek¹, Madeline E. Wiechert¹, Jolynda V. Ivy¹, Joelle Kalicki¹, Glendoria Elliott¹, Rhonda E. Ries¹, Jacqueline E. Payton¹, Peter Westervelt¹, Michael H. Tomasson¹, Mark A. Watson¹, Jack Baty¹, Sharon Heath¹, William D. Shannon¹, Rakesh Nagarajan¹, Daniel C. Link¹, Matthew J. Walter¹, Timothy A. Graubert¹, John F. DiPersio¹, Richard K. Wilson¹, Timothy J. Ley¹ - Show less +54 more•Institutions (1)

Washington University in St. Louis¹

10 Sep 2009-The New England Journal of Medicine

TL;DR: By comparing the sequences of tumor and skin genomes of a patient with AML-M1, recurring mutations that may be relevant for pathogenesis are identified.

...read moreread less

Abstract: From the Departments of Genetics (E.R.M., L.D., V.J.M., R.K.W., T.J.L.), Medicine (R.E.R., P.W., M.H.T., S.H., W.D.S., D.C.L., M.J.W., T.A.G., J.F.D., T.J.L.), and Pathology and Immunology (J.E.P., M.A.W., R.N.); the Genome Center (E.R.M., L.D., D.J.D., D.E.L., M.D.M., K.C., D.C.K., R.S.F., K.D.D., S.D.M., L.A.F., D.P.L., V.J.M., R.M.A.,

...read moreread less

2,151 citations

Journal Article•DOI•

BreakDancer: An algorithm for high resolution mapping of genomic structural variation

[...]

Ken Chen¹, John W. Wallis¹, Michael D. McLellan¹, David E. Larson¹, Joelle Kalicki¹, Craig Pohl¹, Sean McGrath¹, Michael C. Wendl¹, Qunyuan Zhang¹, Devin P. Locke¹, Xiaoqi Shi¹, Robert S. Fulton¹, Timothy J. Ley¹, Richard K. Wilson¹, Li Ding¹, Elaine R. Mardis¹ - Show less +12 more•Institutions (1)

Washington University in St. Louis¹

09 Aug 2009-Nature Methods

TL;DR: The algorithm BreakDancer predicts a wide variety of structural variants including insertion-deletions (indels), inversions and translocations and sensitively and accurately detected indels ranging from 10 base pairs to 1 megabase pair that are difficult to detect via a single conventional approach.

...read moreread less

Abstract: This software package provides genome-wide detection of structural variants (insertions, deletions, inversions and inter- and intrachromosomal translocations) from 50-base-pair paired-end reads. The sizes of the detected variants vary from 10 base pairs to 1 megabase pair.

...read moreread less

1,418 citations

Journal Article•DOI•

VarScan: Variant detection in massively parallel sequencing of individual and pooled samples

[...]

Daniel C. Koboldt¹, Ken Chen, Todd Wylie¹, David E. Larson¹, Michael D. McLellan¹, Elaine R. Mardis¹, George M. Weinstock¹, Richard K. Wilson¹, Li Ding¹ - Show less +5 more•Institutions (1)

Washington University in St. Louis¹

01 Sep 2009-Bioinformatics

TL;DR: VarScan is presented, an open source tool for variant detection that is compatible with several short read aligners that demonstrates its ability to detect SNPs and indels with high sensitivity and specificity, in both Roche/454 sequencing of individuals and deep Illumina/Solexa sequencing of pooled samples.

...read moreread less

Abstract: Summary: Massively parallel sequencing technologies hold incredible promise for the study of DNA sequence variation, particularly the identification of variants affecting human disease. The unprecedented throughput and relatively short read lengths of Roche/454, Illumina/Solexa, and other platforms have spurred development of a new generation of sequence alignment algorithms. Yet detection of sequence variants based on short read alignments remains challenging, and most currently available tools are limited to a single platform or aligner type. We present VarScan, an open source tool for variant detection that is compatible with several short read aligners. We demonstrate VarScan’s ability to detect SNPs and indels with high sensitivity and specificity, in both Roche/454 sequencing of individuals and deep Illumina/Solexa sequencing of pooled samples. Availability and Implementation: Source code and documentation freely available at http://genome.wustl.edu/tools/cancer-genomics, implemented as a Perl package and supported on Linux/UNIX, MS Windows and Mac OSX.

...read moreread less

1,250 citations

Journal Article•DOI•

Characterizing a model human gut microbiota composed of members of its two dominant bacterial phyla

[...]

Michael A. Mahowald¹, Federico E. Rey¹, Henning Seedorf¹, Peter J. Turnbaugh¹, Robert S. Fulton¹, Aye Wollam¹, Neha Shah¹, Chunyan Wang¹, Vincent Magrini¹, Richard K. Wilson¹, Brandi L. Cantarel², Brandi L. Cantarel³, Pedro M. Coutinho², Bernard Henrissat³, Bernard Henrissat², Lara W. Crock¹, Alison L Russell⁴, Nathan C Verberkmoes⁴, Robert L. Hettich⁴, Jeffrey I. Gordon¹ - Show less +16 more•Institutions (4)

Washington University in St. Louis¹, Aix-Marseille University², Centre national de la recherche scientifique³, Oak Ridge National Laboratory⁴

07 Apr 2009-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A simplified model of the human gut microbiota illustrates niche specialization and functional redundancy within members of its major bacterial phyla, and the importance of host glycans as a nutrient foundation that ensures ecosystem stability.

...read moreread less

Abstract: The adult human distal gut microbial community is typically dominated by 2 bacterial phyla (divisions), the Firmicutes and the Bacteroidetes. Little is known about the factors that govern the interactions between their members. Here, we examine the niches of representatives of both phyla in vivo. Finished genome sequences were generated from Eubacterium rectale and E. eligens, which belong to Clostridium Cluster XIVa, one of the most common gut Firmicute clades. Comparison of these and 25 other gut Firmicutes and Bacteroidetes indicated that the Firmicutes possess smaller genomes and a disproportionately smaller number of glycan-degrading enzymes. Germ-free mice were then colonized with E. rectale and/or a prominent human gut Bacteroidetes, Bacteroides thetaiotaomicron, followed by whole-genome transcriptional profiling, high-resolution proteomic analysis, and biochemical assays of microbial-microbial and microbial-host interactions. B. thetaiotaomicron adapts to E. rectale by up-regulating expression of a variety of polysaccharide utilization loci encoding numerous glycoside hydrolases, and by signaling the host to produce mucosal glycans that it, but not E. rectale, can access. E. rectale adapts to B. thetaiotaomicron by decreasing production of its glycan-degrading enzymes, increasing expression of selected amino acid and sugar transporters, and facilitating glycolysis by reducing levels of NADH, in part via generation of butyrate from acetate, which in turn is used by the gut epithelium. This simplified model of the human gut microbiota illustrates niche specialization and functional redundancy within members of its major bacterial phyla, and the importance of host glycans as a nutrient foundation that ensures ecosystem stability.

...read moreread less

670 citations

Journal Article•DOI•

Acquired copy number alterations in adult acute myeloid leukemia genomes

[...]

Matthew J. Walter¹, Jacqueline E. Payton¹, Rhonda E. Ries¹, William D. Shannon¹, Hrishikesh Deshmukh¹, Yu Zhao¹, Jack Baty¹, Sharon Heath¹, Peter Westervelt¹, Mark A. Watson¹, Michael H. Tomasson¹, Rakesh Nagarajan¹, Brian O’Gara¹, Clara D. Bloomfield², Krzysztof Mrózek², Rebecca R. Selzer³, Todd Richmond³, Jacob O. Kitzman³, Joel Geoghegan³, Peggy S. Eis³, Rachel Maupin¹, Robert S. Fulton¹, Michael D. McLellan¹, Richard K. Wilson¹, Elaine R. Mardis¹, Daniel C. Link¹, Timothy A. Graubert¹, John F. DiPersio¹, Timothy J. Ley¹ - Show less +25 more•Institutions (3)

Washington University in St. Louis¹, Ohio State University², Hoffmann-La Roche³

04 Aug 2009-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The use of an unbiased high-resolution genomic screen identified many genes not previously implicated in AML that may be relevant for pathogenesis, along with many known oncogenes and tumor suppressor genes.

...read moreread less

Abstract: Cytogenetic analysis of acute myeloid leukemia (AML) cells has accelerated the identification of genes important for AML pathogenesis. To complement cytogenetic studies and to identify genes altered in AML genomes, we performed genome-wide copy number analysis with paired normal and tumor DNA obtained from 86 adult patients with de novo AML using 1.85 million feature SNP arrays. Acquired copy number alterations (CNAs) were confirmed using an ultra-dense array comparative genomic hybridization platform. A total of 201 somatic CNAs were found in the 86 AML genomes (mean, 2.34 CNAs per genome), with French-American-British system M6 and M7 genomes containing the most changes (10–29 CNAs per genome). Twenty-four percent of AML patients with normal cytogenetics had CNA, whereas 40% of patients with an abnormal karyotype had additional CNA detected by SNP array, and several CNA regions were recurrent. The mRNA expression levels of 57 genes were significantly altered in 27 of 50 recurrent CNA regions <5 megabases in size. A total of 8 uniparental disomy (UPD) segments were identified in the 86 genomes; 6 of 8 UPD calls occurred in samples with a normal karyotype. Collectively, 34 of 86 AML genomes (40%) contained alterations not found with cytogenetics, and 98% of these regions contained genes. Of 86 genomes, 43 (50%) had no CNA or UPD at this level of resolution. In this study of 86 adult AML genomes, the use of an unbiased high-resolution genomic screen identified many genes not previously implicated in AML that may be relevant for pathogenesis, along with many known oncogenes and tumor suppressor genes.

...read moreread less

241 citations

Journal Article•DOI•

A burst of segmental duplications in the genome of the African great ape ancestor

[...]

Tomas Marques-Bonet¹, Jeffrey M. Kidd², Mario Ventura³, Tina Graves⁴, Ze Cheng², LaDeanna W. Hillier⁴, Zhaoshi Jiang², Carl Baker², Ray Malfavon-Borja², Lucinda Fulton⁴, Can Alkan², Gozde Aksay², Santhosh Girirajan², Priscillia Siswara², Lin Chen², Maria Francesca Cardone³, Arcadi Navarro⁵, Arcadi Navarro⁶, Elaine R. Mardis⁴, Richard K. Wilson⁴, Evan E. Eichler² - Show less +17 more•Institutions (6)

Howard Hughes Medical Institute¹, University of Washington², University of Bari³, Washington University in St. Louis⁴, Spanish National Research Council⁵, Catalan Institution for Research and Advanced Studies⁶

12 Feb 2009-Nature

TL;DR: The results suggest that the evolutionary properties of copy-number mutation differ significantly from other forms of genetic mutation and, in contrast to the hominid slowdown of single-base-pair mutations, there has been a genomic burst of duplication activity at this period during human evolution.

...read moreread less

Abstract: It is generally accepted that the extent of phenotypic change between human and great apes is dissonant with the rate of molecular change. Between these two groups, proteins are virtually identical, cytogenetically there are few rearrangements that distinguish ape-human chromosomes, and rates of single-base-pair change and retrotransposon activity have slowed particularly within hominid lineages when compared to rodents or monkeys. Studies of gene family evolution indicate that gene loss and gain are enriched within the primate lineage. Here, we perform a systematic analysis of duplication content of four primate genomes (macaque, orang-utan, chimpanzee and human) in an effort to understand the pattern and rates of genomic duplication during hominid evolution. We find that the ancestral branch leading to human and African great apes shows the most significant increase in duplication activity both in terms of base pairs and in terms of events. This duplication acceleration within the ancestral species is significant when compared to lineage-specific rate estimates even after accounting for copy-number polymorphism and homoplasy. We discover striking examples of recurrent and independent gene-containing duplications within the gorilla and chimpanzee that are absent in the human lineage. Our results suggest that the evolutionary properties of copy-number mutation differ significantly from other forms of genetic mutation and, in contrast to the hominid slowdown of single-base-pair mutations, there has been a genomic burst of duplication activity at this period during human evolution.

...read moreread less

227 citations

Journal Article•DOI•

Comparative genomics of protoploid Saccharomycetaceae.

[...]

Jean-Luc Souciet, Bernard Dujon, Claude Gaillardin, Mark Johnston, Philippe Baret¹, Paul F. Cliften, David James Sherman, Jean Weissenbach, Eric Westhof, Patrick Wincker, Claire Jubin, Julie Poulain, Valérie Barbe, Béatrice Segurens, François Artiguenave, Véronique Anthouard, Benoit Vacherie, Marie-Eve Val, Robert S. Fulton, Patrick Minx, Richard K. Wilson, Pascal Durrens, Géraldine Jean, Christian Marck, Tiphaine Martin, Macha Nikolski, Thomas Rolland, Marie-Line Seret¹, Serge Casaregola, Laurence Despons, Cécile Fairhead, Gilles Fischer, Ingrid Lafontaine, Véronique Leh, Marc Lemaire, Jacky de Montigny, Cécile Neuvéglise, Agnès Thierry, Isabelle Blanc-Lenfle, Claudine Bleykasten, Julie Diffels¹, Emilie S. Fritsch, Lionel Frangeul, Adrien Goëffon, Nicolas Jauniaux, Rym Kachouri-Lafond, Celia Payen, Serge Potier, Lenka Pribylova, Christophe Ozanne, Guy-Franck Richard, Christine Sacerdot, Marie-Laure Straub, Emmanuel Talla - Show less +50 more•Institutions (1)

Université catholique de Louvain¹

01 Oct 2009-Genome Research

TL;DR: Five species of Saccharomycetaceae, a large subdivision of hemiascomycetes, that are called "protoploid" because they diverged from the S. cerevisiae lineage prior to its genome duplication, are concentrated here on.

...read moreread less

Abstract: Our knowledge of yeast genomes remains largely dominated by the extensive studies on Saccharomyces cerevisiae and the consequences of its ancestral duplication, leaving the evolution of the entire class of hemiascomycetes only partly explored. We concentrate here on five species of Saccharomycetaceae, a large subdivision of hemiascomycetes, that we call "protoploid" because they diverged from the S. cerevisiae lineage prior to its genome duplication. We determined the complete genome sequences of three of these species: Kluyveromyces (Lachancea) thermotolerans and Saccharomyces (Lachancea) kluyveri (two members of the newly described Lachancea clade), and Zygosaccharomyces rouxii. We included in our comparisons the previously available sequences of Kluyveromyces lactis and Ashbya (Eremothecium) gossypii. Despite their broad evolutionary range and significant individual variations in each lineage, the five protoploid Saccharomycetaceae share a core repertoire of approximately 3300 protein families and a high degree of conserved synteny. Synteny blocks were used to define gene orthology and to infer ancestors. Far from representing minimal genomes without redundancy, the five protoploid yeasts contain numerous copies of paralogous genes, either dispersed or in tandem arrays, that, altogether, constitute a third of each genome. Ancient, conserved paralogs as well as novel, lineage-specific paralogs were identified.

...read moreread less

221 citations

Journal Article•DOI•

Cancer genome sequencing: a review

[...]

Elaine R. Mardis¹, Richard K. Wilson¹•Institutions (1)

Washington University in St. Louis¹

15 Oct 2009-Human Molecular Genetics

TL;DR: Several areas within cancer genomics are being transformed by the application of new technology, and in the process are dramatically expanding the authors' understanding of this disease.

...read moreread less

Abstract: A genomic era of cancer studies is developing rapidly, fueled by the emergence of next-generation sequencing technologies that provide exquisite sensitivity and resolution. This article discusses several areas within cancer genomics that are being transformed by the application of new technology, and in the process are dramatically expanding our understanding of this disease. Although, we anticipate that there will be many exciting discoveries in the near future, the ultimate success of these endeavors rests on our ability to translate what is learned into better diagnosis, treatment and prevention of cancer.

...read moreread less

217 citations

Journal Article•DOI•

The Physical and Genetic Framework of the Maize B73 Genome

[...]

Fusheng Wei¹, Jianwei Zhang¹, Shiguo Zhou², Ruifeng He¹, Mary L. Schaeffer³, Kristi Collura¹, David Kudrna¹, Ben Faga⁴, Marina Wissotski¹, Wolfgang Golser¹, Susan M. Rock⁵, Tina Graves⁵, Robert S. Fulton⁵, Edward H. Coe³, Patrick S. Schnable⁶, David C. Schwartz², Doreen Ware⁴, Sandra W. Clifton⁵, Richard K. Wilson⁵, Rod A. Wing¹ - Show less +16 more•Institutions (6)

University of Arizona¹, University of Wisconsin-Madison², University of Missouri³, Cold Spring Harbor Laboratory⁴, Washington University in St. Louis⁵, Iowa State University⁶

20 Nov 2009-PLOS Genetics

TL;DR: All available physical, sequence, genetic, and optical data were used to generate a golden path (AGP) of chromosome-based pseudomolecules, herein referred to as the B73 Reference Genome Sequence version 1 (B73 RefGen_v1).

...read moreread less

Abstract: Maize is a major cereal crop and an important model system for basic biological research. Knowledge gained from maize research can also be used to genetically improve its grass relatives such as sorghum, wheat, and rice. The primary objective of the Maize Genome Sequencing Consortium (MGSC) was to generate a reference genome sequence that was integrated with both the physical and genetic maps. Using a previously published integrated genetic and physical map, combined with in-coming maize genomic sequence, new sequence-based genetic markers, and an optical map, we dynamically picked a minimum tiling path (MTP) of 16,910 bacterial artificial chromosome (BAC) and fosmid clones that were used by the MGSC to sequence the maize genome. The final MTP resulted in a significantly improved physical map that reduced the number of contigs from 721 to 435, incorporated a total of 8,315 mapped markers, and ordered and oriented the majority of FPC contigs. The new integrated physical and genetic map covered 2,120 Mb (93%) of the 2,300-Mb genome, of which 405 contigs were anchored to the genetic map, totaling 2,103.4 Mb (99.2% of the 2,120 Mb physical map). More importantly, 336 contigs, comprising 94.0% of the physical map ( approximately 1,993 Mb), were ordered and oriented. Finally we used all available physical, sequence, genetic, and optical data to generate a golden path (AGP) of chromosome-based pseudomolecules, herein referred to as the B73 Reference Genome Sequence version 1 (B73 RefGen_v1).

...read moreread less

106 citations

Journal Article•DOI•

Detailed analysis of a contiguous 22-Mb region of the maize genome.

[...]

Fusheng Wei¹, Joshua C. Stein², Chengzhi Liang², Jianwei Zhang¹, Robert S. Fulton³, Regina S. Baucom⁴, Emanuele De Paoli⁵, Shiguo Zhou⁶, Lixing Yang⁴, Yujun Han⁴, Shiran Pasternak², Apurva Narechania², Lifang Zhang², Cheng Ting Yeh⁷, Kai Ying⁷, Dawn H. Nagel⁴, Kristi Collura¹, David Kudrna¹, Jennifer Currie¹, Jinke Lin¹, Hyeran Kim¹, Angelina Angelova¹, Gabriel Scara¹, Marina Wissotski¹, Wolfgang Golser¹, Laura Courtney³, Scott Kruchowski³, Tina Graves³, Susan M. Rock³, Stephanie Adams³, Lucinda Fulton³, Catrina Fronick³, William Courtney³, Melissa Kramer², Lori Spiegel², Lydia Nascimento², Ananth Kalyanaraman⁸, Cristian Chaparro⁹, Jean-Marc Deragon⁹, Phillip San Miguel¹⁰, Ning Jiang, Susan R. Wessler⁴, Pamela J. Green⁵, Yeisoo Yu¹, David C. Schwartz⁶, Blake C. Meyers⁵, Jeffrey L. Bennetzen⁴, Robert A. Martienssen², W. Richard McCombie², Srinivas Aluru⁷, Sandra W. Clifton³, Patrick S. Schnable⁷, Doreen Ware², Richard K. Wilson³, Rod A. Wing - Show less +51 more•Institutions (10)

University of Arizona¹, Cold Spring Harbor Laboratory², Washington University in St. Louis³, University of Georgia⁴, University of Delaware⁵, University of Wisconsin-Madison⁶, Iowa State University⁷, Washington State University⁸, University of Perpignan⁹, Purdue University¹⁰

20 Nov 2009-PLOS Genetics

TL;DR: The results demonstrate the feasibility of refining the B73 RefGen_v1 genome assembly by incorporating optical map, high-resolution genetic map, and comparative genomic data sets and improvements in gene and repeat annotation will serve to promote future functional genomic and phylogenomic research in maize and other grasses.

...read moreread less

Abstract: Most of our understanding of plant genome structure and evolution has come from the careful annotation of small (e.g., 100 kb) sequenced genomic regions or from automated annotation of complete genome sequences. Here, we sequenced and carefully annotated a contiguous 22 Mb region of maize chromosome 4 using an improved pseudomolecule for annotation. The sequence segment was comprehensively ordered, oriented, and confirmed using the maize optical map. Nearly 84% of the sequence is composed of transposable elements (TEs) that are mostly nested within each other, of which most families are low-copy. We identified 544 gene models using multiple levels of evidence, as well as five miRNA genes. Gene fragments, many captured by TEs, are prevalent within this region. Elimination of gene redundancy from a tetraploid maize ancestor that originated a few million years ago is responsible in this region for most disruptions of synteny with sorghum and rice. Consistent with other sub-genomic analyses in maize, small RNA mapping showed that many small RNAs match TEs and that most TEs match small RNAs. These results, performed on approximately 1% of the maize genome, demonstrate the feasibility of refining the B73 RefGen_v1 genome assembly by incorporating optical map, high-resolution genetic map, and comparative genomic data sets. Such improvements, along with those of gene and repeat annotation, will serve to promote future functional genomic and phylogenomic research in maize and other grasses.

...read moreread less

Journal Article•DOI•

Sequencing human–gibbon breakpoints of synteny reveals mosaic new insertions at rearrangement sites

[...]

Santhosh Girirajan¹, Lin Chen¹, Tina Graves², Tomas Marques-Bonet¹, Mario Ventura³, Catrina Fronick², Lucinda Fulton², Mariano Rocchi³, Robert S. Fulton², Richard K. Wilson², Elaine R. Mardis², Evan E. Eichler¹ - Show less +8 more•Institutions (3)

University of Washington¹, Washington University in St. Louis², University of Bari³

01 Feb 2009-Genome Research

TL;DR: Analysis of 24 synteny breakpoints in the white-cheeked gibbon provides a model for a replication-dependent repair mechanism for double-strand breaks (DSBs) at rearrangement sites and insights into the structure and formation of primate segmental duplications at sites of genomic rearrangements during evolution.

...read moreread less

Abstract: The gibbon genome exhibits extensive karyotypic diversity with an increased rate of chromosomal rearrangements during evolution. In an effort to understand the mechanistic origin and implications of these rearrangement events, we sequenced 24 synteny breakpoint regions in the white-cheeked gibbon (Nomascus leucogenys, NLE) in the form of high-quality BAC insert sequences (4.2 Mbp). While there is a significant deficit of breakpoints in genes, we identified seven human gene structures involved in signaling pathways (DEPDC4, GNG10), phospholipid metabolism (ENPP5, PLSCR2), beta-oxidation (ECH1), cellular structure and transport (HEATR4), and transcription (ZNF461), that have been disrupted in the NLE gibbon lineage. Notably, only three of these genes show the expected evolutionary signatures of pseudogenization. Sequence analysis of the breakpoints suggested both nonclassical nonhomologous end-joining (NHEJ) and replication-based mechanisms of rearrangement. A substantial number (11/24) of human-NLE gibbon breakpoints showed new insertions of gibbon-specific repeats and mosaic structures formed from disparate sequences including segmental duplications, LINE, SINE, and LTR elements. Analysis of these sites provides a model for a replication-dependent repair mechanism for double-strand breaks (DSBs) at rearrangement sites and insights into the structure and formation of primate segmental duplications at sites of genomic rearrangements during evolution.

...read moreread less

Journal Article•DOI•

Next-generation sequencing of cancer genomes: back to the future

[...]

Matthew J. Walter¹, Timothy A. Graubert¹, John F. DiPersio¹, Elaine R. Mardis¹, Richard K. Wilson¹, Timothy J. Ley¹ - Show less +2 more•Institutions (1)

Washington University in St. Louis¹

01 Nov 2009-Personalized Medicine

TL;DR: The systematic karyotyping of bone marrow cells was the first genomic approach used to personalize therapy for patients with leukemia and has the potential to be rapidly extended with the use of whole-genome sequencing approaches for cancer, which are now possible.

...read moreread less

Abstract: The systematic karyotyping of bone marrow cells was the first genomic approach used to personalize therapy for patients with leukemia. The paradigm established by cytogenetic studies in leukemia (from gene discovery to therapeutic intervention) now has the potential to be rapidly extended with the use of whole-genome sequencing approaches for cancer, which are now possible. We are now entering a period of exponential growth in cancer gene discovery that will provide many novel therapeutic targets for a large number of cancer types. Establishing the pathogenetic relevance of individual mutations is a major challenge that must be solved. However, after thousands of cancer genomes have been sequenced, the genetic rules of cancer will become known and new approaches for diagnosis, risk stratification and individualized treatment of cancer patients will surely follow.

...read moreread less

Journal Article•DOI•

Transcriptomic analysis of the entomopathogenic nematode Heterorhabditis bacteriophora TTO1

[...]

Xiaodong Bai¹, Byron J. Adams², Todd A. Ciche³, Sandra W. Clifton⁴, Randy Gaugler⁵, Saskia A. Hogenhout⁶, John Spieth⁴, Paul W. Sternberg⁷, Richard K. Wilson⁴, Parwinder S. Grewal¹ - Show less +6 more•Institutions (7)

Ohio State University¹, Brigham Young University², Michigan State University³, Washington University in St. Louis⁴, Rutgers University⁵, John Innes Centre⁶, California Institute of Technology⁷

30 Apr 2009-BMC Genomics

TL;DR: This large-scale expressed sequence tag (EST) analysis effort enables gene discovery and development of microsatellite markers and will enable genetic mapping and population genetic studies.

...read moreread less

Abstract: The entomopathogenic nematode Heterorhabditis bacteriophora and its symbiotic bacterium, Photorhabdus luminescens, are important biological control agents of insect pests. This nematode-bacterium-insect association represents an emerging tripartite model for research on mutualistic and parasitic symbioses. Elucidation of mechanisms underlying these biological processes may serve as a foundation for improving the biological control potential of the nematode-bacterium complex. This large-scale expressed sequence tag (EST) analysis effort enables gene discovery and development of microsatellite markers. These ESTs will also aid in the annotation of the upcoming complete genome sequence of H. bacteriophora. A total of 31,485 high quality ESTs were generated from cDNA libraries of the adult H. bacteriophora TTO1 strain. Cluster analysis revealed the presence of 3,051 contigs and 7,835 singletons, representing 10,886 distinct EST sequences. About 72% of the distinct EST sequences had significant matches (E value < 1e-5) to proteins in GenBank's non-redundant (nr) and Wormpep190 databases. We have identified 12 ESTs corresponding to 8 genes potentially involved in RNA interference, 22 ESTs corresponding to 14 genes potentially involved in dauer-related processes, and 51 ESTs corresponding to 27 genes potentially involved in defense and stress responses. Comparison to ESTs and proteins of free-living nematodes led to the identification of 554 parasitic nematode-specific ESTs in H. bacteriophora, among which are those encoding F-box-like/WD-repeat protein theromacin, Bax inhibitor-1-like protein, and PAZ domain containing protein. Gene Ontology terms were assigned to 6,685 of the 10,886 ESTs. A total of 168 microsatellite loci were identified with primers designable for 141 loci. A total of 10,886 distinct EST sequences were identified from adult H. bacteriophora cDNA libraries. BLAST searches revealed ESTs potentially involved in parasitism, RNA interference, defense responses, stress responses, and dauer-related processes. The putative microsatellite markers identified in H. bacteriophora ESTs will enable genetic mapping and population genetic studies. These genomic resources provide the material base necessary for genome annotation, microarray development, and in-depth gene functional analysis.

...read moreread less

Journal Article•DOI•

Molecular determinants archetypical to the phylum Nematoda

[...]

Yong Yin¹, John Martin¹, Sahar Abubucker¹, Zhengyuan Wang¹, Lucijan Wyrwicz, Leszek Rychlewski, James P. McCarter¹, Richard K. Wilson¹, Makedonka Mitreva¹ - Show less +5 more•Institutions (1)

Washington University in St. Louis¹

18 Mar 2009-BMC Genomics

TL;DR: This study identified and characterized the molecular determinants that help in defining the phylum Nematoda, and therefore improved the understanding of nematode protein evolution and provided novel insights for the development of next generation parasite control strategies.

...read moreread less

Abstract: Nematoda diverged from other animals between 600–1,200 million years ago and has become one of the most diverse animal phyla on earth. Most nematodes are free-living animals, but many are parasites of plants and animals including humans, posing major ecological and economical challenges around the world. We investigated phylum-specific molecular characteristics in Nematoda by exploring over 214,000 polypeptides from 32 nematode species including 27 parasites. Over 50,000 nematode protein families were identified based on primary sequence, including ~10% with members from at least three different species. Nearly 1,600 of the multi-species families did not share homology to Pfam domains, including a total of 758 restricted to Nematoda. Majority of the 462 families that were conserved among both free-living and parasitic species contained members from multiple nematode clades, yet ~90% of the 296 parasite-specific families originated only from a single clade. Features of these protein families were revealed through extrapolation of essential functions from observed RNAi phenotypes in C. elegans, bioinformatics-based functional annotations, identification of distant homology based on protein folds, and prediction of expression at accessible nematode surfaces. In addition, we identified a group of nematode-restricted sequence features in energy-generating electron transfer complexes as potential targets for new chemicals with minimal or no toxicity to the host. This study identified and characterized the molecular determinants that help in defining the phylum Nematoda, and therefore improved our understanding of nematode protein evolution and provided novel insights for the development of next generation parasite control strategies.

...read moreread less

Journal Article•DOI•

The theory of discovering rare variants via DNA sequencing

[...]

Michael C. Wendl¹, Richard K. Wilson¹•Institutions (1)

Washington University in St. Louis¹

20 Oct 2009-BMC Genomics

TL;DR: Optimal project-wide redundancy and sample size are shown to be inversely proportional to the desired variant frequency, and optimization principles reported here dramatically simplify the design process and should be broadly useful as rare-variant projects become both more important and routine in the future.

...read moreread less

Abstract: Rare population variants are known to have important biomedical implications, but their systematic discovery has only recently been enabled by advances in DNA sequencing. The design process of a discovery project remains formidable, being limited to ad hoc mixtures of extensive computer simulation and pilot sequencing. Here, the task is examined from a general mathematical perspective. We pose and solve the population sequencing design problem and subsequently apply standard optimization techniques that maximize the discovery probability. Emphasis is placed on cases whose discovery thresholds place them within reach of current technologies. We find that parameter values characteristic of rare-variant projects lead to a general, yet remarkably simple set of optimization rules. Specifically, optimal processing occurs at constant values of the per-sample redundancy, refuting current notions that sample size should be selected outright. Optimal project-wide redundancy and sample size are then shown to be inversely proportional to the desired variant frequency. A second family of constants governs these relationships, permitting one to immediately establish the most efficient settings for a given set of discovery conditions. Our results largely concur with the empirical design of the Thousand Genomes Project, though they furnish some additional refinement. The optimization principles reported here dramatically simplify the design process and should be broadly useful as rare-variant projects become both more important and routine in the future.

...read moreread less

Journal Article•DOI•

The transcriptomes of the cattle parasitic nematode Ostertagia ostartagi

[...]

Sahar Abubucker¹, Dante S. Zarlenga², John Martin¹, Yong Yin¹, Zhengyuan Wang¹, James P. McCarter¹, Louis Gasbarree², Richard K. Wilson¹, Makedonka Mitreva¹ - Show less +5 more•Institutions (2)

Washington University in St. Louis¹, United States Department of Agriculture²

26 May 2009-Veterinary Parasitology

TL;DR: This study presents the first large-scale genomic survey of O. ostertagi by the analysis of expressed transcripts from three stages of the parasite: third-stage larvae, fourth- stage larvae and adult worms, and identifies transcripts that can facilitate the design of control strategies and vaccine programs.

...read moreread less

Journal Article•DOI•

Statistical aspects of discerning indel-type structural variation via DNA sequence alignment.

[...]

Michael C. Wendl¹, Richard K. Wilson¹•Institutions (1)

Washington University in St. Louis¹

05 Aug 2009-BMC Genomics

TL;DR: The statistical theory characterizing the length-discrepancy scheme for Gaussian libraries resolves several outstanding issues and furnishes a general methodology for designing future projects from the standpoint of a spectrum-wide constant risk.

...read moreread less

Abstract: Structural variations in the form of DNA insertions and deletions are an important aspect of human genetics and especially relevant to medical disorders. Investigations have shown that such events can be detected via tell-tale discrepancies in the aligned lengths of paired-end DNA sequencing reads. Quantitative aspects underlying this method remain poorly understood, despite its importance and conceptual simplicity. We report the statistical theory characterizing the length-discrepancy scheme for Gaussian libraries, including coverage-related effects that preceding models are unable to account for. Deletion and insertion statistics both depend heavily on physical coverage, but otherwise differ dramatically, refuting a commonly held doctrine of symmetry. Specifically, coverage restrictions render insertions much more difficult to capture. Increased read length has the counterintuitive effect of worsening insertion detection characteristics of short inserts. Variance in library insert length is also a critical factor here and should be minimized to the greatest degree possible. Conversely, no significant improvement would be realized in lowering fosmid variances beyond current levels. Detection power is examined under a straightforward alternative hypothesis and found to be generally acceptable. We also consider the proposition of characterizing variation over the entire spectrum of variant sizes under constant risk of false-positive errors. At 1% risk, many designs will leave a significant gap in the 100 to 200 bp neighborhood, requiring unacceptably high redundancies to compensate. We show that a few modifications largely close this gap and we give a few examples of feasible spectrum-covering designs. The theory resolves several outstanding issues and furnishes a general methodology for designing future projects from the standpoint of a spectrum-wide constant risk.

...read moreread less

Journal Article•

Mapping and sequencing of structural variation from eight human genomes

[...]

01 Jan 2009-Nature Genetics

TL;DR: In this article, the authors explore variation on an intermediate scale, particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs, and find that 50% were seen in more than one individual and nearly half lay outside regions of the genome previously described as structurally variant.

...read moreread less

Abstract: Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale―particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation―a standard for genotyping platforms and a prelude to future individual genome sequencing projects.

...read moreread less

Journal Article•DOI•

Chromatin Immunoprecipitation of GFP-Tagged PML-Rara Coupled to High-Throughput Next Generation Sequencing.

[...]

Nicole R. Grieselhuber¹, Jahangheer S. Shaik¹, Li-Wei Chang¹, Sean McGrath¹, Lukas D. Wartman¹, Rakesh Nagarajan¹, Richard K. Wilson¹, Elaine R. Mardis¹, Timothy J. Ley¹ - Show less +5 more•Institutions (1)

Washington University in St. Louis¹

20 Nov 2009-Blood

TL;DR: The results suggest that PML-RARA has an extended repertoire of genomic DNA binding sites compared to wild-type RARA, reflecting novel gain-of-function properties of the fusion protein.

...read moreread less