Institution
Wellcome Trust Sanger Institute
Nonprofit•Cambridge, United Kingdom•
About: Wellcome Trust Sanger Institute is a nonprofit organization based out in Cambridge, United Kingdom. It is known for research contribution in the topics: Population & Genome. The organization has 4009 authors who have published 9671 publications receiving 1224479 citations.
Topics: Population, Genome, Gene, Genome-wide association study, Genomics
Papers published on a yearly basis
Papers
More filters
••
TL;DR: The Ensembl gene-building system enables fast automated annotation of eukaryotic genomes and annotates genes based on evidence derived from known protein, cDNA, and EST sequences.
Abstract: As more genomes are sequenced, there is an increasing need for automated first-pass annotation which allows timely access to important genomic information. The Ensembl gene-building system enables fast automated annotation of eukaryotic genomes. It annotates genes based on evidence derived from known protein, cDNA, and EST sequences. The gene-building system rests on top of the core Ensembl (MySQL) database schema and Perl Application Programming Interface (API), and the data generated are accessible through the Ensembl genome browser (http://www.ensembl.org). To date, the Ensembl predicted gene sets are available for the A. gambiae, C. briggsae, zebrafish, mouse, rat, and human genomes and have been heavily relied upon in the publication of the human, mouse, rat, and A. gambiae genome sequence analysis. Here we describe in detail the gene-building system and the algorithms involved. All code and data are freely available from http://www.ensembl.org.
406 citations
••
University of Nevada, Reno1, Purdue University2, Monsanto3, Old Dominion University4, North Carolina State University5, University College London6, Oklahoma State University–Stillwater7, Spanish National Research Council8, National Institutes of Health9, University of Cambridge10, Wellcome Trust11, J. Craig Venter Institute12, Leidos13, Broad Institute14, University of Nevada, Las Vegas15, University of Notre Dame16, University of Barcelona17, Carlos III Health Institute18, University of Massachusetts Medical School19, University of Connecticut20, University of Oxford21, University of Lausanne22, West Virginia University23, Virginia Tech24, Indiana University25, University of Maryland, Baltimore26, Kansas State University27, Texas A&M University28, University of Minnesota29, University of Manchester30, National University of Singapore31, University of California, San Francisco32, Iowa State University33, Colorado State University34, Pennsylvania State University35, University of California, Riverside36, Max Planck Society37, ANSES38, University of Santiago de Compostela39, Pompeu Fabra University40, California State Polytechnic University, Pomona41, University of Queensland42, University of the Sunshine Coast43, University of Geneva44, Swiss Institute of Bioinformatics45, University of Copenhagen46, University of Tennessee Health Science Center47, Wellcome Trust Sanger Institute48, University of Vigo49, University of Illinois at Urbana–Champaign50, Quinnipiac University51, International Livestock Research Institute52
TL;DR: Insights from genome analyses into parasitic processes unique to ticks, including host ‘questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival are reported.
Abstract: Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retro-transposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing ∼57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick-host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host 'questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent.
406 citations
••
TL;DR: The genome sequence of a plant pathogenic enterobacterium, Erwinia carotovora subsp.
Abstract: The bacterial family Enterobacteriaceae is notable for its well studied human pathogens, including Salmonella, Yersinia, Shigella, and Escherichia spp. However, it also contains several plant pathogens. We report the genome sequence of a plant pathogenic enterobacterium, Erwinia carotovora subsp. atroseptica (Eca) strain SCRI1043, the causative agent of soft rot and blackleg potato diseases. Approximately 33% of Eca genes are not shared with sequenced enterobacterial human pathogens, including some predicted to facilitate unexpected metabolic traits, such as nitrogen fixation and opine catabolism. This proportion of genes also contains an overrepresentation of pathogenicity determinants, including possible horizontally acquired gene clusters for putative type IV secretion and polyketide phytotoxin synthesis. To investigate whether these gene clusters play a role in the disease process, an arrayed set of insertional mutants was generated, and mutations were identified. Plant bioassays showed that these mutants were significantly reduced in virulence, demonstrating both the presence of novel pathogenicity determinants in Eca, and the impact of functional genomics in expanding our understanding of phytopathogenicity in the Enterobacteriaceae.
406 citations
••
TL;DR: This work validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrated that 86% and 82% of the human and mouse reference genomes are error-free, respectively.
Abstract: Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.
406 citations
••
GlaxoSmithKline1, University of Cambridge2, University of Ottawa3, University of Helsinki4, Erasmus University Rotterdam5, Imperial College London6, Wellcome Trust Sanger Institute7, University of London8, University of Leicester9, Queen Mary University of London10, National Institute for Health and Welfare11, University of Amsterdam12, National Institutes of Health13, National Research Council14, University of Michigan15, University of North Carolina at Chapel Hill16, University of Oulu17, University of Bristol18, University of Lausanne19, MedStar Washington Hospital Center20, University of Pennsylvania21, University of Texas Southwestern Medical Center22, University of Leeds23, The Heart Research Institute24, Massachusetts Institute of Technology25
TL;DR: In addition to those that are largely associated with LDL-C, genetic loci mainly associated with circulating triglycerides and HDL-C are also associated with risk of CAD, and these findings potentially provide new insights into the biological mechanisms underlying lipid metabolism and CAD risk.
Abstract: Objective—Genetic studies might provide new insights into the biological mechanisms underlying lipid metabolism and risk of CAD. We therefore conducted a genome-wide association study to identify n...
403 citations
Authors
Showing all 4058 results
Name | H-index | Papers | Citations |
---|---|---|---|
Nicholas J. Wareham | 212 | 1657 | 204896 |
Gonçalo R. Abecasis | 179 | 595 | 230323 |
Panos Deloukas | 162 | 410 | 154018 |
Michael R. Stratton | 161 | 443 | 142586 |
David W. Johnson | 160 | 2714 | 140778 |
Michael John Owen | 160 | 1110 | 135795 |
Naveed Sattar | 155 | 1326 | 116368 |
Robert E. W. Hancock | 152 | 775 | 88481 |
Julian Parkhill | 149 | 759 | 104736 |
Nilesh J. Samani | 149 | 779 | 113545 |
Michael Conlon O'Donovan | 142 | 736 | 118857 |
Jian Yang | 142 | 1818 | 111166 |
Christof Koch | 141 | 712 | 105221 |
Andrew G. Clark | 140 | 823 | 123333 |
Stylianos E. Antonarakis | 138 | 746 | 93605 |