scispace - formally typeset
Search or ask a question
Institution

Wellcome Trust Sanger Institute

NonprofitCambridge, United Kingdom
About: Wellcome Trust Sanger Institute is a nonprofit organization based out in Cambridge, United Kingdom. It is known for research contribution in the topics: Population & Genome. The organization has 4009 authors who have published 9671 publications receiving 1224479 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The Ensembl gene-building system enables fast automated annotation of eukaryotic genomes and annotates genes based on evidence derived from known protein, cDNA, and EST sequences.
Abstract: As more genomes are sequenced, there is an increasing need for automated first-pass annotation which allows timely access to important genomic information. The Ensembl gene-building system enables fast automated annotation of eukaryotic genomes. It annotates genes based on evidence derived from known protein, cDNA, and EST sequences. The gene-building system rests on top of the core Ensembl (MySQL) database schema and Perl Application Programming Interface (API), and the data generated are accessible through the Ensembl genome browser (http://www.ensembl.org). To date, the Ensembl predicted gene sets are available for the A. gambiae, C. briggsae, zebrafish, mouse, rat, and human genomes and have been heavily relied upon in the publication of the human, mouse, rat, and A. gambiae genome sequence analysis. Here we describe in detail the gene-building system and the algorithms involved. All code and data are freely available from http://www.ensembl.org.

406 citations

Journal ArticleDOI
Monika Gulia-Nuss1, Monika Gulia-Nuss2, Andrew B. Nuss2, Andrew B. Nuss1, Jason M. Meyer2, Jason M. Meyer3, Daniel E. Sonenshine4, R. Michael Roe5, Robert M. Waterhouse, David B. Sattelle6, José de la Fuente7, José de la Fuente8, José M. C. Ribeiro9, Karyn Megy10, Karyn Megy11, Jyothi Thimmapuram2, Jason R. Miller12, Brian P. Walenz12, Brian P. Walenz9, Sergey Koren12, Sergey Koren9, Jessica B. Hostetler12, Jessica B. Hostetler9, Mathangi Thiagarajan12, Mathangi Thiagarajan13, Vinita Joardar12, Vinita Joardar9, Linda Hannick13, Linda Hannick12, Shelby L. Bidwell9, Shelby L. Bidwell12, Martin Hammond11, Sarah Young14, Qiandong Zeng14, Jenica L. Abrudan15, Jenica L. Abrudan16, Francisca C. Almeida17, Nieves Ayllón8, Ketaki Bhide2, Brooke W. Bissinger5, Elena Bonzón-Kulichenko18, Steven D. Buckingham6, Daniel R. Caffrey19, Melissa J. Caimano20, Vincent Croset21, Vincent Croset22, Timothy P. Driscoll23, Timothy P. Driscoll24, Don Gilbert25, Joseph J. Gillespie26, Joseph J. Gillespie24, Gloria I. Giraldo-Calderón2, Gloria I. Giraldo-Calderón16, Jeffrey M. Grabowski2, Jeffrey M. Grabowski9, David Jiang24, Sayed M.S. Khalil, Donghun Kim27, Donghun Kim28, Katherine M. Kocan7, Juraj Koči26, Juraj Koči27, Richard J. Kuhn2, Timothy J. Kurtti29, Kristin Lees30, Kristin Lees31, Emma G. Lang2, Ryan C. Kennedy32, Hyeogsun Kwon33, Hyeogsun Kwon28, Rushika Perera2, Rushika Perera34, Yumin Qi24, Justin D. Radolf20, Joyce M. Sakamoto35, Alejandro Sánchez-Gracia17, Maiara S. Severo36, Maiara S. Severo37, Neal S. Silverman19, Ladislav Šimo38, Ladislav Šimo27, Marta Tojo10, Marta Tojo39, Cristian Tornador40, Janice P. Van Zee2, Jesús Vázquez18, Filipe G. Vieira17, Margarita Villar8, Adam R. Wespiser19, Yunlong Yang28, Jiwei Zhu5, Peter Arensburger41, Patricia V. Pietrantonio28, Stephen C. Barker42, Renfu Shao43, Evgeny M. Zdobnov44, Evgeny M. Zdobnov45, Frank Hauser46, Cornelis J. P. Grimmelikhuijzen46, Yoonseong Park27, Julio Rozas17, Richard Benton22, Joao H. F. Pedra26, Joao H. F. Pedra36, David R. Nelson47, Maria F. Unger16, Jose M. C. Tubio48, Jose M. C. Tubio49, Zhijian Jake Tu24, Hugh M. Robertson50, Martin Shumway37, Martin Shumway12, Granger G. Sutton12, Jennifer R. Wortman12, Daniel Lawson11, Stephen K. Wikel51, Vishvanath Nene12, Vishvanath Nene52, Claire M. Fraser26, Frank H. Collins16, Bruce W. Birren14, Karen E. Nelson12, Elisabet Caler12, Elisabet Caler9, Catherine A. Hill2 
University of Nevada, Reno1, Purdue University2, Monsanto3, Old Dominion University4, North Carolina State University5, University College London6, Oklahoma State University–Stillwater7, Spanish National Research Council8, National Institutes of Health9, University of Cambridge10, Wellcome Trust11, J. Craig Venter Institute12, Leidos13, Broad Institute14, University of Nevada, Las Vegas15, University of Notre Dame16, University of Barcelona17, Carlos III Health Institute18, University of Massachusetts Medical School19, University of Connecticut20, University of Oxford21, University of Lausanne22, West Virginia University23, Virginia Tech24, Indiana University25, University of Maryland, Baltimore26, Kansas State University27, Texas A&M University28, University of Minnesota29, University of Manchester30, National University of Singapore31, University of California, San Francisco32, Iowa State University33, Colorado State University34, Pennsylvania State University35, University of California, Riverside36, Max Planck Society37, ANSES38, University of Santiago de Compostela39, Pompeu Fabra University40, California State Polytechnic University, Pomona41, University of Queensland42, University of the Sunshine Coast43, University of Geneva44, Swiss Institute of Bioinformatics45, University of Copenhagen46, University of Tennessee Health Science Center47, Wellcome Trust Sanger Institute48, University of Vigo49, University of Illinois at Urbana–Champaign50, Quinnipiac University51, International Livestock Research Institute52
TL;DR: Insights from genome analyses into parasitic processes unique to ticks, including host ‘questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival are reported.
Abstract: Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retro-transposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing ∼57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick-host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host 'questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent.

406 citations

Journal ArticleDOI
TL;DR: The genome sequence of a plant pathogenic enterobacterium, Erwinia carotovora subsp.
Abstract: The bacterial family Enterobacteriaceae is notable for its well studied human pathogens, including Salmonella, Yersinia, Shigella, and Escherichia spp. However, it also contains several plant pathogens. We report the genome sequence of a plant pathogenic enterobacterium, Erwinia carotovora subsp. atroseptica (Eca) strain SCRI1043, the causative agent of soft rot and blackleg potato diseases. Approximately 33% of Eca genes are not shared with sequenced enterobacterial human pathogens, including some predicted to facilitate unexpected metabolic traits, such as nitrogen fixation and opine catabolism. This proportion of genes also contains an overrepresentation of pathogenicity determinants, including possible horizontally acquired gene clusters for putative type IV secretion and polyketide phytotoxin synthesis. To investigate whether these gene clusters play a role in the disease process, an arrayed set of insertional mutants was generated, and mutations were identified. Plant bioassays showed that these mutants were significantly reduced in virulence, demonstrating both the presence of novel pathogenicity determinants in Eca, and the impact of functional genomics in expanding our understanding of phytopathogenicity in the Enterobacteriaceae.

406 citations

Journal ArticleDOI
TL;DR: This work validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrated that 86% and 82% of the human and mouse reference genomes are error-free, respectively.
Abstract: Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.

406 citations

Journal ArticleDOI
TL;DR: In addition to those that are largely associated with LDL-C, genetic loci mainly associated with circulating triglycerides and HDL-C are also associated with risk of CAD, and these findings potentially provide new insights into the biological mechanisms underlying lipid metabolism and CAD risk.
Abstract: Objective—Genetic studies might provide new insights into the biological mechanisms underlying lipid metabolism and risk of CAD. We therefore conducted a genome-wide association study to identify n...

403 citations


Authors

Showing all 4058 results

NameH-indexPapersCitations
Nicholas J. Wareham2121657204896
Gonçalo R. Abecasis179595230323
Panos Deloukas162410154018
Michael R. Stratton161443142586
David W. Johnson1602714140778
Michael John Owen1601110135795
Naveed Sattar1551326116368
Robert E. W. Hancock15277588481
Julian Parkhill149759104736
Nilesh J. Samani149779113545
Michael Conlon O'Donovan142736118857
Jian Yang1421818111166
Christof Koch141712105221
Andrew G. Clark140823123333
Stylianos E. Antonarakis13874693605
Network Information
Related Institutions (5)
Broad Institute
11.6K papers, 1.5M citations

96% related

Howard Hughes Medical Institute
34.6K papers, 5.2M citations

95% related

Laboratory of Molecular Biology
24.2K papers, 2.1M citations

94% related

Salk Institute for Biological Studies
13.1K papers, 1.6M citations

93% related

National Institutes of Health
297.8K papers, 21.3M citations

93% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202317
202270
2021836
2020810
2019854
2018764