scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The i5K initiative: Advancing arthropod genomics for knowledge, human health, agriculture, and the environment

TL;DR: An international effort to guide arthropod genomic efforts, from species prioritization to methodology and informatics is described, which aims to deliver sequences and analytical tools for each of theArthropod branches andEach of the species having beneficial and negative effects on humankind.
Abstract: Insects and their arthropod relatives including mites, spiders, and crustaceans play major roles in the world's terrestrial, aquatic, and marine ecosystems Arthropods compete with humans for food and transmit devastating diseases They also comprise the most diverse and successful branch of metazoan evolution, with millions of extant species Here, we describe an international effort to guide arthropod genomic efforts, from species prioritization to methodology and informatics The 5000 arthropod genomes initiative (i5K) community met formally in 2012 to discuss a roadmap for sequencing and analyzing 5000 high-priority arthropods and is continuing this effort via pilot projects, the development of standard operating procedures, and training of students and career scientists, With university, governmental, and industry support, the i5K Consortium aspires to deliver sequences and analytical tools for each of the arthropod branches and each of the species having beneficial and negative effects on humankind
Citations
More filters
Journal ArticleDOI
TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.
Abstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.

4,104 citations

Journal ArticleDOI
TL;DR: This update features a major scaling up of the resource coverage, sampling the genomic diversity of 1271 eukaryotes, 6013 prokaryotes and 6488 viruses, and picking up the best sequenced and annotated representatives for each species or operational taxonomic unit.
Abstract: OrthoDB (https://www.orthodb.org) provides evolutionary and functional annotations of orthologs. This update features a major scaling up of the resource coverage, sampling the genomic diversity of 1271 eukaryotes, 6013 prokaryotes and 6488 viruses. These include putative orthologs among 448 metazoan, 117 plant, 549 fungal, 148 protist, 5609 bacterial, and 404 archaeal genomes, picking up the best sequenced and annotated representatives for each species or operational taxonomic unit. OrthoDB relies on a concept of hierarchy of levels-of-orthology to enable more finely resolved gene orthologies for more closely related species. Since orthologs are the most likely candidates to retain functions of their ancestor gene, OrthoDB is aimed at narrowing down hypotheses about gene functions and enabling comparative evolutionary studies. Optional registered-user sessions allow on-line BUSCO assessments of gene set completeness and mapping of the uploaded data to OrthoDB to enable further interactive exploration of related annotations and generation of comparative charts. The accelerating expansion of genomics data continues to add valuable information, and OrthoDB strives to provide orthologs from the broadest coverage of species, as well as to extensively collate available functional annotations and to compute evolutionary annotations. The data can be browsed online, downloaded or assessed via REST API or SPARQL RDF compatible with both UniProt and Ensembl.

608 citations


Cites background from "The i5K initiative: Advancing arthr..."

  • ...17– 19), particularly in the i5K initiative (20)....

    [...]

Journal ArticleDOI
TL;DR: A perspective on the Earth BioGenome Project (EBP), a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth’s eukaryotic biodiversity over a period of 10 years, is presented.
Abstract: Increasing our understanding of Earth’s biodiversity and responsibly stewarding its resources are among the most crucial scientific and social challenges of the new millennium. These challenges require fundamental new knowledge of the organization, evolution, functions, and interactions among millions of the planet’s organisms. Herein, we present a perspective on the Earth BioGenome Project (EBP), a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth’s eukaryotic biodiversity over a period of 10 years. The outcomes of the EBP will inform a broad range of major issues facing humanity, such as the impact of climate change on biodiversity, the conservation of endangered species and ecosystems, and the preservation and enhancement of ecosystem services. We describe hurdles that the project faces, including data-sharing policies that ensure a permanent, freely available resource for future scientific discovery while respecting access and benefit sharing guidelines of the Nagoya Protocol. We also describe scientific and organizational challenges in executing such an ambitious project, and the structure proposed to achieve the project’s goals. The far-reaching potential benefits of creating an open digital repository of genomic information for life on Earth can be realized only by a coordinated international effort.

560 citations

Journal ArticleDOI
TL;DR: A thorough phylogenetic analysis of the seven arthropod and human ABC protein subfamilies is conducted, to infer orthologous relationships that might suggest conserved function of ABC transporters in arthropods.

442 citations

References
More filters
Journal ArticleDOI
24 Mar 2000-Science
TL;DR: The nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome is determined using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map.
Abstract: The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes approximately 13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

6,180 citations


"The i5K initiative: Advancing arthr..." refers methods in this paper

  • ...Beginning with the sequencing of the Drosophila melanogaster genome in 2000 (Adams et al. 2000) and aided by relatively small genome sizes for many of the best-studied species (Gregory 2013; Grbic et al. 2011), insects and now mites have provided general insights into the evolution of genomes and…...

    [...]

  • ...Beginning with the sequencing of the Drosophila melanogaster genome in 2000 (Adams et al. 2000) and aided by relatively small genome sizes for many of the best-studied species (Gregory 2013; Grbic et al....

    [...]

Journal ArticleDOI
24 Nov 2011-Nature
TL;DR: The Tetranychus urticae genome is the smallest known arthropod genome as discussed by the authors, which represents the first complete chelicerate genome for a pest and has been annotated with genes associated with feeding on different hosts.
Abstract: The spider mite Tetranychus urticae is a cosmopolitan agricultural pest with an extensive host plant range and an extreme record of pesticide resistance. Here we present the completely sequenced and annotated spider mite genome, representing the first complete chelicerate genome. At 90 megabases T. urticae has the smallest sequenced arthropod genome. Compared with other arthropods, the spider mite genome shows unique changes in the hormonal environment and organization of the Hox complex, and also reveals evolutionary innovation of silk production. We find strong signatures of polyphagy and detoxification in gene families associated with feeding on different hosts and in new gene families acquired by lateral gene transfer. Deep transcriptome analysis of mites feeding on different plants shows how this pest responds to a changing host environment. The T. urticae genome thus offers new insights into arthropod evolution and plant-herbivore interactions, and provides unique opportunities for developing novel plant protection strategies.

894 citations

01 Jan 2002

602 citations

Journal ArticleDOI
TL;DR: A precipitous drop in costs and increase in sequencing efficiency is anticipated, with concomitant development of improved annotation technology, and it is proposed to create a collection of tissue and DNA specimens for 10,000 vertebrate species specifically designated for whole-genome sequencing in the very near future.
Abstract: American Genetic Association, Gordon and Betty Moore Foundation, NHGRI Intramural Sequencing Center, and UCSC Alumni Association to cost of the Genome 10K workshop; Howard Hughes Medical Institute to D. H.; Gordon and Betty Moore Foundation to S. C. S.; A

545 citations