Michael DiCuccio

Journal ArticleDOI

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

- 04 Jan 2016 -

TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.

...read moreread less

Journal ArticleDOI

NCBI prokaryotic genome annotation pipeline

Tatiana Tatusova, +9 more

- 19 Aug 2016 -

Nucleic Acids Research

TL;DR: The new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies less on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence.

...read moreread less

Journal ArticleDOI

RefSeq: an update on mammalian reference sequences

Kim D. Pruitt, +28 more

- 01 Jan 2014 -

Nucleic Acids Research

TL;DR: The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of annotated genomic, transcript and protein sequence records derived from data in public sequence archives and from computation, curation and collaboration.

...read moreread less

Journal ArticleDOI

RefSeq: an update on prokaryotic genome annotation and curation.

Daniel H. Haft, +20 more

- 04 Jan 2018 -

Nucleic Acids Research

TL;DR: The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information provides annotation for over 95 000 prokaryotic genomes that meet standards for sequence quality, completeness, and freedom from contamination.

...read moreread less

Journal ArticleDOI

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes

Kim D. Pruitt, +48 more

- 01 Jul 2009 -

Genome Research

TL;DR: The CCDS database centralizes the function of identifying well-supported, identically-annotated, protein-coding regions and indicates that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS.

...read moreread less

Papers

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

NCBI prokaryotic genome annotation pipeline

RefSeq: an update on mammalian reference sequences

RefSeq: an update on prokaryotic genome annotation and curation.

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes