scispace - formally typeset
Search or ask a question
Author

Eric Cox

Bio: Eric Cox is an academic researcher from National Institutes of Health. The author has contributed to research in topics: SUMO protein & Phosphorylation. The author has an hindex of 15, co-authored 24 publications receiving 4681 citations. Previous affiliations of Eric Cox include Johns Hopkins University & Johns Hopkins University School of Medicine.

Papers
More filters
Journal ArticleDOI
TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.
Abstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.

4,104 citations

Journal ArticleDOI
Benjamin J. Matthews1, Benjamin J. Matthews2, Olga Dudchenko3, Olga Dudchenko4, Sarah B. Kingan5, Sergey Koren6, Igor Antoshechkin7, Jacob E. Crawford, William J. Glassford8, Margaret Herre2, Seth Redmond9, Seth Redmond10, Noah H. Rose11, Gareth D. Weedall12, Gareth D. Weedall13, Yang Wu14, Yang Wu15, Sanjit S. Batra3, Sanjit S. Batra4, Carlos A Brito-Sierra16, Steven D. Buckingham17, Corey L. Campbell18, Saki Chan, Eric Cox6, Benjamin R. Evans19, Thanyalak Fansiri, Igor Filipović20, Albin Fontaine, Andrea Gloria-Soria19, Andrea Gloria-Soria21, Richard Hall5, Vinita Joardar6, Andrew K. Jones22, Raissa G.G. Kay23, Vamsi K. Kodali6, Joyce Lee, Gareth J Lycett12, Sara N. Mitchell, Jill Muehling5, Michael R. Murphy6, Arina D. Omer3, Arina D. Omer4, Frederick A. Partridge17, Paul Peluso5, Aviva Presser Aiden4, Aviva Presser Aiden3, Vidya Ramasamy22, Gordana Rašić20, Sourav Roy23, Karla Saavedra-Rodriguez18, Shruti Sharan16, Atashi Sharma15, Melissa Smith5, Joe Turner24, Allison M Weakley, Zhilei Zhao11, Omar S. Akbari25, William C. Black18, Han Cao, Alistair C. Darby24, Catherine A. Hill16, J. Spencer Johnston26, Terence Murphy6, Alexander S. Raikhel23, David B. Sattelle17, Igor V. Sharakhov15, Igor V. Sharakhov27, Bradley J. White, Li Zhao2, Erez Lieberman Aiden10, Erez Lieberman Aiden4, Erez Lieberman Aiden3, Richard S. Mann8, Louis Lambrechts28, Louis Lambrechts29, Jeffrey R. Powell19, Maria V. Sharakhova27, Maria V. Sharakhova15, Zhijian Tu15, Hugh M. Robertson30, Carolyn S. McBride11, Alex Hastie, Jonas Korlach5, Daniel E. Neafsey9, Daniel E. Neafsey10, Adam M. Phillippy6, Leslie B. Vosshall2, Leslie B. Vosshall1 
14 Nov 2018-Nature
TL;DR: An improved, fully re-annotated Aedes aegypti genome assembly (AaegL5) provides insights into the sex-determining M locus, chemosensory systems that help mosquitoes to hunt humans and loci involved in insecticide resistance and will help to generate intervention strategies to fight this deadly disease vector.
Abstract: Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science. We anchored physical and cytogenetic maps, doubled the number of known chemosensory ionotropic receptors that guide mosquitoes to human hosts and egg-laying sites, provided further insight into the size and composition of the sex-determining M locus, and revealed copy-number variation among glutathione S-transferase genes that are important for insecticide resistance. Using high-resolution quantitative trait locus and population genomic analyses, we mapped new candidates for dengue vector competence and insecticide resistance. AaegL5 will catalyse new biological insights and intervention strategies to fight this deadly disease vector.

392 citations

Journal ArticleDOI
03 Sep 2013-eLife
TL;DR: It is suggested that mCpG-dependent TF binding activity is a widespread phenomenon and provides a new framework to understand the role and mechanism of TFs in epigenetic regulation of gene transcription.
Abstract: DNA methylation, especially CpG methylation at promoter regions, has been generally considered as a potent epigenetic modification that prohibits transcription factor (TF) recruitment, resulting in transcription suppression. Here, we used a protein microarray-based approach to systematically survey the entire human TF family and found numerous purified TFs with methylated CpG (mCpG)-dependent DNA-binding activities. Interestingly, some TFs exhibit specific binding activity to methylated and unmethylated DNA motifs of distinct sequences. To elucidate the underlying mechanism, we focused on Kruppel-like factor 4 (KLF4), and decoupled its mCpG- and CpG-binding activities via site-directed mutagenesis. Furthermore, KLF4 binds specific methylated or unmethylated motifs in human embryonic stem cells in vivo. Our study suggests that mCpG-dependent TF binding activity is a widespread phenomenon and provides a new framework to understand the role and mechanism of TFs in epigenetic regulation of gene transcription. DOI: http://dx.doi.org/10.7554/eLife.00726.001

326 citations

Patent
06 Dec 1999
TL;DR: In this article, the authors proposed a method of treating abnormal cell growth in mammals with administering the compounds of formula (1) and to pharmaceutical compositions for treating such disorders which contain the compounds (1).
Abstract: The invention relates to compounds of formula (1) and to pharmaceutically acceptable salts and solvates thereof, wherein A, X, R?1, R3 and R4? are as defined herein. The invention also relates to methods of treating abnormal cell growth in mammals with administering the compounds of formula (1) and to pharmaceutical compositions for treating such disorders which contain the compounds of formula (1). The invention also relates to methods of preparing the compounds of formula (1).

146 citations


Cited by
More filters
Journal ArticleDOI
Minoru Kanehisa1, Miho Furumichi1, Mao Tanabe1, Yoko Sato2, Kanae Morishima1 
TL;DR: The content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases, and the newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined.
Abstract: KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an encyclopedia of genes and genomes. Assigning functional meanings to genes and genomes both at the molecular and higher levels is the primary objective of the KEGG database project. Molecular-level functions are stored in the KO (KEGG Orthology) database, where each KO is defined as a functional ortholog of genes and proteins. Higher-level functions are represented by networks of molecular interactions, reactions and relations in the forms of KEGG pathway maps, BRITE hierarchies and KEGG modules. In the past the KO database was developed for the purpose of defining nodes of molecular networks, but now the content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases. The newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined. Furthermore, the DISEASE and DRUG databases have been improved by systematic analysis of drug labels for better integration of diseases and drugs with the KEGG molecular networks. KEGG is moving towards becoming a comprehensive knowledge base for both functional interpretation and practical application of genomic information.

5,741 citations

Journal ArticleDOI
TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.
Abstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.

4,104 citations

Journal ArticleDOI
TL;DR: This work generates primary data, creates bioinformatics tools and provides analysis to support the work of expert manual gene annotators and automated gene annotation pipelines to identify and characterise gene loci to the highest standard.
Abstract: The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.

2,095 citations

Journal ArticleDOI
TL;DR: The new version of the MPI Bioinformatics Toolkit is introduced, focusing on improved features for the comprehensive analysis of proteins, as well as on promoting teaching.

1,757 citations

Journal ArticleDOI
TL;DR: Improvements make the miRTarBase one of the more comprehensively annotated, experimentally validated miRNA-target interactions databases and motivate additional miRNA research efforts.
Abstract: MicroRNAs (miRNAs) are small non-coding RNAs of approximately 22 nucleotides, which negatively regulate the gene expression at the post-transcriptional level. This study describes an update of the miRTarBase (http://miRTarBase.mbc.nctu.edu.tw/) that provides information about experimentally validated miRNA-target interactions (MTIs). The latest update of the miRTarBase expanded it to identify systematically Argonaute-miRNA-RNA interactions from 138 crosslinking and immunoprecipitation sequencing (CLIP-seq) data sets that were generated by 21 independent studies. The database contains 4966 articles, 7439 strongly validated MTIs (using reporter assays or western blots) and 348 007 MTIs from CLIP-seq. The number of MTIs in the miRTarBase has increased around 7-fold since the 2014 miRTarBase update. The miRNA and gene expression profiles from The Cancer Genome Atlas (TCGA) are integrated to provide an effective overview of this exponential growth in the miRNA experimental data. These improvements make the miRTarBase one of the more comprehensively annotated, experimentally validated miRNA-target interactions databases and motivate additional miRNA research efforts.

1,517 citations