scispace - formally typeset
Search or ask a question
Topic

Hypothetical protein

About: Hypothetical protein is a research topic. Over the lifetime, 1364 publications have been published within this topic receiving 40253 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Characterization of unannotated and uncharacterized protein segments is expected to lead to the discovery of novel functions as well as provide important insights into existing biological processes and is likely to shed new light on molecular mechanisms of diseases that are not yet fully understood.
Abstract: 1.1. Uncharacterized Protein Segments Are a Source of Functional Novelty Over the past decade, we have observed a massive increase in the amount of information describing protein sequences from a variety of organisms.1,2 While this may reflect the diversity in sequence space, and possibly also in function space,3 a large proportion of the sequences lacks any useful function annotation.4,5 Often these sequences are annotated as putative or hypothetical proteins, and for the majority their functions still remain unknown.6,7 Suggestions about potential protein function, primarily molecular function, often come from computational analysis of their sequences. For instance, homology detection allows for the transfer of information from well-characterized protein segments to those with similar sequences that lack annotation of molecular function.8−10 Other aspects of function, such as the biological processes proteins participate in, may come from genetic- and disease-association studies, expression and interaction network data, and comparative genomics approaches that investigate genomic context.11−17 Characterization of unannotated and uncharacterized protein segments is expected to lead to the discovery of novel functions as well as provide important insights into existing biological processes. In addition, it is likely to shed new light on molecular mechanisms of diseases that are not yet fully understood. Thus, uncharacterized protein segments are likely to be a large source of functional novelty relevant for discovering new biology.

1,540 citations

Journal ArticleDOI
13 Mar 1987-Science
TL;DR: A gene encoding a messenger RNA (mRNA) of 4.6 kilobases, located in the proximity of esterase D, was identified as the retinoblastoma susceptibility (RB) gene on the basis of chromosomal location, homozygous deletion, and tumor-specific alterations in expression.
Abstract: Recent evidence indicates the existence of a genetic locus in chromosome region 13q14 that confers susceptibility to retinoblastoma, a cancer of the eye in children. A gene encoding a messenger RNA (mRNA) of 4.6 kilobases (kb), located in the proximity of esterase D, was identified as the retinoblastoma susceptibility (RB) gene on the basis of chromosomal location, homozygous deletion, and tumor-specific alterations in expression. Transcription of this gene was abnormal in six of six retinoblastomas examined: in two tumors, RB mRNA was not detectable, while four others expressed variable quantities of RB mRNA with decreased molecular size of about 4.0 kb. In contrast, full-length RB mRNA was present in human fetal retina and placenta, and in other tumors such as neuroblastoma and medulloblastoma. DNA from retinoblastoma cells had a homozygous gene deletion in one case and hemizygous deletion in another case, while the remainder were not grossly different from normal human control DNA. The gene contains at least 12 exons distributed in a region of over 100 kb. Sequence analysis of complementary DNA clones yielded a single long open reading frame that could encode a hypothetical protein of 816 amino acids. A computer-assisted search of a protein sequence database revealed no closely related proteins. Features of the predicted amino acid sequence include potential metal-binding domains similar to those found in nucleic acid-binding proteins. These results provide a framework for further study of recessive genetic mechanisms in human cancers.

1,407 citations

Journal ArticleDOI
TL;DR: The state of the art in function prediction is reviewed and some of the underlying difficulties and successes are described, including inferring conservation patterns in members of a functionally uncharacterized family for which many sequences and structures are known.
Abstract: The sequence of a genome contains the plans of the possible life of an organism, but implementation of genetic information depends on the functions of the proteins and nucleic acids that it encodes. Many individual proteins of known sequence and structure present challenges to the understanding of their function. In particular, a number of genes responsible for diseases have been identified but their specific functions are unknown. Whole-genome sequencing projects are a major source of proteins of unknown function. Annotation of a genome involves assignment of functions to gene products, in most cases on the basis of amino-acid sequence alone. 3D structure can aid the assignment of function, motivating the challenge of structural genomics projects to make structural information available for novel uncharacterized proteins. Structure-based identification of homologues often succeeds where sequence-alone-based methods fail, because in many cases evolution retains the folding pattern long after sequence similarity becomes undetectable. Nevertheless, prediction of protein function from sequence and structure is a difficult problem, because homologous proteins often have different functions. Many methods of function prediction rely on identifying similarity in sequence and/or structure between a protein of unknown function and one or more well-understood proteins. Alternative methods include inferring conservation patterns in members of a functionally uncharacterized family for which many sequences and structures are known. However, these inferences are tenuous. Such methods provide reasonable guesses at function, but are far from foolproof. It is therefore fortunate that the development of whole-organism approaches and comparative genomics permits other approaches to function prediction when the data are available. These include the use of protein-protein interaction patterns, and correlations between occurrences of related proteins in different organisms, as indicators of functional properties. Even if it is possible to ascribe a particular function to a gene product, the protein may have multiple functions. A fundamental problem is that function is in many cases an ill-defined concept. In this article we review the state of the art in function prediction and describe some of the underlying difficulties and successes.

1,299 citations

Journal ArticleDOI
TL;DR: The Conserved Domain Architecture Retrieval Tool (CDART) performs similarity searches of the NCBI Entrez Protein Database based on domain architecture, defined as the sequential order of conserved domains in proteins.
Abstract: The Conserved Domain Architecture Retrieval Tool (CDART) performs similarity searches of the NCBI Entrez Protein Database based on domain architecture, defined as the sequential order of conserved domains in proteins. The algorithm finds protein similarities across significant evolutionary distances using sensitive protein domain profiles rather than by direct sequence similarity. Proteins similar to a query protein are grouped and scored by architecture. Relying on domain profiles allows CDART to be fast, and, because it relies on annotated functional domains, informative. Domain profiles are derived from several collections of domain definitions that include functional annotation. Searches can be further refined by taxonomy and by selecting domains of interest. CDART is available at http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi.

675 citations

Journal ArticleDOI
01 Oct 1987-Nature
TL;DR: Biochemical fractionation and immunofluorescence studies demonstrate that the majority of the RB protein is located within the nucleus, suggesting that the RB gene product may function in regulating other genes within the cell.
Abstract: The human gene (RB) that determines susceptibility to hereditary retinoblastoma has been identified recently by molecular genetic techniques. Previous results indicate that complete inactivation of the RB gene is required for tumour formation. As a 'cancer suppressor' gene, RB thus functions in a manner opposite to that of most other oncogenes. Sequence analysis of RB complementary DNA clones demonstrated a long open reading frame encoding a hypothetical protein with features suggestive of a DNA-binding function. To further substantiate and identify the RB protein, we have prepared rabbit antisera against a trypE-RB fusion protein. The purified anti-RB IgG immunoprecipitates a protein doublet with apparent relative molecular mass (Mr) of 110,000-114,000. The specific protein(s) are present in all cell lines expressing normal RB mRNA, but are not detected in five retinoblastoma cell lines examined. The RB protein can be metabolically labelled with 32P-phosphoric acid, indicating that it is a phosphoprotein. Biochemical fractionation and immunofluorescence studies demonstrate that the majority of the protein is located within the nucleus. Furthermore, the protein can be retained by and eluted from DNA-cellulose columns, suggesting that it is associated with DNA binding activity. Taken together, these results imply that the RB gene product may function in regulating other genes within the cell.

656 citations


Network Information
Related Topics (5)
Peptide sequence
84.1K papers, 4.3M citations
86% related
Mutant
74.5K papers, 3.4M citations
85% related
Protein structure
42.3K papers, 3M citations
85% related
Genome
74.2K papers, 3.8M citations
83% related
Gene
211.7K papers, 10.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20232
202210
202147
202059
201950
201831