scispace - formally typeset
Search or ask a question
Topic

Annotation

About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.


Papers
More filters
Patent
14 Sep 2005
TL;DR: In this article, an annotation is anchored to a piece of runtime content, such as upon creation of the annotation, by maintaining annotation data as well as start and end pointers mapped to the annotated piece.
Abstract: Described is the annotating of computer document content, particularly editable content, by saving annotations in a separate annotation store, and mapping the annotations back to the content. By mapping, no data are added to the original content at runtime, and only minimal data need be added to the content when persisted. An annotation is anchored to a piece of runtime content, such as upon creation of the annotation, by maintaining annotation data as well as start and end pointers mapped to the annotated piece. Upon saving the content, information (e.g., an anchor marker including an identifier) is persisted with the piece of content to allow the annotation to be re-anchored to the piece upon subsequent reload. For example, when loading content from persistent storage that includes an annotation identifier, the annotation identifier is processed to locate a start and end of the portion and an annotation in the annotation store.

42 citations

01 Jan 2013
TL;DR: BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies and could be very easily adapted to work with new technologies in the future.
Abstract: BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future.

42 citations

Journal ArticleDOI
TL;DR: The coverage of FunSimMat is significantly increased by adding data from the Gene Ontology Annotation project as well as new functional similarity measures, and two new visualization tools allow an interactive analysis of the functional relationships between proteins or protein families.
Abstract: Quantifying the functional similarity of genes and their products based on Gene Ontology annotation is an important tool for diverse applications like the analysis of gene expression data, the prediction and validation of protein functions and interactions, and the prioritization of disease genes. The Functional Similarity Matrix (FunSimMat, http://www.funsimmat.de) is a comprehensive database providing various precomputed functional similarity values for proteins in UniProtKB and for protein families in Pfam and SMART. With this update, we significantly increase the coverage of FunSimMat by adding data from the Gene Ontology Annotation project as well as new functional similarity measures. The applicability of the database is greatly extended by the implementation of a new Gene Ontology-based method for disease gene prioritization. Two new visualization tools allow an interactive analysis of the functional relationships between proteins or protein families. This is enhanced further by the introduction of an automatically derived hierarchy of annotation classes. Additional changes include a revised user front-end and a new RESTlike interface for improving the user-friendliness and online accessibility of FunSimMat.

42 citations

Patent
14 Aug 2003
TL;DR: In this paper, an annotation-based automatic web document generation apparatus includes a web server, an annotation editor, annotation server, annotation/meta file database, and annotation processing system, which is used to generate a new web document adapted for a user terminal by combining context information with the merged meta file and efficiently provide a plurality of user terminals of various characteristics with existing web documents.
Abstract: An annotation based automatic web document generation apparatus includes a web server, an annotation editor, an annotation server, an annotation/meta file database, and annotation processing system. The web server provides a web document. The annotation editor refers the web document from the web server to generate at least one annotation and generates a merged command for data for said at least one annotation. The annotation/meta file database stores the data for said at least one annotation and a merged meta file including the merged command for the data for said at least one annotation. Thus, it is possible to generate a new web document adapted for a user terminal by combining context information with the merged meta file and efficiently provide a plurality of user terminals of various characteristics with existing web documents.

41 citations

Proceedings ArticleDOI
22 Jul 2006
TL;DR: This paper discusses the problems associated with lower-density languages in the context of the development of linguistically annotated resources and frames the work with three key questions regarding the definition of lower- density languages; increasing available resources and reducing data requirements.
Abstract: The languages that are most commonly subject to linguistic annotation on a large scale tend to be those with the largest populations or with recent histories of linguistic scholarship. In this paper we discuss the problems associated with lower-density languages in the context of the development of linguistically annotated resources. We frame our work with three key questions regarding the definition of lower-density languages; increasing available resources and reducing data requirements. A number of steps forward are identified for increasing the number lower-density language corpora with linguistic annotations.

41 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
81% related
Deep learning
79.8K papers, 2.1M citations
80% related
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Unsupervised learning
22.7K papers, 1M citations
79% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20231,461
20223,073
2021305
2020401
2019383
2018373