scispace - formally typeset
Search or ask a question
Topic

Annotation

About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.


Papers
More filters
Patent
26 Oct 2004
TL;DR: Semantic Web attributes are transmitted via an electronic message such as an email message as discussed by the authors, and the attributes are extracted from the message by a program agent according to a predetermined plan, and the extracted attributes are saved in storage wherein the storage is optionally an annotation store.
Abstract: Semantic Web attributes are transmitted via an electronic message such as an email message. The attributes are extracted from the message by a program agent according to a predetermined plan. The extracted attributes are saved in storage wherein the storage is optionally an annotation store.

50 citations

Journal ArticleDOI
TL;DR: The use of MAKER-P is reported to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure and demonstrate the utility of MAker-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes.
Abstract: The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes.

50 citations

Journal ArticleDOI
25 Jul 2012-PLOS ONE
TL;DR: This work analyzes the full GO molecular function annotation of UniProtKB proteins and presents and evaluates a data mining algorithm, based on the association rule learning methodology, for identifying implicit relationships between molecular function terms.
Abstract: Despite the structure and objectivity provided by the Gene Ontology (GO), the annotation of proteins is a complex task that is subject to errors and inconsistencies. Electronically inferred annotations in particular are widely considered unreliable. However, given that manual curation of all GO annotations is unfeasible, it is imperative to improve the quality of electronically inferred annotations. In this work, we analyze the full GO molecular function annotation of UniProtKB proteins, and discuss some of the issues that affect their quality, focusing particularly on the lack of annotation consistency. Based on our analysis, we estimate that 64% of the UniProtKB proteins are incompletely annotated, and that inconsistent annotations affect 83% of the protein functions and at least 23% of the proteins. Additionally, we present and evaluate a data mining algorithm, based on the association rule learning methodology, for identifying implicit relationships between molecular function terms. The goal of this algorithm is to assist GO curators in updating GO and correcting and preventing inconsistent annotations. Our algorithm predicted 501 relationships with an estimated precision of 94%, whereas the basic association rule learning methodology predicted 12,352 relationships with a precision below 9%.

50 citations

Patent
20 May 2008
TL;DR: In this article, a method for fast semi-automatic semantic annotation is described, which assigns a tag and a label to each word of the next limited annotated corpus using a parser engine, a similarity engine, and a SVM engine.
Abstract: A method, apparatus and computer instructions is provided for fast semi-automatic semantic annotation. Given a limited annotated corpus, the present invention assigns a tag and a label to each word of the next limited annotated corpus using a parser engine, a similarity engine, and a SVM engine. A rover then combines the parse trees from the three engines and annotates the next chunk of limited annotated corpus with confidence, such that the efforts required for human annotation is reduced.

49 citations

Proceedings ArticleDOI
05 Jun 2009
TL;DR: It is demonstrated that a combination of instance, annotator and annotation task characteristics are important for developing an accurate estimator, and argued that both correlation coefficient and root mean square error should be used for evaluating annotation cost estimators.
Abstract: We present an empirical investigation of the annotation cost estimation task for active learning in a multi-annotator environment. We present our analysis from two perspectives: selecting examples to be presented to the user for annotation; and evaluating selective sampling strategies when actual annotation cost is not available. We present our results on a movie review classification task with rationale annotations. We demonstrate that a combination of instance, annotator and annotation task characteristics are important for developing an accurate estimator, and argue that both correlation coefficient and root mean square error should be used for evaluating annotation cost estimators.

49 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
81% related
Deep learning
79.8K papers, 2.1M citations
80% related
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Unsupervised learning
22.7K papers, 1M citations
79% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20231,461
20223,073
2021305
2020401
2019383
2018373