scispace - formally typeset
Search or ask a question
Topic

Annotation

About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.


Papers
More filters
Proceedings Article
07 Dec 2009
TL;DR: A probabilistic topic model for analyzing and extracting content-related annotations from noisy annotated discrete data such as web pages stored in social bookmarking services, in which the annotations are assumed to originate from topics that generated the content or from a general distribution unrelated to the content.
Abstract: We propose a probabilistic topic model for analyzing and extracting content-related annotations from noisy annotated discrete data such as web pages stored in social bookmarking services. In these services, since users can attach annotations freely, some annotations do not describe the semantics of the content, thus they are noisy, i.e. not content-related. The extraction of content-related annotations can be used as a preprocessing step in machine learning tasks such as text classification and image recognition, or can improve information retrieval performance. The proposed model is a generative model for content and annotations, in which the annotations are assumed to originate either from topics that generated the content or from a general distribution unrelated to the content. We demonstrate the effectiveness of the proposed method by using synthetic data and real social annotation data for text and images.

44 citations

Proceedings ArticleDOI
10 Nov 2017
TL;DR: This study examines inter-annotator agreement in multi-class, multi-label sentiment annotation of messages, using several annotation agreement measures, as well as statistical analysis and Machine Learning to assess the resulting annotations.
Abstract: Manual text annotation is an essential part of Big Text analytics. Although annotators work with limited parts of data sets, their results are extrapolated by automated text classification and affect the final classification results. Reliability of annotations and adequacy of assigned labels are especially important in the case of sentiment annotations. In the current study we examine inter-annotator agreement in multi-class, multi-label sentiment annotation of messages. We used several annotation agreement measures, as well as statistical analysis and Machine Learning to assess the resulting annotations.

44 citations

Journal ArticleDOI
TL;DR: In the past year, the authors at DDBJ collected and released 1 066 084 entries or 718 072 425 bases including the whole chromosome 22 of chimpanzee, the whole-genome shotgun sequences of silkworm and various others.
Abstract: Inthepastyear,weatDDBJ(DNADataBankofJapan; http://www.ddbj.nig.ac.jp) collected and released 1066084 entries or 718072425 bases including the whole chromosome 22 of chimpanzee, the wholegenome shotgun sequences of silkworm and various others. On the other hand, we hosted workshops for human full-length cDNA annotation and participated in jamborees of mouse full-length cDNA annotation. The annotated data are made public at DDBJ. We are also in collaboration with a RIKEN team to accept and release the CAGE (Cap Analysis Gene Expression) data under a new category, MGA (Mass Sequences for Genome Annotation). The data will be useful for studying gene expression control in many aspects.

44 citations

Book ChapterDOI
30 Nov 2006
TL;DR: A tool-supported process for semantic annotation of documents based on techniques and technologies traditionally used in software analysis and reverse engineering for large-scale legacy code bases is elaborated.
Abstract: The Web is the greatest information source in human history. Unfortunately, mining knowledge out of this source is a laborious and error-prone task. Many researchers believe that a solution to the problem can be founded on semantic annotations that need to be inserted in web-based documents and guide information extraction and knowledge mining. In this paper, we further elaborate a tool-supported process for semantic annotation of documents based on techniques and technologies traditionally used in software analysis and reverse engineering for large-scale legacy code bases. The outcomes of the paper include an experimental evaluation framework and empirical results based on two case studies adopted from the Tourism sector. The conclusions suggest that our approach can facilitate the semi-automatic annotation of large document bases.

44 citations

Book ChapterDOI
05 Nov 2006
TL;DR: It is shown how information can be inferred about the semantics of operation parameters based on their connections to other (annotated) operation parameters within tried-and-tested workflows.
Abstract: Semantic annotations of web services can facilitate the discovery of services, as well as their composition into workflows. At present, however, the practical utility of such annotations is limited by the small number of service annotations available for general use. Resources for manual annotation are scarce, and therefore some means is required by which services can be automatically (or semi-automatically) annotated. In this paper, we show how information can be inferred about the semantics of operation parameters based on their connections to other (annotated) operation parameters within tried-and-tested workflows. In an open-world context, we can infer only constraints on the semantics of parameters, but these so-called loose annotations are still of value in detecting errors within workflows, annotations and ontologies, as well as in simplifying the manual annotation task.

44 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
81% related
Deep learning
79.8K papers, 2.1M citations
80% related
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Unsupervised learning
22.7K papers, 1M citations
79% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20231,461
20223,073
2021305
2020401
2019383
2018373