scispace - formally typeset
Search or ask a question
Topic

Annotation

About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.


Papers
More filters
01 Jan 2005
TL;DR: Inter-annotator agreement test indicated that the writing style rather than the contents of the research abstracts is the source of the difficulty in tree annotation, and that annotation can be stably done by linguists without much knowledge of biology with appropriate guidelines regarding to linguistic phenomena particular to scientific texts.
Abstract: Linguistically annotated corpus based on texts in biomedical domain has been constructed to tune natural language processing (NLP) tools for biotextmining. As the focus of information extraction is shifting from "nominal" information such as named entity to "verbal" information such as function and interaction of substances, application of parsers has become one of the key technologies and thus the corpus annotated for syntactic structure of sentences is in demand. A subset of the GENIA corpus consisting of 500 MEDLINE abstracts has been annotated for syntactic structure in an XMLbased format based on Penn Treebank II (PTB) scheme. Inter-annotator agreement test indicated that the writing style rather than the contents of the research abstracts is the source of the difficulty in tree annotation, and that annotation can be stably done by linguists without much knowledge of biology with appropriate guidelines regarding to linguistic phenomena particular to scientific texts.

147 citations

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A content-based image annotation refinement (CIAR) algorithm is proposed to re-rank the candidate annotations of images and leverages both corpus information and the content feature of a query image.
Abstract: Automatic image annotation has been an active research topic due to its great importance in image retrieval and management. However, results of the state-of-the-art image annotation methods are often unsatisfactory. Despite continuous efforts in inventing new annotation algorithms, it would be advantageous to develop a dedicated approach that could refine imprecise annotations. In this paper, a novel approach to automatically refining the original annotations of images is proposed. For a query image, an existing image annotation method is first employed to obtain a set of candidate annotations. Then, the candidate annotations are re-ranked and only the top ones are reserved as the final annotations. By formulating the annotation refinement process as a Markov process and defining the candidate annotations as the states of a Markov chain, a content-based image annotation refinement (CIAR) algorithm is proposed to re-rank the candidate annotations. It leverages both corpus information and the content feature of a query image. Experimental results on a typical Corel dataset show not only the validity of the refinement, but also the superiority of the proposed algorithm over existing ones.

144 citations

Journal ArticleDOI
TL;DR: This paper describes an approach where a named-entity recognition system produces a first annotation and annotators revise this annotation using a web-based interface, showing that the inter-annotator agreement is much better than the agreement with the system provided annotations.

144 citations

01 Jan 2003
TL;DR: A new version of The VideoAnnEx is developed, a.k.a. IBM MPEG-7 Annotation Tool, for collaborative multimedia annotation task in a distributed environment, and a forum to collaboratively annotate semantic labels to the NIST TRECVID 2003 development set is proposed.
Abstract: We developed a new version of The VideoAnnEx, a.k.a. IBM MPEG-7 Annotation Tool, for collaborative multimedia annotation task in a distributed environment. The VideoAnnEx assists authors in the task of annotating video sequences with MPEG-7 metadata. Each shot in the video sequence can be annotated with static scene descriptions, key object descriptions, event descriptions, and other lexicon sets. The annotated descriptions are associated with each video shot or regions in the keyframes, and are stored as MPEG-7 XML file. We proposed a forum to collaboratively annotate semantic labels to the NIST TRECVID 2003 development set. From April to July 2003, 111 researchers from 23 institutes worked together to associate 198K of ground-truth labels (433K after hierarchy propagation) to 62.2 hours of videos. This large set of valuable ground-truth data is publicly available to the research community, especially for multimedia indexing and retrieval, semantic understanding, and supervised machine learning fields.

143 citations

Proceedings ArticleDOI
24 Sep 2007
TL;DR: This work developed Kodak's consumer video benchmark data set, which includes a significant number of videos from actual users, a rich lexicon that accommodates consumers, and the annotation of a subset of concepts over the entire video data set.
Abstract: Semantic indexing of images and videos in the consumer domain has become a very important issue for both research and actual application. In this work we developed Kodak's consumer video benchmark data set, which includes (1) a significant number of videos from actual users, (2) a rich lexicon that accommodates consumers. needs, and (3) the annotation of a subset of concepts over the entire video data set. To the best of our knowledge, this is the first systematic work in the consumer domain aimed at the definition of a large lexicon, construction of a large benchmark data set, and annotation of videos in a rigorous fashion. Such effort will have significant impact by providing a sound foundation for developing and evaluating large-scale learning-based semantic indexing/annotation techniques in the consumer domain.

141 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
81% related
Deep learning
79.8K papers, 2.1M citations
80% related
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Unsupervised learning
22.7K papers, 1M citations
79% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20231,461
20223,073
2021305
2020401
2019383
2018373