scispace - formally typeset
Search or ask a question
Topic

Annotation

About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.


Papers
More filters
Proceedings ArticleDOI
28 Nov 2011
TL;DR: An effective Weak Label Regularized Local Coordinate Coding (WLRLCC) technique is proposed, which exploits the principle of local coordinate coding by learning sparse features, and employs the idea of graph-based weak label regularization to enhance the weak labels of the similar facial images.
Abstract: This paper investigates a retrieval-based annotation paradigm of mining web facial images for automated face annotation. In general, there are two key challenges for such an annotation paradigm. The first challenge is how to efficiently retrieve a short list of most similar facial images from facial image databases, and the second challenge is how to effectively perform annotation by exploiting these similar facial images and their weak labels which are often noisy and incomplete. In this paper, we mainly focus on tackling the second challenge of the retrieval-based face annotation paradigm. In particular, we propose an effective Weak Label Regularized Local Coordinate Coding (WLRLCC) technique, which exploits the local coordinate coding principle in learning sparse features, and at the same time employs the graph-based weak label regularization principle to enhance the weak labels of the short list of similar facial images. We present an efficient optimization algorithm to solve the WLRLCC task, and develop an effective sparse reconstruction scheme to perform the final face name annotation. We conduct a set of extensive empirical studies on a large-scale facial image database withatotal of 6,000 persons and over 600,000 web facial images, in which encouraging results show that the proposed WLRLCC algorithm significantly boosts the performance of the regular retrieval-based face annotation approaches.

70 citations

Journal ArticleDOI
05 May 2010
TL;DR: In adapting the extent tagger to new domains, merging the training data from the ACE corpus with annotated data in the new domain provides the best performance.
Abstract: SpatialML is an annotation scheme for marking up references to places in natural language. It covers both named and nominal references to places, grounding them where possible with geo-coordinates, and characterizes relationships among places in terms of a region calculus. A freely available annotation editor has been developed for SpatialML, along with several annotated corpora. Inter-annotator agreement on SpatialML extents is 91.3 F-measure on a corpus of SpatialML-annotated ACE documents released by the Linguistic Data Consortium. Disambiguation agreement on geo-coordinates on ACE is 87.93 F-measure. An automatic tagger for SpatialML extents scores 86.9 F on ACE, while a disambiguator scores 93.0 F on it. Results are also presented for two other corpora. In adapting the extent tagger to new domains, merging the training data from the ACE corpus with annotated data in the new domain provides the best performance.

70 citations

Journal ArticleDOI
Purvesh Khatri1, Bogdan Done1, Archana Rao1, Arina Done1, Sorin Draghici1 
TL;DR: The technique is able to identify missing and inaccurate annotations in existing annotation databases, and thus help improve their accuracy, and is used to analyze and improve the quality of the data of any public or private annotation database.
Abstract: The correct interpretation of any biological experiment depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are ubiquitous and used by all life scientists in most experiments. However, it is well known that such databases are incomplete and many annotations may also be incorrect. In this paper we describe a technique that can be used to analyze the semantic content of such annotation databases. Our approach is able to extract implicit semantic relationships between genes and functions. This ability allows us to discover novel functions for known genes. This approach is able to identify missing and inaccurate annotations in existing annotation databases, and thus help improve their accuracy. We used our technique to analyze the current annotations of the human genome. From this body of annotations, we were able to predict 212 additional gene--function assignments. A subsequent literature search found that 138 of these gene--functions assignments are supported by existing peer-reviewed papers. An additional 23 assignments have been confirmed in the meantime by the addition of the respective annotations in later releases of the Gene Ontology database. Overall, the 161 confirmed assignments represent 75.95% of the proposed gene--function assignments. Only one of our predictions (0.4%) was contradicted by the existing literature. We could not find any relevant articles for 50 of our predictions (23.58%). The method is independent of the organism and can be used to analyze and improve the quality of the data of any public or private annotation database. Availability: http://vortex.cs.wayne.edu/papers/semantic_analysis_bioinfo.pdf Contact: [email protected]

69 citations

Proceedings Article
01 Nov 2012
TL;DR: This paper provides a framework to illustrate how a clustering method can support the annotation task, and shows that a large reduction in both the time to annotate images and number of mouse clicks needed for the annotation is achieved.
Abstract: As more subject-specific image datasets (medical images, birds, etc) become available, high quality labels associated with these datasets are essential for building statistical models and method evaluation. Obtaining these annotations is a time-comsuming and thus a costly business. We propose a clustering method to support this annotation task, making the task easier and more efficient to perform for users. In this paper, we provide a framework to illustrate how a clustering method can support the annotation task. A large reduction in both the time to annotate images and number of mouse clicks needed for the annotation is achieved. By investigating the quality of the annotation, we show that this framework is affected by the particular clustering method used. This, however, does not have a large influence on the overall accuracy and disappears if the data is annotated by multiple persons.

69 citations

Proceedings Article
01 Jan 2001
TL;DR: The development of a fast, robust and highly usable annotation tool for the annotation of XMLencoded multi-modal language corpora was a major objective of the work presented.
Abstract: We present a tool for the annotation of XMLencoded multi-modal language corpora. Nonhierarchical data is supported by means of standoff annotation. We define base level and suprabase level elements and theory-independent markables for multi-modal annotation and apply them to a cospecification annotation scheme. We also describe how arbitrary annotation schemes can be represented in terms of these elements. Apart from theoretical considerations, however, the development of a fast, robust and highly usable annotation tool was a major objective of the work presented.

69 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
81% related
Deep learning
79.8K papers, 2.1M citations
80% related
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Unsupervised learning
22.7K papers, 1M citations
79% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20231,461
20223,073
2021305
2020401
2019383
2018373