Topic
Annotation
About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.
Papers published on a yearly basis
Papers
More filters
••
28 Nov 2011TL;DR: An effective Weak Label Regularized Local Coordinate Coding (WLRLCC) technique is proposed, which exploits the principle of local coordinate coding by learning sparse features, and employs the idea of graph-based weak label regularization to enhance the weak labels of the similar facial images.
Abstract: This paper investigates a retrieval-based annotation paradigm of mining web facial images for automated face annotation. In general, there are two key challenges for such an annotation paradigm. The first challenge is how to efficiently retrieve a short list of most similar facial images from facial image databases, and the second challenge is how to effectively perform annotation by exploiting these similar facial images and their weak labels which are often noisy and incomplete. In this paper, we mainly focus on tackling the second challenge of the retrieval-based face annotation paradigm. In particular, we propose an effective Weak Label Regularized Local Coordinate Coding (WLRLCC) technique, which exploits the local coordinate coding principle in learning sparse features, and at the same time employs the graph-based weak label regularization principle to enhance the weak labels of the short list of similar facial images. We present an efficient optimization algorithm to solve the WLRLCC task, and develop an effective sparse reconstruction scheme to perform the final face name annotation. We conduct a set of extensive empirical studies on a large-scale facial image database withatotal of 6,000 persons and over 600,000 web facial images, in which encouraging results show that the proposed WLRLCC algorithm significantly boosts the performance of the regular retrieval-based face annotation approaches.
70 citations
••
05 May 2010TL;DR: In adapting the extent tagger to new domains, merging the training data from the ACE corpus with annotated data in the new domain provides the best performance.
Abstract: SpatialML is an annotation scheme for marking up references to places in natural language. It covers both named and nominal references to places, grounding them where possible with geo-coordinates, and characterizes relationships among places in terms of a region calculus. A freely available annotation editor has been developed for SpatialML, along with several annotated corpora. Inter-annotator agreement on SpatialML extents is 91.3 F-measure on a corpus of SpatialML-annotated ACE documents released by the Linguistic Data Consortium. Disambiguation agreement on geo-coordinates on ACE is 87.93 F-measure. An automatic tagger for SpatialML extents scores 86.9 F on ACE, while a disambiguator scores 93.0 F on it. Results are also presented for two other corpora. In adapting the extent tagger to new domains, merging the training data from the ACE corpus with annotated data in the new domain provides the best performance.
70 citations
••
TL;DR: The technique is able to identify missing and inaccurate annotations in existing annotation databases, and thus help improve their accuracy, and is used to analyze and improve the quality of the data of any public or private annotation database.
Abstract: The correct interpretation of any biological experiment depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are ubiquitous and used by all life scientists in most experiments. However, it is well known that such databases are incomplete and many annotations may also be incorrect. In this paper we describe a technique that can be used to analyze the semantic content of such annotation databases. Our approach is able to extract implicit semantic relationships between genes and functions. This ability allows us to discover novel functions for known genes. This approach is able to identify missing and inaccurate annotations in existing annotation databases, and thus help improve their accuracy. We used our technique to analyze the current annotations of the human genome. From this body of annotations, we were able to predict 212 additional gene--function assignments. A subsequent literature search found that 138 of these gene--functions assignments are supported by existing peer-reviewed papers. An additional 23 assignments have been confirmed in the meantime by the addition of the respective annotations in later releases of the Gene Ontology database. Overall, the 161 confirmed assignments represent 75.95% of the proposed gene--function assignments. Only one of our predictions (0.4%) was contradicted by the existing literature. We could not find any relevant articles for 50 of our predictions (23.58%). The method is independent of the organism and can be used to analyze and improve the quality of the data of any public or private annotation database.
Availability: http://vortex.cs.wayne.edu/papers/semantic_analysis_bioinfo.pdf
Contact: [email protected]
69 citations
•
01 Nov 2012TL;DR: This paper provides a framework to illustrate how a clustering method can support the annotation task, and shows that a large reduction in both the time to annotate images and number of mouse clicks needed for the annotation is achieved.
Abstract: As more subject-specific image datasets (medical images, birds, etc) become available, high quality labels associated with these datasets are essential for building statistical models and method evaluation. Obtaining these annotations is a time-comsuming and thus a costly business. We propose a clustering method to support this annotation task, making the task easier and more efficient to perform for users. In this paper, we provide a framework to illustrate how a clustering method can support the annotation task. A large reduction in both the time to annotate images and number of mouse clicks needed for the annotation is achieved. By investigating the quality of the annotation, we show that this framework is affected by the particular clustering method used. This, however, does not have a large influence on the overall accuracy and disappears if the data is annotated by multiple persons.
69 citations
•
01 Jan 2001
TL;DR: The development of a fast, robust and highly usable annotation tool for the annotation of XMLencoded multi-modal language corpora was a major objective of the work presented.
Abstract: We present a tool for the annotation of XMLencoded multi-modal language corpora. Nonhierarchical data is supported by means of standoff annotation. We define base level and suprabase level elements and theory-independent markables for multi-modal annotation and apply them to a cospecification annotation scheme. We also describe how arbitrary annotation schemes can be represented in terms of these elements. Apart from theoretical considerations, however, the development of a fast, robust and highly usable annotation tool was a major objective of the work presented.
69 citations