scispace - formally typeset
Search or ask a question
Topic

Annotation

About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.


Papers
More filters
Proceedings ArticleDOI
15 Aug 2005
TL;DR: This work shows that, under the database centric probabilistic model, optimal annotation and retrieval can be implemented with algorithms that are conceptually simple, computationally efficient, and do not require prior semantic segmentation of training images.
Abstract: We introduce a new model for semantic annotation and retrieval from image databases. The new model is based on a probabilistic formulation that poses annotation and retrieval as classification problems, and produces solutions that are optimal in the minimum probability of error sense. It is also database centric, by establishing a one-to-one mapping between semantic classes and the groups of database images that share the associated semantic labels. In this work we show that, under the database centric probabilistic model, optimal annotation and retrieval can be implemented with algorithms that are conceptually simple, computationally efficient, and do not require prior semantic segmentation of training images. Due to its simplicity, the annotation and retrieval architecture is also amenable to sophisticated parameter tuning, a property that is exploited to investigate the role of feature selection in the design of optimal annotation and retrieval systems. Finally, we demonstrate the benefits of simply establishing a one-to-one mapping between keywords and the states of the semantic classification problem over the more complex, and currently popular, joint modeling of keyword and visual feature distributions. The database centric probabilistic retrieval model is compared to existing semantic labeling and retrieval methods, and shown to achieve higher accuracy than the previously best published results, at a fraction of their computational cost.

54 citations

Journal ArticleDOI
TL;DR: This paper exploits the social annotations and proposes a novel framework simultaneously considering the user and query relevance to learn to personalized image search, and introduces user-specific topic modeling to map the query relevance and user preference into the same user- specific topic space.
Abstract: Increasingly developed social sharing websites like Flickr and Youtube allow users to create, share, annotate, and comment medias. The large-scale user-generated metadata not only facilitate users in sharing and organizing multimedia content, but provide useful information to improve media retrieval and management. Personalized search serves as one of such examples where the web search experience is improved by generating the returned list according to the modified user search intents. In this paper, we exploit the social annotations and propose a novel framework simultaneously considering the user and query relevance to learn to personalized image search. The basic premise is to embed the user preference and query-related search intent into user-specific topic spaces. Since the users' original annotation is too sparse for topic modeling, we need to enrich users' annotation pool before user-specific topic spaces construction. The proposed framework contains two components: (1) a ranking-based multicorrelation tensor factorization model is proposed to perform annotation prediction, which is considered as users' potential annotations for the images; (2) we introduce user-specific topic modeling to map the query relevance and user preference into the same user-specific topic space. For performance evaluation, two resources involved with users' social activities are employed. Experiments on a large-scale Flickr dataset demonstrate the effectiveness of the proposed method.

54 citations

Journal ArticleDOI
TL;DR: The construction of cDNA libraries from juvenile/pre-smolt Atlantic salmon, EST sequencing, clustering, and annotation are described by assigning putative function to the transcripts, which represents 97% of all sequences submitted to GenBank from the pre- smoltification stage.
Abstract: To identify as many different transcripts/genes in the Atlantic salmon genome as possible, it is crucial to acquire good cDNA libraries from different tissues and developmental stages, their relevant sequences (ESTs or full length sequences) and attempt to predict function. Such libraries allow identification of a large number of different transcripts and can provide valuable information on genes expressed in a particular tissue at a specific developmental stage. This data is important in constructing a microarray chip, identifying SNPs in coding regions, and for future identification of genes in the whole genome sequence. An important factor that determines the usefulness of generated data for biologists is efficient data access. Public searchable databases play a crucial role in providing such service. Twenty-three Atlantic salmon cDNA libraries were constructed from 15 tissues, yielding nearly 155,000 clones. From these libraries 58,109 ESTs were generated, of which 57,212 were used for contig assembly. Following deletion of mitochondrial sequences 55,118 EST sequences were submitted to GenBank. In all, 20,019 unique sequences, consisting of 6,424 contigs and 13,595 singlets, were generated. The Norwegian Salmon Genome Project Database has been constructed and annotation performed by the annotation transfer approach. Annotation was successful for 50.3% (10,075) of the sequences and 6,113 sequences (30.5%) were annotated with Gene Ontology terms for molecular function, biological process and cellular component. We describe the construction of cDNA libraries from juvenile/pre-smolt Atlantic salmon (Salmo salar), EST sequencing, clustering, and annotation by assigning putative function to the transcripts. These sequences represents 97% of all sequences submitted to GenBank from the pre-smoltification stage. The data has been grouped into datasets according to its source and type of annotation. Various data query options are offered including searches on function assignments and Gene Ontology terms. Data delivery options include summaries for the datasets and their annotations, detailed self-explanatory annotations, and access to the original BLAST results and Gene Ontology annotation trees. Potential presence of a relatively high number of immune-related genes in the dataset was shown by annotation searches.

54 citations

Journal ArticleDOI
TL;DR: This is the first attempt to assess cross-species GO annotation consistency, and compares annotation sets utilizing the hierarchical structure of the GO to compare GO annotations between orthologous gene pairs.
Abstract: Motivation: The Gene Ontology (GO) is widely used to annotate molecular attributes of genes and gene products. Multiple groups undertaking functional annotations of genomes contribute their annotation sets to the GO database resource and these data are subsequently used in comparative functional analysis research. Although GO curators adhere to the same protocols and standards while assigning GO annotations, the specific procedure followed by each annotation group can vary. Since differences in application of annotation standards would dilute the effectiveness of comparative analysis, methods for assessing annotation consistency are essential. The development of methodologies that are broadly applicable for the assessment of GO annotation consistency is an important issue for the comparative genomics community. Results: We have developed a methodology for assessing the consistency of GO annotations provided by different annotation groups. The method is completely general and can be applied to compare any two sets of GO annotations. This is the first attempt to assess cross-species GO annotation consistency. Our method compares annotation sets utilizing the hierarchical structure of the GO to compare GO annotations between orthologous gene pairs. The method produces a report on the annotation consistency and inconsistency for each orthologous pair. We present results obtained by comparing GO annotations for mouse and human gene sets. Availability: The complete current MGI_GOA GO annotation consistency report is available online at http://www.spatial.maine.edu/~mdolan/ Contact: mdolan@informatics.jax.org

54 citations

Proceedings ArticleDOI
09 Jul 2007
TL;DR: This work demonstrates that the iterative annotation algorithm can incorporate the keyword correlations and the region matching approaches handily to improve the image annotation significantly and outperforms the state-of-the-art continuous feature model MBRM with recall and precision improving 21% and 11% respectively.
Abstract: Automatic image annotation automatically labels image content with semantic keywords. For instance, the Relevance Model estimates the joint probability of the keyword and the image [3]. Most of the previous annotation methods assign keywords separately. Recently the correlation between annotated keywords has been used to improve image annotation. However, directly estimating the joint probability of a set of keywords and the unlabeled image is computationally prohibitive. To avoid the computation difficulty we propose a heuristic greedy iterative algorithm to estimate the probability of a keyword subset being the caption of an image. In our approach, the correlations between keywords are analyzed by "Automatic Local Analysis" of text information retrieval. In addition, a new image generation probability estimation method is proposed based on region matching. We demonstrate that our iterative annotation algorithm can incorporate the keyword correlations and the region matching approaches handily to improve the image annotation significantly. The experiments on the ECCV2002 [2] benchmark show that our method outperforms the state-of-the-art continuous feature model MBRM with recall and precision improving 21% and 11% respectively.

54 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
81% related
Deep learning
79.8K papers, 2.1M citations
80% related
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Unsupervised learning
22.7K papers, 1M citations
79% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20231,461
20223,073
2021305
2020401
2019383
2018373