scispace - formally typeset
Search or ask a question
Topic

Annotation

About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.


Papers
More filters
Patent
Steve Nelson1, Jason Harris1
15 May 2003
TL;DR: In this article, an annotation management system for providing real-time annotations for media content during a videoconference session is provided, which includes a media management server configured to manage media data and annotation data for distribution to participants of the videocon conference session.
Abstract: An annotation management system for providing real-time annotations for media content during a videoconference session is provided. The annotation management system includes a media management server configured to manage media data and annotation data for distribution to participants of the videoconference session. A storage server in communication with the media management server is configured to store the media data and the annotation data. An event database in communication with the media management server is configured to capture events associated with the annotation data. A media analysis server is in communication with the media management server, the event database, and the storage server. The media analysis server is configured to associate the stored annotation data with the captured events to enable reconstruction of the videoconference session based on the captured events. A videoconference system, a computer readable medium, a graphical user interface, and a method are also included.

121 citations

Journal ArticleDOI
TL;DR: COCO-CN as mentioned in this paper is a dataset enriched with manually written Chinese sentences and tags, which provides a unified and challenging platform for cross-lingual image tagging, captioning, and retrieval.
Abstract: This paper contributes to cross-lingual image annotation and retrieval in terms of data and baseline methods. We propose COCO-CN , a novel dataset enriching MS-COCO with manually written Chinese sentences and tags. For effective annotation acquisition, we develop a recommendation-assisted collective annotation system, automatically providing an annotator with several tags and sentences deemed to be relevant with respect to the pictorial content. Having 20 342 images annotated with 27 218 Chinese sentences and 70 993 tags, COCO-CN is currently the largest Chinese–English dataset that provides a unified and challenging platform for cross-lingual image tagging, captioning, and retrieval. We develop conceptually simple yet effective methods per task for learning from cross-lingual resources. Extensive experiments on the three tasks justify the viability of the proposed dataset and methods. Data and code are publicly available at https://github.com/li-xirong/coco-cn .

121 citations

Proceedings Article
01 Nov 2008
TL;DR: The creators of signed language corpora should prioritize annotation above transcription, and ensure that signs are identified using unique gloss-based annotations, and the whole rationale for corpus-creation is undermined.
Abstract: The essential characteristic of a signed language corpus is that it has been annotated, and not, contrary to the practice of many signed language researchers, that it has been transcribed. Annotations are necessary for corpus-based investigations of signed or spoken languages. Multi-media annotation software can now be used to transform a recording into a machine-readable text without it first being necessary to transcribe the text, provided that linguistic units are uniquely identified and annotations subsequently appended to these units. These unique identifiers are here referred to as ID-glosses. The use of ID-glosses is only possible if a reference lexical database (i.e., dictionary) exists as the result of prior foundation research into the lexicon. In short, the creators of signed language corpora should prioritize annotation above transcription, and ensure that signs are identified using unique gloss-based annotations. Without this the whole rationale for corpus-creation is undermined.

121 citations

Journal ArticleDOI
TL;DR: This work builds a prototype system for ontology based annotation and indexing of biomedical data that enables ontology-based querying and integration of tissue and gene expression microarray data, and enables identification of datasets on specific diseases across both repositories.
Abstract: The volume of publicly available genomic scale data is increasing. Genomic datasets in public repositories are annotated with free-text fields describing the pathological state of the studied sample. These annotations are not mapped to concepts in any ontology, making it difficult to integrate these datasets across repositories. We have previously developed methods to map text-annotations of tissue microarrays to concepts in the NCI thesaurus and SNOMED-CT. In this work we generalize our methods to map text annotations of gene expression datasets to concepts in the UMLS. We demonstrate the utility of our methods by processing annotations of datasets in the Gene Expression Omnibus. We demonstrate that we enable ontology-based querying and integration of tissue and gene expression microarray data. We enable identification of datasets on specific diseases across both repositories. Our approach provides the basis for ontology-driven data integration for translational research on gene and protein expression data. Based on this work we have built a prototype system for ontology based annotation and indexing of biomedical data. The system processes the text metadata of diverse resource elements such as gene expression data sets, descriptions of radiology images, clinical-trial reports, and PubMed article abstracts to annotate and index them with concepts from appropriate ontologies. The key functionality of this system is to enable users to locate biomedical data resources related to particular ontology concepts.

121 citations

Journal ArticleDOI
TL;DR: How the gene prediction in the research field increasingly shifts from methods that typically exploited one or two types of data to more integrative approaches that simultaneously deal with various experimental, statistical, or other in silico evidence is discussed.
Abstract: In this era of whole genome sequencing, reliable genome annotations (identification of functional regions) are the cornerstones for many subsequent analyses. Not only is careful annotation important for studying the gene and gene family content of a genome and its host, but also for wide-scale transcriptome and proteome analyses attempting to de- scribe a certain biological process or to get a global picture of a cell's behavior. Although the number of sequenced ge- nomes is increasing thanks to the application of new technologies, genome-wide analyses will critically depend on the quality of the genome annotations. However, the annotation process is more complicated in the plant field than in the animal field because of the limited funding that leads to much fewer experimental data and less annotation expertise. This situation calls for highly automated annotation platforms that can make the best use of all available data, experimental or not. We discuss how the gene prediction (the process of predicting protein gene structures in genomic sequences) research field increasingly shifts from methods that typically exploited one or two types of data to more integrative approaches that simultaneously deal with various experimental, statistical, or other in silico evidence. We illustrate the importance of inte- grative approaches for producing high-quality automatic annotations of genomes of plants and algae as well as of fungi that live in close association with plants using the platform EuGene as an example.

121 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
81% related
Deep learning
79.8K papers, 2.1M citations
80% related
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Unsupervised learning
22.7K papers, 1M citations
79% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20231,461
20223,073
2021305
2020401
2019383
2018373