scispace - formally typeset
Search or ask a question
Topic

Annotation

About: Annotation is a research topic. Over the lifetime, 6719 publications have been published within this topic receiving 203463 citations. The topic is also known as: note & markup.


Papers
More filters
Journal ArticleDOI
TL;DR: The current release of ORegAnno comprises 30 145 records curated from 922 publications and describing regulatory sequences for over 3853 genes and 465 transcription factors from 19 species.
Abstract: ORegAnno is an open-source, open-access database and literature curation system for communitybased annotation of experimentally identified DNA regulatory regions, transcription factor binding sites and regulatory variants. The current release comprises 30145 records curated from 922 publications and describing regulatory sequences for over 3853 genes and 465 transcription factors from 19 species. A new feature called the ‘publication queue’ allows users to input relevant papers from scientific literature as targets for annotation. The queue contains 4438 gene regulation papers entered by experts and another 54351 identified by text-mining methods. Users can enter or ‘check out’ papers from the queue for manual curation using a series of user-friendly annotation pages. A typical record entry consists of species, sequence type, sequence, target gene, binding factor, experimental outcome and one or more lines of experimental evidence. An evidence ontology was developed to describe and categorize these experiments. Records are cross-referenced to Ensembl or Entrez gene identifiers, PubMed and dbSNP and can be visualized in the Ensembl or UCSC genome browsers. All data are freely available through search pages, XML data dumps or web services at: http://www.oreganno.org.

262 citations

Proceedings ArticleDOI
07 May 2002
TL;DR: This work provides a framework, CREAM, that allows for creation of metadata, and describes its implementation, viz.
Abstract: Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, CREAM, that allows for creation of metadata. While the annotation mode of CREAM allows to create metadata for existing web pages, the authoring mode lets authors create metadata --- almost for free --- while putting together the content of a page.As a particularity of our framework, CREAM allows to create relational metadata, i.e. metadata that instantiate interrelated definitions of classes in a domain ontology rather than a comparatively rigid template-like schema asm Dublin Core. We discuss some of the requirements one has to meet when developing such an ontology-based framework, e.g. the integration of a metadata crawler, inference services, document management and a meta-ontology, and describe its implementation, viz. Ont-O-Mat, a component-based, ontology-driven Web page authoring and annotation tool.

261 citations

Patent
18 May 1989
TL;DR: In this article, a data processing system provides annotation of a document with annotations inputted through an electronic tablet (16), a keyboard (12), and an audio assembly (24), the annotations are entered in a modeless operation of the three input streams.
Abstract: A data processing system (10) provides annotation of a document with annotations inputted through an electronic tablet (16), a keyboard (12) and an audio assembly (24). The annotations are entered in a modeless operation of the three input streams. Input by the tablet utilizes a two ended pen (14). One end (30) of the pen is used for writing annotations to be added to a document and the second end (28) of the pen is used for erasing the added written annotations. Files record annotations made to the document in a manner which allows a time sequenced playback of the annotations including handwritten, typed and voiced annotations. A state machine of various procedures and controllers which call the procedures is used for processing the annotations as they are made, for replaying the annotations at a subsequent desired time, and for printing the annotations.

258 citations

Journal ArticleDOI
TL;DR: The web-based workbench PlutoF is described, which is designed to bridge the gap between the needs of contemporary research in biology and the existing software resources and databases.
Abstract: DNA sequences accumulating in the International Nucleotide Sequence Databases (INSD) form a rich source of information for taxonomic and ecological meta-analyses. However, these databases include many erroneous entries, and the data itself is poorly annotated with metadata, making it difficult to target and extract entries of interest with any degree of precision. Here we describe the web-based workbench PlutoF, which is designed to bridge the gap between the needs of contemporary research in biology and the existing software resources and databases. Built on a relational database, PlutoF allows remote-access rapid submission, retrieval, and analysis of study, specimen, and sequence data in INSD as well as for private datasets though web-based thin clients. In contrast to INSD, PlutoF supports internationally standardized terminology to allow very specific annotation and linking of interacting specimens and species. The sequence analysis module is optimized for identification and analysis of environmental ITS sequences of fungi, but it can be modified to operate on any genetic marker and group of organisms. The workbench is available at http://plutof.ut.ee.

258 citations

Journal ArticleDOI
TL;DR: The TIGER Treebank, a corpus of currently 40,000 syntactically annotated German newspaper sentences, is described and the query language which was designed to facilitate a simple formulation of complex queries is described, a graphical user interface for query input.
Abstract: This paper reports on the TIGER Treebank, a corpus of currently 40,000 syntactically annotated German newspaper sentences. We describe what kind of information is encoded in the treebank and introduce the different representation formats that are used for the annotation and exploitation of the treebank. We explain the different methods used for the annotation: interactive annotation, using the tool ANNOTATE, and LFG parsing. Furthermore, we give an account of the annotation scheme used for the TIGER treebank. This scheme is an extended and improved version of the NEGRA annotation scheme and we illustrate in detail the linguistic extensions that were made concerning the annotation in the TIGER project. The main differences are concerned with coordination, verb-subcategorization, expletives as well as proper nouns. In addition, the paper also presents the query tool TIGERSearch that was developed in the project to exploit the treebank in an adequate way. We describe the query language which was designed to facilitate a simple formulation of complex queries; furthermore, we shortly introduce TIGER in, a graphical user interface for query input. The paper concludes with a summary and some directions for future work.

253 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
81% related
Deep learning
79.8K papers, 2.1M citations
80% related
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Unsupervised learning
22.7K papers, 1M citations
79% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20231,461
20223,073
2021305
2020401
2019383
2018373