scispace - formally typeset
Open AccessProceedings Article

Developing a large semantically annotated corpus

TLDR
It is argued that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the starting ingredients for fulfilling this enterprise.
Abstract
What would be a good method to provide a large collection of semantically annotated texts with formal, deep semantics rather than shallow? We argue that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the starting ingredients for fulfilling this enterprise. The result is a semantic resource that anyone can edit and that integrates various phenomena, including predicate-argument structure, scope, tense, thematic roles, rhetorical relations and presuppositions, into a single semantic formalism: Discourse Representation Theory. Taking texts rather than sentences as the units of annotation results in deep semantic representations that incorporate discourse structure and dependencies. To manage the various (possibly conflicting) annotations provided by experts and non-experts, we introduce a method that stores " Bits of Wisdom " in a database as stand-off annotations.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Abstract Meaning Representation for Sembanking

TL;DR: A sembank of simple, whole-sentence semantic structures will spur new work in statistical natural language understanding and generation, like the Penn Treebank encouraged work on statistical parsing.
Journal ArticleDOI

The Sketch Engine: ten years on

TL;DR: The paper describes the core functions (word sketches, concordancing, thesaurus), and outlines the different kinds of users, and the approach taken to working with many different languages.
Proceedings Article

Universal Conceptual Cognitive Annotation (UCCA)

TL;DR: UCCA is presented, a novel multi-layered framework for semantic representation that aims to accommodate the semantic distinctions expressed through linguistic utterances and its relative insensitivity to meaning-preserving syntactic variation is demonstrated.
Proceedings ArticleDOI

Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language

TL;DR: The results show that non-expert annotators can produce high quality QA-SRL data, and also establish baseline performance levels for future work on this task, and introduce simple classifierbased models for predicting which questions to ask and what their answers should be.
Journal ArticleDOI

Frege in Space: A Program of Compositional Distributional Semantics

TL;DR: The idea that word meaning can be approximated by the patterns of co-occurrence of words in corpora from statistical semantics and the idea that compositionality can be captured in terms of a syntax-driven calculus of function application from formal semantics are adopted.
References
More filters
Journal ArticleDOI

WordNet : an electronic lexical database

Christiane Fellbaum
- 01 Sep 2000 - 
TL;DR: The lexical database: nouns in WordNet, Katherine J. Miller a semantic network of English verbs, and applications of WordNet: building semantic concordances are presented.
Proceedings ArticleDOI

The Berkeley FrameNet Project

TL;DR: This report will present the project's goals and workflow, and information about the computational tools that have been adapted or created in-house for this work.
Journal ArticleDOI

The Proposition Bank: An Annotated Corpus of Semantic Roles

TL;DR: An automatic system for semantic role tagging trained on the corpus is described and the effect on its performance of various types of information is discussed, including a comparison of full syntactic parsing with a flat representation and the contribution of the empty trace categories of the treebank.
Book

The syntactic process

TL;DR: The book covers topics in formal linguistics, intonational phonology, computational linguistics and experimental psycholinguistics, presenting them as an integrated theory of the language faculty in a form accessible to readers from any of those fields.
Book

Logics of conversation

TL;DR: The semantics of DRT is studied as a model for logical forms for discourse interpretation and some proofs in the glue logic are shown.
Related Papers (5)