scispace - formally typeset
Search or ask a question
Proceedings Article

Universal Conceptual Cognitive Annotation (UCCA)

01 Aug 2013-Vol. 1, pp 228-238
TL;DR: UCCA is presented, a novel multi-layered framework for semantic representation that aims to accommodate the semantic distinctions expressed through linguistic utterances and its relative insensitivity to meaning-preserving syntactic variation is demonstrated.
Abstract: Syntactic structures, by their nature, reflect first and foremost the formal constructions used for expressing meanings. This renders them sensitive to formal variation both within and across languages, and limits their value to semantic applications. We present UCCA, a novel multi-layered framework for semantic representation that aims to accommodate the semantic distinctions expressed through linguistic utterances. We demonstrate UCCA’s portability across domains and languages, and its relative insensitivity to meaning-preserving syntactic variation. We also show that UCCA can be effectively and quickly learned by annotators with no linguistic background, and describe the compilation of a UCCAannotated corpus.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
01 Sep 2015
TL;DR: The results show that non-expert annotators can produce high quality QA-SRL data, and also establish baseline performance levels for future work on this task, and introduce simple classifierbased models for predicting which questions to ask and what their answers should be.
Abstract: This paper introduces the task of questionanswer driven semantic role labeling (QA-SRL), where question-answer pairs are used to represent predicate-argument structure. For example, the verb “introduce” in the previous sentence would be labeled with the questions “What is introduced?”, and “What introduces something?”, each paired with the phrase from the sentence that gives the correct answer. Posing the problem this way allows the questions themselves to define the set of possible roles, without the need for predefined frame or thematic role ontologies. It also allows for scalable data collection by annotators with very little training and no linguistic expertise. We gather data in two domains, newswire text and Wikipedia articles, and introduce simple classifierbased models for predicting which questions to ask and what their answers should be. Our results show that non-expert annotators can produce high quality QA-SRL data, and also establish baseline performance levels for future work on this task.

222 citations


Cites background from "Universal Conceptual Cognitive Anno..."

  • ...The UCCA foundational layer does not distinguish semantic roles, so Frogs eat herons and Herons eat frogs will receive identical annotation — thereby discarding information which is potentially useful for translation or question answering....

    [...]

  • ...Universal Cognitive Conceptual Annotation (UCCA) (Abend and Rappoport, 2013) is an attempt to create a linguistically universal annotation scheme by using general labels such as argument or scene....

    [...]

Proceedings ArticleDOI
07 Apr 2017
TL;DR: This article proposed a transition-based parser for AMR that parses sentences left-to-right, in linear time, and showed that their parser is competitive with the state of the art on the LDC2015E86 dataset and that it outperforms state-of-theart parsers for recovering named entities and handling polarity.
Abstract: Meaning Representation (AMR) is a semantic representation for natural language that embeds annotations related to traditional tasks such as named entity recognition, semantic role labeling, word sense disambiguation and co-reference resolution. We describe a transition-based parser for AMR that parses sentences left-to-right, in linear time. We further propose a test-suite that assesses specific subtasks that are helpful in comparing AMR parsers, and show that our parser is competitive with the state of the art on the LDC2015E86 dataset and that it outperforms state-of-the-art parsers for recovering named entities and handling polarity.

174 citations

Proceedings ArticleDOI
01 Sep 2017
TL;DR: This year, the WMT17 Metrics Shared Task build upon two types of manual judgements: direct assessment (DA) and HUME manual semantic judgements.
Abstract: This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT17 news translation task and Neural MT training task. We collected scores of 14 metrics from 8 research groups. In addition to that, we computed scores of 7 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system-level correlation (how well each metric’s scores correlate with WMT17 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in judging the quality of a particular sentence). This year, we build upon two types of manual judgements: direct assessment (DA) and HUME manual semantic judgements.

158 citations

Journal ArticleDOI
TL;DR: The authors introduce a Question Decomposition Meaning Representation (QDMR) for questions, which constitutes the ordered list of steps, expressed through natural language, that are necessary for answering a question.
Abstract: Understanding natural language questions entails the ability to break down a question into the requisite steps for computing its answer. In this work, we introduce a Question Decomposition Meaning Representation (QDMR) for questions. QDMR constitutes the ordered list of steps, expressed through natural language, that are necessary for answering a question. We develop a crowdsourcing pipeline, showing that quality QDMRs can be annotated at scale, and release the Break dataset, containing over 83K pairs of questions and their QDMRs. We demonstrate the utility of QDMR by showing that (a) it can be used to improve open-domain question answering on the HotpotQA dataset, (b) it can be deterministically converted to a pseudo-SQL formal language, which can alleviate annotation in semantic parsing applications. Last, we use Break to train a sequence-to-sequence model with copying that parses questions into QDMR structures, and show that it substantially outperforms several natural baselines.

149 citations

Proceedings ArticleDOI
01 Aug 2016
TL;DR: This paper presents the results of the WMT16 Metrics Shared Task, which asked participants of this task to score the outputs of the MT systems involved in the W MT16 Shared Translation Task.
Abstract: This paper presents the results of the WMT16 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT16 Shared Translation Task. We collected scores of 16 metrics from 9 research groups. In addition to that, we computed scores of 9 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system-level correlation (how well each metric’s scores correlate with WMT16 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in comparing two translations of a particular sentence). This year there are several additions to the setup: large number of language pairs (18 in total), datasets from different domains (news, IT and medical), and different kinds of judgments: relative ranking (RR), direct assessment (DA) and HUME manual semantic judgments. Finally, generation of large number of hybrid systems was trialed for provision of more conclusive system-level metric rankings.

145 citations


Cites background from "Universal Conceptual Cognitive Anno..."

  • ...UCCA (Abend and Rappoport, 2013) is an appealing candidate for semantic analysis, due to its cross-linguistic applicability, support for rapid annotation, and coverage of many fundamental semantic phenomena, such as verbal, nom(6)The only numbering displayed on the rating scale are extreme points 0 and 100%, and three ticks indicate the levels of 25, 50 and 75 %....

    [...]

  • ...UCCA (Abend and Rappoport, 2013) is an appealing candidate for semantic analysis, due to its cross-linguistic applicability, support for rapid annotation, and coverage of many fundamental semantic phenomena, such as verbal, nom- 6The only numbering displayed on the rating scale are extreme points 0 and 100%, and three ticks indicate the levels of 25, 50 and 75 %. inal and adjectival argument structures and their inter-relations....

    [...]

  • ...HUME is a human evaluation measure that decomposes over the UCCA semantic units (Birch et al., 2016)....

    [...]

References
More filters
ReportDOI
TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Abstract: : As a result of this grant, the researchers have now published oil CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, with over 3 million words of that material assigned skeletal grammatical structure This material now includes a fully hand-parsed version of the classic Brown corpus About one half of the papers at the ACL Workshop on Using Large Text Corpora this past summer were based on the materials generated by this grant

8,377 citations


"Universal Conceptual Cognitive Anno..." refers methods in this paper

  • ...For instance, both the PTB and the Prague Dependency Treebank (Böhmová et al., 2003) employed annotators with extensive linguistic background....

    [...]

  • ...PropBank and NomBank are built on top of the PTB annotation, and provide for each verb (PropBank) and noun (NomBank), a delineation of their arguments and their categorization into semantic roles....

    [...]

  • ...In fact, the annotations of (a) and (c) are identical under the most widely-used schemes for English, the Penn Treebank (PTB) (Marcus et al., 1993) and CoNLL-style dependencies (Surdeanu et al....

    [...]

  • ...A recent work that did report inter-annotator agreement in terms of bracketing F-score is an annotation project of the PTB’s noun phrases with more elaborate syntactic structure (Vadas and Cur- ran, 2011)....

    [...]

  • ...An increasingly popular alternative to the PTB are dependency structures, which are usually represented as trees whose nodes are the words of the sentence (Ivanova et al., 2012)....

    [...]

Book
01 Jan 1994
TL;DR: This book presents the most complete exposition of the theory of head-driven phrase structure grammar, introduced in the authors' "Information-Based Syntax and Semantics," and demonstrates the applicability of the HPSG approach to a wide range of empirical problems.
Abstract: This book presents the most complete exposition of the theory of head-driven phrase structure grammar (HPSG), introduced in the authors' "Information-Based Syntax and Semantics." HPSG provides an integration of key ideas from the various disciplines of cognitive science, drawing on results from diverse approaches to syntactic theory, situation semantics, data type theory, and knowledge representation. The result is a conception of grammar as a set of declarative and order-independent constraints, a conception well suited to modelling human language processing. This self-contained volume demonstrates the applicability of the HPSG approach to a wide range of empirical problems, including a number which have occupied center-stage within syntactic theory for well over twenty years: the control of "understood" subjects, long-distance dependencies conventionally treated in terms of "wh"-movement, and syntactic constraints on the relationship between various kinds of pronouns and their antecedents. The authors make clear how their approach compares with and improves upon approaches undertaken in other frameworks, including in particular the government-binding theory of Noam Chomsky.

3,600 citations

Proceedings ArticleDOI
10 Aug 1998
TL;DR: This report will present the project's goals and workflow, and information about the computational tools that have been adapted or created in-house for this work.
Abstract: FrameNet is a three-year NSF-supported project in corpus-based computational lexicography, now in its second year (NSF IRI-9618838, "Tools for Lexicon Building"). The project's key features are (a) a commitment to corpus evidence for semantic and syntactic generalizations, and (b) the representation of the valences of its target words (mostly nouns, adjectives, and verbs) in which the semantic portion makes use of frame semantics. The resulting database will contain (a) descriptions of the semantic frames underlying the meanings of the words described, and (b) the valence representation (semantic and syntactic) of several thousand words and phrases, each accompanied by (c) a representative collection of annotated corpus attestations, which jointly exemplify the observed linkings between "frame elements" and their syntactic realizations (e.g. grammatical function, phrase type, and other syntactic traits). This report will present the project's goals and workflow, and information about the computational tools that have been adapted or created in-house for this work.

2,900 citations


"Universal Conceptual Cognitive Anno..." refers background or methods in this paper

  • ...The FrameNet project (Baker et al., 1998) 11The experiment was conducted on the first 30 sentences of section 02....

    [...]

  • ..., 2004) on the one hand, and FrameNet (Baker et al., 1998) on the other....

    [...]

  • ...Every Scene contains one main relation, which is the anchor of the Scene, the most important relation it describes (similar to frameevoking lexical units in FrameNet (Baker et al., 1998))....

    [...]

  • ...The leading SRL ap- proaches are PropBank (Palmer et al., 2005) and NomBank (Meyers et al., 2004) on the one hand, and FrameNet (Baker et al., 1998) on the other....

    [...]

Journal ArticleDOI
TL;DR: An automatic system for semantic role tagging trained on the corpus is described and the effect on its performance of various types of information is discussed, including a comparison of full syntactic parsing with a flat representation and the contribution of the empty trace categories of the treebank.
Abstract: The Proposition Bank project takes a practical approach to semantic representation, adding a layer of predicate-argument information, or semantic role labels, to the syntactic structures of the Penn Treebank. The resulting resource can be thought of as shallow, in that it does not represent coreference, quantification, and many other higher-order phenomena, but also broad, in that it covers every instance of every verb in the corpus and allows representative statistics to be calculated.We discuss the criteria used to define the sets of semantic roles used in the annotation process and to analyze the frequency of syntactic/semantic alternations in the corpus. We describe an automatic system for semantic role tagging trained on the corpus and discuss the effect on its performance of various types of information, including a comparison of full syntactic parsing with a flat representation and the contribution of the empty ''trace'' categories of the treebank.

2,416 citations


"Universal Conceptual Cognitive Anno..." refers background in this paper

  • ...proaches are PropBank (Palmer et al., 2005) and NomBank (Meyers et al....

    [...]

  • ...The leading SRL ap- proaches are PropBank (Palmer et al., 2005) and NomBank (Meyers et al., 2004) on the one hand, and FrameNet (Baker et al., 1998) on the other....

    [...]

Book
04 Feb 2008
TL;DR: This book presents a synthesis that draws together and refines the descriptive and theoretical notions developed in this framework over the course of three decades in a unified manner that accomodates both the conceptual and the social-interactive basis of linguistic structure.
Abstract: This book fills a long standing need for a basic introduction to Cognitive Grammar that is current, authoritative, comprehensive, and approachable. It presents a synthesis that draws together and refines the descriptive and theoretical notions developed in this framework over the course of three decades. In a unified manner, it accomodates both the conceptual and the social-interactive basis of linguistic structure, as well as the need for both functional explanation and explicit structural description. Starting with the fundamentals, essential aspects of the theory are systematically laid out with concrete illustrations and careful discussion of their rationale. Among the topics surveyed are conceptual semantics, grammatical classes, grammatical constructions, the lexicon-grammar continuum characterized as assemblies of symbolic structures (form-meaning pairings), and the usage- based account of productivity, restrictions, and well-formedness. The theory's central claim - that grammar is inherently meaningful - is thereby shown to be viable. The framework is further elucidated through application to nominal structure, clause structure, and complex sentences. These are examined in broad perspective, with exemplification from English and numerous other languages. In line with the theory's general principles, they are discussed not only in termsof their structural characterization, but also their conceptual value and functional motivation. Other matters explored include discourse, the temporal dimension of language structure, and what grammar reveals about cognitive processes and the contruction of our mental world.

1,631 citations


"Universal Conceptual Cognitive Anno..." refers background in this paper

  • ...UCCA’s representation is guided by conceptual notions and has its roots in the Cognitive Linguistics tradition and specifically in Cognitive Grammar (Langacker, 2008)....

    [...]