On the Development of the RST Spanish Treebank

Open AccessProceedings Article

On the Development of the RST Spanish Treebank

Iria da Cunha, +2 more

- pp 1-10

Chats0

TLDR

The RST Spanish Treebank is presented, the first corpus annotated with rhetorical relations for this language, and the interface that is developed to carry out searches over the corpus' annotated texts is shown.

Abstract:

In this article we present the RST Spanish Treebank, the first corpus annotated with rhetorical relations for this language. We describe the characteristics of the corpus, the annotation criteria, the annotation procedure, the inter-annotator agreement, and other related aspects. Moreover, we show the interface that we have developed to carry out searches over the corpus' annotated texts.

Citations

PDF

Open Access

More filters

CSTNews - A Discourse-Annotated Corpus for Single and Multi-Document Summarization of News Texts in Brazilian Portuguese

Paula Christina Figueira Cardoso, +9 more

TL;DR: CSTNews, a discourse-annotated corpus for fostering research on single and multi-document summarization, is introduced within the context of the SUCINTO Project, which aims at investigating summarization strategies and developing tools and resources for that purpose.

...read moreread less

Proceedings ArticleDOI

Cross-lingual RST Discourse Parsing

Chloé Braud, +2 more

TL;DR: A new discourse parser which is simpler, yet competitive (significantly better on 2/3 metrics) to state of the art for English, and a harmonization of discourse treebanks across languages are presented, enabling the first experiments on cross-lingual discourse parsing to be presented.

...read moreread less

Journal ArticleDOI

A qualitative comparison method for rhetorical structures: identifying different discourse structures in multilingual corpora

Mikel Iruskieta, +2 more

TL;DR: A new type of comparison is shown that has important advantages with regard to the quantitative method usually employed: it provides an accurate measurement of inter-annotator agreement, and it pinpoints sources of disagreement among annotators.

...read moreread less

Proceedings Article

Discourse Structure and Computation: Past, Present and Future

Bonnie Webber, +1 more

TL;DR: The challenges faced by the current understanding of discourse, the technology they employ, the applications they support, and the applications that meeting these challenges will promote are recounted.

...read moreread less

Proceedings ArticleDOI

Cross-lingual and cross-domain discourse segmentation of entire documents

Chloé Braud, +2 more

TL;DR: This article proposed statistical discourse segmenters for five languages and three domains that do not rely on gold pre-annotations, and achieved 89.5% F1 for English newswire, with slight drops in performance on other domains.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

A Coefficient of agreement for nominal Scales

Jacob Cohen

- 01 Apr 1960 -

Educational and Psychological Measuremen...

TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.

...read moreread less

ReportDOI

Building a large annotated corpus of English: the penn treebank

Mitchell Marcus, +2 more

- 01 Jun 1993 -

Computational Linguistics

TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.

...read moreread less

Journal ArticleDOI

Measuring nominal scale agreement among many raters.

Joseph L. Fleiss

- 01 Jan 1971 -

Psychological Bulletin

Journal ArticleDOI

Rhetorical Structure Theory : Toward a Functional Theory of Text Organization

William C. Mann, +1 more

- 01 Jan 1988 -

Text - Interdisciplinary Journal for the...

TL;DR: Rhetorical Structure Theory (RST) as mentioned in this paper is a descriptive theory of a major aspect of the organization of natural text, which is a linguistically useful method for describing natural texts, characterizing their Structure primarily in terms of relations that hold between parts of the text.

...read moreread less

Journal ArticleDOI

Inter-coder agreement for computational linguistics

Ron Artstein, +3 more

- 01 Dec 2008 -

Computational Linguistics

TL;DR: It is argued that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in computational linguistics, may be more appropriate for many corpus annotation tasks—but that their use makes the interpretation of the value of the coefficient even harder.

...read moreread less

Collapse

On the Development of the RST Spanish Treebank

Citations

CSTNews - A Discourse-Annotated Corpus for Single and Multi-Document Summarization of News Texts in Brazilian Portuguese

Cross-lingual RST Discourse Parsing

A qualitative comparison method for rhetorical structures: identifying different discourse structures in multilingual corpora

Discourse Structure and Computation: Past, Present and Future

Cross-lingual and cross-domain discourse segmentation of entire documents

References

A Coefficient of agreement for nominal Scales

Building a large annotated corpus of English: the penn treebank

Measuring nominal scale agreement among many raters.

Rhetorical Structure Theory : Toward a Functional Theory of Text Organization

Inter-coder agreement for computational linguistics

Related Papers (5)

Rhetorical Structure Theory : Toward a Functional Theory of Text Organization

Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory

The Penn Discourse TreeBank 2.0.

The rhetorical parsing of unrestricted texts: a surface-based approach

Logics of conversation