Open AccessProceedings Article
Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English
Daniel Dahlmeier,Hwee Tou Ng,Siew Mei Wu +2 more
- pp 22-31
TLDR
The annotation schema and the data collection and annotation process of NUCLE are described and an unpublished study of annotator agreement for grammatical error correction is reported on.Abstract:
We describe the NUS Corpus of Learner English (NUCLE), a large, fully annotated corpus of learner English that is freely available for research purposes. The goal of the corpus is to provide a large data resource for the development and evaluation of grammatical error correction systems. Although NUCLE has been available for almost two years, there has been no reference paper that describes the corpus in detail. In this paper, we address this need. We describe the annotation schema and the data collection and annotation process of NUCLE. Most importantly, we report on an unpublished study of annotator agreement for grammatical error correction. Finally, we present statistics on the distribution of grammatical errors in the NUCLE corpus.read more
Citations
More filters
Proceedings ArticleDOI
The CoNLL-2014 Shared Task on Grammatical Error Correction
Hwee Tou Ng,Siew Mei Wu,Ted Briscoe,Christian Hadiwinoto,Raymond Hendy Susanto,Christopher Bryant +5 more
TL;DR: The CoNLL-2014 shared task was devoted to grammatical error correction of all error types as discussed by the authors, where a participating system is expected to detect and correct grammatical errors of all types.
Proceedings ArticleDOI
The BEA-2019 Shared Task on Grammatical Error Correction.
TL;DR: This paper reports on the BEA-2019 Shared Task on Grammatical Error Correction (GEC), which introduces a new dataset, the Write&Improve+LOCNESS corpus, which represents a wider range of native and learner English levels and abilities.
Proceedings ArticleDOI
Grammatical error correction using neural machine translation
Zheng Yuan,Ted Briscoe +1 more
TL;DR: This paper presents the first study using neural machine translation (NMT) for grammatical error correction (GEC) with a twostep approach to handle the rare word problem in NMT, which has been proved to be useful and effective for the GEC task.
Proceedings ArticleDOI
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
TL;DR: This paper presents a simple and efficient GEC sequence tagger using a Transformer encoder, pre-trained on synthetic data and then fine-tuned in two stages: first on errorful corpora, and second on a combination of errorful and error-free parallel corpora.
Proceedings Article
The CoNLL-2013 Shared Task on Grammatical Error Correction
TL;DR: The task definition is given, the data sets are presented, and the evaluation metric and scorer used in the shared task are described, to give an overview of the various approaches adopted by the participating teams, and present the evaluation results.
References
More filters
Journal ArticleDOI
The measurement of observer agreement for categorical data
J. R. Landis,Gary G. Koch +1 more
TL;DR: A general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies is presented and tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interob server agreement are developed as generalized kappa-type statistics.
Journal ArticleDOI
A Coefficient of agreement for nominal Scales
TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.
Proceedings Article
A New Dataset and Method for Automatically Grading ESOL Texts
TL;DR: It is demonstrated how supervised discriminative machine learning techniques can be used to automate the assessment of 'English as a Second or Other Language' (ESOL) examination scripts by using rank preference learning to explicitly model the grade relationships between scripts.
Proceedings ArticleDOI
The CoNLL-2014 Shared Task on Grammatical Error Correction
Hwee Tou Ng,Siew Mei Wu,Ted Briscoe,Christian Hadiwinoto,Raymond Hendy Susanto,Christopher Bryant +5 more
TL;DR: The CoNLL-2014 shared task was devoted to grammatical error correction of all error types as discussed by the authors, where a participating system is expected to detect and correct grammatical errors of all types.
Proceedings Article
Better Evaluation for Grammatical Error Correction
Daniel Dahlmeier,Hwee Tou Ng +1 more
TL;DR: This work presents a novel method for evaluating grammatical error correction that is an algorithm for efficiently computing the sequence of phrase-level edits between a source sentence and a system hypothesis that achieves the highest overlap with the gold-standard annotation.