Findings of the VarDial Evaluation Campaign 2017

doi:10.18653/V1/W17-1201

Open AccessProceedings ArticleDOI

Findings of the VarDial Evaluation Campaign 2017

- pp 1-15

TLDR

The VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which was organized as part of the fourth edition of the VarDial workshop at EACL’2017, is presented.

Abstract:

We present the results of the VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which we organized as part of the fourth edition of the VarDial workshop at EACL’2017 This year, we included four shared tasks: Discriminating between Similar Languages (DSL), Arabic Dialect Identification (ADI), German Dialect Identification (GDI), and Cross-lingual Dependency Parsing (CLP) A total of 19 teams submitted runs across the four tasks, and 15 of them wrote system description papers

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

Daniel Zeman, +60 more

TL;DR: The task and evaluation methodology is defined, how the data sets were prepared, report and analyze the main results, and a brief categorization of the different approaches of the participating systems are provided.

...read moreread less

Proceedings Article

Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign

Marcos Zampieri, +18 more

TL;DR: The results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects and Indo-Aryan Language Identification are presented.

...read moreread less

Proceedings Article

Fine-Grained Arabic Dialect Identification

Mohammad Salameh, +2 more

TL;DR: This paper presents the first results on a fine-grained dialect classification task covering 25 specific cities from across the Arab World, in addition to Standard Arabic, and builds several classification systems and explores a large space of features.

...read moreread less

Proceedings Article

CAMeL tools: An open source python toolkit for arabic natural language processing

Ossama Obeid, +9 more

TL;DR: The design of CAMeL Tools is described and the functionalities it provides are described, including utilities for pre-processing, morphological modeling, Dialect Identification, Named Entity Recognition and Sentiment Analysis.

...read moreread less

Proceedings ArticleDOI

The MADAR Shared Task on Arabic Fine-Grained Dialect Identification

Houda Bouamor, +2 more

TL;DR: This shared task is the first to target a large set of dialect labels at the city and country levels and was organized as part of The Fourth Arabic Natural Language Processing Workshop, collocated with ACL 2019.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

What is your Mother Tongue?: Improving Chinese native language identification by cleaning noisy data and adopting BM25

Lan Wang, +2 more

TL;DR: The authors used a BM25 term weighting technique to score each feature and adopted a hierarchical structure of linear support vector machine classifiers to achieve high accuracy and a state-of-the-art accuracy of 77.1%.

...read moreread less

Posted Content

Discriminating between similar languages in Twitter using label propagation.

Will Radford, +1 more

- 19 Jul 2016 -

arXiv: Computation and Language

TL;DR: This work proposes a label propagation approach that takes the social graph of tweet authors into account as well as content to better tease apart similar languages in Twitter messages.

...read moreread less

USHEF and USAAR-USHEF Participation in the WMT15 Quality Estimation Shared Task

Carolina Scarton, +2 more

TL;DR: It is found that a model of comparable performance can be built with only three features selected by the exhaustive search procedure, which shows slight improvements over the baseline with the use of discourse features.

...read moreread less

Collapse

Journal of Artificial Intelligence Resea...

Findings of the VarDial Evaluation Campaign 2017

Citations

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign

Fine-Grained Arabic Dialect Identification

CAMeL tools: An open source python toolkit for arabic natural language processing

The MADAR Shared Task on Arabic Fine-Grained Dialect Identification

References

What is your Mother Tongue?: Improving Chinese native language identification by cleaning noisy data and adopting BM25

Discriminating between similar languages in Twitter using label propagation.

USHEF and USAAR-USHEF Participation in the WMT15 Quality Estimation Shared Task

Related Papers (5)

Discriminating between Similar Languages and Arabic Dialect Identification: A Report on the Third DSL Shared Task

Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign

A Report on the DSL Shared Task 2014

Overview of the DSL Shared Task 2015

Automatic Language Identification in Texts: A Survey