Findings of the VarDial Evaluation Campaign 2017

doi:10.18653/V1/W17-1201

Open AccessProceedings ArticleDOI

Findings of the VarDial Evaluation Campaign 2017

- pp 1-15

TLDR

The VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which was organized as part of the fourth edition of the VarDial workshop at EACL’2017, is presented.

Abstract:

We present the results of the VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which we organized as part of the fourth edition of the VarDial workshop at EACL’2017 This year, we included four shared tasks: Discriminating between Similar Languages (DSL), Arabic Dialect Identification (ADI), German Dialect Identification (GDI), and Cross-lingual Dependency Parsing (CLP) A total of 19 teams submitted runs across the four tasks, and 15 of them wrote system description papers

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Identifying dialects with textual and acoustic cues

Abualsoud Hanani, +2 more

TL;DR: Several systems for identifying short samples of Arabic or Swiss-German dialects were prepared for the shared task of the 2017 DSL Workshop, and the best runs achieved a accuracy of nearly 63% on both the Swiss- German and Arabic dialects tasks.

...read moreread less

Proceedings Article

Iterative Language Model Adaptation for Indo-Aryan Language Identification

Tommi Jauhiainen, +2 more

TL;DR: The SUKI team's submission using a HeLI-method based language identifier with iterative language model adaptation obtained the best results in the shared task with a macro F1-score of 0.958.

...read moreread less

Posted Content

The Unreasonable Effectiveness of Machine Learning in Moldavian versus Romanian Dialect Identification

Mihaela Gaman, +1 more

- 30 Jul 2020 -

arXiv: Computation and Language

TL;DR: A subjective evaluation by human annotators, showing that humans attain much lower accuracy rates compared to machine learning (ML) models, and experiments showing that the machine learning performance on the MRC shared task can be improved through an ensemble based on classifier stacking.

...read moreread less

Proceedings ArticleDOI

CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects

Simon Clematide, +1 more

TL;DR: The submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naïve Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM).

...read moreread less

Findings of the VarDial Evaluation Campaign 2022

Noëmi Aepli, +7 more

TL;DR: The results of the shared tasks organized as part of the VarDial Evaluation Campaign 2022 are presented in this paper , where three separate shared tasks are included: identification of languages and dialects of Italy (ITDI), French Cross-Domain Dialect Identification (FDI), and Dialectal Extractive Question Answering (DialQA).

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Okapi at TREC

Stephen Robertson, +4 more

TL;DR: Much of the work involved investigating plausible methods of applying Okapi-style weighting to phrases, and expansion using terms from the top documents retrieved by a pilot search on topic terms was used.

...read moreread less

Proceedings Article

Parallel Data, Tools and Interfaces in OPUS

J"org Tiedemann

TL;DR: New data sets and their features, additional annotation tools and models provided from the website and essential interfaces and on-line services included in the OPUS project are reported.

...read moreread less

Proceedings Article

Universal Dependencies v1: A Multilingual Treebank Collection

Joakim Nivre, +11 more

TL;DR: This paper describes v1 of the universal guidelines, the underlying design principles, and the currently available treebanks for 33 languages, as well as highlighting the needs for sound comparative evaluation and cross-lingual learning experiments.

...read moreread less

Proceedings Article

Universal Dependency Annotation for Multilingual Parsing

Ryan McDonald, +12 more

TL;DR: A new collection of treebanks with homogeneous syntactic dependency annotation for six languages: German, English, Swedish, Spanish, French and Korean is presented, made freely available in order to facilitate research on multilingual dependency parsing.

...read moreread less

Journal ArticleDOI

Bootstrapping parsers via syntactic projection across parallel texts

Rebecca Hwa, +4 more

- 01 Sep 2005 -

Natural Language Engineering

TL;DR: Using parallel text to help solving the problem of creating syntactic annotation in more languages by annotating the English side of a parallel corpus, project the analysis to the second language, and train a stochastic analyzer on the resulting noisy annotations.

...read moreread less

Collapse

Journal of Artificial Intelligence Resea...

Findings of the VarDial Evaluation Campaign 2017

Citations

Identifying dialects with textual and acoustic cues

Iterative Language Model Adaptation for Indo-Aryan Language Identification

The Unreasonable Effectiveness of Machine Learning in Moldavian versus Romanian Dialect Identification

CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects

Findings of the VarDial Evaluation Campaign 2022

References

Okapi at TREC

Parallel Data, Tools and Interfaces in OPUS

Universal Dependencies v1: A Multilingual Treebank Collection

Universal Dependency Annotation for Multilingual Parsing

Bootstrapping parsers via syntactic projection across parallel texts

Related Papers (5)

Discriminating between Similar Languages and Arabic Dialect Identification: A Report on the Third DSL Shared Task

Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign

A Report on the DSL Shared Task 2014

Overview of the DSL Shared Task 2015

Automatic Language Identification in Texts: A Survey