scispace - formally typeset
Open AccessProceedings Article

Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign

TLDR
The results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects and Indo-Aryan Language Identification are presented.
Abstract
We present the results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The campaign was organized as part of the fifth edition of the VarDial workshop, collocated with COLING’2018. This year, the campaign included five shared tasks, including two task re-runs – Arabic Dialect Identification (ADI) and German Dialect Identification (GDI) –, and three new tasks – Morphosyntactic Tagging of Tweets (MTT), Discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the VarDial workshop proceedings and are referred to in this report.

read more

Citations
More filters

NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

TL;DR: The second Nuanced Arabic Dialect Identification Shared Task (NADI 2021) as discussed by the authors was the first shared task to include four subtasks: country-level ModernStandard Arabic (MSA) identification (Subtask 1.1), countrylevel dialect identification, province level MSA identification, and province-level sub-dialect identifica-tion (SubTask 2.2).
Posted Content

NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

TL;DR: The results and findings of the First Nuanced Arabic Dialect Identification Shared Task (NADI), the first shared task to target naturally-occurring fine-grained dialectal text at the sub-country level, are presented.
Proceedings Article

A Report on the VarDial Evaluation Campaign 2020

TL;DR: The VarDial Evaluation Campaign 2020 included three shared tasks each focusing on a different challenge of language and dialect identification: Romanian Dialect Identification (RDI), Social Media Variety Geolocation (SMG), and Uralic Language Identification (ULI).
Proceedings Article

Character Level Convolutional Neural Network for Arabic Dialect Identification.

Mohamed Ali
TL;DR: This submission is for the description paper for the system in the ADI shared task, where the system’s architecture and user interfaces are described in detail.
Proceedings ArticleDOI

Emoji Powered Capsule Network to Detect Type and Target of Offensive Posts in Social Media.

TL;DR: The evaluation showed that even though the capsule networks have not been used commonly in natural language processing tasks, they can outperform existing state of the art solutions for offensive language detection in social media.
References
More filters
Proceedings Article

Discriminating between Similar Languages and Arabic Dialect Identification: A Report on the Third DSL Shared Task

TL;DR: High-order character n-grams were the most successful feature, and the best classification approaches included traditional supervised learning methods such as SVM, logistic regression, and language models, while deep learning approaches did not perform very well.
Proceedings ArticleDOI

Findings of the VarDial Evaluation Campaign 2017

TL;DR: The VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which was organized as part of the fourth edition of the VarDial workshop at EACL’2017, is presented.
Proceedings Article

Phonotactic language identification using high quality phoneme recognition.

TL;DR: Four PRLM systems have Equal Error Rate (EER) of 2.4% on 12 languages task, which compares favorably to the best known result from this task.
Journal ArticleDOI

Automatic Language Identification in Texts: A Survey

TL;DR: A unified notation is introduced for evaluation methods, applications, as well as off-the-shelf LI systems that do not require training by the end user, to propose future directions for research in LI.
Proceedings ArticleDOI

A Report on the DSL Shared Task 2014

TL;DR: This paper summarizes the methods, results and findings of the Discriminating between Similar Languages (DSL) shared task 2014, where the best system obtained 95.7% average accuracy.
Related Papers (5)