Findings of the VarDial Evaluation Campaign 2017
Marcos Zampieri,Shervin Malmasi,Nikola Ljubešić,Preslav Nakov,Ahmed Ali,Jörg Tiedemann,Yves Scherrer,Noëmi Aepli +7 more
- pp 1-15
TLDR
The VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which was organized as part of the fourth edition of the VarDial workshop at EACL’2017, is presented.Abstract:
We present the results of the VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which we organized as part of the fourth edition of the VarDial workshop at EACL’2017 This year, we included four shared tasks: Discriminating between Similar Languages (DSL), Arabic Dialect Identification (ADI), German Dialect Identification (GDI), and Cross-lingual Dependency Parsing (CLP) A total of 19 teams submitted runs across the four tasks, and 15 of them wrote system description papersread more
Citations
More filters
Proceedings ArticleDOI
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Daniel Zeman,Martin Popel,Milan Straka,Jan Hajič,Joakim Nivre,Filip Ginter,Juhani Luotolahti,Sampo Pyysalo,Slav Petrov,Martin Potthast,Francis M. Tyers,Elena Badmaeva,Memduh Gökırmak,Anna Nedoluzhko,Silvie Cinková,Jaroslava Hlaváčová,Václava Kettnerová,Zdenka Uresova,Jenna Kanerva,Stina Ojala,Anna Missilä,Christopher D. Manning,Sebastian Schuster,Siva Reddy,Dima Taji,Nizar Habash,Herman Leung,Marie-Catherine de Marneffe,Manuela Sanguinetti,Maria Simi,Hiroshi Kanayama,Valeria dePaiva,Kira Droganova,Héctor Martínez Alonso,Ça ugrı Çöltekin,Umut Sulubacak,Hans Uszkoreit,Vivien Macketanz,Aljoscha Burchardt,Kim Harris,Katrin Marheinecke,Georg Rehm,Tolga Kayadelen,Mohammed Attia,Ali Elkahky,Zhuoran Yu,Emily Pitler,Saran Lertpradit,Michael Mandl,Jesse Kirchner,Hector Fernandez Alcalde,Jana Strnadová,Esha Banerjee,Ruli Manurung,Antonio Stella,Atsuko Shimada,Sookyoung Kwak,Gustavo Mendonça,Tatiana Lando,Rattima Nitisaroj,Josie Li +60 more
TL;DR: The task and evaluation methodology is defined, how the data sets were prepared, report and analyze the main results, and a brief categorization of the different approaches of the participating systems are provided.
Proceedings Article
Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign
Marcos Zampieri,Shervin Malmasi,Preslav Nakov,Ahmed Ali,Suwon Shon,James Glass,Yves Scherrer,Tanja Samardžić,Nikola Ljubešić,Nikola Ljubešić,Jörg Tiedemann,Chris van der Lee,Stefan Grondelaers,Nelleke Oostdijk,Dirk Speelman,Antal van den Bosch,Ritesh Kumar,Bornini Lahiri,Mayank Jain +18 more
TL;DR: The results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects and Indo-Aryan Language Identification are presented.
Proceedings Article
Fine-Grained Arabic Dialect Identification
TL;DR: This paper presents the first results on a fine-grained dialect classification task covering 25 specific cities from across the Arab World, in addition to Standard Arabic, and builds several classification systems and explores a large space of features.
Proceedings Article
CAMeL tools: An open source python toolkit for arabic natural language processing
Ossama Obeid,Nasser Zalmout,Salam Khalifa,Dima Taji,Mai Oudah,Bashar Alhafni,Go Inoue,Fadhl Eryani,Alexander Erdmann,Nizar Habash +9 more
TL;DR: The design of CAMeL Tools is described and the functionalities it provides are described, including utilities for pre-processing, morphological modeling, Dialect Identification, Named Entity Recognition and Sentiment Analysis.
Proceedings ArticleDOI
The MADAR Shared Task on Arabic Fine-Grained Dialect Identification
TL;DR: This shared task is the first to target a large set of dialect labels at the city and country levels and was organized as part of The Fourth Arabic Natural Language Processing Workshop, collocated with ACL 2019.
References
More filters
Proceedings ArticleDOI
What is your Mother Tongue?: Improving Chinese native language identification by cleaning noisy data and adopting BM25
TL;DR: The authors used a BM25 term weighting technique to score each feature and adopted a hierarchical structure of linear support vector machine classifiers to achieve high accuracy and a state-of-the-art accuracy of 77.1%.
Posted Content
Discriminating between similar languages in Twitter using label propagation.
Will Radford,Matthias Gallé +1 more
TL;DR: This work proposes a label propagation approach that takes the social graph of tweet authors into account as well as content to better tease apart similar languages in Twitter messages.
USHEF and USAAR-USHEF Participation in the WMT15 Quality Estimation Shared Task
TL;DR: It is found that a model of comparable performance can be built with only three features selected by the exhaustive search procedure, which shows slight improvements over the baseline with the use of discourse features.