scispace - formally typeset
Open AccessProceedings ArticleDOI

Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources

TLDR
Evaluating on POS datasets from 14 languages in the Universal Dependencies corpus, it is shown that the proposed transfer learning model improves the POS tagging performance of the target languages without exploiting any linguistic knowledge between the source language and the target language.
Abstract
Training a POS tagging model with crosslingual transfer learning usually requires linguistic knowledge and resources about the relation between the source language and the target language. In this paper, we introduce a cross-lingual transfer learning model for POS tagging without ancillary resources such as parallel corpora. The proposed cross-lingual model utilizes a common BLSTM that enables knowledge transfer from other languages, and private BLSTMs for language-specific representations. The cross-lingual model is trained with language-adversarial training and bidirectional language modeling as auxiliary objectives to better represent language-general information while not losing the information about a specific target language. Evaluating on POS datasets from 14 languages in the Universal Dependencies corpus, we show that the proposed transfer learning model improves the POS tagging performance of the target languages without exploiting any linguistic knowledge between the source language and the target language.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT

TL;DR: This paper explored the broader cross-lingual potential of multilingual BERT as a zero-shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing.
Posted Content

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT.

TL;DR: This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing.
Journal ArticleDOI

Deep convolutional neural networks with ensemble learning and transfer learning for capacity estimation of lithium-ion batteries

TL;DR: The verification and comparison results demonstrate that the proposed DCNN-ETL method can produce a higher accuracy and robustness than these other data-driven methods in estimating the capacities of the Li-ion cells in the target task.
Proceedings ArticleDOI

Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism

TL;DR: This paper proposes a novel adversarial transfer learning framework to make full use of task-shared boundaries information and prevent the task-specific features of CWS, and exploits self-attention to explicitly capture long range dependencies between two tokens.
Proceedings ArticleDOI

XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation

TL;DR: A recent cross-lingual pre-trained model Unicoder is extended to cover both understanding and generation tasks, which is evaluated on XGLUE as a strong baseline and the base versions of Multilingual BERT, XLM and XLM-R are evaluated for comparison.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

A Survey on Transfer Learning

TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
Proceedings ArticleDOI

Convolutional Neural Networks for Sentence Classification

TL;DR: The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
Proceedings Article

Understanding the difficulty of training deep feedforward neural networks

TL;DR: The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.
Posted Content

Convolutional Neural Networks for Sentence Classification

TL;DR: In this article, CNNs are trained on top of pre-trained word vectors for sentence-level classification tasks and a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks.
Related Papers (5)