Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources
Joo-Kyung Kim,Young-Bum Kim,Ruhi Sarikaya,Eric Fosler-Lussier +3 more
- pp 2832-2838
TLDR
Evaluating on POS datasets from 14 languages in the Universal Dependencies corpus, it is shown that the proposed transfer learning model improves the POS tagging performance of the target languages without exploiting any linguistic knowledge between the source language and the target language.Abstract:
Training a POS tagging model with crosslingual transfer learning usually requires linguistic knowledge and resources about the relation between the source language and the target language. In this paper, we introduce a cross-lingual transfer learning model for POS tagging without ancillary resources such as parallel corpora. The proposed cross-lingual model utilizes a common BLSTM that enables knowledge transfer from other languages, and private BLSTMs for language-specific representations. The cross-lingual model is trained with language-adversarial training and bidirectional language modeling as auxiliary objectives to better represent language-general information while not losing the information about a specific target language. Evaluating on POS datasets from 14 languages in the Universal Dependencies corpus, we show that the proposed transfer learning model improves the POS tagging performance of the target languages without exploiting any linguistic knowledge between the source language and the target language.read more
Citations
More filters
Proceedings ArticleDOI
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Shijie Wu,Mark Dredze +1 more
TL;DR: This paper explored the broader cross-lingual potential of multilingual BERT as a zero-shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing.
Posted Content
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT.
Shijie Wu,Mark Dredze +1 more
TL;DR: This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing.
Journal ArticleDOI
Deep convolutional neural networks with ensemble learning and transfer learning for capacity estimation of lithium-ion batteries
TL;DR: The verification and comparison results demonstrate that the proposed DCNN-ETL method can produce a higher accuracy and robustness than these other data-driven methods in estimating the capacities of the Li-ion cells in the target task.
Proceedings ArticleDOI
Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism
TL;DR: This paper proposes a novel adversarial transfer learning framework to make full use of task-shared boundaries information and prevent the task-specific features of CWS, and exploits self-attention to explicitly capture long range dependencies between two tokens.
Proceedings ArticleDOI
XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation
Yaobo Liang,Nan Duan,Yeyun Gong,Ning Wu,Fenfei Guo,Weizhen Qi,Ming Gong,Linjun Shou,Daxin Jiang,Guihong Cao,Xiaodong Fan,Ruofei Zhang,Rahul Agrawal,Edward Cui,Sining Wei,Taroon Bharti,Ying Qiao,Jiun-Hung Chen,Winnie Wu,Shuguang Liu,Fan Yang,Daniel Campos,Rangan Majumder,Ming Zhou +23 more
TL;DR: A recent cross-lingual pre-trained model Unicoder is extended to cover both understanding and generation tasks, which is evaluated on XGLUE as a strong baseline and the base versions of Multilingual BERT, XLM and XLM-R are evaluated for comparison.
References
More filters
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI
A Survey on Transfer Learning
Sinno Jialin Pan,Qiang Yang +1 more
TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
Proceedings ArticleDOI
Convolutional Neural Networks for Sentence Classification
TL;DR: The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
Proceedings Article
Understanding the difficulty of training deep feedforward neural networks
Xavier Glorot,Yoshua Bengio +1 more
TL;DR: The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.
Posted Content
Convolutional Neural Networks for Sentence Classification
TL;DR: In this article, CNNs are trained on top of pre-trained word vectors for sentence-level classification tasks and a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks.