Semi-supervised Multitask Learning for Sequence Labeling
Marek Rei
- Vol. 1, pp 2121-2130
Reads0
Chats0
TLDR
The authors proposed a language modeling objective to incentivize the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks.Abstract:
We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. This language modeling objective incentivises the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks. The architecture was evaluated on a range of datasets, covering the tasks of error detection in learner texts, named entity recognition, chunking and POS-tagging. The novel language modeling objective provided consistent performance improvements on every benchmark, without requiring any additional annotated or unannotated data.read more
Citations
More filters
Proceedings ArticleDOI
Grammatical Error Detection with Self Attention by Pairwise Training
Quanbin Wang,Ying Tan +1 more
TL;DR: The experimental results shown that the proposed method can achieve the state of the art performance on four different standard benchmarks and the overall improvements among the four test set are around 2.5% which demonstrate the generality of pairwise training for datasets from differen domains.
Book ChapterDOI
Combining neural and knowledge-based approaches to Named Entity Recognition in Polish.
TL;DR: This paper proposed a named entity recognition framework composed of knowledge-based feature extractors and a deep learning model including contextual word embeddings, long short-term memory (LSTM) layers and conditional random fields (CRF) inference layer.
Proceedings Article
Assessing Grammatical Correctness in Language Learning
TL;DR: This paper used pre-trained BERT to detect grammatical errors and fine-tuned it using synthetic training data to assess the correctness of learners' answers in a language-learning system.
Posted Content
Multi-task Learning for Chinese Word Usage Errors Detection
Jinbin Zhang,Heng Wang +1 more
TL;DR: The authors proposed a novel approach, which takes advantage of different auxiliary tasks, such as POS-tagging prediction and word log frequency prediction, to help the task of Chinese word usage error detection.
Posted Content
Linguistically Informed Relation Extraction and Neural Architectures for Nested Named Entity Recognition in BioNLP-OST 2019
TL;DR: In this paper, a hybrid loss including ranking and conditional random fields (CRF), multi-task objective and token-level ensembling strategy was used to improve NER.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Journal Article
Dropout: a simple way to prevent neural networks from overfitting
TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Posted Content
Efficient Estimation of Word Representations in Vector Space
TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Proceedings Article
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.