scispace - formally typeset
Open AccessProceedings ArticleDOI

Semi-supervised Multitask Learning for Sequence Labeling

Marek Rei
- Vol. 1, pp 2121-2130
Reads0
Chats0
TLDR
The authors proposed a language modeling objective to incentivize the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks.
Abstract
We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. This language modeling objective incentivises the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks. The architecture was evaluated on a range of datasets, covering the tasks of error detection in learner texts, named entity recognition, chunking and POS-tagging. The novel language modeling objective provided consistent performance improvements on every benchmark, without requiring any additional annotated or unannotated data.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Multi-Task Learning with Language Modeling for Question Generation

TL;DR: Zhang et al. as discussed by the authors proposed to incorporate an auxiliary task of language modeling to help question generation in a hierarchical multi-task learning structure, which enables the encoder to learn a better representation of the input sequence, which will guide the decoder to generate more coherent and fluent questions.
Posted Content

Variational Sequential Labelers for Semi-Supervised Learning.

TL;DR: This article introduced a family of multitask variational methods for semi-supervised sequence labeling, which consists of a latent-variable generative model and a discriminative labeler.
Proceedings ArticleDOI

Toward a Task of Feedback Comment Generation for Writing Learning.

TL;DR: This paper has taken the first step by creating learner corpora consisting of approximately 1,900 essays where all preposition errors are manually annotated with feedback comments, and tested three baseline methods on the dataset.
Posted Content

Prototype-to-Style: Dialogue Generation with Style-Aware Editing on Retrieval Memory

TL;DR: A prototype-to-style (PS) framework to tackle the challenge of stylistic dialogue generation and significantly outperforms existing baselines both in terms of in-domain and cross-domain evaluations.
Journal ArticleDOI

Position-aware self-attention based neural sequence labeling

TL;DR: This work proposes an innovative attention-based model as well as a well-designed self-attentional context fusion layer within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Journal Article

Dropout: a simple way to prevent neural networks from overfitting

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Posted Content

Efficient Estimation of Word Representations in Vector Space

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
Related Papers (5)