Semi-supervised Multitask Learning for Sequence Labeling

doi:10.18653/V1/P17-1194

Open AccessProceedings ArticleDOI

Semi-supervised Multitask Learning for Sequence Labeling

Marek Rei

- Vol. 1, pp 2121-2130

Chats0

TLDR

The authors proposed a language modeling objective to incentivize the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks.

Abstract:

We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. This language modeling objective incentivises the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks. The architecture was evaluated on a range of datasets, covering the tasks of error detection in learner texts, named entity recognition, chunking and POS-tagging. The novel language modeling objective provided consistent performance improvements on every benchmark, without requiring any additional annotated or unannotated data.

Citations

PDF

Open Access

More filters

Dissertation

Extracting Clinical Event Timelines : Temporal Information Extraction and Coreference Resolution in Electronic Health Records

Julien Tourille

TL;DR: A neural based approach for temporal information extraction which includes categorical features and neural network components such as attention mechanisms and token character-level representations influence the performance of the coreference resolution approach in clinical narratives.

...read moreread less

Posted Content

Neural Multi-task Learning in Automated Assessment

Ronan Cummins, +1 more

- 21 Jan 2018 -

arXiv: Computation and Language

TL;DR: A multi-task neural network model is developed that jointly optimises for both grammatical error detection and essay scoring and shows that neural automated essay scoring can be significantly improved.

...read moreread less

Posted Content

Multi-Task Learning for Domain-General Spoken Disfluency Detection in Dialogue Systems

Igor Shalyminov, +2 more

- 08 Oct 2018 -

arXiv: Computation and Language

TL;DR: This paper presents a multi-task LSTM-based model for incremental detection of disfluency structure, which can be hooked up to any component for incremental interpretation, or else simply used to `clean up' the current utterance as it is being produced.

...read moreread less

Journal ArticleDOI

Dual Learning for Semi-Supervised Natural Language Understanding

Su Zhu, +2 more

- 11 Jun 2020 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: Zhang et al. as mentioned in this paper introduced a dual task of NLU, semantic-to-sentence generation (SSG), and proposed a new framework for semi-supervised NLU with the corresponding dual model.

...read moreread less

DOI

A Multi-Task Approach to Incremental Dialogue State Tracking

Anh Duong Trinh, +2 more

TL;DR: This paper presents the design of the incremental dialogue state tracker in detail and provides evaluation against the well known Dialogue State Tracking Challenge 2 (DSTC2) dataset and finds that the Multi-Task Learning-based model achieves state-of-the-art results for incremental processing.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Journal Article

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014 -

Journal of Machine Learning Research

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

...read moreread less

Posted Content

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013 -

arXiv: Computation and Language

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

...read moreread less

Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +2 more

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less