Self-adaptive hierarchical sentence model

Open AccessProceedings Article

Self-adaptive hierarchical sentence model

Han Zhao, +2 more

- pp 4069-4076

Chats0

TLDR

This paper proposed a self-adaptive hierarchical sentence model (AdaSent), which forms a hierarchy of representations from words to phrases and then to sentences through recursive gated local composition of adjacent segments.

Abstract:

The ability to accurately model a sentence at varying stages (e.g., word-phrase-sentence) plays a central role in natural language processing. As an effort towards this goal we propose a self-adaptive hierarchical sentence model (AdaSent). AdaSent effectively forms a hierarchy of representations from words to phrases and then to sentences through recursive gated local composition of adjacent segments. We design a competitive mechanism (through gating networks) to allow the representations of the same sentence to be engaged in a particular learning task (e.g., classification), therefore effectively mitigating the gradient vanishing problem persistent in other recursive models. Both qualitative and quantitative analysis shows that AdaSent can automatically form and select the representations suitable for the task at hand during training, yielding superior classification performance over competitor models on 5 benchmark data sets.

Citations

PDF

Open Access

More filters

Posted Content

Representation Learning with Contrastive Predictive Coding

Aaron van den Oord, +2 more

- 10 Jul 2018 -

arXiv: Learning

TL;DR: This work proposes a universal unsupervised learning approach to extract useful representations from high-dimensional data, which it calls Contrastive Predictive Coding, and demonstrates that the approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.

...read moreread less

Proceedings Article

Skip-thought vectors

Ryan Kiros, +6 more

TL;DR: This article used the continuity of text from books to train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage, which can produce highly generic sentence representations that are robust and perform well in practice.

...read moreread less

Proceedings ArticleDOI

Generating Sentences from a Continuous Space

Samuel R. Bowman, +5 more

TL;DR: This work introduces and study an RNN-based variational autoencoder generative model that incorporates distributed latent representations of entire sentences that allows it to explicitly model holistic properties of sentences such as style, topic, and high-level syntactic features.

...read moreread less

Proceedings ArticleDOI

Supervised learning of universal sentence representations from natural language inference data

Alexis Conneau, +4 more

TL;DR: This article showed how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.

...read moreread less

Proceedings ArticleDOI

Document Modeling with Gated Recurrent Neural Network for Sentiment Classification

Duyu Tang, +2 more

TL;DR: A neural network model is introduced to learn vector-based document representation in a unified, bottom-up fashion and dramatically outperforms standard recurrent neural network in document modeling for sentiment classification.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

...read moreread less

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Proceedings Article

Rectified Linear Units Improve Restricted Boltzmann Machines

Vinod Nair, +1 more

TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.

...read moreread less

Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

- 01 Sep 2014 -

arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Collapse

Self-adaptive hierarchical sentence model

Citations

Representation Learning with Contrastive Predictive Coding

Skip-thought vectors

Generating Sentences from a Continuous Space

Supervised learning of universal sentence representations from natural language inference data

Document Modeling with Gated Recurrent Neural Network for Sentiment Classification

References

Long short-term memory

Distributed Representations of Words and Phrases and their Compositionality

Neural Machine Translation by Jointly Learning to Align and Translate

Rectified Linear Units Improve Restricted Boltzmann Machines

Neural Machine Translation by Jointly Learning to Align and Translate

Related Papers (5)

Convolutional Neural Networks for Sentence Classification

Long short-term memory

Distributed Representations of Words and Phrases and their Compositionality

Glove: Global Vectors for Word Representation

Adam: A Method for Stochastic Optimization