Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information

Open AccessPosted Content

Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information

- 29 May 2018 -

TLDR

The authors proposed a densely-connected co-attentive recurrent neural network (C-RNN), which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers.

Abstract:

Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. For these tasks, understanding logical and semantic relationship between two sentences is required but it is yet challenging. Although attention mechanism is useful to capture the semantic relationship and to properly align the elements of two sentences, previous methods of attention mechanism simply use a summation operation which does not retain original features enough. Inspired by DenseNet, a densely connected convolutional network, we propose a densely-connected co-attentive recurrent neural network, each layer of which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers. It enables preserving the original and the co-attentive feature information from the bottommost word embedding layer to the uppermost recurrent layer. To alleviate the problem of an ever-increasing size of feature vectors due to dense concatenation operations, we also propose to use an autoencoder after dense concatenation. We evaluate our proposed architecture on highly competitive benchmark datasets related to sentence matching. Experimental results show that our architecture, which retains recurrent and attentive features, achieves state-of-the-art performances for most of the tasks.

Citations

PDF

Open Access

More filters

Posted Content

The Natural Language Decathlon: Multitask Learning as Question Answering

Bryan McCann, +3 more

- 28 Aug 2018 -

arXiv: Computation and Language

TL;DR: Presented on August 28, 2018 at 12:15 p.m. in the Pettit Microelectronics Research Center, Room 102 A/B.

...read moreread less

Journal ArticleDOI

Deep Learning--based Text Classification: A Comprehensive Review

Shervin Minaee, +5 more

- 17 Apr 2021 -

ACM Computing Surveys

TL;DR: This paper provided a comprehensive review of more than 150 deep learning-based models for text classification developed in recent years, and discussed their technical contributions, similarities, and strengths, and provided a quantitative analysis of the performance of different deep learning models on popular benchmarks.

...read moreread less

Posted Content

Multi-Task Deep Neural Networks for Natural Language Understanding

Xiaodong Liu, +3 more

- 31 Jan 2019 -

arXiv: Computation and Language

TL;DR: A Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks that allows domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations.

...read moreread less

Posted Content

Semantics-aware BERT for Language Understanding

Zhuosheng Zhang, +6 more

- 05 Sep 2019 -

arXiv: Computation and Language

TL;DR: This work proposes to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduces an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone.

...read moreread less

Proceedings ArticleDOI

Exploiting Edge Features for Graph Neural Networks

Liyu Gong, +1 more

TL;DR: In this article, the authors propose to use doubly stochastic normalization of graph edge features instead of the commonly used row or symmetric normalization approaches used in current graph neural networks, and construct new formulas for the operations in each individual layer so that they can handle multi-dimensional edge features.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model

Liu Yang, +3 more

TL;DR: This article proposed an attention-based neural matching model for ranking short answer text, which adopts value-shared weighting scheme instead of position shared weighting for combining different matching signals and incorporate question term importance learning using question attention network.

...read moreread less

Posted Content

A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

Adina Williams, +2 more

- 18 Apr 2017 -

arXiv: Computation and Language

TL;DR: The Multi-Genre Natural Language Inference (MultiNLI) corpus as mentioned in this paper is a dataset designed for use in the development and evaluation of machine learning models for sentence understanding.

...read moreread less

Proceedings Article

Learning to Compose Task-Specific Tree Structures

Jihun Choi, +2 more

TL;DR: Gumbel Tree-LSTM as mentioned in this paper uses Straight-Through Gumbel-Softmax estimator to decide the parent node among candidates dynamically and to calculate gradients of the discrete decision.

...read moreread less

Posted Content

Natural Language Inference over Interaction Space.

Yichen Gong, +2 more

- 13 Sep 2017 -

arXiv: Computation and Language

TL;DR: This paper proposed Interactive Inference Network (IIN), a novel class of neural network architectures that is able to achieve high-level understanding of the sentence pair by hierarchically extracting semantic features from interaction space.

...read moreread less

Posted Content

aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model

Liu Yang, +3 more

- 05 Jan 2018 -

arXiv: Information Retrieval

TL;DR: It is shown that the relatively simple aNMM model can significantly outperform other neural network models that have been used for the question answering task, and is competitive with models that are combined with additional features.

...read moreread less

Collapse

arXiv: Learning

Deep contextualized word representations

Matthew E. Peters, +6 more

- 15 Feb 2018 -

arXiv: Computation and Language

Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information

Citations

The Natural Language Decathlon: Multitask Learning as Question Answering

Deep Learning--based Text Classification: A Comprehensive Review

Multi-Task Deep Neural Networks for Natural Language Understanding

Semantics-aware BERT for Language Understanding

Exploiting Edge Features for Graph Neural Networks

References

aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model

A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

Learning to Compose Task-Specific Tree Structures

Natural Language Inference over Interaction Space.

aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model

Related Papers (5)

Glove: Global Vectors for Word Representation

A large annotated corpus for learning natural language inference

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Adam: A Method for Stochastic Optimization

Deep contextualized word representations