Open AccessPosted Content
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information
TLDR
The authors proposed a densely-connected co-attentive recurrent neural network (C-RNN), which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers.Abstract:
Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. For these tasks, understanding logical and semantic relationship between two sentences is required but it is yet challenging. Although attention mechanism is useful to capture the semantic relationship and to properly align the elements of two sentences, previous methods of attention mechanism simply use a summation operation which does not retain original features enough. Inspired by DenseNet, a densely connected convolutional network, we propose a densely-connected co-attentive recurrent neural network, each layer of which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers. It enables preserving the original and the co-attentive feature information from the bottommost word embedding layer to the uppermost recurrent layer. To alleviate the problem of an ever-increasing size of feature vectors due to dense concatenation operations, we also propose to use an autoencoder after dense concatenation. We evaluate our proposed architecture on highly competitive benchmark datasets related to sentence matching. Experimental results show that our architecture, which retains recurrent and attentive features, achieves state-of-the-art performances for most of the tasks.read more
Citations
More filters
Book ChapterDOI
Using Fractional Latent Topic to Enhance Recurrent Neural Network in Text Similarity Modeling
Yang Song,Wenxin Hu,Liang He +2 more
TL;DR: This paper proposes a novel fractional latent topic based RNN (FraLT-RNN) model, which focuses on the text representation in topic-level and largely preserve the whole semantic information of a text.
Proceedings ArticleDOI
Pre-trained Language Model Based Active Learning for Sentence Matching
TL;DR: A pre- trained language model based active learning approach for sentence matching that can provide linguistic criteria from the pre-trained language model to measure instances and help select more effective instances for annotation.
Journal ArticleDOI
Adversarial shared-private model for cross-domain clinical text entailment recognition
TL;DR: This work proposes a domain adaptation framework for the cross-domain clinical RTE, which achieves significantly enhanced performances against baseline domain adaptation methods, on the few-shot and zero-shot transferring settings.
Journal ArticleDOI
Improved Machine Reading Comprehension Using Data Validation for Weakly Labeled Data
TL;DR: The proposed MRC model can address the limitation of irrelevant context in MRC better than the human supervision, and showed a 4.33% improvement in performance for the TriviaQA Wiki, compared to the existing baseline model.
Journal ArticleDOI
Deep bi-directional interaction network for sentence matching
TL;DR: A Deep Bi-Directional Interaction Network (DBDIN) is proposed, which captures semantic relatedness from two directions and each direction employs multiple attention-based interaction units, and introduces a self-attention mechanism at last to enhance global matching information with smaller model complexity.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI
Glove: Global Vectors for Word Representation
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Proceedings ArticleDOI
Densely Connected Convolutional Networks
TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Proceedings Article
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.