scispace - formally typeset
Open AccessProceedings ArticleDOI

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

TLDR
This paper conducted a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models.
Abstract
Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a max-pooling operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (i) (long) document classification; (ii) text sequence matching; and (iii) short text tasks, including classification and tagging.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Graph Convolutional Networks for Text Classification

TL;DR: Zhang et al. as mentioned in this paper proposed a Text Graph Convolutional Network (Text GCN) for text classification, which jointly learns the embeddings for both words and documents, as supervised by the known class labels for documents.
Journal ArticleDOI

A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification

TL;DR: In this paper, an effective sarcasm identification framework on social media data by pursuing the paradigms of neural language models and deep neural networks is presented. But sarcasm detection on text documents is one of the most challenging tasks in NLP.
Posted Content

Knowledge-Enriched Transformer for Emotion Detection in Textual Conversations

TL;DR: A Knowledge-Enriched Transformer (KET) is proposed, where contextual utterances are interpreted using hierarchical self-attention and external commonsense knowledge is dynamically leveraged using a context-aware affective graph attention mechanism.
Posted Content

Tensor Graph Convolutional Networks for Text Classification.

TL;DR: This paper investigates graph-based neural networks for text classification problem with a new framework TensorGCN (tensor graph convolutional networks), which presents an effective way to harmonize and integrate heterogeneous information from different kinds of graphs.
Posted Content

Estimating Training Data Influence by Tracking Gradient Descent

TL;DR: TracIn as mentioned in this paper is a method that computes the influence of a training example on a prediction made by the model by tracing how the loss on the test point changes during the training process whenever the training example of interest was utilized.
References
More filters
Proceedings Article

Deconvolutional Latent-Variable Model for Text Sequence Matching.

TL;DR: The authors employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization, and apply it to text sequence matching problems.
Posted Content

Adaptive Convolutional Filter Generation for Natural Language Understanding.

TL;DR: This work proposes an adaptive convolutional filter generation framework for natural language understanding, by leveraging a meta network to generate input-aware filters and proposing an adaptive question answering (AdaQA) model; a novel two-way feature abstraction mechanism is introduced to encapsulate co-dependent sentence representations.
Related Papers (5)