scispace - formally typeset
Open AccessProceedings ArticleDOI

Multinomial Adversarial Networks for Multi-Domain Text Classification

Xilun Chen, +1 more
- Vol. 1, pp 1226-1240
Reads0
Chats0
TLDR
A multinomial adversarial network (MAN) to tackle the real-world problem of multi-domain text classification (MDTC) in which labeled data may exist for multiple domains, but in insufficient amounts to train effective classifiers for one or more of the domains.
Abstract
Many text classification tasks are known to be highly domain-dependent. Unfortunately, the availability of training data can vary drastically across domains. Worse still, for some domains there may not be any annotated data at all. In this work, we propose a multinomial adversarial network (MAN) to tackle this real-world problem of multi-domain text classification (MDTC) in which labeled data may exist for multiple domains, but in insufficient amounts to train effective classifiers for one or more of the domains. We provide theoretical justifications for the MAN framework, proving that different instances of MANs are essentially minimizers of various f-divergence metrics (Ali and Silvey, 1966) among multiple probability distributions. MANs are thus a theoretically sound generalization of traditional adversarial networks that discriminate over two distributions. More specifically, for the MDTC task, MAN learns features that are invariant across multiple domains by resorting to its ability to reduce the divergence among the feature distributions of each domain. We present experimental results showing that MANs significantly outperform the prior art on the MDTC task. We also show that MANs achieve state-of-the-art performance for domains with no labeled data.

read more

Citations
More filters
Posted Content

WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale

TL;DR: The authors introduced WinoGrande, a large-scale dataset of 44k problems, inspired by the original Winograd Schema Challenge (WSC) design, but adjusted to improve both the scale and the hardness of the dataset.
Proceedings ArticleDOI

Multi-Source Domain Adaptation with Mixture of Experts

TL;DR: This article propose a mixture-of-experts approach for unsupervised domain adaptation from multiple sources, which explicitly captures the relationship between a target example and different source domains. But their approach is limited to sentiment analysis and partof-speech tagging.
Proceedings ArticleDOI

Multi-Source Cross-Lingual Model Transfer: Learning What to Share

TL;DR: This model leverages adversarial networks to learn language-invariant features, and mixture-of-experts models to dynamically exploit the similarity between the target language and each individual source language to further boost target language performance.
Posted Content

Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives

TL;DR: A comprehensive overview of the application of adversarial training to affective computing and sentiment analysis is provided in this paper, with a range of potential future research directions highlighted in this paper.
Proceedings ArticleDOI

Domain-agnostic Question-Answering with Adversarial Training

TL;DR: An adversarial training framework for domain generalization in Question Answering (QA) task is utilized, where the two models constantly compete, so that QA model can learn domain-invariant features.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Posted Content

Efficient Estimation of Word Representations in Vector Space

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

Automatic differentiation in PyTorch

TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Related Papers (5)