scispace - formally typeset
Open AccessJournal ArticleDOI

Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation

Reads0
Chats0
TLDR
An unsupervised neural model with interactive attention mechanism is presented, used for learning the semantic relationship between records and reference texts to achieve better content transfer and better style preservation.
Abstract
In this paper, we focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer and aims to preserve text styles while altering the content. In detail, the input is a set of structured records and a reference text for describing another recordset. The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference. The task is unsupervised due to lack of parallel data, and is challenging to select suitable records and style words from bi-aspect inputs respectively and generate a high-fidelity long document. To tackle those problems, we first build a dataset based on a basketball game report corpus as our testbed, and present an unsupervised neural model with interactive attention mechanism, which is used for learning the semantic relationship between records and reference texts to achieve better content transfer and better style preservation. In addition, we also explore the effectiveness of the back-translation in our task for constructing some pseudo-training pairs. Empirical results show superiority of our approaches over competitive methods, and the models also yield a new state-of-the-art result on a sentence-level dataset. 1

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

A Survey of Knowledge-Enhanced Text Generation.

TL;DR: A comprehensive review of the research on knowledge-enhanced text generation over the past five years is presented, which includes two parts: (i) general methods and architectures for integrating knowledge into text generation; (ii) specific techniques and applications according to different forms of knowledge data.
Proceedings ArticleDOI

TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching

TL;DR: This work proposes TableGPT, a model that outperforms existing systems on most few-shot settings for table-to-text generation and exploits multi-task learning with two auxiliary tasks that preserve table’s structural information by reconstructing the structure from GPT-2's representation and improving the text's fidelity with content matching task.
Journal ArticleDOI

KID-Review: Knowledge-Guided Scientific Review Generation with Oracle Pre-training

TL;DR: An end-to-end knowledge-guided review generation framework for scientific papers grounded in cognitive psychology research that a better understanding of text requires different types of knowledge and an oracle pre-training strategy which can make the Kid-Review better educated and make the generated review cover more aspects.
Journal ArticleDOI

Learning number reasoning for numerical table-to-text generation

TL;DR: In this paper, a neural table reasoning generator (NTRG) was proposed to improve the number reasoning capability of neural table-to-text generation by generating additional mathematical equations from numerical table records.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Journal ArticleDOI

Generative Adversarial Nets

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Proceedings Article

Auto-Encoding Variational Bayes

TL;DR: A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
Related Papers (5)