scispace - formally typeset
Open AccessPosted Content

Neural data-to-text generation: A comparison between pipeline and end-to-end architectures

TLDR
Automatic and human evaluations together with a qualitative analysis suggest that having explicit intermediate steps in the generation process results in better texts than the ones generated by end-to-end approaches.
Abstract
Traditionally, most data-to-text applications have been designed using a modular pipeline architecture, in which non-linguistic input data is converted into natural language through several intermediate transformations. In contrast, recent neural models for data-to-text generation have been proposed as end-to-end approaches, where the non-linguistic input is rendered in natural language with much less explicit intermediate representations in-between. This study introduces a systematic comparison between neural pipeline and end-to-end data-to-text approaches for the generation of text from RDF triples. Both architectures were implemented making use of state-of-the art deep learning methods as the encoder-decoder Gated-Recurrent Units (GRU) and Transformer. Automatic and human evaluations together with a qualitative analysis suggest that having explicit intermediate steps in the generation process results in better texts than the ones generated by end-to-end approaches. Moreover, the pipeline models generalize better to unseen inputs. Data and code are publicly available.

read more

Citations
More filters
Posted Content

Investigating Pretrained Language Models for Graph-to-Text Generation

TL;DR: It is suggested that the PLMs benefit from similar facts seen during pretraining or fine-tuning, such that they perform well even when the input graph is reduced to a simple bag of node and edge labels.
Posted Content

Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation.

TL;DR: This work proposes a novel confidence oriented decoder that assigns a confidence score to each target position in training using a variational Bayes objective, and can be leveraged at inference time using a calibration technique to promote more faithful generation.
Posted Content

Text-to-Text Pre-Training for Data-to-Text Tasks

TL;DR: It is indicated that text-to-text pre-training in the form of T5 enables simple, end- to-end transformer based models to outperform pipelined neural architectures tailored for data-to/text generation, as well as alternatives such as BERT and GPT-2.
Proceedings ArticleDOI

Bridging the Structural Gap Between Encoding and Decoding for Data-To-Text Generation

TL;DR: This work proposes DualEnc, a dual encoding model that can not only incorporate the graph structure, but can also cater to the linear structure of the output text, demonstrating that dual encoding can significantly improve the quality of the generated text.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article

Attention is All you Need

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings ArticleDOI

Neural Machine Translation of Rare Words with Subword Units

TL;DR: This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU.
Related Papers (5)