Retrieval Augmentation Reduces Hallucination in Conversation

Open AccessProceedings Article

Retrieval Augmentation Reduces Hallucination in Conversation

Kurt Shuster, +4 more

- pp 3784-3803

Chats0

TLDR

This paper explore the use of neural retrieval-in-the-loop architectures for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses.

Abstract:

Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020). In this work we explore the use of neural-retrieval-in-the-loop architectures - recently shown to be effective in open-domain QA (Lewis et al., 2020b; Izacard and Grave, 2020) - for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses. We study various types of architectures with multiple components - retrievers, rankers, and encoder-decoders - with the goal of maximizing knowledgeability while retaining conversational ability. We demonstrate that our best models obtain state-of-the-art performance on two knowledge-grounded conversational tasks. The models exhibit open-domain conversational capabilities, generalize effectively to scenarios not within the training data, and, as verified by human evaluations, substantially reduce the well-known problem of knowledge hallucination in state-of-the-art chatbots.

Citations

PDF

Open Access

More filters

Proceedings Article

A Model of Cross-Lingual Knowledge-Grounded Response Generation for Open-Domain Dialogue Systems

San Kim, +3 more

TL;DR: This paper showed that knowledge inherent in cross-lingual language models can be helpful for generating responses in open-domain Korean dialogue systems, even with only English knowledge given to the dialogue system, and developed a knowledge-grounded Korean dialogue model based on KE-T5.

...read moreread less

Posted Content

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

David Wilmot, +1 more

- 08 Sep 2021 -

arXiv: Computation and Language

TL;DR: This article improved the standard transformer language model by incorporating an external knowledgebase (derived from Retrieval Augmented Generation) and adding a memory mechanism to enhance performance on longer works, using a novel approach to derive salience annotation using chapter-aligned summaries from the Shmoop corpus for classic literary works.

...read moreread less

Proceedings Article

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

David Wilmot, +1 more

TL;DR: This paper improved the standard transformer language model by incorporating an external knowledgebase (derived from Retrieval Augmented Generation) and adding a memory mechanism to enhance performance on longer works, using a novel approach to derive salience annotation using chapter-aligned summaries from the Shmoop corpus for classic literary works.

...read moreread less

Posted Content

MFAQ: a Multilingual FAQ Dataset

Maxime De Bruyn, +3 more

- 27 Sep 2021 -

arXiv: Computation and Language

TL;DR: The authors collected around 6M FAQ pairs from the web, in 21 different languages, and adopted a similar setup as Dense Passage Retrieval (DPR) and test various bi-encoders on this dataset.

...read moreread less

Posted Content

Reason first, then respond: Modular Generation for Knowledge-infused Dialogue.

Leonard Adolphs, +4 more

- 09 Nov 2021 -

arXiv: Computation and Language

TL;DR: The authors propose a knowledge-to-response (K2R) model, which generates a knowledge sequence, given a dialogue context, as an intermediate step, and then attends to its own generated knowledge sequence and the dialogue context to produce a final response.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Proceedings Article

Language Models are Few-Shot Learners

Tom B. Brown, +30 more

TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.

...read moreread less

Journal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

TL;DR: This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

...read moreread less