scispace - formally typeset
Open AccessProceedings Article

Retrieval Augmentation Reduces Hallucination in Conversation

Reads0
Chats0
TLDR
This paper explore the use of neural retrieval-in-the-loop architectures for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses.
Abstract: 
Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020). In this work we explore the use of neural-retrieval-in-the-loop architectures - recently shown to be effective in open-domain QA (Lewis et al., 2020b; Izacard and Grave, 2020) - for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses. We study various types of architectures with multiple components - retrievers, rankers, and encoder-decoders - with the goal of maximizing knowledgeability while retaining conversational ability. We demonstrate that our best models obtain state-of-the-art performance on two knowledge-grounded conversational tasks. The models exhibit open-domain conversational capabilities, generalize effectively to scenarios not within the training data, and, as verified by human evaluations, substantially reduce the well-known problem of knowledge hallucination in state-of-the-art chatbots.

read more

Citations
More filters
Proceedings Article

A Model of Cross-Lingual Knowledge-Grounded Response Generation for Open-Domain Dialogue Systems

TL;DR: This paper showed that knowledge inherent in cross-lingual language models can be helpful for generating responses in open-domain Korean dialogue systems, even with only English knowledge given to the dialogue system, and developed a knowledge-grounded Korean dialogue model based on KE-T5.
Posted Content

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

TL;DR: This article improved the standard transformer language model by incorporating an external knowledgebase (derived from Retrieval Augmented Generation) and adding a memory mechanism to enhance performance on longer works, using a novel approach to derive salience annotation using chapter-aligned summaries from the Shmoop corpus for classic literary works.
Proceedings Article

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

TL;DR: This paper improved the standard transformer language model by incorporating an external knowledgebase (derived from Retrieval Augmented Generation) and adding a memory mechanism to enhance performance on longer works, using a novel approach to derive salience annotation using chapter-aligned summaries from the Shmoop corpus for classic literary works.
Posted Content

MFAQ: a Multilingual FAQ Dataset

TL;DR: The authors collected around 6M FAQ pairs from the web, in 21 different languages, and adopted a similar setup as Dense Passage Retrieval (DPR) and test various bi-encoders on this dataset.
Posted Content

Reason first, then respond: Modular Generation for Knowledge-infused Dialogue.

TL;DR: The authors propose a knowledge-to-response (K2R) model, which generates a knowledge sequence, given a dialogue context, as an intermediate step, and then attends to its own generated knowledge sequence and the dialogue context to produce a final response.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article

Attention is All you Need

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Journal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TL;DR: This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Related Papers (5)