Retrieval Augmentation Reduces Hallucination in Conversation

Open AccessPosted Content

Retrieval Augmentation Reduces Hallucination in Conversation

Kurt Shuster, +4 more

- 15 Apr 2021 -

arXiv: Computation and Language

Chats0

TLDR

This paper explore the use of neural retrieval-in-the-loop architectures for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses.

Abstract:

Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020). In this work we explore the use of neural-retrieval-in-the-loop architectures - recently shown to be effective in open-domain QA (Lewis et al., 2020b; Izacard and Grave, 2020) - for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses. We study various types of architectures with multiple components - retrievers, rankers, and encoder-decoders - with the goal of maximizing knowledgeability while retaining conversational ability. We demonstrate that our best models obtain state-of-the-art performance on two knowledge-grounded conversational tasks. The models exhibit open-domain conversational capabilities, generalize effectively to scenarios not within the training data, and, as verified by human evaluations, substantially reduce the well-known problem of knowledge hallucination in state-of-the-art chatbots.

Citations

PDF

Open Access

More filters

Posted Content

Internet-Augmented Dialogue Generation.

Mojtaba Komeili, +2 more

- 15 Jul 2021 -

arXiv: Artificial Intelligence

TL;DR: This article proposed an approach that learns to generate an internet search query based on the context, and then conditions on the search results to finally generate a response, a method that can employ up-to-the-minute relevant information.

...read moreread less

Journal ArticleDOI

FaithDial: A Faithful Benchmark for Information-Seeking Dialogue

Nouha Dziri, +6 more

- 22 Apr 2022 -

Transactions of the Association for Comp...

TL;DR: This work creates F AITH D IAL, a new benchmark for hallucination-free dialogues, by editing hallucinated responses in the Wizard of Wikipedia (W O W) benchmark, and benchmark a series of state-of-the-art models and proposes an auxiliary contrastive objective that achieves the highest level of faithfulness and abstractiveness.

...read moreread less

Posted Content

Beyond Goldfish Memory: Long-Term Open-Domain Conversation.

Jing Xu, +2 more

- 15 Jul 2021 -

arXiv: Computation and Language

TL;DR: In this article, the authors collected and released a human-human dataset consisting of multiple chat sessions whereby the speaking partners learn about each other's interests and discuss the things they have learnt from past sessions, and they show how existing models trained on existing datasets perform poorly in this long-term conversation setting in both automatic and human evaluations.

...read moreread less

Posted Content

TruthfulQA: Measuring How Models Mimic Human Falsehoods

Stephanie Lin, +2 more

- 08 Sep 2021 -

arXiv: Computation and Language

TL;DR: This paper proposed a benchmark to measure whether a language model is truthful in generating answers to questions, which consists of 817 questions that span 38 categories, including health, law, finance and politics.

...read moreread less

Proceedings ArticleDOI

Internet-Augmented Dialogue Generation

TL;DR: The authors proposed an approach that learns to generate an internet search query based on the context, and then conditions on the search results to finally generate a response, a method that can employ up-to-the-minute relevant information.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Journal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

TL;DR: This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

...read moreread less

Journal ArticleDOI

Natural Questions: A Benchmark for Question Answering Research

Tom Kwiatkowski, +17 more

- 02 Aug 2019 -

Transactions of the Association for Comp...

TL;DR: The Natural Questions corpus, a question answering data set, is presented, introducing robust metrics for the purposes of evaluating question answering systems; demonstrating high human upper bounds on these metrics; and establishing baseline results using competitive methods drawn from related literature.

...read moreread less

Proceedings ArticleDOI

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

Mandar Joshi, +3 more

TL;DR: It is shown that, in comparison to other recently introduced large-scale datasets, TriviaQA has relatively complex, compositional questions, has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and requires more cross sentence reasoning to find answers.

...read moreread less