Dense Passage Retrieval for Open-Domain Question Answering

Open AccessPosted Content

Dense Passage Retrieval for Open-Domain Question Answering

Vladimir Karpukhin, +7 more

- 10 Apr 2020 -

arXiv: Computation and Language

Chats0

TLDR

This work shows that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework.

Abstract:

Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.

Citations

PDF

Open Access

More filters

Posted Content

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Patrick S. H. Lewis, +11 more

- 22 May 2020 -

arXiv: Computation and Language

TL;DR: A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.

...read moreread less

Journal ArticleDOI

Self-supervised Learning: Generative or Contrastive.

Xiao Liu, +6 more

- 15 Jun 2020 -

arXiv: Learning

TL;DR: This survey takes a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning, and comprehensively review the existing empirical methods into three main categories according to their objectives.

...read moreread less

Posted Content

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval

Lee Xiong, +7 more

- 01 Jul 2020 -

arXiv: Information Retrieval

TL;DR: Approximate nearest neighbor Negative Contrastive Estimation (ANCE) is presented, a training mechanism that constructs negatives from an Approximate Nearest Neighbor (ANN) index of the corpus, which is parallelly updated with the learning process to select more realistic negative training instances.

...read moreread less

Journal Article

LaMDA: Language Models for Dialog Applications

Romal Thoppilan, +56 more

- 20 Jan 2022 -

arXiv.org

TL;DR: The authors presented LaMDA: Language Models for Dialog Applications, a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text and demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding.

...read moreread less

Posted Content

Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering

Gautier Izacard, +1 more

- 02 Jul 2020 -

arXiv: Computation and Language

TL;DR: Interestingly, it is observed that the performance of this method significantly improves when increasing the number of retrieved passages, evidence that sequence-to-sequence models offers a flexible framework to efficiently aggregate and combine evidence from multiple passages.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Journal ArticleDOI

Indexing by Latent Semantic Analysis

Scott Deerwester, +4 more

- 01 Sep 1990 -

Journal of the Association for Informati...

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.

...read moreread less

Proceedings ArticleDOI

SQuAD: 100,000+ Questions for Machine Comprehension of Text

Pranav Rajpurkar, +3 more

TL;DR: The Stanford Question Answering Dataset (SQuAD) as mentioned in this paper is a reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage.

...read moreread less

Proceedings Article

Signature Verification using a "Siamese" Time Delay Neural Network

Jane Bromley, +4 more

TL;DR: An algorithm for verification of signatures written on a pen-input tablet based on a novel, artificial neural network called a "Siamese" neural network, which consists of two identical sub-networks joined at their outputs.

...read moreread less

Proceedings ArticleDOI

Learning to rank using gradient descent

Chris J.C. Burges, +6 more

TL;DR: RankNet is introduced, an implementation of these ideas using a neural network to model the underlying ranking function, and test results on toy data and on data from a commercial internet search engine are presented.

...read moreread less