Open AccessPosted Content
Dense Passage Retrieval for Open-Domain Question Answering
Vladimir Karpukhin,Barlas Oguz,Sewon Min,Patrick S. H. Lewis,Ledell Wu,Sergey Edunov,Danqi Chen,Wen-tau Yih +7 more
Reads0
Chats0
TLDR
This work shows that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework.Abstract:
Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.read more
Citations
More filters
Posted Content
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Patrick S. H. Lewis,Ethan Perez,Aleksandra Piktus,Fabio Petroni,Vladimir Karpukhin,Naman Goyal,Heinrich Küttler,Michael Lewis,Wen-tau Yih,Tim Rocktäschel,Sebastian Riedel,Douwe Kiela +11 more
TL;DR: A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
Journal ArticleDOI
Self-supervised Learning: Generative or Contrastive.
TL;DR: This survey takes a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning, and comprehensively review the existing empirical methods into three main categories according to their objectives.
Posted Content
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong,Chenyan Xiong,Ye Li,Kwok-Fung Tang,Jialin Liu,Paul N. Bennett,Junaid Ahmed,Arnold Overwijk +7 more
TL;DR: Approximate nearest neighbor Negative Contrastive Estimation (ANCE) is presented, a training mechanism that constructs negatives from an Approximate Nearest Neighbor (ANN) index of the corpus, which is parallelly updated with the learning process to select more realistic negative training instances.
Journal Article
LaMDA: Language Models for Dialog Applications
Romal Thoppilan,Daniel Adiwardana,Jamie Hall,Noam Shazeer,Apoorv Kulshreshtha,Heng-Tze Cheng,Alicia Jin,Taylor Bos,Leslie Baker,Yu Du,Yaguang Li,Hongrae Lee,Huaixiu Zheng,Amin Ghafouri,Marcelo Menegali,Yanping Huang,Maxim Krikun,Dmitry Lepikhin,James Qin,Dehao Chen,Yuanzhong Xu,Zhifeng Chen,Adam Roberts,Maarten Bosma,Yaoqi Zhou,Chung-Ching Chang,I. A. Krivokon,Willard J. Rusch,Marc Pickett,Kathleen S. Meier-Hellstern,Meredith Ringel Morris,Tulsee Doshi,Renelito Delos Santos,Toju Duke,Johnny Hartz Søraker,Bendert Zevenbergen,Velu Prabhakaran,Mark Díaz,Ben Hutchinson,Kristen Olson,Alejandra Aguirre Molina,Erin Hoffman-John,Josh Lee,Lora Aroyo,Ravindran Rajakumar,Alena Butryna,Matthew Lamm,V. O. Kuzmina,Joseph Fenton,Aaron Cohen,Rachel Bernstein,Raymond C. Kurzweil,Blaise Aguera-Arcas,Claire Cui,Marian Rogers Croak,Ed H. Chi,Quoc Hoai Le +56 more
TL;DR: The authors presented LaMDA: Language Models for Dialog Applications, a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text and demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding.
Posted Content
Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
Gautier Izacard,Edouard Grave +1 more
TL;DR: Interestingly, it is observed that the performance of this method significantly improves when increasing the number of retrieved passages, evidence that sequence-to-sequence models offers a flexible framework to efficiently aggregate and combine evidence from multiple passages.
References
More filters
Proceedings ArticleDOI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Journal ArticleDOI
Indexing by Latent Semantic Analysis
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Proceedings ArticleDOI
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TL;DR: The Stanford Question Answering Dataset (SQuAD) as mentioned in this paper is a reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage.
Proceedings Article
Signature Verification using a "Siamese" Time Delay Neural Network
TL;DR: An algorithm for verification of signatures written on a pen-input tablet based on a novel, artificial neural network called a "Siamese" neural network, which consists of two identical sub-networks joined at their outputs.
Proceedings ArticleDOI
Learning to rank using gradient descent
Chris J.C. Burges,Tal Shaked,Erin L. Renshaw,Ari Lazier,Matt Deeds,Nicole A. Hamilton,Greg Hullender +6 more
TL;DR: RankNet is introduced, an implementation of these ideas using a neural network to model the underlying ranking function, and test results on toy data and on data from a commercial internet search engine are presented.