scispace - formally typeset
Open AccessProceedings ArticleDOI

Expansion via Prediction of Importance with Contextualization

TLDR
Expansion via Prediction of Importance with Contextualization (EPIC) as mentioned in this paper is a representation-based ranking approach that explicitly models the importance of each term using a contextualized language model and performs passage expansion by propagating the importance to similar terms.
Abstract: 
The identification of relevance with little textual context is a primary challenge in passage retrieval. We address this problem with a representation-based ranking approach that: (1) explicitly models the importance of each term using a contextualized language model; (2) performs passage expansion by propagating the importance to similar terms; and (3) grounds the representations in the lexicon, making them interpretable. Passage representations can be pre-computed at index time to reduce query-time latency. We call our approach EPIC (Expansion via Prediction of Importance with Contextualization). We show that EPIC significantly outperforms prior importance-modeling and document expansion approaches. We also observe that the performance is additive with the current leading first-stage retrieval methods, further narrowing the gap between inexpensive and cost-prohibitive passage ranking approaches. Specifically, EPIC achieves a MRR@10 of 0.304 on the MS-MARCO passage ranking dataset with 78ms average query latency on commodity hardware. We also find that the latency is further reduced to 68ms by pruning document representations, with virtually no difference in effectiveness.

read more

Citations
More filters
Posted Content

Pretrained Transformers for Text Ranking: BERT and Beyond

TL;DR: This tutorial provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example, and covers a wide range of techniques.
Posted Content

PARADE: Passage Representation Aggregation for Document Reranking

TL;DR: An end-to-end Transformer-based model that considers document-level context for document reranking and leverages passage-level relevance representations to predict a document relevance score, overcoming the limitations of previous approaches.
Proceedings ArticleDOI

SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

TL;DR: SPARTA achieves new state-of-the-art results across a variety of open-domain question answering tasks in both English and Chinese datasets, including open SQuAD, CMRC and etc.
Posted Content

BERT-QE: Contextualized Query Expansion for Document Re-ranking

TL;DR: A novel query expansion model that leverages the strength of the BERT model to select relevant document chunks for expansion is proposed, which significantly outperforms BERT-Large models.
Posted Content

SLEDGE: A Simple Yet Effective Baseline for Coronavirus Scientific Knowledge Search

TL;DR: This work presents a search system called SLEDGE, which utilizes SciBERT to effectively re-rank articles, and trains the model on a general-domain answer ranking dataset, and transfers the relevance signals to SARS-CoV-2 for evaluation.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Posted Content

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TL;DR: This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.
Journal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TL;DR: This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Proceedings ArticleDOI

A Deep Relevance Matching Model for Ad-hoc Retrieval

TL;DR: A novel deep relevance matching model (DRMM) for ad-hoc retrieval that employs a joint deep architecture at the query term level for relevance matching and can significantly outperform some well-known retrieval models as well as state-of-the-art deep matching models.
Related Papers (5)