Expansion via Prediction of Importance with Contextualization

doi:10.1145/3397271.3401262

Open AccessProceedings ArticleDOI

Expansion via Prediction of Importance with Contextualization

Sean MacAvaney, +5 more

- 29 Apr 2020 -

arXiv: Information Retrieval

TLDR

Expansion via Prediction of Importance with Contextualization (EPIC) as mentioned in this paper is a representation-based ranking approach that explicitly models the importance of each term using a contextualized language model and performs passage expansion by propagating the importance to similar terms.

Abstract:

The identification of relevance with little textual context is a primary challenge in passage retrieval. We address this problem with a representation-based ranking approach that: (1) explicitly models the importance of each term using a contextualized language model; (2) performs passage expansion by propagating the importance to similar terms; and (3) grounds the representations in the lexicon, making them interpretable. Passage representations can be pre-computed at index time to reduce query-time latency. We call our approach EPIC (Expansion via Prediction of Importance with Contextualization). We show that EPIC significantly outperforms prior importance-modeling and document expansion approaches. We also observe that the performance is additive with the current leading first-stage retrieval methods, further narrowing the gap between inexpensive and cost-prohibitive passage ranking approaches. Specifically, EPIC achieves a MRR@10 of 0.304 on the MS-MARCO passage ranking dataset with 78ms average query latency on commodity hardware. We also find that the latency is further reduced to 68ms by pruning document representations, with virtually no difference in effectiveness.

Citations

PDF

Open Access

More filters

Posted Content

Pretrained Transformers for Text Ranking: BERT and Beyond

Jimmy Lin, +2 more

- 13 Oct 2020 -

arXiv: Information Retrieval

TL;DR: This tutorial provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example, and covers a wide range of techniques.

...read moreread less

Posted Content

PARADE: Passage Representation Aggregation for Document Reranking

Canjia Li, +4 more

- 20 Aug 2020 -

arXiv: Information Retrieval

TL;DR: An end-to-end Transformer-based model that considers document-level context for document reranking and leverages passage-level relevance representations to predict a document relevance score, overcoming the limitations of previous approaches.

...read moreread less

Proceedings ArticleDOI

SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

Tiancheng Zhao, +2 more

TL;DR: SPARTA achieves new state-of-the-art results across a variety of open-domain question answering tasks in both English and Chinese datasets, including open SQuAD, CMRC and etc.

...read moreread less

Posted Content

BERT-QE: Contextualized Query Expansion for Document Re-ranking

Zhi Zheng, +5 more

- 15 Sep 2020 -

arXiv: Information Retrieval

TL;DR: A novel query expansion model that leverages the strength of the BERT model to select relevant document chunks for expansion is proposed, which significantly outperforms BERT-Large models.

...read moreread less

Posted Content

SLEDGE: A Simple Yet Effective Baseline for Coronavirus Scientific Knowledge Search

Sean MacAvaney, +2 more

- 05 May 2020 -

arXiv: Information Retrieval

TL;DR: This work presents a search system called SLEDGE, which utilizes SciBERT to effectively re-rank articles, and trains the model on a general-domain answer ranking dataset, and transfers the relevance signals to SARS-CoV-2 for evaluation.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018 -

arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Posted Content

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 23 Oct 2019 -

arXiv: Learning

TL;DR: This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

...read moreread less

Journal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

TL;DR: This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

...read moreread less

Proceedings ArticleDOI

A Deep Relevance Matching Model for Ad-hoc Retrieval

Jiafeng Guo, +3 more

TL;DR: A novel deep relevance matching model (DRMM) for ad-hoc retrieval that employs a joint deep architecture at the query term level for relevance matching and can significantly outperform some well-known retrieval models as well as state-of-the-art deep matching models.

...read moreread less

Expansion via Prediction of Importance with Contextualization

Citations

Pretrained Transformers for Text Ranking: BERT and Beyond

PARADE: Passage Representation Aggregation for Document Reranking

SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

BERT-QE: Contextualized Query Expansion for Document Re-ranking

SLEDGE: A Simple Yet Effective Baseline for Coronavirus Scientific Knowledge Search

References

Adam: A Method for Stochastic Optimization

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

A Deep Relevance Matching Model for Ad-hoc Retrieval

Related Papers (5)

Expansion via Prediction of Importance with Contextualization

Deeper Text Understanding for IR with Contextual Neural Language Modeling

Mining dependency relations for query expansion in passage retrieval

Relevance-Based Language Models

Passage Re-ranking with BERT