Multi-Layer Contextual Passage Term Embedding for Ad-Hoc Retrieval

doi:10.3390/info13050221

Open AccessJournal ArticleDOI

Multi-Layer Contextual Passage Term Embedding for Ad-Hoc Retrieval

Weihong Cai, +5 more

- 25 Apr 2022 -

Information

- Vol. 13, Iss: 5, pp 221-221

Chats0

TLDR

This paper explores a novel multi-layer contextual passage architecture that leverage text summarization extraction to generate passage-level evidence for the pre-selected document passage thus brought new possibilities for the long document relevance task.

Abstract:

Nowadays, pre-trained language models such as Bidirectional Encoder Representations from Transformer (BERT) are becoming a basic building block in Information Retrieval tasks. Nevertheless, there are several limitations when applying BERT to the query-document matching task: (1) relevance assessments are applicable at the document-level, and the tokens of documents often exceed the maximum input length of BERT; (2) applying BERT to long documents leads to a great consumption of memory usage and run time, owing to the computational cost of the interactions between tokens. This paper explores a novel multi-layer contextual passage architecture that leverage text summarization extraction to generate passage-level evidence for the pre-selected document passage thus brought new possibilities for the long document relevance task. Experiments were conducted on two standard ad-hoc retrieval collections from the Text Retrieval Conference (TREC) 2004 Robust Track (Robust04) and ClueWeb09 with two different characteristics individually. Experimental results show that our approach can significantly outperform the strong baselines and even compared with the same BERT-based models, the precision of our methods as well as state-of-the-art neural ranking models.

References

PDF

Open Access

More filters

Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018 -

arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Proceedings ArticleDOI

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

Omar Khattab, +1 more

TL;DR: ColBERT is presented, a novel ranking model that adapts deep LMs (in particular, BERT) for efficient retrieval that is competitive with existing BERT-based models (and outperforms every non-BERT baseline) and enables leveraging vector-similarity indexes for end-to-end retrieval directly from millions of documents.

...read moreread less

DOI

Models and Data for Simple Applications of BERT for Ad Hoc Document Retrieval

Zeynep Akkalyoncu Yilmaz, +3 more

TL;DR: This work addresses the challenge posed by documents that are typically longer than the length of input BERT was designed to handle by applying inference on sentences individually, and then aggregating sentence scores to produce document scores.

...read moreread less

Journal ArticleDOI

An adaptive term proximity based rocchio’s model for clinical decision support retrieval

Min Pan, +5 more

- 12 Dec 2019 -

BMC Medical Informatics and Decision Mak...

TL;DR: Experimental results demonstrate that the proposed HRoc and HRoc_AP models superior to other advanced models, such as PRoc2 and TF-PRF methods on various evaluation metrics, and a new concept of term proximity feedback weight is proposed.

...read moreread less

Related Papers (5)

Improving web search ranking by incorporating summarization

Xianjun Meng, +3 more

Document ranking refinement using a markov random field model

Esaú Villatoro, +4 more

- 01 Mar 2012 -

Natural Language Engineering

Multi-Layer Contextual Passage Term Embedding for Ad-Hoc Retrieval

References

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

Models and Data for Simple Applications of BERT for Ad Hoc Document Retrieval

An adaptive term proximity based rocchio’s model for clinical decision support retrieval

Related Papers (5)

Improving web search ranking by incorporating summarization

Document ranking refinement using a markov random field model

Learning To Rank Relevant Documents for Information Retrieval in Bioengineering Text Corpora

Novelty based Ranking of Human Answers for Community Questions

Improving search relevance for short queries in community question answering