scispace - formally typeset
Open AccessJournal ArticleDOI

Multi-Layer Contextual Passage Term Embedding for Ad-Hoc Retrieval

Weihong Cai, +5 more
- 25 Apr 2022 - 
- Vol. 13, Iss: 5, pp 221-221
Reads0
Chats0
TLDR
This paper explores a novel multi-layer contextual passage architecture that leverage text summarization extraction to generate passage-level evidence for the pre-selected document passage thus brought new possibilities for the long document relevance task.
Abstract
Nowadays, pre-trained language models such as Bidirectional Encoder Representations from Transformer (BERT) are becoming a basic building block in Information Retrieval tasks. Nevertheless, there are several limitations when applying BERT to the query-document matching task: (1) relevance assessments are applicable at the document-level, and the tokens of documents often exceed the maximum input length of BERT; (2) applying BERT to long documents leads to a great consumption of memory usage and run time, owing to the computational cost of the interactions between tokens. This paper explores a novel multi-layer contextual passage architecture that leverage text summarization extraction to generate passage-level evidence for the pre-selected document passage thus brought new possibilities for the long document relevance task. Experiments were conducted on two standard ad-hoc retrieval collections from the Text Retrieval Conference (TREC) 2004 Robust Track (Robust04) and ClueWeb09 with two different characteristics individually. Experimental results show that our approach can significantly outperform the strong baselines and even compared with the same BERT-based models, the precision of our methods as well as state-of-the-art neural ranking models.

read more

Content maybe subject to copyright    Report

References
More filters
Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Proceedings ArticleDOI

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

TL;DR: ColBERT is presented, a novel ranking model that adapts deep LMs (in particular, BERT) for efficient retrieval that is competitive with existing BERT-based models (and outperforms every non-BERT baseline) and enables leveraging vector-similarity indexes for end-to-end retrieval directly from millions of documents.

Models and Data for Simple Applications of BERT for Ad Hoc Document Retrieval

TL;DR: This work addresses the challenge posed by documents that are typically longer than the length of input BERT was designed to handle by applying inference on sentences individually, and then aggregating sentence scores to produce document scores.
Journal ArticleDOI

An adaptive term proximity based rocchio’s model for clinical decision support retrieval

TL;DR: Experimental results demonstrate that the proposed HRoc and HRoc_AP models superior to other advanced models, such as PRoc2 and TF-PRF methods on various evaluation metrics, and a new concept of term proximity feedback weight is proposed.
Related Papers (5)