Curriculum Learning for Dense Retrieval Distillation

doi:10.1145/3477495.3531791

Open AccessProceedings ArticleDOI

Curriculum Learning for Dense Retrieval Distillation

TLDR

A generic curriculum learning based optimization framework called CL-DRD that controls the difficulty level of training data produced by the re-ranking (teacher) model is proposed that iteratively optimizes the dense retrieval (student) model by increasing the difficulty of the knowledge distillation data made available to it.

Abstract:

Recent work has shown that more effective dense retrieval models can be obtained by distilling ranking knowledge from an existing base re-ranking model. In this paper, we propose a generic curriculum learning based optimization framework called CL-DRD that controls the difficulty level of training data produced by the re-ranking (teacher) model. CL-DRD iteratively optimizes the dense retrieval (student) model by increasing the difficulty of the knowledge distillation data made available to it. In more detail, we initially provide the student model coarse-grained preference pairs between documents in the teacher's ranking, and progressively move towards finer-grained pairwise document ordering requirements. In our experiments, we apply a simple implementation of the CL-DRD framework to enhance two state-of-the-art dense retrieval models. Experiments on three public passage retrieval datasets demonstrate the effectiveness of our proposed framework.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

PLAID: An Efficient Engine for Late Interaction Retrieval

Keshav Santhanam, +3 more

TL;DR: The Performance- Optimized Late Interaction Driver (PLAID) engine is introduced, which uses centroid interaction as well as centroid pruning, a mechanism for sparsifying the bag of centroids, within a highly-optimized engine to reduce late interaction search latency by up to 7x on a GPU and 45x on an CPU against vanilla ColBERTv2.

...read moreread less

Journal ArticleDOI

Dense Text Retrieval based on Pretrained Language Models: A Survey

Wayne Xin Zhao, +3 more

- 27 Nov 2022 -

arXiv.org

TL;DR: A comprehensive survey on dense text retrieval can be found in this article , where the authors provide a comprehensive, practical reference focused on the major progress for dense text this article . But, their focus is on the relevance matching.

...read moreread less

Journal ArticleDOI

Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey

Xiaoyu Shen, +5 more

- 05 Aug 2022 -

arXiv.org

TL;DR: A thorough structured overview of mainstream techniques for low-resource DR, dividing the techniques into three main categories based on their required resources, and highlighting the open issues and pros and cons.

...read moreread less

Proceedings ArticleDOI

PROD: Progressive Distillation for Dense Retrieval

Yeyun Gong, +9 more

TL;DR: This work proposes PROD, a PRO gressive distillation method, for dense retrieval, which consists of a teacher progressive distillation and a datagressive distillation to gradually improve the student.

...read moreread less

Journal ArticleDOI

Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval

Sheng-Chieh Lin, +2 more

- 31 Jul 2022 -

Transactions of the Association for Comp...

TL;DR: This work demonstrates that MLM pre-trained transformers can be used to effectively encode text information into a single-vector for dense retrieval.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018 -

arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Journal ArticleDOI

A vector space model for automatic indexing

Gerard Salton, +2 more

- 01 Nov 1975 -

Communications of The ACM

TL;DR: An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents, demonstating the usefulness of the model.

...read moreread less

Proceedings ArticleDOI