Effective retrieval of structured documents
Ross Wilkinson
- pp 311-317
TLDR
This work considers what information is needed to retrieve effectively and shows that knowledge of the structure of documents can lead to improved retrieval performance.Abstract:
Information systems usually retrieve whole documents as answers to queries. However, it may in some circumstances be more appropriate to retrieve parts of documents. We consider formulas for retrieving whole documents and parts of documents horn a large structured document collection. We consider what information is needed to retrieve effectively and show that knowledge of the structure of documents can lead to improved retrieval performance.read more
Citations
More filters
Proceedings ArticleDOI
Simple BM25 extension to multiple weighted fields
TL;DR: This paper describes a simple way of adapting the BM25 ranking formula to deal with structured documents and proposes a much more intuitive alternative which weights term frequencies before the non-linear term frequency saturation function is applied.
Proceedings ArticleDOI
Passage-level evidence in document retrieval
TL;DR: The increasing lengths of documents in full-text collections encourages renewed interest in the ranking and retrieval of document passages, but questions about how passages are defined, how they can be ranked efficiently, and what is their proper role in long, structured documents are raised.
Patent
Method and apparatus for generating query responses in a computer-based document retrieval system
TL;DR: In this article, a method and apparatus for generating responses to queries to a document retrieval system is presented, which responds to a specific request for information by locating and ranking portions of text that may contain the information sought.
Posted Content
Pretrained Transformers for Text Ranking: BERT and Beyond
TL;DR: This tutorial provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example, and covers a wide range of techniques.
Proceedings ArticleDOI
Passage retrieval revisited
Marcin Kaszkiel,Justin Zobel +1 more
TL;DR: This paper compares their scheme of arbitrary passage retrieval to several other document retrieval and passage retrieval methods and shows experimentally that, compared to these methods,ranking via fixed-length passages is robust and effective.
References
More filters
Proceedings ArticleDOI
Approaches to passage retrieval in full text information systems
TL;DR: New approaches are described in this study for implementing selective passage retrieval systems, and identifying text passages responsive to particular user needs.
Proceedings ArticleDOI
Subtopic structuring for full-length document access
Marti A. Hearst,Christian Plaunt +1 more
TL;DR: It is argued that the advent of large volumes of full-length text, as opposed to short texts like abstracts and newswire, should be accompanied by corresponding new approaches to information access and a partition of the text into coherent multi-paragraph units that represent the pattern of subtopics that comprise the text.
Journal ArticleDOI
Overview of the second text retrieval conference (TREC-2)
TL;DR: The second Text Retrieval Conference (TREC-2) was held in August, 1993, and was attended by about 150 people involved in 31 participating groups as discussed by the authors, with a large variation of retrieval techniques reported on, including methods using automatic thesaurii, sophisticated term weighting, natural language techniques, relevance feedback, and advanced pattern matching.
Proceedings ArticleDOI
The use of cluster hierarchies in hypertext information retrieval
TL;DR: An hierarchical structure is described which effectively supports the graphical traversal of a document collection in a hypertext system and an overview of an interactive browser based on cluster hierarchies is provided.