scispace - formally typeset
Open AccessProceedings Article

FluxCapacitor: efficient time-travel text search

Reads0
Chats0
TLDR
Time-travel text search as mentioned in this paper evaluates a keyword query on the state of the text collection as of a user-specified time point to find the current state of a text collection.
Abstract
An increasing number of temporally versioned text collections is available today with Web archives being a prime example Search on such collections, however, is often not satisfactory and ignores their temporal dimension completely Time-travel text search solves this problem by evaluating a keyword query on the state of the text collection as of a user-specified time point This work demonstrates our approach to efficient time-travel text search and its implementation in the FLUXCAPACITOR prototype

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Survey of Temporal Information Retrieval and Related Applications

TL;DR: A survey of the existing literature on temporal information retrieval is presented, categorize the relevant research, describe the main contributions, and compare different approaches to provide a coherent view of the field.
Proceedings ArticleDOI

A time machine for text search

TL;DR: This work proposes an efficient solution for time-travel text search by extending the inverted file index to make it ready for temporal search, and introduces approximate temporal coalescing as a tunable method to reduce the index size without significantly affecting the quality of results.
Proceedings ArticleDOI

Exploiting time-based synonyms in searching document archives

TL;DR: This paper presents an approach to extracting synonyms of named entities over time from the whole history of Wikipedia, and uses their temporal patterns as a feature in ranking and classifying them into two types, i.e., time-independent or time-dependent.
Proceedings ArticleDOI

Temporal index sharding for space-time efficiency in archive search

TL;DR: This work presents a novel index organization scheme that shards each index list with almost zero increase in index size but still minimizes the cost of reading index entries during query processing, and demonstrates the feasibility of faster time-travel query processing with no space overhead.
Proceedings ArticleDOI

EverLast: a distributed architecture for preserving the web

TL;DR: EverLast, a scalable distributed framework for next generation Web archival and temporal text analytics over the archive, is proposed, built on a loosely-coupled distributed architecture that can be deployed over large-scale peer-to-peer networks.
References
More filters
Journal ArticleDOI

Inverted files for text search engines

TL;DR: This tutorial introduces the key techniques in the area of text indexing, describing both a core implementation and how the core can be enhanced through a range of extensions.
Proceedings ArticleDOI

An online algorithm for segmenting time series

TL;DR: This paper undertake the first extensive review and empirical comparison of all proposed techniques for mining time-series data with fatal flaws and introduces a novel algorithm that is empirically show to be superior to all others in the literature.
Proceedings Article

Okapi/Keenbow at TREC-8.

TL;DR: Three ad hoc runs were submitted: long (title, description and narrative), medium (title and description) and short (title only).
Book ChapterDOI

REHIST: relative error histogram construction algorithms

TL;DR: This paper considers histogram construction for the known relative error measures and develops optimal as well as fast approximation algorithms that demonstrate the effectiveness of these algorithms in providing significantly more accurate answers through synthetic and real life data sets.
Proceedings ArticleDOI

A time machine for text search

TL;DR: This work proposes an efficient solution for time-travel text search by extending the inverted file index to make it ready for temporal search, and introduces approximate temporal coalescing as a tunable method to reduce the index size without significantly affecting the quality of results.