FluxCapacitor: efficient time-travel text search

Open AccessProceedings Article

FluxCapacitor: efficient time-travel text search

Klaus Berberich, +3 more

- pp 1414-1417

Chats0

TLDR

Time-travel text search as mentioned in this paper evaluates a keyword query on the state of the text collection as of a user-specified time point to find the current state of a text collection.

Abstract:

An increasing number of temporally versioned text collections is available today with Web archives being a prime example Search on such collections, however, is often not satisfactory and ignores their temporal dimension completely Time-travel text search solves this problem by evaluating a keyword query on the state of the text collection as of a user-specified time point This work demonstrates our approach to efficient time-travel text search and its implementation in the FLUXCAPACITOR prototype

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Survey of Temporal Information Retrieval and Related Applications

Ricardo Campos, +3 more

- 25 Aug 2014 -

ACM Computing Surveys

TL;DR: A survey of the existing literature on temporal information retrieval is presented, categorize the relevant research, describe the main contributions, and compare different approaches to provide a coherent view of the field.

...read moreread less

Proceedings ArticleDOI

A time machine for text search

Klaus Berberich, +3 more

TL;DR: This work proposes an efficient solution for time-travel text search by extending the inverted file index to make it ready for temporal search, and introduces approximate temporal coalescing as a tunable method to reduce the index size without significantly affecting the quality of results.

...read moreread less

Proceedings ArticleDOI

Exploiting time-based synonyms in searching document archives

Nattiya Kanhabua, +1 more

TL;DR: This paper presents an approach to extracting synonyms of named entities over time from the whole history of Wikipedia, and uses their temporal patterns as a feature in ranking and classifying them into two types, i.e., time-independent or time-dependent.

...read moreread less

Proceedings ArticleDOI

Temporal index sharding for space-time efficiency in archive search

Avishek Anand, +3 more

TL;DR: This work presents a novel index organization scheme that shards each index list with almost zero increase in index size but still minimizes the cost of reading index entries during query processing, and demonstrates the feasibility of faster time-travel query processing with no space overhead.

...read moreread less

Proceedings ArticleDOI

EverLast: a distributed architecture for preserving the web

Avishek Anand, +4 more

TL;DR: EverLast, a scalable distributed framework for next generation Web archival and temporal text analytics over the archive, is proposed, built on a loosely-coupled distributed architecture that can be deployed over large-scale peer-to-peer networks.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Inverted files for text search engines

Justin Zobel, +1 more

- 25 Jul 2006 -

ACM Computing Surveys

TL;DR: This tutorial introduces the key techniques in the area of text indexing, describing both a core implementation and how the core can be enhanced through a range of extensions.

...read moreread less

Proceedings ArticleDOI

An online algorithm for segmenting time series

Eamonn Keogh, +3 more

TL;DR: This paper undertake the first extensive review and empirical comparison of all proposed techniques for mining time-series data with fatal flaws and introduces a novel algorithm that is empirically show to be superior to all others in the literature.

...read moreread less

Proceedings Article

Okapi/Keenbow at TREC-8.

Stephen Robertson, +1 more

TL;DR: Three ad hoc runs were submitted: long (title, description and narrative), medium (title and description) and short (title only).

...read moreread less

Book ChapterDOI

REHIST: relative error histogram construction algorithms

Sudipto Guha, +2 more

TL;DR: This paper considers histogram construction for the known relative error measures and develops optimal as well as fast approximation algorithms that demonstrate the effectiveness of these algorithms in providing significantly more accurate answers through synthetic and real life data sets.

...read moreread less

Proceedings ArticleDOI

A time machine for text search

Klaus Berberich, +3 more

TL;DR: This work proposes an efficient solution for time-travel text search by extending the inverted file index to make it ready for temporal search, and introduces approximate temporal coalescing as a tunable method to reduce the index size without significantly affecting the quality of results.

...read moreread less

FluxCapacitor: efficient time-travel text search

Citations

Survey of Temporal Information Retrieval and Related Applications

A time machine for text search

Exploiting time-based synonyms in searching document archives

Temporal index sharding for space-time efficiency in archive search

EverLast: a distributed architecture for preserving the web

References

Inverted files for text search engines

An online algorithm for segmenting time series

Okapi/Keenbow at TREC-8.

REHIST: relative error histogram construction algorithms

A time machine for text search

Related Papers (5)

A time machine for text search

Comparison of access methods for time-evolving data

Inverted files for text search engines

Versioning a full-text information retrieval system

On the value of temporal information in information retrieval