scispace - formally typeset
S

Srikanta Bedathur

Researcher at Indian Institute of Technology Delhi

Publications -  120
Citations -  1897

Srikanta Bedathur is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topics: Computer science & SPARQL. The author has an hindex of 21, co-authored 108 publications receiving 1680 citations. Previous affiliations of Srikanta Bedathur include IBM & Indraprastha Institute of Information Technology.

Papers
More filters
Proceedings ArticleDOI

Using Word Embeddings for Information Retrieval: How Collection and Term Normalization Choices Affect Performance

TL;DR: This article studied the effect of varying following two parameters, viz., i) the term normalization and ii) the choice of training collection, on ad hoc retrieval performance with word2vec and fastText embeddings.
Proceedings ArticleDOI

Index maintenance for time-travel text search

TL;DR: This work describes a novel index structure that efficiently supports time-travel text search and can be maintained incrementally as new document versions are added to the web archive.
Proceedings ArticleDOI

Computing n-gram statistics in MapReduce

TL;DR: This work study how n-gram statistics can be computed efficiently harnessing MapReduce for distributed data processing, and describes different algorithms, ranging from an extension of word counting, via methods based on the Apriori principle, to a novel method Suffix-σ that relies on sorting and aggregating suffixes.
Proceedings ArticleDOI

Durable top-k search in document archives

TL;DR: A new ranking problem in versioned databases of versioned objects which have different valid instances along a history is proposed and a special indexing technique for archived data is proposed, based on a shared execution paradigm and more efficient than the first approach.
Journal ArticleDOI

Interesting-phrase mining for ad-hoc text analytics

TL;DR: This work develops preprocessing and indexing methods for phrases, paired with new search techniques for the top-k most interesting phrases in ad-hoc subsets of the corpus of New York Times news articles.