scispace - formally typeset
Search or ask a question
Author

Srikanta Bedathur

Bio: Srikanta Bedathur is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topics: Computer science & SPARQL. The author has an hindex of 21, co-authored 108 publications receiving 1680 citations. Previous affiliations of Srikanta Bedathur include IBM & Indraprastha Institute of Information Technology.


Papers
More filters
Proceedings ArticleDOI
26 Oct 2010
TL;DR: An approach based on a novel framework constituting an integration of natural language processing, association rule mining, and contextual similarity as a learning technique is proposed, which has been experimented with real data and has been found to yield good results with respect to efficiency and accuracy.
Abstract: Time-stamped documents such as newswire articles, blog posts and other web-pages are often archived online. When these archives cover long spans of time, the terminology within them could undergo significant changes. Hence, when users pose queries pertaining to historical information, over such documents, the queries need to be translated, taking into account these temporal changes, to provide accurate responses to users. For example, a query on Sri Lanka should automatically retrieve documents with its former name Ceylon. We call such concepts SITACs, i.e., Semantically Identical Temporally Altering Concepts. In order to discover SITACs, we propose an approach based on a novel framework constituting an integration of natural language processing, association rule mining, and contextual similarity as a learning technique. The proposed approach has been experimented with real data and has been found to yield good results with respect to efficiency and accuracy.

29 citations

01 Jan 2009
TL;DR: The results unequivocally show that time-stamps of past interactions significantly improve the prediction accuracy of new and recurrent links over rather sophisticated methods proposed recently.
Abstract: Prediction of links - both new as well as recurring - in a social network representing interactions between individuals is an important problem. In the recent years, there is significant interest in methods that use only the graph structure to make predictions. However, most of them consider a single snapshot of the network as the input, neglecting an important aspect of these social networks viz., their evolution over time. In this work, we investigate the value of incorporating the history information available on the interactions (or links) of the current social network state. Our results unequivocally show that time-stamps of past interactions significantly improve the prediction accuracy of new and recurrent links over rather sophisticated methods proposed recently. Furthermore, we introduce a novel testing method which reflects the application of link prediction better than previous approaches.

29 citations

Proceedings ArticleDOI
25 Jun 2019
TL;DR: This paper designs a random-walk based sampling algorithm called ARRIVAL, which is backed by theoretical guarantees on its expected quality and shown to be 100 times faster than baseline strategies with an average accuracy of 95%.
Abstract: A fundamental query in labeled graphs is to determine if there exists a path between a given source and target vertices, such that the path satisfies a given label constraint. One of the powerful forms of specifying label constraints is through regular expressions, and the resulting problem of reachability queries under regular simple paths (RSP) form the core of many practical graph query languages such as SPARQL from W3C, Cypher of Neo4J, Oracle's PGQL and LDBC's G-CORE. Despite its importance, since it is known that answering RSP queries is NP-Hard, there are no scalable and practical solutions for answering reachability with full-range of regular expressions as constraints. In this paper, we circumvent this computational bottleneck by designing a random-walk based sampling algorithm called ARRIVAL, which is backed by theoretical guarantees on its expected quality. Extensive experiments on billion-sized real graph datasets with thousands of labels show that ARRIVAL to be 100 times faster than baseline strategies with an average accuracy of 95%.

27 citations

Proceedings Article
01 Jan 2013
TL;DR: A novel method is developed to determine search results consisting of documents that are relevant to thequery and were published at diverse times of interest to the query.
Abstract: We investigate the notion of temporal diversity, bringing together two recently active threads of research, namely temporal ranking and diversification of search results. A novel method is developed to determine search results consisting of documents that are relevant to the query and were published at diverse times of interest to the query. Preliminary experiments on twenty years’ worth of newspaper articles from The New York Times demonstrate characteristics of our method and compare it against two baselines.

26 citations

Proceedings ArticleDOI
26 Apr 2010
TL;DR: The proposed system, named ANTOURAGE, solves the intractable problem of automatically extracting tourist trips from large volumes of geo-tagged photographs using a novel adaptation of the max-min ant system (MMAS) meta-heuristic.
Abstract: We study how to automatically extract tourist trips from large volumes of geo-tagged photographs. Working with more than 8 million of these photographs that are publicly available via photo- sharing communities such as Flickr and Panoramio, our goal is to satisfy the needs of a tourist who specifies a starting location (typically a hotel) together with a bounded travel distance and demands a tour that visits the popular sites along the way. Our system, named ANTOURAGE, solves this intractable problem using a novel adaptation of the max-min ant system (MMAS) meta-heuristic. Experiments using GPS metadata crawled from Flickr show that ANTOURAGE can generate high-quality tours.

26 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.
Abstract: Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labeled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.

2,530 citations

Journal ArticleDOI
TL;DR: YAGO2 as mentioned in this paper is an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space, and it contains 447 million facts about 9.8 million entities.

1,186 citations

Journal ArticleDOI
TL;DR: YAGO is a large ontology with high coverage and precision, based on a clean logical model with a decidable consistency that allows representing n-ary relations in a natural way while maintaining compatibility with RDFS.

912 citations