scispace - formally typeset
Search or ask a question
Author

Srikanta Bedathur

Bio: Srikanta Bedathur is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topics: Computer science & SPARQL. The author has an hindex of 21, co-authored 108 publications receiving 1680 citations. Previous affiliations of Srikanta Bedathur include IBM & Indraprastha Institute of Information Technology.


Papers
More filters
Proceedings ArticleDOI
08 Apr 2019
TL;DR: This paper proposes a first practical attempt to address reachability query processing issues by abandoning the traditional indexing approach altogether and operating directly on the graph as it evolves, and shows that ARROW, despite its simplicity, is near-accurate and scales to graphs with tens of Millions of vertices and hundreds of millions of edges.
Abstract: Efficiently answering reachability queries on a directed graph is a fundamental problem and many solutions - theoretical and practical - have been proposed. A common strategy to make reachability query processing efficient, accurate and scalable is to precompute indexes on the graph. However this often becomes impractical, particularly when dealing with large graphs that are highly dynamic or when queries have additional constraints known only at the time of querying. In the former case, indexes become stale very quickly and keeping them up to date at the same speed as changes to the graph is untenable. For the latter setting, currently proposed indexes are often quite bulky and are highly customized to handle only a small class of constraints. In this paper, we propose a first practical attempt to address these issues by abandoning the traditional indexing approach altogether and operating directly on the graph as it evolves. Our approach, called ARROW, uses random walks to efficiently approximate reachability between vertices, building on ideas that have been prevalent in the theory community but ignored by practitioners. Not only is ARROW well suited for highly dynamic settings - as it is index-free, but it can also be easily adapted to handle many different forms of ad-hoc constraints while being competitive with custom-made index structures. In this paper, we show that ARROW, despite its simplicity, is near-accurate and scales to graphs with tens of millions of vertices and hundreds of millions of edges. We present extensive empirical evidence to illustrate these advantages.

23 citations

Proceedings ArticleDOI
08 May 2007
TL;DR: This work presents an efficiently computable normalization for PageRank scores that makes them comparable across graphs, and shows that the normalized PageRank Scores are robust to non-local changes in the graph, unlike the standard PageRank measure.
Abstract: PageRank is the best known technique for link-based importance ranking. The computed importance scores, however, are not directly comparable across different snapshots of an evolving graph. We present an efficiently computable normalization for PageRank scores that makes them comparable across graphs. Furthermore, we show that the normalized PageRank scores are robust to non-local changes in the graph, unlike the standard PageRank measure.

23 citations

Proceedings ArticleDOI
23 May 2006
TL;DR: The BuzzRank method, a method that quantifies trends in time series of importance scores and is based on a relevant growth model of importance score, is proposed and experimentally demonstrated the usefulness of BuzzRank on a bibliographic dataset.
Abstract: Ranking methods like PageRank assess the importance of Web pages based on the current state of the rapidly evolving Web graph. The dynamics of the resulting importance scores, however, have not been considered yet, although they provide the key to an understanding of the Zeitgeist on the Web. This paper proposes the BuzzRank method that quantifies trends in time series of importance scores and is based on a relevant growth model of importance scores. We experimentally demonstrate the usefulness of BuzzRank on a bibliographic dataset.

22 citations

01 Jan 2006
TL;DR: In this article, a new layout strategy, called Stellar, is proposed to improve the search efficiency of disk-resident suffix-trees through customized layouts of tree-nodes to disk-pages.
Abstract: Suffix-trees are popular indexing structures for various sequence processing problems in biological data management. We investigate here the possibility of enhancing the search efficiency of disk-resident suffix-trees through customized layouts of tree-nodes to disk-pages. Specifically, we propose a new layout strategy, called Stellar, that provides significantly improved search performance on a representative set of real genomic sequences. Further, Stellar supports both the standard root-to-leaf lookup queries as well as sophisticated sequencesearch algorithms that exploit the suffix-links of suffix-trees. Our results are encouraging with regard to the ultimate objective of seamlessly integrating sequence processing in database engines.

21 citations

Proceedings ArticleDOI
23 Apr 2017
TL;DR: This paper develops a joint model called LoCaTe, consisting of a user mobility model estimated using kernel density estimates; a model of the semantics of the location using topic models; and a models of time-gap between check-ins using exponential distribution that significantly outperforms state-of-the-art models for the same task.
Abstract: Location-based social networks (LBSNs) such as Foursquare offer a platform for users to share and be aware of each other’s physical movements. As a result of such a sharing of check-in information with each other, users can be influenced to visit at the locations visited by their friends. Quantifying such influences in these LBSNs is useful in various settings such as location promotion, personalized recommendations, mobility pattern prediction etc. In this paper, we focus on the problem of location promotion and develop a model to quantify the influence specific to a location between a pair of users. Specifically, we develop a joint model called LoCaTe, consisting of (i) user mobility model estimated using kernel density estimates; (ii) a model of the semantics of the location using topic models; and (iii) a model of time-gap between checkins using exponential distribution. We validate our model on a long-term crawl of Foursquare data collected between Jan 2015 – Feb 2016, as well as on publicly available LBSN datasets. Our experiments demonstrate that LoCaTe significantly outperforms state-of-the-art models for the same task.

21 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.
Abstract: Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labeled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.

2,530 citations

Journal ArticleDOI
TL;DR: YAGO2 as mentioned in this paper is an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space, and it contains 447 million facts about 9.8 million entities.

1,186 citations

Journal ArticleDOI
TL;DR: YAGO is a large ontology with high coverage and precision, based on a clean logical model with a decidable consistency that allows representing n-ary relations in a natural way while maintaining compatibility with RDFS.

912 citations