scispace - formally typeset
Search or ask a question
Author

Srikanta Bedathur

Bio: Srikanta Bedathur is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topics: Computer science & SPARQL. The author has an hindex of 21, co-authored 108 publications receiving 1680 citations. Previous affiliations of Srikanta Bedathur include IBM & Indraprastha Institute of Information Technology.


Papers
More filters
Proceedings Article
01 Jan 2007
TL;DR: This paper utilizes informationextraction techniques to identify entity candidates in documents, map them onto entries in a richly structured ontology, and derive a generalized data graph that encompasses Web pages, entities, and ontological concepts and relationships.
Abstract: This paper pursues the recently emerging paradigm of searching for entities that are embedded in Web pages. We utilize informationextraction techniques to identify entity candidates in documents, map them onto entries in a richly structured ontology, and derive a generalized data graph that encompasses Web pages, entities, and ontological concepts and relationships. We exploit this combination of pages and entities for a novel kind of search-result ranking, coined EntityAuthority, in order to improve the quality of keyword queries that return either pages or entities. To this end, we utilize the mutual reinforcement between authoritative pages and important entities. This resembles the HITS method for Web-graph link analysis and recently proposed ObjectRank methods, but our approach operates on a much richer, typed graph structure with different kinds of nodes and also differs in the underlying mathematical denitions. Preliminary experiments with topic-specic slices of Wikipedia demonstrate the effectiveness of our approach on certain classes of queries.

17 citations

Proceedings Article
23 Sep 2007
TL;DR: Time-travel text search as mentioned in this paper evaluates a keyword query on the state of the text collection as of a user-specified time point to find the current state of a text collection.
Abstract: An increasing number of temporally versioned text collections is available today with Web archives being a prime example Search on such collections, however, is often not satisfactory and ignores their temporal dimension completely Time-travel text search solves this problem by evaluating a keyword query on the state of the text collection as of a user-specified time point This work demonstrates our approach to efficient time-travel text search and its implementation in the FLUXCAPACITOR prototype

17 citations

Journal ArticleDOI
01 Sep 2015
TL;DR: An in-memory approximation called BloomGraphs is proposed to store and update evolving subgraphs on an underlying social graph - with users as nodes and the connection between them as edges - that is compact and computationally efficient to maintain in the presence of updates.
Abstract: Monitoring the formation and evolution of communities in large online social networks such as Twitter is an important problem that has generated considerable interest in both industry and academia. Fundamentally, the problem can be cast as studying evolving sugraphs (each subgraph corresponding to a topical community) on an underlying social graph - with users as nodes and the connection between them as edges. A key metric of interest in this setting is tracking the changes to the conductance of subgraphs induced by edge activations. This metric quantifies how well or poorly connected a subgraph is to the rest of the graph relative to its internal connections. Conductance has been demonstrated to be of great use in many applications, such as identifying bursty topics, tracking the spread of rumors, and so on. However, tracking this simple metric presents a considerable scalability challenge - the underlying social network is large, the number of communities that are active at any moment is large, the rate at which these communities evolve is high, and moreover, we need to track conductance in real-time. We address these challenges in this paper.We propose an in-memory approximation called BloomGraphs to store and update these (possibly overlapping) evolving subgraphs. As the name suggests, we use Bloom filters to represent an approximation of the underlying graph. This representation is compact and computationally efficient to maintain in the presence of updates. This is especially important when we need to simultaneously maintain thousands of evolving subgraphs. BloomGraphs are used in computing and tracking conductance of these subgraphs as edge-activations arrive. BloomGraphs have several desirable properties in the context of this application, including a small memory footprint and efficient updateability. We also demonstrate mathematically that the error incurred in computing conductance is one-sided and that in the case of evolving subgraphs the change in approximate conductance has the same sign as the change in exact conductance in most cases. We validate the effectiveness of BloomGraphs through extensive experimentation on large Twitter graphs and other social networks.

16 citations

01 Jan 2010
TL;DR: NEAT, a prototype system that provides an exploration interface to news archive search that visualizes search results making use of two kinds of temporal information, namely, news articles’ publication dates but also their contained temporal expressions.
Abstract: In this paper, we present NEAT, a prototype system that provides an exploration interface to news archive search. Our prototype visualizes search results making use of two kinds of temporal information, namely, news articles’ publication dates but also their contained temporal expressions. The displayed timelines are annotated with major events, harvested using crowdsourcing, to make it easier for users to put the shown search results into context. The prototype has been fully implemented and deployed on the New York Times Annotated Corpus.

16 citations

Book ChapterDOI
20 Oct 2020
TL;DR: This is the first attempt at jointly modeling the diffusion process with activity-driven implicit communities and CoLAB achieves upto 27% improvements in location prediction task over recent deep point-process based methods on geo-tagged event traces collected from Foursquare check-ins.
Abstract: The location check-ins of users through various location-based services such as Foursquare, Twitter and Facebook Places, generate large traces of geo-tagged events. These event-traces often manifest in hidden (possibly overlapping) communities of users with similar interests. Inferring these implicit communities is crucial for forming user profiles for improvements in recommendation and prediction tasks. Given only time-stamped geo-tagged traces of users, can we find out these implicit communities, and characteristics of the underlying influence network? Can we use this network to improve the next location prediction task? In this paper, we focus on the problem of community detection as well as capturing the underlying diffusion process. We propose CoLAB, based on spatio-temporal point processes for information diffusion in continuous time but discrete space of locations. It simultaneously models the implicit communities of users based on their check-in activities, without making use of their social network connections. CoLAB captures the semantic features of the location, user-to-user influence along with spatial and temporal preferences of users. The latent community of users and model parameters are learnt through stochastic variational inference. To the best of our knowledge, this is the first attempt at jointly modeling the diffusion process with activity-driven implicit communities. We demonstrate CoLAB achieves upto 27% improvements in location prediction task over recent deep point-process based methods on geo-tagged event traces collected from Foursquare check-ins.

14 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.
Abstract: Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labeled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.

2,530 citations

Journal ArticleDOI
TL;DR: YAGO2 as mentioned in this paper is an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space, and it contains 447 million facts about 9.8 million entities.

1,186 citations

Journal ArticleDOI
TL;DR: YAGO is a large ontology with high coverage and precision, based on a clean logical model with a decidable consistency that allows representing n-ary relations in a natural way while maintaining compatibility with RDFS.

912 citations